2009/02/28
2009/02/27
How does noise affect generalization
2009/02/26
2009/02/25
2009/02/24
preprocessing samples before give them to the learning machines
There is a large work done on preprocessing samples before give them to the learning machines:
· Remove noise
o Algorithms for detecting noise samples based in knn algorithms.
· Add noise
o Small noise produces a better performance in neural networks (and maybe also in other algorithms).
· Re-structure the dimensionality and distance metrix.
o Nahanalobis distance.
o Scaling the data: It give an improvement in SVM machines
o Kernels: increase dimensions.
o Genetic kernel (GK SVM)
o Removing features:
§ removing dimensions (feature selection)
· information gain (the best)
· mutual information
· x2 statistic chi (second best)
· term strength
§ principal component analysis.
§ neighborhood component analysis
· Re-sampling:
o Under-sampling:
§ Randomly
§ Inconsistent data
§ Duplicate data
§ Removing noise (bis)
o Over-sampling:
§ Randomly
§ SMOTE
§ Border SMOTE-1
§ Border SMOTE-2
§ Adding noise (bis)
§ Give more weight to hard samples.
· Windowed data:
o In some cases context information increases the accuracy.