Nearest Neighbor is a simple classification technique but has a pretty good work. However, Nearest Neighbor also has advantages and disadvantages. K-NN is an algorithm that uses all the training data to perform the classification process. This results in a very long prediction process for very large amounts of data. Another approach is to use the mean data of each class, then calculate the closest distance of the test data to the mean data of each class. This gives a faster job advantage, but the result is less satisfactory because the model only forms a linear hyperplane right in the middle between the 2 classes separating the 2 classes. The more data the train the more smooth the hyperplane made. There is an exchange relation between the amount of trainee data on the computational cost and the resulting decision quality.
The Nearest Neighbor or K-NN algorithm does not distinguish each feature with a weight as in the Artifical Neural Network or so-called ANN that tries to suppress features that do not contribute to the classification to 0 on the weights. K-NN has no weight for each feature. Since K-NN is a lazy learning category that stores some or all of the data and almost no training process, KNN is very fast in the training process because it is not there, but it is very slow in the prediction process.
The trick is to determine the most appropriate value of K. Since K-NN in principle chooses the nearest neighbor, distance parameters are also important to consider in accordance with the case data. Euclidean is perfect for using the shortest distance between two data, but Manhattan is very determined to detect outliers in the data.
The Nearest Neighbor or K-NN algorithm does not distinguish each feature with a weight as in the Artifical Neural Network or so-called ANN that tries to suppress features that do not contribute to the classification to 0 on the weights. K-NN has no weight for each feature. Since K-NN is a lazy learning category that stores some or all of the data and almost no training process, KNN is very fast in the training process because it is not there, but it is very slow in the prediction process.
The trick is to determine the most appropriate value of K. Since K-NN in principle chooses the nearest neighbor, distance parameters are also important to consider in accordance with the case data. Euclidean is perfect for using the shortest distance between two data, but Manhattan is very determined to detect outliers in the data.
Advertisement
No comments