Model of the Classification Concept

Classification is a work of assessing a data object to include it in a certain class of a number of available classes. In the classification, there is two main work done, namely the development of the model as a prototype to be stored as a memory and the use of the model to perform introduction /classification /prediction on another data object to be known in the class where the data object in the model already saved. An example of an application that is often encountered is the classification of animal species, which have a number of attributes. With these attributes, if there is a new animal, the animal class can be immediately known. Another example is how to diagnose melanoma cancer skin disease by building a model based on existing training data, then using the model to identify new patient's illness so it is known whether the patient has cancer or not.

The Model of the Classification Concept
Classification can be defined as a work that does the training/learning of the target function f which maps each set of x attributes to one of a number of available y class labels. The training work will produce a model which is then stored as memory.

The model in the classification has the same meaning as the black box, where there is a model that receives input, then is able to do the thinking on the input, and gives the answer as the output of the thought result.

Models that have been built during the training can then be used to predict new unknown data class labels. In the development model during the training process required an algorithm to build it, called training algorithm. There are many training algorithms that have been developed by researchers, such as K-Nearest Neighbor, Artificial Neural Network, Support Vector Machine, and so on. Each algorithm has its advantages and disadvantages, but all algorithms are the same principle, ie doing a training so that at the end of the training, the model can predict each input vector to label the output class correctly.

Induction is a step to build a classification model of the training data provided, also called the training process, while deduction is a step to apply the model to the test data so that the real class of test data can be known, also called the prediction process.

Based on how the coach, classification algorithms can be divided into two kinds of eager learner and lazy learner. The algorithms included in the eager learner category are designed for reading/training/learning on the train data to correctly map each vector insert into the output class label so that at the end of the training process the model can map all test data vectors to the output class label correctly. Furthermore, after the training process is completed, the model (in the form of weight or a certain number of quantity values) is stored as memory, while all the training data is discarded. The prediction process is done with the stored model, not involving any training data at all. This makes the prediction process run fast, but must be paid with the old training process. Classification algorithm of this category, including Artifical Neural Network, Support Vector Machine, Decision Tree, Bayesian, and others.

While the algorithm included in the category of lazy learner only a little training or not at all, just save some or all of the training data, then use it in the process of prediction. This resulted in a long prediction process because the model had to re-read all the training data in order to give the correct class label output on the test data provided. The advantages of such algorithms are the fast-paced training process. Classification algorithms that fall into this category, such as K-Nearest Neighbor, Fuzzy K-Nearest Neighbor, Linear Regression, and others.

Model of the Classification Concept

Capung May 03, 2018 Data Mining

No comments

Model of the Classification Concept

Share this post, please!

Related Content:

Capung May 03, 2018 Data Mining

Share this post, please!

No comments