Jobs related to data mining can be divided into four groups, namely prediction modeling, cluster analysis, association analysis, anomaly detection.Data Mining is a set of processes to add value in the form of the information for this unknown manually from a database by doing the digging patterns of data in order to manipulate the data into information more valuable is obtained by means of extracting and recognize the important or interesting patterns from the data contained in the database.
For example, the work to detect the type of patient illness is based on the number of parameters of the illness suffered by the type of classification because here the expected target is discrete, only some kind of possible target value is obtained, no time value must be obtained to get the final target value . Meanwhile, the prediction job of the number of sales earned in the next three months includes the regression because to get the third month sales value, the second month sales value must be obtained and to get the second month sales value, the first month sales value must be obtained. Here there is a time series value that must be calculated to arrive at the desired final target, there is a continuous value that must be calculated to get the desired final target value.
Cluster analysis performs grouping of data into a number of existing groups. data that fall within the limits of similarity with the group will join the group, and will be separated in different groups if out of line with the group.
The closest adoption to daily life is the analysis of shopping cart data. for example, a buyer is a housewife who will buy household goods in a supermarket. if the mother buys rice, it is very likely that she will also buy other items, such as oil, eggs, and it is not possible or rarely buy other items such as hats or books. by knowing the stronger relationship between rice and eggs than rice with hats, retailers can determine what items should be available.
Prediction Modeling
The prediction model relates to the creation of a model that can map from each set of variables to each target, then use the model to assign target values to the new set of results. there are two types of prediction models, namely classification and regression. classification is used for discrete target variables, while regression for continuous target variables.For example, the work to detect the type of patient illness is based on the number of parameters of the illness suffered by the type of classification because here the expected target is discrete, only some kind of possible target value is obtained, no time value must be obtained to get the final target value . Meanwhile, the prediction job of the number of sales earned in the next three months includes the regression because to get the third month sales value, the second month sales value must be obtained and to get the second month sales value, the first month sales value must be obtained. Here there is a time series value that must be calculated to arrive at the desired final target, there is a continuous value that must be calculated to get the desired final target value.
Cluster analysis
Examples of work related to cluster analysis is how to know the pattern of purchasing goods by consumers at certain times. By knowing the pattern of the purchasing group, the company or retailer can determine the promotional schedule that can be given so that sales turnover can be improved.Cluster analysis performs grouping of data into a number of existing groups. data that fall within the limits of similarity with the group will join the group, and will be separated in different groups if out of line with the group.
Association analysis
Association analysis is used to find patterns that describe the strength of feature relationships in the data. The pattern found usually represents the form of the implication rule or feature subset. The goal is to find an attractive pattern in an efficient way.The closest adoption to daily life is the analysis of shopping cart data. for example, a buyer is a housewife who will buy household goods in a supermarket. if the mother buys rice, it is very likely that she will also buy other items, such as oil, eggs, and it is not possible or rarely buy other items such as hats or books. by knowing the stronger relationship between rice and eggs than rice with hats, retailers can determine what items should be available.
Anomaly detection
The anomaly detection work is concerned with observing a data from a number of data that has significantly different characteristics than the rest of the data. data whose characteristics are distorted or different from other data are called outliers. A good anomaly detection algorithm should have a high detection rate and a low error lane. anomaly detection can be applied to the network system to know the pattern of data entering the network so that intrusions can be found if the data work patterns that come different. the behavior of anomalous weather conditions can also be detected by this algorithm.
Advertisement
No comments