Algorithm C4.5 in Data Mining

Algorithm C4.5 is an algorithm used to form decision trees that are formed based on the criteria - the criteria of decision makers. The decision tree is a very powerful and well-known method of classification and prediction. The method of satisfaction trees turns a very big fact into a decision tree that presents the rules. Rules can be easily understood with natural language. And they can also be expressed in the form of a database like Structured Query Language to search for records in certain categories.Decision trees are also useful for exploring data, finding a hidden relationship between a number of potential input variables with a target variable. Because decision trees combine data exploration and modeling, it is great as a first step in the modeling process even when used as an end model of some other technique.

A decision tree is a structure that can be used to divide large datasets into smaller record sets by applying a set of decision rules. With each set of divisions, members of the result set become similar to each other. Inside the decision tree there are several elements :

Root
Node
Relationship

In the case of solving a case using C4.5 algorithm there are two elements that must be understood:

Entropy
Gain

In general, the C4.5 algorithm for building decision trees is as follows.

Select attribute as root
Create a branch for each value
For the case in the branch
Repeat the process for each branch until all the cases on the branch have the same class.

To select an attribute as a root, based on the highest gain value of the existing attributes. To calculate the gain is used the formula as in the following equation.

Information :

S: The set of cases

A: Attribute

n: Number of Partitions Attribute A

|Si| : Number of cases on i partition

|S| : Number of cases in S

Meanwhile, the calculation of entropy value can be seen in the following equation.

Information :

S: The set of cases

A: Features

n: Number of partitions S

pi: The proportion of Si to S

Algorithm C4.5 in Data Mining

Capung May 01, 2018 Algorithm, Data Mining

No comments

Algorithm C4.5 in Data Mining

Share this post, please!

Related Content:

Capung May 01, 2018 Algorithm, Data Mining

Share this post, please!

No comments