-->
Naive Bayes Algorithm in Data Mining

Naive Bayes Algorithm in Data Mining

Bayes's theorem is a simple probabilistic-based prediction technique based on Bayes's or Bayes's theorem with strong independence assumptions. In other words, in Naive Bayes, the model used is an independent feature model. In Bayes, the intent of strong independence on features is that a feature on a data is not related to the presence or absence of other features in the same data. For example, in the case of animal classification with features of skin cover, childbirth, weight, and lactation. In the real world, animals that breed by childbirth are also breastfeeding. Here there is an addiction to the breastfeeding feature because animals that breastfeed usually give birth, or the eggs that lay eggs usually do not breastfeed. In Bayes, this is not viewed so that each feature seems to have no relationship whatsoever. Bayes's prediction is based on Bayes's theorem with the following formula.
Naive Bayes Algorithm in Data Mining

P (H | E) = (P (E | H) xP (H)) / P (E)

Information :
P (H | E) is the conditional final probability of a hypothesis H occurs when the proof E occurs.
P (E | H) is the probability that a proof E occurs will affect the hypothesis H.
P (H) is the initial probability of hypothesis H occurs regardless of any evidence.
P (E) is the initial probability of E proof occurs regardless of hypothesis or other evidence.

The basic idea of ​​the Bayes rule is the result of the hypothesis or event (H) can be estimated based on some evidence (E) observed. There are some important things from the Bayes rules, that is.
1. An initial probability H or P (H) is the probability of a hypothesis before evidence is observed.
2. A final probability H or P (H | E) is the probability of a hypothesis after evidence is observed.

For example, In a weather forecasting to estimate the occurrence of rain, there are factors that affect the occurrence of rain, which is cloudy. If applied in Naive Bayes, the probability of occurrence of rain, if cloudy evidence has been observed, it will be expressed by.

P (Rainy Cloudy) = (P (Rainfall) xP (Rain)) / P (Cloudy))

P (Rain Cloudy) is the probability value of the rain hypothesis occurs when cloudy evidence has been observed. P (Cloudy) is the probability that the observed clouds will affect the occurrence of rain. P (Rain) is the initial probability of rain regardless of any evidence, while P (Overcast) is the probability of clouding.
Advertisement

Related Content:

Show Comment
Blogger
Disqus
Pilih Sistem Komentar

No comments