Gaussian Naive Bayes Classifier.
GNB is a probabilistic classifier relying on Bayes rule to estimate posterior probabilities of labels given the data. Naive assumption in it is an independence of the features, which allows to combine per-feature likelihoods by a simple product across likelihoods of “independent” features. See http://en.wikipedia.org/wiki/Naive_bayes for more information.
Provided here implementation is “naive” on its own – various aspects could be improved, but it has its own advantages:
GNB is listed both as linear and non-linear classifier, since specifics of separating boundary depends on the data and/or parameters: linear separation is achieved whenever samples are balanced (or prior='uniform') and features have the same variance across different classes (i.e. if common_variance=True to enforce this).
Whenever decisions are made based on log-probabilities (parameter logprob=True, which is the default), then conditional attribute values, if enabled, would also contain log-probabilities. Also mention that normalization by the evidence (P(data)) is disabled by default since it has no impact per se on classification decision. You might like to set parameter normalize to True if you want to access properly scaled probabilities in values conditional attribute.
Notes
Available conditional attributes:
(Conditional attributes enabled by default suffixed with +)
Initialize an GNB classifier.
Parameters : | common_variance :
prior :
logprob :
normalize :
enable_ca : None or list of str
disable_ca : None or list of str
auto_train : bool
force_train : bool
space: str, optional :
postproc : Node instance, optional
descr : str
|
---|
Means of features per class
Class probabilities
Labels classifier was trained on
Variances per class, but “vars” is taken ;)