|
|
|
|
|
Random ForestA Random Forest classifier is a classifier that is constructed using an algorithm developed by Leo Brieman and Adele Cutler. The classifier uses large number of individual decision trees and decides the class by choosing the mode (most frequently occurring) of the classes as determined by the individual trees. (Random Forest is a trademark of Leo Brieman and Adele Cutler) The individual trees are constructed using the following algorithm: -
- Assume that the number of cases in the training set is N, and that the number of variables in the classifier is M. Select the number of input variables that will be used to determine the decision at a node of the tree. This number, m should be much much less than M.
-
- Choose a training set by choosing N samples from the training set with replacement.
-
- For each node of the tree randomly select m of the M variables on which to base the decision at that node. Calculate the best split based on these m variables in the training set.
-
- Each tree is fully grown and not pruned (as would be done in constructing a normal tree classifier).
The random forest classifier counts among its advantages the following: -
- It produces the most accurate classifier among current algorithms (as of 2004).
-
- It handles a very large number input variables.
-
- It can estimate the importance of variables in determining classification.
-
- It generates an internal unbiased estimate of the generalization error as the forest building progresses.
-
- It includes a good method for estimating missing data and maintains accuracy when a large proportion of the data are missing.
-
- It provides an experimental method for detecting variable interactions.
-
- The algorithm runs quickly to produce a forest of decision trees for the classifier.
External Link Random Forest classifier description
|
 |
|
| Copyright 2005-2009 OnPedia.com. All Rights Reserved |
|
|