Type a new keyword(s) and press Enter to search

Pruning Decision Trees

 

            
            
            
             Expert Systems often utilize massive data structures to store information that later need to be searched. These data structures, called decision trees, are in most cases incomprehensible to a user due to their complexity. Methods of prioritizing, storing, and manipulating these structures have been the subject of exhaustive research. In effort to reduce search time, many algorithms have been invented which simplify, or prune, these trees. There are a large number of ways to implement these algorithms. A good pruning algorithm can have the dual effect of both decreasing the size of a tree and increasing the accuracy or of a search on the tree. This review article will provide interested parties with a concise overview of current pruning methods, a discussion of the tradeoffs between simplicity and accuracy, as well as introducing some new findings in this field of research.
             Background.
             Most pruning algorithms have many things in common. The basic pruning approach involves replacing a subtree with a leaf node determined by the most common class that was a subset of the original subtree. The pruned tree is tested and verified by a representative validation set. A node is removed if the validation set performs no worse on the pruned tree than the original. Nodes are usually removed in an order based on how much their removal will improve accuracy or decrease tree size. Pruning continues until further iterations of the pruning algorithm hurt the accuracy of the tree. Motivations for pruning are plentiful. Decision trees oft exhibit common problems that can be solved with intelligent algorithms. These problems include, but are not limited to, subtree replication, fragmentation, and the presence of noise.
             Subtree replication refers to the case in which two or more nodes have identical subtrees. Fragmentation denotes the case where each leaf node only represents a few possible classes, or outcomes, allowing high influence from noise.


Essays Related to Pruning Decision Trees