Next:
Up:
Previous:
How can we avoid overfitting?
- stop growing when data split not statistically significant
- grow full tree, then post-prune
How to select ``best'' tree:
- Measure performance over training data
- Measure performance over separate validation data set
- MDL: minimize
size(tree) + size(misclassifications(tree))