Decision trees🔗

This class introduces the concept of decision trees as supervized learning methods for classification and regression. A decision tree is a simple model that can even be visualized and understood by a human after training. It consists in a graph with a "tree" structure, meaning that there exists a root node with a pair of child nodes, themself having pairs of child nodes, etc., until some leaf nodes. The way data are processed by the model is by flowing through the nodes until reaching a leaf, that corresponds to a class or a value. At each node, a "split" was created at training time, testing some features of the data point. The binary outcome of the test determines whether the next node seen by the data will be the first or the second child of the current node.

Notebook

References🔗

Scikit-learn documentation on decision trees

Wikipedia on decision tree learning

T. Hastie, R. Tibshirani and J. Friedman. Elements of Statistical Learning, Springer, 2009.