You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I could not immediately find the idea behind the Gini impurity index on the internet. The following derivation helped me understand the intuition a little bit better:
The idea is that this captures "how often a randomly selected element is labeled incorrectly if the label is chosen randomly according to the actual distribution (in a leaf)".
The definition of information gain, it is unclear to me what X_i is exactly. I would have expected Gain(X, i) and |X| in the denominator of the fraction. Would that make sense? Furthermore, am I correct that this l=1 to L sum loops over what some call the levels of this feature?
The text was updated successfully, but these errors were encountered:
The idea is that this captures "how often a randomly selected element is labeled incorrectly if the label is chosen randomly according to the actual distribution (in a leaf)".
The text was updated successfully, but these errors were encountered: