You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
On a real production dataset with 5 explanatory variables and ~1000 lines, I received a SystemStackError: stack level too deep when calling DecisionTree::ID3Tree#train.
Trying to figure out what was happening, I built the following simple dataset, which allows to reveal the bug:
attributes=["X0","X1","X2","X3"]data=[["a",0,1,1,1],["a",0,1,0,0],["a",0,0,1,0],["a",0,0,0,1]]data_type={X0: :discrete,X1: :continuous,X2: :continuous,X3: :continuous}tree=DecisionTree::ID3Tree.new(attributes,data,1,data_type)tree.train# SystemStackError is raised here!
The reason of this bug seems to lie in the specific output ([-1, -1]) of DecisionTree::ID3Tree#id3_continuous in the case if values.size == 1 (see this line).
Returning [0, -1] instead of [-1, -1] in the cases if values.size == 1 and if gain.size == 1 in the method #id3_continuous solves the problem.
It would also be relevant to stop the recursion in the case where the selection of each variable leads to a zero gain. That can be done adding in #id3_train the following line:
returndata.first.lastifperformance.all?{ |a,b| a <= 0}
On a real production dataset with 5 explanatory variables and ~1000 lines, I received a
SystemStackError: stack level too deep
when callingDecisionTree::ID3Tree#train
.Trying to figure out what was happening, I built the following simple dataset, which allows to reveal the bug:
The reason of this bug seems to lie in the specific output (
[-1, -1]
) ofDecisionTree::ID3Tree#id3_continuous
in the caseif values.size == 1
(see this line).Returning
[0, -1]
instead of[-1, -1]
in the casesif values.size == 1
andif gain.size == 1
in the method#id3_continuous
solves the problem.It would also be relevant to stop the recursion in the case where the selection of each variable leads to a zero gain. That can be done adding in
#id3_train
the following line:after this line:
What do you think?
Do you want me to make a pull request with these changes?
The text was updated successfully, but these errors were encountered: