Skip to content

potential bug in plot_tree.R and generate_tree.R #90

@LordAidan

Description

@LordAidan

Moin,

first I want to start out and say what a nice project! I really like it and I look forward into using this in my work. The idea and implementation is just wonderful. So thanks a lot.

But while using it, I potentialy stumbled oppon two eventual bugs. I say eventual as I am not that proficiant in R so maybe I am overlooking something.

So to start out with the potential bug in plot_tree.R liese in the following lines 91-98:

  if(!show_cpd & any(!is.null(cal_data_df) |
                     !is.null(cpd_plot_width) |
                     !is.null(threshold) |
                     !is.null(interval_type) |
                     !is.null(show_prediction_nodes) |
                     !is.null(show_point_prediction) |
                     !is.null(show_prediction_interval) |
                     !is.null(show_sample_size))){

The following out function is allways called since show_prediction_nodes, show_point_prediction and so on are booleans by default. Since you cannot provide a NULL as an parameter this seams to fail everytime even if you do not want to use cpd. I fixed it with the following:

     if(!show_cpd & any(!is.null(cal_data_df) |
                        !is.null(cpd_plot_width) |
                        !is.null(threshold) |
                        !is.null(interval_type) |
                        show_prediction_nodes |
                        show_point_prediction |
                        show_prediction_interval |
                        show_sample_size)){

The second thing I stumbled upon are the following lines of code in generate_tree.R:

if(is.numeric(imp.num.var)){
      # Recode variable importance values
      imp <- data.frame(imp = rf$variable.importance, var = names(rf$variable.importance))

      # Warning if all variables have importance of zero
      if(all(imp$imp==0)){
        warning("All variables have RF variable importance of zero.")
      }

      # Select fraction of variables
      imp <- imp[sort(imp$imp, decreasing = TRUE, index.return = TRUE)$ix[1:imp.num.var],]

      # Match split points with importance values
      split_points <- split_points[split_points$split_var %in% imp$var,]

      # Stop if no split points are left
      stop("No split points left after filtering for important variables. Please take a look at your RF and it's importance values.")

    }
  }

If I understand this portion of code correcly it also allways throws an error when imp.num.var is numeric but in your example you used 2.
So this does not seem to be the desired behavior.

I hoped this helped. I wish you a nice day.
Yours truly,
Aidan

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions