Release CRAN release 1.0.0 · ModelOriented/hstats

Major changes

Quantile approximation: hstats() now has the option approx = FALSE. Set to TRUE to replace values of dense numeric columns by grid_size = 50 quantile midpoints. This will bring a massive speed-up for one-way calculations. Use this option when one-way calculations are slow, or when you want to increase n_max.
hstats(): n_max has been increased from 300 to 500 rows. This will make estimates of H-statistics more stable at the price of longer run time. Reduce to 300 for the old behaviour.
hstats(): Three-way interactions are not anymore calculated by default. Set threeway_m to 5 for the old behaviour.
Revised plots: The colors and color palettes have changed and can now also be controlled via global options. For instance, to change the fill color of all bars, set options(hstats.fill = new value). Value labels are more clear, and there are more options. Varying color/fill scales now use viridis (inferno). This can be modified on the fly or via options(hstats.viridis_args = list(...)).
"hstats_matrix" object: All statistics functions, e.g., h2_pairwise() or perm_importance(), now return a "hstats_matrix". The values are stored in $M and can be plotted via plot(). Other methods include: dimnames(), rownames(), colnames(), dim(), nrow(), ncol(), head(), tail(), and subsetting like a normal matrix. This allows, e.g, to select and plot only one column of the results.
perm_importance(): The perms argument has been changed to m_rep.
print() and summary() methods have been revised.
The arguments w (case weights) and y (response) can now also be passed as column names.

Minor changes

Statistics: The argument top_m has been moved to the plot() method.
Statistics: The clipping threshold eps of squared numerator statistics has been reduced from 1e-8 to 1e-10. It is now handled in hstats() instead of the statistic functions.
H-squared: The $H^2$ statistic stored in a "hstats" object is now a matrix with one row (it was a vector).
pd_importance(): The "hstats" object now contains pre-calculated PD-based importance values in $pd_importance.
summary.hstats() now returns an object of class "hstats_summary" instead of "summary_hstats".
average_loss() is more flexible regarding the group BY argument. It can also be a variable name. Non-discrete BY variables are now automatically binned. Like partial_dep(), binning is controlled by the by_size = 4 argument.
average_loss() also returns a "hstats_matrix" object with print() and plot() method. The values can be extracted via $M.
The default v of hstats() and perm_importance() is now NULL. Internally, it is set to colnames(X) (minus the column names of w and y if passed as name).
Missing grid values: partial_dep() and ice() have received a na.rm argument that controls if missing values are dropped during grid creation. The default TRUE is compatible with earlier releases.
Missing values in hstats(): Discrete variables with missings would cause rowsum() to launch repeated warnings. This case is now catched.
The position of some function arguments have changed.
perm_importance(): The default of verbose is TRUE again.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CRAN release 1.0.0

Major changes

Minor changes