Skip to content

Commit e561dd1

Browse files
authored
Merge pull request #75 from ModelOriented/release_candidate
Release candidate
2 parents ed6a7d6 + 2be114a commit e561dd1

24 files changed

+2358
-4652
lines changed

CRAN-SUBMISSION

+3-3
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,3 @@
1-
Version: 0.6.0
2-
Date: 2023-03-05 16:47:24 UTC
3-
SHA: d9ee1aa8903020cb4ff34149f6e622a137be42d5
1+
Version: 0.7.0
2+
Date: 2023-04-10 16:20:03 UTC
3+
SHA: 1d5cdf85049baa8819da6a36b1ee91cdf6637a69

NEWS.md

+29-6
Original file line numberDiff line numberDiff line change
@@ -1,15 +1,32 @@
11
# shapviz 0.7.0
22

3-
## New features
3+
## Milestone: Working with multiple 'shapviz' objects
44

5-
- Multiple models: Use `c(xgb = s1, rf = s2, ...)` or `mshapviz(list(xgb = s1, rf = s2, ...))` to combine multiple "shapviz" objects to a "mshapviz" object. Their plots are glued together by the {patchwork} package and can modified, e.g., using `&` and other {patchwork} functionalities.
6-
- Multiclass: Another way to create a "mshapviz" object is to call `shapviz()` to multiclass XGBoost/LightGBM/kernelshap objects.
5+
Sometimes, you will find it necessary to work with several "shapviz" objects at the same time:
6+
7+
- To visualize SHAP values of a multiclass or multi-output model.
8+
- To compare SHAP plots of different models.
9+
- To compare SHAP plots between subgroups.
10+
11+
To simplify the workflow, {shapviz} introduces the "mshapviz" object ("m" like "multi"). You can create it in different ways:
12+
13+
- Use `shapviz()` on multiclass XGBoost or LightGBM models.
14+
- Use `shapviz()` on "kernelshap" objects created from multiclass/multioutput models.
15+
- Use `c(Mod_1 = s1, Mod_2 = s2, ...)` on "shapviz" objects `s1`, `s2`, ...
16+
- Or `mshapviz(list(Mod_1 = s1, Mod_2 = s2, ...))`
17+
18+
The `sv_*()` functions use the {patchwork} package to glue the individual plots together.
19+
20+
See the new vignette for more info and specific examples.
21+
22+
## Other new features
23+
24+
- `sv_dependence()` now allows multiple `v` and/or `color_var` to be plotted (glued via {patchwork}).
725
- {DALEX}: Support for "predict_parts" objects from {DALEX}, thanks to Adrian Stando.
826
- Aggregated SHAP values: The argument `row_id` of `sv_waterfall()` and `sv_force()` now also allows a vector of integers or a logical vector. If more than one row is selected, SHAP values and predictions are averaged before plotting (*aggregated SHAP values* in {DALEX}).
927
- Row bind: "shapviz" objects `x1`, `x2` can now be concatenated in rowwise manner using `x1 + x2` or `rbind(x1, x2)`, again thanks to Adrian.
1028
- `colnames()`: "shapviz" objects `x` have received a `dimnames()` function, so you can now, e.g., use `colnames(x)` to see the feature names.
1129
- Subsetting: "shapviz" `x` can now be subsetted using `x[cond, features]`.
12-
- New vignette on working with multiple "shapviz" objects.
1330

1431
## Maintenance
1532

@@ -18,13 +35,19 @@
1835
- Webpage created with "pgkdown"
1936
- New dependency: {patchwork}
2037

21-
## Other changes and bug fixes
38+
## Other changes
2239

40+
- Color guides are closer to the plot area. This affects `sv_dependence()`, `sv_importance(kind="bee")`, and `sv_interaction()`.
41+
- The lengthy y axis title "SHAP interaction value" in `sv_dependence()` has been shortened to "SHAP interaction".
2342
- As announced, the argument `show_other` of `sv_importance()` has been removed.
2443
- Slightly less picky checks on `S_inter`.
25-
- `sv_waterfall()`: Using `order_fun()` would not work as expected with `max_display`.
2644
- `print.shapviz()` is much more compact, use `summary.shapviz()` for more info.
2745

46+
## Bug fixes
47+
48+
- `sv_waterfall()`: Using `order_fun()` would not work as expected with `max_display`. This has been fixed.
49+
- `sv_dependence()`: Passing `viridis_args = NULL` would hide the color guide title. This has been fixed. But please pass `viridis_args = list()` instead.
50+
2851
# shapviz 0.6.0
2952

3053
## Change in defaults

R/sv_dependence.R

+44-7
Original file line numberDiff line numberDiff line change
@@ -1,27 +1,31 @@
11
#' SHAP Dependence Plot
22
#'
3-
#' Scatter plot of the SHAP values of a feature against its feature values.
3+
#' Scatterplot of the SHAP values of a feature against its feature values.
44
#' If SHAP interaction values are available, setting \code{interactions = TRUE} allows
55
#' to focus on pure interaction effects (multiplied by two) or on pure main effects.
66
#'
77
#' @importFrom rlang .data
88
#' @param object An object of class "(m)shapviz".
99
#' @param v Column name of feature to be plotted.
10+
#' Can be a vector/list if \code{object} is of class "shapviz".
1011
#' @param color_var Feature name to be used on the color scale to investigate interactions.
1112
#' The default ("auto") uses SHAP interaction values (if available) or a heuristic to
1213
#' select the strongest interacting feature. Set to \code{NULL} to not use the color axis.
14+
#' Can be a vector/list if \code{object} is of class "shapviz".
1315
#' @param color Color to be used if \code{color_var = NULL}.
16+
#' Can be a vector/list if \code{v} is a vector.
1417
#' @param viridis_args List of viridis color scale arguments, see
1518
#' \code{?ggplot2::scale_color_viridis_c()}. The default points to the global
1619
#' option \code{shapviz.viridis_args}, which corresponds to
1720
#' \code{list(begin = 0.25, end = 0.85, option = "inferno")}.
1821
#' These values are passed to \code{ggplot2::scale_color_viridis_*()}.
1922
#' For example, to switch to a standard viridis scale, you can either change the default
20-
#' with \code{options(shapviz.viridis_args = NULL)} or set \code{viridis_args = NULL}.
23+
#' with \code{options(shapviz.viridis_args = list())} or set \code{viridis_args = list()}.
2124
#' Only relevant if \code{color_var} is not \code{NULL}.
2225
#' @param jitter_width The amount of horizontal jitter. The default (\code{NULL}) will
2326
#' use a value of 0.2 in case \code{v} is discrete, and no jitter otherwise.
2427
#' (Numeric variables are considered discrete if they have at most 7 unique values.)
28+
#' Can be a vector/list if \code{v} is a vector.
2529
#' @param interactions Should SHAP interaction values be plotted? Default is \code{FALSE}.
2630
#' Requires SHAP interaction values. If \code{color_var = NULL} (or it is equal to
2731
#' \code{v}), the pure main effect of \code{v} is visualized. Otherwise, twice the SHAP
@@ -35,11 +39,13 @@
3539
#' sv_dependence(x, "Petal.Length")
3640
#' sv_dependence(x, "Petal.Length", color_var = "Species")
3741
#' sv_dependence(x, "Petal.Length", color_var = NULL)
42+
#' sv_dependence(x, c("Species", "Petal.Length"))
43+
#' sv_dependence(x, "Petal.Width", color_var = c("Species", "Petal.Length"))
3844
#'
3945
#' # SHAP interaction values
4046
#' x2 <- shapviz(fit, X_pred = dtrain, X = iris, interactions = TRUE)
4147
#' sv_dependence(x2, "Petal.Length", interactions = TRUE)
42-
#' sv_dependence(x2, "Petal.Length", color_var = NULL, interactions = TRUE)
48+
#' sv_dependence(x2, c("Petal.Length", "Species"), color_var = NULL, interactions = TRUE)
4349
#'
4450
#' # Show main effect of "Petal.Length" for setosa and virginica separately
4551
#' mx <- c(
@@ -64,12 +70,38 @@ sv_dependence.default <- function(object, ...) {
6470
sv_dependence.shapviz <- function(object, v, color_var = "auto", color = "#3b528b",
6571
viridis_args = getOption("shapviz.viridis_args"),
6672
jitter_width = NULL, interactions = FALSE, ...) {
73+
p <- length(v)
74+
if (p > 1L || length(color_var) > 1L) {
75+
if (is.null(color_var)) {
76+
color_var <- replicate(p, NULL)
77+
}
78+
if (is.null(jitter_width)) {
79+
jitter_width <- replicate(p, NULL)
80+
}
81+
plot_list <- mapply(
82+
FUN = sv_dependence,
83+
v = v,
84+
color_var = color_var,
85+
color = color,
86+
jitter_width = jitter_width,
87+
MoreArgs = list(
88+
object = object,
89+
viridis_args = viridis_args,
90+
interactions = interactions,
91+
...
92+
),
93+
SIMPLIFY = FALSE
94+
)
95+
nms <- if (length(v) > 1L) v
96+
plot_list <- add_titles(plot_list, nms = nms) # see sv_waterfall()
97+
return(patchwork::wrap_plots(plot_list))
98+
}
99+
67100
S <- get_shap_values(object)
68101
X <- get_feature_values(object)
69102
S_inter <- get_shap_interactions(object)
70103
nms <- colnames(object)
71104
stopifnot(
72-
length(v) == 1L,
73105
v %in% nms,
74106
is.null(color_var) || (color_var %in% c("auto", nms))
75107
)
@@ -94,7 +126,7 @@ sv_dependence.shapviz <- function(object, v, color_var = "auto", color = "#3b528
94126
if (color_var == v) {
95127
y_lab <- "SHAP main effect"
96128
} else {
97-
y_lab <- "SHAP interaction value"
129+
y_lab <- "SHAP interaction"
98130
}
99131
s <- S_inter[, v, color_var]
100132
if (color_var != v) {
@@ -119,19 +151,24 @@ sv_dependence.shapviz <- function(object, v, color_var = "auto", color = "#3b528
119151
vir <- scale_color_viridis_c
120152
}
121153
if (is.null(viridis_args)) {
122-
viridis_args <- list(NULL)
154+
viridis_args <- list()
123155
}
124156
ggplot(dat, aes(x = .data[[v]], y = shap, color = .data[[color_var]])) +
125157
geom_jitter(width = jitter_width, height = 0, ...) +
126158
ylab(y_lab) +
127-
do.call(vir, viridis_args)
159+
do.call(vir, viridis_args) +
160+
theme(legend.box.spacing = grid::unit(0, "pt"))
128161
}
129162

130163
#' @describeIn sv_dependence SHAP dependence plot for "mshapviz" object.
131164
#' @export
132165
sv_dependence.mshapviz <- function(object, v, color_var = "auto", color = "#3b528b",
133166
viridis_args = getOption("shapviz.viridis_args"),
134167
jitter_width = NULL, interactions = FALSE, ...) {
168+
stopifnot(
169+
length(v) == 1L,
170+
length(color_var) <= 1L
171+
)
135172
plot_list <- lapply(
136173
object,
137174
FUN = sv_dependence,

R/sv_importance.R

+3-2
Original file line numberDiff line numberDiff line change
@@ -27,7 +27,7 @@
2727
#' corresponds to \code{list(begin = 0.25, end = 0.85, option = "inferno")}.
2828
#' These values are passed to \code{ggplot2::scale_color_viridis_c()}.
2929
#' For example, to switch to a standard viridis scale, you can either change the default
30-
#' with \code{options(shapviz.viridis_args = NULL)} or set \code{viridis_args = NULL}.
30+
#' with \code{options(shapviz.viridis_args = list())} or set \code{viridis_args = list()}.
3131
#' @param color_bar_title Title of color bar of the beeswarm plot.
3232
#' Set to \code{NULL} to hide the color bar altogether.
3333
#' @param show_numbers Should SHAP feature importances be printed?
@@ -127,7 +127,8 @@ sv_importance.shapviz <- function(object, kind = c("bar", "beeswarm", "both", "n
127127
bar = !is.null(color_bar_title),
128128
ncol = length(unique(df$color)) # Special case of constant feature values
129129
) +
130-
labs(x = "SHAP value", y = element_blank(), color = color_bar_title)
130+
labs(x = "SHAP value", y = element_blank(), color = color_bar_title) +
131+
theme(legend.box.spacing = grid::unit(0, "pt"))
131132
}
132133
if (show_numbers) {
133134
p <- p +

R/sv_interaction.R

+3-2
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,7 @@
2222
#' corresponds to \code{list(begin = 0.25, end = 0.85, option = "inferno")}.
2323
#' These values are passed to \code{ggplot2::scale_color_viridis_c()}.
2424
#' For example, to switch to a standard viridis scale, you can either change the default
25-
#' with \code{options(shapviz.viridis_args = NULL)} or set \code{viridis_args = NULL}.
25+
#' with \code{options(shapviz.viridis_args = list())} or set \code{viridis_args = list()}.
2626
#' @param color_bar_title Title of color bar of the beeswarm plot.
2727
#' Set to \code{NULL} to hide the color bar altogether.
2828
#' @param ... Arguments passed to \code{geom_point()}.
@@ -106,7 +106,8 @@ sv_interaction.shapviz <- function(object, kind = c("beeswarm", "no"),
106106
ncol = length(unique(X_long$Freq))
107107
) +
108108
theme(
109-
panel.spacing = unit(0.2, "lines"),
109+
panel.spacing = grid::unit(0.2, "lines"),
110+
legend.box.spacing = grid::unit(0, "pt"),
110111
axis.ticks.y = element_blank(),
111112
axis.text.y = element_blank()
112113
)

README.md

+31-30
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
# shapviz <a href='https://github.com/mayer79/shapviz'><img src='man/figures/logo.png' align="right" height="139" /></a>
1+
# {shapviz} <a href='https://github.com/ModelOriented/shapviz'><img src='man/figures/logo.png' align="right" height="139" /></a>
22

33
<!-- badges: start -->
44

@@ -11,7 +11,7 @@
1111

1212
<!-- badges: end -->
1313

14-
## Introduction
14+
## Overview
1515

1616
SHAP (SHapley Additive exPlanations, [1]) is an ingenious way to study black box models. SHAP values decompose - as fair as possible - predictions into additive feature contributions. Crunching SHAP values requires clever algorithms by clever people. Analyzing them, however, is super easy with the right visualizations. {shapviz} offers the latter:
1717

@@ -39,12 +39,12 @@ To further simplify the use of {shapviz}, we added direct connectors to:
3939
- [`kernelshap`](https://CRAN.R-project.org/package=kernelshap)
4040
- [`fastshap`](https://CRAN.R-project.org/package=fastshap)
4141
- [`shapr`](https://CRAN.R-project.org/package=shapr)
42-
- [`treeshap`](https://github.com/ModelOriented/treeshap)
43-
- [`DALEX`](https://cran.r-project.org/web/packages/DALEX)
42+
- [`treeshap`](https://github.com/ModelOriented/treeshap/)
43+
- [`DALEX`](https://CRAN.R-project.org/package=DALEX)
4444

4545
For XGBoost, LightGBM, and H2O, the SHAP values are directly calculated from the fitted model.
4646

47-
[`CatBoost`](https://github.com/catboost) is not included, but see the vignette how to use its SHAP calculation backend with {shapviz}.
47+
[`CatBoost`](https://github.com/catboost/) is not included, but see the vignette how to use its SHAP calculation backend with {shapviz}.
4848

4949
Multiple "shapviz" objects can be glued together, see Vignette "Multiple shapviz objects".
5050

@@ -59,9 +59,9 @@ install.packages("shapviz")
5959
devtools::install_github("mayer79/shapviz")
6060
```
6161

62-
## Example
62+
## Usage
6363

64-
Shiny diamonds... let's model their prices by four "c" variables with XGBoost:
64+
Shiny diamonds... let's use XGBoost to model their prices by the four "C" variables:
6565

6666
### Model
6767

@@ -72,18 +72,9 @@ library(xgboost)
7272

7373
set.seed(3653)
7474

75-
# Explanation data
76-
dia_small <- diamonds[sample(nrow(diamonds), 2000L), ]
77-
78-
# XGBoost model
7975
x <- c("carat", "cut", "color", "clarity")
8076
dtrain <- xgb.DMatrix(data.matrix(diamonds[x]), label = diamonds$price)
81-
82-
fit <- xgb.train(
83-
params = list(learning_rate = 0.1, objective = "reg:squarederror"),
84-
data = dtrain,
85-
nrounds = 65L
86-
)
77+
fit <- xgb.train(params = list(learning_rate = 0.1), data = dtrain, nrounds = 65L)
8778
```
8879

8980
### Create "shapviz" object
@@ -93,6 +84,9 @@ One line of code creates a "shapviz" object. It contains SHAP values and feature
9384
In this example, we construct the "shapviz" object directly from the fitted XGBoost model. Thus we also need to pass a corresponding prediction dataset `X_pred` used for calculating SHAP values by XGBoost.
9485

9586
``` r
87+
# Explanation data
88+
dia_small <- diamonds[sample(nrow(diamonds), 2000L), ]
89+
9690
shp <- shapviz(fit, X_pred = data.matrix(dia_small[x]), X = dia_small)
9791
```
9892

@@ -148,14 +142,6 @@ sv_importance(shp, kind = "beeswarm")
148142

149143
![](man/figures/README-imp2.png)
150144

151-
#### Or both combined
152-
153-
``` r
154-
sv_importance(shp, kind = "both", show_numbers = TRUE, bee_width = 0.2)
155-
```
156-
157-
![](man/figures/README-imp3.png)
158-
159145
### Dependence plot
160146

161147
A scatterplot of SHAP values of a feature like `color` against its observed values gives a great impression on the feature effect on the response. Vertical scatter gives additional info on interaction effects (using a heuristic to select the feature on the color axis).
@@ -166,24 +152,39 @@ sv_dependence(shp, v = "color")
166152

167153
![](man/figures/README-dep.svg)
168154

155+
Or multiple features together, using {patchwork}:
156+
157+
``` r
158+
library(patchwork) # We need the & operator
159+
160+
sv_dependence(shp, v = x) &
161+
theme_gray(base_size = 9) &
162+
ylim(-5000, 15000)
163+
```
164+
165+
![](man/figures/README-dep-multi.png)
166+
169167
### Interactions
170168

171-
If SHAP interaction values have been computed (via {xgboost} or {treeshap}), the dependence plot can focus on main effects or SHAP interaction effects (multiplied by two due to symmetry):
169+
If SHAP interaction values have been computed (via {xgboost} or {treeshap}), the dependence plot can focus on main effects or SHAP interaction effects (multiplied by two due to symmetry).
172170

173171
``` r
174-
shp_with_inter <- shapviz(
172+
shp_i <- shapviz(
175173
fit, X_pred = data.matrix(dia_small[x]), X = dia_small, interactions = TRUE
176174
)
177175

178-
sv_dependence(shp_with_inter, v = "color", color_var = "cut", interactions = TRUE)
176+
# Main effect of carat and its interactions
177+
sv_dependence(
178+
shp_i, v = "carat", color_var = x, interactions = TRUE) &
179+
ylim(-6000, 13000)
179180
```
180181

181-
![](man/figures/README-dep2.svg)
182+
![](man/figures/README-dep2.png)
182183

183184
We can also study all interactions and main effects together using the following beeswarm visualization:
184185

185186
```{r}
186-
sv_interaction(shp_with_inter) +
187+
sv_interaction(shp_i) +
187188
theme(axis.text.x = element_text(angle = 45, vjust = 1, hjust = 1))
188189
```
189190

cran-comments.md

+19-4
Original file line numberDiff line numberDiff line change
@@ -1,13 +1,27 @@
1-
# Submission of shapviz 0.6.0
1+
# Re-Submission of {shapviz} 0.7.0
22

3-
Dear CRAN team. The dependence plot now uses better defaults. As they are user visible in some cases, the version jumps from 0.5.0 to 0.6.0.
3+
This second resubmission removes non-standard file (sorry, my bad!)
4+
5+
# Re-Submission of {shapviz} 0.7.0
6+
7+
This re-submission fixes non-standard links in the new vignette (and the README).
8+
9+
## Original message
10+
11+
Dear CRAN team.
12+
13+
- {shapviz} can now deal with multiclass models or SHAP values of multiple models. Hurray ;).
14+
- Many additional features
15+
- New contributor
16+
- Additional vignette
17+
- New home: github/ModelOriented/shapviz
418

519
## Checks
620

721
### check(manual = TRUE, cran = TRUE)
822

9-
-> WARNING
10-
'qpdf' is needed for checks on size reduction of PDFs
23+
- WARNING: 'qpdf' is needed for checks on size reduction of PDFs
24+
- Note: unable to verify current time
1125

1226
### check_rhub()
1327

@@ -21,3 +35,4 @@ Found the following files/directories:
2135
### check_win_devel()
2236

2337
Status: OK
38+

man/figures/README-dep-multi.png

30.6 KB
Loading

0 commit comments

Comments
 (0)