Skip to content

Commit 7e9580d

Browse files
authored
Merge pull request #147 from ModelOriented/update-docstrings
Updated docstrings
2 parents 6105947 + 3277da8 commit 7e9580d

8 files changed

+153
-140
lines changed

NEWS.md

+1
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,7 @@
33
## Documentation
44

55
- More compact README.
6+
- Updated function description.
67

78
# kernelshap 0.7.0
89

R/additive_shap.R

+14-13
Original file line numberDiff line numberDiff line change
@@ -9,19 +9,20 @@
99
#' - `gam::gam()`,
1010
#' - [survival::coxph()], and
1111
#' - [survival::survreg()].
12-
#'
12+
#'
1313
#' The SHAP values are extracted via `predict(object, newdata = X, type = "terms")`,
14-
#' a logic heavily inspired by `fastshap:::explain.lm(..., exact = TRUE)`.
14+
#' a logic adopted from `fastshap:::explain.lm(..., exact = TRUE)`.
1515
#' Models with interactions (specified via `:` or `*`), or with terms of
1616
#' multiple features like `log(x1/x2)` are not supported.
17-
#'
17+
#'
1818
#' Note that the SHAP values obtained by [additive_shap()] are expected to
1919
#' match those of [permshap()] and [kernelshap()] as long as their background
2020
#' data equals the full training data (which is typically not feasible).
2121
#'
22-
#' @inheritParams kernelshap
23-
#' @param X Dataframe with rows to be explained. Will be used like
22+
#' @param object Fitted additive model.
23+
#' @param X Dataframe with rows to be explained. Passed to
2424
#' `predict(object, newdata = X, type = "terms")`.
25+
#' @param verbose Set to `FALSE` to suppress messages.
2526
#' @param ... Currently unused.
2627
#' @returns
2728
#' An object of class "kernelshap" with the following components:
@@ -38,15 +39,15 @@
3839
#' fit <- lm(Sepal.Length ~ ., data = iris)
3940
#' s <- additive_shap(fit, head(iris))
4041
#' s
41-
#'
42+
#'
4243
#' # MODEL TWO: More complicated (but not very clever) formula
4344
#' fit <- lm(
4445
#' Sepal.Length ~ poly(Sepal.Width, 2) + log(Petal.Length) + log(Sepal.Width),
4546
#' data = iris
4647
#' )
4748
#' s_add <- additive_shap(fit, head(iris))
4849
#' s_add
49-
#'
50+
#'
5051
#' # Equals kernelshap()/permshap() when background data is full training data
5152
#' s_kernel <- kernelshap(
5253
#' fit, head(iris[c("Sepal.Width", "Petal.Length")]), bg_X = iris
@@ -59,28 +60,28 @@ additive_shap <- function(object, X, verbose = TRUE, ...) {
5960
if (any(attr(stats::terms(object), "order") > 1)) {
6061
stop("Additive SHAP not appropriate for models with interactions.")
6162
}
62-
63+
6364
txt <- "Exact additive SHAP via predict(..., type = 'terms')"
6465
if (verbose) {
6566
message(txt)
6667
}
67-
68+
6869
S <- stats::predict(object, newdata = X, type = "terms")
6970
rownames(S) <- NULL
70-
71+
7172
# Baseline value
7273
b <- as.vector(attr(S, "constant"))
7374
if (is.null(b)) {
7475
b <- 0
7576
}
76-
77+
7778
# Which columns of X are used in each column of S?
7879
s_names <- colnames(S)
7980
cols_used <- lapply(s_names, function(z) all.vars(stats::reformulate(z)))
8081
if (any(lengths(cols_used) > 1L)) {
8182
stop("The formula contains terms with multiple features (not supported).")
8283
}
83-
84+
8485
# Collapse all columns in S using the same column in X and rename accordingly
8586
mapping <- split(
8687
s_names, factor(unlist(cols_used), levels = colnames(X)), drop = TRUE
@@ -89,7 +90,7 @@ additive_shap <- function(object, X, verbose = TRUE, ...) {
8990
cbind,
9091
lapply(mapping, function(z) rowSums(S[, z, drop = FALSE], na.rm = TRUE))
9192
)
92-
93+
9394
structure(
9495
list(
9596
S = S,

R/kernelshap.R

+105-98
Large diffs are not rendered by default.

R/permshap.R

+11-11
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
#'
33
#' Exact permutation SHAP algorithm with respect to a background dataset,
44
#' see Strumbelj and Kononenko. The function works for up to 14 features.
5-
#' For eight or more features, we recomment to switch to [kernelshap()].
5+
#' For more than eight features, we recommend [kernelshap()] due to its higher speed.
66
#'
77
#' @inheritParams kernelshap
88
#' @returns
@@ -16,12 +16,12 @@
1616
#' - `bg_w`: The background case weights.
1717
#' - `m_exact`: Integer providing the effective number of exact on-off vectors used.
1818
#' - `exact`: Logical flag indicating whether calculations are exact or not
19-
#' (currently `TRUE`).
19+
#' (currently always `TRUE`).
2020
#' - `txt`: Summary text.
2121
#' - `predictions`: \eqn{(n \times K)} matrix with predictions of `X`.
2222
#' - `algorithm`: "permshap".
2323
#' @references
24-
#' 1. Erik Strumbelj and Igor Kononenko. Explaining prediction models and individual
24+
#' 1. Erik Strumbelj and Igor Kononenko. Explaining prediction models and individual
2525
#' predictions with feature contributions. Knowledge and Information Systems 41, 2014.
2626
#' @export
2727
#' @examples
@@ -80,7 +80,7 @@ permshap.default <- function(
8080
if (verbose) {
8181
message(txt)
8282
}
83-
83+
8484
basic_checks(X = X, feature_names = feature_names, pred_fun = pred_fun)
8585
prep_bg <- prepare_bg(X = X, bg_X = bg_X, bg_n = bg_n, bg_w = bg_w, verbose = verbose)
8686
bg_X <- prep_bg$bg_X
@@ -92,32 +92,32 @@ permshap.default <- function(
9292
bg_preds <- align_pred(pred_fun(object, bg_X, ...))
9393
v0 <- wcolMeans(bg_preds, w = bg_w) # Average pred of bg data: 1 x K
9494
v1 <- align_pred(pred_fun(object, X, ...)) # Predictions on X: n x K
95-
95+
9696
# Drop unnecessary columns in bg_X. If X is matrix, also column order is relevant
9797
# Predictions will never be applied directly to bg_X anymore
9898
if (!identical(colnames(bg_X), feature_names)) {
9999
bg_X <- bg_X[, feature_names, drop = FALSE]
100100
}
101-
101+
102102
# Precalculations that are identical for each row to be explained
103103
Z <- exact_Z(p, feature_names = feature_names, keep_extremes = TRUE)
104104
m_exact <- nrow(Z) - 2L # We won't evaluate vz for first and last row
105105
precalc <- list(
106106
Z = Z,
107-
Z_code = rowpaste(Z),
107+
Z_code = rowpaste(Z),
108108
bg_X_rep = rep_rows(bg_X, rep.int(seq_len(bg_n), m_exact))
109109
)
110-
110+
111111
if (m_exact * bg_n > 2e5) {
112112
warning_burden(m_exact, bg_n = bg_n)
113113
}
114-
114+
115115
# Apply permutation SHAP to each row of X
116116
if (isTRUE(parallel)) {
117117
parallel_args <- c(list(i = seq_len(n)), parallel_args)
118118
res <- do.call(foreach::foreach, parallel_args) %dopar% permshap_one(
119119
x = X[i, , drop = FALSE],
120-
v1 = v1[i, , drop = FALSE],
120+
v1 = v1[i, , drop = FALSE],
121121
object = object,
122122
pred_fun = pred_fun,
123123
bg_w = bg_w,
@@ -133,7 +133,7 @@ permshap.default <- function(
133133
for (i in seq_len(n)) {
134134
res[[i]] <- permshap_one(
135135
x = X[i, , drop = FALSE],
136-
v1 = v1[i, , drop = FALSE],
136+
v1 = v1[i, , drop = FALSE],
137137
object = object,
138138
pred_fun = pred_fun,
139139
bg_w = bg_w,

README.md

+8-9
Original file line numberDiff line numberDiff line change
@@ -15,19 +15,18 @@
1515

1616
The package contains three functions to crunch SHAP values:
1717

18-
- `permshap()`: Exact permutation SHAP algorithm of [1]. Recommended for models with up to 8 features.
19-
- `kernelshap()`: Kernel SHAP algorithm of [2] and [3]. Recommended for models with more than 8 features.
20-
- `additive_shap()`: For *additive models* fitted via `lm()`, `glm()`, `mgcv::gam()`, `mgcv::bam()`, `gam::gam()`, `survival::coxph()`, or `survival::survreg()`. Exponentially faster than the model-agnostic options above, and recommended if possible.
18+
- **`permshap()`**: Exact permutation SHAP algorithm of [1]. Recommended for models with up to 8 features.
19+
- **`kernelshap()`**: Kernel SHAP algorithm of [2] and [3]. Recommended for models with more than 8 features.
20+
- **`additive_shap()`**: For *additive models* fitted via `lm()`, `glm()`, `mgcv::gam()`, `mgcv::bam()`, `gam::gam()`, `survival::coxph()`, or `survival::survreg()`. Exponentially faster than the model-agnostic options above, and recommended if possible.
2121

22-
To explain your model, select an explanation dataset `X` (up to 1000 rows from the training data) and apply the recommended function. Use {shapviz} to visualize the resulting SHAP values.
22+
To explain your model, select an explanation dataset `X` (up to 1000 rows from the training data, feature columns only) and apply the recommended function. Use {shapviz} to visualize the resulting SHAP values.
2323

24-
**Remarks for `permshap()` and `kernelshap()`**
24+
**Remarks to `permshap()` and `kernelshap()`**
2525

26-
- `X` should only contain feature columns.
2726
- Both algorithms need a representative background data `bg_X` to calculate marginal means (up to 500 rows from the training data). In cases with a natural "off" value (like MNIST digits), this can also be a single row with all values set to the off value. If unspecified, 200 rows are randomly sampled from `X`.
28-
- By changing the defaults in `kernelshap()`, the iterative pure sampling approach of [3] can be enforced.
29-
- `permshap()` vs. `kernelshap()`: For models with interactions of order up to two, exact Kernel SHAP agrees with exact permutation SHAP.
30-
- `additive_shap()` vs. the model-agnostic explainers: The results would agree if the full training data would be used as background data.
27+
- Exact Kernel SHAP is an approximation to exact permutation SHAP. Since exact calculations are usually sufficiently fast for up to eight features, we recommend `permshap()` in this case. With more features, `kernelshap()` switches to a comparably fast, almost exact algorithm. That is why we recommend `kernelshap()` in this case.
28+
- For models with interactions of order up to two, SHAP values of exact permutation SHAP and exact Kernel SHAP agree.
29+
- `permshap()` and `kernelshap()` give the same results as `additive_shap` as long as the full training data would be used as background data.
3130

3231
## Installation
3332

man/additive_shap.Rd

+4-4
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

man/kernelshap.Rd

+8-3
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

man/permshap.Rd

+2-2
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

0 commit comments

Comments
 (0)