Skip to content

Commit

Permalink
Merge pull request #42 from mayer79/release_candidate
Browse files Browse the repository at this point in the history
Release candidate
  • Loading branch information
mayer79 authored Sep 29, 2022
2 parents 81814e6 + 499dd8e commit b75486e
Show file tree
Hide file tree
Showing 9 changed files with 62 additions and 26 deletions.
3 changes: 3 additions & 0 deletions CRAN-SUBMISSION
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
Version: 0.3.0
Date: 2022-09-29 15:39:13 UTC
SHA: 0652b701dc44c6446c9ed4c92e4da3d76089f7b6
20 changes: 10 additions & 10 deletions DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,22 +1,22 @@
Package: kernelshap
Title: Kernel SHAP
Version: 0.2.0.900
Version: 0.3.0
Authors@R: c(
person("Michael", "Mayer", , "[email protected]", role = c("aut", "cre")),
person("David", "Watson", , "[email protected]", role = "ctb")
)
Description: Multidimensional refinement of the Kernel SHAP algorithm
described in Ian Covert and Su-In Lee (2021)
<http://proceedings.mlr.press/v130/covert21a>. Depending on the
number of features, Kernel SHAP values can be calculated exactly, by
sampling, or by a combination of the two. As soon as sampling is
involved, the algorithm iterates until convergence, and standard
errors are provided. The package allows to work with any model that
<http://proceedings.mlr.press/v130/covert21a>. The package allows to
calculate Kernel SHAP values in an exact way, by iterative sampling
(as in the reference above), or by a hybrid of the two. As soon as
sampling is involved, the algorithm iterates until convergence, and
standard errors are provided. The package works with any model that
provides numeric predictions of dimension one or higher. Examples
include linear regression, logistic regression (logit or probability
scale), other generalized linear models, generalized additive models,
and neural networks. The package plays well together with
meta-learning packages like 'tidymodels', 'caret' or 'mlr3'.
include linear regression, logistic regression (on logit or
probability scale), other generalized linear models, generalized
additive models, and neural networks. The package plays well together
with meta-learning packages like 'tidymodels', 'caret' or 'mlr3'.
Visualizations can be done using the R package 'shapviz'.
License: GPL (>= 2)
Depends:
Expand Down
2 changes: 1 addition & 1 deletion NEWS.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# kernelshap 0.2.0.900 DEVEL
# kernelshap 0.3.0

## Major improvements

Expand Down
4 changes: 2 additions & 2 deletions R/kernelshap.R
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
#' The function allows to calculate Kernel SHAP values in an exact way, by iterative sampling
#' as in CL21, or by a hybrid of these two options. As soon as sampling is involved,
#' the algorithm iterates until convergence, and standard errors are provided.
#' The default behaviour depends on the number of features p:
#' The default behaviour depends on the number of features p, see also Details below:
#' \itemize{
#' \item 2 <= p <= 8: Exact Kernel SHAP values are returned (for the given background data).
#' \item p > 8: Hybrid (partly exact) iterative version of Kernel SHAP
Expand Down Expand Up @@ -191,7 +191,7 @@ kernelshap.default <- function(object, X, bg_X, pred_fun = stats::predict, bg_w
all(nms %in% colnames(bg_X)),
is.function(pred_fun),
exact %in% c(TRUE, FALSE),
p == 1L || hybrid_degree %in% 0:(p / 2),
p == 1L || exact || hybrid_degree %in% 0:(p / 2),
paired_sampling %in% c(TRUE, FALSE),
"m must be even" = trunc(m / 2) == m / 2
)
Expand Down
8 changes: 4 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@

SHAP values (Lundberg and Lee, 2017) decompose model predictions into additive contributions of the features in a fair way. A model agnostic approach is called Kernel SHAP, introduced in Lundberg and Lee (2017), and investigated in detail in Covert and Lee (2021).

The "kernelshap" package implements a multidimensional refinement of the Kernel SHAP Algorithm described in Covert and Lee (2021). The package allows to calculate Kernel SHAP values in an exact way, by iterative sampling (as in Covert and Lee, 2021), or a hybrid of the two. As soon as sampling is involved, the algorithm iterates until convergence, and standard errors are provided.
The "kernelshap" package implements a multidimensional refinement of the Kernel SHAP Algorithm described in Covert and Lee (2021). The package allows to calculate Kernel SHAP values in an exact way, by iterative sampling (as in Covert and Lee, 2021), or by a hybrid of the two. As soon as sampling is involved, the algorithm iterates until convergence, and standard errors are provided.

The default behaviour depends on the number of features $p$:

Expand Down Expand Up @@ -283,7 +283,7 @@ fit <- gam(Sepal.Length ~ s(Sepal.Width) + Species, data = iris)

system.time(
s <- kernelshap(
fit,
fit,
iris[c(2, 5)],
bg_X = iris,
parallel = TRUE,
Expand All @@ -300,7 +300,7 @@ SHAP values of first 2 observations:

## Exact/sampling/hybrid

In above examples, since $p$ was small, exact Kernel SHAP values were calculated. Here, we want to show how to use the different strategies (exact, hybrid, and pure sampling) in a situation with ten features.
In above examples, since $p$ was small, exact Kernel SHAP values were calculated. Here, we want to show how to use the different strategies (exact, hybrid, and pure sampling) in a situation with ten features, see `?kernelshap` for details about those strategies.

With ten features, a degree 2 hybrid is being used by default:

Expand Down Expand Up @@ -343,7 +343,7 @@ s$S[1:5]

The results are identical. While more on-off vectors $z$ were required (1022), only a single call to `predict()` was necessary.

Pure sampling can be enforced by setting the hybrid degree to 0:
Pure sampling (not recommended!) can be enforced by setting the hybrid degree to 0:

```r
s <- kernelshap(fit, X[1L, ], bg_X = X, hybrid_degree = 0)
Expand Down
4 changes: 2 additions & 2 deletions compare_with_python.R
Original file line number Diff line number Diff line change
Expand Up @@ -65,8 +65,8 @@ fit <- lm(
X_small <- diamonds[seq(1, nrow(diamonds), 53), setdiff(names(diamonds), "price")]

# Exact KernelSHAP on X_small, using X_small as background data
# (71/59 seconds for exact, 27/17 for hybrid deg 2, 17/9 for hybrid deg 1,
# 26/15 for pure sampling; second number with 2 parallel sessions on Windows)
# (58/67(?) seconds for exact, 25/18 for hybrid deg 2, 16/9 for hybrid deg 1,
# 26/17 for pure sampling; second number with 2 parallel sessions on Windows)
system.time(
ks <- kernelshap(fit, X_small, bg_X = bg_X)
)
Expand Down
32 changes: 32 additions & 0 deletions cran-comments.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
Hello CRAN

This is an update with

- a much better way to calculate *exact* KernelSHAP values,
- and a very fast and accurate hybrid between exact and sampling.

Furthermore, some defaults have been improved. As the package is maturing, the next
update will hopefully be version 1.0.0.

## Checks

### `check(manual = TRUE, cran = TRUE)`

0 errors ✔ | 0 warnings ✔ | 0 notes ✔

### `check_win_devel()`

* checking for detritus in the temp directory ... NOTE
Found the following files/directories:
'lastMiKTeXException'

0 errors ✔ | 0 warnings ✔ | 1 note ✖

### `check_rhub()`

- Ubuntu Linux 20.04.1 LTS, R-release, GCC: Okay
- Platform: Fedora Linux, R-devel, clang, gfortran: Note

* checking HTML version of manual ... NOTE
Skipping checking HTML validation: no command 'tidy' found

2 changes: 1 addition & 1 deletion man/kernelshap.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

13 changes: 7 additions & 6 deletions packaging.R
Original file line number Diff line number Diff line change
Expand Up @@ -15,14 +15,15 @@ library(usethis)
use_description(
fields = list(
Title = "Kernel SHAP",
Version = "0.2.0.900",
Version = "0.3.0",
Description = "Multidimensional refinement of the Kernel SHAP algorithm described in
Ian Covert and Su-In Lee (2021) <http://proceedings.mlr.press/v130/covert21a>.
Depending on the number of features, Kernel SHAP values can be calculated exactly,
by sampling, or by a combination of the two. As soon as sampling is involved,
the algorithm iterates until convergence, and standard errors are provided.
The package allows to work with any model that provides numeric predictions of dimension one or higher.
Examples include linear regression, logistic regression (logit or probability scale),
The package allows to calculate Kernel SHAP values in an exact way, by iterative
sampling (as in the reference above), or by a hybrid of the two.
As soon as sampling is involved, the algorithm iterates until convergence,
and standard errors are provided.
The package works with any model that provides numeric predictions of dimension one or higher.
Examples include linear regression, logistic regression (on logit or probability scale),
other generalized linear models, generalized additive models, and neural networks.
The package plays well together with meta-learning packages like 'tidymodels', 'caret' or 'mlr3'.
Visualizations can be done using the R package 'shapviz'.",
Expand Down

0 comments on commit b75486e

Please sign in to comment.