Skip to content

Releases: deel-ai/xplique

Example-based

08 Oct 12:36
Compare
Choose a tag to compare

Release v1.4.0 Introduce the Example-based module

This module introduces example-based methods from four families of methods. Namely:

  • Similar Examples
  • Semi-factuals
  • Counterfactuals
  • Prototypes

The API

The API is common for the four families, and it has three steps:

  • Construct a projection function with the model.
  • Initialize the method with a dataset, projection, and other parameters.
  • Explain given samples with a local explanation.
projection = ProjectionMethod(model)

explainer = ExampleMethod(
    cases_dataset=cases_dataset, 
    k=k,
    projection=projection,
    case_returns=case_returns,
    distance=distance,
)

examples = explainer.explain(inputs, targets)

It works for TensorFlow and PyTorch without additional code. To have more information, please look at the documentation or to the tutorial.

Other modifications

  • Xplique is limited to TensorFlow < 2.16 as this version of tf introduces many modifications.
  • The features visualization module now supports grey-scale images.

Fix issues #150 and #151 on fidelity metrics

13 Dec 15:45
Compare
Choose a tag to compare

Fix issues:

  • #150 [Bug]: - Causal fidelity problem with steps=-1
  • #151 [Bug]: - MuFidelity does not work for Tabular data and time series

Now CausalFidelity metric for attribution methods works with steps=-1 for Insertion and Deletion to be done feature by feature.

Now MuFidelity works as intended for tabular data and time series.

Fix issue #143

13 Dec 14:46
Compare
Choose a tag to compare

Patch

Fix issue #143 : all VRAM was allocated when Xplique was imported, this is not the case anymore.

Data type coverage and bug correction

09 Nov 11:18
Compare
Choose a tag to compare

Release Notes v1.3.1

Data type coverage

The first part of this release concerns Xplique data type coverage extension. It first modifies some methods to extend the coverage.

Non-square images

SobolAttributionMethod and HsicAttributionImage now support non-square images.

Image explanation shape harmonization

For image explanation, depending on the method, the explanation shape could be either $(n, h, w)$, $(n, h, w, 1)$, or $(n, h, w, 3)$. It was decided to harmonize it to $(n, h, w, 1)$.

Reducer for gradient-based methods

For images, most gradient-based provide a value for each channel, however, for consistency, it was decided that for images, explanations will have the shape $(n, h, w, 1)$. Therefore, gradient-based methods need to reduce the channel dimension of their image explanations and the reducer parameter chooses how to do it among {"mean", "min", "max", "sum", None}. In the case None is given, the channel dimension is not reduced. The default value is "mean" for methods except Saliency which is "max" to comply with the paper and GradCAM and GradCAMPP which are not concerned.

Time series

Xplique was initially designed for images but it also supports attribution methods for tabular data and now time series data.

Xplique conciders data with:

  • 4 dimensions as images.
  • 3 dimensions as time series.
  • 2 dimensions as tabular data.

Tutorial

To show how to use Xplique on time series a new tutorial was designed: Attributions: Time Series and Regression.

Plot

The function xplique.plots.plot_timeseries_attributions was modified to match xplique.plots.plot_attributions API. Here is an example from the tutorial on temperature forecasting for the next 24 hours based on weather data from the last 48 hours:

image

Methods

Rise, Lime, and Kernelshap now support time series natively.

Overview of covered data types and tasks

Attribution Method Type of Model Images Time Series and Tabular Data
Deconvolution TF C✔️ OD❌ SS❌ C✔️ R✔️
Grad-CAM TF C✔️ OD❌ SS❌
Grad-CAM++ TF C✔️ OD❌ SS❌
Gradient Input TF, PyTorch** C✔️ OD✔️ SS✔️ C✔️ R✔️
Guided Backprop TF C✔️ OD❌ SS❌ C✔️ R✔️
Integrated Gradients TF, PyTorch** C✔️ OD✔️ SS✔️ C✔️ R✔️
Kernel SHAP TF, PyTorch**, Callable* C✔️ OD✔️ SS✔️ C✔️ R✔️
Lime TF, PyTorch**, Callable* C✔️ OD✔️ SS✔️ C✔️ R✔️
Occlusion TF, PyTorch**, Callable* C✔️ OD✔️ SS✔️ C✔️ R✔️
Rise TF, PyTorch**, Callable* C✔️ OD✔️ SS✔️ C✔️ R✔️
Saliency TF, PyTorch** C✔️ OD✔️ SS✔️ C✔️ R✔️
SmoothGrad TF, PyTorch** C✔️ OD✔️ SS✔️ C✔️ R✔️
SquareGrad TF, PyTorch** C✔️ OD✔️ SS✔️ C✔️ R✔️
VarGrad TF, PyTorch** C✔️ OD✔️ SS✔️ C✔️ R✔️
Sobol Attribution TF, PyTorch** C✔️ OD✔️ SS✔️ 🔵
Hsic Attribution TF, PyTorch** C✔️ OD✔️ SS✔️ 🔵
FORGrad enhancement TF, PyTorch** C✔️ OD✔️ SS✔️

TF : Tensorflow compatible
C : Classification | R : Regression |
OD : Object Detection | SS : Semantic Segmentation (SS)

* : See the Callable documentation

** : See the Xplique for PyTorch documentation, and the PyTorch models: Getting started notebook.

✔️ : Supported by Xplique | ❌ : Not applicable | 🔵 : Work in Progress

Metrics

Naturally, metrics now support Time series too.


Bugs correction

The second part of this release is to solve pending issues: #102, #123, #127, #128, #131, and #137.

Memories problem

Indeed, among the reported issues several concerned memory management.

SmoothGrad, VarGrad, and SquareGrad issue #137

SmoothGrad, VarGrad, and SquareGrad now use online statistics to compute explanations, which allows to make batch inferences. Furthermore, their implementation was refactorized with a GradientStatistic abstraction. It does not modify usage.

MuFidelity issue #137

The metric MuFidelity had the same problem as the three previous methods, it was also solved.

HsicAttributionMethod

This method had a different memory problem the batch_size for the model was used correctly, however, when computing the estimator a tensor of size grid_size**2 * nb_design**2 was created. However, for big images and/or small objects in images, the grid_size needs to be increased, furthermore, for the estimator to converge, nb_design should also be increased accordingly. Which creates out-of-memory errors.

Thus an estimator_batch_size (different from the initial batch_size) was introduced to batch over the grid_size**2 dimension. The default value is None, thus conserving the default behavior of the method, but when an out-of-memory occurs, setting an estimator_batch_size smaller than grid_size**2 will reduce the memory cost of the method.

Other issues

Metrics input types issues #102 and #128

Now inputs and targets are sanitized to numpy arrays

Feature visualization latent dtype issue #131

In issue #131, there was a conflict in dtype between the model internal dtype and Xplique dtype. We made sure that the dtype used for the conflicting computation was the model's internal dtype.

Other corrections

Naturally, other problems were reported to us outside of issues or discovered by the team, we also addressed these.

Some refactorization

Lime was refactorized but it does not impact usage.

Small fixes

In HsicAttributionMethod and SobolAttributionMethod there was a difference between the documentation of the perturbation_function and the actual code.

For Craft, there were some remaining prints, but they may be useful, thus Craft's methods with print now take a verbose parameter.

CRAFT

19 Oct 13:22
Compare
Choose a tag to compare

Release Note v1.3.0

New Features

CRAFT or Concept Recursive Activation FacTorization for Explainability

Introduction of the CRAFT method (see the Paper) for both frameworks: PyTorch and Tensorflow. CRAFT is a method for automatically extracting human-interpretable concepts from deep networks.

from xplique.concepts import CraftTf as Craft

# Cut the model in two parts (as explained in the paper)
# first part is g(.) our 'input_to_latent' model returning positive activations,
# second part is h(.) our 'latent_to_logit' model

g = tf.keras.Model(model.input, model.layers[-3].output)
h = tf.keras.Model(model.layers[-2].input, model.layers[-1].output)

# Create a Craft concept extractor from these 2 models
craft = Craft(input_to_latent_model = g,
			  latent_to_logit_model = h,
			  number_of_concepts = 10,
			  patch_size = 80,
			  batch_size = 64)

# Use Craft to get the crops (crops), the embedding of the crops (crops_u),
# and the concept bank (w)
crops, crops_u, w = craft.fit(images_preprocessed, class_id=rabbit_class_id)

# Compute Sobol indices to understand which concept matters
importances = craft.estimate_importance()

# Display those concepts by showing the 10 best crops for each concept
craft.plot_concepts_crops(nb_crops=10)

See related documentation, Tensorflow tutorials and PyTorch tutorial

Minor fix

18 Oct 12:08
Compare
Choose a tag to compare

Release Note v1.2.1

Minor fix

Dead links

There were several dead links modified in 88165b3.

Update methods table data types coverage

Lime does not work for now for semantic segmentation, hence the table was updated from "supported by xplique" to "work in progress". c307bb5.

Add tutorial links for feature visualization

The tutorials for feature visualization were not visible, thus the links were added in several places in 0547dbb.

Modify setup.cfg

In the setup, a tag was created when calling bump2version. However, this tag is created on the current branch and not master, which cannot be used. Thus this behavior was removed in 1139740.

Semantic Segmentation and Object Detection

05 Oct 10:17
Compare
Choose a tag to compare

Release Note v1.2.0

New features

Semantic Segmentation

See related documentation and tutorial.

explainer = Method(model, operator=xplique.Tasks.SEMANTIC_SEGMENTATION)

A new operator was designed to treat the semantic segmentation task, with the relative documentation and tutorial. It is used similarly to classification and regression, as shown in the example above. But the model specification changes and targets parameter definition differs (to design them, the user should use xplique.utils_functions.segmentation set of functions).

Object Detection

See related documentation and tutorial.

explainer = Method(model, operator=xplique.Tasks.OBJECT_DETECTION)

The object detection API was adapted to the operator API, hence an object detection operator was designed to enable white box methods for object detection. Furthermore, the relative documentation and tutorials were introduced. Here also, targets and model specifications differ from classification ones.

Therefore, the BoundingBoxExplainer is now deprecated.

Documentation

Merge model, operator, and API description page into one

As @fel-thomas highlighted in #132 remarks, the documentation was too divided, furthermore, a lot of information was redundant between those pages and they were interdependent. Hence the choice was made to merge the model, operator, and API description page into one. We believe it will simplify the use of the library.

Create task-related pages

As aforementioned, two tasks (Object Detection and Semantic Segmentation) were introduced in the documentation, their complexity induced a specific documentation page. However, it was not consistent to have documentation pages only for those two tasks. Therefore information about Classification and Regression was extracted from the common API page to create two other new task-specific pages. Finally, four task-specific were introduced to the documentation

Bug fixes

Regression

The regression operator was set to the MAE function in the previous release to allow the explanation of multi-output regression. However, such a function is not differentiable in zero, thus gradient-based methods were not working.

Hence, the behavior was set back to the previous behavior (a sum of the targeted outputs). Nonetheless, this operator is limited to single-output explanations, hence for multi-output regression, each output should be explained individually.

MaCO, ForGRAD and a Torch Wrapper

07 Sep 07:55
Compare
Choose a tag to compare

Release note v1.1.0

New Features

MaCO

Introduction of a recent method for scaling up feature visualization on state-of-the-art deep models: MaCo. This method is described in the following arXiv paper: https://arxiv.org/pdf/2306.06805.pdf. It involves fixing the amplitude in the Fourier spectrum and only optimizing the phase during the optimization of a neuron/channel/layer.

It comes with the associated documentation, tests and notebook

FORGrad

Introduction of FORGrad (paper here: https://arxiv.org/pdf/2307.09591.pdf). All in all, this method consists in filtering the noise in the explanations to make them more interpretable.

It comes with the associated documentation, tests and notebook

PyTorch Wrapper

Provide within Xplique a convenient wrapper for Pytorch's model that works for most attribution methods and is compatible with metrics.

It comes with the associated documentation, tests and notebook. It also introduce its own pipeline for CI challenging the cross-version between TF and PyTorch.

Introduce a Tasks Enum

Add the Tasks enum which includes the operators for classification and regression tasks. The possibility to choose from the existing operator by their name was added.

Add an activation parameter for metrics

While we recommend using the logits to generate explanations, it might be more relevant to look at the probability (after a softmax or sigmoid layer) of a prediction when computing metrics for instance if it measures 'a drop in probability for the classification of an input occluded in the more relevant part'. Thus, we introduce this option when you build a metric. activation can be either None, 'softmax' or 'sigmoid'.

Bug fixes

regression_operator

The operator was a sum instead of a mean (for MAE). It has been fixed.

HSIC attribution

  • doc of call of HSICEstimator
  • Add @tf.function

Documentation

  • Enhance overall the documentation
  • Add documetation for the operator
  • Add explanation concerning the model, the API
  • Add a pipeline for CI of the documentation

v1.0 Operator major release

29 May 15:47
Compare
Choose a tag to compare

This release introduces operators for attribution methods. It allows one to apply attribution methods to a larger variety of use-cases. For more detail, one can refer to PRs #124 and #125

What changes in the use of Xplique?

For regular users, this release should be transparent. However, if you want to apply attribution methods to non-classification tasks, il will now be easier.

What is an operator?

The idea is as follows: to define an attribution method, we need any function that takes in the model (f), a series of inputs (x), and labels (y) and returns a scalar in R.

g(f, x, y) -> R

This function called an operator, can be defined by the user (or by us) and then provides a common interface for all attribution methods that will call it (or calculate its gradient). As you can see, the goal is for attribution methods to have this function as an attribute (in more detail, this will give self.inference_function = operator at some point).

Some Examples of Operators

  1. The most trivial operator is perhaps the classification one, it consists of taking a particular logit to explain a class. In the case where the model f: R^n -> R^c with c being the number of classes and y being one-hot vectors, then our operator simply boils down to:
def g(f, x, y):
    return tf.reduce_sum(f(x) * y, -1)
  1. Regarding regression, with a model f: R^n -> R^m with m and targets being the initial prediction of the model, the operator will be:
def g(f, x, y):
    return tf.reduce_sum(tf.abs(model(inputs) - targets), axis=-1)
  1. Regarding bounding-box, an operator has already been defined in the literature with the D-RISE article. It consists of using the three IOU, objectness, and box classification scores to form... a scalar!

  2. To explain concepts, for example with a model f = c ⚬ g(x), with a = g(x) and a factorizer that allows interpreting a in a reduced dimension space u = factorizer(a), we can very well define the following operator:

def g(c, u, y):
    a = factorizer.inverse(u)
    y_pred = c(a)
    return tf.reduce_sum(y_pred * y, -1)

As you can see, many cases can be handled in this manner!

Implementation

Regarding implementation, there is a series of operators available in the file in commons/operators and the most important part -- the operator plug -- is located in the attributions/base.py file. As discussed with @fel-thomas, @AntoninPoche, and @lucashervier, the PyTorch implementation is not far and would be located here!

Once this is done, we simply added the argument to all the attribution methods defined in the library, some related metrics naturally inherited the parameter.

Finally, the two metrics InsertionTS and DeletionTS, were deleted as they are now redundant. Indeed, with the new implementation, metrics are not limited to classification.

v0.4.3

19 Dec 14:09
Compare
Choose a tag to compare
data unrolling: better handling of batched tf.Dataset