Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add JPQD evaluation notebook #231

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

helena-intel
Copy link
Collaborator

Add JPQD evaluation notebook. Since JPQD QA takes about 12 hours to train, it doesn't make sense to do it in a notebook (if the browser crashes or the computer goes to sleep, training would stop). So I just refer to the example and use the notebook to evaluate the model.

This makes the notebook similar to the PTQ QA notebook. I thought about removing duplication but I think duplication in examples is not so bad, at least for now. It's nice that examples are standalone.

Since JPQD starts from a plain bert-base-uncased model I finetuned a bert-base-uncased model following the transformers run_qa.py example to compare performance.

Instead of making this a JPQD specific notebook, it could make more sense to make it a generic QA INT8 evaluation notebook, but on the other hand, it's an example, people can surely change it for similar purposes, and it's nice to promote JPQD.

TODO: the intro text at the top needs to explain a bit more about JPQD.

Colab link: https://colab.research.google.com/github/helena-intel/optimum-intel/blob/jpqd-notebook/notebooks/openvino/question_answering_quantization_jpqd.ipynb (performance is probably bad on Colab because there is no AVX512/VNNI).

@vuiseng9

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint.

@helena-intel helena-intel force-pushed the jpqd-notebook branch 2 times, most recently from 8f7b129 to 82dc6a6 Compare March 14, 2023 23:18
@AlexKoff88
Copy link
Collaborator

I think that it could be more useful if we can show the performance and accuracy trade-offs for three models:

  • Original Transformer model (fp32)
  • Quantized model (PTQ/QAT)
  • Pruned and quantized (JPQD, distillation is an auxiliary method here)

@ljaljushkin
Copy link
Contributor

@yujiepan-work and @vuiseng9 implemented very nice lightweight tests for JPQD training.
9 epochs take just a few seconds on a single card. I'd reuse them for this notebook.
https://github.com/openvinotoolkit/nncf/blob/develop/tests/torch/sparsity/movement/test_training.py#L237

if we need a very good accuracy/performance results, there are longer tests to consider:
https://github.com/openvinotoolkit/nncf/blob/develop/tests/torch/sparsity/movement/test_training.py#L318
If I am not mistaken, it takes minutes. Probably, @yujiepan-work could say the exact time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants