ModelASV

This is a project in developing.

ModelASV is a tool for analyzing models including large language models, multi-modality models and image/video generation models. It has two main functions: Analyze model, Simulate model, and Visualize model (web browser).

Because the training analyzing is becoming more and more important, ModelASV targets analyzing both the training and the infernce of models.

Install

pip install modelasv

Usage

You can see example in examples folder.

To analyze the performance of a deep learning model, you can use the LMViewAnalyzer, RooflineModel, and LMSimulator classes from the modelasv module.

The LMViewAnalyzer is used to collect information, such as the number of operations, and input and output shapes, for a given model. The RooflineModel is used to estimate the theoretical performance limit of a operation based on the hardware specifications, such as memory bandwidth and compute capability. The LMSimulator is then used to simulate the performance of the model with different bit-widths, taking into account the performance data collected by the analyzer and the theoretical performance limit estimated by the roofline model.

Here's an example of how to use these classes to analyze the performance of the Stable Diffusion model:

import torch
from diffusers import StableDiffusionPipeline
from modelasv import LMViewAnalyzer, RooflineModel, LMSimulator

# Load the Stable Diffusion model
model_id = "runwayml/stable-diffusion-v1-5"
pipe = StableDiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16)
pipe = pipe.to("cuda")

# Initialize the performance analysis tools
analyzer = LMViewAnalyzer()
bandwidth = 100e9  # Memory bandwidth of hardware in bytes/s
compute_capacity = {8: 100e12}  
# compute_capacity: the compute capacity of the accelerator, a dict with keys as bitwidth and values as FLOPS/OPS
#    The bitwidth is the bitwidth of the data type, e.g., 32 for float32 and 16 for float16
rf = RooflineModel(bandwidth, compute_capacity)  # Create a roofline model

simulator = LMSimulator(rf)

# Analyze the performance of different sub-model of the Stable Diffusion model
with torch.no_grad():
    for model in [pipe.text_encoder, pipe.unet, pipe.vae]:
        # Generate an image using the Stable Diffusion model
        prompt = "a photo of an astronaut riding a horse on mars"

        # Use this context to enable analyze
        with analyzer.analyze(model):
            image = pipe(prompt, num_inference_steps=20).images[0]

        # Collect performance data and simulate the model with different bit-widths
        analyze_result = analyzer.accumulate_report(model)
        report, tot_report = simulator.simulate(analyze_result, a_bit=8, w_bit=8)
        print(tot_report)

In this example, we first load the Stable Diffusion model using the StableDiffusionPipeline class from the diffusers library. We then create an LMViewAnalyzer to collect performance data, a RooflineModel to estimate the theoretical performance limit, and an LMSimulator to simulate the model's performance with different bit-widths.

We loop through the key components of the Stable Diffusion model (the text encoder, U-Net, and VAE) and generate an image using the model. During the image generation, the LMViewAnalyzer collects performance data, which is then passed to the LMSimulator to simulate the model's performance with 8-bit activations and weights.

The tot_report variable contains the simulated performance results, which can be used to analyze the model's efficiency and identify potential bottlenecks.

API description

Analyze model

ModelASV uses torch.nn module wrappers to wrap all kinds of structures. And it generates the analyze_report on each module after run inference.

The analyze_report is a dict (if module itself "", else sub-module name) contains list of dict with these information:

operations
weights_shape: dict
inputs_shape: dict
outputs_shape: dict
info: dict, some useful information need to show, such as num_key_value_groups, load_kv_cache, store_kv_cache If there are sub-module, there are more than one dict in the analyze_report dict. If we execute the module several times, there should be multiple dict with these information.

Simulate model

ModelASV provides the Simulator to estimate how fast a module runs at a given hardware device.

RooflineModelSimulator

Visualize model

Comming soon.

Name	Name	Last commit message	Last commit date
Latest commit hahnyuan plot tool Aug 6, 2024 d0f4733 · Aug 6, 2024 History 20 Commits
examples	examples	plot tool	Aug 6, 2024
modelasv	modelasv	plot tool	Aug 6, 2024
.gitignore	.gitignore	first commit	Jul 30, 2024
LICENSE	LICENSE	Create LICENSE	Aug 1, 2024
README.md	README.md	modify api	Aug 1, 2024
setup.py	setup.py	plot tool	Aug 6, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ModelASV

Install

Usage

API description

Analyze model

Simulate model

Visualize model

About

Releases

Packages

Languages

License

hahnyuan/ModelASV

Folders and files

Latest commit

History

Repository files navigation

ModelASV

Install

Usage

API description

Analyze model

Simulate model

Visualize model

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages