Implementation of Batching, Enabling HPU Graphs and FP8 quantization for SD3 Pipeline #1345

deepak-gowda-narayana · 2024-09-19T18:33:06Z

Added support for batch wise implementation and enabled HPU Graph integration, FP8 quantization for sd3 pipeline to optimize the pipeline performance on Gaudi 2.

What does this PR do?

Fixes # (issue)

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you make sure to update the documentation with your changes?
Did you write any new necessary tests?

Considering number of samples as 1 ( whole batch ) to have perf measure consistent with industry standard for easier comparison

mkpatel3-github · 2024-10-01T19:27:54Z

@dsocek @skavulya need your feedback to review

dsocek

Please see inline comments and requested changes. Also, make sure there is at least 1 CI test for SD3 with batching added to test_diffusers

optimum/habana/diffusers/pipelines/stable_diffusion_3/pipeline_stable_diffusion_3.py

skavulya

LGTM. Please add a test for batch sizes to tests /test_diffusers.py

examples/stable-diffusion/quantization/measure_config.json

examples/stable-diffusion/quantization/quant_config.json

deepak-gowda-narayana · 2024-10-07T18:45:52Z

LGTM. Please add a test for batch sizes to tests /test_diffusers.py

Added the tests for Batch Size

deepak-gowda-narayana · 2024-10-07T18:47:05Z

@skavulya @dsocek Request to review - Have updated the changes specified.

dsocek

Looks good. There are some extra spaces in sd3 pipeline file you should run make style to fix.

examples/stable-diffusion/quantization/SD3/measure_config.json

dsocek

Should add example of running FP8 mode in README

examples/stable-diffusion/quantization/stable-diffusion-3/measure_config.json

examples/stable-diffusion/quantization/stable-diffusion-3/quantize_config.json

examples/stable-diffusion/text_to_image_generation.py

deepak-gowda-narayana · 2024-10-07T23:23:19Z

Should add example of running FP8 mode in README

Added the example

deepak-gowda-narayana · 2024-10-08T17:47:48Z

Looks good. There are some extra spaces in sd3 pipeline file you should run make style to fix.

Fixed

deepak-gowda-narayana · 2024-10-08T17:48:53Z

@dsocek Please provide Feedback on the changes

examples/stable-diffusion/README.md

deepak-gowda-narayana · 2024-10-15T20:00:18Z

@libinta Request to review the PR and push for merging

emascarenhas · 2024-10-18T23:45:23Z

@deepak-gowda-narayana ,
Also run fast tests i.e., tests/ci/fast_tests*.sh and the SLOW test_diffusers.py and post summary of results here.
Also make style and fix any errors.

deepak-gowda-narayana · 2024-10-22T03:20:25Z

@emascarenhas

@deepak-gowda-narayana , Also run fast tests i.e., tests/ci/fast_tests*.sh and the SLOW test_diffusers.py and post summary of results here. Also make style and fix any errors.

Result Summary of fast_tests.sh

python -m pytest tests/test_gaudi_configuration.py tests/test_trainer_distributed.py tests/test_trainer.py tests/test_trainer_seq2seq.py
================================================================================================================= test session starts =================================================================================================================
platform linux -- Python 3.10.12, pytest-7.4.4, pluggy-1.5.0
rootdir: /mnt/deepak_sd3
configfile: setup.cfg
collected 89 items

tests/test_gaudi_configuration.py .. [ 2%]
tests/test_trainer_distributed.py .. [ 4%]
tests/test_trainer.py .............................s....................................sssssss.......... [ 97%]
tests/test_trainer_seq2seq.py .. [100%]

================================================================================================================== warnings summary ===================================================================================================================
../../usr/lib/python3.10/inspect.py:288
/usr/lib/python3.10/inspect.py:288: FutureWarning: torch.distributed.reduce_op is deprecated, please use torch.distributed.ReduceOp instead
return isinstance(object, types.FunctionType)

../../usr/local/lib/python3.10/dist-packages/transformers/deepspeed.py:24
/usr/local/lib/python3.10/dist-packages/transformers/deepspeed.py:24: FutureWarning: transformers.deepspeed module is deprecated and will be removed in a future version. Please import deepspeed modules directly from transformers.integrations
warnings.warn(

tests/test_trainer.py::GaudiTrainerIntegrationPrerunTest::test_cosine_with_min_lr_scheduler
/usr/local/lib/python3.10/dist-packages/torch/optim/lr_scheduler.py:216: UserWarning: Detected call of lr_scheduler.step() before optimizer.step(). In PyTorch 1.1.0 and later, you should call them in the opposite order: optimizer.step() before lr_scheduler.step(). Failure to do this will result in PyTorch skipping the first value of the learning rate schedule. See more details at https://pytorch.org/docs/stable/optim.html#how-to-adjust-learning-rate
warnings.warn(

tests/test_trainer.py: 12 warnings
/mnt/deepak_sd3/optimum/habana/transformers/trainer.py:1494: FutureWarning: You are using torch.load with weights_only=False (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for weights_only will be flipped to True. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via torch.serialization.add_safe_globals. We recommend you start setting weights_only=True for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
torch.load(os.path.join(checkpoint, OPTIMIZER_NAME), map_location=map_location)

tests/test_trainer.py: 12 warnings
/mnt/deepak_sd3/optimum/habana/transformers/trainer.py:1305: FutureWarning: You are using torch.load with weights_only=False (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for weights_only will be flipped to True. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via torch.serialization.add_safe_globals. We recommend you start setting weights_only=True for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
checkpoint_rng_state = torch.load(rng_file)

tests/test_trainer.py::GaudiTrainerIntegrationTest::test_load_best_model_from_safetensors
/mnt/deepak_sd3/tests/test_trainer.py:491: FutureWarning: You are using torch.load with weights_only=False (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for weights_only will be flipped to True. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via torch.serialization.add_safe_globals. We recommend you start setting weights_only=True for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
state_dict = torch.load(os.path.join(checkpoint, WEIGHTS_NAME))

tests/test_trainer.py::GaudiTrainerIntegrationTest::test_load_best_model_with_save
tests/test_trainer.py::GaudiTrainerIntegrationTest::test_load_best_model_with_save
/mnt/deepak_sd3/optimum/habana/transformers/training_args.py:366: FutureWarning: evaluation_strategy is deprecated and will be removed in version 4.46 of 🤗 Transformers. Use eval_strategy instead
warnings.warn(

tests/test_trainer.py::GaudiTrainerIntegrationTest::test_multiple_peft_adapters
/usr/local/lib/python3.10/dist-packages/transformers/generation/configuration_utils.py:579: UserWarning: pad_token_id should be positive but got -1. This will cause errors when batch generating, if there is padding. Please set pad_token_id explicitly as model.generation_config.pad_token_id=PAD_TOKEN_ID to avoid errors in generation
warnings.warn(

tests/test_trainer.py::GaudiTrainerIntegrationTest::test_multiple_peft_adapters
/usr/local/lib/python3.10/dist-packages/transformers/data/datasets/language_modeling.py:119: FutureWarning: This dataset will be removed from the library soon, preprocessing should be handled with the 🤗 Datasets library. You can have a look at this example script for pointers: https://github.com/huggingface/transformers/blob/main/examples/pytorch/language-modeling/run_mlm.py
warnings.warn(

tests/test_trainer.py::GaudiTrainerIntegrationTest::test_resume_training_with_safe_checkpoint
tests/test_trainer.py::GaudiTrainerIntegrationTest::test_resume_training_with_safe_checkpoint
/mnt/deepak_sd3/tests/test_trainer.py:537: FutureWarning: You are using torch.load with weights_only=False (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for weights_only will be flipped to True. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via torch.serialization.add_safe_globals. We recommend you start setting weights_only=True for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
state_dict = loader(weights_file)

tests/test_trainer.py::GaudiTrainerOptimizerChoiceTest::test_optim_supported_0_adamw_hf
tests/test_trainer.py::GaudiTrainerOptimizerChoiceTest::test_optim_supported_1_adamw_hf
/usr/local/lib/python3.10/dist-packages/transformers/optimization.py:591: FutureWarning: This implementation of AdamW is deprecated and will be removed in a future version. Use the PyTorch implementation torch.optim.AdamW instead, or set no_deprecation_warning=True to disable this warning
warnings.warn(

tests/test_trainer_seq2seq.py::GaudiSeq2seqTrainerTester::test_bad_generation_config_fail_early
/usr/local/lib/python3.10/dist-packages/transformers/generation/configuration_utils.py:606: UserWarning: do_sample is set to False. However, top_p is set to 0.9 -- this flag is only used in sample-based generation modes. You should set do_sample=True or unset top_p. This was detected when initializing the generation config instance, which means the corresponding file may hold incorrect parameterization and should be fixed.
warnings.warn(

tests/test_trainer_seq2seq.py::GaudiSeq2seqTrainerTester::test_finetune_t5
/usr/local/lib/python3.10/dist-packages/huggingface_hub/file_download.py:797: FutureWarning: resume_download is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use force_download=True.
warnings.warn(

tests/test_trainer_seq2seq.py::GaudiSeq2seqTrainerTester::test_finetune_t5
/usr/local/lib/python3.10/dist-packages/transformers/modeling_utils.py:2618: UserWarning: Moving the following attributes in the config to the generation config: {'max_length': 128}. You are seeing this warning because you've set generation parameters in the model config, as opposed to in the generation config.
warnings.warn(

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
=============================================================================================== 81 passed, 8 skipped, 39 warnings in 109.40s (0:01:49) ================================================================================================

Result Summary of fast_tests_diffusers.sh

python -m pytest tests/test_diffusers.py
===================================================================================================== test session starts ======================================================================================================
platform linux -- Python 3.10.12, pytest-7.4.4, pluggy-1.5.0
rootdir: /mnt/deepak_sd3
configfile: setup.cfg
collected 159 items

tests/test_diffusers.py .......sssss..........s.......s........................s.ss.sssssssssssss.............sss.......s....s.....ss.s.ssssss......s...s......s.sss........s..s.s....s [100%]

======================================================================================================= warnings summary =======================================================================================================
../../usr/lib/python3.10/inspect.py:288
/usr/lib/python3.10/inspect.py:288: FutureWarning: torch.distributed.reduce_op is deprecated, please use torch.distributed.ReduceOp instead
return isinstance(object, types.FunctionType)

../../usr/local/lib/python3.10/dist-packages/diffusers/models/vq_model.py:20
/usr/local/lib/python3.10/dist-packages/diffusers/models/vq_model.py:20: FutureWarning: VQEncoderOutput is deprecated and will be removed in version 0.31. Importing VQEncoderOutput from diffusers.models.vq_model is deprecated and this will be removed in a future version. Please use from diffusers.models.autoencoders.vq_model import VQEncoderOutput, instead.
deprecate("VQEncoderOutput", "0.31", deprecation_message)

../../usr/local/lib/python3.10/dist-packages/diffusers/models/vq_model.py:25
/usr/local/lib/python3.10/dist-packages/diffusers/models/vq_model.py:25: FutureWarning: VQModel is deprecated and will be removed in version 0.31. Importing VQModel from diffusers.models.vq_model is deprecated and this will be removed in a future version. Please use from diffusers.models.autoencoders.vq_model import VQModel, instead.
deprecate("VQModel", "0.31", deprecation_message)

tests/test_diffusers.py: 50 warnings
/usr/local/lib/python3.10/dist-packages/huggingface_hub/file_download.py:797: FutureWarning: resume_download is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use force_download=True.
warnings.warn(

tests/test_diffusers.py::GaudiPipelineUtilsTester::test_save_pretrained
/usr/local/lib/python3.10/dist-packages/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion.py:251: FutureWarning: The configuration file of the unet has set the default sample_size to smaller than 64 which seems highly unlikely. If your checkpoint is a fine-tuned version of any of the following:

CompVis/stable-diffusion-v1-4
CompVis/stable-diffusion-v1-3
CompVis/stable-diffusion-v1-2
CompVis/stable-diffusion-v1-1
runwayml/stable-diffusion-v1-5
runwayml/stable-diffusion-inpainting
you should change 'sample_size' to 64 in the configuration file. Please make sure to update the config accordingly as leaving sample_size=32 in the config might lead to incorrect results in future versions. If you have downloaded this checkpoint from the Hugging Face Hub, it would be very nice if you could open a Pull request for the unet/config.json file
deprecate("sample_size<64", "1.0.0", deprecation_message, standard_warn=False)

tests/test_diffusers.py: 10 warnings
/usr/local/lib/python3.10/dist-packages/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion.py:201: FutureWarning: The configuration file of this scheduler: GaudiDDIMScheduler {
"_class_name": "GaudiDDIMScheduler",
"_diffusers_version": "0.29.2",
"beta_end": 0.012,
"beta_schedule": "scaled_linear",
"beta_start": 0.00085,
"clip_sample": false,
"clip_sample_range": 1.0,
"dynamic_thresholding_ratio": 0.995,
"num_train_timesteps": 1000,
"prediction_type": "epsilon",
"rescale_betas_zero_snr": false,
"sample_max_value": 1.0,
"set_alpha_to_one": false,
"steps_offset": 0,
"thresholding": false,
"timestep_spacing": "leading",
"trained_betas": null
}
is outdated. steps_offset should be set to 1 instead of 0. Please make sure to update the config accordingly as leaving steps_offset might led to incorrect results in future versions. If you have downloaded this checkpoint from the Hugging Face Hub, it would be very nice if you could open a Pull request for the scheduler/scheduler_config.json file
deprecate("steps_offset!=1", "1.0.0", deprecation_message, standard_warn=False)

tests/test_diffusers.py: 79 warnings
/usr/local/lib/python3.10/dist-packages/diffusers/models/unets/unet_2d_blocks.py:1369: FutureWarning: scale is deprecated and will be removed in version 1.0.0. The scale argument is deprecated and will be ignored. Please remove it, as passing it will raise an error in the future. scale should directly be passed while calling the underlying pipeline component i.e., via cross_attention_kwargs.
deprecate("scale", "1.0.0", deprecation_message)

tests/test_diffusers.py: 79 warnings
/usr/local/lib/python3.10/dist-packages/diffusers/models/unets/unet_2d_blocks.py:2628: FutureWarning: scale is deprecated and will be removed in version 1.0.0. The scale argument is deprecated and will be ignored. Please remove it, as passing it will raise an error in the future. scale should directly be passed while calling the underlying pipeline component i.e., via cross_attention_kwargs.
deprecate("scale", "1.0.0", deprecation_message)

tests/test_diffusers.py: 13 warnings
/usr/local/lib/python3.10/dist-packages/diffusers/image_processor.py:628: FutureWarning: the output_type numpy is outdated and has been set to np. Please make sure to set it to one of these instead: pil, np, pt, latent
deprecate("Unsupported output_type", "1.0.0", deprecation_message, standard_warn=False)

tests/test_diffusers.py::GaudiStableDiffusionControlNetPipelineTester::test_stable_diffusion_controlnet_batch_sizes
tests/test_diffusers.py::GaudiStableDiffusionControlNetPipelineTester::test_stable_diffusion_controlnet_bf16
tests/test_diffusers.py::GaudiStableDiffusionControlNetPipelineTester::test_stable_diffusion_controlnet_default
tests/test_diffusers.py::GaudiStableDiffusionControlNetPipelineTester::test_stable_diffusion_controlnet_hpu_graphs
tests/test_diffusers.py::GaudiStableDiffusionControlNetPipelineTester::test_stable_diffusion_controlnet_num_images_per_prompt
/mnt/deepak_sd3/tests/test_diffusers.py:1606: FutureWarning: nn.init.normal is now deprecated in favor of nn.init.normal_.
torch.nn.init.normal(m.weight)

tests/test_diffusers.py::GaudiStableDiffusionControlNetPipelineTester::test_stable_diffusion_controlnet_batch_sizes
tests/test_diffusers.py::GaudiStableDiffusionControlNetPipelineTester::test_stable_diffusion_controlnet_bf16
tests/test_diffusers.py::GaudiStableDiffusionControlNetPipelineTester::test_stable_diffusion_controlnet_default
tests/test_diffusers.py::GaudiStableDiffusionControlNetPipelineTester::test_stable_diffusion_controlnet_hpu_graphs
tests/test_diffusers.py::GaudiStableDiffusionControlNetPipelineTester::test_stable_diffusion_controlnet_num_images_per_prompt
tests/test_diffusers.py::StableDiffusionXLInpaintPipelineFastTests::test_stable_diffusion_xl_inpaint_2_images
/usr/local/lib/python3.10/dist-packages/diffusers/image_processor.py:528: FutureWarning: Passing image as a list of 4d torch.Tensor is deprecated.Please concatenate the list along the batch dimension and pass it as a single 4d torch.Tensor
warnings.warn(

tests/test_diffusers.py::GaudiStableDiffusionMultiControlNetPipelineTester::test_stable_diffusion_multicontrolnet_batch_sizes
tests/test_diffusers.py::GaudiStableDiffusionMultiControlNetPipelineTester::test_stable_diffusion_multicontrolnet_bf16
tests/test_diffusers.py::GaudiStableDiffusionMultiControlNetPipelineTester::test_stable_diffusion_multicontrolnet_default
tests/test_diffusers.py::GaudiStableDiffusionMultiControlNetPipelineTester::test_stable_diffusion_multicontrolnet_hpu_graphs
tests/test_diffusers.py::GaudiStableDiffusionMultiControlNetPipelineTester::test_stable_diffusion_multicontrolnet_num_images_per_prompt
/mnt/deepak_sd3/tests/test_diffusers.py:1869: FutureWarning: nn.init.normal is now deprecated in favor of nn.init.normal_.
torch.nn.init.normal(m.weight)

tests/test_diffusers.py::GaudiStableDiffusionDepth2ImgPipelineTester::test_depth2img_pipeline_batch
tests/test_diffusers.py::GaudiStableDiffusionDepth2ImgPipelineTester::test_depth2img_pipeline_bf16
tests/test_diffusers.py::GaudiStableDiffusionDepth2ImgPipelineTester::test_depth2img_pipeline_default
tests/test_diffusers.py::GaudiStableDiffusionDepth2ImgPipelineTester::test_depth2img_pipeline_hpu_graphs
/usr/local/lib/python3.10/dist-packages/transformers/models/dpt/feature_extraction_dpt.py:28: FutureWarning: The class DPTFeatureExtractor is deprecated and will be removed in version 5 of Transformers. Please use DPTImageProcessor instead.
warnings.warn(

tests/test_diffusers.py::GaudiStableDiffusionDepth2ImgPipelineTester::test_depth2img_pipeline_batch
tests/test_diffusers.py::GaudiStableDiffusionDepth2ImgPipelineTester::test_depth2img_pipeline_bf16
tests/test_diffusers.py::GaudiStableDiffusionDepth2ImgPipelineTester::test_depth2img_pipeline_default
tests/test_diffusers.py::GaudiStableDiffusionDepth2ImgPipelineTester::test_depth2img_pipeline_hpu_graphs
/usr/local/lib/python3.10/dist-packages/torch/amp/autocast_mode.py:303: UserWarning: In HPU autocast, but the target dtype is not supported. Disabling autocast.
HPU Autocast only supports dtypes of torch.bfloat16 and torch.float16 currently.
warnings.warn(error_message)

tests/test_diffusers.py::GaudiStableDiffusionDepth2ImgPipelineTester::test_depth2img_pipeline_batch
/mnt/deepak_sd3/optimum/habana/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_depth2img.py:175: FutureWarning: You have passed 2 text prompts (prompt), but only 1 initial images (image). Initial images are now duplicating to match the number of text prompts. Note that this behavior is deprecated and will be removed in a version 1.0.0. Please make sure to update your script to pass as many initial images as text prompts to suppress this warning.
deprecate("len(prompt) != len(image)", "1.0.0", deprecation_message, standard_warn=False)

tests/test_diffusers.py: 14 warnings
/usr/local/lib/python3.10/dist-packages/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_inpaint.py:306: FutureWarning: The configuration file of this scheduler: PNDMScheduler {
"_class_name": "PNDMScheduler",
"_diffusers_version": "0.29.2",
"beta_end": 0.02,
"beta_schedule": "linear",
"beta_start": 0.0001,
"num_train_timesteps": 1000,
"prediction_type": "epsilon",
"set_alpha_to_one": false,
"skip_prk_steps": true,
"steps_offset": 0,
"timestep_spacing": "leading",
"trained_betas": null
}
is outdated. steps_offset should be set to 1 instead of 0. Please make sure to update the config accordingly as leaving steps_offset might led to incorrect results in future versions. If you have downloaded this checkpoint from the Hugging Face Hub, it would be very nice if you could open a Pull request for the scheduler/scheduler_config.json file
deprecate("steps_offset!=1", "1.0.0", deprecation_message, standard_warn=False)

tests/test_diffusers.py::StableDiffusionInpaintPipelineFastTests::test_karras_schedulers_shape
/usr/local/lib/python3.10/dist-packages/torchsde/_brownian/brownian_interval.py:608: UserWarning: Should have tb<=t1 but got tb=25.1461181640625 and t1=25.146116.
warnings.warn(f"Should have {tb_name}<=t1 but got {tb_name}={tb} and t1={self._end}.")

tests/test_diffusers.py::StableDiffusionInpaintPipelineFastTests::test_karras_schedulers_shape
/usr/local/lib/python3.10/dist-packages/torchsde/_brownian/brownian_interval.py:599: UserWarning: Should have ta>=t0 but got ta=0.0291675366461277 and t0=0.029168.
warnings.warn(f"Should have ta>=t0 but got ta={ta} and t0={self._start}.")

tests/test_diffusers.py::StableDiffusionXLInpaintPipelineFastTests::test_stable_diffusion_xl_inpaint_2_images
/usr/local/lib/python3.10/dist-packages/diffusers/image_processor.py:582: FutureWarning: Passing image as torch tensor with value range in [-1,1] is deprecated. The expected value range for image tensor is [0,1] when passing as pytorch tensor or numpy Array. You passed image with value range [-0.9999732375144958,0.999970555305481]
warnings.warn(

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
================================================================================== 112 passed, 47 skipped, 277 warnings in 946.38s (0:15:46) ===================================================================================

Result Summary of slow_tests_diffusers.sh

tests/test_diffusers.py:872: AssertionError
======================================================================================================= warnings summary =======================================================================================================
../../usr/lib/python3.10/inspect.py:288
/usr/lib/python3.10/inspect.py:288: FutureWarning: torch.distributed.reduce_op is deprecated, please use torch.distributed.ReduceOp instead
return isinstance(object, types.FunctionType)

../../usr/local/lib/python3.10/dist-packages/diffusers/models/vq_model.py:20
/usr/local/lib/python3.10/dist-packages/diffusers/models/vq_model.py:20: FutureWarning: VQEncoderOutput is deprecated and will be removed in version 0.31. Importing VQEncoderOutput from diffusers.models.vq_model is deprecated and this will be removed in a future version. Please use from diffusers.models.autoencoders.vq_model import VQEncoderOutput, instead.
deprecate("VQEncoderOutput", "0.31", deprecation_message)

../../usr/local/lib/python3.10/dist-packages/diffusers/models/vq_model.py:25
/usr/local/lib/python3.10/dist-packages/diffusers/models/vq_model.py:25: FutureWarning: VQModel is deprecated and will be removed in version 0.31. Importing VQModel from diffusers.models.vq_model is deprecated and this will be removed in a future version. Please use from diffusers.models.autoencoders.vq_model import VQModel, instead.
deprecate("VQModel", "0.31", deprecation_message)

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
=================================================================================================== short test summary info ====================================================================================================
FAILED tests/test_diffusers.py::GaudiStableDiffusionPipelineTester::test_textual_inversion - AssertionError: 10.952491965134426 not greater than or equal to 116.6072956525933
================================================================================== 1 failed, 158 deselected, 3 warnings in 157.57s (0:02:37) ===================================================================================
make: *** [Makefile:99: slow_tests_diffusers] Error 1

deepak-gowda-narayana added 3 commits September 17, 2024 12:05

Update pipeline_stable_diffusion_3.py

2f9f53e

Update pipeline_stable_diffusion_3.py

53aae6b

Update pipeline_stable_diffusion_3.py

b343f35

deepak-gowda-narayana requested a review from regisss as a code owner September 19, 2024 18:33

deepak-gowda-narayana added 4 commits September 24, 2024 08:50

Update pipeline_stable_diffusion_3.py

0899f07

Considering number of samples as 1 ( whole batch ) to have perf measure consistent with industry standard for easier comparison

Update pipeline_stable_diffusion_3.py

8c0a1b6

Update pipeline_stable_diffusion_3.py

cae57f7

Merge branch 'huggingface:main' into main

3b24e90

dsocek suggested changes Oct 1, 2024

View reviewed changes

skavulya reviewed Oct 2, 2024

View reviewed changes

deepak-gowda-narayana added 6 commits October 2, 2024 12:48

Update pipeline_stable_diffusion_xl.py

18690d9

Update README.md

5220aab

Create measure_config.json

079c759

Create quant_config.json

32564c4

Update pipeline_stable_diffusion_xl.py

595641c

Update pipeline_stable_diffusion_3.py

3b1dc48

deepak-gowda-narayana changed the title ~~Implementation of Batching and Enabling HPU Graphs Integration for SD3 Pipeline~~ Implementation of Batching, Enabling HPU Graphs and FP8 quantization Integration for SD3 Pipeline Oct 2, 2024

deepak-gowda-narayana changed the title ~~Implementation of Batching, Enabling HPU Graphs and FP8 quantization Integration for SD3 Pipeline~~ Implementation of Batching, Enabling HPU Graphs and FP8 quantization for SD3 Pipeline Oct 2, 2024

Merge branch 'huggingface:main' into main

a372536

dsocek reviewed Oct 4, 2024

View reviewed changes

examples/stable-diffusion/quantization/measure_config.json Outdated Show resolved Hide resolved

examples/stable-diffusion/quantization/quant_config.json Outdated Show resolved Hide resolved

deepak-gowda-narayana and others added 9 commits October 7, 2024 09:43

Update README.md

cb91354

sync with original branch

1645d81

Update README.md

3b00c52

Create measure_config.json

5dd3246

Create quantize_config.json

20b8352

Delete examples/stable-diffusion/quantization/measure_config.json

cfe29ec

Delete examples/stable-diffusion/quantization/quant_config.json

2ee0849

Add quant mode arguments for Stable Diffusion

86f66d0

Merge branch 'huggingface:main' into main

999b527

Add tests for SD3 - batch size

c5c8ceb

deepak-gowda-narayana requested review from dsocek and skavulya October 7, 2024 20:23

dsocek approved these changes Oct 7, 2024

View reviewed changes

examples/stable-diffusion/quantization/SD3/measure_config.json Outdated Show resolved Hide resolved

deepak-gowda-narayana added 3 commits October 7, 2024 14:50

Rename measure_config.json to measure_config.json

f485278

Create quantize_config.json

8154614

Delete examples/stable-diffusion/quantization/SD3/quantize_config.json

64e09e0

dsocek reviewed Oct 7, 2024

View reviewed changes

deepak-gowda-narayana added 5 commits October 7, 2024 16:09

Update measure_config.json

865d950

Update quantize_config.json

1989c7f

Update text_to_image_generation.py

aaec9f6

Update README.md

d55f8d4

update correct file name

3f41f45

Merge branch 'huggingface:main' into main

0983284

Run make style

335d4f2

dsocek reviewed Oct 8, 2024

View reviewed changes

examples/stable-diffusion/README.md Show resolved Hide resolved

Update README.md

eabf4d9

Merge branch 'huggingface:main' into main

63882fb

deepak-gowda-narayana and others added 2 commits October 21, 2024 20:06

Update sample size , add quant_mode argument to pipeline input

0611cb0

make style to format code style

65b5dba

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implementation of Batching, Enabling HPU Graphs and FP8 quantization for SD3 Pipeline #1345

Implementation of Batching, Enabling HPU Graphs and FP8 quantization for SD3 Pipeline #1345

deepak-gowda-narayana commented Sep 19, 2024 •

edited

Loading

mkpatel3-github commented Oct 1, 2024

dsocek left a comment

skavulya left a comment

deepak-gowda-narayana commented Oct 7, 2024

deepak-gowda-narayana commented Oct 7, 2024

dsocek left a comment

dsocek left a comment

deepak-gowda-narayana commented Oct 7, 2024

deepak-gowda-narayana commented Oct 8, 2024

deepak-gowda-narayana commented Oct 8, 2024

deepak-gowda-narayana commented Oct 15, 2024

emascarenhas commented Oct 18, 2024

deepak-gowda-narayana commented Oct 22, 2024

Implementation of Batching, Enabling HPU Graphs and FP8 quantization for SD3 Pipeline #1345

Are you sure you want to change the base?

Implementation of Batching, Enabling HPU Graphs and FP8 quantization for SD3 Pipeline #1345

Conversation

deepak-gowda-narayana commented Sep 19, 2024 • edited Loading

What does this PR do?

Before submitting

mkpatel3-github commented Oct 1, 2024

dsocek left a comment

Choose a reason for hiding this comment

skavulya left a comment

Choose a reason for hiding this comment

deepak-gowda-narayana commented Oct 7, 2024

deepak-gowda-narayana commented Oct 7, 2024

dsocek left a comment

Choose a reason for hiding this comment

dsocek left a comment

Choose a reason for hiding this comment

deepak-gowda-narayana commented Oct 7, 2024

deepak-gowda-narayana commented Oct 8, 2024

deepak-gowda-narayana commented Oct 8, 2024

deepak-gowda-narayana commented Oct 15, 2024

emascarenhas commented Oct 18, 2024

deepak-gowda-narayana commented Oct 22, 2024

deepak-gowda-narayana commented Sep 19, 2024 •

edited

Loading