Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG: RuntimeError: Boolean value of Tensor with more than one value is ambiguous] #225

Open
siwer opened this issue Sep 26, 2024 · 0 comments
Labels
bug Something isn't working

Comments

@siwer
Copy link

siwer commented Sep 26, 2024

Python -VV

File "/opt/anaconda/envs/transformers/lib/python3.11/site-packages/mistral_inference/transformer.py", line 162, in forward_partial
    if self.vision_encoder is not None and images:
       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: Boolean value of Tensor with more than one value is ambiguous

Pip Freeze

aiohappyeyeballs @ file:///croot/aiohappyeyeballs_1725434011349/work
aiohttp @ file:///croot/aiohttp_1725527756643/work
aiosignal @ file:///tmp/build/80754af9/aiosignal_1637843061372/work
annotated-types==0.7.0
attrs @ file:///croot/attrs_1695717823297/work
Bottleneck @ file:///croot/bottleneck_1707864210935/work
Brotli @ file:///work/ci_py311/brotli-split_1676830125088/work
certifi @ file:///croot/certifi_1725551672989/work/certifi
cffi @ file:///croot/cffi_1700254295673/work
charset-normalizer @ file:///tmp/build/80754af9/charset-normalizer_1630003229654/work
cryptography @ file:///croot/cryptography_1702070282333/work
datasets @ file:///croot/datasets_1716911606380/work
dill @ file:///croot/dill_1715094664823/work
docstring_parser==0.16
filelock @ file:///croot/filelock_1700591183607/work
fire==0.6.0
frozenlist @ file:///croot/frozenlist_1698702560391/work
fsspec @ file:///croot/fsspec_1714461537038/work
gmpy2 @ file:///work/ci_py311/gmpy2_1676839849213/work
huggingface_hub @ file:///croot/huggingface_hub_1724853938404/work
idna @ file:///work/ci_py311/idna_1676822698822/work
Jinja2 @ file:///work/ci_py311/jinja2_1676823587943/work
jsonschema==4.23.0
jsonschema-specifications==2023.12.1
MarkupSafe @ file:///croot/markupsafe_1704205993651/work
mistral_common==1.4.3
mistral_inference==1.4.0
mkl-fft @ file:///croot/mkl_fft_1695058164594/work
mkl-random @ file:///croot/mkl_random_1695059800811/work
mkl-service==2.4.0
mpmath @ file:///croot/mpmath_1690848262763/work
multidict @ file:///croot/multidict_1701096859099/work
multiprocess @ file:///croot/multiprocess_1692294385131/work
networkx @ file:///croot/networkx_1690561992265/work
numexpr @ file:///croot/numexpr_1696515281613/work
numpy @ file:///croot/numpy_and_numpy_base_1704311704800/work/dist/numpy-1.26.3-cp311-cp311-linux_x86_64.whl#sha256=10a078151ecec16bafb535f7487635217625fa06536dec8509e514648c78d626
nvidia-cublas-cu12==12.1.3.1
nvidia-cuda-cupti-cu12==12.1.105
nvidia-cuda-nvrtc-cu12==12.1.105
nvidia-cuda-runtime-cu12==12.1.105
nvidia-cudnn-cu12==9.1.0.70
nvidia-cufft-cu12==11.0.2.54
nvidia-curand-cu12==10.3.2.106
nvidia-cusolver-cu12==11.4.5.107
nvidia-cusparse-cu12==12.1.0.106
nvidia-nccl-cu12==2.20.5
nvidia-nvjitlink-cu12==12.6.68
nvidia-nvtx-cu12==12.1.105
opencv-python-headless==4.10.0.84
packaging @ file:///croot/packaging_1720101850331/work
pandas @ file:///croot/pandas_1718308974269/work/dist/pandas-2.2.2-cp311-cp311-linux_x86_64.whl#sha256=3c7ce50f9f519c785bd4cdb28a0ca71f85a541f3d27b25aa9da770f953e7f2e9
pillow==10.4.0
pyarrow @ file:///croot/pyarrow_1721664224170/work/python
pycparser @ file:///tmp/build/80754af9/pycparser_1636541352034/work
pydantic==2.9.2
pydantic_core==2.23.4
pyOpenSSL @ file:///croot/pyopenssl_1690223430423/work
PySocks @ file:///work/ci_py311/pysocks_1676822712504/work
python-dateutil @ file:///croot/python-dateutil_1716495738603/work
pytz @ file:///croot/pytz_1713974312559/work
PyYAML @ file:///croot/pyyaml_1698096049011/work
referencing==0.35.1
regex @ file:///croot/regex_1723064389032/work
requests @ file:///croot/requests_1690400202158/work
rpds-py==0.20.0
safetensors @ file:///croot/safetensors_1724853960118/work
sentencepiece==0.2.0
simple-parsing==0.1.6
six @ file:///tmp/build/80754af9/six_1644875935023/work
sympy @ file:///croot/sympy_1701397643339/work
termcolor==2.4.0
tiktoken==0.7.0
tokenizers @ file:///croot/tokenizers_1721139552427/work
torch==2.4.1
torchaudio==2.1.2
torchvision==0.16.2
tqdm @ file:///croot/tqdm_1724853939799/work
transformers @ file:///home/conda/feedstock_root/build_artifacts/transformers_1724403320167/work
triton==3.0.0
typing_extensions @ file:///croot/typing_extensions_1715268824938/work
tzdata @ file:///croot/python-tzdata_1690578112552/work
urllib3 @ file:///croot/urllib3_1698257533958/work
xformers==0.0.28.post1
xxhash @ file:///work/ci_py311/python-xxhash_1676842384694/work
yarl @ file:///croot/yarl_1725976495189/work

Reproduction Steps

Running forward_partial() with Pixtral led to the above mentioned issue. See the code below for my script

import torch
from pathlib import Path
from mistral_inference.transformer import Transformer
from mistral_common.tokens.tokenizers.mistral import MistralTokenizer
from mistral_common.protocol.instruct.messages import UserMessage, TextChunk, ImageURLChunk
from mistral_common.protocol.instruct.request import ChatCompletionRequest

mistral_models_path = Path.home().joinpath('mistral_models', 'Pixtral')
tokenizer = MistralTokenizer.from_file(f"{mistral_models_path}/tekken.json")
model = Transformer.from_folder(mistral_models_path,device="cuda:0")

# Run the model 
url = "https://huggingface.co/datasets/patrickvonplaten/random_img/resolve/main/yosemite.png"
prompt = "Describe the image."

completion_request = ChatCompletionRequest(messages=[UserMessage(content=[ImageURLChunk(image_url=url), TextChunk(text=prompt)])])

encoded = tokenizer.encode_chat_completion(completion_request)

images = encoded.images
tokens = encoded.tokens

tokens = torch.tensor(tokens).to(model.device)
images = torch.cuda.BFloat16Tensor(images).to(model.device)

with torch.no_grad():
    res = model.forward_partial(input_ids=tokens,seqlens=[len(tokens)],images=images)

Expected Behavior

Expected model.forward_partial() to output the vector representations of the input tokens

Additional Context

No response

Suggested Solutions

Change line 162 in mistral-inference/blob/main/src/mistral_inference/transformer.py

current:
if self.vision_encoder is not None and images:

proposed solution:
if self.vision_encoder is not None and images is not None:

This led to the code to functioning properly

@siwer siwer added the bug Something isn't working label Sep 26, 2024
siwer added a commit to siwer/mistral-inference that referenced this issue Sep 27, 2024
Proposed fix for issue mistralai#225
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant