Skip to content

Commit

Permalink
Fix embed not using cuda as default device when available 2.11 (#941)
Browse files Browse the repository at this point in the history
What is the current behavior? (You can also link to an open issue here)
when device is unset in the embed request, even though cuda is available, default device of embed is cpu. Other functionalities such as search, add docs are unaffected (defaults to cuda when available)

What is the new behavior (if this is a feature change)?
default device in embed will be cuda when it is avialable
  • Loading branch information
RaynorChavez authored Aug 20, 2024
1 parent 1f36bfd commit b43bab5
Show file tree
Hide file tree
Showing 5 changed files with 57 additions and 5 deletions.
4 changes: 3 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -149,4 +149,6 @@ dump.rdb
.DS_Store

# Tester app for unit tests
scripts/vespa_local/vespa_tester_app.zip
scripts/vespa_local/vespa_tester_app.zip

src/marqo/tensor_search/cache_dir/*
10 changes: 10 additions & 0 deletions RELEASE.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,8 @@
# Release 2.11.2

## Bug fixes and minor changes
- Fix an issue where CUDA was not automatically selected as the default device for the `embed` endpoint, even when available [#941](https://github.com/marqo-ai/marqo/pull/941).

# Release 2.11.1

## Bug fixes and minor changes
Expand All @@ -22,6 +27,11 @@
- Huge shoutout to all our 4.4k stargazers! We’ve come a long way as a team and as a community, so a huge thanks to everyone who continues to support Marqo.
- Feel free to keep on sharing questions and feedback on our [forum](https://community.marqo.ai/) and [Slack channel](https://marqo-community.slack.com/join/shared_invite/zt-2b4nsvbd2-TDf8agPszzWH5hYKBMIgDA#/shared-invite/email)! If you have any more inquiries or thoughts, please don’t hesitate to reach out.

# Release 2.10.2

## Bug fixes and minor changes
- Fix an issue where CUDA was not automatically selected as the default device for the `embed` endpoint, even when available [#941](https://github.com/marqo-ai/marqo/pull/941).

# Release 2.10.1

## Bug fixes and minor changes
Expand Down
6 changes: 3 additions & 3 deletions src/marqo/core/embed/embed.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@
from marqo.tensor_search.tensor_search_logging import get_logger
from marqo.core.utils.prefix import determine_text_prefix, DeterminePrefixContentType
from marqo.vespa.vespa_client import VespaClient
from marqo.tensor_search import utils

logger = get_logger(__name__)

Expand Down Expand Up @@ -61,11 +62,10 @@ def embed_content(
temp_config = config.Config(
vespa_client=self.vespa_client,
)

# Set default device if not provided
if device is None:
device = self.default_device

device = utils.read_env_vars_and_defaults("MARQO_BEST_AVAILABLE_DEVICE")

# Content validation is done in API model layer
t0 = timer()
Expand Down
2 changes: 1 addition & 1 deletion src/marqo/version.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
__version__ = "2.11.1"
__version__ = "2.11.2"

def get_version() -> str:
return f"{__version__}"
40 changes: 40 additions & 0 deletions tests/tensor_search/integ_tests/test_embed.py
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@
from marqo.vespa.models.query_result import Root, Child, RootFields
from marqo.tensor_search.models.private_models import S3Auth, ModelAuth, HfAuth
from marqo.api.models.embed_request import EmbedRequest
from marqo.tensor_search import utils
import os
import pprint
import unittest
Expand Down Expand Up @@ -150,6 +151,45 @@ def tearDown(self) -> None:
super().tearDown()
self.device_patcher.stop()

def test_embed_content_cuda_device_as_default(self):
"""
Test that embed_content uses the default device when no device is specified.
"""
for index in [self.unstructured_default_text_index, self.structured_default_text_index]:
with self.subTest(index=index.type):
expected_devices = ["cuda", "cpu"]
for expected_device in expected_devices:
with patch.dict(os.environ, {"MARQO_BEST_AVAILABLE_DEVICE": expected_device}):
with patch('marqo.tensor_search.tensor_search.run_vectorise_pipeline') as mock_vectorise:
mock_vectorise.return_value = {0: [0.1, 0.2, 0.3]}

embed_res = embed(
marqo_config=self.config,
index_name=index.name,
embedding_request=EmbedRequest(
content=["This is a test document"]
),
device=None
)

# Check that run_vectorise_pipeline was called
mock_vectorise.assert_called_once()

# Get the arguments passed to run_vectorise_pipeline
args, kwargs = mock_vectorise.call_args

# Print the args and kwargs for debugging
print(f"args passed to run_vectorise_pipeline: {args}")
print(f"kwargs passed to run_vectorise_pipeline: {kwargs}")

# Check that the device passed to run_vectorise_pipeline matches the expected value
self.assertEqual(args[2], expected_device)

# Check the result
self.assertEqual(embed_res["content"], ["This is a test document"])
self.assertIsInstance(embed_res["embeddings"][0], list)
self.assertEqual(embed_res["embeddings"][0], [0.1, 0.2, 0.3])

def test_embed_equivalent_to_add_docs(self):
"""
Ensure that the embedding returned by embed endpoint matches the one created by add_docs.
Expand Down

0 comments on commit b43bab5

Please sign in to comment.