Release 2.5.0

vicilliar released this 02 May 05:33

2.5.0

New features

New ‘embed’ endpoint (POST /indexes/{index_name}/embed) (#803). Marqo can now perform inference and return the embeddings for a single piece or list of content, where content can be either a string or weighted dictionary of strings. See usage here.
New ‘recommend’ endpoint (POST /indexes/{index_name}/recommend) (#816). Given a list of existing document IDs, Marqo can now recommend similar documents by performing a search on interpolated vectors from the documents. See usage here.
Add Inference Cache to speed up frequent search and embed requests (#802). Marqo now caches embeddings generated during inference. The cache size and type can be configured with MARQO_INFERENCE_CACHE_SIZE and MARQO_INFERENCE_CACHE_TYPE. See configuration instructions here.
Add configurable search timeout (#813). Backend timeout now defaults to 1s, but can be configured with the environment variable VESPA_SEARCH_TIMEOUT_MS. See configuration instructions here.
More informative get_cuda_info response (#811). New keys: utilization memory_used_percent have been added for easier tracking of cuda device status. See here for more information.

Bug fixes and minor changes

Upgraded open_clip_torch, timm, and safetensors for access to new models (#810)

Contributor shout-outs

Shoutout to all our 4.1k stargazers! Thanks for continuing to use our product and helping Marqo grow.
Keep on sharing your questions and feedback on our forum and Slack channel! If you have any more inquiries or thoughts, please don’t hesitate to reach out.

Assets 2