Skip to content

Releases: marqo-ai/marqo

Release 1.5.1

15 Dec 03:10
a7200df
Compare
Choose a tag to compare

1.5.1

Bug fixes and minor changes

  • Adding no_model to MARQO_MODELS_TO_PRELOAD no longer causes an error on startup. Preloading process is simply skipped for this model #657.

Release 1.5.0

04 Dec 07:23
702b668
Compare
Choose a tag to compare

1.5.0

New Features

  • Separate model for search and add documents (#633). Using the search_model and search_model_properties key in index_defaults allows you to specify a model specifically to be used for searching. This is useful for using a different model for search than what is used for add_documents. Learn how to use search_model here.
  • Prefixes for text chunks and queries enabled to improve retrieval for specific models (#643). These prefixes are defined at the model_properties level, but can be overriden at index creation, add documents, or search time. Learn how to use prefixes for add_documents here and search here.

Bug fixes and minor changes

Contributor shout-outs

  • A huge thank you to all our 3.7k stargazers!
  • Thanks everyone for continuing to participate in our forum! Keep all your insights, questions, and feedback coming!

Release 1.4.0

28 Oct 00:00
e15470b
Compare
Choose a tag to compare

1.4.0

Breaking Changes

  • Configurable document count limit for add_documents() calls (#592). This mitigates Marqo getting overloaded
    due to add_documents requests with a very high number of documents. If you are adding documents in batches larger than the default (64), you will now
    receive an error. You can ensure your add_documents request complies to this limit by setting the Python client’s client_batch_size or changing this
    limit via the MARQO_MAX_ADD_DOCS_COUNT variable. Read more on configuring the doc count limit here.
  • Default refresh value for add_documents() and delete_documents() set to false (#601). This prevents
    unnecessary refreshes, which can negatively impact search and add_documents performance, especially for applications that are
    constantly adding or deleting documents. If you search or get documents immediately after adding or deleting documents, you may still get some extra
    or missing documents. To see results of these operations more immediately, simply set the refresh parameter to true. Read more on this parameter
    here.

New Features

  • Custom vector field type added (#610). You can now add externally generated vectors to Marqo documents! See
    usage here.
  • no_model option added for index creation (#617). This allows for indexes that do no vectorisation, providing
    easy use of custom vectors with no risk of accidentally mixing them up with Marqo-generated vectors. See usage here.
  • The search endpoint's q parameter is now optional if context vectors are provided. (#617). This is
    particularly useful when using context vectors to search across your documents that have custom vector fields. See usage here.
  • Configurable retries added to backend requests (#623). This makes add_documents() and search() requests
    more resilient to transient network errors. Use with caution, as retries in Marqo will change the consistency guarantees for these endpoints. For more
    control over retry error handling, you can leave retry attempts at the default value (0) and implement your own backend communication error handling.
    See retry configuration instructions and how it impacts these endpoints' behaviour here.
  • More informative delete_documents() response (#619). The response object now includes a list of document
    ids, status codes, and results (success or reason for failure). See delete documents usage here.
  • Friendlier startup experience (#600). Startup output has been condensed, with unhelpful log messages removed.
    More detailed logs can be accessed by setting MARQO_LOG_LEVEL to debug.

Bug fixes and minor changes

  • Updated README: added Haystack integration, tips, and fixed links (#593, #602, #616).
  • Stabilized test suite by adding score modifiers search tests (​​#596) and migrating test images to S3 (#594).
  • bulk added as an illegal index name (#598). This prevents conflicts with the /bulk endpoint.
  • Unnecessary reputation field removed from backend call (#609).
  • Fixed typo in error message (#615).

Contributor shout-outs

  • A huge thank you to all our 3.7k stargazers!
  • Shoutout to @TuanaCelik for helping out with the Haystack integration!
  • Thanks everyone for keeping our forum busy. Don't hesitate to keep posting your insights, questions, and feedback!

Release 1.3.0

04 Sep 05:33
d5a4008
Compare
Choose a tag to compare

1.3.0

New features

  • New E5 models added to model registry (#568). E5 V2 and Multilingual E5 models are now available for use. The new E5 V2 models outperform their E5 counterparts in the BEIR benchmark, as seen here. See all available models here.
  • Dockerfile optimisation (#569). A pre-built Marqo base image results in reduced image layers and increased build speed, meaning neater docker pulls and an overall better development experience.

Bug fixes and minor changes

  • Major README overhaul (#573). The README has been revamped with up-to-date examples and easier to follow instructions.
  • New security policy (#574).
  • Improved testing pipeline (#582 & #586). Tests now trigger on pull request updates. This results in safer and easier merges to mainline.
  • Updated requirements files. Now the requirements.dev.txt should be used to install requirements for development environments (#569). Version pins for protobuf & onnx have been removed while a version pin for anyio has been added (#581, & #589).
  • General readability improvements (#577, #578, #587, & #580)

Contributor shout-outs

  • A huge thank you to all our 3.5k stargazers!
  • Shoutout to @vladdoster for all the useful spelling and grammar edits!
  • Thanks everyone for keeping our forum bustling. Don't hesitate to keep posting your insights, questions, and feedback!

Release 1.2.0

14 Aug 07:53
1bf9977
Compare
Choose a tag to compare

1.2.0

New features

  • Storage status in health check endpoint (#555 & #559). The GET /indexes/{index-name}/health endpoint's backend object will now return the boolean storage_is_available, to indicate if there is remaining storage space. If space is not available, health status will now return yellow. See here for detailed usage.

  • Score Modifiers search optimization (#566). This optimization reduces latency for searches with the score_modifiers parameter when field names or weights are changed. See here for detailed usage.

Bug fixes and minor changes

  • Improved error message for full storage (#555 & #559). When storage is full, Marqo will return 400 Bad Request instead of 429 Too Many Requests.
  • Searching with a zero vector now returns an empty list instead of an internal error (#562).

Contributor shout-outs

  • A huge thank you to all our 3.3k stargazers!
  • Thank you for all the continued discussion in our forum. Keep all the insights, questions, and feedback coming!

Release 1.1.0

31 Jul 12:06
f477273
Compare
Choose a tag to compare

1.1.0

New features

  • New field numberOfVectors in the get_stats response object (#553).
    This field counts all vectors from all documents in a given index. See here for detailed usage.

  • New per-index health check endpoint GET /indexes/{index-name}/health (#552).
    This replaces the cluster-level health check endpoint, GET /health,
    which is deprecated and will be removed in Marqo 2.0.0. See here for detailed usage.

Bug fixes and minor changes

  • Improved image download validation and resource management (#551). Image downloading in Marqo is more stable and resource-efficient now.

  • Adding documents now returns an error when tensorFields is not specified explicitly (#554). This prevents users accidentally creating unwanted tensor fields.

Contributor shout-outs

  • Thank you for the vibrant discussion in our forum. We love hearing your questions and about your use cases.

Release 1.0.0

24 Jul 09:54
ca136d2
Compare
Choose a tag to compare

1.0.0

Breaking Changes

  • New parameter tensor_fields will replace non_tensor_fields in the add_documents endpoint (#538). Only fields in tensor_fields will have embeddings generated, offering more granular control over which fields are vectorised. See here for the full list of add_documents parameters and their usage. The non_tensor_fields parameter is deprecated and will be removed in a future release. Calls to add_documents with neither of these parameters specified will now fail.

  • Multiple tensor field optimisation (#530). This optimisation results in faster and more stable searches across multiple tensor fields. Please note that indexed documents will now have a different internal document structure, so documents indexed with previous Marqo versions cannot be searched with this version, and vice versa.

  • The add_documents endpoint's request body is now an object, with the list of documents under the documents key (#535). The query parameters use_existing_tensors, image_download_headers, model_auth, and mappings have been moved to the body as optional keys, and support for these parameters in the query string is deprecated. This change results in shorter URLs and better readability, as values for these parameters no longer need to be URL-encoded. See here for the new add_documents API usage. Backwards compatibility is supported at the moment but will be removed in a future release.

  • Better validation for index creation with custom models (#530). When creating an index with a model not in the registry, Marqo will check if model_properties is specified with a proper dimension, and raise an error if not. See here for a guide on using custom models. This validation is now done at index creation time, rather than at add documents or search time.

  • Stricter filter_string syntax for search (#530). The filter_string parameter must have special Lucene characters escaped with a backslash (\) to filter as expected. This will affect filtering on field names or content that contains special characters. See here for more information on special characters and see here for a guide on using Marqo filter strings.

  • Removed server-side batching (batch_size parameter) for the add_documents endpoint (#527). Instead, client-side batching is encouraged (use client_batch_size instead of server_batch_size in the python client).

New Features

  • Multi-field pagination (#530). The offset parameter in search can now be used to paginate through results spanning multiple searchable_attributes. This works for both TENSOR and LEXICAL search. See here for a guide on pagination.
  • Optimised default index configuration (#540).

Bug Fixes & Minor Changes

  • Removed or updated all references to outdated features in the examples and the README (#529).
  • Enhanced bulk search test stability (#544).

Contributor shout-outs

  • Thank you to our 3.2k stargazers!
  • We've finally come to our first major release, Marqo 1.0.0! Thanks to all our users and contributors, new and old, for your feedback and support to help us reach this huge milestone. We're excited to continue building Marqo with you. Happy searching!

Release 0.1.0

04 Jul 05:27
a9c63f2
Compare
Choose a tag to compare

0.1.0

New features

  • Telemetry. Marqo now includes various timing metrics for the search, bulk_search and add_documents endpoints when the query parameter telemetry=True is specified (#506). The metrics will be returned in the response body and provide a breakdown of latencies for various stages of the API call.
  • Consolidate default device to CUDA when available (#508). By default, Marqo now uses CUDA devices for search and indexing if available. See here for more information. This helps ensure you get the best indexing and search experience without having to explicitly add the device parameter to search and add_documents calls.
  • Model download integrity verification (#502). Model files are validated and removed if corrupted during download. This helps ensure that models are not loaded if they are corrupted.

Breaking changes

  • Remove deprecated add_or_update_documents endpoint (#517).
  • Disable automatic index creation. Marqo will no longer automatically create an index if it does not exist (#516). Attempting to add documents to a non-existent index will now result in an error. This helps provide more certainty about the properties of the index you are adding documents to, and also helps prevent accidental indexing to the wrong index.
  • Remove parallel indexing (#523). Marqo no longer supports server-side parallel indexing. This helps deliver a more stable and efficient indexing experience. Parallelisation can still be implemented by the user.

Bug fixes and minor changes

  • Improve error messages (#494, #499).
  • Improve API request validation (#495).
  • Add new multimodal search example (#503).
  • Remove autocast for CPU to speed up vectorisation on ARM64 machines (#491).
  • Enhance test stability (#514).
  • Ignore .kibana index (#512).
  • Improve handling of whitespace when indexing documents (#521).
  • Update CUDA version to 11.4.3 (#525).

Contributor shout-outs

  • Thank you to our 3.1k stargazers!

Release 0.0.21

31 May 09:02
ba2e99a
Compare
Choose a tag to compare

0.0.21

New features

  • Load custom SBERT models from cloud storage with authentication (#474). Marqo now supports fetching your fine-tuned public and private SBERT models from Hugging Face and AWS s3. Learn more about using your own SBERT model here. For instructions on loading a private model using authentication, check [model auth during search(https://docs.marqo.ai/0.0.19/API-Reference/search/#model-auth) and model auth during add_documents.

  • Bulk search score modifier and context vector support (#469). Support has been added for score modifiers and context vectors to our bulk search API. This can help enhance throughput and performance for certain workloads. Please see documentation for usage.

Bug fixes and minor changes

Contributor shout-outs

  • A special thank you to our 3.0k stargazers!

Release 0.0.20

18 May 10:18
9eb4561
Compare
Choose a tag to compare

0.0.20

New features

  • Custom model pre-loading (#475). Public CLIP and OpenCLIP models specified by URL can now be loaded on Marqo startup via the MARQO_MODELS_TO_PRELOAD environment variable. These must be formatted as JSON objects with model and model_properties.
    See here (configuring pre-loaded models) for usage.

Bug fixes and minor changes

  • Fixed arm64 build issue caused by package version conflicts (#478)