Skip to content

Releases: opensanctions/yente

v4.2.0

02 Dec 11:36
Compare
Choose a tag to compare

This release updates various dependencies, introducing new fields in the followthemoney schema and bringing in security patches for the web stack dependencies.

We're also implementing the Reconciliation API's Data Extension protocol for the first time, allowing users to enrich OpenRefine tables with new columns using the API.

What's Changed

New Contributors

Full Changelog: v4.1.0...v4.2.0

v4.1.0

01 Oct 06:27
Compare
Choose a tag to compare

This release improves error handling in the dataset indexer, and updates many dependencies.

A small breaking change: the results.xxx.total.value response field in the /match query API now contains the number of matching entities, not the number of candidates that have been scored. This behavior is meaningful, the previous one was more of a bug.

What's Changed

Full Changelog: v4.0.0...v4.1.0

v4.0.0

23 Jul 14:18
Compare
Choose a tag to compare

This is a major release of yente which changes the data indexing and search backend systems. It's adding support for incremental data updates (delta updater) and for the OpenSearch search index as a provider. Read more in our announcement blog post:

https://www.opensanctions.org/articles/2024-07-24-yente4/

This release does not change the scoring and matching systems.

v3.8.10

04 Jul 07:18
Compare
Choose a tag to compare

What's Changed

Full Changelog: v3.8.9...v3.8.10

v3.8.9

09 May 09:27
Compare
Choose a tag to compare

What's Changed

Full Changelog: v3.8.8...v3.8.9

v3.8.8

17 Apr 08:23
Compare
Choose a tag to compare

New features

Dependency updates

New Contributors

Full Changelog: v3.8.4...v3.8.8

v3.8.4

27 Feb 16:26
Compare
Choose a tag to compare

This is a maintenance release which addresses a potential vulnerability in orjson. It does not change any scoring behaviour.

What's Changed

New Contributors

Full Changelog: v3.8.3...v3.8.4

v3.8.3

05 Feb 16:06
Compare
Choose a tag to compare

This release includes two changes to the match API:

  • Fix a bug where custom datasets that are much smaller than the OpenSanctions data were not scored correctly in search results and therefore didn't return even if they were a good match for the query.
  • Fix the phonetics matcher to cut off results where the raw (levenshtein) edit distance between the proposed match and the query exceeds a threshold.

What's Changed

Full Changelog: v3.8.2...v3.8.3

v3.8.2

16 Jan 16:42
Compare
Choose a tag to compare

This release makes functional changes in response to user feedback, in particular the following:

  • Indexer stability: the indexer process is struggling with interrupted downloads of source data, in part due to the growth of our database (error: "Payload not completed"). We've now switched to a different HTTP client library and added support for HTTP/2 binary streams in an effort to add more stability to this process. We've also disabled the option to conduct multiple indexing jobs at the same time.
  • Phonetic search yields overly broad results: this also results in missed matches due to an abnormally large number of match candidates being generated. We've further limited the way that phonetic search works in an effort to reduce false positives.
  • Default data update checks (YENTE_CRONTAB) are now conducted every two hours.
  • Improved handling of exceptions from the search index.
  • Introduced a new index_stale boolean flag in /catalog for monitoring purposes.

v3.8.0

04 Dec 09:13
Compare
Choose a tag to compare

This release brings a number of improvements:

  • Updated nomenklatura matching model (logic-v1) which now does SWIFT BIC matching and handles names with different tokenization better ("Jean-Paul Sartre" == "JeanPaul Sartre").
  • logic-v1 is now the default algorithm for the match API
  • The match API now supports a topics argument that can be used to match only entities with a particular topic tag (e.g. role.pep, sanction).
  • The /catalog endpoint now carries freshness data, giving the index_version for each dataset, and listing an array of all current and outdated datasets in the index.
  • Various dependency upgrades.