Performance, caching optimizations #235

naddeoa · 2024-02-17T20:20:13Z

No description provided.

Experimenting performance improvements.

Presidio was doing some init work outside of the init method without caching. Also, workflows now return metadata about performance times.

We need to be able to not download torch so we can control it from consuming applications.

Making everything explicit since it isn't clear that this object isn't the primary output of the api.

Apparently the new version of spacy doesn't pull down packages automatically via .load() anymore. Now you have to explicitly download the package first. They don't have an api intended for python use but their CLI is python and you can still call it directly. They don't have a nice way of pinning the version yet though, short of copying the github url that they use to store the wheels.

Anthony Naddeo added 14 commits February 15, 2024 22:03

Add an LRU cache to transformer embedding

f898dd7

Experimenting performance improvements.

bump version

a8907a9

Tweak caches

d2f45b4

bump version

71fadd8

Add perf metadata, fix some metric inits

e1cdedd

Presidio was doing some init work outside of the init method without caching. Also, workflows now return metadata about performance times.

bump version

9c82f55

Don't include torch in all extras

1f0cbe7

We need to be able to not download torch so we can control it from consuming applications.

bump version

16c6c41

Fix formatting on performance metrics

448c537

bump version

6a4cf8a

Make perf info a dict for readability

a4029c7

bump version

c440c42

Update perf info name api

954c3e0

Making everything explicit since it isn't clear that this object isn't the primary output of the api.

bump version

c26ca54

naddeoa force-pushed the perf branch from 36e02a2 to c26ca54 Compare February 17, 2024 20:32

Anthony Naddeo added 4 commits February 17, 2024 15:04

bump version

b1f92c3

Add SSN and bank account to the default pii list

b57db35

bump version

59a1a04

naddeoa merged commit 81af2de into workflow Feb 18, 2024
2 checks passed

naddeoa deleted the perf branch February 18, 2024 01:47

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance, caching optimizations #235

Performance, caching optimizations #235

naddeoa commented Feb 17, 2024

Performance, caching optimizations #235

Performance, caching optimizations #235

Conversation

naddeoa commented Feb 17, 2024