Skip to content

Commit

Permalink
Change nltk and pillow versions
Browse files Browse the repository at this point in the history
  • Loading branch information
wanliAlex committed Oct 7, 2024
1 parent 70fb8f6 commit 8142bbc
Show file tree
Hide file tree
Showing 2 changed files with 8 additions and 2 deletions.
4 changes: 2 additions & 2 deletions requirements.dev.txt
Original file line number Diff line number Diff line change
Expand Up @@ -23,10 +23,10 @@ huggingface-hub==0.25.0
more_itertools
boto3==1.25.4
botocore==1.28.4
nltk==3.7
nltk==3.9.1
torch==1.12.1
torchvision==0.13.1
Pillow==9.3.0
Pillow==10.4.0
numpy==1.23.4
validators==0.20.0
sentence-transformers==2.2.2
Expand Down
6 changes: 6 additions & 0 deletions src/marqo/s2_inference/processing/text.py
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,12 @@ def _splitting_functions(split_by: str, language: str='english') -> FunctionType
except LookupError:
nltk.download("punkt")

# Punkt_tab needs to be downloaded after NLTK 3.8 and later
try:
nltk.data.find("tokenizers/punkt_tab")
except LookupError:
nltk.download("punkt_tab")

MAPPING = {
'character':list,
'word': partial(word_tokenize, language=language),
Expand Down

0 comments on commit 8142bbc

Please sign in to comment.