Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TEI Process dying on Sagemaker Endpoint with g4dn.xlarge #429

Open
3 of 4 tasks
BebehCodes opened this issue Oct 24, 2024 · 1 comment
Open
3 of 4 tasks

TEI Process dying on Sagemaker Endpoint with g4dn.xlarge #429

BebehCodes opened this issue Oct 24, 2024 · 1 comment

Comments

@BebehCodes
Copy link

System Info

Hello,

I'm using the AWS TEI Docker image (2.0.1-tei1.4.0-gpu-py310-cu122-ubuntu22.04) for text embeddings inference. When I deploy it on a SageMaker g4dn.xlarge instance, the process stops working after just a couple of requests. Strangely, the same setup runs smoothly on a g5 instance without any issues.

It looks like after a few inference requests on g4dn.xlarge, the processes that serve the models just die.

Any idea why is that happening with the specific instance?

Information

  • Docker
  • The CLI directly

Tasks

  • An officially supported command
  • My own modifications

Reproduction

Deploy a model with TEI on g4dn and send a couple of hundred or thousand requests.

Expected behavior

I would expect for the processes to not die.

@rvdmarck
Copy link

rvdmarck commented Jan 9, 2025

Hello,

Encountering the same issue here. To add to this, for some input I get embedding consisting of only [None,...] values just after deploying the endpoint in an inconsistant manner (for the same input I would get that or a valid embedding).
Then, after some load testing, the model just always return array of [None,..] so it seems to have definitely died.

Any follow up on this ?

Regards

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants