-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Q] Wrap nrtsearch into an AWS Lambda function #452
Comments
Hi Prakesh, I'm not super-familiar with Lambda but will try to answer your questions. By the way I see that Lambdas have a maximum execution time of 15 minutes here. As I said I'm not too familiar with Lambdas but if this means that after 15 minutes the replica would shut-down and new replica would come up from scratch, the overhead of downloading the index every 15 minutes might not be worth running it in a Lambda. Feel free to correct me if my understanding is wrong though. Best of luck with your work! |
Hi @sarthakn7 , Let me give you more info about my idea and how the AWS Lambda functions work. The AWS Lambda functions are functions executed in a lightweight container. The container is created during the first execution of the function and then it's frozen right after. If the function is not used for a long time, then the container is destroyed and needs to be bootstrapped again (it's called "cold boot") on the next function call. I'm not going to run the function for 15min. It won't be cost effective. The function must run when there's an idx update and/or a search to execute. Since the execution context is disposable, I need to:
Is it possible load the idx updates from S3 without having to download all the index files in S3? Thank you for your hints! 🙏 |
Hi @prakeshdp , Right now it is not possible to load index updates from S3 after the replica has started. The replica needs to talk to a primary for that. Even if it is frozen, whenever it's "unfrozen" it should be able to get the latest segments from the primary provided the primary indexes something while the replica is running so that an nrt point is published. Also thanks for the details about AWS Lambda. I think for it to be effective, the following conditions must be true for the use-case:
Both of the above conditions are not satisfied for any of our use-cases, but do go ahead with this works for you. Let us know how it goes! |
I came across the nrtsearch project and I'd love to use it for one of my side projects!
I think it would be great if replica nodes can run within an AWS Lambda function, so AWS can take care of the horizontal scaling.
I think running a node in a AWS Lambda function should be quite easy (including the idx backup/restore from/to S3), but gRPC won't work and the replication system should be reimplemented using REST API calls or something similar.
Do you have any hint on what to do or how to start working on a similar task?
Thank you so much for your hard work and for open sourcing such a great software! 🚀
The text was updated successfully, but these errors were encountered: