Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Post-Mortem: Changes to Prepare for Next Week #126

Open
mdr223 opened this issue Oct 26, 2023 · 4 comments · Fixed by #141
Open

Post-Mortem: Changes to Prepare for Next Week #126

mdr223 opened this issue Oct 26, 2023 · 4 comments · Fixed by #141

Comments

@mdr223
Copy link
Collaborator

mdr223 commented Oct 26, 2023

Summarizing the steps to begin hardening our system for next week. Please add anything I may have missed from our discussion.

  1. Merge PR Feature/simplify services #119 to move Data Manager into chat & cleo services
  2. Merge PR adding archi code to sources in meta archi #109 to finish updates to A2rchi Meta
  3. Create PR to move lock from guarding OpenAI calls and only guard update(s) to vector store
  4. Merge this^ PR to main
  5. Write simple script w/for loop to submit N API calls to our Flask app, measure and plot histogram of latencies for N = [10, 100, 1000, 10000]
  • Call A2rchi at t3desk19.mit.edu:7683 ~5 times using the PSET 7 question to get some sense of the distribution for the latency
  • Simulate N calls to Flask app by deploying DumbLLM in dev but on t3desk19 with time.sleep(np.random.normal(mean, std)) where mean and std are guesstimates of latency parameters based on calls to t3desk19.mit.edu
    • Running experiment on t3desk is important b/c parallelism will be different compared with submit06
  • Record and store results
  1. Depending on performance results from 5., we may have to contemplate load-balancing w/multiple containers

And separately but also importantly:
7. remove raise e from inside our main try-except block in ChatWrapper

@julius-heitkoetter
Copy link
Collaborator

One edit here is that Ludo raised an issue (#115) that the DumbLLM is no longer compatible. We need to resolve this issue before we get to step 5 with a PR

@mdr223
Copy link
Collaborator Author

mdr223 commented Oct 26, 2023

Another final addition we will likely want to make is merging in my work on the db backend once it's ready. Without that we will also need to lock the conversations.json file (which is plausible), but could harm performance.

@ludomori99
Copy link
Collaborator

I believe we also discussed introducing timeouts in the requests in the app. Not sure if we want to open an issue about that.

@mdr223
Copy link
Collaborator Author

mdr223 commented Nov 7, 2023

With the new PR for request timeouts (#141), I believe we can close this issue once the PR is merged.

@mdr223 mdr223 linked a pull request Nov 7, 2023 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants