Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: add maxTokens to serve mode #1280

Merged
merged 8 commits into from
Nov 12, 2024

Conversation

samirtahir91
Copy link
Contributor

@samirtahir91 samirtahir91 commented Oct 23, 2024

📑 Description

We need to set maxTokens for serve mode, this is not set by default like it is with k8sgpt auth, which defaults to 2048.
Backends like google gemini and googlevertexai fail without setting maxTokens when running in serve mode (so including k8sgpt operator).

This PR adds a new environment variable K8SGPT_MAX_TOKENS which you can set for serve, the default is set to 2048 if unset

✅ Checks

  • My pull request adheres to the code style of this project
  • My code requires changes to the documentation
  • I have updated the documentation as required
  • All the tests have passed

ℹ Additional Information

The k8sgpt operator will need to be updated to set maxTokens and providerId also in the k8sgpt CRD spec - injecting the env vars - see k8sgpt-ai/k8sgpt-operator#545

@samirtahir91 samirtahir91 requested review from a team as code owners October 23, 2024 13:53
@samirtahir91 samirtahir91 force-pushed the fix/add-max-tokens-to-serve branch from c5473f7 to 7305e8a Compare October 23, 2024 13:57
@samirtahir91
Copy link
Contributor Author

@AlexsJones Could you review please?

cmd/serve/serve.go Outdated Show resolved Hide resolved
@samirtahir91 samirtahir91 force-pushed the fix/add-max-tokens-to-serve branch from 52b3784 to e7067b8 Compare October 24, 2024 06:25
@samirtahir91
Copy link
Contributor Author

@AlexsJones - Are you happy to merge?

Copy link
Member

@AlexsJones AlexsJones left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this, I think it's a good first step

@samirtahir91
Copy link
Contributor Author

Thanks for this, I think it's a good first step

No worries 😃, once merged can you review the change for the operator to adopt this in k8sgpt-ai/k8sgpt-operator#545

@samirtahir91
Copy link
Contributor Author

@AlexsJones Can you merge this please?

Copy link

codecov bot commented Nov 4, 2024

Codecov Report

Attention: Patch coverage is 0% with 12 lines in your changes missing coverage. Please review.

Project coverage is 34.65%. Comparing base (173e4dc) to head (9f68d30).
Report is 27 commits behind head on main.

Files with missing lines Patch % Lines
cmd/serve/serve.go 0.00% 12 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1280      +/-   ##
==========================================
- Coverage   34.76%   34.65%   -0.12%     
==========================================
  Files          94       95       +1     
  Lines        6342     6417      +75     
==========================================
+ Hits         2205     2224      +19     
- Misses       4046     4100      +54     
- Partials       91       93       +2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@AlexsJones AlexsJones merged commit a50375c into k8sgpt-ai:main Nov 12, 2024
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

3 participants