-
Notifications
You must be signed in to change notification settings - Fork 712
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: add maxTokens to serve mode #1280
fix: add maxTokens to serve mode #1280
Conversation
c5473f7
to
7305e8a
Compare
@AlexsJones Could you review please? |
6815774
to
52b3784
Compare
Signed-off-by: samir-tahir <[email protected]>
52b3784
to
e7067b8
Compare
@AlexsJones - Are you happy to merge? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this, I think it's a good first step
No worries 😃, once merged can you review the change for the operator to adopt this in k8sgpt-ai/k8sgpt-operator#545 |
@AlexsJones Can you merge this please? |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #1280 +/- ##
==========================================
- Coverage 34.76% 34.65% -0.12%
==========================================
Files 94 95 +1
Lines 6342 6417 +75
==========================================
+ Hits 2205 2224 +19
- Misses 4046 4100 +54
- Partials 91 93 +2 ☔ View full report in Codecov by Sentry. |
📑 Description
We need to set
maxTokens
for serve mode, this is not set by default like it is withk8sgpt auth
, which defaults to2048
.Backends like google gemini and googlevertexai fail without setting maxTokens when running in serve mode (so including k8sgpt operator).
This PR adds a new environment variable
K8SGPT_MAX_TOKENS
which you can set for serve, the default is set to2048
if unset✅ Checks
ℹ Additional Information
The k8sgpt operator will need to be updated to set maxTokens and providerId also in the k8sgpt CRD spec - injecting the env vars - see k8sgpt-ai/k8sgpt-operator#545