-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
💅 Update text2vec-azure-openai
to utilize isAzure: true
flag and mark resourceName
+ deploymentId
as optional
#196
base: main
Are you sure you want to change the base?
Conversation
1952e48
to
6b07490
Compare
…rk `resourceName` + `deploymentId` as optional This relates to the changes in weaviate/weaviate#5776
6b07490
to
9184e62
Compare
Great to see you again! Thanks for the contribution. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the contribution! Left a few comments mainly around house-keeping otherwise the PR looks great 😁
/** Will automatically be set to true. You don't need to set this manually. */ | ||
isAzure?: true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this is always true
, does it still need the ?
operator? If not, can we remove it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the true
is always set by the text2VecAzureOpenAI function internally, should not be necessary to be passed by the user - but due to how the types in general are structured I could not easily 1) remove it completely from the config object, nor 2) remove the optional operator, since then the user would be required to supply isAzure: true
manually 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahhh, okay I see now. I think this makes sense if the user makes use of the .azureOpenAI
method but part of the API is still to allow users to work with the raw types if they so wish. As such, if a user did:
generative: {
name: 'generative-openai',
config: {
deploymentId: config.deploymentId,
resourceName: config.resourceName,
baseURL: config.baseURL,
}
}
then the type system would allow it since isAzure
is optional yet the runtime would interpret this as isAzure: undefined
, which is a false-y value.
I like the idea of introducing the isAzure
flag to the TS client but I think it may be better placed as a pure internal, e.g. not exposed in the user types, wdyt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My thinking is that isAzure
should be removed from the ...Config
types and instead interpreted by the client's runtime itself depending on the name of the module. So this would most likely require the addition of generative-azure-openai
, alongside text2vec-azure-openai
, that is then parsed appropriately in the collection creation logic
There we'd have some boolean clauses to determine whether the module is an azure one, based on the name, and then inject isAzure: true
into the config appropriately. IMO, this would be the most consistent for the client/server relationship as I'm sure there will be future refactoring of the server that changes this behaviour. Then, we'd only break the internal relationship rather than the public API
We already do something similar here, wdyt about extending this logic as described above?
If you'd rather not then that's fine, I can add it to my backlog 😁 Also, sorry for the spaghetti of the collection.create
method, I've not had the chance to refactor it into a better structure 😅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it - this definitely makes more sense 👍 I didn't look into this part.
I'm drowning a bit in other work right now, but I should be able to look into this in more detail hopefully next week or so 🙏
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I get round to it this week, I'll ping you on here to let you know. Thanks for your help so far!
…passed at all Since the only indicator for the azureOpenAI config is now the isAzure: true flag, which is set in the vectorizer setup directly, no config object is necessary for it.
With this adjustment, devs can use the
text2VecAzureOpenAI
vectorizer, without specifyingdeploymentId
orresourceName
upfront for their collection.Instead, they may provide the headers
X-Azure-Deployment-Id
andX-Azure-Resource-Name
in their requests to set these.Internally, using text2VecAzureOpenAI will set the an
isAzure: true
flag for the OpenAI vectorizer, so it understands that the Azure logic must be used.