-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Upgrade Azure Storage SDK to a modern version #629
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A couple of comments. I know you plan to do more testing, so I'll take another look once that is done.
const properties = await this.queueClient.getProperties() | ||
return { count: properties.approximateMessagesCount } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This likely applies in other places as well. In the original code, error handling logged the error. In the new code, it appears that error handling depends on how await
handles exceptions instead of using try
and catch
blocks to log the error in the same way as the original.
const properties = await this.queueClient.getProperties() | |
return { count: properties.approximateMessagesCount } | |
try { | |
const properties = await this.queueClient.getProperties() | |
return { count: properties.approximateMessagesCount } | |
} catch (error) { | |
this.logger.error(error) // Log the error | |
return null // Handle the error by returning null | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah it's a good point. There was one more place like this, I've updated it as well.
if (connectionString) { | ||
this.client = QueueServiceClient.fromConnectionString(connectionString, pipelineOptions) | ||
} else { | ||
this.client = new QueueServiceClient( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a nice update here for maintaining the old approach and setting up a more flexible long term approach.
const retryOperations = new AzureStorage.ExponentialRetryPolicyFilter() | ||
this.client = AzureStorage.createQueueService(connectionString).withFilter(retryOperations) | ||
constructor(connectionString, options) { | ||
const pipelineOptions = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the use of pipelineOptions
. Makes it easier to see the configs related to the queues.
} | ||
|
||
// This API can only be used for the 'deadletter' store because we cannot look up documents by type performantly | ||
async count(type, force = false) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you talk about the reason this is deleted completely?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was an oversight, I've brought it back. Thanks for catching it!
3180953
to
d4ba41c
Compare
d4ba41c
to
b038b72
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for upgrading the sdk! A couple of comments for you to consider.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for incorporating the changes.
Upon further testing, an error was observed when writing the Expected: Observed: Steps to reproduce:
|
Error writing deadletters can be addressed in a separate PR. |
So we have the ability to have an harvest connection with connections string and queues with azure SPN
…om-harvest Separate crawler queue connection from harvest
After I added a commit to change queue configuration I've deployed to dev and run the integration tests. https://github.com/clearlydefined/operations/actions/runs/13264887396/job/37029783039#step:7:355 The failure looks ok to me as caused by a recent release to a package checked in the test |
@ljones140 Thank you for your testing! Please note that our integration tests are designed for sanity checks and do not cover all of our use cases. During testing, I observed that the crawler queue was not cleared after all requests were completed (see below, 10 hidden messages in the queue). I suspect that this issue was related to the previous code change where the message receipt was not returned in the |
@qtomlinson I ran the Integration tests again and the queues cleared. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Please address #629 (comment) in a subsequent PR.
Needs to be a string
@qtomlinson adding the fix to this PR as it was very simple edc8bb2 But did take a long time toe debug. Used the Azure docker env which was very helpful |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the quick fix! I am glad that the azurite setup makes debugging easier.
@@ -638,7 +638,7 @@ class Crawler { | |||
metadata.errorMessage = request._error.message | |||
metadata.errorStack = request._error.stack | |||
} | |||
metadata.version = "1" | |||
metadata.version = '1' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be updated to '1.1' as we are changing the file format for the deadletter here?
- Deadletters are internal exception information. The change here looks acceptable to me.
- Comparing the metadata structure of a deadletter and a harvested result (both are persisted documents in storage), there is no metadata.version in recent harvest results. Older harvest results prior to 2019 have metadata.version as a string. In the future, we may want to align with the metadata in harvested results and consider using schemaVersion instead.
This change is a prerequisite to enable token-based authentication for the Azure Storage operations (blobs and queues). This change leaves the options to use the existing, connection string-based authentication though.