-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Lots of class C transactions #9
Comments
I think the original reason for doing that is that there was no other way to get all versions of the file, which was needed in some error recovery cases. That was only required on writes, so I think it could be improved during reads, if that's not already true. Also, I'll look into this more this weekend when I have some time. |
Hi, I just started using this yesterday and already burned through the class C operations included in the free tier after only 2 megabytes of data. Have you found any time to look at this? Unfortunately I don't know golang at all. |
It doesn't look like That said, there are workarounds: I think there are so many class C calls because Since these are helpful, non-default, and non-obvious, I'll update the readme to mention them. |
@timsomers Could you try the |
hi @encryptio My full command is already "git annex copy --to b2 --not --in b2 --trust b2" so additionally trusting that repo should not make a difference. I've changed it anyway to make sure, but we'll have to wait until tomorrow to see the result. Can't you just cache the output of |
Trusting the repo changed nothing. Checking the reports page it seems 2 list calls are made for each upload:
Is this necessary? Can't we at least reduce this to a single call? |
There's a possible race condition if multiple processes upload the same file at the same time, but that race condition already existed and does not cause any data loss (instead, it causes a small waste of space in B2) and the inconsistency is corrected after removal by a git annex fsck. I think people will naturally avoid the race condition. Fixing it would require some extra API calls to ListFileVersions during removal. Improves #9
@timsomers Added a cache to the I had to think pretty hard about making sure it's actually safe to do so, and ended up concluding that it's not significantly worse than the existing race condition (see 2bf053c for details on the race.) |
I've built this and it does indeed improve things. Now I manage to upload about 2.7k files with 2.5k transactions. I wanted to try with a longer cache time (I don't believe the race condition you mentioned applies to me, as I only push from a single repo) but I didn't immediately find how to rebuild my custom code, just the |
If you'd like to adjust and rebuild, edit the source in A longer cache time wouldn't improve things, since the thing it's caching is for a single upload. Getting better than one |
I've been using this and noticed the high number of class C transactions, so I modified it to be able to cache the full bucket contents to memory for an entire invocation, or for a duration which can be set by the user: meristo@7ef35c6 |
Has anyone used @meristo 's patch? I'll try giving it a go in the next week and report back. |
This remote uses lots of class C transactions to the B2 API, which can be quite expensive. I think this is mostly due to the calls to
ListFileNames()
for each operation. Could it be possible to replace them with "simple" calls toGetFileInfo()
, a class B operation?Thanks a lot for your work!
The text was updated successfully, but these errors were encountered: