Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Backfill DB with legacy permits #109

Open
Keyrxng opened this issue May 18, 2024 · 9 comments · May be fixed by #112
Open

Backfill DB with legacy permits #109

Keyrxng opened this issue May 18, 2024 · 9 comments · May be fixed by #112

Comments

@Keyrxng
Copy link
Contributor

Keyrxng commented May 18, 2024

Required by ubiquity/work.ubq.fi#46

Using the CLI as-is or with required changes:

a) Collect permit data across ubiquity and ubiquibot orgs
b) Insert this data in the permits table (idempotently)

original comment ubiquity/work.ubq.fi#46 (comment)

time estimate: < 4 hrs
@Keyrxng
Copy link
Contributor Author

Keyrxng commented May 18, 2024

I was able to collect about 500 confirmed txhash using data from Dune Analytics yesterday.

Off the top of my head the CLI pulls about double that in permits. Some permits in the DB rn don't have a txHash, that's either because they haven't been claimed or it's acceptable for claimed permits to not have a txHash.

Not all have a location_id either, I plan to just use whatever location_id the user entry has.


A permit being generated and it being claimed can have an arbitrary length of time between so trying to match block_time with comment.timestamp doesn't work very well.

Still an angle or two I haven't tried yet but for sake of reference, for legacy permits where I cannot find a txHash and isNonceClaimed() === true should that be regarded as an invalidatedPermit and any where isNonceClaimed() === false without a txHash should be regarded as yet to be claimed?

@rndquu
Copy link
Member

rndquu commented May 23, 2024

I was able to collect about 500 confirmed txhash using data from Dune Analytics yesterday.

Permits have been generated from different signers at different moments in time so using only on-chain data might not be too accurate. It makes sense to utilize github API for fetching permits.

or it's acceptable for claimed permits to not have a txHash

I don't think claimed permits should miss tx hash.

Not all have a location_id either, I plan to just use whatever location_id the user entry has.

You may leave permits.location_id field empty since the locations table is deprecated and will be removed.

A permit being generated and it being claimed can have an arbitrary length of time between so trying to match block_time with comment.timestamp doesn't work very well.

Github API might have this info.

Still an angle or two I haven't tried yet but for sake of reference, for legacy permits where I cannot find a txHash and isNonceClaimed() === true should that be regarded as an invalidatedPermit and any where isNonceClaimed() === false without a txHash should be regarded as yet to be claimed?

I think most of them should have a txHash. Try fetching with github API.

@Keyrxng
Copy link
Contributor Author

Keyrxng commented May 23, 2024

Permits have been generated from different signers

As far as I can tell with the CLI output it has been four addresses and I used all four parsing on-chain.

Try fetching with github API

I assume you mean the comment metadata here yeah? The CLI collects permits via the claim_url in comments since the metadata is relatively new.

ty for the clarification on the rest

@rndquu
Copy link
Member

rndquu commented May 23, 2024

By the CLI you mean https://github.com/ubiquity/.github/tree/main/airdrop-cli, right? How does it work exactly? Does it parse only on-chain transactions from permit signer or uses github API to fetch all permits info from the claim_url query param?

@Keyrxng
Copy link
Contributor Author

Keyrxng commented May 23, 2024

It parses comments from the bot and pav extracting the permit data using the claim_url, it only verifies if a permit has been claimed via isNonceClaimed.

It uses four different regex to capture the various formats that have existed, and likely needs updated again to match the current format.

I collected data from Dune manually using the four payment addresses to the 60-odd hunters in the DB, using the txHash I unbundled the permit data and was able to pair those up, those come with timestamp but it is hard to pin it against it's comment like that because of time-til-claim, using nonce etc is easier.

The tally from Dune alone is far lower than what it is expected to be.


You can only trace back 50k (might be 5k off the top of my head) blocks at a time using a provider and scripting for each hunter which wasn't really feasible so Dune was the better option.

I can't think how you'd get the txHash using just the permit data / permit comment other than scanning 5/50k blocks at a time from the comment timestamp until that amount is transferred into the user wallet then unbundle the permit and track by nonce, amount.

Any suggestions? It's doable anyway, I just needed those points clarified

@rndquu
Copy link
Member

rndquu commented May 24, 2024

It parses comments from the bot and pav extracting the permit data using the claim_url, it only verifies if a permit has been claimed via isNonceClaimed.

It uses four different regex to capture the various formats that have existed, and likely needs updated again to match the current format.

I collected data from Dune manually using the four payment addresses to the 60-odd hunters in the DB, using the txHash I unbundled the permit data and was able to pair those up, those come with timestamp but it is hard to pin it against it's comment like that because of time-til-claim, using nonce etc is easier.

The tally from Dune alone is far lower than what it is expected to be.

You can only trace back 50k (might be 5k off the top of my head) blocks at a time using a provider and scripting for each hunter which wasn't really feasible so Dune was the better option.

I can't think how you'd get the txHash using just the permit data / permit comment other than scanning 5/50k blocks at a time from the comment timestamp until that amount is transferred into the user wallet then unbundle the permit and track by nonce, amount.

Any suggestions? It's doable anyway, I just needed those points clarified

We had a similar task of matching github issues with on-chain transactions at https://github.com/ubiquity/audit.ubq.fi and used etherscan API, although I don't remember a particular API method that was used (this one or this one or some other)

@Keyrxng
Copy link
Contributor Author

Keyrxng commented May 24, 2024

I wasn't aware of the issue or that etherscan had those endpoints actually, I checked and gnosisscan has them also. They don't seem to be limited by the same block limit either and use they use same method, awesome.

@Keyrxng
Copy link
Contributor Author

Keyrxng commented Jun 1, 2024

I've been working on this and will have an open PR ready tonight or tomorrow

@Keyrxng
Copy link
Contributor Author

Keyrxng commented Jun 5, 2024

I've had to pretty much refactor the CLI entirely for the following reasons:

  • It was not a very well written or maintainable tool

  • Originally I misinterpreted the spec I guess, and built it to be a literal CLI tool using CLIME etc which was and is not required.

  • The written data was not very workable for attempting to validate and cover many edge cases

  • The direction now is that all permits will be stored in the DB making any future use of this tool obsolete beyond using the output for manual verification

  • It was only using permits found in issues and was not validating them beyond a nonceBitmap check. The problem with that is an invalidated permit would appear the same as a claimed permit without the additional on-chain verification added through the refactor

  • This refactor was necessary to improve the reliability of the information especially because we are seeding the prod DB with these permits now as opposed to it's original purpose of just accumulating contributions for an airdrop.


I spent far more hours than estimated. I wanted to be confident that the data was as verifiable as I could make it and not push a half-arsed approach.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants