Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: bump icicle to v2.8.0 #595

Open
wants to merge 10 commits into
base: main
Choose a base branch
from
Open

chore: bump icicle to v2.8.0 #595

wants to merge 10 commits into from

Conversation

mrain
Copy link
Contributor

@mrain mrain commented May 29, 2024

no issue related.

This PR:

  • bump icicle deps to v2.8.0
    • HostOrDeviceSlice is now a trait. Occurrences are changed to concrete types now.
    • For Bn254 curve impl
      • commit_on_gpu now assume that the input scalars are in normal form instead of montgomery form.
      • now load_poly_to_gpu has one memory copy instead of free as before
      • these are because, icicle field impl now uses u32 limbs instead of u64 limbs as in arkworks.
  • add GPU stress test for memory leakage/fragmentation

This PR does not:

Key places to review:


Before we can merge this PR, please make sure that all the following items have been
checked off. If any of the checklist items are not applicable, please leave them but
write a little note why.

  • Targeted PR against correct branch (main)
  • Linked to GitHub issue with discussion and accepted design OR have an explanation in the PR that describes this work.
  • Wrote unit tests
  • Updated relevant documentation in the code
  • Added relevant changelog entries to the CHANGELOG.md of touched crates.
  • Re-reviewed Files changed in the GitHub PR explorer

@mrain mrain changed the title chore: bump icicle to v2.3.1 chore: bump icicle to v2.4.0 Jun 10, 2024
@alxiong
Copy link
Contributor

alxiong commented Jul 3, 2024

thanks @mrain for taking the initiative to upgrade our icicle dep! 👍

@mrain mrain changed the title chore: bump icicle to v2.6.0 chore: bump icicle to v2.8.0 Aug 15, 2024
@mrain mrain marked this pull request as ready for review August 15, 2024 15:23
@@ -38,7 +38,7 @@ ark-ed-on-bls12-377 = "0.4.0"
ark-ed-on-bls12-381 = "0.4.0"
ark-ed-on-bn254 = "0.4.0"
criterion = "0.5.1"
jf-pcs = { git = "https://github.com/EspressoSystems/jellyfish", tag = "0.4.5", features = ["test-srs"] }
jf-pcs = { path = "../pcs", features = ["test-srs"] }
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use path b/c some updates are not reflected in the github yet.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you intend to switch back before merge?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i don't think he can. One of the perils of separate versioning is multi-step upgrade here.
changing back need another PR: a new tag will be added once this PR is merged, then next one would switch back.

Copy link
Contributor

@ggutoski ggutoski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple of nits. I'd like another pair of eyes before approval.

fn ark_affine_to_icicle(p: ark_bn254::G1Affine) -> icicle_bn254::curve::G1Affine {
icicle_bn254::curve::G1Affine {
x: icicle_bn254::curve::BaseField::from(p.x.0 .0),
y: icicle_bn254::curve::BaseField::from(p.y.0 .0),
x: unsafe { ark_std::mem::transmute(p.x.into_bigint()) },
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: refactor this recurring line of unsafe code into a separate function.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and did we add test for this unsafe transmute anywhere?

also can you add comment that even though arkworks uses u64 limb (as opposed to u32 in icicle), they are both little-endian encoded, thus agree on a byte-level, which is why this unsafe transmute is correct.

^^ am I right?

Copy link
Contributor Author

@mrain mrain Aug 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the test is here: https://github.com/EspressoSystems/jellyfish/pull/595/files#diff-1b7c209306eb56d628e64ac6bc03978824af57767468dc3c618243860d04a6e3L1415

But it's only testing the ScalarField conversion. Maybe it's okay to assume that BaseField conversion should pass if this passes.

Comment on lines 896 to 898
/// - default implementation assume normal(non-montgomery) affine for
/// bases and scalars, consider overwrite this function if you want
/// otherwise
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the PR description:

commit_on_gpu now assume that the input scalars are in normal form instead of montgomery form.

Why switch from montgomery to normal?

I'm surprised you don't need to update this comment. This comment suggests it was always the case that scalars are in normal form. So was it wrong before and is now correct??

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No this is always correct. The generic implementation assumes that the scalars are normal. Only for Bn254 we take montgomery scalars. Now they have to be normal in all cases.

Comment on lines 439 to +442
pub fn warmup_new_stream() -> anyhow::Result<CudaStream> {
let stream = CudaStream::create().map_err(|e| anyhow!("{:?}", e))?;
let _warmup_bytes = HostOrDeviceSlice::<'_, u8>::cuda_malloc_async(1024, &stream)
.map_err(|e| anyhow!("{:?}", e))?;
let _warmup_bytes =
DeviceVec::<u8>::cuda_malloc_async(1024, &stream).map_err(|e| anyhow!("{:?}", e))?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -38,7 +38,7 @@ ark-ed-on-bls12-377 = "0.4.0"
ark-ed-on-bls12-381 = "0.4.0"
ark-ed-on-bn254 = "0.4.0"
criterion = "0.5.1"
jf-pcs = { git = "https://github.com/EspressoSystems/jellyfish", tag = "0.4.5", features = ["test-srs"] }
jf-pcs = { path = "../pcs", features = ["test-srs"] }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i don't think he can. One of the perils of separate versioning is multi-step upgrade here.
changing back need another PR: a new tag will be added once this PR is merged, then next one would switch back.

Comment on lines +54 to +55
/// Stress test with varied payload sizes for GPU memory
/// leakage/fragmentation.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how does this test leakage or fragmentation?

iiuc, this only put lots of running instances, and in case we didn't clean up (after use) the memory, it would overflow the memory?

but how does it test fragmentation?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested by Salman. It really depends on the GPU memory management. Hypothesis is that if you make many GPU memory allocations and they are not cleaned up fast enough, then there're potentially fragmentations that prevent you from allocating even if the GPU memory is not full.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

plz put this explanation in a comment 🙏

fn ark_affine_to_icicle(p: ark_bn254::G1Affine) -> icicle_bn254::curve::G1Affine {
icicle_bn254::curve::G1Affine {
x: icicle_bn254::curve::BaseField::from(p.x.0 .0),
y: icicle_bn254::curve::BaseField::from(p.y.0 .0),
x: unsafe { ark_std::mem::transmute(p.x.into_bigint()) },
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and did we add test for this unsafe transmute anywhere?

also can you add comment that even though arkworks uses u64 limb (as opposed to u32 in icicle), they are both little-endian encoded, thus agree on a byte-level, which is why this unsafe transmute is correct.

^^ am I right?

Comment on lines +66 to 67
DeviceVec<IcicleAffine<<UnivariateKzgPCS<E> as GPUCommittable<E>>::IC>>,
CudaStream,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we strictly need type AdvzGPU? the 'srs lifetime can be removed at least?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure we could remove the lifetime specifier. I'm not sure what do you mean, AdvzGPU takes additional type generics for on GPU srs and CudaStream.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AdvzGPU takes additional type generics for on GPU srs and CudaStream.

i forgot about this. nvm. maybe just remove the lifetime.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants