Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Review Request] AVX512 Implementation of Keccak-f1600 #531

Open
zplzpl opened this issue Jan 22, 2025 · 1 comment
Open

[Review Request] AVX512 Implementation of Keccak-f1600 #531

zplzpl opened this issue Jan 22, 2025 · 1 comment

Comments

@zplzpl
Copy link

zplzpl commented Jan 22, 2025

Hi,

I've implemented an AVX512 version of Keccak-f1600 permutation function and would appreciate a review from the community to ensure correctness and optimal performance.

Key implementation details:

  • Uses AVX512 instructions for SIMD parallelization
  • Processes 8 states in parallel (using ZMM registers)
  • Follows similar structure to the existing AVX2 implementation
  • Includes a "turbo" mode that skips the first 12 rounds for certain use cases
  • Memory alignment requirement: 64-byte aligned state

Would particularly appreciate feedback on:

  1. Are the rotation values and state transformations correct?
  2. Is the register allocation optimal?
  3. Are there any potential performance improvements?
  4. Any concerns about the memory alignment requirements?

The full implementation can be found in the code snippet below [link to code].

Thank you in advance for your time and expertise!

Source code:

Thank you in advance for your time and expertise!

Current issue:

  • When running the example test case, I'm getting a segmentation fault:
  • Other test case are pass
Image

`unexpected fault address 0x0
fatal error: fault
[signal SIGSEGV: segmentation violation code=0x80 addr=0x0 pc=0x50f7e7]

goroutine 1 gp=0xc0000061c0 m=0 mp=0x6580e0 [running]:
runtime.throw({0x549d01?, 0x0?})
/usr/lib/golang/src/runtime/panic.go:1023 +0x5c fp=0xc0000c8fd0 sp=0xc0000c8fa0 pc=0x438e5c
runtime.sigpanic()
/usr/lib/golang/src/runtime/signal_unix.go:895 +0x285 fp=0xc0000c9030 sp=0xc0000c8fd0 pc=0x44ff45
github.com/cloudflare/circl/simd/keccakf1600.f1600x8AVX512(0xc0000c91a0, 0x63d520, 0x0)
/home/ec2-user/go/src/circl/simd/keccakf1600/f1600x8_amd64.s:20 +0x27 fp=0xc0000c9038 sp=0xc0000c9030 pc=0x50f7e7
github.com/cloudflare/circl/simd/keccakf1600.permuteSIMDx8(...)
/home/ec2-user/go/src/circl/simd/keccakf1600/f1600x4_amd64.go:9
github.com/cloudflare/circl/simd/keccakf1600.(*StateX8).Permute(0x0?)
/home/ec2-user/go/src/circl/simd/keccakf1600/f1600x.go:180 +0x73 fp=0xc0000c9068 sp=0xc0000c9038 pc=0x50bfd3
github.com/cloudflare/circl/simd/keccakf1600_test.Example()
/home/ec2-user/go/src/circl/simd/keccakf1600/example_test.go:63 +0x5af fp=0xc0000c9b10 sp=0xc0000c9068 pc=0x51150f
testing.runExample({{0x54a05c, 0x7}, 0x556230, {0x550df2, 0x21}, 0x0})
/usr/lib/golang/src/testing/run_example.go:63 +0x2de fp=0xc0000c9c08 sp=0xc0000c9b10 pc=0x4c941e
testing.runExamples(0x57f780?, {0x64ef40, 0x1, 0x6?})
/usr/lib/golang/src/testing/example.go:40 +0x126 fp=0xc0000c9ca0 sp=0xc0000c9c08 pc=0x4c4e06
testing.(*M).Run(0xc0000720a0)
/usr/lib/golang/src/testing/testing.go:2029 +0x75d fp=0xc0000c9ed0 sp=0xc0000c9ca0 pc=0x4cf33d
main.main()
_testmain.go:67 +0x16c fp=0xc0000c9f50 sp=0xc0000c9ed0 pc=0x51182c
runtime.main()
/usr/lib/golang/src/runtime/proc.go:271 +0x29d fp=0xc0000c9fe0 sp=0xc0000c9f50 pc=0x43b91d
runtime.goexit({})
/usr/lib/golang/src/runtime/asm_amd64.s:1695 +0x1 fp=0xc0000c9fe8 sp=0xc0000c9fe0 pc=0x46ff01`

@armfazh
Copy link
Contributor

armfazh commented Feb 4, 2025

It seems that the memory is not aligned (See AVX512 requirements for memory alignment).

My gues is that this code must change for AVX512.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants