Skip to content

Commit 1092693

Browse files
authored
feat: sha256 (#86)
* feat: main hashing logic + test * feat: working sha256 digest * clean: docs and lint * clean: remaining docs and code comments * Update README.md
1 parent 13239a0 commit 1092693

File tree

6 files changed

+322
-0
lines changed

6 files changed

+322
-0
lines changed

Cargo.lock

+79
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

Cargo.toml

+2
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,9 @@ version ="0.1.0"
1010
[dependencies]
1111
rand ="0.8.5"
1212
itertools="0.13.0"
13+
hex ="0.4.3"
1314

1415
[dev-dependencies]
1516
rstest ="0.21.0"
1617
pretty_assertions="1.4.0"
18+
sha2 ="0.10.8"

src/hashes/README.md

+27
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,27 @@
1+
# Hash Functions
2+
3+
A hash function is a function that take in an input of arbitrary length, and produces a fixed length output.
4+
Moreover, a hash function should be deterministic, meaning that the same input should always produce the same output.
5+
Also, we require that the output of a hash function should be uniformly distributed over its output space, meaning that every output should be equally likely given any input.
6+
7+
Intuitively, you can imagine the job for a hash function is to take some arbitrary data and produce a unique "identifier" or "fingerprint" for that data.
8+
Given that the output space is large enough (and some other conditions are met), we can use hash functions to do this.
9+
In effect, every bit of data we care to create an identifier for can be hashed to a unique output value.
10+
For instance, we may have a complete library of works of literature, and by hashing the contents of each book, we can create a unique identifier for each book.
11+
Common output spaces are 256-bits, meaning that we would have 2^256 possible outputs.
12+
To put this in perspective, the number of atoms in the observable universe is estimated to be around 10^80, which is around the magnitude of 2^256.
13+
For back of the envelope calculations, we can note that 2^10 is about 1000=10^3, so 2^256 is about 10^77, which is near the estimates of number of atoms in the observable universe.
14+
15+
## SHA-2
16+
[SHA-2](https://en.wikipedia.org/wiki/SHA-2) is a family of hash functions that includes SHA-224, SHA-256, SHA-384, SHA-512, SHA-512/224, and SHA-512/256.
17+
The SHA-2 family of hash functions are widely used and are considered to be secure, though you should not hash secrets directly with SHA-2 (you can use SHA-3 instead).
18+
As with many cryptographic primitives, SHA-2 is standardized by NIST.
19+
It is used in many different protocols such as TLS, SSL, PGP, and SSH.
20+
21+
The hash function itself is based on the [Merkle-Damgard construction](https://en.wikipedia.org/wiki/Merkle–Damgård_construction), so it reads in blocks of data and processes them in a certain way.
22+
The output of the hash function is the hash of the data, which is a fixed length output.
23+
In our case, we will be using SHA-256, which produces a 256-bit output.
24+
25+
For more detail on the implementation of SHA-256 see [this resource](https://helix.stormhub.org/papers/SHA-256.pdf).
26+
Also, you can find JavaScript code and a working applet for SHA-256 [here](https://www.movable-type.co.uk/scripts/sha256.html).
27+
Our implementation can be found in the `src/hashes/sha256.rs` file with detailed documentation and comments.

src/hashes/mod.rs

+6
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
//! Hashing algorithms
2+
//!
3+
//! This module contains implementations of various hashing algorithms.
4+
//! Currently, the only supported algorithm is SHA-256.
5+
6+
pub mod sha256;

src/hashes/sha256.rs

+207
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,207 @@
1+
//! An implementation of the SHA-256 hash function.
2+
//! This module provides an implementation of the SHA-256 hash function, which is a widely-used
3+
//! cryptographic hash function that produces a 256-bit hash value from an input message.
4+
5+
/// The SHA-256 hash function uses random constants in the hash computation.
6+
/// These constants here are the first 32 bits of the fractional parts of the cube roots of the
7+
/// first 64 prime numbers.
8+
const K: [u32; 64] = [
9+
0x428a2f98, 0x71374491, 0xb5c0fbcf, 0xe9b5dba5, 0x3956c25b, 0x59f111f1, 0x923f82a4, 0xab1c5ed5,
10+
0xd807aa98, 0x12835b01, 0x243185be, 0x550c7dc3, 0x72be5d74, 0x80deb1fe, 0x9bdc06a7, 0xc19bf174,
11+
0xe49b69c1, 0xefbe4786, 0x0fc19dc6, 0x240ca1cc, 0x2de92c6f, 0x4a7484aa, 0x5cb0a9dc, 0x76f988da,
12+
0x983e5152, 0xa831c66d, 0xb00327c8, 0xbf597fc7, 0xc6e00bf3, 0xd5a79147, 0x06ca6351, 0x14292967,
13+
0x27b70a85, 0x2e1b2138, 0x4d2c6dfc, 0x53380d13, 0x650a7354, 0x766a0abb, 0x81c2c92e, 0x92722c85,
14+
0xa2bfe8a1, 0xa81a664b, 0xc24b8b70, 0xc76c51a3, 0xd192e819, 0xd6990624, 0xf40e3585, 0x106aa070,
15+
0x19a4c116, 0x1e376c08, 0x2748774c, 0x34b0bcb5, 0x391c0cb3, 0x4ed8aa4a, 0x5b9cca4f, 0x682e6ff3,
16+
0x748f82ee, 0x78a5636f, 0x84c87814, 0x8cc70208, 0x90befffa, 0xa4506ceb, 0xbef9a3f7, 0xc67178f2,
17+
];
18+
19+
/// The initial hash values for SHA-256.
20+
/// These are the first 32 bits of the fractional parts of the square roots of the first 8 prime
21+
/// numbers.
22+
const H: [u32; 8] =
23+
[0x6a09e667, 0xbb67ae85, 0x3c6ef372, 0xa54ff53a, 0x510e527f, 0x9b05688c, 0x1f83d9ab, 0x5be0cd19];
24+
25+
/// A rotation function that rotates a 32-bit word to the right by `N` bits.
26+
/// Note that the implementation here assumes that the bits are replaced by zeroes when shifted
27+
/// hence the `|`.
28+
pub const fn rotate_right<const N: usize>(x: u32) -> u32 { (x >> N) | (x << (32 - N)) }
29+
30+
/// The [Σ0](https://en.wikipedia.org/wiki/SHA-2) function used in SHA-256.
31+
/// This is one of the compression functions used in the hash computation.
32+
pub const fn sigma_0(x: u32) -> u32 {
33+
rotate_right::<2>(x) ^ rotate_right::<13>(x) ^ rotate_right::<22>(x)
34+
}
35+
36+
/// The [Σ1](https://en.wikipedia.org/wiki/SHA-2) function used in SHA-256.
37+
/// This is one of the compression functions used in the hash computation.
38+
pub const fn sigma_1(x: u32) -> u32 {
39+
rotate_right::<6>(x) ^ rotate_right::<11>(x) ^ rotate_right::<25>(x)
40+
}
41+
42+
/// The [σ0](https://en.wikipedia.org/wiki/SHA-2) function used in SHA-256.
43+
/// This is one of the message schedule functions used in the hash computation.
44+
pub const fn small_sigma_0(x: u32) -> u32 {
45+
rotate_right::<7>(x) ^ rotate_right::<18>(x) ^ (x >> 3)
46+
}
47+
48+
/// The [σ1](https://en.wikipedia.org/wiki/SHA-2) function used in SHA-256.
49+
/// This is one of the message schedule functions used in the hash computation.
50+
pub const fn small_sigma_1(x: u32) -> u32 {
51+
rotate_right::<17>(x) ^ rotate_right::<19>(x) ^ (x >> 10)
52+
}
53+
54+
/// The [Ch](https://en.wikipedia.org/wiki/SHA-2) function used in SHA-256.
55+
/// This is a logical function used in the hash computation used to "choose" between `y` and `z`
56+
/// given `x` as a conditional.
57+
pub const fn ch(x: u32, y: u32, z: u32) -> u32 { (x & y) ^ (!x & z) }
58+
59+
/// The [Maj](https://en.wikipedia.org/wiki/SHA-2) function used in SHA-256.
60+
/// This is a logical function used in the hash computation used to select the "majority" of the
61+
/// calues of `x`, `y`, and `z`.
62+
pub const fn maj(x: u32, y: u32, z: u32) -> u32 { (x & y) ^ (x & z) ^ (y & z) }
63+
64+
/// An empty struct to encapsulate the SHA-256 hash function.
65+
pub struct Sha256;
66+
67+
impl Sha256 {
68+
/// The SHA-256 hash function.
69+
/// This function takes an input byte array and returns a 32-byte array representing the hash
70+
/// of the input.
71+
///
72+
/// # Arguments
73+
/// * `input` - A byte array representing the input to the hash function.
74+
///
75+
/// # Returns
76+
/// A 32-byte array representing the hash of the input.
77+
///
78+
/// # Example
79+
/// ```
80+
/// use hex;
81+
/// use ronkathon::hashes::Sha256;
82+
///
83+
/// let input = b"abc";
84+
/// let output = Sha256::digest(input);
85+
/// assert_eq!(
86+
/// hex::encode(output),
87+
/// "ba7816bf8f01cfea414140de5dae2223b00361a396177a9cb410ff61f20015ad"
88+
/// );
89+
/// ```
90+
pub fn digest(input: &[u8]) -> [u8; 32] {
91+
///////////////////////////////////////////////////////////////////////////
92+
// Set up initial data structures.
93+
//
94+
// Initialize the hash values.
95+
// This will be the variable we update as we process the input.
96+
let mut hash = H;
97+
98+
// Initialize the array of message words which will be used in the hash computation.
99+
let mut words = [0u32; 64];
100+
101+
///////////////////////////////////////////////////////////////////////////
102+
// Padding
103+
//
104+
// The message is padded so that its length is congruent to 448 modulo 512.
105+
// The padding consists of a single 1 bit followed by zeros, and the length
106+
// of the message in bits is appended to the end.
107+
let len = input.len() as u64 * 8;
108+
let len_with_1_appended = len + 1;
109+
let len_mod = len_with_1_appended % 512;
110+
let zero_padding = if len_mod > 448 { 512 + 448 - len_mod } else { 448 - len_mod };
111+
let len_padded = (len_with_1_appended as usize + zero_padding as usize) / 8;
112+
113+
// Create the padded message from the input.
114+
let mut message = Vec::with_capacity(len_padded);
115+
message.extend_from_slice(input);
116+
117+
// Push on the 1 bit followed by zeroes.
118+
message.push(0x80);
119+
120+
// Push on the remaining needed zeroes we computed above.
121+
message.extend(&vec![0; 56 - len as usize / 8 - 1]);
122+
123+
// Push on the length of the message in bits.
124+
message.extend_from_slice(&len.to_be_bytes());
125+
///////////////////////////////////////////////////////////////////////////
126+
127+
///////////////////////////////////////////////////////////////////////////
128+
// Hashing
129+
//
130+
// Process the message in 512-bit blocks.
131+
for chunk in message.chunks(64) {
132+
// Copy the bytes into the words array to fill the first 16 words.
133+
for i in 0..16 {
134+
words[i] =
135+
u32::from_be_bytes([chunk[i * 4], chunk[i * 4 + 1], chunk[i * 4 + 2], chunk[i * 4 + 3]]);
136+
}
137+
138+
// Use our permutations/compression functions to complete the block
139+
// decomposition for the remaining words.
140+
for i in 16..64 {
141+
words[i] = small_sigma_1(words[i - 2])
142+
.wrapping_add(words[i - 7])
143+
.wrapping_add(small_sigma_0(words[i - 15]))
144+
.wrapping_add(words[i - 16]);
145+
}
146+
147+
// Initialize the working variables.
148+
let mut a = hash[0];
149+
let mut b = hash[1];
150+
let mut c = hash[2];
151+
let mut d = hash[3];
152+
let mut e = hash[4];
153+
let mut f = hash[5];
154+
let mut g = hash[6];
155+
let mut h = hash[7];
156+
157+
// Perform the main hash computation.
158+
for i in 0..64 {
159+
let temp1 = h
160+
.wrapping_add(sigma_1(e))
161+
.wrapping_add(ch(e, f, g))
162+
.wrapping_add(K[i])
163+
.wrapping_add(words[i]);
164+
let temp2 = sigma_0(a).wrapping_add(maj(a, b, c));
165+
166+
h = g;
167+
g = f;
168+
f = e;
169+
e = d.wrapping_add(temp1);
170+
d = c;
171+
c = b;
172+
b = a;
173+
a = temp1.wrapping_add(temp2);
174+
}
175+
176+
// Update the hash values.
177+
hash[0] = hash[0].wrapping_add(a);
178+
hash[1] = hash[1].wrapping_add(b);
179+
hash[2] = hash[2].wrapping_add(c);
180+
hash[3] = hash[3].wrapping_add(d);
181+
hash[4] = hash[4].wrapping_add(e);
182+
hash[5] = hash[5].wrapping_add(f);
183+
hash[6] = hash[6].wrapping_add(g);
184+
hash[7] = hash[7].wrapping_add(h);
185+
}
186+
187+
// Convert the hash to a byte array with correct endianness and
188+
// type then return it.
189+
hash.iter_mut().for_each(|x| *x = x.to_be());
190+
unsafe { std::mem::transmute(hash) }
191+
}
192+
}
193+
194+
#[cfg(test)]
195+
mod tests {
196+
197+
use super::*;
198+
#[test]
199+
fn sha256_hash() {
200+
let input = b"abc";
201+
let output = Sha256::digest(input);
202+
assert_eq!(
203+
hex::encode(output),
204+
"ba7816bf8f01cfea414140de5dae2223b00361a396177a9cb410ff61f20015ad"
205+
);
206+
}
207+
}

0 commit comments

Comments
 (0)