You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
**Zero overhead:** EE provides the same latency as plaintext inference, no slowdowns.
16
+
**Zero overhead:** EE provides the same latency as plaintext inference, with no slowdowns.
17
17
18
18
**128k factorial:** EE indicates a massive combinatorial complexity, contributing to the strongest security guarantees.
19
19
20
20
## Our Journey to Encryption
21
21
22
22
We investigated multiple methodologies to ensure end-to-end data privacy within the Nesa network. **Differential privacy** seeks to obscure sensitive details by adding statistical noise, but it cannot fully prevent inference on raw data once it is processed by a model. **Homomorphic encryption**, on the other hand, is mathematically elegant: it permits computations directly on encrypted data. This is achieved through operations that are homomorphic to addition and multiplication, enabling algebraic manipulation of ciphertexts that, once decrypted, yield the correct plaintext results. Such a property is exceptionally appealing in scenarios like outsourced cloud computations, where one can perform inference off-site without revealing the sensitive inputs.
23
23
24
-
However, standard HE schemes are tailored around arithmetic operations. Neural networks, especially those with layers like attention mechanisms, activation functions, or normalization steps, do not map cleanly onto ring or field operations alone. Adapting HE to these complex transformations typically incurs prohibitive computational cost, slowing inference to impractical speeds.
24
+
However, standard HE schemes are tailored around arithmetic operations. Neural networks, especially those with layers like attention mechanisms, activation functions, or normalization steps, do not map cleanly onto ring or field operations alone. Adapting HE to these complex transformations typically incurs prohibitive computational costs, slowing inference to impractical speeds.
25
25
26
26
Despite this, the conceptual promise of HE—running inference on encrypted data without decryption—prompted us to seek an alternative. We aimed to preserve the protective qualities of encrypted computation while working around the bottlenecks introduced by non-linear neural functions.
27
27
@@ -37,9 +37,9 @@ Rather than relying exclusively on arithmetic operations compatible with HE, EE
37
37
38
38
Formally, given some plaintext $p_i$, and some ciphertext $c_i$, with $p_i$ = decrypt($c_i$), our EE framework ensures that decrypt(nonlinear($c_1,c_2$)) = nonlinear($p_1,p_2$), where "nonlinear" represents a specific set of non-linear neural functions.
39
39
40
-
Crucially, the complexity of inference under EE does not surpass that of the unencrypted version. Each forward pass through the network involves approximately the same computational cost as before. Thus, **inference latency remains unchanged**, a significant advantage compared to conventional HE-based techniques.
40
+
Crucially, the complexity of inference under EE does not surpass that of the unencrypted version. Each forward pass through the network involves approximately the same computational cost. Thus, **inference latency remains unchanged**, a significant advantage compared to conventional HE-based techniques.
41
41
42
-
To illustrate this with a tangible example, consider transformer-based models like ChatGPT, Claude, or Llama. These models employ tokenizers to convert text into discrete tokens, each mapped to an integer token id. Under EE, we implement a specialized tokenizer that produces a different, encrypted set of token ids. The network, now adapted to EE, treats these encrypted token ids as standard inputs. It processes them identically to how it would process normal tokens, ultimately returning encrypted output tokens that can be decrypted locally by the user. The following diagram outlines this workflow:
42
+
To illustrate this with a tangible example, consider transformer-based models like ChatGPT, Claude, or Llama. These models employ tokenizers to convert text into discrete tokens, each mapped to an integer token ID. Under EE, we implement a specialized tokenizer that produces a different, encrypted set of token IDs. The network, now adapted to EE, treats these encrypted token IDs as standard inputs. It processes them identically to how it would process normal tokens, ultimately returning encrypted output tokens that can be decrypted locally by the user. The following diagram outlines this workflow:
43
43
44
44
<divalign="center">
45
45
<imgsrc="tokenizer.png"alt="tokenizer diagram">
@@ -54,7 +54,7 @@ Below is a more detailed breakdown of how Equivariant Encryption matches or outp
|**Data Confidentiality (Server Blindness)**| The server never sees plaintext data. | The server never sees plaintext data. |
57
-
|**End-to-End Encrypted Computation**| Operations should be fully on encrypted data, no intermediate decryptions. | EE models run directly on encrypted tokens. No intermediate decryptions are required. |
57
+
|**End-to-End Encrypted Computation**| Operations should be fully on encrypted data, with no intermediate decryptions. | EE models run directly on encrypted tokens. No intermediate decryptions are required. |
58
58
|**User-Controlled Encryption**| Users should hold keys and control encryption/decryption. | Only the user can map plaintext to transformed tokens using the EE tokenizer as a private key. |
59
59
|**Preservation of Accuracy**| The decrypted output should match the result of plaintext inference. | EE ensures final results are identical to plaintext inference outputs, with no accuracy loss. |
60
60
|**Support for Arbitrary Model Structures**| HE struggles with non-linearities and complex NN layers. | EE is designed for modern neural architectures and preserves non-linearities. |
0 commit comments