Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] Add support for encrypted images #2297

Draft
wants to merge 12 commits into
base: criu-dev
Choose a base branch
from

Conversation

rst0git
Copy link
Member

@rst0git rst0git commented Nov 7, 2023

This pull request extends CRIU with support for encrypted images. A new cli option, -e|--encrypt, is used to enable this functionality with the dump command.

The implementation is based on the existing integration with GnuTLS, using ChaCha20-Poly1305 for protobuf and raw images, and AES-XTS for memory pages. The symmetric keys used for encryption are randomly generated, encrypted with a public key loaded from X.509 certificate and stored in cipher.img. During restore, if cipher.img exists, CRIU will load a corresponding private key from a PEM file and decrypt the symmetric keys.

Usage example:

criu dump -e ...
criu restore ...

The following figure shows the results of performance evaluation, where CRIUsec includes the changes in this pull request, CRIU is used without encryption as a baseline, and GnuPG, OpenSSL, age are alternative solutions used with post-dump action-script for comparison.

criu/tls.c Fixed Show fixed Hide fixed
criu/tls.c Fixed Show fixed Hide fixed
criu/tls.c Fixed Show fixed Hide fixed
criu/tls.c Fixed Show fixed Hide fixed
criu/tls.c Fixed Show fixed Hide fixed
criu/tls.c Fixed Show fixed Hide fixed
criu/tls.c Dismissed Show dismissed Hide dismissed
Copy link

@github-advanced-security github-advanced-security bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CodeQL found more than 10 potential problems in the proposed changes. Check the Files changed tab for more details.

@rst0git rst0git force-pushed the encrypted-images branch 2 times, most recently from be3855a to 541a702 Compare November 7, 2023 10:30
criu/tls.c Fixed Show fixed Hide fixed
@rst0git rst0git force-pushed the encrypted-images branch 2 times, most recently from 8adc076 to 5a3d7f9 Compare November 7, 2023 11:29
@codecov-commenter
Copy link

codecov-commenter commented Nov 7, 2023

Codecov Report

Attention: 687 lines in your changes are missing coverage. Please review.

Comparison is base (b17a73b) 70.62% compared to head (a58f851) 69.24%.

❗ Current head a58f851 differs from pull request most recent head 42c52ff. Consider uploading reports for the commit 42c52ff to get more accurate results

Files Patch % Lines
criu/tls.c 4.68% 549 Missing ⚠️
criu/image.c 29.78% 33 Missing ⚠️
criu/protobuf.c 25.00% 30 Missing ⚠️
criu/page-xfer.c 24.13% 22 Missing ⚠️
criu/util.c 22.72% 17 Missing ⚠️
criu/mem.c 26.31% 14 Missing ⚠️
criu/pagemap.c 16.66% 5 Missing ⚠️
criu/cr-dump.c 50.00% 4 Missing ⚠️
criu/unittest/mock.c 0.00% 4 Missing ⚠️
criu/config.c 25.00% 3 Missing ⚠️
... and 3 more
Additional details and impacted files
@@             Coverage Diff              @@
##           criu-dev    #2297      +/-   ##
============================================
- Coverage     70.62%   69.24%   -1.38%     
============================================
  Files           134      134              
  Lines         33316    34038     +722     
============================================
+ Hits          23528    23570      +42     
- Misses         9788    10468     +680     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@adrianreber
Copy link
Member

Hello @ueno, @rst0git mentioned that he talked to you about the ciphers used in this PR. Can you take a look at this PR regarding the used ciphers and if they make sense. Unfortunately most of us here do not know that much about encryption and if you could take a look that would be great. Thanks.

@adrianreber
Copy link
Member

@rst0git What about using constructs like cleanup_free at some places? Could simplify some code paths a bit.

@adrianreber
Copy link
Member

I think we are trying to avoid C++ comments with //. I see a couple of them.

@adrianreber
Copy link
Member

What happens if CRIU finds an encrypted image? Reading the code and the documentation it will look for a PEM file, right? How will CRIU react if the PEM file doesn't exist?

@rst0git
Copy link
Member Author

rst0git commented Dec 6, 2023

What about using constructs like cleanup_free at some places? Could simplify some code paths a bit.

Sounds like a good idea!

I think we are trying to avoid C++ comments with //. I see a couple of them.

Thanks, I've updated the comments to use /*.

What happens if CRIU finds an encrypted image? Reading the code and the documentation it will look for a PEM file, right?

When criu dump is used with --encrypt, it generates an image file called cipher.img. Subsequently, criu restore checks if cipher.img exists in the directory with image files. This file serves as an indicator that all other CRIU images are encrypted. In such case, CRIU loads a private key from the default location, which is /etc/pki/criu/private/key.pem, or from an alternative path specified using the --tls-key option.

How will CRIU react if the PEM file doesn't exist?

If the PEM file with private key doesn't exist, CRIU fails with the following error:

(00.001370) tls: Loading private key from /etc/pki/criu/private/key.pem
(00.001405) Error (criu/tls.c:585): tls: Can't stat /etc/pki/criu/private/key.pem: No such file or directory

criu/image.c Outdated
void *buf;
/* 96-bits nonce and 128-bits tag for ChaCha20-Poly1305 */
uint8_t nonce_data[12];
uint8_t tag_data[16];
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: maybe it might make sense to define these constants as a macro?

/* The data must be small enough to use plain RSA
* https://github.com/gnutls/nettle/blob/fe7ae87d/pkcs1-encrypt.c#L66
*/
max_block_size = key_len / 8 - 11;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a bit of a concern on using RSA encryption with PKCS#1 padding for new application, though nettle/gnutls implementation should not be vulnerable to known timing attacks (e.g., Marvin). If possible, I would recommend using alternatives like RSA-OAEP or some sort of KEM, though neither of them are implemented yet in nettle.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As stated in the Internet Draft: https://datatracker.ietf.org/doc/draft-kario-rsa-guidance/ new deployments MUST NOT use PKCS#1v1.5 encryption padding

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI, RSA-OAEP is now available in gnutls >= 3.8.4 (I recommend 3.8.5 for a minor glitch). To use:

gnutls_x509_spki_t spki;

gnutls_x509_spki_init(&spki);
gnutls_x509_spki_set_rsa_oaep_params(spki, dig, label);
gnutls_pubkey_set_spki(pubkey, spki, 0); // pubkey is a regular RSA key

gnutls_pubkey_encrypt_data(pubkey, 0, &plaintext, &ciphertext);

The maximum size of plaintext can be calculated with: k - 2 * hLen - 2, where k is the size of the RSA key in bytes, while hLen is the output size of the hash function dig. Decryption also needs a similar preparation.

ciphertext.data = ce->token.data;
ciphertext.size = ce->token.len;

ret = gnutls_privkey_decrypt_data(privkey, 0, &ciphertext, &decrypted_token);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This probably doesn't apply to the use-case here, but it might make sense to avoid side channels by removing the size check below (and possibly memcpy). See also https://gitlab.com/gnutls/gnutls/-/blob/3f42ae70a1672673cb8f27c2dd3da1a34d1cbdd7/lib/auth/rsa.c#L194


ret = gnutls_pubkey_init(&pubkey);
if (ret < 0) {
tls_perror("Failed to initialize public key", ret);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cert_data.data is leaking? Maybe good to use cleanup_free as suggested already. Also you could use the __attribute__((cleanup)) for other gnutls data types, you can find some examples here.

crit/crit/__main__.py Fixed Show fixed Hide fixed

try:
img = pycriu.images.load(inf(opts), opts['pretty'], opts['nopl'])
img = pycriu.images.load(inf(opts), opts['pretty'], opts['nopl'], token=token)

Check warning

Code scanning / CodeQL

File is not always closed Warning

File is opened but is not closed.
@@ -101,10 +154,11 @@

def explore_ps(opts):
pss = {}
ps_img = pycriu.images.load(dinf(opts, 'pstree.img'))
token = get_cipher_token(opts)
ps_img = pycriu.images.load(dinf(opts, 'pstree.img'), token=token)

Check warning

Code scanning / CodeQL

File is not always closed Warning

File is opened but is not closed.
for p in ps_img['entries']:
core = pycriu.images.load(
dinf(opts, 'core-%d.img' % get_task_id(p, 'pid')))
dinf(opts, 'core-%d.img' % get_task_id(p, 'pid')), token=token)

Check warning

Code scanning / CodeQL

File is not always closed Warning

File is opened but is not closed.
@@ -154,7 +210,7 @@
return None

if ft['img'] is None:
ft['img'] = pycriu.images.load(dinf(opts, img))['entries']
ft['img'] = pycriu.images.load(dinf(opts, img), token=token)['entries']

Check warning

Code scanning / CodeQL

File is not always closed Warning

File is opened but is not closed.
@@ -258,11 +315,12 @@


def explore_mems(opts):
ps_img = pycriu.images.load(dinf(opts, 'pstree.img'))
token = get_cipher_token(opts)
ps_img = pycriu.images.load(dinf(opts, 'pstree.img'), token=token)

Check warning

Code scanning / CodeQL

File is not always closed Warning

File is opened but is not closed.
vids = vma_id()
for p in ps_img['entries']:
pid = get_task_id(p, 'pid')
mmi = pycriu.images.load(dinf(opts, 'mm-%d.img' % pid))['entries'][0]
mmi = pycriu.images.load(dinf(opts, 'mm-%d.img' % pid), token=token)['entries'][0]

Check warning

Code scanning / CodeQL

File is not always closed Warning

File is opened but is not closed.
@@ -311,12 +369,13 @@


def explore_rss(opts):
ps_img = pycriu.images.load(dinf(opts, 'pstree.img'))
token = get_cipher_token(opts)
ps_img = pycriu.images.load(dinf(opts, 'pstree.img'), token=token)

Check warning

Code scanning / CodeQL

File is not always closed Warning

File is opened but is not closed.
pid))['entries'][0]['vmas']
pms = pycriu.images.load(dinf(opts, 'pagemap-%d.img' % pid))['entries']
vmas = pycriu.images.load(
dinf(opts, 'mm-%d.img' % pid), token=token)['entries'][0]['vmas']

Check warning

Code scanning / CodeQL

File is not always closed Warning

File is opened but is not closed.
pms = pycriu.images.load(dinf(opts, 'pagemap-%d.img' % pid))['entries']
vmas = pycriu.images.load(
dinf(opts, 'mm-%d.img' % pid), token=token)['entries'][0]['vmas']
pms = pycriu.images.load(dinf(opts, 'pagemap-%d.img' % pid), token=token)['entries']

Check warning

Code scanning / CodeQL

File is not always closed Warning

File is opened but is not closed.
@@ -26,11 +37,37 @@

#define tls_perror(msg, ret) pr_err("%s: %s\n", msg, gnutls_strerror(ret))

#define cleanup_gnutls_datum __attribute__((cleanup(_cleanup_gnutls_datum)))
static inline void _cleanup_gnutls_datum(gnutls_datum_t *p)

Check notice

Code scanning / CodeQL

Unused static function Note

Static function _cleanup_gnutls_datum is unreachable
Copy link

A friendly reminder that this PR had no activity for 30 days.

@rst0git rst0git added no-auto-close Don't auto-close as a stale issue and removed stale-pr labels Feb 10, 2024
goto err;
}
if (ret == 0) {
break;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we return an error if bytes_read isn't equal to data_size?

}

/* Write ciphertext data */
ret = write(fd_out, buf, data_size);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does it mean that a reader needs needs to read chunks of the same size?

@rst0git rst0git force-pushed the encrypted-images branch from 6045665 to 3936cbf Compare July 10, 2024 12:52
if args.type == 'rsa':
generate_rsa_key(args.bits)
elif args.type == 'ec':
generate_ec_key()

Check failure

Code scanning / CodeQL

Wrong number of arguments in a call Error

Call to
function generate_ec_key
with too few arguments; should be no fewer than 1.
* restorer can wait() for it when the restore stage is done.
*/
ta->helpers = (pid_t *)rst_mem_align_cpos(RM_PRIVATE);
child = rst_mem_alloc(sizeof(*child), RM_PRIVATE);

Check failure

Code scanning / CodeQL

Inconsistent nullness check Error

The result of this call to rst_mem_alloc is not checked for null, but 91% of calls to rst_mem_alloc check for null.
This patch extends CRIU dump with support for encryption of images
using ChaCha20-Poly1305 authenticated-encryption in combination with
X.509 certificates.

The '--encrypt' option can be used with the dump/pre-dump commands to
enable this functionality. When this option has been specified during
dump, the GnuTLS library will be used to load a public key from X.509
certificate, and to generate a 256-bit random `token`. The token's
value is then encrypted with the public key and the corresponding
ciphertext is saved in `cipher.img`. During restore, if cipher.img
exists in the images directory, the GnuTLS library will be used to
load a private key from a corresponding PEM file to decrypt the token
value.

The token value is used with ChaCha20-Poly1305 to encrypt/decrypt
all other CRIU images. The 256-bit token is used in combination with
96-bits `nonce` and 128-bits `tag` to protect data confidentiality
and provide message authentication for each data entry.

Example:
	criu dump --encrypt ...
	criu restore ...

Signed-off-by: Radostin Stoyanov <[email protected]>
This patch extends ZDTM to run `criu dump` with the `--encrypt`
option to test the encryption functionality of CRIU images.

Signed-off-by: Radostin Stoyanov <[email protected]>
'opts' is defined in cr_options.h. This header will be included in a
subsequent patch. We rename the local variable 'opts' to 'bpfmap_opts'
to avoid variable shadowing.

Signed-off-by: Radostin Stoyanov <[email protected]>
We calculate the total memory size needed for both keys and values and
allocate a single contiguous memory region using a single mmap call.
In a subsequent patch, this change would enable encrypting the combined
memory region using a single pair of ChaCha20-Poly1305 tag and nonce.

Signed-off-by: Radostin Stoyanov <[email protected]>
This patch extends dump_one_bpfmap_data() with support for encryption.

Signed-off-by: Radostin Stoyanov <[email protected]>
During checkpoint, the contents of ghost images and pipe data is
splice()-ed between file descriptors. To enable encryption for this data
we introduce `tls_encrypt_file_data()` and `tls_decrypt_file_data()`.
These functions read data from input file descriptor, perform
encryption/decryption of the data, and write it to the corresponding
output file descriptor.

Signed-off-by: Radostin Stoyanov <[email protected]>
This patch extends CRIT with the ability to decode encrypted images.
When `cipher.img` is present, crit will load the corresponding private
key (from /etc/pki/criu/private/key.pem), decrypt the cipher token and
use it to decrypt the protobuf entries in the image that is being
decoded.

Signed-off-by: Radostin Stoyanov <[email protected]>
cr_system() and cr_system_userns() are used to run external executables
such as tar, ip, and iptables. These external tools are used to create
image files in 3rd party format (i.e., raw images). In order to encrypt
the output of these tools, and to decrypt their input, we replace the
corresponding input/output file descriptor with a pipe, and perform
encryption/decryption of the data.

Signed-off-by: Radostin Stoyanov <[email protected]>
We use the AES-XTS block cipher to encrypt memory pages as it is
designed to encrypt blocks of data with fixed-size (e.g. memory pages),
allows the use of hardware acceleration available in modern CPUs, and
uses a single initialization vector (IV), instead of per-page nonce,
to ensure that encrypting the same plaintext with the same key results
in different ciphertexts.

In particular, XTS uses two 256-bits AES keys. One key is used to
perform block encryption, and the other is used to encrypt a so-called
"tweak value". The encrypted tweak value is further modified (with a
Galois polynomial function) and XOR-ed with both the plaintext and
ciphertext of each block. This method ensures that encrypting multiple
blocks with identical data will produce different ciphertext.

Since CRIU restores memory pages in the restorer context, this PIE code
cannot be linked with libraries such as GnuTLS to perform decryption.
Instead, we introduce a helper process to decrypt memory pages data.
The restorer context communicates with this helper process using PIPEs.
It sends the function arguments be used by preadv() and receives back
its return value. The decrypted data is transferred to the target
address space with process_vm_writev.

Suggested-by: Daiki Ueno <[email protected]>
Signed-off-by: Radostin Stoyanov <[email protected]>
The AES-XTS cipher does not provide integrity verification.
In this patch we add a verification mechanism based on the
HMAC-SHA-256 algorithm.

In order to support iterative checkpointing and memory deduplication
with encrypted memory, and to avoid storing HMAC for each memory page,
we compute XOR for of HMAC value for all memory pages and store this
value in cipher.img

The XOR computation also allows us to address the problem that memory
pages are read during restore in a different order then they are written
during checkpoint. In addition, to ensure that memory pages are restored
in correct order, we include the PID and VMA address associated with each
page in the HMAC computation.

The following example illustrates the HMAC value computation:

	H_n = HMAC(PID + VMA + MEMORY + KEY)
	hmac_value = H_1 ^ H_2 ^ ... ^ H_n

- PID: PID associated with the memory page
- VMA: virtual memory address associated with memory page
- KEY: secret key
- H_n: n-th memory page
- hmac_value: value stored in cipther.img during checkpoint, and used
              for integrity verification during restore

Signed-off-by: Radostin Stoyanov <[email protected]>
Measure the time for data encryption and decryption with
stream and block ciphers.

Signed-off-by: Radostin Stoyanov <[email protected]>
This script, similar to ssh-keygen and certtool, makes it easier
to generate and install certificate and key to enable encryption
support with CRIU.

Signed-off-by: Radostin Stoyanov <[email protected]>

if files_img is None:
try:
files_img = pycriu.images.load(dinf(opts, "files.img"))['entries']
files_img = pycriu.images.load(dinf(opts, "files.img"), token=token)['entries']

Check warning

Code scanning / CodeQL

File is not always closed Warning

File is opened but is not closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
no-auto-close Don't auto-close as a stale issue
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants