Skip to content

Commit

Permalink
src: add INLINE_CONTENT_MAX constant
Browse files Browse the repository at this point in the history
We've been doing this incorrectly by storing all non-empty files
externally.  Add a constant and use it internally.

This means that currently-existing splitstreams need to be regenerated:
they'll have also stored small files as external references.  Add some
extra checks at the splitstream-to-image stage that verifies that the
splitstream has followed the rules correctly: this will help identify
older streams that were built with incorrect rules.  This is another
"delete your respository and start over" change.

We *could* provide bridging code here: in case of too-small external
files in the splitstream, we could read them from the repository and
convert them to inline, but let's save ourselves the bother.

Closes #26

Signed-off-by: Allison Karlitskaya <[email protected]>
  • Loading branch information
allisonkarlitskaya committed Nov 15, 2024
1 parent de17f01 commit b480adc
Show file tree
Hide file tree
Showing 2 changed files with 18 additions and 3 deletions.
5 changes: 5 additions & 0 deletions src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -9,3 +9,8 @@ pub mod repository;
pub mod selabel;
pub mod splitstream;
pub mod util;

/// All files that contain 64 or fewer bytes (size <= INLINE_CONTENT_MAX) should be stored inline
/// in the erofs image (and also in splitstreams). All files with 65 or more bytes (size > MAX)
/// should be written to the object storage and referred to from the image (and splitstreams).
pub const INLINE_CONTENT_MAX: usize = 64;
16 changes: 13 additions & 3 deletions src/oci/tar.rs
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ use std::{
path::PathBuf,
};

use anyhow::{bail, Result};
use anyhow::{bail, ensure, Result};
use rustix::fs::makedev;
use tar::{EntryType, Header, PaxExtensions};
use tokio::io::{AsyncRead, AsyncReadExt};
Expand All @@ -17,6 +17,7 @@ use crate::{
image::{LeafContent, Stat},
splitstream::{SplitStreamData, SplitStreamReader, SplitStreamWriter},
util::{read_exactish, read_exactish_async},
INLINE_CONTENT_MAX,
};

fn read_header<R: Read>(reader: &mut R) -> Result<Option<Header>> {
Expand Down Expand Up @@ -55,7 +56,7 @@ pub fn split<R: Read>(tar_stream: &mut R, writer: &mut SplitStreamWriter) -> Res
let mut buffer = vec![0u8; storage_size];
tar_stream.read_exact(&mut buffer)?;

if header.entry_type() == EntryType::Regular && storage_size > 0 {
if header.entry_type() == EntryType::Regular && actual_size > INLINE_CONTENT_MAX {
// non-empty regular file: store the data in the object store
let padding = buffer.split_off(actual_size);
writer.write_external(&buffer, padding)?;
Expand Down Expand Up @@ -85,7 +86,7 @@ pub async fn split_async(
let mut buffer = vec![0u8; storage_size];
tar_stream.read_exact(&mut buffer).await?;

if header.entry_type() == EntryType::Regular && storage_size > 0 {
if header.entry_type() == EntryType::Regular && actual_size > INLINE_CONTENT_MAX {
// non-empty regular file: store the data in the object store
let padding = buffer.split_off(actual_size);
writer.write_external(&buffer, padding)?;
Expand Down Expand Up @@ -175,6 +176,10 @@ pub fn get_entry<R: Read>(reader: &mut SplitStreamReader<R>) -> Result<Option<Ta
let item = match reader.read_exact(size as usize, ((size + 511) & !511) as usize)? {
SplitStreamData::External(id) => match header.entry_type() {
EntryType::Regular | EntryType::Continuous => {
ensure!(
size as usize > INLINE_CONTENT_MAX,
"Splitstream incorrectly stored a small ({size} byte) file external"
);
TarItem::Leaf(LeafContent::ExternalFile(id, size))
}
_ => bail!(
Expand Down Expand Up @@ -213,6 +218,11 @@ pub fn get_entry<R: Read>(reader: &mut SplitStreamReader<R>) -> Result<Option<Ta
}
EntryType::Directory => TarItem::Directory,
EntryType::Regular | EntryType::Continuous => {
ensure!(
content.len() <= INLINE_CONTENT_MAX,
"Splitstream incorrectly stored a large ({} byte) file inline",
content.len()
);
TarItem::Leaf(LeafContent::InlineFile(content))
}
EntryType::Link => TarItem::Hardlink({
Expand Down

0 comments on commit b480adc

Please sign in to comment.