Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

file_known_and_unchanged is wasteful and should be converted to inodes #909

Open
verygreen opened this issue Apr 15, 2016 · 11 comments
Open

Comments

@verygreen
Copy link
Contributor

Moved from #908
I guess another change is to turn cache.file_known_and_unchanged into just inode_known_and_unchanged.
Who cares what's the actual file name, since it's always stored in the archive anyway I believe?
Additional benefit, the "files_cache" huge file in .cache might become smaller as it no longer needs to hold path names.

This poses some difficulties for filesystems without inode numbers like vfat and on Windows, as mentioned by @ThomasWaldmann

Windows (assuming they don't have inodes - I don't know much about ntfs) - some other unique identifier might be used, I guess. For filesystems that have no symlinks (like vfat) - file path might be unique enough. Still some races with content changing underneath, but at least nothing security critical (also hopefully nobody resurrects umsdos fs) (it's tempting to use filesize + create date (+modification date?) as unique id for vfat to handle renames more optimally there, but that might not be safe).
On nfs there's "file handle" that's unique (no idea if it's easily visible from userspace).
fuse and friends are harder, I guess.
ntfs has symlinks and hardlinks, so they must have have unique file ids similar to inode numbers of some sort.
These strategies could be determined by filesystem type (available from statfs(2) and could be called once we cross a mountpoint boundary assuming this is enabled, also separately checked for every path).
It should be noted that since inodes are not unique outside of a normal filesystem, really inode in the table should be some combination of actual inode number and filesystem id (similar to how nfs creates file handles). Filesystem id might be anything from mount path (unsafe) to underlying block device id (unsafe) to fs uuid (not bullet proof either) to some other options.

Using file path as generic fallback for inode number for cases when inode numbers are ignored for other reasons (commandline switch) is also possible, though it should be marked as "potentially suboptimal".

@ThomasWaldmann
Copy link
Member

ThomasWaldmann commented Nov 8, 2016

Currently, a mapping is used from key path_hash -> value FileCacheEntry (size, mtime, inode, age, chunk_ids).

key: path_hash = self.key.id_hash(safe_encode(os.path.join(self.cwd, path)))

So, the key hash could be just computed from different input data or get replaced by a tuple.

Algorithm:

if has_stable_inodes(path, st):
    key = (st.st_dev, st.st_ino)
else:
    key = id_hash(path)  # fallback, see orig. code

@enkore
Copy link
Contributor

enkore commented Nov 8, 2016

A hash isn't even necessary - we can just pack the device number and the inode number in 16 bytes(!) each. But since we in fact don't even have the requirement of len(key) == 32 here, like we do for objects, we could just as well use (dev, ino) as the key directly.

Maybe that even uses less memory (but certainly less storage when msgpacked) than the 32 byte string. Edit: Uh, well maybe not :D

>>> sys.getsizeof((2065, 557311989))
64
>>> sys.getsizeof(2065)
28

= 120 bytes per key. While bytes(32).

>>> sys.getsizeof(bytes(32))
65

Storage, however:

>>> len(msgpack.packb((2065, 557311989)))
9

But we can combine both: Store tuples in the file = smaller storage. When reading the file, keys that are tuples are converted to a 32 byte string (= same memory footprint). When writing the file, the reverse is performed. This works, because when writing we know we want to operate in use_inodes_for_cache mode, and when reading, the file just tells us.

This is is almost compatible: Old cold doesn't touch it. It would just insert all the tuples as keys. The only problem is that msgpack deserializes tuples as lists, which are mutable, which are therefore not hashable, and therefore don't qualify as keys. Hm.

@ThomasWaldmann
Copy link
Member

ThomasWaldmann commented Nov 9, 2016

Trying to list filesystems with known-stable inode numbers (white list):

  • all native linux, local filesystems(?): ext2/3/4, btrfs, xfs, zfs, ...
  • cifs with unix extensions and "serverino" option

Trying to list filesystems with known-unstable inode numbers (black list):

  • non-linux/unix filesystems: fat, ntfs
  • FUSE filesystems: ntfs, sshfs
  • network filesystems: smbfs / cifs (w/o serverino), nfs

@ThomasWaldmann
Copy link
Member

ThomasWaldmann commented Nov 13, 2016

Crap, os.statvfs() does not return f_type.

Calling stat(v)fs ourselves seem to require some platform-specific code as it is slightly different on different platforms (data type of fs type, include files). Also, we would have to classify a lot of filesystems.

Maybe just let the user specify whether inodes shall be considered stable? We already have a commandline option to ignore inodes (used when they are unstable or unsupported) - add another one like --use-inodes to trigger that borg uses (st_dev, st_ino) instead of the fs item path as key in the files cache?

@enkore
Copy link
Contributor

enkore commented Nov 13, 2016

Yes I think the latter would be a better approach to this. Only some 'special' cases profit from this (but if they do, then quite a lot I suspect); hardlinked files in the same archive are already indexed by (dev, ino) anyway.

@ThomasWaldmann
Copy link
Member

#1842 (comment) see there.

@enkore
Copy link
Contributor

enkore commented Nov 13, 2016

Re. compatibility with older Borg:

in ino-mode, truncate/leave files (this is valid). Store ino files cache as files.ino or so. No problems. This also needs next to no code changes in the loading/saving logic, only the filename has to be changed depending on mode.

With some added code the cache could be automatically online, as files are seen during backups, as well.

@ThomasWaldmann
Copy link
Member

The code will dynamically switch between modes, so we need both in parallel.
Guess nothing bad will happen with old borg. The keys are bytes in both cases and they are both only computed in a one-way manner.

@ThomasWaldmann
Copy link
Member

#1842 currently blocked by some C/Cython problem (access fsid) - can anybody help with that?

@ThomasWaldmann
Copy link
Member

anyone?

@ThomasWaldmann
Copy link
Member

Small related update:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants