Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Split plugin #381

Open
wants to merge 8 commits into
base: master
Choose a base branch
from
Open

Split plugin #381

wants to merge 8 commits into from

Conversation

olihey
Copy link

@olihey olihey commented Aug 17, 2016

Hej,

I would like to contribute a plugin that handles big files (mainly disk images) better.
Problem with disk image is that they need to be transferred whole when they have changed.

To improve that I created this little plugin I use to upload and download my DMG files (Mac disk images). What it does is splitting the local file into chunks (user-defined size) and uploads them to ACD as separate files. The download function then joins them back together.
Advantage is that only changed chunks of the file are uploaded/downloaded. This can speed up backup of partitions and disk images dramatically,

The plugins adds two new commands, split-upload and split-download.

The command
acd_cli split-upload <local-file> <remote-file> <chunk-size>
(example)
acd_cli split-upload ~/Downloads/Plain.dmg /plain.dmg 524288
will upload the local file Plain.dmg to /plain.dmg on ACD. A new folder "plain.dmg" will be created in the root of ACD and in that will be the chunk files of 512kb (524288) each.

With
acd_cli split-download <remote-file> <local-file> <chunk-size>
(example)
acd_cli split-download /plain.dmg ~/Downloads/Plain2.dmg 524288
the data is downloaded again into a different local file Plain2.dmg.

Only changes will be transferred on download and upload. So if an earlier version of the file already exists before calling "split-download" on it, chunks with matching MD5 hashes will not be downloaded.

Additionally the split-upload method has the option to compress each chunk and encrypt it.
The parameter "-lc / --lzma-compress" compresses each chunk using the lzma module. This should not be used on encrypted files like VeraCrypt containers.
To encrypt the chunks use the "-p / --password" option. This will encrypt each chunk with the specified password and random salt using AES-256 from PyCrypto. When using "split-download" the password needs to specified as well.
The chunks are saved in a OpenSSL standard format so that can be decrypted using openssl.

Example to upload a HD encrypted and compressed:
acd_cli split-upload -lc -p mysecretpassword /dev/rdisk2 /backups/macHD.img 8388608

@bgemmill
Copy link

This feature set looks pretty cool @olihey. If I can get the fuse work submitted in #374 I'll see if I can integrate this; at the moment ecryptfs is 1:1 with files and chunking would be a great solution.

@ebyrne242
Copy link

@bgemmill That is how https://www.cryfs.org/ works. Instead of writing whole files, it writes 32K chunks. I tried it with acd_cli fuse a while back and it didn't work, but I suspect (hope) the rsync fixes in #374 might fix that, so I'll have to try it again in the near future.

@bgemmill
Copy link

@sketch242 cryfs will work with that PR, but I couldn't get the performance to where it would back up and restore in reasonable amounts of time; once Cryfs goes stable that may change. In the mean time this looks interesting as another way forward.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants