Skip to content

Commit 1180fd6

Browse files
authored
Merge pull request #1402 from trapexit/docs
Doc and mover script updates
2 parents 8f355d7 + 19509e3 commit 1180fd6

6 files changed

+162
-66
lines changed

mkdocs/docs/faq/reliability_and_scalability.md

+13-3
Original file line numberDiff line numberDiff line change
@@ -26,6 +26,16 @@ Users have pooled everything from USB thumb drives to enterprise NVME
2626
SSDs to remote filesystems and rclone mounts.
2727

2828
The cost of many calls can be `O(n)` meaning adding more branches to
29-
the pool will increase the cost of certain functions but there are a
30-
number of caches and strategies in place to limit overhead where
31-
possible.
29+
the pool will increase the cost of certain functions, such as reading
30+
directories or finding files to open, but there are a number of caches
31+
and strategies in place to limit overhead where possible.
32+
33+
34+
## Are there any limits?
35+
36+
There is no maximum capacity beyond what is imposed by the operating
37+
system itself. Any limit is practical rather than technical. As
38+
explained in the question about scale mergerfs is mostly limited by
39+
the tolerated cost of aggregating branches and the cost associated
40+
with interacting with them. If you pool slow network filesystem then
41+
that will naturally impact performance more than low latency SSDs.

mkdocs/docs/media_and_publicity.md

+10-1
Original file line numberDiff line numberDiff line change
@@ -20,12 +20,16 @@
2020
- 2020-08-20 - [Setting up Rclone, Mergerfs and Crontab for automated cloud storage](https://bytesized-hosting.com/pages/setting-up-rclone-mergerfs-and-crontab-for-automated-cloud-storage)
2121
- 2020-11-22 - [Introducing… MergerFS – My FREE UNRAID alternative](https://supertechfreaks.com/introducing-mergerfs-free-unraid-alternative/)
2222
- 2020-12-30 - [Perfect Media Server](https://perfectmediaserver.com) (a new site with docs fully fleshing out the 'Perfect Media Server' blog series)
23+
- 2021-07-24 - [Building the Ultimate Linux Home Server - Part 1: Intro, MergerFS, and SnapRAID](https://blog.karaolidis.com/ultimate-home-server-part-1/)
2324
- 2021-10-31 - [Better Home Storage: MergerFS + SnapRAID on OpenMediaVault](https://blog.sakuragawa.moe/better-home-storage-mergerfs-snapraid-on-openmediavault/)
2425
- 2021-11-28 - [Linux Magazine: Come Together - Merging file systems for a simple NAS with MergerFS](https://www.linux-magazine.com/Issues/2022/254/MergerFS)
2526
- 2022-06-04 - [MergerFS + SnapRaid Study](https://crashlaker.github.io/2022/06/04/mergerfs_+_snapraid_study.html)
2627
- 2022-12-31 - [Merge Storages in CasaOS: A secret beta feature you know now](https://blog.casaos.io/blog/13.html)
2728
- 2023-02-03 - [(MergerFS + SnapRAID) is the new RAID 5](https://thenomadcode.tech/mergerfs-snapraid-is-the-new-raid-5)
2829
- 2024-02-07 - [Designing & Deploying MANS - A Hybrid NAS Approach with SnapRAID, MergerFS, and OpenZFS](https://blog.muffn.io/posts/part-3-mini-100tb-nas)
30+
- 2024-03-11 - [Using MergerFS to combine multiple hard drives into one unified media storage](https://fullmetalbrackets.com/blog/two-drives-mergerfs/)
31+
- 2024-12-20 - [Pooling multiple drives on my Raspberry Pi with mergerfs](https://sebi.io/posts/2024-12-20-pooling-multiple-drives-with-mergerfs/)
32+
2933

3034
## Videos
3135

@@ -56,16 +60,21 @@
5660
- 2023-06-26 - [How to install and setup MergerFS](https://www.youtube.com/watch?v=n7piuhTXeG4)
5761
- 2023-07-31 - [How to recover a dead drive using Snapraid](https://www.youtube.com/watch?v=fmuiRLPcuJE)
5862
- 2024-01-05 - [OpenMediaVault MergerFS Tutorial (Portuguese)](https://www.youtube.com/watch?v=V6Yw86dRUPQ)
63+
- 2024-02-19 - [Setup and Install MergerFS and SnapRAID (Part 1)](https://noted.lol/mergerfs-and-snapraid-setup-1/)
64+
- 2024-02-22 - [Setup and Install MergerFS and SnapRAID (Part 2)](https://noted.lol/mergerfs-and-snapraid-setup-part-2/)
5965
- 2024-11-15 - [Meu servidor NAS - Parte 18: Recuperando um HD, recuperando o MergerFS e os próximos passos do NAS!](https://www.youtube.com/watch?v=5fy98kPzE3s)
6066

67+
6168
## Podcasts
6269

6370
- 2019-11-04 - [Jupiter Extras: A Chat with mergerfs Developer Antonio Musumeci | Jupiter Extras 28](https://www.youtube.com/watch?v=VmJUAyyhSPk)
6471
- 2019-11-07 - [Jupiter Broadcasting: ZFS Isn’t the Only Option | Self-Hosted 5](https://www.youtube.com/watch?v=JEW7UuKhMJ8)
6572
- 2023-10-08 - [Self Hosted Episode 105 - Sleeper Storage Technology](https://selfhosted.show/105)
6673

74+
6775
## Social Media
6876

6977
- [Reddit](https://www.reddit.com/search/?q=mergerfs&sort=new)
70-
- [Twitter](https://twitter.com/search?q=mergerfs&src=spelling_expansion_revert_click&f=live)
78+
- [X](https://x.com/search?q=mergerfs&src=spelling_expansion_revert_click&f=live)
7179
- [YouTube](https://www.youtube.com/results?search_query=mergerfs&sp=CAI%253D)
80+
- [ServeTheHome Forum](https://forums.servethehome.com/index.php?search/3105813/&q=mergerfs&o=date)

mkdocs/docs/related_projects.md

+25-8
Original file line numberDiff line numberDiff line change
@@ -21,16 +21,33 @@
2121

2222
## Software and services commonly used with mergerfs
2323

24-
* [snapraid](https://www.snapraid.it/)
25-
* [rclone](https://rclone.org/)
26-
* rclone's [union](https://rclone.org/union/) feature is based on
27-
mergerfs policies
28-
* [ZFS](https://openzfs.org/): Common to use ZFS w/ mergerfs
24+
* [snapraid](https://www.snapraid.it/): a backup program designed for
25+
disk arrays, storing parity information for data recovery in the
26+
event of up to six disk failures.
27+
* [rclone](https://rclone.org/): a command-line program to manage
28+
files on cloud storage. It is a feature-rich alternative to cloud
29+
vendors' web storage interfaces. rclone's
30+
[union](https://rclone.org/union/) feature is based on mergerfs
31+
policies.
32+
* [ZFS](https://openzfs.org/): Common to use ZFS w/ mergerfs. ZFS for
33+
important data and mergerfs pool for replacable media.
2934
* [UnRAID](https://unraid.net): While UnRAID has its own union
3035
filesystem it isn't uncommon to see UnRAID users leverage mergerfs
31-
given the differences in the technologies.
32-
* For a time there were a number of Chia miners recommending mergerfs
33-
* [cloudboxes.io](https://cloudboxes.io/wiki/how-to/apps/set-up-mergerfs-using-ssh)
36+
given the differences in the technologies. There is a [plugin
37+
available by
38+
Rysz](https://forums.unraid.net/topic/144999-plugin-mergerfs-for-unraid-support-topic/)
39+
to ease installation and setup.
40+
* [TrueNAS](https://www.truenas.com): Some users are requesting
41+
mergerfs be [made part
42+
of](https://forums.truenas.com/t/add-unionfs-or-mergerfs-and-rdam-enhancement-then-beat-all-other-nas-systems/23218)
43+
TrueNAS.
44+
* For a time there were a number of Chia miners recommending mergerfs.
45+
* [cloudboxes.io](https://cloudboxes.io): VPS provider. Includes
46+
details [on their
47+
wiki](https://cloudboxes.io/wiki/how-to/apps/set-up-mergerfs-using-ssh):
48+
on how to setup mergerfs.
49+
* [QNAP](https://www.myqnap.org/product/mergerfs-apache83/): Someone
50+
has create builds of mergerfs for different QNAP devices.
3451

3552

3653
## Distributions including mergerfs

mkdocs/docs/usage_patterns.md

+68-37
Original file line numberDiff line numberDiff line change
@@ -29,60 +29,91 @@ across filesystems (see the mergerfs.dup tool) and setting
2929
`func.open=rand`, using `symlinkify`, or using dm-cache or a similar
3030
technology to add tiered cache to the underlying device itself.
3131

32-
With #2 one could use dm-cache as well but there is another solution
33-
which requires only mergerfs and a cronjob.
34-
35-
1. Create 2 mergerfs pools. One which includes just the slow branches
36-
and one which has both the fast branches (SSD,NVME,etc.) and slow
37-
branches. The 'base' pool and the 'cache' pool.
38-
2. The 'cache' pool should have the cache branches listed first in
39-
the branch list.
40-
3. The best `create` policies to use for the 'cache' pool would
41-
probably be `ff`, `epff`, `lfs`, `msplfs`, or `eplfs`. The latter
42-
three under the assumption that the cache filesystem(s) are far
43-
smaller than the backing filesystems. If using path preserving
44-
policies remember that you'll need to manually create the core
45-
directories of those paths you wish to be cached. Be sure the
46-
permissions are in sync. Use `mergerfs.fsck` to check / correct
47-
them. You could also set the slow filesystems mode to `NC` though
48-
that'd mean if the cache filesystems fill you'd get "out of space"
49-
errors.
50-
4. Enable `moveonenospc` and set `minfreespace` appropriately. To
51-
make sure there is enough room on the "slow" pool you might want
52-
to set `minfreespace` to at least as large as the size of the
53-
largest cache filesystem if not larger. This way in the worst case
54-
the whole of the cache filesystem(s) can be moved to the other
55-
drives.
56-
5. Set your programs to use the 'cache' pool.
57-
6. Save one of the below scripts or create you're own. The script's
58-
responsibility is to move files from the cache filesystems (not
59-
pool) to the 'base' pool.
60-
7. Use `cron` (as root) to schedule the command at whatever frequency
61-
is appropriate for your workflow.
32+
With #2 one could use a block cache solution as available via LVM and
33+
dm-cache but there is another solution which requires only mergerfs, a
34+
script to move files around, and a cron job to run said script.
35+
36+
* Create two mergerfs pools. One which includes just the **slow**
37+
branches and one which has both the **fast** branches
38+
(SSD,NVME,etc.) and **slow** branches. The **base** pool and the
39+
**cache** pool.
40+
* The **cache** pool should have the cache branches listed first in
41+
the branch list in order to to make it easier to prioritize them.
42+
* The best `create` policies to use for the **cache** pool would
43+
probably be `ff`, `lus`, or `lfs`. The latter two under the
44+
assumption that the cache filesystem(s) are far smaller than the
45+
backing filesystems.
46+
* You can also set the **slow** filesystems mode to `NC` which would
47+
give you the ability to use other `create` policies though that'd
48+
mean if the cache filesystems fill you'd get "out of space"
49+
errors. This however may be good as it would indicate the script
50+
moving files around is not configured properly.
51+
* Set your programs to use the **cache** pool.
52+
* Configure the **base** pool with the `create` policy you would like
53+
to lay out files as you like.
54+
* Save one of the below scripts or create your own. The script's
55+
responsibility is to move files from the **cache** branches (not
56+
pool) to the **base** pool.
57+
* Use `cron` (as root) to schedule the command at whatever frequency
58+
is appropriate for your workflow.
6259

6360

6461
### time based expiring
6562

66-
Move files from cache to base pool based only on the last time the
67-
file was accessed. Replace `-atime` with `-amin` if you want minutes
68-
rather than days. May want to use the `fadvise` / `--drop-cache`
69-
version of rsync or run rsync with the tool
70-
[nocache](https://github.com/Feh/nocache).
63+
Move files from cache filesystem to base pool which have an access
64+
time older than the supplied number of days. Replace `-atime` with
65+
`-amin` in the script if you want minutes rather than days.
7166

7267
**NOTE:** The arguments to these scripts include the cache
7368
**filesystem** itself. Not the pool with the cache filesystem. You
7469
could have data loss if the source is the cache pool.
7570

7671
[mergerfs.time-based-mover](https://github.com/trapexit/mergerfs/blob/latest-release/tools/mergerfs.time-based-mover?raw=1)
7772

73+
Download:
74+
```
75+
curl -o /usr/local/bin/mergerfs.time-based-mover https://raw.githubusercontent.com/trapexit/mergerfs/refs/heads/latest-release/tools/mergerfs.time-based-mover
76+
```
77+
78+
crontab entry:
79+
```
80+
# m h dom mon dow command
81+
0 * * * * /usr/local/bin/mergerfs.time-based-mover /mnt/ssd/cache00 /mnt/base-pool 1
82+
```
83+
84+
If you have more than one cache filesystem then simply add a cron
85+
entry for each.
86+
87+
If you want to only move files from a subdirectory then use the
88+
subdirectories. `/mnt/ssd/cache00/foo` and `/mnt/base-pool/foo`
89+
respectively.
90+
7891

7992
### percentage full expiring
8093

81-
Move the oldest file from the cache to the backing pool. Continue till
82-
below percentage threshold.
94+
While the cache filesystem's percentage full is above the provided
95+
value move the oldest file from the cache filesystem to the base pool.
8396

8497
**NOTE:** The arguments to these scripts include the cache
8598
**filesystem** itself. Not the pool with the cache filesystem. You
8699
could have data loss if the source is the cache pool.
87100

88101
[mergerfs.percent-full-mover](https://github.com/trapexit/mergerfs/blob/latest-release/tools/mergerfs.percent-full-mover?raw=1)
102+
103+
Download:
104+
```
105+
curl -o /usr/local/bin/mergerfs.percent-full-mover https://raw.githubusercontent.com/trapexit/mergerfs/refs/heads/latest-release/tools/mergerfs.percent-full-mover
106+
```
107+
108+
crontab entry:
109+
```
110+
# m h dom mon dow command
111+
0 * * * * /usr/local/bin/mergerfs.percent-full-mover /mnt/ssd/cache00 /mnt/base-pool 80
112+
```
113+
114+
If you have more than one cache filesystem then simply add a cron
115+
entry for each.
116+
117+
If you want to only move files from a subdirectory then use the
118+
subdirectories. `/mnt/ssd/cache00/foo` and `/mnt/base-pool/foo`
119+
respectively.

tools/mergerfs.percent-full-mover

+26-10
Original file line numberDiff line numberDiff line change
@@ -1,21 +1,37 @@
11
#!/usr/bin/env sh
22

33
if [ $# != 3 ]; then
4-
echo "usage: $0 <cache-fs> <backing-pool> <percentage>"
5-
exit 1
4+
echo "usage: $0 <cache-fs> <base-pool> <percentage>"
5+
exit 1
66
fi
77

8-
CACHE="${1}"
9-
BACKING="${2}"
8+
CACHEFS="${1}"
9+
BASEPOOL="${2}"
1010
PERCENTAGE=${3}
1111

1212
set -o errexit
13-
while [ $(df --output=pcent "${CACHE}" | grep -v Use | cut -d'%' -f1) -gt ${PERCENTAGE} ]
13+
while [ $(df "${CACHE}" | tail -n1 | awk '{print $5}' | cut -d'%' -f1) -gt ${PERCENTAGE} ]
1414
do
15+
# Find the file with the oldest access time
1516
FILE=$(find "${CACHE}" -type f -printf '%A@ %P\n' | \
16-
sort | \
17-
head -n 1 | \
18-
cut -d' ' -f2-)
19-
test -n "${FILE}"
20-
rsync -axqHAXWESR --preallocate --relative --remove-source-files "${CACHE}/./${FILE}" "${BACKING}/"
17+
sort | \
18+
head -n 1 | \
19+
cut -d' ' -f2-)
20+
# If no file found, exit
21+
test -n "${FILE}" || exit 0
22+
# Move file
23+
rsync \
24+
--archive \
25+
--acls \
26+
--xattrs \
27+
--atimes \
28+
--hard-links \
29+
--one-file-system \
30+
--quiet \
31+
--preallocate \
32+
--remove-source-files \
33+
--relative \
34+
--log-file=/tmp/mergerfs-cache-rsync.log \
35+
"${CACHE}/./${FILE}" \
36+
"${BACKING}/"
2137
done

tools/mergerfs.time-based-mover

+20-7
Original file line numberDiff line numberDiff line change
@@ -1,13 +1,26 @@
11
#!/usr/bin/env sh
22

33
if [ $# != 3 ]; then
4-
echo "usage: $0 <cache-fs> <backing-pool> <days-old>"
5-
exit 1
4+
echo "usage: $0 <cache-fs> <base-pool> <days-old>"
5+
exit 1
66
fi
77

8-
CACHE="${1}"
9-
BACKING="${2}"
10-
N=${3}
8+
CACHEFS="${1}"
9+
BASEPOOL="${2}"
10+
DAYS_OLD=${3}
1111

12-
find "${CACHE}" -type f -atime +${N} -printf '%P\n' | \
13-
rsync --files-from=- -axqHAXWES --preallocate --remove-source-files "${CACHE}/" "${BACKING}/"
12+
find "${CACHEFS}" -type f -atime +${DAYS_OLD} -printf '%P\n' | \
13+
rsync \
14+
--files-from=- \
15+
--archive \
16+
--acls \
17+
--xattrs \
18+
--atimes \
19+
--hard-links \
20+
--one-file-system \
21+
--quiet \
22+
--preallocate \
23+
--remove-source-files \
24+
--log-file=/tmp/mergerfs-cache-rsync.log \
25+
"${CACHEFS}/" \
26+
"${BASEPOOL}/"

0 commit comments

Comments
 (0)