|
| 1 | +## linux 内存文件系统使用 - tmpfs, ramfs, shmfs |
| 2 | + |
| 3 | +### 作者 |
| 4 | +digoal |
| 5 | + |
| 6 | +### 日期 |
| 7 | +2019-02-11 |
| 8 | + |
| 9 | +### 标签 |
| 10 | +PostgreSQL , hugetlbfs , hugepage , memory filesystem , ramfs , tmpfs , shmfs |
| 11 | + |
| 12 | +---- |
| 13 | + |
| 14 | +## 背景 |
| 15 | +在做一些测试时,如果IO设备很烂的话,可以直接使用内存文件系统,避免IO上引入的一些开销影响测试结果。 |
| 16 | + |
| 17 | +用法很简单: |
| 18 | + |
| 19 | +### tmpfs or shmfs |
| 20 | +mount a shmfs with a certain size to /dev/shm, and set the correct permissions. |
| 21 | + |
| 22 | +For tmpfs you do not need to specify a size. Tmpfs or shmfs allocated memory is pageable. |
| 23 | + |
| 24 | +For example: |
| 25 | + |
| 26 | +Example Mount shmfs: |
| 27 | + |
| 28 | +``` |
| 29 | +# mount -t shm shmfs -o size=20g /dev/shm |
| 30 | + |
| 31 | +Edit /etc/fstab: |
| 32 | + |
| 33 | +shmfs /dev/shm shm size=20g 0 0 |
| 34 | +``` |
| 35 | + |
| 36 | +OR |
| 37 | + |
| 38 | +Example Mount tmpfs: |
| 39 | + |
| 40 | +``` |
| 41 | +# mount –t tmpfs tmpfs /dev/shm |
| 42 | + |
| 43 | +Edit /etc/fstab: |
| 44 | + |
| 45 | +none /dev/shm tmpfs defaults 0 0 |
| 46 | +``` |
| 47 | + |
| 48 | +### ramfs |
| 49 | +ramfs is similar to shmfs, except that pages are not pageable or swappable. |
| 50 | + |
| 51 | +This approach provides the commonly desired effect. ramfs is created by: |
| 52 | + |
| 53 | +``` |
| 54 | +umount /dev/shm |
| 55 | + |
| 56 | +mount -t ramfs ramfs /dev/shm |
| 57 | +``` |
| 58 | + |
| 59 | +## 例子 |
| 60 | + |
| 61 | +``` |
| 62 | +[root@pg11-test ~]# mkdir /mnt/tmpfs |
| 63 | +[root@pg11-test ~]# mkdir /mnt/ramfs |
| 64 | +``` |
| 65 | + |
| 66 | +1、tmpfs |
| 67 | + |
| 68 | +``` |
| 69 | +mount -t tmpfs tmpfs /mnt/tmpfs -o size=10G,noatime,nodiratime,rw |
| 70 | +mkdir /mnt/tmpfs/a |
| 71 | +chmod 777 /mnt/tmpfs/a |
| 72 | +``` |
| 73 | + |
| 74 | +2、ramfs |
| 75 | + |
| 76 | +``` |
| 77 | +mount -t ramfs ramfs /mnt/ramfs -o noatime,nodiratime,rw,data=writeback,nodelalloc,nobarrier |
| 78 | +mkdir /mnt/ramfs/a |
| 79 | +chmod 777 /mnt/ramfs/a |
| 80 | +``` |
| 81 | + |
| 82 | +ramfs无法在mount时限制大小,即使限制了也不起作用,在df结果中也看不到这个挂载点,但是实际上已经挂载。 |
| 83 | + |
| 84 | +``` |
| 85 | +[root@pg11-test ~]# mount |
| 86 | +tmpfs on /mnt/tmpfs type tmpfs (rw,noatime,nodiratime,size=10485760k) |
| 87 | +ramfs on /mnt/ramfs type ramfs (rw,noatime,nodiratime,data=writeback,nodelalloc,nobarrier) |
| 88 | + |
| 89 | +[root@pg11-test ~]# df -h |
| 90 | +Filesystem Size Used Avail Use% Mounted on |
| 91 | +/dev/vda1 197G 17G 171G 9% / |
| 92 | +devtmpfs 252G 0 252G 0% /dev |
| 93 | +tmpfs 252G 936K 252G 1% /dev/shm |
| 94 | +tmpfs 252G 676K 252G 1% /run |
| 95 | +tmpfs 252G 0 252G 0% /sys/fs/cgroup |
| 96 | +/dev/mapper/vgdata01-lv03 4.0T 549G 3.5T 14% /data03 |
| 97 | +/dev/mapper/vgdata01-lv02 4.0T 335G 3.7T 9% /data02 |
| 98 | +/dev/mapper/vgdata01-lv01 4.0T 1.5T 2.6T 37% /data01 |
| 99 | +tmpfs 51G 0 51G 0% /run/user/0 |
| 100 | +/dev/mapper/vgdata01-lv04 2.0T 621G 1.3T 32% /data04 |
| 101 | +tmpfs 10G 0 10G 0% /mnt/tmpfs |
| 102 | +``` |
| 103 | + |
| 104 | +### 内存文件系统性能 |
| 105 | +#### PostgreSQL fsync测试接口,测试内存文件系统fsync性能。 |
| 106 | + |
| 107 | +``` |
| 108 | +su - digoal |
| 109 | + |
| 110 | + |
| 111 | +digoal@pg11-test-> pg_test_fsync -f /mnt/tmpfs/a/1 |
| 112 | +5 seconds per test |
| 113 | +O_DIRECT supported on this platform for open_datasync and open_sync. |
| 114 | + |
| 115 | +Compare file sync methods using one 8kB write: |
| 116 | +(in wal_sync_method preference order, except fdatasync is Linux's default) |
| 117 | + open_datasync n/a* |
| 118 | + fdatasync 1137033.436 ops/sec 1 usecs/op |
| 119 | + fsync 1146431.736 ops/sec 1 usecs/op |
| 120 | + fsync_writethrough n/a |
| 121 | + open_sync n/a* |
| 122 | +* This file system and its mount options do not support direct |
| 123 | + I/O, e.g. ext4 in journaled mode. |
| 124 | + |
| 125 | +Compare file sync methods using two 8kB writes: |
| 126 | +(in wal_sync_method preference order, except fdatasync is Linux's default) |
| 127 | + open_datasync n/a* |
| 128 | + fdatasync 622763.705 ops/sec 2 usecs/op |
| 129 | + fsync 625990.998 ops/sec 2 usecs/op |
| 130 | + fsync_writethrough n/a |
| 131 | + open_sync n/a* |
| 132 | +* This file system and its mount options do not support direct |
| 133 | + I/O, e.g. ext4 in journaled mode. |
| 134 | + |
| 135 | +Compare open_sync with different write sizes: |
| 136 | +(This is designed to compare the cost of writing 16kB in different write |
| 137 | +open_sync sizes.) |
| 138 | + 1 * 16kB open_sync write n/a* |
| 139 | + 2 * 8kB open_sync writes n/a* |
| 140 | + 4 * 4kB open_sync writes n/a* |
| 141 | + 8 * 2kB open_sync writes n/a* |
| 142 | + 16 * 1kB open_sync writes n/a* |
| 143 | + |
| 144 | +Test if fsync on non-write file descriptor is honored: |
| 145 | +(If the times are similar, fsync() can sync data written on a different |
| 146 | +descriptor.) |
| 147 | + write, fsync, close 317779.892 ops/sec 3 usecs/op |
| 148 | + write, close, fsync 317769.037 ops/sec 3 usecs/op |
| 149 | + |
| 150 | +Non-sync'ed 8kB writes: |
| 151 | + write 529490.541 ops/sec 2 usecs/op |
| 152 | + |
| 153 | +digoal@pg11-test-> pg_test_fsync -f /mnt/ramfs/a/1 |
| 154 | +5 seconds per test |
| 155 | +O_DIRECT supported on this platform for open_datasync and open_sync. |
| 156 | + |
| 157 | +Compare file sync methods using one 8kB write: |
| 158 | +(in wal_sync_method preference order, except fdatasync is Linux's default) |
| 159 | + open_datasync n/a* |
| 160 | + fdatasync 1146515.453 ops/sec 1 usecs/op |
| 161 | + fsync 1149912.760 ops/sec 1 usecs/op |
| 162 | + fsync_writethrough n/a |
| 163 | + open_sync n/a* |
| 164 | +* This file system and its mount options do not support direct |
| 165 | + I/O, e.g. ext4 in journaled mode. |
| 166 | + |
| 167 | +Compare file sync methods using two 8kB writes: |
| 168 | +(in wal_sync_method preference order, except fdatasync is Linux's default) |
| 169 | + open_datasync n/a* |
| 170 | + fdatasync 621456.930 ops/sec 2 usecs/op |
| 171 | + fsync 624811.200 ops/sec 2 usecs/op |
| 172 | + fsync_writethrough n/a |
| 173 | + open_sync n/a* |
| 174 | +* This file system and its mount options do not support direct |
| 175 | + I/O, e.g. ext4 in journaled mode. |
| 176 | + |
| 177 | +Compare open_sync with different write sizes: |
| 178 | +(This is designed to compare the cost of writing 16kB in different write |
| 179 | +open_sync sizes.) |
| 180 | + 1 * 16kB open_sync write n/a* |
| 181 | + 2 * 8kB open_sync writes n/a* |
| 182 | + 4 * 4kB open_sync writes n/a* |
| 183 | + 8 * 2kB open_sync writes n/a* |
| 184 | + 16 * 1kB open_sync writes n/a* |
| 185 | + |
| 186 | +Test if fsync on non-write file descriptor is honored: |
| 187 | +(If the times are similar, fsync() can sync data written on a different |
| 188 | +descriptor.) |
| 189 | + write, fsync, close 314754.770 ops/sec 3 usecs/op |
| 190 | + write, close, fsync 314509.045 ops/sec 3 usecs/op |
| 191 | + |
| 192 | +Non-sync'ed 8kB writes: |
| 193 | + write 517299.869 ops/sec 2 usecs/op |
| 194 | +``` |
| 195 | + |
| 196 | +#### 本地磁盘性能如下: |
| 197 | + |
| 198 | +``` |
| 199 | +digoal@pg11-test-> pg_test_fsync -f /data01/digoal/1 |
| 200 | +5 seconds per test |
| 201 | +O_DIRECT supported on this platform for open_datasync and open_sync. |
| 202 | + |
| 203 | +Compare file sync methods using one 8kB write: |
| 204 | +(in wal_sync_method preference order, except fdatasync is Linux's default) |
| 205 | + open_datasync 46574.176 ops/sec 21 usecs/op |
| 206 | + fdatasync 40183.743 ops/sec 25 usecs/op |
| 207 | + fsync 36875.852 ops/sec 27 usecs/op |
| 208 | + fsync_writethrough n/a |
| 209 | + open_sync 42927.560 ops/sec 23 usecs/op |
| 210 | + |
| 211 | +Compare file sync methods using two 8kB writes: |
| 212 | +(in wal_sync_method preference order, except fdatasync is Linux's default) |
| 213 | + open_datasync 17121.111 ops/sec 58 usecs/op |
| 214 | + fdatasync 26438.641 ops/sec 38 usecs/op |
| 215 | + fsync 24562.907 ops/sec 41 usecs/op |
| 216 | + fsync_writethrough n/a |
| 217 | + open_sync 15698.199 ops/sec 64 usecs/op |
| 218 | + |
| 219 | +Compare open_sync with different write sizes: |
| 220 | +(This is designed to compare the cost of writing 16kB in different write |
| 221 | +open_sync sizes.) |
| 222 | + 1 * 16kB open_sync write 28793.172 ops/sec 35 usecs/op |
| 223 | + 2 * 8kB open_sync writes 15720.156 ops/sec 64 usecs/op |
| 224 | + 4 * 4kB open_sync writes 10007.818 ops/sec 100 usecs/op |
| 225 | + 8 * 2kB open_sync writes 5698.259 ops/sec 175 usecs/op |
| 226 | + 16 * 1kB open_sync writes 3116.232 ops/sec 321 usecs/op |
| 227 | + |
| 228 | +Test if fsync on non-write file descriptor is honored: |
| 229 | +(If the times are similar, fsync() can sync data written on a different |
| 230 | +descriptor.) |
| 231 | + write, fsync, close 33399.473 ops/sec 30 usecs/op |
| 232 | + write, close, fsync 33216.001 ops/sec 30 usecs/op |
| 233 | + |
| 234 | +Non-sync'ed 8kB writes: |
| 235 | + write 376584.982 ops/sec 3 usecs/op |
| 236 | +``` |
| 237 | + |
| 238 | +性能对比,显而易见。 |
| 239 | + |
| 240 | +## 其他 |
| 241 | +mount hugetlbfs,使用huge page的文件系统,但是不支持read, write接口,需要使用mmap的用法。 |
| 242 | + |
| 243 | +详见 |
| 244 | + |
| 245 | +https://www.ibm.com/developerworks/cn/linux/l-cn-hugetlb/index.html |
| 246 | + |
| 247 | +## 参考 |
| 248 | +https://docs.oracle.com/cd/E11882_01/server.112/e10839/appi_vlm.htm#UNXAR397 |
| 249 | + |
| 250 | +http://www.cnblogs.com/jintianfree/p/3993893.html |
| 251 | + |
| 252 | +https://lwn.net/Articles/376606/ |
| 253 | + |
| 254 | +https://www.ibm.com/developerworks/cn/linux/l-cn-hugetlb/index.html |
| 255 | + |
| 256 | +[《PostgreSQL Huge Page 使用建议 - 大内存主机、实例注意》](../201803/20180325_02.md) |
| 257 | + |
| 258 | + |
| 259 | +<a rel="nofollow" href="http://info.flagcounter.com/h9V1" ><img src="http://s03.flagcounter.com/count/h9V1/bg_FFFFFF/txt_000000/border_CCCCCC/columns_2/maxflags_12/viewers_0/labels_0/pageviews_0/flags_0/" alt="Flag Counter" border="0" ></a> |
| 260 | + |
| 261 | + |
| 262 | +## [digoal's 大量PostgreSQL文章入口](https://github.com/digoal/blog/blob/master/README.md "22709685feb7cab07d30f30387f0a9ae") |
| 263 | + |
| 264 | + |
| 265 | +## [免费领取阿里云RDS PostgreSQL实例、ECS虚拟机](https://free.aliyun.com/ "57258f76c37864c6e6d23383d05714ea") |
| 266 | + |
0 commit comments