1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 1142 1143 1144 1145 1146 1147 1148 1149 1150 1151 1152 1153 1154 1155 1156 1157 1158 1159 1160 1161 1162 1163 1164 1165 1166 1167 1168 1169 1170 1171 1172 1173 1174 1175 1176 1177 1178 1179 1180 1181 1182 1183 1184 1185 1186 1187 1188 1189 1190 1191 1192 1193 1194 1195 1196 1197 1198 1199 1200 1201 1202 1203 1204 1205 1206 1207 1208 1209 1210 1211 1212 1213 1214 1215 1216 1217 1218 1219 1220 1221 1222 1223 1224 1225 1226 1227 1228 1229 1230 1231 1232 1233 1234 1235 1236 1237 1238 1239 1240 1241 1242 1243 1244 1245 1246 1247 1248 1249 1250 1251 1252 1253 1254 1255 1256 1257 1258 1259 1260 1261 1262 1263 1264 1265 1266 1267 1268 1269 1270 1271 1272 1273 1274 1275 1276 1277 1278 1279 1280 1281 1282 1283 1284 1285 1286 1287 1288 1289 1290 1291 1292 1293 1294 1295 1296 1297 1298 1299 1300 1301 1302 1303 1304 1305 1306 1307 1308 1309 1310 1311 1312 1313 1314 1315 1316 1317 1318 1319 1320 1321 1322 1323 1324 1325 1326 1327 1328 1329 1330 1331 1332 1333 1334 1335 1336 1337 1338 1339 1340 1341 1342 1343 1344 1345 1346 1347 1348 1349 1350 1351 1352 1353 1354 1355 1356 1357 1358 1359 1360 1361 1362 1363 1364 1365 1366 1367 1368 1369 1370 1371 1372 1373 1374 1375 1376 1377 1378 1379 1380 1381 1382 1383 1384 1385 1386 1387 1388 1389 1390 1391 1392 1393 1394 1395 1396 1397 1398 1399 1400 1401 1402 1403 1404 1405 1406 1407 1408 1409 1410 1411 1412 1413 1414 1415 1416 1417 1418 1419 1420 1421 1422 1423 1424 1425 1426 1427 1428 1429 1430 1431 1432 1433 1434 1435 1436 1437 1438 1439 1440 1441 1442 1443 1444 1445 1446 1447 1448 1449 1450 1451 1452 1453 1454 1455 1456 1457 1458 1459 1460 1461 1462 1463 1464 1465 1466 1467 1468 1469 1470 1471 1472 1473 1474 1475 1476 1477 1478 1479 1480 1481 1482 1483 1484 1485 1486 1487 1488 1489 1490 1491 1492 1493 1494 1495 1496 1497 1498 1499 1500 1501 1502 1503 1504 1505 1506 1507 1508 1509 1510 1511 1512 1513 1514 1515 1516 1517 1518 1519 1520 1521 1522 1523 1524 1525 1526 1527 1528 1529 1530 1531 1532 1533 1534 1535 1536 1537 1538 1539 1540 1541 1542 1543 1544 1545 1546 1547 1548 1549 1550 1551 1552 1553 1554 1555 1556 1557 1558 1559 1560 1561 1562 1563 1564 1565 1566 1567 1568 1569 1570 1571 1572 1573 1574 1575 1576 1577 1578 1579 1580 1581 1582 1583 1584 1585 1586 1587 1588 1589 1590 1591 1592 1593 1594 1595 1596 1597 1598 1599 1600 1601 1602 1603 1604 1605 1606 1607 1608 1609 1610 1611 1612 1613 1614 1615 1616 1617 1618 1619 1620 1621 1622 1623 1624 1625 1626 1627 1628 1629 1630 1631 1632 1633 1634 1635 1636 1637 1638 1639 1640 1641 1642 1643 1644 1645 1646 1647 1648 1649 1650 1651 1652 1653 1654 1655 1656 1657 1658 1659 1660 1661 1662 1663 1664 1665 1666 1667 1668 1669 1670 1671 1672 1673 1674 1675 1676 1677 1678 1679 1680 1681 1682 1683 1684 1685 1686 1687 1688 1689 1690 1691 1692 1693 1694 1695 1696 1697 1698 1699 1700 1701 1702 1703 1704 1705 1706 1707 1708 1709 1710 1711 1712 1713 1714 1715 1716 1717 1718 1719 1720 1721 1722 1723 1724 1725 1726 1727 1728 1729 1730 1731 1732 1733 1734 1735 1736 1737 1738 1739 1740 1741 1742 1743 1744 1745 1746 1747 1748 1749 1750 1751 1752 1753 1754 1755 1756 1757 1758 1759 1760 1761 1762 1763 1764 1765 1766 1767 1768 1769 1770 1771 1772 1773 1774 1775 1776 1777 1778 1779 1780 1781 1782 1783 1784 1785 1786 1787 1788 1789 1790 1791 1792 1793 1794 1795 1796 1797 1798 1799 1800 1801 1802 1803 1804 1805 1806 1807 1808 1809 1810 1811 1812 1813 1814 1815 1816 1817 1818 1819 1820 1821 1822 1823 1824 1825 1826 1827 1828 1829 1830 1831 1832 1833 1834 1835 1836 1837 1838 1839 1840 1841 1842 1843 1844 1845 1846 1847 1848 1849 1850 1851 1852 1853 1854 1855 1856 1857 1858 1859 1860 1861 1862 1863 1864 1865 1866 1867 1868 1869 1870 1871 1872 1873 1874 1875 1876 1877 1878 1879 1880 1881 1882 1883 1884 1885 1886 1887 1888 1889 1890 1891 1892 1893 1894 1895 1896 1897 1898 1899 1900 1901 1902 1903 1904 1905 1906 1907 1908 1909 1910 1911 1912 1913 1914 1915 1916 1917 1918 1919 1920 1921 1922 1923 1924 1925 1926 1927 1928 1929 1930 1931 1932 1933 1934 1935 1936 1937 1938 1939 1940 1941 1942 1943 1944 1945 1946 1947 1948 1949 1950 1951 1952 1953 1954 1955 1956 1957 1958 1959 1960 1961 1962 1963 1964 1965 1966 1967 1968 1969 1970 1971 1972 1973 1974 1975 1976 1977 1978 1979 1980 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024 2025 2026 2027 2028 2029 2030 2031 2032 2033 2034 2035 2036 2037 2038 2039 2040 2041 2042 2043 2044 2045 2046 2047 2048 2049 2050 2051 2052 2053 2054 2055 2056 2057 2058 2059 2060 2061 2062 2063 2064 2065 2066 2067 2068 2069 2070 2071 2072 2073 2074 2075 2076 2077 2078 2079 2080 2081 2082 2083 2084 2085 2086 2087 2088 2089 2090 2091 2092 2093 2094 2095 2096 2097 2098 2099 2100 2101 2102 2103 2104 2105 2106 2107 2108 2109 2110 2111 2112 2113 2114 2115 2116 2117 2118 2119 2120 2121 2122 2123 2124 2125 2126 2127 2128 2129 2130 2131 2132 2133 2134 2135 2136 2137 2138 2139 2140 2141 2142 2143 2144 2145 2146 2147 2148 2149 2150 2151 2152 2153 2154 2155 2156 2157 2158 2159 2160 2161 2162 2163 2164 2165 2166 2167 2168 2169 2170 2171 2172 2173 2174 2175 2176 2177 2178 2179 2180 2181 2182 2183 2184 2185 2186 2187 2188 2189 2190 2191 2192 2193 2194 2195 2196 2197 2198 2199 2200 2201 2202 2203 2204 2205 2206 2207 2208 2209 2210 2211 2212 2213 2214 2215 2216 2217 2218 2219 2220 2221 2222 2223 2224 2225 2226 2227 2228 2229 2230 2231 2232 2233 2234 2235 2236 2237 2238 2239 2240 2241 2242 2243 2244 2245 2246 2247 2248 2249 2250 2251 2252 2253 2254 2255 2256 2257 2258 2259 2260 2261 2262 2263 2264 2265 2266 2267 2268 2269 2270 2271 2272 2273 2274 2275 2276 2277 2278 2279 2280 2281 2282 2283 2284 2285 2286 2287 2288 2289 2290 2291 2292 2293 2294 2295 2296 2297 2298 2299 2300 2301 2302 2303 2304 2305 2306 2307 2308 2309 2310 2311 2312 2313 2314 2315 2316 2317 2318 2319 2320 2321 2322 2323 2324 2325 2326 2327 2328 2329 2330 2331 2332 2333 2334 2335 2336 2337 2338 2339 2340 2341 2342 2343 2344 2345 2346 2347 2348 2349 2350 2351 2352 2353 2354 2355 2356 2357 2358 2359 2360 2361 2362 2363 2364 2365 2366 2367 2368 2369 2370 2371 2372 2373 2374 2375 2376 2377 2378 2379 2380 2381 2382 2383 2384 2385 2386 2387 2388 2389 2390 2391 2392 2393 2394 2395 2396 2397 2398 2399 2400 2401 2402 2403 2404 2405 2406 2407 2408 2409 2410 2411 2412 2413 2414 2415 2416 2417 2418 2419 2420 2421 2422 2423 2424 2425 2426 2427 2428 2429 2430 2431 2432 2433 2434 2435
|
.\".so aufs.tmac
.
.eo
.de TQ
.br
.ns
.TP \$1
..
.de Bu
.IP \(bu 4
..
.ec
.\" end of macro definitions
.
.\" ----------------------------------------------------------------------
.TH aufs 5 \*[AUFS_VERSION] Linux "Linux Aufs User's Manual"
.SH NAME
aufs \- advanced multi layered unification filesystem. version \*[AUFS_VERSION]
.\" ----------------------------------------------------------------------
.SH DESCRIPTION
Aufs is a stackable unification filesystem such as Unionfs, which unifies
several directories and provides a merged single directory.
In the early days, aufs was entirely re-designed and re-implemented
Unionfs Version 1.x series. After
many original ideas, approaches and improvements, it
becomes totally different from Unionfs while keeping the basic features.
See Unionfs Version 1.x series for the basic features.
Recently, Unionfs Version 2.x series begin taking some of same
approaches to aufs's.
.\" ----------------------------------------------------------------------
.SH MOUNT OPTIONS
At mount-time, the order of interpreting options is,
.RS
.Bu
simple flags, except xino/noxino and udba=notify
.Bu
branches
.Bu
xino/noxino
.Bu
udba=notify
.RE
At remount-time,
the options are interpreted in the given order,
e.g. left to right.
.RS
.Bu
create or remove
whiteout\-base(\*[AUFS_WH_BASE]) and
whplink\-dir(\*[AUFS_WH_PLINKDIR]) if necessary
.RE
.
.TP
.B br:BRANCH[:BRANCH ...] (dirs=BRANCH[:BRANCH ...])
Adds new branches.
(cf. Branch Syntax).
Aufs rejects the branch which is an ancestor or a descendant of another
branch. It is called overlapped. When the branch is loopback-mounted
directory, aufs also checks the source fs\-image file of loopback
device. If the source file is a descendant of another branch, it will
be rejected too.
After mounting aufs or adding a branch, if you move a branch under
another branch and make it descendant of another branch, aufs will not
work correctly. By default (since linux\-3.2 until linux\-3.18\-rc1), aufs
prohibits such operation internally,
but there left a way to do.
(cf. Branch Syntax).
.
.TP
.B [ add | ins ]:index:BRANCH
Adds a new branch.
The index begins with 0.
Aufs creates
whiteout\-base(\*[AUFS_WH_BASE]) and
whplink\-dir(\*[AUFS_WH_PLINKDIR]) if necessary.
If there is the same named file on the lower branch (larger index),
aufs will hide the lower file.
You can only see the highest file.
You will be confused if the added branch has whiteouts (including
diropq), they may or may not hide the lower entries.
.\" It is recommended to make sure that the added branch has no whiteout.
(cf. DIAGNOSTICS).
Even if a process have once mapped a file by mmap(2) with MAP_SHARED
and the same named file exists on the lower branch,
the process still refers the file on the lower(hidden)
branch after adding the branch.
If you want to update the contents of a process address space after
adding, you need to restart your process or open/mmap the file again.
.\" Usually, such files are executables or shared libraries.
(cf. Branch Syntax).
.
.TP
.B del:dir
Removes a branch.
Aufs does not remove
whiteout\-base(\*[AUFS_WH_BASE]) and
whplink\-dir(\*[AUFS_WH_PLINKDIR]) automatically.
For example, when you add a RO branch which was unified as RW, you
will see whiteout\-base or whplink\-dir on the added RO branch.
If a process is referencing the file/directory on the deleting branch
(by open, mmap, current working directory, etc.), aufs will return an
error EBUSY. In this case, a script `aubusy' (in aufs\-util.git and aufs2\-util.git) is
useful to identify which process (and which file) makes the branch busy.
.
.TP
.B mod:BRANCH
Modifies the permission flags of the branch.
Aufs creates or removes
whiteout\-base(\*[AUFS_WH_BASE]) and/or
whplink\-dir(\*[AUFS_WH_PLINKDIR]) if necessary.
If the branch permission is been changing `rw' to `ro', and a process
is mapping a file by mmap(2)
.\" with MAP_SHARED
on the branch, the process may or may not
be able to modify its mapped memory region after modifying branch
permission flags.
Additionally when you enable CONFIG_IMA (in linux\-2.6.30 and later), IMA
may produce some wrong messages. But this is equivalent when the
filesystem is changed `ro' in emergency.
(cf. Branch Syntax).
.
.TP
.B append:BRANCH
equivalent to `add:(last index + 1):BRANCH'.
(cf. Branch Syntax).
.
.TP
.B prepend:BRANCH
equivalent to `add:0:BRANCH.'
(cf. Branch Syntax).
.
.TP
.B xino=filename
Use external inode number bitmap and translation table.
When CONFIG_AUFS_EXPORT is enabled, external inode generation table too.
It is set to
<FirstWritableBranch>/\*[AUFS_XINO_FNAME] by default, or
\*[AUFS_XINO_DEFPATH].
Comma character in filename is not allowed.
The files are created per an aufs and per a branch filesystem, and
unlinked. So you
cannot find this file, but it exists and is read/written frequently by
aufs.
When the specified file already exists, then mount(8) returns an error.
(cf. External Inode Number Bitmap, Translation Table and Generation Table).
If you enable CONFIG_SYSFS, the path of xino files are not shown in
/proc/mounts (and /etc/mtab), instead it is shown in
<sysfs>/fs/aufs/si_<id>/xi_path.
Otherwise, it is shown in /proc/mounts unless it is not the default
path.
.
.TP
.B noxino
Stop using external inode number bitmap and translation table.
If you use this option,
Some applications will not work correctly.
.\" And pseudo link feature will not work after the inode cache is
.\" shrunk.
(cf. External Inode Number Bitmap, Translation Table and Generation Table).
.
.TP
.B trunc_xib
Truncate the external inode number bitmap file. The truncation is done
automatically when you delete a branch unless you do not specify
`notrunc_xib' option.
(cf. External Inode Number Bitmap, Translation Table and Generation Table).
.
.TP
.B notrunc_xib
Stop truncating the external inode number bitmap file when you delete
a branch.
(cf. External Inode Number Bitmap, Translation Table and Generation Table).
.
.TP
.B trunc_xino_path=BRANCH | itrunc_xino=INDEX
Truncate the external inode number translation table per branch. The
branch can be specified by path or index (its origin is 0).
Sometimes the size of a xino file for tmpfs branch grows very big. If
you don't like such situation, try "mount \-o
remount,trunc_xino_path=BRANCH /your/aufs" (or itrunc_xino=INDEX). It
will shrink the xino file for BRANCH. These options are one time
actions. So the size may grow again. In order to make it work
automatically when necessary, try trunc_xino option.
These options are already implemented, but its design is not fixed
(cf. External Inode Number Bitmap, Translation Table and Generation Table).
.
.TP
.B trunc_xino | notrunc_xino
Enable (or disable) the automatic truncation of xino files.
The truncation is done by discarding the internal "hole" (unused blocks).
.\" When the number of blocks by the xino file for the branch exceeds
.\" the predefined upper limit, the automatic truncation begins. If the xino
.\" files contain few holes and the result size is still exceeds the upper
.\" limit, then the upper limit is added by \*[AUFS_XINO_TRUNC_STEP] blocks. The
.\" initial upper limit is \*[AUFS_XINO_TRUNC_INIT] blocks.
.\" Currently the type of branch fs supported by this automatic truncation
.\" is tmpfs or ramfs only.
The default is notrunc_xino.
These options are already implemented, but its design is not fixed
(cf. External Inode Number Bitmap, Translation Table and Generation Table).
TODO: customizable two values for upper limit
\" .
\" .TP
\" .B trunc_xino_v=n:n
.
.TP
.B acl
.TQ
.B noacl
Enable or disable POSIX Access Control List. This feature is totally
depending upon the branch fs. If your branch fs doesn't support POSIX
ACL, these options are meaningless. CONFIG_FS_POSIX_ACL is required.
.
.TP
.B create_policy | create=CREATE_POLICY
.TQ
.B copyup_policy | copyup | cpup=COPYUP_POLICY
Policies to select one among multiple writable branches. The default
values are `create=tdp' and `cpup=tdp'.
link(2) and rename(2) systemcalls have an exception. In aufs, they
try keeping their operations in the branch where the source exists.
(cf. Policies to Select One among Multiple Writable Branches).
.
.TP
.B dio
Enable Direct I/O support (including Linux AIO), and always make open(2)
with O_DIRECT success. But if your branch filesystem doesn't support
it, then the succeeding I/O will fail
(cf, Direct I/O).
.
.TP
.B nodio
Disable Direct I/O (including Linux AIO), and always make open(2) with
O_DIRECT fail.
This is default value
(cf, Direct I/O).
.
.TP
.B verbose | v
Print some information.
Currently, it is only busy file (or inode) at deleting a branch.
.
.TP
.B noverbose | quiet | q | silent
Disable `verbose' option.
This is default value.
.
.TP
.B sum
df(1)/statfs(2) returns the total number of blocks and inodes of
all branches.
When the block size of all branches are not equal, aufs chooses the
smallest one and calculate the number of blocks (including bavail and
bfree).
Note that there are cases that systemcalls may return ENOSPC, even if
df(1)/statfs(2) shows that aufs has some free space/inode.
.
.TP
.B nosum
Disable `sum' option.
This is default value.
.
.TP
.B dirwh=N
Watermark to remove a dir actually at rmdir(2) and rename(2).
If the target dir which is being removed or renamed (destination dir)
has a huge number of whiteouts, i.e. the dir is empty logically but
physically, the cost to remove/rename the single
dir may be very high.
It is
required to unlink all of whiteouts internally before issuing
rmdir/rename to the branch.
To reduce the cost of single systemcall,
aufs renames the target dir to a whiteout-ed temporary name and
invokes a pre\-created
kernel thread to remove whiteout-ed children and the target dir.
The rmdir/rename systemcall returns just after kicking the thread.
When the number of whiteout-ed children is less than the value of
dirwh, aufs remove them in a single systemcall instead of passing
another thread.
This value is ignored when the branch is NFS.
The default value is \*[AUFS_DIRWH_DEF].
.\" .
.\" .TP
.\" .B rdcache=N
.
.TP
.B rdblk=N
Specifies a size of internal VDIR block which is allocated at a time in
byte.
The VDIR block will be allocated several times when necessary. If your
directory has tens of thousands of files, you may want to expand this size.
The default value is defined as \*[AUFS_RDBLK_DEF].
The size has to be lager than NAME_MAX (usually 255) and kmalloc\-able
(the maximum limit depends on your system. at least 128KB is available
for every system).
If you set it to zero, then the internal estimation for the directory
size becomes ON, and aufs sets the value for the directory individually.
Sometimes the estimated value may be inappropriate since the estimation
is not so clever. Setting zero is useful when you use RDU
(cf. VDIR/readdir(3) in user\-space (RDU).
Otherwise it may be a pressure for kernel memory space.
Anytime you can reset the value to default by specifying rdblk=def.
(cf. Virtual or Vertical Directory Block).
.
.TP
.B rdhash=N
Specifies a size of internal VDIR hash table which is used to compare
the file names under the same named directory on multiple branches.
The VDIR hash table will be allocated in readdir(3)/getdents(2),
rmdir(2) and rename(2) for the existing target directory. If your
directory has tens of thousands of files, you may want to expand this size.
The default value is defined as \*[AUFS_RDHASH_DEF].
The size has to be lager than zero, and it will be multiplied by 4 or 8
(for 32\-bit and 64\-bit respectively, currently). The result must be
kmalloc\-able
(the maximum limit depends on your system. at least 128KB is available
for every system).
If you set it to zero, then the internal estimation for the directory
becomes ON, and aufs sets the value for the directory individually.
Sometimes the estimated value may be inappropriate since the estimation
is not so clever. Setting zero is useful when you use RDU
(cf. VDIR/readdir(3) in user\-space (RDU).
Otherwise it may be a pressure for kernel memory space.
Anytime you can reset the value to default by specifying rdhash=def.
(cf. Virtual or Vertical Directory Block).
.
.TP
.B plink
.TQ
.B noplink
Specifies to use `pseudo link' feature or not.
The default is `plink' which means use this feature.
(cf. Pseudo Link)
.
.TP
.B clean_plink
Removes all pseudo\-links in memory.
In order to make pseudo\-link permanent, use
`auplink' utility just before one of these operations,
unmounting aufs,
using `ro' or `noplink' mount option,
deleting a branch from aufs,
adding a branch into aufs,
or changing your writable branch as readonly.
If you installed both of /sbin/mount.aufs and /sbin/umount.aufs, and your
mount(8) and umount(8) support them,
`auplink' utility will be executed automatically and flush pseudo\-links.
(cf. Pseudo Link)
.
.TP
.B udba=none | reval | notify
Specifies the level of UDBA (User's Direct Branch Access) test.
(cf. User's Direct Branch Access and Inotify Limitation).
.
.TP
.B diropq=whiteouted | w | always | a
Specifies whether mkdir(2) and rename(2) dir case make the created directory
`opaque' or not.
In other words, to create `\*[AUFS_WH_DIROPQ]' under the created or renamed
directory, or not to create.
When you specify diropq=w or diropq=whiteouted, aufs will not create
it if the
directory was not whiteout-ed or opaqued. If the directory was whiteout-ed
or opaqued, the created or renamed directory will be opaque.
When you specify diropq=a or diropq==always, aufs will always create
it regardless
the directory was whiteout-ed/opaqued or not.
The default value is diropq=w, it means not to create when it is unnecessary.
.\" If you define CONFIG_AUFS_COMPAT at aufs compiling time, the default will be
.\" diropq=a.
.\" You need to consider this option if you are planning to add a branch later
.\" since `diropq' affects the same named directory on the added branch.
.
.TP
.B warn_perm
.TQ
.B nowarn_perm
Adding a branch, aufs will issue a warning about uid/gid/permission of
the adding branch directory,
when they differ from the existing branch's. This difference may or
may not impose a security risk.
If you are sure that there is no problem and want to stop the warning,
use `nowarn_perm' option.
The default is `warn_perm' (cf. DIAGNOSTICS).
.
.TP
.B shwh
.TQ
.B noshwh
By default (noshwh), aufs doesn't show the whiteouts and
they just hide the same named entries in the lower branches. The
whiteout itself also never be appeared.
If you enable CONFIG_AUFS_SHWH and specify `shwh' option, aufs
will show you the name of whiteouts
with keeping its feature to hide the lowers.
Honestly speaking, I am rather confused with this `visible whiteouts.'
But a user who originally requested this feature wrote a nice how\-to
document about this feature. See Tips file in the aufs CVS tree.
.
.TP
.B dirperm1
.TQ
.B nodirperm1
By default (nodirperm1), aufs respects the directory permission bits on
all branches equally, which means if the permission bits for a directory
on a lower readonly branch prohibits you to read, then you cannot read
even if you run "chmod a+rx" (and aufs copies it up).
With this option (dirperm1), the behavior changes and aufs checks the
permission bits of the directory on the topmost branch and the
permission bits on all lower branches are ignored.
In other words, you read a directory even if the lower readonly branch
fs prohibits it by its permission bits.
This feature may invite a security risk similar to the world writable
upper branch. As this case, dirperm1 option will produce a warning too.
.
.TP
.B dirren
.TQ
.B nodirren
Activates or deactivates the special handling for renaming a directory
(DIRREN) feature.
In order to use this feature, CONFIG_AUFS_DIRREN has to be enabled and
`dirren' mount option has to be specified too.
By default (nodirren), aufs returns an error with EXDEV for the case of
rename(2) a directory which exists on the multiple branches. Note that
DIRREN is slow (I have not yet measured it though) since it loads and
saves the list of the inode\-numbers per branch and the detailed
information per branch.
Note that `udba=notify' option may not work with DIRREN, since it is
based upon the name, while DIRREN handles both of before\- and
after\-renamed names. The internal name comparison may not work
correctly. In this case, aufs behaves like the default `udba=reval' is
specified.
.\" ----------------------------------------------------------------------
.SH Module Parameters
.TP
.B brs=1 | 0
Specifies to use the branch path data file under sysfs or not.
If the number of your branches is large or their path is long
and you meet the limitation of mount(8) ro /etc/mtab, you need to
enable CONFIG_SYSFS and set aufs module parameter brs=1.
When this parameter is set as 1, aufs does not show `br:' (or dirs=)
mount option through /proc/mounts (and /etc/mtab). So you can
keep yourself from the page limitation of
mount(8) or /etc/mtab.
Aufs shows branch paths through <sysfs>/fs/aufs/si_XXX/brNNN.
Actually the file under sysfs has also a size limitation, but I don't
think it is harmful.
There is one more side effect in setting 1 to this parameter.
If you rename your branch, the branch path written in /etc/mtab will be
obsoleted and the future remount will meet some error due to the
unmatched parameters (Remember that mount(8) may take the options from
/etc/mtab and pass them to the systemcall).
If you set 1, /etc/mtab will not hold the branch path and you will not
meet such trouble. On the other hand, the entries for the
branch path under sysfs are generated dynamically. So it must not be obsoleted.
But I don't think users want to rename branches so often.
If CONFIG_SYSFS is disable, this parameter is always set to 0.
.
.TP
.B allow_userns= Y | N
Allows an unprivileged mount under user namespace.
Userns mount to put AUFS into a chroot environment can be useful
while it as a security worry. This parameter sets an internal flag
FS_USERNS_MOUNT and allows userns unconditionally.
See the discussion in
http://www.mail\-archive.com/aufs\-users@lists.sourceforge.net/msg04266.html
and its thread.
The default is `N'.
If CONFIG_USER_NS is disabled, this parameter is meaningless.
.
.TP
.B sysrq=key
Specifies MagicSysRq key for debugging aufs.
You need to enable both of CONFIG_MAGIC_SYSRQ and CONFIG_AUFS_DEBUG.
Currently this is for developers only.
The default is `a'.
.
.TP
.B debug= 0 | 1
Specifies disable(0) or enable(1) debug print in aufs.
This parameter can be changed dynamically.
You need to enable CONFIG_AUFS_DEBUG.
Currently this is for developers only.
The default is `0' (disable).
.\" ----------------------------------------------------------------------
.SH Entries under Sysfs and Debugfs
See linux/Documentation/ABI/*/{sys,debug}fs\-aufs.
.\" ----------------------------------------------------------------------
.SH Gain the performance in return for the features
In order to gain a better performance, there are a few steps. They are
essentially to drop the features from aufs, and to gain a performance in
return for them. You don't have to drop all of them. It may be too
much. Try step by step with measuring the performance you want using
your typical workload.
.SS Patch file
.
As my recommendation, there is one patch file in aufs[34]\-standalone.git
tree, tmpfs\-idr.patch. It introduces IDR for
the tmpfs inode\-number management, and has an effect to prevent the
size of aufs's XINO/XIB files to grow rapidly. If you don't use TMPFS as
your branch, the patch won't be necessary.
.SS Configuration
.
Disable all unnecessary ones except CONFIG_AUFS_RDU (readdir in
user\-space). RDU requires an external user\-space library libau.so, but
it is so effective particularly for the directory which has tens of
thousands of files. To use RDU, users have to set LD_PRELOAD environment
variable. If he doesn't set, this configuration will do no harm. The
size of aufs module will be larger a little, but the time\-performance
(speed) won't be damaged.
.SS Mount option
.
As a first step, I'd recommend you to try `dirperm1', `udba=none' and
`nodirren.'
The former prohibits aufs to dig down the lower branches in checking the
directory permission bits, and the latter makes aufs not to watch the
external modification, eg. by\-passing aufs (users' direct branch access).
These options are able to be changed and restored anytime.
For the second step, try `notrunc_xino' and `notrunc_xib.'
It is not always when they are so effective. Especially if you have
applied tmpfs\-idr.patch, then the effect is small since the most
of effect is done by the patch. But there surely exists their effect. In
this case, the size of XINO and XIB will grow only, not truncated. In
other word, it is a time\-vs\-space issue.
For the third and last step, try `noplink' and `noxino.'
With these options, aufs behaves un\-POSIX\-ly a little, which means lose
the features maintaining the hard\-link (pseudo\-link) and the inode
numbers. Some behaviours may surprise users, but many internal process
will be skipped and the result performance will be better.
For your convenience, mount.aufs(8) provides `\*[DROPLVL]=N'
mount option. `N' means the level (see above) and you can specify either 1, 2
or 3 (and their negative values, will be described soon). It is not a real mount option,
which means it is not interpreted
by kernel\-space. When this option is given, mount.aufs(8)
translates it into several other (real) mount options, and passes them to
kernel\-space as if they were originally specified. Currently there are 3 levels.
.RS
.nr step 1 1
.IP \n[step] 4
\*[DROPLVL1]
.IP \n+[step]
\*[DROPLVL2]
.IP \n+[step]
\*[DROPLVL3]
.RE
For example, when you give `\*[DROPLVL]=3', mount.aufs(8) converts
it to `\*[DROPLVL1],\*[DROPLVL2],\*[DROPLVL3]'.
Also mount.aufs(8) provides `ephemeral' mount option which is equivalent
to `\*[DROPLVL]=3'.
For your more convenience, mount.aufs(8) provides the negative values
for each level. Note that there is no level 0, and no difference between
2 and \-2.
The options in `\*[DROPLVL2]' are already implemented, but their design is not fixed
(cf. External Inode Number Bitmap, Translation Table and Generation
Table). And the current default value is `\*[DROPLVL2]', so technically
speaking `\*[DROPLVL2]' is less important.
.RS
.nr step 1 1
.IP \-\n[step] 4
\*[DROPLVL1R]
.IP \-\n+[step]
\*[DROPLVL2R]
.IP \-\n+[step]
\*[DROPLVL3R]
.RE
The purpose of the negative values are to revert the effect of the
positive values (counter\-level). Note the XINO path in
`\-3'. In order to revert `noxino' in `\*[DROPLVL]=3', you
need to specify the actual XINO path, but it is totally depending upon
your environment, and mount.aufs(8) doesn't know about it and does nothing but
provides the default path. So generally it will be necessary to append
`xino=<your XINO path>' to `\*[DROPLVL]=\-3'.
Reverting `noatime' to `relatime' is rather tricky. It is due to the
behaviours of mount(8) and mount(2). You need to run
`remount,strictatime' before `remount,\*[DROPLVL]=\-1'.
Also note that the order of the mount options. For example, if you want to drop some
features but keep UDBA level as default, then you can specify
`\*[DROPLVL]=1,udba=reval'. If you write the reverse order as
`udba=reval,\*[DROPLVL]=1', then `udba=none' in `\*[DROPLVL]=1' takes
its effect and the udba level specified before \*[DROPLVL] will lose.
.\" ----------------------------------------------------------------------
.SH Git work\-tree and aufs
Git has a cache called `index' file. In this cache there are the
identity of the files individually. Here `identity' means a pair of
struct stat.st_{dev,ino}. (Git may consider other stat members too. But
the essential part of the identity is still dev and ino.)
Since aufs is a virtual filesystem and manages the inode numbers, it
provides its own st_dev and st_ino. They differ from the `index' cache
in git, and some git operations have to refresh the `index' cache, which
may take long time.
For instance,
.RS
.Bu
/branch/ro has 0x0801 for its st_dev
.Bu
/branch/ro/proj.git/fileA has 100 for its st_ino
.Bu
/branch/ro/proj.git/.git/index contains {0x0801,100} as fileA's
identity
.Bu
mount /u as /branch/rw + /branch/ro, /u is aufs
.Bu
we can see the contents of /u/proj.git/.git/index is equivalent to
/branch/ro/proj.git/.git/index
.RE
In this case, aufs provides {0x0802,110} (for example) for fileA's
identity, which is different from /branch/ro/proj.git/fileA.
If you run git\-diff or something, the
behaviour of git differs a little.
.RS
.Bu
git issues stat(2) and gets st_{dev,ino} pair.
.Bu
git compares the gotten pair and the one in the index file.
.Bu
when they are different from each other, git opens the file, reads all
data, compares it with the cached data, and finds there is nothing
changed.
.Bu
if the gotten pair is equal to the one in the index file, then
open/read/compare steps will be skipped.
.RE
This issue can happen when you copy the git working tree to somewhere
else. All files identity will be changed by the copy and the cached
identity in index file will be obsoleted.
Once you complete git\-status or something, the index file will be
updated, and full open/read/compare steps will not happen anymore.
This behaviour of git can be controlled by git's configuration
core.checkstat.
.\" ----------------------------------------------------------------------
.SH Branch Syntax
.TP
.B dir_path[ =permission [ + attribute ] ]
.TQ
.B permission := rw | ro | rr
.TQ
.B attribute := wh | nolwh | unpin | coo_reg | coo_all | moo | fhsm | icexsec | icexsys | icextr | icexusr | icexoth | icex
dir_path is a directory path.
The keyword after `dir_path=' is a
permission flags for that branch.
Comma, colon and the permission flags string (including `=') in the path
are not allowed.
Any (ordinary) filesystem can be a branch, But some are not accepted such like
sysfs, procfs and unionfs.
If you specify such filesystems as an aufs branch, aufs will return an error
saying it is unsupported.
Also aufs expects the writable branch filesystem supports the maximum
filename length as NAME_MAX. The readonly branch filesystem can be
shorter.
Cramfs in linux stable release has strange inodes and it makes aufs
confused. For example,
.nf
$ mkdir -p w/d1 w/d2
$ > w/z1
$ > w/z2
$ mkcramfs w cramfs
$ sudo mount -t cramfs -o ro,loop cramfs /mnt
$ find /mnt -ls
76 1 drwxr-xr-x 1 jro 232 64 Jan 1 1970 /mnt
1 1 drwxr-xr-x 1 jro 232 0 Jan 1 1970 /mnt/d1
1 1 drwxr-xr-x 1 jro 232 0 Jan 1 1970 /mnt/d2
1 1 -rw-r--r-- 1 jro 232 0 Jan 1 1970 /mnt/z1
1 1 -rw-r--r-- 1 jro 232 0 Jan 1 1970 /mnt/z2
.fi
All these two directories and two files have the same inode with one
as their link count. Aufs cannot handle such inode correctly.
Currently, aufs involves a tiny workaround for such inodes. But some
applications may not work correctly since aufs inode number for such
inode will change silently.
If you do not have any empty files, empty directories or special files,
inodes on cramfs will be all fine.
A branch should not be shared as the writable branch between multiple
aufs. A readonly branch can be shared.
The maximum number of branches is configurable at compile time (127 by
default).
When an unknown permission or attribute is given, aufs sets ro to that
branch silently.
.SS Permission
.
.TP
.B rw
Readable and writable branch. Set as default for the first branch.
If the branch filesystem is mounted as readonly, you cannot set it `rw.'
.\" A filesystem which does not support link(2) and i_op\->setattr(), for
.\" example FAT, will not be used as the writable branch.
.
.TP
.B ro
Readonly branch and it has no whiteouts on it.
Set as default for all branches except the first one. Aufs never issue
both of write operation and lookup operation for whiteout to this branch.
.
.TP
.B rr
Real readonly branch, special case of `ro', for natively readonly
branch. Assuming the branch is natively readonly, aufs can optimize
some internal operation. For example, if you specify `udba=notify'
option, aufs does not set fsnotify or inotify for the things on rr branch.
Set by default for a branch whose fs\-type is either `iso9660',
`cramfs' or `romfs' (and `squashfs' for linux\-2.6.29 and later).
When your branch exists on slower device and you have some
capacity on your hdd, you may want to try ulobdev tool in ULOOP sample.
It can cache the contents of the real devices on another faster device,
so you will be able to get the better access performance.
The ulobdev tool is for a generic block device, and the ulohttp is for a
filesystem image on http server.
If you want to spin down your hdd to save the
battery life or something, then you may want to use ulobdev to save the
access to the hdd, too.
See $AufsCVS/sample/uloop in detail.
.SS Attribute
.
.TP
.B wh
Readonly branch and it has/might have whiteouts on it.
Aufs never issue write operation to this branch, but lookup for whiteout.
Use this as `<branch_dir>=ro+wh'.
.
.TP
.B nolwh
Usually, aufs creates a whiteout as a hardlink on a writable
branch. This attributes prohibits aufs to create the hardlinked
whiteout, including the source file of all hardlinked whiteout
(\*[AUFS_WH_BASE].)
If you do not like a hardlink, or your writable branch does not support
link(2), then use this attribute.
But I am afraid a filesystem which does not support link(2) natively
will fail in other place such as copy\-up.
Use this as `<branch_dir>=rw+nolwh'.
Also you may want to try `noplink' mount option, while it is not recommended.
.
.TP
.B unpin
By default, aufs sets `pin' to the branch dir, which means that users
cannot remove nor rename the branch top dir as if it were a mount\-point.
In some cases and some users may need to rename the branch top dir. So
this attribute is implemented. If you specify `unpin' as a branch
attribute, it stops behaving as a mount\-point and you can rename the
branch top dir.
Needless to say, if you remove the branch top dir, then aufs cannot work.
Since linux\-3.18\-rc1, this attribute became meaningless. It is simply
ignored and all branch top dir behaves as this attribute is always
specified.
.
.TP
.B coo_reg | coo_all
Copy\-up on open.
By default the internal copy\-up is executed when it is really necessary.
It is not done when a file is opened for writing, but when any writing is
done.
These attributes are for not only the readonly branches but
also the writable branches. `coo_reg' handles the regular files only and
`coo_all' handles the regular files plus the directories. All special
files and symlinks will not be copied\-up.
Additionally NFS server may not issue open(2) when NFS client issues
open(2). This behavior means that the file may not be copied\-up when
NFS client issues open(2).
The internal copy\-up operation by these attributes are unrelated to
the COPYUP_POLICY
(cf. Policies to Select One among Multiple Writable Branches),
which means `copy\-up on open' always choose the nearest upper writable
branch.
Even if there are multiple writable branches set these attributes,
the internal copy\-up operation is done once, not recursively.
Users who have many (over 100) branches want to know and analyze
when and what file is copied\-up. To insert a new upper branch which
contains such files only may improve the performance of aufs.
The `copy\-up on open' itself may not be so attractive, but combining
with a feature FHSM (File\-based Hierarchy Storage Management)
will be useful.
.
.TP
.B moo
Move\-up on open.
Very similar attribute to coo except moo unlinks the copy\-up source
after the successful operation. This attribute handles the regular files
only, and obviously cannot be specified to the readonly branch.
Users can specify all these attributes for a single writable branch, but
only the last specified one has its effect. Other coo/moo attributes are
silently ignored.
The `move\-up on open' itself may not be so attractive, but combining
with a feature FHSM (File\-based Hierarchy Storage Management)
will be useful.
.
.TP
.B fhsm
File\-based Hierarchy Storage Management.
Specifies that this branch is a participant of aufs FHSM. Refer to
.B
aufs_fhsm(5)
in detail.
.
.TP
.B icexsec | icexsys | icextr | icexusr | icexoth | icex
Ignore the error on copying\-up/down XATTR.
When an internal copy\-up/down happens, aufs tries copying all XATTRs.
Here an error can happen because of the XATTR support on the dst
branch may different from the src branch. If you know how the branch
supports or unsupports XATTR, you can specify these attributes.
`icexsec' means to ignore an error on copying\-up/down XATTR categorized
as "security" (for LSM and capability). And `icexsys,' `icextr,' and
`icexusr,' are for "system" (for posix ACL), "trusted" and "user"
categories individually.
`icexoth' is for any other category. To be convenient, `icex` sets them
all.
See also linux/Documentation/filesystems/aufs/design/06xattr.txt.
These attributes are essentially for the writable branches. But when you
use
.B
aufs_fhsm(5),
you may want to
specify them to the readonly branches too. So they are available for the
readonly branches.
.\" .SS FUSE as a branch
.\" A FUSE branch needs special attention.
.\" The struct fuse_operations has a statfs operation. It is OK, but the
.\" parameter is struct statvfs* instead of struct statfs*. So almost
.\" all user\-space implementation will call statvfs(3)/fstatvfs(3) instead of
.\" statfs(2)/fstatfs(2).
.\" In glibc, [f]statvfs(3) issues [f]statfs(2), open(2)/read(2) for
.\" /proc/mounts,
.\" and stat(2) for the mountpoint. With this situation, a FUSE branch will
.\" cause a deadlock in creating something in aufs. Here is a sample
.\" scenario,
.\" .\" .RS
.\" .\" .IN -10
.\" .Bu
.\" create/modify a file just under the aufs root dir.
.\" .Bu
.\" aufs acquires a write\-lock for the parent directory, ie. the root dir.
.\" .Bu
.\" A library function or fuse internal may call statfs for a fuse branch.
.\" The create=mfs mode in aufs will surely call statfs for each writable
.\" branches.
.\" .Bu
.\" FUSE in kernel\-space converts and redirects the statfs request to the
.\" user\-space.
.\" .Bu
.\" the user\-space statfs handler will call [f]statvfs(3).
.\" .Bu
.\" the [f]statvfs(3) in glibc will access /proc/mounts and issue
.\" stat(2) for the mountpoint. But those require a read\-lock for the aufs
.\" root directory.
.\" .Bu
.\" Then a deadlock occurs.
.\" .\" .RE 1
.\" .\" .IN
.\"
.\" In order to avoid this deadlock, I would suggest not to call
.\" [f]statvfs(3) from fuse. Here is a sample code to do this.
.\" .nf
.\" struct statvfs stvfs;
.\"
.\" main()
.\" {
.\" statvfs(..., &stvfs)
.\" or
.\" fstatvfs(..., &stvfs)
.\" stvfs.f_fsid = 0
.\" }
.\"
.\" statfs_handler(const char *path, struct statvfs *arg)
.\" {
.\" struct statfs stfs
.\"
.\" memcpy(arg, &stvfs, sizeof(stvfs))
.\"
.\" statfs(..., &stfs)
.\" or
.\" fstatfs(..., &stfs)
.\"
.\" arg->f_bfree = stfs.f_bfree
.\" arg->f_bavail = stfs.f_bavail
.\" arg->f_ffree = stfs.f_ffree
.\" arg->f_favail = /* any value */
.\" }
.\" .fi
.\" ----------------------------------------------------------------------
.SH External Inode Number Bitmap, Translation Table and Generation Table (xino)
Aufs uses one external bitmap file and one external inode number
translation table files per an aufs and per a branch
filesystem by default.
Additionally when CONFIG_AUFS_EXPORT is enabled, one external inode
generation table is added.
The bitmap (and the generation table) is for recycling aufs inode number
and the others
are a table for converting an inode number on a branch to
an aufs inode number. The default path
is `first writable branch'/\*[AUFS_XINO_FNAME].
If there is no writable branch, the
default path
will be \*[AUFS_XINO_DEFPATH].
.\" A user who executes mount(8) needs the privilege to create xino
.\" file.
If you enable CONFIG_SYSFS, the path of xino files are not shown in
/proc/mounts (and /etc/mtab), instead it is shown in
<sysfs>/fs/aufs/si_<id>/xi_path.
Otherwise, it is shown in /proc/mounts unless it is not the default
path.
Those files are always opened and read/write by aufs frequently.
If your writable branch is on flash memory device, it is recommended
to put xino files on other than flash memory by specifying `xino='
mount option.
The
maximum file size of the bitmap is, basically, the amount of the
number of all the files on all branches divided by 8 (the number of
bits in a byte).
For example, on a 4KB page size system, if you have 32,768 (or
2,599,968) files in aufs world,
then the maximum file size of the bitmap is 4KB (or 320KB).
The
maximum file size of the table will
be `max inode number on the branch x size of an inode number'.
For example in 32bit environment,
.nf
$ df -i /branch_fs
/dev/hda14 2599968 203127 2396841 8% /branch_fs
.fi
and /branch_fs is an branch of the aufs. When the inode number is
assigned contiguously (without `hole'), the maximum xino file size for
/branch_fs will be 2,599,968 x 4 bytes = about 10 MB. But it might not be
allocated all of disk blocks.
When the inode number is assigned discontinuously, the maximum size of
xino file will be the largest inode number on a branch x 4 bytes.
Additionally, the file size is limited to LLONG_MAX or the s_maxbytes
in filesystem's superblock (s_maxbytes may be smaller than
LLONG_MAX). So the
support\-able largest inode number on a branch is less than
2305843009213693950 (LLONG_MAX/4\-1).
This is the current limitation of aufs.
Note that the xino-array feature which was introduced in aufs4.14 and
later made this limitation obsolete.
On 64bit environment, this limitation becomes more strict and the
supported largest inode number is less than LLONG_MAX/8\-1.
In order to estimate the size of the table for your readonly branch fs,
try
.nf
$ echo $((4 * $(sudo find /branch_fs -xdev -printf "%i\\n" |
sort -n | tail -n 1)))
.fi
For 64bit environment, replace 4 by 8 in above equation.
The xino files are always hidden, i.e. removed. So you cannot
do `ls \-l xino_file'.
If you enable CONFIG_DEBUG_FS, you can check these information through
<debugfs>/aufs/<si_id>/{xib,xi[0\-9]*,xigen}. xib is for the bitmap file,
xi0 ix for the first branch, and xi1 is for the next. xigen is for the
generation table.
xib and xigen are in the format of,
.nf
<blocks>x<block size> <file size>
.fi
Note that a filesystem usually has a
feature called pre\-allocation, which means a number of
blocks are allocated automatically, and then deallocated
silently when the filesystem thinks they are unnecessary.
You do not have to be surprised the sudden changes of the number of
blocks, when your filesystem which xino files are placed supports the
pre\-allocation feature.
The rests are hidden xino file information in the format of,
.nf
<file count>, <blocks>x<block size> <file size>
.fi
If the file count is larger than 1, it means some of your branches are
on the same filesystem and the xino file is shared by them.
Note that the file size may not be equal to the actual consuming blocks
since xino file is a sparse file, i.e. a hole in a file which does not
consume any disk blocks.
Once you unmount aufs, the xino files for that aufs are totally gone.
It means that the inode number is not permanent across umount or
shutdown.
The xino files should be created on the filesystem except NFS.
If your first writable branch is NFS, you will need to specify xino
file path other than NFS.
Also if you are going to remove the branch where xino files exist or
change the branch permission to readonly, you need to use xino option
before del/mod the branch.
The bitmap file and the table can be truncated.
For example, if you delete a branch which has huge number of files,
many inode numbers will be recycled and the bitmap will be truncated
to smaller size. Aufs does this automatically when a branch is
deleted.
You can truncate it anytime you like if you specify `trunc_xib' mount
option. But when the accessed inode number was not deleted, nothing
will be truncated.
The truncation is essentially equivalent to
.nf
$ cp --sparse=always <current xino file> <new xino file> &&
rm <current xino file>
.fi
It means that you have two xino files during the copy, and you should
pay attention to the free space of the filesystem where the xino file is
located.
If the free space is not large enough to hold two xino files temporary
during the copy, then the truncation fails and the xino file will go on
growing. For such case, you should move the xino file to another larger
partition, and move it back to where it was (if you want). To do this,
use `xino=' mount option. During this move, the xino file is truncated
automatically.
If you do not want to truncate it (it may be slow) when you delete a
branch, specify `notrunc_xib' after `del' mount option.
For the table, see trunc_xino_path=BRANCH, itrunc_xino=INDEX, trunc_xino
and notrunc_xino option.
If you do not want to use xino, use noxino mount option. Use this
option with care, since the inode number may be changed silently and
unexpectedly anytime.
For example,
rmdir failure, recursive chmod/chown/etc to a large and deep directory
or anything else.
And some applications will not work correctly.
.\" When the inode number has been changed, your system
.\" can be crazy.
If you want to change the xino default path, use xino mount option.
After you add branches, the persistence of inode number may not be
guaranteed.
At remount time, cached but unused inodes are discarded.
And the newly appeared inode may have different inode number at the
next access time. The inodes in use have the persistent inode number.
When aufs assigned an inode number to a file, and if you create the
same named file on the upper branch directly, then the next time you
access the file, aufs may assign another inode number to the file even
if you use xino option.
Some applications may treat the file whose inode number has been
changed as totally different file.
.\" ----------------------------------------------------------------------
.SH Pseudo Link (hardlink over branches)
Aufs supports `pseudo link' which is a logical hard\-link over
branches (cf. ln(1) and link(2)).
In other words, a copied\-up file by link(2) and a copied\-up file which was
hard\-linked on a readonly branch filesystem.
When you have files named fileA and fileB which are
hardlinked on a readonly branch, if you write something into fileA,
aufs copies\-up fileA to a writable branch, and write(2) the originally
requested thing to the copied\-up fileA. On the writable branch,
fileA is not hardlinked.
But aufs remembers it was hardlinked, and handles fileB as if it existed
on the writable branch, by referencing fileA's inode on the writable
branch as fileB's inode.
Once you unmount aufs, the plink info for that aufs kept in memory are totally
gone.
It means that the pseudo\-link is not permanent.
If you want to make plink permanent, try `auplink' utility just before
one of these operations,
unmounting your aufs,
using `ro' or `noplink' mount option,
deleting a branch from aufs,
adding a branch into aufs,
or changing your writable branch to readonly.
This utility will reproduces all real hardlinks on a writable branch by linking
them, and removes pseudo\-link info in memory and temporary link on the
writable branch.
Since this utility access your branches directly, you cannot hide them by
`mount \-\-bind /tmp /branch' or something.
If you are willing to rebuild your aufs with the same branches later, you
should use auplink utility before you umount your aufs.
If you installed both of /sbin/mount.aufs and /sbin/umount.aufs, and your
mount(8) and umount(8) support them,
`auplink' utility will be executed automatically and flush pseudo\-links.
During this utility is running, it puts aufs into the pseudo\-link
maintenance mode. In this mode, only the process which began the
maintenance mode (and its child processes) is allowed to operate in
aufs. Some other processes which are not related to the pseudo\-link will
be allowed to run too, but the rest have to return an error or wait
until the maintenance mode ends. If a process already acquires an inode
mutex (in VFS), it has to return an error.
Due to the fact that the pseudo\-link maintenance mode is operated via
procfs, the pseudo\-link feature itself (including the related mount
options) depends upon CONFIG_PROC_FS too.
.nf
# auplink /your/aufs/root flush
# umount /your/aufs/root
or
# auplink /your/aufs/root flush
# mount -o remount,mod:/your/writable/branch=ro /your/aufs/root
or
# auplink /your/aufs/root flush
# mount -o remount,noplink /your/aufs/root
or
# auplink /your/aufs/root flush
# mount -o remount,del:/your/aufs/branch /your/aufs/root
or
# auplink /your/aufs/root flush
# mount -o remount,append:/your/aufs/branch /your/aufs/root
.fi
The plinks are kept both in memory and on disk. When they consumes too much
resources on your system, you can use the `auplink' utility at anytime and
throw away the unnecessary pseudo\-links in safe.
Additionally, the `auplink' utility is very useful for some security reasons.
For example, when you have a directory whose permission flags
are 0700, and a file who is 0644 under the 0700 directory. Usually,
all files under the 0700 directory are private and no one else can see
the file. But when the directory is 0711 and someone else knows the 0644
filename, he can read the file.
Basically, aufs pseudo\-link feature creates a temporary link under the
directory whose owner is root and the permission flags are 0700.
But when the writable branch is NFS, aufs sets 0711 to the directory.
When the 0644 file is pseudo\-linked, the temporary link, of course the
contents of the file is totally equivalent, will be created under the
0711 directory. The filename will be generated by its inode number.
While it is hard to know the generated filename, someone else may try peeping
the temporary pseudo\-linked file by his software tool which may try the name
from one to MAX_INT or something.
In this case, the 0644 file will be read unexpectedly.
I am afraid that leaving the temporary pseudo\-links can be a security hole.
It makes sense to execute `auplink /your/aufs/root flush'
periodically, when your writable branch is NFS.
When your writable branch is not NFS, or all users are careful enough to set 0600
to their private files, you do not have to worry about this issue.
If you do not want this feature, use `noplink' mount option.
.SS The behaviors of plink and noplink
This sample shows that the `f_src_linked2' with `noplink' option cannot follow
the link.
.nf
none on /dev/shm/u type aufs (rw,xino=/dev/shm/rw/.aufs.xino,br:/dev/shm/rw=rw:/dev/shm/ro=ro)
$ ls -li ../r?/f_src_linked* ./f_src_linked* ./copied
ls: ./copied: No such file or directory
15 -rw-r--r-- 2 jro jro 2 Dec 22 11:03 ../ro/f_src_linked
15 -rw-r--r-- 2 jro jro 2 Dec 22 11:03 ../ro/f_src_linked2
22 -rw-r--r-- 2 jro jro 2 Dec 22 11:03 ./f_src_linked
22 -rw-r--r-- 2 jro jro 2 Dec 22 11:03 ./f_src_linked2
$ echo FOO >> f_src_linked
$ cp f_src_linked copied
$ ls -li ../r?/f_src_linked* ./f_src_linked* ./copied
15 -rw-r--r-- 2 jro jro 2 Dec 22 11:03 ../ro/f_src_linked
15 -rw-r--r-- 2 jro jro 2 Dec 22 11:03 ../ro/f_src_linked2
36 -rw-r--r-- 2 jro jro 6 Dec 22 11:03 ../rw/f_src_linked
53 -rw-r--r-- 1 jro jro 6 Dec 22 11:03 ./copied
22 -rw-r--r-- 2 jro jro 6 Dec 22 11:03 ./f_src_linked
22 -rw-r--r-- 2 jro jro 6 Dec 22 11:03 ./f_src_linked2
$ cmp copied f_src_linked2
$
none on /dev/shm/u type aufs (rw,xino=/dev/shm/rw/.aufs.xino,noplink,br:/dev/shm/rw=rw:/dev/shm/ro=ro)
$ ls -li ../r?/f_src_linked* ./f_src_linked* ./copied
ls: ./copied: No such file or directory
17 -rw-r--r-- 2 jro jro 2 Dec 22 11:03 ../ro/f_src_linked
17 -rw-r--r-- 2 jro jro 2 Dec 22 11:03 ../ro/f_src_linked2
23 -rw-r--r-- 2 jro jro 2 Dec 22 11:03 ./f_src_linked
23 -rw-r--r-- 2 jro jro 2 Dec 22 11:03 ./f_src_linked2
$ echo FOO >> f_src_linked
$ cp f_src_linked copied
$ ls -li ../r?/f_src_linked* ./f_src_linked* ./copied
17 -rw-r--r-- 2 jro jro 2 Dec 22 11:03 ../ro/f_src_linked
17 -rw-r--r-- 2 jro jro 2 Dec 22 11:03 ../ro/f_src_linked2
36 -rw-r--r-- 1 jro jro 6 Dec 22 11:03 ../rw/f_src_linked
53 -rw-r--r-- 1 jro jro 6 Dec 22 11:03 ./copied
23 -rw-r--r-- 2 jro jro 6 Dec 22 11:03 ./f_src_linked
23 -rw-r--r-- 2 jro jro 6 Dec 22 11:03 ./f_src_linked2
$ cmp copied f_src_linked2
cmp: EOF on f_src_linked2
$
.fi
.\"
.\" If you add/del a branch, or link/unlink the pseudo-linked
.\" file on a branch
.\" directly, aufs cannot keep the correct link count, but the status of
.\" `pseudo-linked.'
.\" Those files may or may not keep the file data after you unlink the
.\" file on the branch directly, especially the case of your branch is
.\" NFS.
If you add a branch which has fileA or fileB, aufs does not follow the
pseudo link. The file on the added branch has no relation to the same
named file(s) on the lower branch(es).
If you use noxino mount option, pseudo link will not work after the
kernel shrinks the inode cache.
This feature will not work for squashfs before version 3.2 since its
inode is tricky.
When the inode is hardlinked, squashfs inodes has the same inode
number and correct link count, but the inode memory object is
different. Squashfs inodes (before v3.2) are generated for each, even
they are hardlinked.
.\" ----------------------------------------------------------------------
.SH User's Direct Branch Access (UDBA)
UDBA means a modification to a branch filesystem manually or directly,
e.g. bypassing aufs.
While aufs is designed and implemented to be safe after UDBA,
it can make yourself and your aufs confused. And some information like
aufs inode will be incorrect.
For example, if you rename a file on a branch directly, the file on
aufs may
or may not be accessible through both of old and new name.
Because aufs caches various information about the files on
branches. And the cache still remains after UDBA.
Aufs has a mount option named `udba' which specifies the test level at
access time whether UDBA was happened or not.
.
.TP
.B udba=none
Aufs trusts the dentry and the inode cache on the system, and never
test about UDBA. With this option, aufs runs fastest, but it may show
you incorrect data.
Additionally, if you often modify a branch
directly, aufs will not be able to trace the changes of inodes on the
branch. It can be a cause of wrong behavior, deadlock or anything else.
It is recommended to use this option only when you are sure that
nobody access a file on a branch.
It might be difficult for you to achieve real `no UDBA' world when you
cannot stop your users doing `find / \-ls' or something.
If you really want to forbid all of your users to UDBA, here is a trick
for it.
With this trick, users cannot see the
branches directly and aufs runs with no problem, except `auplink' utility.
But if you are not familiar with aufs, this trick may make
yourself confused.
.nf
# d=/tmp/.aufs.hide
# mkdir $d
# for i in $branches_you_want_to_hide
> do
> mount -n --bind $d $i
> done
.fi
When you unmount the aufs, delete/modify the branch by remount, or you
want to show the hidden branches again, unmount the bound
/tmp/.aufs.hide.
.nf
# umount -n $branches_you_want_to_unbound
.fi
If you use FUSE filesystem as an aufs branch which supports hardlink,
you should not set this option, since FUSE makes inode objects for
each hardlinks (at least in linux\-2.6.23). When your FUSE filesystem
maintains them at link/unlinking, it is equivalent
to `direct branch access' for aufs.
.
.TP
.B udba=reval
Aufs tests only the existence of the file which existed. If
the existed file was removed on the branch directly, aufs
discard the cache about the file and
re-lookup it. So the data will be updated.
This test is at minimum level to keep the performance and ensure the
existence of a file.
This is default and aufs runs still fast.
This rule leads to some unexpected situation, but I hope it is
harmless. Those are totally depends upon cache. Here are just a few
examples.
.
.RS
.Bu
If the file is cached as negative or
not\-existed, aufs does not test it. And the file is still handled as
negative after a user created the file on a branch directly. If the
file is not cached, aufs will lookup normally and find the file.
.
.Bu
When the file is cached as positive or existed, and a user created the
same named file directly on the upper branch. Aufs detects the cached
inode of the file is still existing and will show you the old (cached)
file which is on the lower branch.
.
.Bu
When the file is cached as positive or existed, and a user renamed the
file by rename(2) directly. Aufs detects the inode of the file is
still existing. You may or may not see both of the old and new files.
TODO: If aufs also tests the name, we can detect this case.
.RE
If your outer modification (UDBA) is rare and you can ignore the
temporary and minor differences between virtual aufs world and real
branch filesystem, then try this mount option.
.
.TP
.B udba=notify
Aufs sets either `fsnotify' or `inotify' to all the accessed directories on its branches
and receives the event about the dir and its children. It consumes
resources, cpu and memory. And I am afraid that the performance will be
hurt, but it is most strict test level.
There are some limitations of linux inotify, see also Inotify
Limitation.
So it is recommended to leave udba default option usually, and set it
to notify by remount when you need it.
When a user accesses the file which was notified UDBA before, the cached data
about the file will be discarded and aufs re-lookup it. So the data will
be updated.
When an error condition occurs between UDBA and aufs operation, aufs
will return an error, including EIO.
To use this option, you need to enable CONFIG_INOTIFY and
CONFIG_AUFS_HINOTIFY.
In linux\-2.6.31, CONFIG_FSNOTIFY was introduced and CONFIG_INOTIFY was
listed in Documentation/feature\-removal\-schedule.txt. In aufs2\-31 and
later (until CONFIG_INOTIFY is removed actually), you can choose either
`fsnotify' or `inotify' in configuration. Whichever you choose, specify
`udba=notify', and aufs interprets it as an abstract name.
To rename/rmdir a directory on a branch directory may reveal the same named
directory on the lower branch. Aufs tries re-looking up the renamed
directory and the revealed directory and assigning different inode
number to them. But the inode number including their children can be a
problem. The inode numbers will be changed silently, and
aufs may produce a warning. If you rename a directory repeatedly and
reveal/hide the lower directory, then aufs may confuse their inode
numbers too. It depends upon the system cache.
When you make a directory in aufs and mount other filesystem on it,
the directory in aufs cannot be removed expectedly because it is a
mount point. But the same named directory on the writable branch can
be removed, if someone wants. It is just an empty directory, instead
of a mount point.
Aufs cannot stop such direct rmdir, but produces a warning about it.
If the pseudo\-linked file is hardlinked or unlinked on the branch
directly, its inode link count in aufs may be incorrect. It is
recommended to flush the pseudo\-links by auplink script.
In linux\-4.2 (and later, probably), for the exported aufs, NFS doesn't
show the changes at once
and returns ESTALE even if you set udba=notify.
It is a natural behaviour of linux NFS's
and aufs can do nothing about it. Probably simple "sleep 1" will help.
.\" ----------------------------------------------------------------------
.SH Linux Inotify Limitation
Unfortunately, current inotify (linux\-2.6.18) has some limitations,
and aufs must derive it.
\" .SS IN_ATTRIB, updating atime
\" When a file/dir on a branch is accessed directly, the inode atime (access
\" time, cf. stat(2)) may or may not be updated. In some cases, inotify
\" does not fire this event. So the aufs inode atime may remain old.
\" .SS IN_ATTRIB, updating nlink
\" When the link count of a file on a branch is incremented by link(2)
\" directly,
\" inotify fires IN_CREATE to the parent
\" directory, but IN_ATTRIB to the file. So the aufs inode nlink may
\" remain old.
.SS IN_DELETE, removing file on NFS
When a file on a NFS branch is deleted directly, inotify may or may
not fire
IN_DELETE event. It depends upon the status of dentry
(DCACHE_NFSFS_RENAMED flag).
In this case, the file on aufs seems still exists. Aufs and any user can see
the file.
Since linux\-3.15\-rc1, this behavior has been changed and NFS fires the
event from itself.
.SS IN_IGNORED, deleted rename target
When a file/dir on a branch is unlinked by rename(2) directly, inotify
fires IN_IGNORED which means the inode is deleted. Actually, in some
cases, the inode survives. For example, the rename target is linked or
opened. In this case, inotify watch set by aufs is removed by VFS and
inotify.
And aufs cannot receive the events anymore. So aufs may show you
incorrect data about the file/dir.
.\" ----------------------------------------------------------------------
.SH Virtual or Vertical Directory Block (VDIR)
In order to provide the merged view of file listing, aufs builds
internal directory block on memory. For readdir, aufs performs readdir()
internally for each dir on branches, merges their entries with
eliminating the whiteout\-ed ones, and sets it to the opened file (dir)
object. So the file object has its entry list until it is closed. The
entry list will be updated when the file position is zero (by
rewinddir(3)) and becomes obsoleted.
The merged result is cached in the corresponding inode object and
maintained by a customizable life-time option.
Note: the mount option `rdcache=<sec>' is still under considering and
its description is hidden from this manual.
Some people may call it can be a security hole or invite DoS attack
since the opened and once readdir\-ed dir (file object) holds its entry
list and becomes a pressure for system memory. But I would say it is similar
to files under /proc or /sys. The virtual files in them also holds a
memory page (generally) while they are opened. When an idea to reduce
memory for them is introduced, it will be applied to aufs too.
The dynamically allocated memory block for the name of entries has a
unit of \*[AUFS_RDBLK_DEF] bytes by default.
During building dir blocks, aufs creates hash list (hashed and divided by
\*[AUFS_RDHASH_DEF] by default) and judging whether
the entry is whiteout-ed by its upper branch or already listed.
These values are suitable for normal environments. But you may have
tens of thousands of files or very long filenames under a single directory. For
such cases, you may need to customize these values by specifying rdblk=
and rdhash= aufs mount options.
For instance, there are 97 files under my /bin, and the total name
length is 597 bytes.
.nf
$ \\ls -1 /bin | wc
97 97 597
.fi
Strictly speaking, 97 end\-of\-line codes are
included. But it is OK since aufs VDIR also stores the name length in 1
byte. In this case, you do not need to customize the default values. 597 bytes
filenames will be stored in 2 VDIR memory blocks (597 <
\*[AUFS_RDBLK_DEF] x 2).
And 97 filenames are distributed among \*[AUFS_RDHASH_DEF] lists, so one
list will point 4 names in average. To judge the names is whiteout-ed or
not, the number of comparison will be 4. 2 memory allocations
and 4 comparison costs low (even if the directory is opened for a long
time). So you do not need to customize.
If your directory has tens of thousands of files, the you will need to specify
rdblk= and rdhash=.
.nf
$ ls -U /mnt/rotating-rust | wc -l
1382438
.fi
In this case, assuming the average length of filenames is 6, in order to
get better time performance I would
recommend to set $((128*1024)) or $((64*1024)) for rdblk, and
$((8*1024)) or $((4*1024)) for rdhash.
You can change these values of the active aufs mount by "mount \-o
remount".
This customization is not for
reducing the memory space, but for reducing time for the number of memory
allocation and the name comparison. The larger value is faster, in
general. Of course, you will need system memory. This is a generic
"time\-vs\-space" problem.
.\" ----------------------------------------------------------------------
.SH Using libau.so
There is a dynamic shared object library called libau.so in aufs\-util
or aufs2\-util
GIT tree. This library provides several useful functions which wrap the
standard library functions such as,
.RS
.Bu
readdir, readdir_r, closedir
.Bu
pathconf, fpathconf
.RE
To use libau.so,
.RS
.Bu
install by "make install_ulib" under aufs\-util (or aufs2\-util) GIT tree
.Bu
set the environment variable "LD_PRELOAD=libau.so", or configure
/etc/ld.so.preload
.Bu
set the environment variable "\*[LibAuEnv]=all"
.Bu
and run your application.
.RE
If you use pathconf(3)/fpathconf(3) with _PC_LINK_MAX for aufs, you need
to use libau.so.
.SS VDIR/readdir(3) in user\-space (RDU)
If you have a directory which has tens of thousands of files, aufs VDIR consumes
much memory. So the program which reads a huge directory may produce an
"out of memory" or "page allocation failure" message in the syslog, due to
the memory fragmentation or real starvation.
In this case, RDU (readdir(3) in user\-space) may help you.
Because the kernel memory space cannot be swappable and consuming much
can be pure memory pressure, while it is not true in user\-space.
If you enable CONFIG_AUFS_RDU at compiling aufs, install libau.so, and
set some environment variables, then you can use RDU.
Just simply run your application.
The dynamic link library libau.so implements
another readdir routine, and all readdir(3) calls in your application
will be handled by libau.so.
For setting environment variables, you may want to use a shell function
or alias such like this.
.nf
$ auls()
> {
> LD_PRELOAD=/your/path/to/libau.so
> \*[LibAuEnv]=all
> #AUFS_RDU_BLK= set if you want
> ls $@
> }
$ alias auls="LD_PRELOAD=/your/path/to/libau.so \*[LibAuEnv]=all ls"
.fi
When you call readdir(3), the dynamic linker calls readdir in libau.so.
If it finds the passed dir is NOT aufs, it calls the usual readdir(3).
It the dir is aufs, then libau.so gets all filenames under the dir by
aufs specific ioctl(2)s, instead of regular readdir(3), and merges them
by itself.
In other words,
libau.so moves the memory consumption in kernel\-space to user\-space.
While it is good to stop consuming much memory in kernel\-space,
sometimes the speed performance may be damaged a little as a side
effect.
It is just a little, I hope. At the same time, I won't be surprised if
readdir(3) runs faster.
It is recommended to specify rdblk=0 when you use this library.
If your directory is not so huge and you don't meet the out of memory
situation, probably you don't need this library. The original VDIR in
kernel\-space is still alive, and you can live without libau.so.
.SS pathconf(_PC_LINK_MAX)
Since some implementation of pathconf(3) (and fpathconf(3)) for
_PC_LINK_MAX decides the target filesystem type and returns the
pre\-defined constant value, when aufs is unknown to the library, it
will return the default value (127).
Actually the maximum number of the link count in aufs inherits the
topmost writable
branch filesystem's. But the standard pathconf(3) will not return the
correct value.
To support such case, libau.so provides a wrapper for pathconf(3) (and
fpathconf(3)). When the parameter is _PC_LINK_MAX, the wrapper checks
whether the given parameter refers aufs or not. If it is aufs, then
it will get the maximum link count from the topmost writable branch
internally. Otherwise, it behaves as normal pathconf(3) transparently.
.SS Note
Since this is a dynamically linked library, it is unavailable if your
application is statically linked. And ld.so(8) ignores LD_PRELOAD when
the application is setuid/setgid\-ed unless the library is not
setuid/setgid\-ed. It is a generic rule of
dynamically linked library.
Additionally the functions in libau.so are unavailable in these cases
too.
.RS
.Bu
the application or library issues getdents(2) instead of readdir(3).
.Bu
the library which calls readdir(3) internally. e.g. scandir(3).
.Bu
the library which calls pathconf(3) internally.
.RE
.\" ----------------------------------------------------------------------
.SH Copy On Write, or aufs internal copyup and copydown
Every stackable filesystem which implements copy\-on\-write supports the
copyup feature. The feature is to copy a file/dir from the lower branch
to the upper internally. When you have one readonly branch and one
upper writable branch, and you append a string to a file which exists on
the readonly branch, then aufs will copy the file from the readonly
branch to the writable branch with its directory hierarchy. It means one
write(2) involves several logical/internal mkdir(2), creat(2), read(2),
write(2) and close(2) systemcalls
before the actual expected write(2) is performed. Sometimes it may take
a long time, particularly when the file is very large.
If CONFIG_AUFS_DEBUG is enabled, aufs produces a message saying `copying
a large file.'
You may see the message when you change the xino file path or
truncate the xino/xib files. Sometimes those files can be large and may
take a long time to handle them.
\" .SS a regular file in HFSPLUS
\" HFSPLUS acquires an inode mutex lock at closing a file. This behavior
\" is not a problem, but aufs doesn't expect such behavior and it had
\" caused a deadlock. So aufs added a special handling to copy\-up a
\" regular file in HFSPLUS, eg. opens the file internally twice. It means
\" there exists an additional overhead in copying a regular file in HFSPLUS.
.\" ----------------------------------------------------------------------
.SH Policies to Select One among Multiple Writable Branches
Aufs has some policies to select one among multiple writable branches
when you are going to write/modify something. There are two kinds of
policies, one is for newly create something and the other is for
internal copy\-up.
You can select them by specifying mount option `create=CREATE_POLICY'
or `cpup=COPYUP_POLICY.'
These policies have no meaning when you have only one writable
branch. If there is some meaning, it must hurt the performance.
.SS Exceptions for Policies
In every cases below, even if the policy says that the branch where a
new file should be created is /rw2, the file will be created on /rw1.
.
.Bu
If there is a readonly branch with `wh' attribute above the
policy\-selected branch and the parent dir is marked as opaque,
or the target (creating) file is whiteout-ed on the ro+wh branch, then
the policy will be ignored and the target file will be created on the
nearest upper writable branch than the ro+wh branch.
.RS
.nf
/aufs = /rw1 + /ro+wh/diropq + /rw2
/aufs = /rw1 + /ro+wh/wh.tgt + /rw2
.fi
.RE
.
.Bu
If there is a writable branch above the policy\-selected branch and the
parent dir is marked as opaque or the target file is whiteout-ed on the
branch, then the policy will be ignored and the target file will be
created on the highest one among the upper writable branches who has
diropq or whiteout. In case of whiteout, aufs removes it as usual.
.RS
.nf
/aufs = /rw1/diropq + /rw2
/aufs = /rw1/wh.tgt + /rw2
.fi
.RE
.
.Bu
link(2) and rename(2) systemcalls are exceptions in every policy.
They try selecting the branch where the source exists as possible since
copyup a large file will take long time. If it can't be, ie. the
branch where the source exists is readonly, then they will follow the
copyup policy.
.
.Bu
There is an exception for rename(2) when the target exists.
If the rename target exists, aufs compares the index of the branches
where the source and the target are existing and selects the higher
one. If the selected branch is readonly, then aufs follows the copyup
policy.
.SS Policies for Creating
.
.TP
.B create=tdp | top\-down\-parent
Select the highest branch where the parent dir exists. If this
branch is not writable, internal copyup will happen.
The policy for this copyup is always `bottom\-up.'
This is the default policy.
.
.TP
.B create=tdmfs:low[:second]
Select the highest writable branch regardless the existence of the
parent dir. If the free space of this branch is less than `low' bytes,
then the next highest writable branch will be selected.
If the free space of all writable branches are less than `low' bytes,
then create=mfs policy is applied.
For the duration (`second') parameter, see create=mfs[:second] below.
FHSM (File\-based Hierarchy Storage Management) may bring you the very
similar result, and is more flexible than this policy.
.
.TP
.B create=rr | round\-robin
Selects a writable branch in round robin. When you have two writable
branches and creates 10 new files, 5 files will be created for each
branch.
mkdir(2) systemcall is an exception. When you create 10 new directories,
all are created on the same branch.
.
.TP
.B create=mfs[:second] | most\-free\-space[:second]
Selects a writable branch which has most free space. In order to keep
the performance, you can specify the duration (`second') which makes
aufs hold the index of last selected writable branch until the
specified seconds expires. The seconds is up to \*[AUFS_MFS_MAX_SEC]
seconds.
The first time you create something in aufs
after the specified seconds expired, aufs checks the amount of free
space of all writable branches by internal statfs call
and the held branch index will be updated.
The default value is \*[AUFS_MFS_DEF_SEC] seconds.
.
.TP
.B create=mfsrr:low[:second]
Selects a writable branch in most\-free\-space mode first, and then
round\-robin mode. If the selected branch has less free space than the
specified value `low' in bytes, then aufs re-tries in round\-robin mode.
.\" `G', `M' and `K' (case insensitive) can be followed after `low.' Or
Try an arithmetic expansion of shell which is defined by POSIX.
For example, $((10 * 1024 * 1024)) for 10M.
You can also specify the duration (`second') which is equivalent to
the `mfs' mode.
.
.TP
.B create=pmfs[:second]
Selects a writable branch where the parent dir exists, such as tdp
mode. When the parent dir exists on multiple writable branches, aufs
selects the one which has most free space, such as mfs mode.
.
.TP
.B create=pmfsrr:low[:second]
Firstly selects a writable branch as the `pmfs mode.'
If there are less than `low' bytes available on all branches where the
parent dir exists, aufs selects the one which has the most free space
regardless the parent dir.
.SS Policies for Copy\-Up
.
.TP
.B cpup=tdp | top\-down\-parent
Equivalent to the same named policy for create.
This is the default policy.
.
.TP
.B cpup=bup | bottom\-up\-parent
Selects the writable branch where the parent dir exists and the branch
is nearest upper one from the copyup\-source.
.
.TP
.B cpup=bu | bottom\-up
Selects the nearest upper writable branch from the copyup\-source,
regardless the existence of the parent dir.
.\" ----------------------------------------------------------------------
.SH Exporting Aufs via NFS
Aufs is supporting NFS\-exporting.
Since aufs has no actual block device, you need to add NFS `fsid' option at
exporting. Refer to the manual of NFS about the detail of this option.
There are some limitations or requirements.
.RS
.Bu
The branch filesystem must support NFS\-exporting.
.Bu
NFSv2 is not supported. When you mount the exported aufs from your NFS
client, you will need to some NFS options like v3 or nfsvers=3,
especially if it is nfsroot.
.Bu
If the size of the NFS file handle on your branch filesystem is large,
aufs will
not be able to handle it. The maximum size of NFSv3 file
handle for a filesystem is 64 bytes. Aufs uses 24 bytes for 32bit
system, plus 12 bytes for 64bit system. The rest is a room for a file
handle of a branch filesystem.
.Bu
The External Inode Number Bitmap, Translation Table and Generation Table
(xino) is
required since NFS file
handle is based upon inode number. The mount option `xino' is enabled
by default.
The external inode generation table and its debugfs entry
(<debugfs>/aufs/si_*/xigen) is created when CONFIG_AUFS_EXPORT is
enabled even if you don't export aufs actually.
The size of the external inode generation table grows only, never be
truncated. You might need to pay attention to the free space of the
filesystem where xino files are placed. By default, it is the first
writable branch.
.Bu
The branch filesystems must be accessible, which means `not hidden.'
It means you need to `mount \-\-move' when you use initramfs and
switch_root(8), or chroot(8).
.Bu
Since aufs has several filename prefixes reserved, the maximum filename
length is shorter than ordinary 255. Actually \*[AUFS_MAX_NAMELEN]
(defined as ${AUFS_MAX_NAMELEN}). This
value should be specified as `namlen=' when you mount NFS.
The name of the branch top directory has another
limit. When you set the module parameter `brs' to 1 (default), then you
can see the branch pathname via /sys/fs/aufs/si_XXX/brNNN. Here it is
printed with its branch attributes such as `=rw' or `=ro+wh'. Since all
the sysfs entries have the size limit of 4096 bytes, the length of the
branch path becomes shorter than 4096. Actually you can specify any
branch with much longer names, but you will meet some troubles when you
remount later because remounting runs the aufs mount helper internally
and it tries reading /sys/fs/aufs/si_XXX/brNNN.
.RE
.\" ----------------------------------------------------------------------
.SH Direct I/O
The Direct I/O (including Linux AIO) is a
filesystem (and its backend block device) specific feature.
And there is a minor problem around the aufs internal
copyup. Assume you have two branches, lower RO ext2 and upper RW tmpfs. As
you know ext2 supports Direct I/O, but tmpfs doesn't. When a
`fileA' exists in the lower ext2, and you write something into after
opening it with
O_DIRECT, then aufs behaves like this if the mount option `dio' is specified.
.RS
.Bu
The application issues open(O_DIRECT);
Aufs opens the file in the lower ext2 and succeeds.
.Bu
The application issues write("something");
Aufs copies\-up the file from the lower ext2 to the upper tmpfs, and re-opens the
file in tmpfs with O_DIRECT. It fails and returns an error.
.RE
This behavior may be a problem since application expects the error
should be returned from the first open(2) instead of the later
write(2), when the filesystem doesn't support Direct I/O.
(But, in real world, I don't think there is an application
which doesn't check the error from write(2). So it won't be a big
problem actually).
If the file exists in the upper tmpfs, the first open(2) will fail
expectedly. So there is no problem in this case. But the problem may
happen when the internal copyup happens and the behavior of the branch
differs from each other. As long as the feature depends upon the filesystem, this
problem will not be solved. So aufs sets `nodio` by default, which means
all Direct I/O are disabled, and open(2) with O_DIRECT always fails. If
you want to use Direct I/O
AND all your writable branches support it, then specify `dio' option to
make it in effect.
With the similar reason, fcntl(F_SETFL, O_DIRECT) will not work for aufs
file descriptor.
.\" ----------------------------------------------------------------------
.SH Possible problem of the inode number in TMPFS
Although it is rare to happen, TMPFS has a problem about its inode
number management. Actually TMPFS does not maintain the inode number at
all. Linux kernel has a global 32bit number for general use of inode
number, and TMPFS uses it while most of (real) filesystem maintains its
inode number by itself. The global number can wrap around regardless the
inode number is still in use. This MAY cause a problem.
For instance,
when /your/tmpfs/fileA has 10 as its inode number, the same value (10)
may be assigned to a newly created file /your/tmpfs/fileB. Some
applications do not care the duplicated inode numbers, but others,
including AUFS, will be really confused by this situation.
If your writable branch FS is TMPFS and the inode number wraps
around, aufs will not work correctly. It is recommended to use one of FS
on HDD, ramdisk+ext2 or tmpfs+FSimage+loopback mount, as your writable
branch FS.
Or apply a patch in aufs4\-standalone.git. It addresses this
tmpfs-inum-assignment problem by modifying the source file other than aufs.
.\" ----------------------------------------------------------------------
.SH Dentry and Inode Caches
If you want to clear caches on your system, there are several tricks
for that. If your system ram is low,
try `find /large/dir \-ls > /dev/null'.
It will read many inodes and dentries and cache them. Then old caches will be
discarded.
But when you have large ram or you do not have such large
directory, it is not effective.
If you want to discard cache within a certain filesystem,
try `mount \-o remount /your/mntpnt'. Some filesystem may return an error of
EINVAL or something, but VFS discards the unused dentry/inode caches on the
specified filesystem.
.\" ----------------------------------------------------------------------
.SH Compatible/Incompatible with Unionfs Version 1.x Series
.\" If you compile aufs with \-DCONFIG_AUFS_COMPAT, dirs= option and =nfsro
.\" branch permission flag are available. They are interpreted as
.\" br: option and =ro flags respectively.
.\" `debug', `delete', `imap' options are ignored silently. When you
.\" compile aufs without \-DCONFIG_AUFS_COMPAT, these three options are
.\" also ignored, but a warning message is issued.
Ignoring `delete' option, and to keep filesystem consistency, aufs tries
writing something to only one branch in a single systemcall. It means
aufs may copyup even if the copyup\-src branch is specified as writable.
For example, you have two writable branches and a large regular file
on the lower writable branch. When you issue rename(2) to the file on aufs,
aufs may copyup it to the upper writable branch.
If this behavior is not what you want, then you should rename(2) it
on the lower branch directly.
And there is a simple shell
script `unionctl' under sample subdirectory, which is compatible with
unionctl(8) in
Unionfs Version 1.x series, except \-\-query action.
This script executes mount(8) with `remount' option and uses
add/del/mod aufs mount options.
If you are familiar with Unionfs Version 1.x series and want to use unionctl(8), you can
try this script instead of using mount \-o remount,... directly.
Aufs does not support ioctl(2) interface.
This script is highly depending upon mount(8) in
util\-linux\-2.12p package, and you need to mount /proc to use this script.
If your mount(8) version differs, you can try modifying this
script. It is very easy.
The unionctl script is just for a sample usage of aufs remount
interface.
Aufs uses the external inode number bitmap and translation table by
default.
The default branch permission for the first branch is `rw', and the
rest is `ro.'
The whiteout is for hiding files on lower branches. Also it is applied
to stop readdir going lower branches.
The latter case is called `opaque directory.' Any
whiteout is an empty file, it means whiteout is just an mark.
In the case of hiding lower files, the name of whiteout is
`\*[AUFS_WH_PFX]<filename>.'
And in the case of stopping readdir, the name is
`\*[AUFS_WH_PFX]\*[AUFS_WH_PFX].opq'.
.\" or
.\" `\*[AUFS_WH_PFX]__dir_opaque.' The name depends upon your compile
.\" configuration
.\" CONFIG_AUFS_COMPAT.
.\" All of newly created or renamed directory will be opaque.
All whiteouts are hardlinked,
including `<writable branch top dir>/\*[AUFS_WH_BASE].'
The hardlink on an ordinary (disk based) filesystem does not
consume inode resource newly. But in linux tmpfs, the number of free
inodes will be decremented by link(2). It is recommended to specify
nr_inodes option to your tmpfs if you meet ENOSPC. Use this option
after checking by `df \-i.'
When you rmdir or rename\-to the dir who has a number of whiteouts,
aufs rename the dir to the temporary whiteout-ed name like
`\*[AUFS_WH_PFX]\*[AUFS_WH_PFX]<dir>.<\*[AUFS_WH_TMP_LEN]\-digits hex>.'
Then remove it after actual operation.
cf. mount option `dirwh.'
.\" ----------------------------------------------------------------------
.SH Incompatible with an Ordinary Filesystem
stat(2) returns the inode info from the first existence inode among
the branches, except the directory link count.
Aufs computes the directory link count larger than the exact value usually, in
order to keep UNIX filesystem semantics, or in order to shut find(1) mouth up.
The size of a directory may be wrong too, but it has to do no harm.
The timestamp of a directory will not be updated when a file is
created or removed under it, and it was done on a lower branch.
The test for permission bits has two cases. One is for a directory,
and the other is for a non\-directory. In the case of a directory, aufs
checks the permission bits of all existing directories. It means you
need the correct privilege for the directories including the lower
branches.
The test for a non\-directory is more simple. It checks only the
topmost inode.
statfs(2) returns the information of the first branch info except
namelen when `nosum' is specified (the default). The namelen is
decreased by the whiteout prefix length.
Although the whiteout prefix is essentially `\*[AUFS_WH_PFX]', to
support rmdir(2) and
rename(2) (when the target directory already existed), the namelen is
decreased more since the name will be renamed to
`\*[AUFS_WH_PFX]\*[AUFS_WH_PFX]<dir>.<\*[AUFS_WH_TMP_LEN]\-digits hex>'
as previously described.
And the block size may differ
from st_blksize which is obtained by stat(2).
The whiteout prefix (\*[AUFS_WH_PFX]) is reserved on all branches. Users should
not handle the filename begins with this prefix.
In order to future whiteout, the maximum filename length is limited by
the longest value \- \*[AUFS_WH_PFX_LEN] * 2 \- 1 \-
\*[AUFS_WH_TMP_LEN] = \*[AUFS_MAX_NAMELEN].
It means you cannot handle such long name in
aufs, even if it surely exists on the underlying branch fs. The
readdir(3)/getdents(2) call show you such name, but the d_type is set to
DT_UNKNOWN.
It may be a violation of POSIX.
Remember, seekdir(3) and telldir(3) are not defined in POSIX. They may
not work as you expect. Try rewinddir(3) or re-open the dir.
If you dislike the difference between the aufs entries in /etc/mtab
and /proc/mounts, and if you are using mount(8) in util\-linux package,
then try ./mount.aufs utility. Copy the script to /sbin/mount.aufs.
This simple utility tries updating
/etc/mtab. If you do not care about /etc/mtab, you can ignore this
utility.
Remember this utility is highly depending upon mount(8) in
util\-linux\-2.12p package, and you need to mount /proc.
Since aufs uses its own inode and dentry, your system may cache huge
number of inodes and dentries. It can be as twice as all of the files
in your union.
It means that unmounting or remounting readonly at shutdown time may
take a long time, since mount(2) in VFS tries freeing all of the cache
on the target filesystem.
When you open a directory, aufs will open several directories
internally.
It means you may reach the limit of the number of file descriptor.
And when the lower directory cannot be opened, aufs will close all the
opened upper directories and return an error.
The sub\-mount under the branch
of local filesystem
is ignored.
For example, if you have mount another filesystem on
/branch/another/mntpnt, the files under `mntpnt' will be ignored by aufs.
It is recommended to mount the sub\-mount under the mounted aufs.
For example,
.nf
# sudo mount /dev/sdaXX /ro_branch
# d=another/mntpnt
# sudo mount /dev/sdbXX /ro_branch/$d
# mkdir -p /rw_branch/$d
# sudo mount -t aufs -o br:/rw_branch:/ro_branch none /aufs
# sudo mount -t aufs -o br:/rw_branch/${d}:/ro_branch/${d} none /aufs/another/$d
.fi
There are several characters which are not allowed to use in a branch
directory path and xino filename. See detail in Branch Syntax and Mount
Option.
The file\-lock which means fcntl(2) with F_SETLK, F_SETLKW or F_GETLK, flock(2)
and lockf(3), is applied to virtual aufs file only, not to the file on a
branch. It means you can break the lock by accessing a branch directly.
TODO: check `security' to hook locks, as inotify does.
Aufs respects all "security" hooks in kernel, so you can configure LSM
for both of virtual aufs files and real branch\-fs files. But there is
one exception, it is the kernel function "security_mmap_file()." The
function called inside aufs for a branch\-fs file may cause a deadlock,
so aufs stops calling it.
LSM settings for the virtual aufs files works as usual.
The I/O to the named pipe or local socket are not handled by aufs, even
if it exists in aufs. After the reader and the writer established their
connection if the pipe/socket are copied\-up, they keep using the old one
instead of the copied\-up one.
The fsync(2) and fdatasync(2) systemcalls return 0 which means success, even
if the given file descriptor is not opened for writing.
I am afraid this behavior may violate some standards. Checking the
behavior of fsync(2) on ext2, aufs decided to return success.
If you want to use disk quota, you should set it up to your writable
branch since aufs does not have its own block device.
When your aufs is the root directory of your system, and your system
tells you some of the filesystem were not unmounted cleanly, try these
procedure when you shutdown your system.
.nf
# mount -no remount,ro /
# for i in $writable_branches
# do mount -no remount,ro $i
# done
.fi
If your xino file is on a hard drive, you also need to specify
`noxino' option or `xino=/your/tmpfs/xino' at remounting root
directory.
To rename(2) directory may return EXDEV even if both of src and tgt
are on the same aufs, when `dirren' is not specified. When the rename\-src dir exists on multiple
branches and the lower dir has child/children, aufs has to copyup all his
children. It can be recursive copyup. Current aufs does not support
such huge copyup operation at one time in kernel space, instead
produces a warning and returns EXDEV.
Generally, mv(1) detects this error and tries mkdir(2) and
rename(2) or copy/unlink recursively. So the result is harmless.
If your application which issues rename(2) for a directory does not
support EXDEV, it will not work on aufs.
Also this specification is applied to the case when the src directory
exists on the lower readonly branch and it has child/children.
While it is rare, users can open a removed file with a little help
from procfs.
.RS
.Bu
open a file and get its descriptor
.Bu
remove the file
.Bu
generate a string `/proc/PID/fd/N'
.Bu
open the same file using the generated string
.Bu
.RE
This operation is a little difficult for aufs since aufs allows the
direct access to branches (by\-passing aufs), and it is hard to
distinguish the case of this.
.RS
.Bu
remove a file on a branch directly (by\-passing aufs)
.Bu
open the file via aufs
.RE
For the latter case, aufs detects the unmatching status between aufs
cached info and the real info from the branch, and tries refreshing by
re-lookup. Finally aufs finds
the file is removed and let open(2) return an error.
For the former case, currently (linux\-3.13\-rc7), aufs simply follows the
behavior of ext2 which supports for opening a non\-directory but returns
an error for a directory.
Other than open(2), users may chmod(2) and chown(2) similarly (remove the
file and then operate it via procfs). Ext2 supports them too, but aufs
doesn't. I don't think it a big disadvantage since users can fchmod(2)
and fchown(2) instead.
If a sudden accident such like a power failure happens during aufs is
performing, and regular fsck for branch filesystems is completed after
the disaster, you need to extra fsck for aufs writable branches. It is
necessary to check whether the whiteout remains incorrectly or not,
eg. the real filename and the whiteout for it under the same parent
directory. If such whiteout remains, aufs cannot handle the file
correctly.
To check the consistency from the aufs' point of view, you can use a
simple shell script called /sbin/auchk. Its purpose is a fsck tool for
aufs, and it checks the illegal whiteout, the remained
pseudo\-links and the remained aufs\-temp files. If they are found, the
utility reports you and asks whether to delete or not.
It is recommended to execute /sbin/auchk for every writable branch
filesystem before mounting aufs if the system experienced crash.
In linux\-v4.5, copy_file_range(2) is introduced and aufs supports it.
The systemcall supports only when the given two files exist on the same
filesystem. In aufs world, two files must exist on the same physical
filesystem, not on the logical aufs. The case of two files existing on
the logically same aufs but physically different file system is not
supported.
For example, fileA and fileB are given, and fileA exists on the lower
readonly branch in aufs, and fileB exists on the upper writable branch.
When these two branches exist on the same filesystem, then aufs
copy_file_range(2) should work. Otherwise it will return an error.
In other words, aufs copy_file_range(2) doesn't incur the internal
copyup since such behaviour doesn't fit the original purpose of
copy_file_range(2).
.\" ----------------------------------------------------------------------
.SH EXAMPLES
The mount options are interpreted from left to right at remount-time.
These examples
shows how the options are handled. (assuming /sbin/mount.aufs was
installed)
.nf
# mount -v -t aufs -o br:/day0:/base none /u
none on /u type aufs (rw,xino=/day0/.aufs.xino,br:/day0=rw:/base=ro)
# mount -v -o remount,\\
prepend:/day1,\\
xino=/day1/xino,\\
mod:/day0=ro,\\
del:/day0 \\
/u
none on /u type aufs (rw,xino=/day1/xino,br:/day1=rw:/base=ro)
.fi
.nf
# mount -t aufs -o br:/rw none /u
# mount -o remount,append:/ro /u
different uid/gid/permission, /ro
# mount -o remount,del:/ro /u
# mount -o remount,nowarn_perm,append:/ro /u
#
(there is no warning)
.fi
.\" If you want to expand your filesystem size, aufs may help you by
.\" adding an writable branch. Since aufs supports multiple writable
.\" branches, the old writable branch can be being writable, if you want.
.\" In this example, any modifications to the files under /ro branch will
.\" be copied-up to /new, but modifications to the files under /rw branch
.\" will not.
.\" And the next example shows the modifications to the files under /rw branch
.\" will be copied-up to /new/a.
.\"
.\" TODO: test multiple writable branches policy. cpup=nearest, cpup=exist_parent.
.\"
.\" .nf
.\" # mount -v -t aufs br:/rw:/ro none /u
.\" none on /u type aufs (rw,xino=/rw/.aufs.xino,br:/rw=rw:/ro=ro)
.\" # mkfs /new
.\" # mount -v -o remount,add:1:/new=rw /u
.\" none on /u type aufs (rw,xino=/rw/.aufs.xino,br:/rw=rw:/new=rw:/ro=ro)
.\" .fi
.\"
.\" .nf
.\" # mount -v -t aufs br:/rw:/ro none /u
.\" none on /u type aufs (rw,xino=/rw/.aufs.xino,br:/rw=rw:/ro=ro)
.\" # mkfs /new
.\" # mkdir /new/a new/b
.\" # mount -v -o remount,add:1:/new/b=rw,prepend:/new/a,mod:/rw=ro /u
.\" none on /u type aufs (rw,xino=/rw/.aufs.xino,br:/new/a=rw:/rw=ro:/new/b=rw:/ro=ro)
.\" .fi
When you use aufs as root filesystem, it is recommended to consider to
exclude some directories. For example, /tmp and /var/log are not need
to stack in many cases. They do not usually need to copyup or to whiteout.
Also the swapfile on aufs (a regular file, not a block device) is not
supported.
In order to exclude the specific dir from aufs, try bind mounting.
And there is a good sample which is for network booted diskless machines. See
sample/ in detail.
.\" ----------------------------------------------------------------------
.SH DIAGNOSTICS
When you add a branch to your union, aufs may warn you about the
privilege or security of the branch, which is the permission bits,
owner and group of the top directory of the branch.
For example, when your upper writable branch has a world writable top
directory,
a malicious user can create any files on the writable branch directly,
like copyup and modify manually. I am afraid it can be a security
issue.
When you mount or remount your union without \-o ro common mount option
and without writable branch, aufs will warn you that the first branch
should be writable.
.\" It is discouraged to set both of `udba' and `noxino' mount options. In
.\" this case the inode number under aufs will always be changed and may
.\" reach the end of inode number which is a maximum of unsigned long. If
.\" the inode number reaches the end, aufs will return EIO repeatedly.
When you set udba other than notify and change something on your
branch filesystem directly, later aufs may detect some mismatches to
its cache. If it is a critical mismatch, aufs returns EIO.
When an error occurs in aufs, aufs prints the kernel message with
`errno.' The priority of the message (log level) is ERR or WARNING which
depends upon the message itself.
You can convert the `errno' into the error message by perror(3),
strerror(3) or something.
For example, the `errno' in the message `I/O Error, write failed (\-28)'
is 28 which means ENOSPC or `No space left on device.'
When CONFIG_AUFS_BR_RAMFS is enabled, you can specify ramfs as an aufs
branch. Since ramfs is simple, it does not set the maximum link count
originally. In aufs, it is very dangerous, particularly for
whiteouts. Finally aufs sets the maximum link count for ramfs. The
value is 32000 which is borrowed from ext2.
After you prepend a branch which already has some entries, aufs may
report an I/O Error with "brabra should be negative" or something. For
instance,
you are going to open(2) a regular file in aufs and write(2) something
to it. If
you prepend a branch between open(2) and write(2), and the added branch
already has a same named entry other than a regular file, then you get a
conflict.
.RS
.Bu
a regular file FOO exists in aufs.
.Bu
open the file FOO.
.Bu
add a branch which has FOO but it is a directory, and change the
permission of the old branch to RO.
.Bu
write to the file FOO.
.Bu
aufs tries copying\-up FOO to the upper writable branch which was
recently added.
.Bu
aufs finds a directory FOO on the upper branch, and returns an error.
.RE
In this situation, aufs keeps returning an error during FOO is cached in
memory because it remembers that FOO is a regular file instead of a directory.
When the system discards the cache about FOO, then you will see the
directory FOO.
In other words, you will not be able to see the directory FOO on the
newly added branch during the file FOO on the lower branch is in use.
This situation may invite more complicated issue. If you unlink(2) the
opened file FOO, then aufs will create a whiteout on the upper writable
branch. And you get another conflict which is coexisting a whiteout and
a real entry on the same branch. In this case, aufs also keeps returning
an error when you try using FOO.
.\" .SH Current Limitation
.
.\" ----------------------------------------------------------------------
.\" SYNOPSIS
.\" briefly describes the command or function's interface. For commands, this
.\" shows the syntax of the command and its arguments (including options); bold-
.\" face is used for as-is text and italics are used to indicate replaceable
.\" arguments. Brackets ([]) surround optional arguments, vertical bars (|)
.\" separate choices, and ellipses (...) can be repeated. For functions, it shows
.\" any required data declarations or #include directives, followed by the
.\" function declaration.
.
.\" DESCRIPTION
.\" gives an explanation of what the command, function, or format does. Discuss
.\" how it interacts with files and standard input, and what it produces on
.\" standard output or standard error. Omit internals and implementation
.\" details unless they're critical for understanding the interface. Describe
.\" the usual case; for information on options use the OPTIONS section. If
.\" there is some kind of input grammar or complex set of subcommands, consider
.\" describing them in a separate USAGE section (and just place an overview in
.\" the DESCRIPTION section).
.
.\" RETURN VALUE
.\" gives a list of the values the library routine will return to the caller and
.\" the conditions that cause these values to be returned.
.
.\" EXIT STATUS
.\" lists the possible exit status values or a program and the conditions that
.\" cause these values to be returned.
.
.\" USAGE
.\" describes the grammar of any sublanguage this implements.
.
.\" FILES
.\" lists the files the program or function uses, such as configuration files,
.\" startup files, and files the program directly operates on. Give the full
.\" pathname of these files, and use the installation process to modify the
.\" directory part to match user preferences. For many programs, the default
.\" installation location is in /usr/local, so your base manual page should use
.\" /usr/local as the base.
.
.\" ENVIRONMENT
.\" lists all environment variables that affect your program or function and how
.\" they affect it.
.
.\" SECURITY
.\" discusses security issues and implications. Warn about configurations or
.\" environments that should be avoided, commands that may have security
.\" implications, and so on, especially if they aren't obvious. Discussing security
.\" in a separate section isn't necessary; if it's easier to understand, place
.\" security information in the other sections (such as the DESCRIPTION or USAGE
.\" section). However, please include security information somewhere!
.
.\" CONFORMING TO
.\" describes any standards or conventions this implements.
.
.\" NOTES
.\" provides miscellaneous notes.
.
.\" BUGS
.\" lists limitations, known defects or inconveniences, and other questionable
.\" activities.
.SH COPYRIGHT
Copyright \(co 2005\-2019 Junjiro R. Okajima
.SH AUTHOR
Junjiro R. Okajima
.\" SEE ALSO
.\" lists related man pages in alphabetical order, possibly followed by other
.\" related pages or documents. Conventionally this is the last section.
|