1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 1142 1143 1144 1145 1146 1147 1148 1149 1150 1151 1152 1153 1154 1155 1156 1157 1158 1159 1160 1161 1162 1163 1164 1165 1166 1167 1168 1169 1170 1171 1172 1173 1174 1175 1176 1177 1178 1179 1180 1181 1182 1183 1184 1185 1186 1187 1188 1189 1190 1191 1192 1193 1194 1195 1196 1197 1198 1199 1200 1201 1202 1203 1204 1205 1206 1207 1208 1209 1210 1211 1212 1213 1214 1215 1216 1217 1218 1219 1220 1221 1222 1223 1224 1225 1226 1227 1228 1229 1230 1231 1232 1233 1234 1235 1236 1237 1238 1239 1240 1241 1242 1243 1244 1245 1246 1247 1248 1249 1250 1251 1252 1253 1254 1255 1256 1257 1258 1259 1260 1261 1262 1263 1264 1265 1266 1267 1268 1269 1270 1271 1272 1273 1274 1275 1276 1277 1278 1279 1280 1281 1282 1283 1284 1285 1286 1287 1288 1289 1290 1291 1292 1293 1294 1295 1296 1297 1298 1299 1300 1301 1302 1303 1304 1305 1306 1307 1308 1309 1310 1311 1312 1313 1314 1315 1316 1317 1318 1319 1320 1321 1322 1323 1324 1325 1326 1327 1328 1329 1330 1331 1332 1333 1334 1335 1336 1337 1338 1339 1340 1341 1342 1343 1344 1345 1346 1347 1348 1349 1350 1351 1352 1353 1354 1355 1356 1357 1358 1359 1360 1361 1362 1363 1364 1365 1366 1367 1368 1369 1370 1371 1372 1373 1374 1375 1376 1377 1378 1379 1380 1381 1382 1383 1384 1385 1386 1387 1388 1389 1390 1391 1392 1393 1394 1395 1396 1397 1398 1399 1400 1401 1402 1403 1404 1405 1406 1407 1408 1409 1410 1411 1412 1413 1414 1415 1416 1417 1418 1419 1420 1421 1422 1423 1424 1425 1426 1427 1428 1429 1430 1431 1432 1433 1434 1435 1436 1437 1438 1439 1440 1441 1442 1443 1444 1445 1446 1447 1448 1449 1450 1451 1452 1453 1454 1455 1456 1457 1458 1459 1460 1461 1462 1463 1464 1465 1466 1467 1468 1469 1470 1471 1472 1473 1474 1475 1476 1477 1478 1479 1480 1481 1482 1483 1484 1485 1486 1487 1488 1489 1490 1491 1492 1493 1494 1495 1496 1497 1498 1499 1500 1501 1502 1503 1504 1505 1506 1507 1508 1509 1510 1511 1512 1513 1514 1515 1516 1517 1518 1519 1520 1521 1522 1523 1524 1525 1526 1527 1528 1529 1530 1531 1532 1533 1534 1535 1536 1537 1538 1539 1540 1541 1542 1543 1544 1545 1546 1547 1548 1549 1550 1551 1552 1553 1554 1555 1556 1557 1558 1559 1560 1561 1562 1563 1564 1565 1566 1567 1568 1569 1570 1571 1572 1573 1574 1575 1576 1577 1578 1579 1580 1581 1582 1583 1584 1585 1586 1587 1588 1589 1590 1591 1592 1593 1594 1595 1596 1597 1598 1599 1600 1601 1602 1603 1604 1605 1606 1607 1608 1609 1610 1611 1612 1613 1614 1615 1616 1617 1618 1619 1620 1621 1622 1623 1624 1625 1626 1627 1628 1629 1630 1631 1632 1633 1634 1635 1636 1637 1638 1639 1640 1641 1642 1643 1644 1645 1646 1647 1648 1649 1650 1651 1652 1653 1654 1655 1656 1657 1658 1659 1660 1661 1662 1663 1664 1665 1666 1667 1668 1669 1670 1671 1672 1673 1674 1675 1676 1677 1678 1679 1680 1681 1682 1683 1684 1685 1686 1687 1688 1689 1690 1691 1692 1693 1694 1695 1696 1697 1698 1699 1700 1701 1702 1703 1704 1705 1706 1707 1708 1709 1710 1711 1712 1713 1714 1715 1716 1717 1718 1719 1720 1721 1722 1723 1724 1725 1726 1727 1728 1729 1730 1731 1732 1733 1734 1735 1736 1737 1738 1739 1740 1741 1742 1743 1744 1745 1746 1747 1748 1749 1750 1751 1752 1753 1754 1755 1756 1757 1758 1759 1760 1761 1762 1763 1764 1765 1766 1767 1768 1769 1770 1771 1772 1773 1774 1775 1776 1777 1778 1779 1780 1781 1782 1783 1784 1785 1786 1787 1788 1789 1790 1791 1792 1793 1794 1795 1796 1797 1798 1799 1800 1801 1802 1803 1804 1805 1806 1807 1808 1809 1810 1811 1812 1813 1814 1815 1816 1817 1818 1819 1820 1821 1822 1823 1824 1825 1826 1827 1828 1829 1830 1831 1832 1833 1834 1835 1836 1837 1838 1839 1840 1841 1842 1843 1844 1845 1846 1847 1848 1849 1850 1851 1852 1853 1854 1855 1856 1857 1858 1859 1860 1861 1862 1863 1864 1865 1866 1867 1868 1869 1870 1871 1872 1873 1874 1875 1876 1877 1878 1879 1880 1881 1882 1883 1884 1885 1886 1887 1888 1889 1890 1891 1892 1893 1894 1895 1896 1897 1898 1899 1900 1901 1902 1903 1904 1905 1906 1907 1908 1909 1910 1911 1912 1913 1914 1915 1916 1917 1918 1919 1920 1921 1922 1923 1924 1925 1926 1927 1928 1929 1930 1931 1932 1933 1934 1935 1936 1937 1938 1939 1940 1941 1942 1943 1944 1945 1946 1947 1948 1949 1950 1951 1952 1953 1954 1955 1956 1957 1958 1959 1960 1961 1962 1963 1964 1965 1966 1967 1968 1969 1970 1971 1972 1973 1974 1975 1976 1977 1978 1979 1980 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024 2025 2026 2027 2028 2029 2030 2031 2032 2033 2034 2035 2036 2037 2038 2039 2040 2041 2042 2043 2044 2045 2046 2047 2048 2049 2050 2051 2052 2053 2054 2055 2056 2057 2058 2059 2060 2061 2062 2063 2064 2065 2066 2067 2068 2069 2070 2071 2072 2073 2074 2075 2076 2077 2078 2079 2080 2081 2082 2083 2084 2085 2086 2087 2088 2089 2090 2091 2092 2093 2094 2095 2096 2097 2098 2099 2100 2101 2102 2103 2104 2105 2106 2107 2108 2109 2110 2111 2112 2113 2114 2115 2116 2117 2118 2119 2120 2121 2122 2123 2124 2125 2126 2127 2128 2129 2130 2131 2132 2133 2134 2135 2136 2137 2138 2139 2140 2141 2142 2143 2144 2145 2146 2147 2148 2149 2150 2151 2152 2153 2154 2155 2156 2157 2158 2159 2160 2161 2162 2163 2164 2165 2166 2167 2168 2169 2170 2171 2172 2173 2174 2175 2176 2177 2178 2179 2180 2181 2182 2183 2184 2185 2186 2187 2188 2189 2190 2191 2192 2193 2194 2195 2196 2197 2198 2199 2200 2201 2202 2203 2204 2205 2206 2207 2208 2209 2210 2211 2212 2213 2214 2215 2216 2217 2218 2219 2220 2221 2222 2223 2224 2225 2226 2227 2228 2229 2230 2231 2232 2233 2234 2235 2236 2237 2238 2239 2240 2241 2242 2243 2244 2245 2246 2247 2248 2249 2250 2251 2252 2253 2254 2255 2256 2257 2258 2259 2260 2261 2262 2263 2264 2265 2266 2267 2268 2269 2270 2271 2272 2273 2274 2275 2276 2277 2278 2279 2280 2281 2282 2283 2284 2285 2286 2287 2288 2289 2290 2291 2292 2293 2294 2295 2296 2297 2298 2299 2300 2301 2302 2303 2304 2305 2306 2307 2308 2309 2310 2311 2312 2313 2314 2315 2316 2317 2318 2319 2320 2321 2322 2323 2324 2325 2326 2327 2328 2329 2330 2331 2332 2333 2334 2335 2336 2337 2338 2339 2340 2341 2342 2343 2344 2345 2346 2347 2348 2349 2350 2351 2352 2353 2354 2355 2356 2357 2358 2359 2360 2361 2362 2363 2364 2365 2366 2367 2368 2369 2370 2371 2372 2373 2374 2375 2376 2377 2378 2379 2380 2381 2382 2383 2384 2385 2386 2387 2388 2389 2390 2391 2392 2393 2394 2395 2396 2397 2398 2399 2400 2401 2402 2403 2404 2405 2406 2407 2408 2409 2410 2411 2412 2413 2414 2415 2416 2417 2418 2419 2420 2421 2422 2423 2424 2425 2426 2427 2428 2429 2430 2431 2432 2433 2434 2435 2436 2437 2438 2439 2440 2441 2442 2443 2444 2445 2446 2447 2448 2449 2450 2451 2452 2453 2454 2455 2456 2457 2458 2459 2460 2461 2462 2463 2464 2465 2466 2467 2468 2469 2470 2471 2472 2473 2474 2475 2476 2477 2478 2479 2480 2481 2482 2483 2484 2485 2486 2487 2488 2489 2490 2491 2492 2493 2494 2495 2496 2497 2498 2499 2500 2501 2502 2503 2504 2505 2506 2507 2508 2509 2510 2511 2512 2513 2514 2515 2516 2517 2518 2519 2520 2521 2522 2523 2524 2525 2526 2527 2528 2529 2530 2531 2532 2533 2534 2535 2536 2537 2538 2539 2540 2541 2542 2543 2544 2545 2546 2547 2548 2549 2550 2551 2552 2553 2554 2555 2556 2557 2558 2559 2560 2561 2562 2563 2564 2565 2566 2567 2568 2569 2570 2571 2572 2573 2574 2575 2576 2577 2578 2579 2580 2581 2582 2583 2584 2585 2586 2587 2588 2589 2590 2591 2592 2593 2594 2595 2596 2597 2598 2599 2600 2601 2602 2603 2604 2605 2606 2607 2608 2609 2610 2611 2612 2613 2614 2615 2616 2617 2618 2619 2620 2621 2622 2623 2624 2625 2626 2627 2628 2629 2630 2631 2632 2633 2634 2635 2636 2637 2638 2639 2640 2641 2642 2643 2644 2645 2646 2647 2648 2649 2650 2651 2652 2653 2654 2655 2656 2657 2658 2659 2660 2661 2662 2663 2664 2665 2666 2667 2668 2669 2670 2671 2672 2673 2674 2675 2676 2677 2678 2679 2680 2681 2682 2683 2684 2685 2686 2687 2688 2689 2690 2691 2692 2693 2694 2695 2696 2697 2698 2699 2700 2701 2702 2703 2704 2705 2706 2707 2708 2709 2710 2711 2712 2713 2714 2715 2716 2717 2718 2719 2720 2721 2722 2723 2724 2725 2726 2727 2728 2729 2730 2731 2732 2733 2734 2735 2736 2737 2738 2739 2740 2741 2742 2743 2744 2745 2746 2747 2748 2749 2750 2751 2752 2753 2754 2755 2756 2757 2758 2759 2760 2761 2762 2763 2764 2765 2766 2767 2768 2769 2770 2771 2772 2773 2774 2775 2776 2777 2778 2779 2780 2781 2782 2783 2784 2785 2786 2787 2788 2789 2790 2791 2792 2793 2794 2795 2796 2797 2798 2799 2800 2801 2802 2803 2804 2805 2806 2807 2808 2809 2810 2811 2812 2813 2814 2815 2816 2817 2818 2819 2820 2821 2822 2823 2824 2825 2826 2827 2828 2829 2830 2831 2832 2833 2834 2835 2836 2837 2838 2839 2840 2841 2842 2843 2844 2845 2846 2847 2848 2849 2850 2851 2852 2853 2854 2855 2856 2857 2858 2859 2860 2861 2862 2863 2864 2865 2866 2867 2868 2869 2870 2871 2872 2873 2874 2875 2876 2877 2878 2879 2880 2881 2882 2883 2884 2885 2886 2887 2888 2889 2890 2891 2892 2893 2894 2895 2896 2897 2898 2899 2900 2901 2902 2903 2904 2905 2906 2907 2908 2909 2910 2911 2912 2913 2914 2915 2916 2917 2918 2919 2920 2921 2922 2923 2924 2925 2926 2927 2928 2929 2930 2931 2932 2933 2934 2935 2936 2937 2938 2939 2940 2941 2942 2943 2944 2945 2946 2947 2948 2949 2950 2951 2952 2953 2954 2955 2956 2957 2958 2959 2960 2961 2962 2963 2964 2965 2966 2967 2968 2969 2970 2971 2972 2973 2974 2975 2976 2977 2978 2979 2980 2981 2982 2983 2984 2985 2986 2987 2988 2989 2990 2991 2992 2993 2994 2995 2996 2997 2998 2999 3000 3001 3002 3003 3004 3005 3006 3007 3008 3009 3010 3011 3012 3013 3014 3015 3016 3017 3018 3019 3020 3021 3022 3023 3024 3025 3026 3027 3028 3029 3030 3031 3032 3033 3034 3035 3036 3037 3038 3039 3040 3041 3042 3043 3044 3045 3046 3047 3048 3049 3050 3051 3052 3053 3054 3055 3056 3057 3058 3059 3060 3061 3062 3063 3064 3065 3066 3067 3068 3069 3070 3071 3072 3073 3074 3075 3076 3077 3078 3079 3080 3081 3082 3083 3084 3085 3086 3087 3088 3089 3090 3091 3092 3093 3094 3095 3096 3097 3098 3099 3100 3101 3102 3103 3104 3105 3106 3107 3108 3109 3110 3111 3112 3113 3114 3115 3116 3117 3118 3119 3120 3121 3122 3123 3124 3125 3126 3127 3128 3129 3130 3131 3132 3133 3134 3135 3136 3137 3138 3139 3140 3141 3142 3143 3144 3145 3146 3147 3148 3149 3150 3151 3152 3153 3154 3155 3156 3157 3158 3159 3160 3161 3162 3163 3164 3165 3166 3167 3168 3169 3170 3171 3172
|
% SystemTap Language Reference
\documentclass[twoside,english]{article}
\usepackage{geometry}
\geometry{verbose,letterpaper,tmargin=1.5in,bmargin=1.5in,lmargin=1in,rmargin=1in}
\usepackage{fancyhdr}
\pagestyle{fancy}
\usepackage{array}
\usepackage{varioref}
\usepackage{float}
\usepackage{makeidx}
\usepackage{verbatim}
\usepackage{url}
\makeindex
\makeatletter
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% LyX specific LaTeX commands.
\newcommand{\noun}[1]{\textsc{#1}}
%% Bold symbol macro for standard LaTeX users
%\providecommand{\boldsymbol}[1]{\mbox{\boldmath $#1$}}
%% Because html converters don't know tabularnewline
\providecommand{\tabularnewline}{\\}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% User specified LaTeX commands.
\setlength{\parindent}{0pt}
%\setlength{\parskip}{3pt plus 2pt minus 1pt}
\setlength{\parskip}{5pt}
%
% this makes list spacing much better.
%
\newenvironment{my_itemize}{
\begin{itemize}
\setlength{\itemsep}{1pt}
\setlength{\parskip}{0pt}
\setlength{\parsep}{0pt}}{\end{itemize}
}
\newenvironment{vindent}
{\begin{list}{}{\setlength{\listparindent}{6pt}}
\item[]}
{\end{list}}
\usepackage[english]{babel}
\makeatother
\begin{document}
\title{SystemTap Language Reference}
\maketitle
\newpage{}
This document was derived from other documents contributed to the SystemTap project by employees of Red Hat, IBM and Intel.\newline
Copyright \copyright\space 2007-2013 Red Hat Inc.\newline
Copyright \copyright\space 2007-2009 IBM Corp.\newline
Copyright \copyright\space 2007 Intel Corporation.\newline
Permission is granted to copy, distribute and/or modify this document
under the terms of the GNU Free Documentation License, Version 1.2
or any later version published by the Free Software Foundation;
with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts.\newline
The GNU Free Documentation License is available from
\url{http://www.gnu.org/licenses/fdl.html} or by writing to
the Free Software Foundation, Inc., 51 Franklin Street,
Fifth Floor, Boston, MA 02110-1301, USA.
\newpage{}
\tableofcontents{}
% \listoftables
\newpage{}
\section{SystemTap overview\label{sec:SystemTap-Overview}}
\subsection{About this guide}
This guide is a comprehensive reference of SystemTap's language constructs
and syntax. The contents borrow heavily from existing SystemTap documentation
found in manual pages and the tutorial. The presentation of information here
provides the reader with a single place to find language syntax and recommended
usage. In order to successfully use this guide, you should be familiar with
the general theory and operation of SystemTap. If you are new to SystemTap,
you will find the tutorial to be an excellent place to start learning. For
detailed information about tapsets, see the manual pages provided with the
distribution. For information about the entire collection of SystemTap reference
material, see Section~\ref{sec:For-Further-Reference}
\subsection{Reasons to use SystemTap}
SystemTap provides infrastructure to simplify the gathering of information
about a running Linux kernel so that it may be further analyzed. This analysis
assists in identifying the underlying cause of a performance or functional
problem. SystemTap was designed to eliminate the need for a developer to
go through the tedious instrument, recompile, install, and reboot sequence
normally required to collect this kind of data. To do this, it provides a
simple command-line interface and scripting language for writing
instrumentation for both kernel and user space.
With SystemTap, developers, system administrators, and users can easily write
scripts that gather and manipulate system data that is otherwise unavailable
from standard Linux tools. Users of SystemTap will find it to be a significant
improvement over older methods.
\subsection{Event-action language}
\index{language}
SystemTap's language is strictly typed, declaration free, procedural, and
inspired by dtrace and awk. Source code points or events in the kernel are
associated with handlers, which are subroutines that are executed synchronously.
These probes are conceptually similar to \char`\"{}breakpoint command lists\char`\"{}
in the GDB debugger.
There are two main outermost constructs: probes and functions. Within these,
statements and expressions use C-like operator syntax and precedence.
\subsection{Sample SystemTap scripts}
\index{example scripts}
Following are some example scripts that illustrate the basic operation of
SystemTap. For more examples, see the examples/small\_demos/ directory in
the source directory, the SystemTap wiki at \url{http://sourceware.org/systemtap/wiki/HomePage},
or the SystemTap War Stories at \url{http://sourceware.org/systemtap/wiki/WarStories} page.
\subsubsection{Basic SystemTap syntax and control structures}
The following code examples demonstrate SystemTap syntax and control structures.
\begin{vindent}
\begin{verbatim}
global odds, evens
probe begin {
# "no" and "ne" are local integers
for (i = 0; i < 10; i++) {
if (i % 2) odds [no++] = i
else evens [ne++] = i
}
delete odds[2]
delete evens[3]
exit()
}
probe end {
foreach (x+ in odds)
printf ("odds[%d] = %d", x, odds[x])
foreach (x in evens-)
printf ("evens[%d] = %d", x, evens[x])
}
\end{verbatim}
\end{vindent}
This prints:
\begin{vindent}
\begin{verbatim}
odds[0] = 1
odds[1] = 3
odds[3] = 7
odds[4] = 9
evens[4] = 8
evens[2] = 4
evens[1] = 2
evens[0] = 0
\end{verbatim}
\end{vindent}
Note that all variable types are inferred, and that all locals and
globals are initialized. Integers are set to 0 and strings are set to
the empty string.
\subsubsection{Primes between 0 and 49}
\begin{vindent}
\begin{verbatim}
function isprime (x) {
if (x < 2) return 0
for (i = 2; i < x; i++) {
if (x % i == 0) return 0
if (i * i > x) break
}
return 1
}
probe begin {
for (i = 0; i < 50; i++)
if (isprime (i)) printf("%d\n", i)
exit()
}
\end{verbatim}
\end{vindent}
This prints:
\begin{vindent}
\begin{verbatim}
2
3
5
7
11
13
17
19
23
29
31
37
41
43
47
\end{verbatim}
\end{vindent}
\subsubsection{Recursive functions}
\index{recursion}
\begin{vindent}
\begin{verbatim}
function fibonacci(i) {
if (i < 1) error ("bad number")
if (i == 1) return 1
if (i == 2) return 2
return fibonacci (i-1) + fibonacci (i-2)
}
probe begin {
printf ("11th fibonacci number: %d", fibonacci (11))
exit ()
}
\end{verbatim}
\end{vindent}
This prints:
\begin{vindent}
\begin{verbatim}
11th fibonacci number: 118
\end{verbatim}
\end{vindent}
Any larger number input to the function may exceed the MAXACTION or MAXNESTING
limits, which will be caught at run time and result in an error. For more
about limits see Section~\ref{sub:SystemTap-safety}.
\newpage{}
\subsection{The stap command}
\index{stap}
The stap program is the front-end to the SystemTap tool. It accepts probing
instructions written in its scripting language, translates those instructions
into C code, compiles this C code, and loads the resulting kernel module
into a running Linux kernel to perform the requested system trace or probe
functions. You can supply the script in a named file, from standard input,
or from the command line. The SystemTap script runs until one of the following
conditions occurs:
\begin{itemize}
\item The user interrupts the script with a CTRL-C.
\item The script executes the exit() function.
\item The script encounters a sufficient number of soft errors.
\item The monitored command started with the stap program's
\texttt{\textbf{-c}} option exits.
\end{itemize}
The stap command does the following:
\begin{itemize}
\item Translates the script
\item Generates and compiles a kernel module
\item Inserts the module; output to stap's stdout
\item CTRL-C unloads the module and terminates stap
\end{itemize}
For a full list of options to the stap command, see the stap(1) manual page.
\subsection{Safety and security\label{sub:SystemTap-safety}}
\index{limits}
SystemTap is an administrative tool. It exposes kernel internal data structures
and potentially private user information. It requires root privileges to
actually run the kernel objects it builds using the \textbf{sudo} command,
applied to the \textbf{staprun} program.
staprun is a part of the SystemTap package, dedicated to module loading and
unloading and kernel-to-user data transfer. Since staprun does not perform
any additional security checks on the kernel objects it is given, do not
give elevated privileges via sudo to untrusted users.
The translator asserts certain safety constraints. \index{constraints}It
ensures that no handler routine can run for too long, allocate memory, perform
unsafe operations, or unintentionally interfere with the kernel. Use of script
global variables is locked to protect against manipulation by concurrent
probe handlers. Use of \emph{guru mode} constructs such as embedded C (see
Section~\ref{sub:Embedded-C}) can violate these constraints, leading to
a kernel crash or data corruption.
The resource use limits are set by macros in the generated C code. These
may be overridden with the -D flag. The following list describes a selection
of these macros:
\textbf{MAXNESTING} -- The maximum number of recursive function call levels. The default is 10.
\textbf{MAXSTRINGLEN} -- The maximum length of strings. The default is 256 bytes
for 32 bit machines and 512 bytes for all other machines.
\textbf{MAXTRYLOCK} -- The maximum number of iterations to wait for locks on global variables before
declaring possible deadlock and skipping the probe. The default is 1000.
\textbf{MAXACTION} -- The maximum number of statements to execute during any single probe hit. The default is 1000.
\textbf{MAXMAPENTRIES} -- The maximum number of rows in an array if the array size is not specified
explicitly when declared. The default is 2048.
\textbf{MAXERRORS} -- The maximum number of soft errors before an exit is triggered. The default is 0.
\textbf{MAXSKIPPED} -- The maximum number of skipped reentrant probes before an exit is triggered. The default is 100.
\textbf{MINSTACKSPACE} -- The minimum number of free kernel stack bytes required in order to run a
probe handler. This number should be large enough for the probe handler's
own needs, plus a safety margin. The default is 1024.
If something goes wrong with stap or staprun after a probe has started running,
you may safely kill both user processes, and remove the active probe kernel
module with the rmmod command. Any pending trace messages may be lost.
\section{Types of SystemTap scripts\label{sec:Types-of-SystemTap}}
\subsection{Probe scripts}
Probe scripts are analogous to programs; these scripts identify probe points
and associated handlers.
\subsection{Tapset scripts}
Tapset scripts are libraries of probe aliases and auxiliary functions.
The /usr/share/systemtap/tapset directory contains tapset scripts. While
these scripts look like regular SystemTap scripts, they cannot be run directly.
\section{Components of a SystemTap script}
The main construct in the scripting language identifies probes. Probes associate
abstract events with a statement block, or probe handler, that is to be executed
when any of those events occur.
The following example shows how to trace entry and exit from a function using
two probes.
\begin{vindent}
\begin{verbatim}
probe kernel.function("sys_mkdir").call { log ("enter") }
probe kernel.function("sys_mkdir").return { log ("exit") }
\end{verbatim}
\end{vindent}
To list the probe-able functions in the kernel, use the listing option
(\texttt{\textbf{-l}}). For example:
\begin{vindent}
\begin{verbatim}
$ stap -l 'kernel.function("*")' | sort
\end{verbatim}
\end{vindent}
\subsection{Probe definitions}
The general syntax is as follows.
\begin{vindent}
\begin{verbatim}
probe PROBEPOINT [, PROBEPOINT] { [STMT ...] }
\end{verbatim}
\end{vindent}
Events are specified in a special syntax called \emph{probe points}. There
are several varieties of probe points defined by the translator, and tapset
scripts may define others using aliases. The provided probe points are listed
in the \texttt{stapprobes(3)}, \texttt{tapset::*(3stap)}, and
\texttt{probe::*(3stap)} man pages. The STMT statement block is executed
whenever {\i any} of the named PROBEPOINT events occurs.
The probe handler is interpreted relative to the context of each event. For
events associated with kernel code, this context may include variables defined
in the source code at that location. These \emph{target variables}\index{target variables} (or ``context variables'')
are presented to the script as variables whose names are prefixed with a
dollar sign (\$). They may be accessed only if the compiler used to compile
the kernel preserved them, despite optimization. This is the same constraint
imposed by a debugger when working with optimized code. Other events may
have very little context.
\subsection{Probe aliases\label{sub:Probe-aliases}}
\index{probe aliases}
The general syntax is as follows.
\begin{vindent}
\begin{verbatim}
probe <alias> = <probepoint> { <prologue_stmts> }
probe <alias> += <probepoint> { <epilogue_stmts> }
\end{verbatim}
\end{vindent}
New probe points may be defined using \emph{aliases}. A probe point alias
looks similar to probe definitions, but instead of activating a probe at
the given point, it defines a new probe point name as an alias to an existing
one. New probe aliases may refer to one or more existing probe aliases.
Multiple aliases may share the same underlying probe points.
The following is an example.
\begin{vindent}
\begin{verbatim}
probe socket.sendmsg = kernel.function ("sock_sendmsg") { ... }
probe socket.do_write = kernel.function ("do_sock_write") { ... }
probe socket.send = socket.sendmsg, socket.do_write { ... }
\end{verbatim}
\end{vindent}
There are two types of aliases, the prologue style and the epilogue style
which are identified by the equal sign (\texttt{\textbf{=}}) and \char`\"{}\texttt{\textbf{+=}}\char`\"{}
respectively.
A probe that uses a probe point alias will create an actual probe, with
the handler of the alias \emph{pre-pended}.
This pre-pending behavior serves several purposes. It allows the alias definition
to pre-process the context of the probe before passing control to the handler
specified by the user. This has several possible uses, demonstrated as follows.
\begin{vindent}
\begin{verbatim}
# Skip probe unless given condition is met:
if ($flag1 != $flag2) next
# Supply values describing probes:
name = "foo"
# Extract the target variable to a plain local variable:
var = $var
\end{verbatim}
\end{vindent}
\subsubsection{Prologue-style aliases (=)}
\index{prologue-style aliases}
\index{=}
For a prologue style alias, the statement block that follows an alias definition
is implicitly added as a prologue to any probe that refers to the alias.
The following is an example.
\begin{vindent}
\begin{verbatim}
# Defines a new probe point syscall.read, which expands to
# kernel.function("sys_read"), with the given statement as
# a prologue.
#
probe syscall.read = kernel.function("sys_read") {
fildes = $fd
}
\end{verbatim}
\end{vindent}
\subsubsection{Epilogue-style aliases (+=)}
\index{epilogue-style aliases}
\index{+=}
The statement block that follows an alias definition is implicitly added
as an epilogue to any probe that refers to the alias. It is not useful
to define new variables there (since no subsequent code will see them), but
rather the code can take action based upon variables set by the
prologue or by the user code. The following is an example:
\begin{vindent}
\begin{verbatim}
# Defines a new probe point with the given statement as an
# epilogue.
#
probe syscall.read += kernel.function("sys_read") {
if (traceme) println ("tracing me")
}
\end{verbatim}
\end{vindent}
\subsubsection{Probe alias usage}
A probe alias is used the same way as any built-in probe type, by
naming it:
\begin{vindent}
\begin{verbatim}
probe syscall.read {
printf("reading fd=%d\n", fildes)
}
\end{verbatim}
\end{vindent}
\subsubsection{Alias suffixes}
It is possible to include a suffix with a probe alias invocation. If
only the initial part of a probe point matches an alias, the remainder
is treated as a suffix and attached to the underlying probe point(s) when
the alias is expanded. For example:
\begin{vindent}
\begin{verbatim}
/* Define an alias: */
probe sendrecv = tcp.sendmsg, tcp.recvmsg { ... }
/* Use the alias in its basic form: */
probe sendrecv { ... }
/* Use the alias with an additional suffix: */
probe sendrecv.return { ... }
\end{verbatim}
\end{vindent}
Here, the second use of the probe alias is equivalent to writing \verb+probe tcp.sendmsg.return, tcp.recvmsg.return+.
As another example, the probe points \verb+tcp.sendmsg.return+ and \verb+tcp.recvmsg.return+ are actually defined as aliases in the tapset \verb+tcp.stp+. They expand to a probe point of the form \verb+kernel.function("...").return+, so they can also be suffixed:
\begin{vindent}
\begin{verbatim}
probe tcp.sendmsg.return.maxactive(10) {
printf("returning from sending %d bytes\n", size)
}
\end{verbatim}
\end{vindent}
Here, the probe point expands to
\verb+kernel.function("tcp_sendmsg").return.maxactive(10)+.
\subsubsection{Alias suffixes and wildcards}
When expanding wildcards, SystemTap generally avoids considering alias
suffixes in the expansion. The exception is when a wildcard element is
encountered that does not have any ordinary expansions. Consider the
following example:
\begin{vindent}
\begin{verbatim}
probe some_unrelated_probe = ... { ... }
probe myprobe = syscall.read { ... }
probe myprobe.test = some_unrelated_probe { ... }
probe myprobe.* { ... }
probe myprobe.ret* { ... }
\end{verbatim}
\end{vindent}
Here, \verb+return+ would be a valid suffix for \verb+myprobe+. The
wildcard \verb+myprobe.*+ matches the ordinary alias
\verb+myprobe.test+, and hence the suffix expansion
\verb+myprobe.return+ is not included. Conversely, \verb+myprobe.ret*+
does not match any ordinary aliases, so the suffix
\verb+myprobe.return+ is included as an expansion.
\subsection{Variables\label{sub:Variables}}
\index{variables}
Identifiers for variables and functions are alphanumeric sequences, and may
include the underscore (\_) and the dollar sign (\$) characters. They may
not start with a plain digit. Each variable is by default local to the probe
or function statement block where it is mentioned, and therefore its scope
and lifetime is limited to a particular probe or function invocation. Scalar
variables are implicitly typed as either string or integer. Associative arrays
also have a string or integer value, and a tuple of strings or integers serves
as a key. Arrays must be declared as global. Local arrays\index{local arrays}
are not allowed.
The translator performs \emph{type inference} on all identifiers, including
array indexes and function parameters. Inconsistent type-related use of identifiers
results in an error.
Variables may be declared global. Global variables are shared among all probes
and remain instantiated as long as the SystemTap session. There is one namespace
for all global variables, regardless of the script file in which they are
found. Because of possible concurrency limits, such as multiple probe handlers,
each global variable used by a probe is automatically read- or write-locked
while the handler is running. A global declaration may be written at the
outermost level anywhere in a script file, not just within a block of code.
Global variables which are written but never read will be displayed
automatically at session shutdown. The following declaration marks
\texttt{var1} and \texttt{var2} as global.
The translator will infer a value type for each, and if the variable is used
as an array, its key types.
\begin{vindent}
\begin{verbatim}
global var1[=<value>], var2[=<value>]
\end{verbatim}
\end{vindent}
The scope of a global variable may be limited to a tapset or
user script file using private keyword. The global keyword is optional when
defining a private global variable. Following declaration marks var1 and var2
private globals.
\begin{vindent}
\begin{verbatim}
private global var1[=<value>]
private var2[=<value>]
\end{verbatim}
\end{vindent}
\subsubsection{Unused variables}
\index{unused variables}
The SystemTap translator removes unused variables. Global variable
that are never written or read are discarded. Every local variables
where the variable is only written but never read are also
discarded. This optimization prunes unused variables defined
in the probe aliases, but never used in the probe handler.
If desired, this optimization can disabled with the \texttt{-u} option.
\subsection{Auxiliary functions\label{sub:Auxiliary-functions}}
\index{auxiliary functions}
General syntax:
\begin{vindent}
\begin{verbatim}
function <name>[:<type>] ( <arg1>[:<type>], ... )[:<priority>] { <stmts> }
\end{verbatim}
\end{vindent}
SystemTap scripts may define subroutines to factor out common work. Functions
may take any number of scalar arguments, and must return a single scalar
value. Scalars in this context are integers or strings. For more information
on scalars, see Section~\ref{sub:Variables} and Section~\ref{sub:Data-types}\texttt{.}
The following is an example function declaration.
\begin{vindent}
\begin{verbatim}
function thisfn (arg1, arg2) {
return arg1 + arg2
}
\end{verbatim}
\end{vindent}
Note the general absence of type declarations, which are inferred by the
translator. If desired, a function definition may include explicit type declarations
for its return value, its arguments, or both. This is helpful for embedded-C
functions. In the following example, the type inference engine need only
infer the type of arg2, a string.
\begin{vindent}
\begin{verbatim}
function thatfn:string(arg1:long, arg2) {
return sprintf("%d%s", arg1, arg2)
}
\end{verbatim}
\end{vindent}
Functions may call others or themselves recursively, up to a fixed nesting
limit. See Section~\ref{sub:SystemTap-safety}.
Functions may be marked private using the private keyword to limit their scope
to the tapset or user script file they are defined in. An example definition of
a private function follows:
\begin{vindent}
\begin{verbatim}
private function three:long () { return 3 }
\end{verbatim}
\end{vindent}
Functions terminating without reaching an explicit return statement will
return an implicit 0 or \verb+""+, determined by type inference.
Functions may be overloaded during both runtime and compile time.
Runtime overloading allows the executed function to be selected while the
module is running based on runtime conditions and is achieved using the
"next" statement in script functions and \texttt{STAP\_NEXT} macro for embedded-C
functions. For example,
\begin{vindent}
\begin{verbatim}
function f() { if (condition) next; print("first function") }
function f() %{ STAP_NEXT; print("second function") %}
function f() { print("third function") }
\end{verbatim}
\end{vindent}
During a functioncall f(), the execution will transfer to the third function
if condition evaluates to true and print "third function". Note that the second
function is unconditionally nexted.
Parameter overloading allows the function to be executed to be selected
at compile time based on the number of arguments provided to the
functioncall. For example,
\begin{vindent}
\begin{verbatim}
function g() { print("first function") }
function g(x) { print("second function") }
g() -> "first function"
g(1) -> "second function"
\end{verbatim}
\end{vindent}
Note that runtime overloading does not occur in the above example, as exactly
one function will be resolved for the functioncall. The use of a next statement
inside a function while no more overloads remain will trigger a runtime exception
Runtime overloading will only occur if the functions have the same arity,
functions with the same name but different number of parameters are completely
unrelated.
Execution order is determined by a priority value which may be specified.
If no explicit priority is specified, user script functions are given a
higher priority than library functions. User script functions and library
functions are assigned a default priority value of 0 and 1 respectively.
Functions with the same priority are executed in declaration order. For example,
\begin{vindent}
\begin{verbatim}
function f():3 { if (condition) next; print("first function") }
function f():1 { if (condition) next; print("second function") }
function f():2 { print("third function") }
\end{verbatim}
\end{vindent}
\subsection{Embedded C\label{sub:Embedded-C}}
\index{embedded C}
SystemTap supports a \emph{guru\index{guru mode} mode} where script
safety features such as code and data memory reference protection are
removed. Guru mode is set by passing the \textbf{-g} option to the
stap command. When in guru mode, the translator accepts C code
enclosed between {}``\%\{'' and {}``\%\}'' markers in the top level of
the script file. The embedded C code is transcribed verbatim, without
analysis, in sequence, into the top level of the generated C
code. Thus, guru mode may be useful for adding \#include instructions
at the top level of the generated module, or providing auxiliary
definitions for use by other embedded code.
When in guru mode, embedded C code blocks are also allowed as the body
of a SystemTap function (as described in
Section~\ref{sub:Embedded-C-Functions}), and in place of any SystemTap
expression. In the latter case, the code block must contain a valid
expression according to C syntax.
Here is an example of the various permitted methods of embedded C code inclusion:
\begin{vindent}
\begin{verbatim}
%{
#include <linux/in.h>
#include <linux/ip.h>
%} /* <-- top level */
/* Reads the char value stored at a given address: */
function __read_char:long(addr:long) %{ /* pure */
STAP_RETURN(kderef(sizeof(char), STAP_ARG_addr));
CATCH_DEREF_FAULT ();
%} /* <-- function body */
/* Determines whether an IP packet is TCP, based on the iphdr: */
function is_tcp_packet:long(iphdr) {
protocol = @cast(iphdr, "iphdr")->protocol
return (protocol == %{ IPPROTO_TCP %}) /* <-- expression */
}
\end{verbatim}
\end{vindent}
\subsection{Embedded C functions\label{sub:Embedded-C-Functions}}
General syntax:
\begin{vindent}
\begin{verbatim}
function <name>:<type> ( <arg1>:<type>, ... )[:<priority>] %{ <C_stmts> %}
\end{verbatim}
\end{vindent}
Embedded C code is permitted in a function body.
In that case, the script language
body is replaced entirely by a piece of C code enclosed between
{}``\%\{'' and {}``\%\}'' markers.
The enclosed code may do anything reasonable and safe as allowed
by the C parser.
There are a number of undocumented but complex safety constraints on concurrency,
resource consumption and runtime limits that are applied to code written
in the SystemTap language. These constraints are not applied to embedded
C code, so use embedded C code with extreme caution. Be especially
careful when dereferencing pointers. Use the kread() macro to dereference
any pointers that could potentially be invalid or dangerous. If you are unsure,
err on the side of caution and use kread(). The kread() macro is one of the
safety mechanisms used in code generated by embedded C. It protects against
pointer accesses that could crash the system.
For example, to access the pointer chain \texttt{name = skb->dev->name} in
embedded C, use the following code.
\begin{vindent}
\begin{verbatim}
struct net_device *dev;
char *name;
dev = kread(&(skb->dev));
name = kread(&(dev->name));
\end{verbatim}
\end{vindent}
The memory locations reserved for input and output values are provided
to a function using macros named
\texttt{STAP\_ARG\_foo}\index{STAP_ARG_} (for arguments named
\texttt{foo}) and \texttt{STAP\_RETVALUE}\index{STAP_RETVALUE}.
Errors may be signalled with \texttt{STAP\_ERROR}. Output may be written
with \texttt{STAP\_PRINTF}. The function may return early with \texttt{STAP\_RETURN}. Here are some examples:
\begin{vindent}
\begin{verbatim}
function integer_ops:long (val) %{
STAP_PRINTF("%d\n", STAP_ARG_val);
STAP_RETVALUE = STAP_ARG_val + 1;
if (STAP_RETVALUE == 4)
STAP_ERROR("wrong guess: %d", (int) STAP_RETVALUE);
if (STAP_RETVALUE == 3)
STAP_RETURN(0);
STAP_RETVALUE ++;
%}
function string_ops:string (val) %{
strlcpy (STAP_RETVALUE, STAP_ARG_val, MAXSTRINGLEN);
strlcat (STAP_RETVALUE, "one", MAXSTRINGLEN);
if (strcmp (STAP_RETVALUE, "three-two-one"))
STAP_RETURN("parameter should be three-two-");
%}
function no_ops () %{
STAP_RETURN(); /* function inferred with no return value */
%}
\end{verbatim}
\end{vindent}
The function argument and return value types should be stated if the
translator cannot infer them from usage. The translator does not
analyze the embedded C code within the function.
You should examine C code generated for ordinary script language
functions to write compatible embedded-C. Usually, all SystemTap
functions and probes run with interrupts disabled, thus you cannot
call functions that might sleep within the embedded C.
\subsection{Embedded C pragma comments}
Embedded C blocks may contain various markers to assert optimization
and safety properties.
\begin{itemize}
\item \verb+/* pure */+ means that the C code has no side effects and
may be elided entirely if its value is not used by script code.
\item \verb+/* stable */+ means that the C code always has the same value
(in any given probe handler invocation), so repeated calls may be
automatically replaced by memoized values. Such functions must take
no parameters, and also be \verb+/* pure */+.
\item \verb+/* unprivileged */+ means that the C code is so safe that
even unprivileged users are permitted to use it. (This is useful, in
particular, to define an embedded-C function inside a tapset that
may be used by unprivileged code.)
\item \verb+/* myproc-unprivileged */+ means that the C code is so
safe that even unprivileged users are permitted to use it, provided
that the target of the current probe is within the user's own
process.
\item \verb+/* guru */+ means that the C code is so unsafe that a
systemtap user must specify \verb+-g+ (guru mode) to use this, even
if the C code is being exported from a tapset.
\item \verb+/* unmangled */+, used in an embedded-C function, means
that the legacy (pre-1.8) argument access syntax should be made
available inside the function. Hence, in addition to
\verb+STAP_ARG_foo+ and \verb+STAP_RETVALUE+ one can use
\verb+THIS->foo+ and \verb+THIS->__retvalue+ respectively inside the
function. This is useful for quickly migrating code written for
SystemTap version 1.7 and earlier.
\item \verb+/* unmodified-fnargs */+ in an embedded-C function, means
that the function arguments are not modified inside the function body.
\item \verb+/* string */+ in embedded-C expressions only, means that
the expression has \verb+const char *+ type and should be treated as
a string value, instead of the default long numeric.
\end{itemize}
\subsection{Accessing script level global variables}
Script level global variables may be accessed in embedded-C functions and
blocks. To read or write the global variable \textbf{var}, the
\verb+/* pragma:read:var */+ or \verb+/* pragma:write:var */+
marker must be first placed in the embedded-C function or block. This provides
the macros \verb+STAP_GLOBAL_GET_*+ and \verb+STAP_GLOBAL_SET_*+
macros to allow reading and writing, respectively. For example:
\begin{vindent}
\begin{verbatim}
global var
global var2[100]
function increment() %{
/* pragma:read:var */ /* pragma:write:var */
/* pragma:read:var2 */ /* pragma:write:var2 */
STAP_GLOBAL_SET_var(STAP_GLOBAL_GET_var()+1); //var++
STAP_GLOBAL_SET_var2(1, 1, STAP_GLOBAL_GET_var2(1, 1)+1); //var2[1,1]++
%}
\end{verbatim}
\end{vindent}
Variables may be read and set in both embedded-C functions and expressions.
Strings returned from embedded-C code are decayed to pointers. Variables must
also be assigned at script level to allow for type inference. Map assignment
does not return the value written, so chaining does not work.
\section{Probe points\label{sec:Probe-Points}}
\index{probe points}
\subsection{General syntax}
\index{probe syntax}
The general probe point syntax is a dotted-symbol sequence. This divides
the event namespace into parts, analogous to the style of the Domain Name
System. Each component identifier is parameterized by a string or number
literal, with a syntax analogous to a function call.
The following are all syntactically valid probe points.
\begin{vindent}
\begin{verbatim}
kernel.function("foo")
kernel.function("foo").return
module{"ext3"}.function("ext3_*")
kernel.function("no_such_function") ?
syscall.*
end
timer.ms(5000)
\end{verbatim}
\end{vindent}
Probes may be broadly classified into \emph{synchronous}\index{synchronous}
or \emph{asynchronous}.\index{asynchronous} A synchronous event occurs when
any processor executes an instruction matched by the specification. This
gives these probes a reference point (instruction address) from which more
contextual data may be available. Other families of probe points refer to
asynchronous events such as timers, where no fixed reference point is related.
Each probe point specification may match multiple locations, such as by using
wildcards or aliases, and all are probed. A probe declaration may contain
several specifications separated by commas, which are all probed.
\subsubsection{Prefixes}
\index{prefixes}
Prefixes specify the probe target, such as \textbf{kernel}, \textbf{module},
\textbf{timer}, and so on.
\subsubsection{Suffixes}
\index{suffixes}
Suffixes further qualify the point to probe, such as \textbf{.return} for the
exit point of a probed function. The absence of a suffix implies the function
entry point.
\subsubsection{Wildcarded file names, function names}
\index{wildcards}
A component may include an asterisk ({*}) character, which expands to other
matching probe points. An example follows.
\begin{vindent}
\begin{verbatim}
kernel.syscall.*
kernel.function("sys_*)
\end{verbatim}
\end{vindent}
\subsubsection{Optional probe points\label{sub:Optional-probe-points}}
\index{?}
A probe point may be followed by a question mark (?) character, to indicate
that it is optional, and that no error should result if it fails to expand.
This effect passes down through all levels of alias or wildcard expansion.
The following is the general syntax.
\begin{vindent}
\begin{verbatim}
kernel.function("no_such_function") ?
\end{verbatim}
\end{vindent}
\subsubsection{Brace expansion}
\index{braceexpansion}
Brace expansion is a mechanism which allows a list of probe points to be
generated. It is very similar to shell expansion. A component may be surrounded
by a pair of curly braces to indicate that the comma-separated sequence of
one or more subcomponents will each constitute a new probe point. The braces
may be arbitrarily nested. The ordering of expanded results is based on
product order.
The question mark (?), exclamation mark (!) indicators and probe point conditions
may not be placed in any expansions that are before the last component.
The following is an example of brace expansion.
\begin{vindent}
\begin{verbatim}
syscall.{write,read}
# Expands to
syscall.write, syscall.read
{kernel,module("nfs")}.function("nfs*")!
# Expands to
kernel.function("nfs*")!, module("nfs").function("nfs*")!
\end{verbatim}
\end{vindent}
\subsection{Built-in probe point types (DWARF probes)}
\index{built-in probes}
\index{dwarf probes}
\label{dwarfprobes}
This family of probe points uses symbolic debugging information for the target
kernel or module, as may be found in executables that have not
been stripped, or in the separate \textbf{debuginfo} packages. They allow
logical placement of probes into the execution path of the target
by specifying a set of points in the source or object code. When a matching
statement executes on any processor, the probe handler is run in that context.
Points in a kernel are identified by module, source file, line number, function
name or some combination of these.
Here is a list of probe point specifications currently supported:
\begin{vindent}
\begin{verbatim}
kernel.function(PATTERN)
kernel.function(PATTERN).call
kernel.function(PATTERN).return
kernel.function(PATTERN).return.maxactive(VALUE)
kernel.function(PATTERN).inline
kernel.function(PATTERN).label(LPATTERN)
module(MPATTERN).function(PATTERN)
module(MPATTERN).function(PATTERN).call
module(MPATTERN).function(PATTERN).return.maxactive(VALUE)
module(MPATTERN).function(PATTERN).inline
kernel.statement(PATTERN)
kernel.statement(ADDRESS).absolute
module(MPATTERN).statement(PATTERN)
\end{verbatim}
\end{vindent}
The \textbf{.function} variant places a probe near the beginning of the named
function, so that parameters are available as context variables.
The \textbf{.return} variant places a probe at the moment of return from the named
function, so the return value is available as the \$return context variable.
The entry parameters are also available, though the function may have changed
their values. Return probes may be further qualified with \textbf{.maxactive},
which specifies how many instances of the specified function can be probed simultaneously.
You can leave off \textbf{.maxactive} in most cases, as the default
(\textbf{KRETACTIVE}) should be sufficient.
However, if you notice an excessive number of skipped probes, try setting \textbf{.maxactive}
to incrementally higher values to see if the number of skipped probes decreases.
The \textbf{.inline} modifier for \textbf{.function} filters the results to include only
instances of inlined functions. The \textbf{.call} modifier selects the opposite subset.
The \textbf{.exported} modifier filters the results to include only exported functions.
Inline functions do not have an identifiable return point, so \textbf{.return}
is not supported on \textbf{.inline} probes.
The \textbf{.statement} variant places a probe at the exact spot, exposing those local
variables that are visible there.
In the above probe descriptions, MPATTERN stands for a string literal
that identifies the loaded kernel module of interest and LPATTERN
stands for a source program label. Both MPATTERN and LPATTERN may
include asterisk ({*}), square brackets \char`\"{}{[}]\char`\"{}, and
question mark (?) wildcards.
PATTERN stands for a string literal that identifies a point in the program.
It is composed of three parts:
\begin{enumerate}
\item The first part is the name of a function, as would appear in the nm program's
output. This part may use the asterisk and question mark wildcard operators
to match multiple names.
\item The second part is optional, and begins with the ampersand (@) character.
It is followed by the path to the source file containing the function,
which may include a wildcard pattern, such as mm/slab{*}.
In most cases, the path should be relative to the top of the
linux source directory, although an absolute path may be necessary for some kernels.
If a relative pathname doesn't work, try absolute.
\item The third part is optional if the file name part was given. It identifies
the line number in the source file, preceded by a ``:'' or ``+''.
The line number is assumed to be an
absolute line number if preceded by a ``:'', or relative to the entry of
the function if preceded by a ``+''.
All the lines in the function can be matched with ``:*''.
A range of lines x through y can be matched with ``:x-y''.
\end{enumerate}
Alternately, specify PATTERN as a numeric constant to indicate a relative
module address or an absolute kernel address.
Some of the source-level variables, such as function parameters, locals,
or globals visible in the compilation unit, are visible to probe handlers.
Refer to these variables by prefixing their name with a dollar sign within
the scripts. In addition, a special syntax allows limited traversal of
structures, pointers, arrays, taking the address of a variable or pretty
printing a whole structure.
\texttt{\$var} refers to an in-scope variable var. If it is a type similar
to an integer, it will be cast to a 64-bit integer for script use. Pointers
similar to a string (char {*}) are copied to SystemTap string values by the
\texttt{kernel\_string()} or \texttt{user\_string()} functions.
\texttt{@var("varname")} is an alternative syntax for \texttt{\$varname}.
It can also be used to access global variables in a particular compile
unit (CU). \texttt{@var("varname@src/file.c")} refers to the global
(either file local or external) variable varname defined when the file
src/file.c was compiled. The CU in which the variable is resolved is
the first CU in the module of the probe point which matches the given
file name at the end and has the shortest file name path (e.g. given
\texttt{@var("foo@bar/baz.c")} and CUs with file name paths
\texttt{src/sub/module/bar/baz.c} and \texttt{src/bar/baz.c} the second
CU will be chosen to resolve \texttt{foo}).
The notation \texttt{@var("varname", "/path/to/exe-or-so)} is also supported
to explicitly specify an executable or library file path in which the global or
top-level static variable resides.
\texttt{\$var->field} or \texttt{@var("var@file.c")->field} traverses a
structure's field. The indirection operator may be repeated to follow
additional levels of pointers.
\texttt{\$var{[}N]} or \texttt{@var("var@file.c"){[}N]} indexes into an
array. The index is given with a literal number.
\texttt{\&\$var} or \texttt{\&@var("var@file.c")} provides the address of
a variable as a long. It can also be used in combination with field access
or array indexing to provide the address of a particular field or an
element in an array with \texttt{\&var->field},
\texttt{\&@var("var@file.c"){[}N]} or a combination of those accessors.
Using a single \texttt{\$} or a double \texttt{\$\$} suffix provides a
swallow or deep string representation of the variable data type. Using
a single \texttt{\$}, as in \texttt{\$var\$}, will provide a string that
only includes the values of all basic type values of fields of the variable
structure type but not any nested complex type values (which will be
represented with \texttt{\{...\}}). Using a double \texttt{\$\$},
as in \texttt{@var("var")\$\$} will provide a string that also includes
all values of nested data types.
\texttt{\$\$vars} expands to a character string that is equivalent to
\texttt{sprintf("parm1=\%x ... parmN=\%x var1=\%x ... varN=\%x", \$parm1, ..., \$parmN,
\$var1, ..., \$varN)}
\texttt{\$\$locals} expands to a character string that is equivalent to
\texttt{sprintf("var1=\%x ... varN=\%x", \$var1, ..., \$varN)}
\texttt{\$\$parms} expands to a character string that is equivalent to
\texttt{sprintf("parm1=\%x ... parmN=\%x", \$parm1, ..., \$parmN)}
\subsubsection{kernel.function, module().function}
\index{kernel.function}
\index{module().function}
The \textbf{.function} variant places a probe near the beginning of the named function,
so that parameters are available as context variables.
General syntax:
\begin{vindent}
\begin{verbatim}
kernel.function("func[@file]")
module("modname").function("func[@file]")
\end{verbatim}
\end{vindent}
Examples:
\begin{vindent}
\begin{verbatim}
# Refers to all kernel functions with "init" or "exit"
# in the name:
kernel.function("*init*"), kernel.function("*exit*")
# Refers to any functions within the "kernel/time.c"
# file that span line 240:
kernel.function("*@kernel/time.c:240")
# Refers to all functions in the ext3 module:
module("ext3").function("*")
\end{verbatim}
\end{vindent}
\subsubsection{kernel.statement, module().statement}
\index{kernel.statement}
\index{module().statement}
The \textbf{.statement} variant places a probe at the exact spot, exposing those local
variables that are visible there.
General syntax:
\begin{vindent}
\begin{verbatim}
kernel.statement("func@file:linenumber")
module("modname").statement("func@file:linenumber")
\end{verbatim}
\end{vindent}
Example:
\begin{vindent}
\begin{verbatim}
# Refers to the statement at line 296 within the
# kernel/time.c file:
kernel.statement("*@kernel/time.c:296")
# Refers to the statement at line bio_init+3 within the fs/bio.c file:
kernel.statement("bio_init@fs/bio.c+3")
\end{verbatim}
\end{vindent}
\subsection{Function return probes}
\index{return probes}
The \texttt{.return} variant places a probe at the moment of return from
the named function, so that the return value is available as the \$return
context variable. The entry parameters are also accessible in the context
of the return probe, though their values may have been changed by the function.
Inline functions do not have an identifiable return point, so \texttt{.return}
is not supported on \texttt{.inline} probes.
\subsection{DWARF-less probing}
\index{DWARF-less probing}
In the absence of debugging information, you can still use the
\emph{kprobe} family of probes to examine the entry and exit points of
kernel and module functions. You cannot look up the arguments or local
variables of a function using these probes. However, you can access
the parameters by following this procedure:
When you're stopped at the entry to a function, you can refer to the
function's arguments by number. For example, when probing the function
declared:
\begin{vindent}
\begin{verbatim}
asmlinkage ssize_t sys_read(unsigned int fd, char __user * buf, size_t
count)
\end{verbatim}
\end{vindent}
You can obtain the values of \texttt{fd}, \texttt{buf}, and
\texttt{count}, respectively, as \texttt{uint\_arg(1)},
\texttt{pointer\_arg(2)}, and \texttt{ulong\_arg(3)}. In this case, your
probe code must first call \texttt{asmlinkage()}, because on some
architectures the asmlinkage attribute affects how the function's
arguments are passed.
When you're in a return probe, \texttt{\$return} isn't supported
without DWARF, but you can call \texttt{returnval()} to get the value
of the register in which the function value is typically returned, or
call \texttt{returnstr()} to get a string version of that value.
And at any code probepoint, you can call
\texttt{{register("regname")}} to get the value of the specified CPU
register when the probe point was hit.
\texttt{u\_register("regname")} is like \texttt{register("regname")},
but interprets the value as an unsigned integer.
SystemTap supports the following constructs:
\begin{vindent}
\begin{verbatim}
kprobe.function(FUNCTION)
kprobe.function(FUNCTION).return
kprobe.module(NAME).function(FUNCTION)
kprobe.module(NAME).function(FUNCTION).return
kprobe.statement(ADDRESS).absolute
\end{verbatim}
\end{vindent}
Use \textbf{.function} probes for kernel functions and
\textbf{.module} probes for probing functions of a specified module.
If you do not know the absolute address of a kernel or module
function, use \textbf{.statement} probes. Do not use wildcards in
\textit{FUNCTION} and \textit{MODULE} names. Wildcards cause the probe
to not register. Also, statement probes are available only in guru mode.
\subsection{Userspace probing}
\index{userspace probing}
\index{process}
Support for userspace probing is supported on kernels that are
configured to include the utrace or uprobes extensions.
\subsubsection{Begin/end variants}
\label{sec:beginendvariants}
Constructs:
\begin{vindent}
\begin{verbatim}
process.begin
process("PATH").begin
process(PID).begin
process.thread.begin
process("PATH").thread.begin
process(PID).thread.begin
process.end
process("PATH").end
process(PID).end
process.thread.end
process("PATH").thread.end
process(PID).thread.end
\end{verbatim}
\end{vindent}
The \texttt{.begin} variant is called when a new process described by
\texttt{PID} or \texttt{PATH} is created. If no \texttt{PID} or
\texttt{PATH} argument is specified (for example
\texttt{process.begin}), the probe flags any new process being
spawned.
The \texttt{.thread.begin} variant is called when a new thread
described by \texttt{PID} or \texttt{PATH} is created.
The \texttt{.end} variant is called when a process described by
\texttt{PID} or \texttt{PATH} dies.
The \texttt{.thread.end} variant is called when a thread described by
\texttt{PID} or \texttt{PATH} dies.
\subsubsection{Syscall variants}
\label{sec:syscallvariants}
Constructs:
\begin{vindent}
\begin{verbatim}
process.syscall
process("PATH").syscall
process(PID).syscall
process.syscall.return
process("PATH").syscall.return
process(PID).syscall.return
\end{verbatim}
\end{vindent}
The \texttt{.syscall} variant is called when a thread described by
\texttt{PID} or \texttt{PATH} makes a system call. The system call
number is available in the \texttt{\$syscall} context variable. The
first six arguments of the system call are available in the
\texttt{\$argN} parameter, for example \texttt{\$arg1},
\texttt{\$arg2}, and so on.
The \texttt{.syscall.return} variant is called when a thread described
by \texttt{PID} or \texttt{PATH} returns from a system call. The
system call number is available in the \texttt{\$syscall} context
variable. The return value of the system call is available in the
\texttt{\$return} context variable.
\subsubsection{Function/statement variants}
\label{sec:function-statement}
Constructs:
\begin{vindent}
\begin{verbatim}
process("PATH").function("NAME")
process("PATH").statement("*@FILE.c:123")
process("PATH").function("*").return
process("PATH").function("myfun").label("foo")
\end{verbatim}
\end{vindent}
Full symbolic source-level probes in userspace programs and shared
libraries are supported. These are exactly analogous to the symbolic
DWARF-based kernel or module probes described previously and expose
similar contextual \texttt{\$-variables}. See
Section~\ref{dwarfprobes} for more information
Here is an example of prototype symbolic userspace probing support:
\begin{vindent}
\begin{verbatim}
# stap -e 'probe process("ls").function("*").call {
log (probefunc()." ".$$parms)
}' \
-c 'ls -l'
\end{verbatim}
\end{vindent}
To run, this script requires debugging information for the named
program and utrace support in the kernel. If you see a "pass 4a-time"
build failure, check that your kernel supports utrace.
\subsubsection{Absolute variant}
\label{sec:absolutevariant}
A non-symbolic probe point such as
\texttt{process(PID).statement(ADDRESS).absolute} is analogous to
\newline\texttt{kernel.statement(ADDRESS).absolute} in that both use
raw, unverified virtual addresses and provide no \texttt{\$variables}.
The target \texttt{PID} parameter must identify a running process and
\texttt{ADDRESS} must identify a valid instruction address. All
threads of the listed process will be probed. This is a guru mode
probe.
\subsubsection{Process probe paths}
\label{sec:paths}
For all process probes, \texttt{PATH} names refer to executables that
are searched the same way that shells do: the explicit path specified
if the path name begins with a slash (/) character sequence; otherwise
\texttt{\$PATH} is searched. For example, the following probe syntax:
\begin{vindent}
\begin{verbatim}
probe process("ls").syscall {}
probe process("./a.out").syscall {}
\end{verbatim}
\end{vindent}
works the same as:
\begin{vindent}
\begin{verbatim}
probe process("/bin/ls").syscall {}
probe process("/my/directory/a.out").syscall {}
\end{verbatim}
\end{vindent}
If a process probe is specified without a \texttt{PID} or
\texttt{PATH} parameter, all user threads are probed. However, if
systemtap is invoked in target process mode, process probes are
restricted to the process hierarchy associated with the target
process. If stap is running in \texttt{--unprivileged} mode, only
processes owned by the current user are selected.
\subsubsection{Target process mode}
\label{sec:targetprocessmode}
Target process mode (invoked with \texttt{stap -c CMD} or \texttt{-x
PID}) implicitly restricts all \texttt{process.*} probes to the
given child process. It does not affect \texttt{kernel.*} or other
probe types. The \texttt{CMD} string is normally run directly, rather
than from a ``\texttt{/bin/sh -c}'' sub-shell, since utrace and uprobe
probes receive a fairly "clean" event stream. If meta-characters such
as redirection operators are present in \texttt{CMD}, ``\texttt{/bin/sh
-c CMD}'' is still used, and utrace and uprobe probes will receive
events from the shell. For example:
\begin{vindent}
\begin{verbatim}
% stap -e 'probe process.syscall, process.end {
printf("%s %d %s\n", execname(), pid(), pp())}' \
-c ls
\end{verbatim}
\end{vindent}
Here is the output from this command:
\begin{vindent}
\begin{verbatim}
ls 2323 process.syscall
ls 2323 process.syscall
ls 2323 process.end
\end{verbatim}
\end{vindent}
If \texttt{PATH} names a shared library, all processes that map that
shared library can be probed. If dwarf debugging information is
installed, try using a command with this syntax:
\begin{vindent}
\begin{verbatim}
probe process("/lib64/libc-2.8.so").function("....") { ... }
\end{verbatim}
\end{vindent}
This command probes all threads that call into that library. Typing
``\texttt{stap -c CMD}'' or ``\texttt{stap -x PID}'' restricts this to
the target command and descendants only. You can use
\texttt{\$\$vars} and others. You can provide the location of debug
information to the stap command with the \texttt{-d DIRECTORY} option.
To qualify a probe point to a location in a library required by a
particular process try using a command with this syntax:
\begin{vindent}
\begin{verbatim}
probe process("...").library("...").function("....") { ... }
\end{verbatim}
\end{vindent}
The library name may use wildcards.
The first syntax in the following will probe the functions in the program
linkage table of a particular process. The second syntax will also add the
program linkage tables of libraries required by that process. .plt("...") can
be specified to match particular plt entries.
\begin{vindent}
\begin{verbatim}
probe process("...").plt { ... }
probe process("...").plt process("...").library("...").plt { ... }
\end{verbatim}
\end{vindent}
\subsubsection{Static userspace probing}
\label{sec:staticuserspace}
You can probe symbolic static instrumentation compiled into programs
and shared libraries with the following syntax:
\begin{vindent}
\begin{verbatim}
process("PATH").mark("LABEL")
\end{verbatim}
\end{vindent}
The \texttt{.mark} variant is called from a static probe defined in
the application by
\texttt{STAP\_PROBE1(handle,LABEL,arg1)}. \texttt{STAP\_PROBE1} is
defined in the sdt.h file. The parameters are:
\begin{tabular}{|l|r|c|}
Parameter & Definition \\ \hline
\texttt{handle} & the application handle \\ \hline
\texttt{LABEL} & corresponds to the \texttt{.mark} argument \\ \hline
\texttt{arg1} & the argument \\ \hline
\end{tabular}
Use \texttt{STAP\_PROBE1} for probes with one argument. Use
\texttt{STAP\_PROBE2} for probes with 2 arguments, and so on. The
arguments of the probe are available in the context variables
\texttt{\$arg1}, \texttt{\$arg2}, and so on.
As an alternative to the \texttt{STAP\_PROBE} macros, you can use the
dtrace script to create custom macros. The sdt.h file also provides
dtrace compatible markers through \texttt{DTRACE\_PROBE} and an
associated python \texttt{dtrace} script. You can use these in builds
based on dtrace that need dtrace -h or -G functionality.
\subsection{Java probes}
\index{Java probes}
Support for probing Java methods is available using Byteman as a
backend. Byteman is an instrumentation tool from the JBoss project
which systemtap can use to monitor invocations for a specific method
or line in a Java program.
Systemtap does so by generating a Byteman script listing the probes to
instrument and then invoking the Byteman \texttt{bminstall} utility. A
custom option \texttt{-D OPTION} (see the Byteman documentation for
more details) can be passed to bminstall by invoking systemtap with
option \texttt{-J OPTION}. The systemtap option \texttt{-j} is also
provided as a shorthand for \texttt{-J
org.jboss.byteman.compile.to.bytecode}.
This Java instrumentation support is currently a prototype feature
with major limitations: Java probes attach only to one Java process at
a time; other Java processes beyond the first one to be observed are
ignored. Moreover, Java probing currently does not work across users;
the stap script must run (with appropriate permissions) under the same
user as the Java process being probed. (Thus a stap script under
root currently cannot probe Java methods in a non-root-user Java process.)
There are four probe point variants supported by the translator:
\begin{vindent}
\begin{verbatim}
java("PNAME").class("CLASSNAME").method("PATTERN")
java("PNAME").class("CLASSNAME").method("PATTERN").return
java(PID).class("CLASSNAME").method("PATTERN")
java(PID).class("CLASSNAME").method("PATTERN").return
\end{verbatim}
\end{vindent}
The first two probe points refer to Java processes by the name of the
Java process. The PATTERN parameter specifies the signature of the
Java method to probe. The signature must consist of the exact name of
the method, followed by a bracketed list of the types of the
arguments, for instance \texttt{myMethod(int,double,Foo)}. Wildcards
are not supported.
The probe can be set to trigger at a specific line within the method
by appending a line number with colon, just as in other types of
probes: \texttt{myMethod(int,double,Foo):245}.
The CLASSNAME parameter identifies the Java class the method belongs
to, either with or without the package qualification. By default, the
probe only triggers on descendants of the class that do not override
the method definition of the original class. However, CLASSNAME can
take an optional caret prefix, as in
\verb+class("^org.my.MyClass")+, which specifies that the probe
should also trigger on all descendants of MyClass that override the
original method. For instance, every method with signature foo(int) in
program org.my.MyApp can be probed at once using
\begin{vindent}
\begin{verbatim}
java("org.my.MyApp").class("^java.lang.Object").method("foo(int)")
\end{verbatim}
\end{vindent}
The last two probe points work analogously, but refer to Java
processes by PID. (PIDs for already running processes can be obtained
using the \texttt{jps} utility.)
Context variables defined within java probes include \verb+$provider+
(which identifies the class providing the definition of the triggered
method) and \verb+$name+ (which gives the signature of the method).
Arguments to the method can be accessed using context variables
\verb+$arg1$+ through \verb+$arg10+, for up to the first 10 arguments
of a method.
\subsection{PROCFS probes}
\index{PROCFS probes}
These probe points allow procfs pseudo-files in
\texttt{/proc/systemtap/\textit{MODNAME}} to be created, read and
written. Specify the name of the systemtap module as
\texttt{\textit{MODNAME}}. There are four probe point variants
supported by the translator:
\begin{vindent}
\begin{verbatim}
procfs("PATH").read
procfs("PATH").write
procfs.read
procfs.write
\end{verbatim}
\end{vindent}
\texttt{PATH} is the file name to be created, relative to
\texttt{/proc/systemtap/MODNAME}. If no \texttt{PATH} is specified
(as in the last two variants in the previous list), \texttt{PATH}
defaults to "command".
When a user reads \texttt{/proc/systemtap/MODNAME/PATH}, the
corresponding procfs read probe is triggered. Assign the string data
to be read to a variable named \texttt{\$value}, as follows:
\begin{vindent}
\begin{verbatim}
procfs("PATH").read { $value = "100\n" }
\end{verbatim}
\end{vindent}
When a user writes into \texttt{/proc/systemtap/MODNAME/PATH}, the
corresponding procfs write probe is triggered. The data the user
wrote is available in the string variable named \texttt{\$value}, as
follows:
\begin{vindent}
\begin{verbatim}
procfs("PATH").write { printf("User wrote: %s", $value) }
\end{verbatim}
\end{vindent}
\subsection{Marker probes}
\index{marker probes}
This family of probe points connects to static probe markers inserted
into the kernel or a module. These markers are special macro calls in
the kernel that make probing faster and more reliable than with
DWARF-based probes. DWARF debugging information is not required to
use probe markers.
Marker probe points begin with a \texttt{kernel} prefix which
identifies the source of the symbol table used for finding
markers. The suffix names the marker itself:
\texttt{mark.("MARK")}. The marker name string, which can contain
wildcard characters, is matched against the names given to the marker
macros when the kernel or module is compiled. Optionally, you can
specify \texttt{format("FORMAT")}. Specifying the marker format
string allows differentiation between two markers with the same name
but different marker format strings.
The handler associated with a marker probe reads any optional
parameters specified at the macro call site named \texttt{\$arg1}
through \texttt{\$argNN}, where \texttt{NN} is the number of
parameters supplied by the macro. Number and string parameters are
passed in a type-safe manner.
The marker format string associated with a marker is available in
\texttt{\$format}. The marker name string is available in
\texttt{\$name}.
Here are the marker probe constructs:
\begin{vindent}
\begin{verbatim}
kernel.mark("MARK")
kernel.mark("MARK").format("FORMAT")
\end{verbatim}
\end{vindent}
For more information about marker probes, see
\url{http://sourceware.org/systemtap/wiki/UsingMarkers}.
\subsection{Tracepoints}
\label{sec:tracepoints}
\index{tracepoints}
This family of probe points hooks to static probing tracepoints
inserted into the kernel or kernel modules. As with marker probes,
these tracepoints are special macro calls inserted by kernel
developers to make probing faster and more reliable than with
DWARF-based probes. DWARF debugging information is not required to
probe tracepoints. Tracepoints have more strongly-typed parameters
than marker probes.
Tracepoint probes begin with \texttt{kernel}. The next part names the
tracepoint itself: \texttt{trace("name")}. The tracepoint
\texttt{name} string, which can contain wildcard characters, is
matched against the names defined by the kernel developers in the
tracepoint header files.
The handler associated with a tracepoint-based probe can read the
optional parameters specified at the macro call site. These
parameters are named according to the declaration by the tracepoint
author. For example, the tracepoint probe
\texttt{kernel.trace("sched\_switch")} provides the parameters
\texttt{\$rq}, \texttt{\$prev}, and \texttt{\$next}. If the parameter
is a complex type such as a struct pointer, then a script can access
fields with the same syntax as DWARF \texttt{\$target} variables.
Tracepoint parameters cannot be modified; however, in guru mode a
script can modify fields of parameters.
The name of the tracepoint is available in \texttt{\$\$name}, and a
string of \texttt{name=value} pairs for all parameters of the
tracepoint is available in \texttt{\$\$vars} or \texttt{\$\$parms}.
\subsection{Syscall probes}
\label{sec:syscall}
\index{syscall probes}
The \texttt{syscall.*} aliases define several hundred probes. They
use the following syntax:
\begin{vindent}
\begin{verbatim}
syscall.NAME
syscall.NAME.return
\end{verbatim}
\end{vindent}
Generally, two probes are defined for each normal system call as
listed in the syscalls(2) manual page: one for entry and one for
return. System calls that never return do not have a
corresponding \texttt{.return} probe.
Each probe alias defines a variety of variables. Look at the tapset
source code to find the most reliable source of variable definitions.
Generally, each variable listed in the standard manual page is
available as a script-level variable. For example,
\texttt{syscall.open} exposes file name, flags, and mode. In addition,
a standard suite of variables is available at most aliases, as follows:
\begin{itemize}
\item \texttt{argstr}: A pretty-printed form of the entire argument
list, without parentheses.
\item \texttt{name}: The name of the system call.
\item \texttt{retstr}: For return probes, a pretty-printed form of the
system call result.
\end{itemize}
Not all probe aliases obey all of these general guidelines. Please
report exceptions that you encounter as a bug.
\subsection{Timer probes}
\index{timer probes}
You can use intervals defined by the standard kernel jiffies\index{jiffies}
timer to trigger probe handlers asynchronously. A \emph{jiffy} is a kernel-defined
unit of time typically between 1 and 60 msec. Two probe point variants are
supported by the translator:
\begin{vindent}
\begin{verbatim}
timer.jiffies(N)
timer.jiffies(N).randomize(M)
\end{verbatim}
\end{vindent}
The probe handler runs every N jiffies. If the \texttt{randomize}\index{randomize}
component is given, a linearly distributed random value in the range {[}-M
\ldots{} +M] is added to N every time the handler executes. N is restricted
to a reasonable range (1 to approximately 1,000,000), and M is restricted
to be less than N. There are no target variables provided in either context.
Probes can be run concurrently on multiple processors.
Intervals may be specified in units of time. There are two probe point variants
similar to the jiffies timer:
\begin{vindent}
\begin{verbatim}
timer.ms(N)
timer.ms(N).randomize(M)
\end{verbatim}
\end{vindent}
Here, N and M are specified in milliseconds\index{milliseconds}, but the
full options for units are seconds (s or sec), milliseconds (ms or msec),
microseconds (us or usec), nanoseconds (ns or nsec), and hertz (hz). Randomization
is not supported for hertz timers.
The resolution of the timers depends on the target kernel. For kernels prior
to 2.6.17, timers are limited to jiffies resolution, so intervals are rounded
up to the nearest jiffies interval. After 2.6.17, the implementation uses
hrtimers for greater precision, though the resulting resolution will be dependent
upon architecture. In either case, if the randomize component is given, then
the random value will be added to the interval before any rounding occurs.
Profiling timers are available to provide probes that execute on all CPUs
at each system tick. This probe takes no parameters, as follows.
\begin{vindent}
\begin{verbatim}
timer.profile.tick
\end{verbatim}
\end{vindent}
Full context information of the interrupted process is available, making
this probe suitable for implementing a time-based sampling profiler.
It is recommended to use the tapset probe \verb+timer.profile+ rather
than \verb+timer.profile.tick+. This probe point behaves identically
to \verb+timer.profile.tick+ when the underlying functionality is
available, and falls back to using \verb+perf.sw.cpu_clock+ on some
recent kernels which lack the corresponding profile timer facility.
The following is an example of timer usage.
\begin{vindent}
\begin{verbatim}
# Refers to a periodic interrupt, every 1000 jiffies:
timer.jiffies(1000)
# Fires every 5 seconds:
timer.sec(5)
# Refers to a periodic interrupt, every 1000 +/- 200 jiffies:
timer.jiffies(1000).randomize(200)
\end{verbatim}
\end{vindent}
\subsection{Special probe points}
The probe points \texttt{begin} and \texttt{end} are defined by the translator
to refer to the time of session startup and shutdown. There are no target
variables available in either context.
\subsubsection{begin}
\index{begin}
The \texttt{begin} probe is the start of the SystemTap session.
All \texttt{begin}
probe handlers are run during the startup of the session.
\subsubsection{end}
\index{end}
The \texttt{end} probe is the end of the SystemTap session. All \texttt{end}
probes are run during the normal shutdown of a session, such as in the aftermath
of a SystemTap \texttt{exit} function call, or an interruption from the user.
In the case of an shutdown triggered by error, \texttt{end} probes are not run.
\subsubsection{error}
\index{error}
The \emph{error} probe point is similar to the end
probe, except the probe handler runs when the session ends if an error
occurred. In this case, an \texttt{end} probe is skipped, but each
\texttt{error} probe is still attempted. You can use an
\texttt{error} probe to clean up or perform a final action on script
termination.
Here is a simple example:
\begin{vindent}
\begin{verbatim}
probe error { println ("Oops, errors occurred. Here's a report anyway.")
foreach (coin in mint) { println (coin) } }
\end{verbatim}
\end{vindent}
\subsubsection{begin, end, and error probe sequence}
\index{probe sequence}
\texttt{begin}, \texttt{end}, and \texttt{error} probes can be
specified with an optional sequence number that controls the order in
which they are run. If no sequence number is provided, the sequence
number defaults to zero and probes are run in the order that they
occur in the script file. Sequence numbers may be either positive or
negative, and are especially useful for tapset writers who want to do
initialization in a \texttt{begin} probe. The following are examples.
\begin{vindent}
\begin{verbatim}
# In a tapset file:
probe begin(-1000) { ... }
# In a user script:
probe begin { ... }
\end{verbatim}
\end{vindent}
The user script \texttt{begin} probe defaults to sequence number zero, so
the tapset \texttt{begin} probe will run first.
\subsubsection{never}
\index{never}
The \texttt{never} probe point is defined by the translator to mean \emph{never}.
Its statements are analyzed for symbol and type correctness, but its probe
handler is never run. This probe point may be useful in conjunction with
optional probes. See Section~\ref{sub:Optional-probe-points}.
\section{Language elements\label{sec:Language-Elements}}
\subsection{Identifiers}
\index{identifiers}
\emph{Identifiers} are used to name variables and functions. They are an
alphanumeric sequence that may include the underscore (\_) and dollar sign
(\$) characters. They have the same syntax as C identifiers, except that
the dollar sign is also a legal character. Identifiers that begin with a
dollar sign are interpreted as references to variables in the target software,
rather than to SystemTap script variables. Identifiers may not start with
a plain digit.
\subsection{Data types\label{sub:Data-types}}
\index{data types}
The SystemTap language includes a small number of data types, but no type
declarations. A variable's type is inferred\index{inference} from its use.
To support this inference, the translator enforces consistent typing of function
arguments and return values, array indices and values. There are no implicit
type conversions between strings and numbers. Inconsistent type-related use
of an identifier signals an error.
\subsubsection{Literals}
\index{literals}
Literals are either strings or integers.
Literal integers can be expressed as decimal,
octal, or hexadecimal, using C notation. Type suffixes (e.g., \emph{L} or
\emph{U}) are not used.
\subsubsection{Integers\label{sub:Integers}}
\index{integers} \index{numbers}
Integers are decimal, hexadecimal, or octal, and use the same notation as
in C. Integers are 64-bit signed quantities, although the parser also accepts
(and wraps around) values above positive $2^{63}$ but below $2^{64}$.
\subsubsection{Strings\label{sub:Strings}}
\index{strings}
Strings are enclosed in quotation marks ({}``string''), and pass
through standard C escape codes with backslashes. A string literal may
be split into several pieces, which are glued together, as follows.
\begin{vindent}
\begin{verbatim}
str1 = "foo" "bar"
/* --> becomes "foobar" */
str2 = "a good way to do a multi-line\n"
"string literal"
/* --> becomes "a good way to do a multi-line\nstring literal" */
str3 = "also a good way to " @1 " splice command line args"
/* --> becomes "also a good way to foo splice command line args",
assuming @1 is given as foo on the command line */
\end{verbatim}
\end{vindent}
Observe that script arguments can also be glued into a string literal.
Strings are limited in length to MAXSTRINGLEN. For more information
about this and other limits, see Section~\ref{sub:SystemTap-safety}.
\subsubsection{Associative arrays}
See Section~\ref{sec:Associative-Arrays}
\subsubsection{Statistics}
See Section~\ref{sec:Statistics}
\subsection{Semicolons}
\index{;}
The semicolon is the null statement, or do nothing statement. It is optional,
and useful as a separator between statements to improve detection of syntax
errors and to reduce ambiguities in grammar.
\subsection{Comments}
\index{comments}
Three forms of comments are supported, as follows.
\begin{vindent}
\begin{verbatim}
# ... shell style, to the end of line
// ... C++ style, to the end of line
/* ... C style ... */
\end{verbatim}
\end{vindent}
\subsection{Whitespace}
\index{whitespace}
As in C, spaces, tabs, returns, newlines, and comments are treated as whitespace.
Whitespace is ignored by the parser.
\subsection{Expressions}
\index{expressions}
SystemTap supports a number of operators that use the same general syntax,
semantics, and precedence as in C and awk. Arithmetic is performed per C
rules for signed integers. If the parser detects division by zero or an overflow,
it generates an error. The following subsections list these operators.
\subsubsection{Binary numeric operators}
\index{binary}
\texttt{{*} / \% + - >\,{}> >\,{}>\,{}> <\,{}< \& \textasciicircum{}
| \&\& ||}
\subsubsection{Binary string operators}
\index{binary}
\texttt{\textbf{.}} (string concatenation)
\subsubsection{Numeric assignment operators}
\index{numeric}
\texttt{= {*}= /= \%= += -= >\,{}>= <\,{}<=
\&= \textasciicircum{}= |=}
\subsubsection{String assignment operators}
\texttt{= .=}
\subsubsection{Unary numeric operators}
\index{unary}
\texttt{+ - ! \textasciitilde{} ++ -{}-}
\subsubsection{Numeric \& string comparison, regular expression matching operators}
\index{comparison}
\texttt{< > <= >= == !=} \verb+=~+ \verb+!~+
The \verb+=~+ and \verb+!~+ operators perform regular expression
matching. The second operand must be a string literal
containing a syntactically valid regular expression. The \verb+=~+ operator returns {\tt 1} on a successful match and \texttt{0} on a failed match.
The \verb+!~+ operator returns {\tt 1} on a failed match.
The regular expression syntax supports most of the features of POSIX Extended
Regular Expressions, except for subexpression reuse (\verb+\1+)
functionality. After a successful match, the matched substring and subexpressions can be extracted using the \texttt{matched} tapset
function. The \texttt{ngroups} tapset function returns the number of
subexpressions in the last successfully matched regular expression.
\subsubsection{Ternary operator\label{sub:Ternary-operator}}
\index{?}
\texttt{cond ? exp1 : exp2}
\subsubsection{Grouping operator}
\index{grouping}
\texttt{( exp )}
\subsubsection{Function call}
\index{fn}
General syntax:
\texttt{fn ({[} arg1, arg2, ... ])}
\subsubsection{\$ptr-\textgreater member}
\index{pointer}
\texttt{ptr} is a kernel pointer available in a probed context.
\subsubsection{Pointer typecasting}
\index{Pointer typecasting}
\emph{Typecasting} is supported using the \texttt{@cast()} operator. A
script can define a pointer type for a \emph{long} value, then access
type members using the same syntax as with \texttt{\$target}
variables. After a pointer is saved into a script integer variable,
the translator loses the necessary type information to access members
from that pointer. The \texttt{@cast()} operator tells the translator
how to read a pointer.
The following statement interprets \texttt{p} as a pointer to a struct
or union named \texttt{type\_name} and dereferences the
\texttt{member} value:
\begin{vindent}
\begin{verbatim}
@cast(p, "type_name"[, "module"])->member
\end{verbatim}
\end{vindent}
The optional \texttt{module} parameter tells the translator where to
look for information about that type. You can specify multiple modules
as a list with colon (\texttt{:}) separators. If you do not specify
the module parameter, the translator defaults to either the probe
module for dwarf probes or to \textit{kernel} for functions and all
other probe types.
The following statement retrieves the parent PID from a kernel
task\_struct:
\begin{vindent}
\begin{verbatim}
@cast(pointer, "task_struct", "kernel")->parent->tgid
\end{verbatim}
\end{vindent}
The translator can create its own module with type information from a
header surrounded by angle brackets (\texttt{< >}) if normal debugging
information is not available. For kernel headers, prefix it with
\texttt{kernel} to use the appropriate build system. All other
headers are built with default GCC parameters into a user module. The
following statements are examples.
\begin{vindent}
\begin{verbatim}
@cast(tv, "timeval", "<sys/time.h>")->tv_sec
@cast(task, "task_struct", "kernel<linux/sched.h>")->tgid
\end{verbatim}
\end{vindent}
In guru mode, the translator allows scripts to assign new values to
members of typecasted pointers.
Typecasting is also useful in the case of \texttt{void*} members whose
type might be determinable at run time.
\begin{vindent}
\begin{verbatim}
probe foo {
if ($var->type == 1) {
value = @cast($var->data, "type1")->bar
} else {
value = @cast($var->data, "type2")->baz
}
print(value)
}
\end{verbatim}
\end{vindent}
\subsubsection{\textless value\textgreater\ in \textless array\_name\textgreater}
\index{index}
This expression evaluates to true if the array contains an element with the
specified index.
\subsubsection{{[} \textless value\textgreater, ... ] in \textless array\_name\textgreater}
The number of index values must match the number of indexes previously specified.
\subsection{Literals passed in from the stap command line\label{sub:Literals-passed-in}}
\index{literals}
\emph{Literals} are either strings enclosed in double quotes ('' '') or
integers. For information about integers, see Section~\ref{sub:Integers}.
For information about strings, see Section~\ref{sub:Strings}.
Script arguments at the end of a command line are expanded as literals. You
can use these in all contexts where literals are accepted. A reference to
a nonexistent argument number is an error.
\subsubsection{\$1 \ldots{} \$\textless NN\textgreater\ for literal pasting}
\index{\$}
Use \texttt{\$1 \ldots{} \$<NN>} for pasting the entire argument string
into the input stream, which will be further lexically tokenized.
\subsubsection{@1 \ldots{} @\textless NN\textgreater\ for strings}
Use \texttt{@1 \ldots{} @<NN>} for casting an entire argument
as a string literal.
\subsubsection{Examples}
For example, if the following script named example.stp
\begin{vindent}
\begin{verbatim}
probe begin { printf("%d, %s\n", $1, @2) }
\end{verbatim}
\end{vindent}
is invoked as follows
\begin{vindent}
\begin{verbatim}
# stap example.stp '5+5' mystring
\end{verbatim}
\end{vindent}
then 5+5 is substituted for \$1 and \char`\"{}mystring\char`\"{} for @2. The
output will be
\begin{vindent}
\begin{verbatim}
10, mystring
\end{verbatim}
\end{vindent}
\subsection{Conditional compilation}
\subsubsection{Conditions}
\index{conditions}
One of the steps of parsing is a simple preprocessing stage. The
preprocessor supports conditionals with a general form similar to the
ternary operator (Section~\ref{sub:Ternary-operator}).
\begin{vindent}
\begin{verbatim}
%( CONDITION %? TRUE-TOKENS %)
%( CONDITION %? TRUE-TOKENS %: FALSE-TOKENS %)
\end{verbatim}
\end{vindent}
The CONDITION is a limited expression whose format is determined by its first
keyword. The following is the general syntax.
\begin{vindent}
\begin{verbatim}
%( <condition> %? <code> [ %: <code> ] %)
\end{verbatim}
\end{vindent}
\subsubsection{Conditions based on available target variables}
\index{defined target variable}
The predicate @defined() is available for testing whether a
particular \$variable/expression is resolvable at translation time. The
following is an example of its use:
\begin{vindent}
\begin{verbatim}
probe foo { if (@defined($bar)) log ("$bar is available here") }
\end{verbatim}
\end{vindent}
\subsubsection{Conditions based on kernel version: kernel\_v, kernel\_vr}
\index{kernel version}
\index{kernel\_vr}
\index{kernel\_v}
If the first part of a conditional expression is the identifier \texttt{kernel\_v}
or \texttt{kernel\_vr}, the second part must be one of six standard numeric
comparison operators {}``\textless'', {}``\textless ='', {}``=='', {}``!='', {}``\textgreater'',
or {}``\textgreater ='',
and the third part must be a string literal that contains an RPM-style version-release
value. The condition returns true if the version of the target kernel (as
optionally overridden by the \textbf{-r} option) matches the given version
string. The comparison is performed by the glibc function strverscmp.
\texttt{kernel\_v} refers to the kernel version number only, such as {}``2.6.13\char`\"{}.
\texttt{kernel\_vr} refers to the kernel version number including the release
code suffix, such as {}``2.6.13-1.322FC3smp''.
\subsubsection{Conditions based on architecture: arch}
\index{arch}
If the first part of the conditional expression is the identifier \texttt{arch}
which refers to the processor architecture, then the second part is a string
comparison operator ''=='' or ''!='', and the third part is a string
literal for matching it. This comparison is a simple string equality or inequality.
The currently supported architecture strings are i386, i686, x86\_64, ia64,
s390, and powerpc.
\subsubsection{Conditions based on privilege level: systemtap\_privilege}
\index{systemtap\_privilege}
If the first part of the conditional expression is the identifier
\texttt{systemtap\_privilege} which refers to the privilege level the
systemtap script is being compiled with, then the second part is a
string comparison operator ''=='' or ''!='', and the third part is a
string literal for matching it. This comparison is a simple string
equality or inequality. The possible privilege strings to consider
are \verb+"stapusr"+ for unprivileged scripts, and \verb+"stapsys"+ or
\verb+"stapdev"+ for privileged scripts. (In general, to test for a
privileged script it is best to use \verb+!= "stapusr"+.)
This condition can be used to write scripts that can be run in both
privileged and unprivileged modes, with additional functionality made
available in the privileged case.
\subsubsection{True and False Tokens}
\index{tokens}
TRUE-TOKENS and FALSE-TOKENS are zero or more general parser tokens, possibly
including nested preprocessor conditionals, that are pasted into the input
stream if the condition is true or false. For example, the following code
induces a parse error unless the target kernel version is newer than 2.6.5.
\begin{vindent}
\begin{verbatim}
%( kernel_v <= "2.6.5" %? **ERROR** %) # invalid token sequence
\end{verbatim}
\end{vindent}
The following code adapts to hypothetical kernel version drift.
\begin{vindent}
\begin{verbatim}
probe kernel.function (
%( kernel_v <= "2.6.12" %? "__mm_do_fault" %:
%( kernel_vr == "2.6.13-1.8273FC3smp" %? "do_page_fault" %: UNSUPPORTED %)
%)) { /* ... */ }
%( arch == "ia64" %?
probe syscall.vliw = kernel.function("vliw_widget") {}
%)
\end{verbatim}
\end{vindent}
The following code adapts to the presence of a kernel CONFIG option.
\begin{vindent}
\begin{verbatim}
%( CONFIG_UPROBE == "y" %?
probe process.syscall {}
%)
\end{verbatim}
\end{vindent}
\subsection{Preprocessor macros}
This feature lets scripts eliminate some types of repetition.
\subsubsection{Local macros}
The preprocessor also supports a simple macro facility.
Macros taking zero or more arguments are defined using the following
construct:
\begin{vindent}
\begin{verbatim}
@define NAME %( BODY %)
@define NAME(PARAM_1, PARAM_2, ...) %( BODY %)
\end{verbatim}
\end{vindent}
Macro arguments are referred to in the body by prefixing the argument
name with an \texttt{@} symbol. Likewise, once defined, macros are
invoked by prefixing the macro name with an \texttt{@} symbol:
\begin{vindent}
\begin{verbatim}
@define foo %( x %)
@define add(a,b) %( ((@a)+(@b)) %)
@foo = @add(2,2)
\end{verbatim}
\end{vindent}
Macro expansion is currently performed in a separate pass before
conditional compilation. Therefore, both TRUE- and FALSE-tokens in
conditional expressions will be macroexpanded regardless of how the
condition is evaluated. This can sometimes lead to errors:
\begin{vindent}
\begin{verbatim}
// The following results in a conflict:
%( CONFIG_UPROBE == "y" %?
@define foo %( process.syscall %)
%:
@define foo %( **ERROR** %)
%)
// The following works properly as expected:
@define foo %(
%( CONFIG_UPROBE == "y" %? process.syscall %: **ERROR** %)
%)
\end{verbatim}
\end{vindent}
The first example is incorrect because both \texttt{@define}s are
evaluated in a pass prior to the conditional being evaluated.
\subsubsection{Library macros}
Normally, a macro definition is local to the file it occurs in. Thus,
defining a macro in a tapset does not make it available to the user of
the tapset.
Publically available library macros can be defined by including
\texttt{.stpm} files on the tapset search path. These files may only
contain \texttt{@define} constructs, which become visible across all
tapsets and user scripts.
\section{Statement types\label{sec:Statement-Types}}
Statements enable procedural control flow within functions and probe handlers.
The total number of statements executed in response to any single probe event
is limited to MAXACTION, which defaults to 1000. See Section~\ref{sub:SystemTap-safety}.
\subsection{break and continue}
\index{break}
\index{continue}
Use \texttt{break} or \texttt{continue} to exit or iterate the innermost
nesting loop statement, such as within a \texttt{while, for,} or \texttt{foreach}
statement. The syntax and semantics are the same as those used in C.
\subsection{try/catch}
\index{try}
\index{catch}
Use \texttt{try}/\texttt{catch} to handle most kinds of run-time errors within the script
instead of aborting the probe handler in progress. The semantics are similar
to C++ in that try/catch blocks may be nested. The error string may be captured
by optionally naming a variable which is to receive it.
\begin{vindent}
\begin{verbatim}
try {
/* do something */
/* trigger error like kread(0), or divide by zero, or error("foo") */
} catch (msg) { /* omit (msg) entirely if not interested */
/* println("caught error ", msg) */
/* handle error */
}
/* execution continues */
\end{verbatim}
\end{vindent}
\subsection{delete\label{sub:delete}}
\index{delete}
\texttt{delete} removes an element.
The following statement removes from ARRAY the element specified by the index
tuple. The value will no longer be available, and subsequent iterations will
not report the element. It is not an error to delete an element that does
not exist.
\begin{vindent}
\begin{verbatim}
delete ARRAY[INDEX1, INDEX2, ...]
\end{verbatim}
\end{vindent}
The following syntax removes all elements from ARRAY:
\begin{vindent}
\begin{verbatim}
delete ARRAY
\end{verbatim}
\end{vindent}
The following statement removes the value of SCALAR. Integers and strings
are cleared to zero and null (\char`\"{}\char`\"{}) respectively, while statistics
are reset to their initial empty state.
\begin{vindent}
\begin{verbatim}
delete SCALAR
\end{verbatim}
\end{vindent}
\subsection{EXP (expression)}
\index{expression}
An \texttt{expression} executes a string- or integer-valued expression and
discards the value.
\subsection{for}
\index{for}
General syntax:
\begin{vindent}
\begin{verbatim}
for (EXP1; EXP2; EXP3) STMT
\end{verbatim}
\end{vindent}
The \texttt{for} statement is similar to the \texttt{for} statement in C.
The \texttt{for} expression executes EXP1 as initialization. While EXP2 is
non-zero, it executes STMT, then the iteration expression EXP3.
\subsection{foreach\label{sub:foreach}}
\index{foreach}
General syntax:
\begin{vindent}
\begin{verbatim}
foreach (VAR in ARRAY) STMT
\end{verbatim}
\end{vindent}
The \texttt{foreach} statement loops over each element of a named global array, assigning
the current key to VAR. The array must not be modified within the statement.
If you add a single plus (+) or minus (-) operator after the VAR or the ARRAY
identifier, the iteration order will be sorted by the ascending or descending
index or value.
The following statement behaves the same as the first example, except it
is used when an array is indexed with a tuple of keys. Use a sorting suffix
on at most one VAR or ARRAY identifier.
\begin{vindent}
\begin{verbatim}
foreach ([VAR1, VAR2, ...] in ARRAY) STMT
\end{verbatim}
\end{vindent}
You can combine the first and second syntax to capture both the full tuple
and the keys at the same time as follows.
\begin{vindent}
\begin{verbatim}
foreach (VAR = [VAR1, VAR2, ...] in ARRAY) STMT
\end{verbatim}
\end{vindent}
The following statement is the same as the first example, except that the
\texttt{limit} keyword limits the number of loop iterations to EXP times.
EXP is evaluated once at the beginning of the loop.
\begin{vindent}
\begin{verbatim}
foreach (VAR in ARRAY limit EXP) STMT
\end{verbatim}
\end{vindent}
\subsection{if}
\index{if}
General syntax:
\begin{vindent}
\begin{verbatim}
if (EXP) STMT1 [ else STMT2 ]
\end{verbatim}
\end{vindent}
The \texttt{if} statement compares an integer-valued EXP to zero. It executes
the first STMT if non-zero, or the second STMT if zero.
The \texttt{if} command has the same syntax and semantics as used in C.
\subsection{next}
\index{next}
The \texttt{next} statement returns immediately from the enclosing probe
handler. When used in functions, the execution will be immediately transferred
to the next overloaded function.
\subsection{; (null statement)}
\index{;}
\index{null statement}
General syntax:
\begin{vindent}
\begin{verbatim}
statement1
;
statement2
\end{verbatim}
\end{vindent}
The semicolon represents the null statement, or do nothing. It is useful
as an optional separator between statements to improve syntax error detection
and to handle certain grammar ambiguities.
\subsection{return}
\index{return}
General syntax:
\begin{vindent}
\begin{verbatim}
return EXP
\end{verbatim}
\end{vindent}
The \texttt{return} statement returns the EXP value from the enclosing function.
If the value of the function is not returned, then a return statement is
not needed, and the function will have a special \emph{unknown} type with
no return value.
\subsection{\{ \} (statement block)}
\index{\{ \}}
\index{statement block}
This is the statement block with zero or more statements enclosed within
brackets. The following is the general syntax:
\begin{vindent}
\begin{verbatim}
{ STMT1 STMT2 ... }
\end{verbatim}
\end{vindent}
The statement block executes each statement in sequence in the block. Separators
or terminators are generally not necessary between statements. The statement
block uses the same syntax and semantics as in C.
\subsection{while}
\index{while}
General syntax:
\begin{vindent}
\begin{verbatim}
while (EXP) STMT
\end{verbatim}
\end{vindent}
The \texttt{while} statement uses the same syntax and semantics as in C.
In the statement above, while the integer-valued EXP evaluates to non-zero,
the parser will execute STMT.
\section{Associative arrays\label{sec:Associative-Arrays}}
\index{associative arrays}
Associative arrays are implemented as hash tables with a maximum size set
at startup. Associative arrays are too large to be created dynamically for
individual probe handler runs, so they must be declared as global. The basic
operations for arrays are setting and looking up elements. These operations
are expressed in awk syntax: the array name followed by an opening bracket
({[}), a comma-separated list of up to nine index index expressions, and
a closing bracket (]). Each index expression may be a string or a number,
as long as it is consistently typed throughout the script.
\subsection{Examples}
\begin{vindent}
\begin{verbatim}
# Increment the named array slot:
foo [4,"hello"] ++
# Update a statistic:
processusage [uid(),execname()] ++
# Set a timestamp reference point:
times [tid()] = get_cycles()
# Compute a timestamp delta:
delta = get_cycles() - times [tid()]
\end{verbatim}
\end{vindent}
\subsection{Types of values}
Array elements may be set to a number, a string, or an aggregate.
The type must be consistent
throughout the use of the array. The first assignment to the array defines
the type of the elements. Unset array elements may be fetched and return
a null value (zero or empty string) as appropriate, but they are not seen
by a membership test.
\subsection{Array capacity}
Array sizes can be specified explicitly or allowed to default to the maximum
size as defined by MAXMAPENTRIES. See Section~\ref{sub:SystemTap-safety}
for details on changing MAXMAPENTRIES.
You can explicitly specify the size of an array as follows:
\begin{vindent}
\begin{verbatim}
global ARRAY[<size>]
\end{verbatim}
\end{vindent}
If you do not specify the size parameter, then the array is created to hold
MAXMAPENTRIES number of elements.
\subsection{Array wrapping\label{sub:Array-Wrapping}}
Arrays may be wrapped using the percentage symbol (\%) causing previously entered
elements to be overwritten if more elements are inserted than the array can
hold. This works for both regular and statistics typed arrays.
You can mark arrays for wrapping as follows:
\begin{vindent}
\begin{verbatim}
global ARRAY1%[<size>], ARRAY2%
\end{verbatim}
\end{vindent}
\subsection{Iteration, foreach}
\index{foreach}
Like awk, SystemTap's foreach creates a loop that iterates over key tuples
of an array, not only values. The iteration may be sorted by any single key
or a value by adding an extra plus symbol (+) or minus symbol (-) to the
code or limited to only a few elements with the limit keyword.
The following are examples.
\begin{vindent}
\begin{verbatim}
# Simple loop in arbitrary sequence:
foreach ([a,b] in foo)
fuss_with(foo[a,b])
# Loop in increasing sequence of value:
foreach ([a,b] in foo+) { ... }
# Loop in decreasing sequence of first key:
foreach ([a-,b] in foo) { ... }
# Print the first 10 tuples and values in the array in decreasing sequence
foreach (v = [i,j] in foo- limit 10)
printf("foo[%d,%s] = %d\n", i, j, v)
\end{verbatim}
\end{vindent}
The \texttt{break} and \texttt{continue} statements also work inside foreach
loops. Since arrays can be large but probe handlers must execute quickly,
you should write scripts that exit iteration early, if possible. For simplicity,
SystemTap forbids any modification of an array during iteration with a foreach.
For a full description of \texttt{foreach} see subsection \ref{sub:foreach}.
\subsection{Deletion}
\index{delete}
The \texttt{delete} statement can either remove a single element by index from
an array or clear an entire array at once. See subsection \ref{sub:delete} for
details and examples.
\section{Statistics (aggregates)\label{sec:Statistics}}
\index{aggregates}
Aggregate instances are used to collect statistics on numerical values, when
it is important to accumulate new data quickly and in large volume. These
instances operate without exclusive locks, and store only aggregated stream
statistics. Aggregates make sense only for global variables. They are stored
individually or as elements of an associative array. For information about
wrapping associative arrays with statistics elements, see section~\ref{sub:Array-Wrapping}
\subsection{The aggregation (\textless\hspace{1 sp}\textless\hspace{1 sp}\textless) operator}
\index{\textless\hspace{1 sp}\textless\hspace{1 sp}\textless}
The aggregation operator is {}``\textless\hspace{1 sp}\textless\hspace{1 sp}\textless'',
and its effect is similar to an assignment or a C++ output streaming operation.
The left operand specifies a scalar or array-index \emph{l-value}, which
must be declared global. The right operand is a numeric expression. The meaning
is intuitive: add the given number as a sample to the set of numbers to compute their
statistics. The specific list of statistics to gather is given separately
by the extraction functions. The following is an example.
\begin{vindent}
\begin{verbatim}
a <<< delta_timestamp
writes[execname()] <<< count
\end{verbatim}
\end{vindent}
\subsection{Extraction functions}
\index{extraction}
For each instance of a distinct extraction function operating on a given
identifier, the translator computes a set of statistics. With each execution
of an extraction function, the aggregation is computed for that moment across
all processors. The first argument of each function is the same style of
l-value as used on the left side of the aggregation operation.
\subsection{Integer extractors}
The following functions provide methods to extract information about aggregate.
\subsubsection{@count(s)}
\index{count}
This statement returns the number of samples accumulated in aggregate s.
\subsubsection{@sum(s)}
\index{sum}
This statement returns the total sum of all samples in aggregate s.
\subsubsection{@min(s)}
\index{min}
This statement returns the minimum of all samples in aggregate s.
\subsubsection{@max(s)}
\index{max}
This statement returns the maximum of all samples in aggregate s.
\subsubsection{@avg(s)}
\index{avg}
This statement returns the average value of all samples in aggregate s.
\subsection{Histogram extractors}
\index{histograms}
The following functions provide methods to extract histogram information.
Printing a histogram with the print family of functions renders a histogram
object as a tabular "ASCII art" bar chart.
\subsubsection{@hist\_linear}
\index{hist\_linear}
The statement \texttt{@hist\_linear(v,L,H,W)} represents a linear histogram
of aggregate \texttt{v},
where \emph{L} and \emph{H} represent the lower and upper end of
a range of values and \emph{W} represents the width (or size) of each bucket
within the range. The low and high values can be negative, but the overall
difference (high minus low) must be positive. The width parameter must also
be positive.
In the output, a range of consecutive empty buckets may be replaced with a tilde
(\textasciitilde{}) character. This can be controlled on the command line
with -DHIST\_ELISION=\textless\hspace{1 sp}num\textgreater\hspace{1 sp},
where \textless\hspace{1 sp}num\textgreater\hspace{1 sp} specifies how many
empty buckets at the top and bottom of the range to print.
The default is 2. A \textless\hspace{1 sp}num\textgreater\hspace{1 sp} of 0
removes all empty buckets. A negative \textless\hspace{1 sp}num\textgreater\hspace{1 sp}
disables removal.
For example, if you specify -DHIST\_ELISION=3 and the histogram has 10
consecutive empty buckets, the first 3 and last 3 empty buckets will
be printed and the middle 4 empty buckets will be represented by a
tilde (\textasciitilde{}).
The following is an example.
\begin{vindent}
\begin{verbatim}
global reads
probe netdev.receive {
reads <<< length
}
probe end {
print(@hist_linear(reads, 0, 10240, 200))
}
\end{verbatim}
\end{vindent}
This generates the following output.
\begin{samepage}
\begin{vindent}
\begin{verbatim}
value |-------------------------------------------------- count
0 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 1650
200 | 8
400 | 0
600 | 0
~
1000 | 0
1200 | 0
1400 | 1
1600 | 0
1800 | 0
\end{verbatim}
\end{vindent}
\end{samepage}
This shows that 1650 network reads were of a size between 0 and 199 bytes,
8 reads were between 200 and 399 bytes, and 1 read was between
1200 and 1399 bytes. The tilde (\textasciitilde{}) character indicates
the bucket for 800 to 999 bytes was removed because it was empty.
Empty buckets for 2000 bytes and larger were also removed because they
were empty.
\subsubsection{@hist\_log}
\index{hist\_log}
The statement \texttt{@hist\_log(v)} represents a base-2 logarithmic
histogram. Empty buckets are replaced with a tilde (\textasciitilde{})
character in the same way as \texttt{@hist\_linear()} (see above).
The following is an example.
\begin{vindent}
\begin{verbatim}
global reads
probe netdev.receive {
reads <<< length
}
probe end {
print(@hist_log(reads))
}
\end{verbatim}
\end{vindent}
This generates the following output.
\begin{samepage}
\begin{vindent}
\begin{verbatim}
value |-------------------------------------------------- count
8 | 0
16 | 0
32 | 254
64 | 3
128 | 2
256 | 2
512 | 4
1024 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 16689
2048 | 0
4096 | 0
\end{verbatim}
\end{vindent}
\end{samepage}
\subsection{Deletion}
\index{delete}
The \texttt{delete} statement (subsection \ref{sub:delete}) applied to an
aggregate variable will reset it to the initial empty state.
\section{Formatted output}
\subsection{print}
\index{print}
General syntax:
\begin{vindent}
\begin{verbatim}
print ()
\end{verbatim}
\end{vindent}
This function prints a single value of any type.
\subsection{printf}
\index{printf}
General syntax:
\begin{vindent}
\begin{verbatim}
printf (fmt:string, ...)
\end{verbatim}
\end{vindent}
The printf function takes a formatting string as an argument, and a number
of values of corresponding types, and prints them all. The format must be a
literal string constant. The printf formatting directives are similar to those
of C, except that they are fully checked for type by the translator.
The formatting string can contain tags that are defined as follows:
\begin{vindent}
\begin{verbatim}
%[flags][width][.precision][length]specifier
\end{verbatim}
\end{vindent}
Where \texttt{specifier} is required and defines the type and the interpretation
of the value of the corresponding argument. The following table shows the
details of the specifier parameter:
\begin{table}[H]
\caption{printf specifier values}
\begin{tabular}{|>{\raggedright}p{1in}|>{\raggedright}p{3.5in}|>{\raggedright}p{1.25in}|}
\hline
\textbf{Specifier}&
\textbf{Output}&
\textbf{Example}\tabularnewline
\hline
\hline
d or i&
Signed decimal&
392\tabularnewline
\hline
o&
Unsigned octal&
610\tabularnewline
\hline
s&
String&
sample\tabularnewline
\hline
u&
Unsigned decimal&
7235\tabularnewline
\hline
x&
Unsigned hexadecimal (lowercase letters)&
7fa\tabularnewline
\hline
X&
Unsigned hexadecimal (uppercase letters)&
7FA\tabularnewline
\hline
p&
Pointer address&
0x0000000000bc614e\tabularnewline
\hline
b&
Writes a binary value as text using the computer's native byte order.
The field width specifies the number of bytes
to write. Valid specifications are \%b, \%1b, \%2b, \%4b and \%8b. The default
width is 8 (64-bits).&
See below\tabularnewline
\hline
\%&
A \% followed by another \% character will write \% to stdout.&
\%\tabularnewline
\hline
\end{tabular}
\end{table}
The tag can also contain \texttt{flags}, \texttt{width}, \texttt{.precision}
and \texttt{modifiers} sub-specifiers, which are optional and follow these
specifications:
\begin{table}[H]
\caption{printf flag values}
\begin{tabular}{|>{\raggedright}p{1.5in}|>{\raggedright}p{4.5in}|}
\hline
\textbf{Flags}&
\textbf{Description}\tabularnewline
\hline
\hline
- (minus sign)&
Left-justify within the given field width. Right justification is the default
(see \texttt{width} sub-specifier).\tabularnewline
\hline
+ (plus sign)&
Precede the result with a plus or minus sign even for positive numbers. By
default, only negative numbers are preceded with a minus sign.\tabularnewline
\hline
(space)&
If no sign is going to be written, a blank space is inserted before the value.\tabularnewline
\hline
\#&
Used with \texttt{o}, \texttt{x} or \texttt{X} specifiers the value is preceded
with \texttt{0}, \texttt{0x} or \texttt{0X} respectively for non-zero values.\tabularnewline
\hline
0&
Left-pads the number with zeroes instead of spaces, where padding is specified
(see \texttt{width} sub-specifier).\tabularnewline
\hline
\end{tabular}
\end{table}
\begin{table}[H]
\caption{printf width values}
\begin{tabular}{|>{\raggedright}p{1.5in}|>{\raggedright}p{4.5in}|}
\hline
\textbf{Width}&
\textbf{Description}\tabularnewline
\hline
\hline
(number)&
Minimum number of characters to be printed. If the value to be printed is
shorter than this number, the result is padded with blank spaces. The value
is not truncated even if the result is larger.\tabularnewline
\hline
\end{tabular}
\end{table}
%
\begin{table}[H]
\caption{printf precision values}
\begin{tabular}{|>{\raggedright}p{1.5in}|>{\raggedright}p{4.5in}|}
\hline
\textbf{Precision}&
\textbf{Description}\tabularnewline
\hline
\hline
.number&
For integer specifiers (\texttt{d, i, o, u, x, X}): \texttt{precision} specifies
the minimum number of digits to be written. If the value to be written is
shorter than this number, the result is padded with leading zeros. The value
is not truncated even if the result is longer. A precision of 0 means that
no character is written for the value 0. For s: this is the maximum number
of characters to be printed. By default all characters are printed until
the ending null character is encountered. When no \texttt{precision} is specified,
the default is 1. If the period is specified without an explicit value for
\texttt{precision}, 0 is assumed.\tabularnewline
\hline
\end{tabular}
\end{table}
\textbf{Binary Write Examples}
The following is an example of using the binary write functions:
\begin{vindent}
\begin{verbatim}
probe begin {
for (i = 97; i < 110; i++)
printf("%3d: %1b%1b%1b\n", i, i, i-32, i-64)
exit()
}
\end{verbatim}
\end{vindent}
This prints:
\begin{vindent}
\begin{verbatim}
97: aA!
98: bB"
99: cC#
100: dD$
101: eE%
102: fF&
103: gG'
104: hH(
105: iI)
106: jJ*
107: kK+
108: lL,
109: mM-
\end{verbatim}
\end{vindent}
Another example:
\begin{vindent}
\begin{verbatim}
stap -e 'probe begin{printf("%b%b", 0xc0dedbad, \
0x12345678);exit()}' | hexdump -C
\end{verbatim}
\end{vindent}
This prints:
\begin{vindent}
\begin{verbatim}
00000000 ad db de c0 00 00 00 00 78 56 34 12 00 00 00 00 |........xV4.....|
00000010
\end{verbatim}
\end{vindent}
Another example:
\begin{vindent}
\begin{verbatim}
probe begin{
printf("%1b%1b%1blo %1b%1brld\n", 72,101,108,87,111)
exit()
}
\end{verbatim}
\end{vindent}
This prints:
\begin{vindent}
\begin{verbatim}
Hello World
\end{verbatim}
\end{vindent}
\subsection{printd}
\index{printd}
General syntax:
\begin{vindent}
\begin{verbatim}
printd (delimiter:string, ...)
\end{verbatim}
\end{vindent}
This function takes a string delimiter and two or more values of any type, then
prints the values with the delimiter interposed. The delimiter must be a
literal string constant.
For example:
\begin{vindent}
\begin{verbatim}
printd("/", "one", "two", "three", 4, 5, 6)
\end{verbatim}
\end{vindent}
prints:
\begin{vindent}
\begin{verbatim}
one/two/three/4/5/6
\end{verbatim}
\end{vindent}
\subsection{printdln}
\index{printdln}
General syntax:
\begin{vindent}
\begin{verbatim}
printdln (delimiter:string, ...)
\end{verbatim}
\end{vindent}
This function operates like \texttt{printd}, but also appends a newline.
\subsection{println}
\index{println}
General syntax:
\begin{vindent}
\begin{verbatim}
println ()
\end{verbatim}
\end{vindent}
This function prints a single value like \texttt{print},
but also appends a newline.
\subsection{sprint}
\index{sprint}
General syntax:
\begin{vindent}
\begin{verbatim}
sprint:string ()
\end{verbatim}
\end{vindent}
This function operates like \texttt{print}, but returns the string rather
than printing it.
\subsection{sprintf}
\index{sprintf}
General syntax:
\begin{vindent}
\begin{verbatim}
sprintf:string (fmt:string, ...)
\end{verbatim}
\end{vindent}
This function operates like \texttt{printf}, but returns the formatted string
rather than printing it.
\section{Tapset-defined functions\label{sec:Predefined-Functions}}
Unlike built-in functions, tapset-defined functions are implemented in tapset scripts.
These are individually documented in the in \texttt{tapset::*(3stap)},
\texttt{function::*(3stap)},
and \texttt{probe::*(3stap)} man pages, and implemented under
\texttt{/usr/share/systemtap/tapset}.
\section{For Further Reference\label{sec:For-Further-Reference}}
For more information, see:
\begin{itemize}
\item The SystemTap tutorial at \url{http://sourceware.org/systemtap/tutorial/}
\item The SystemTap wiki at \url{http://sourceware.org/systemtap/wiki}
\item The SystemTap documentation page at \url{http://sourceware.org/systemtap/documentation.html}
\item From an unpacked source tarball or GIT directory, the examples in in the
src/examples directory, the tapsets in the src/tapset directory, and the
test scripts in the src/testsuite directory.
\item The man pages for tapsets.
For a list, run the command \texttt{{}``man -k tapset::}''.
\item The man pages for individual probe points.
For a list, run the command \texttt{{}``man -k probe::}''.
\item The man pages for individual systemtap functions.
For a list, run the command \texttt{{}``man -k function::}''.
\end {itemize}
\setcounter{secnumdepth}{0}
\newpage{}
\addcontentsline{toc}{section}{Index}
\printindex{}
\end{document}
|