1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 1142 1143 1144 1145 1146 1147 1148 1149 1150 1151 1152 1153 1154 1155 1156 1157 1158 1159 1160 1161 1162 1163 1164 1165 1166 1167 1168 1169 1170 1171 1172 1173 1174 1175 1176 1177 1178 1179 1180 1181 1182 1183 1184 1185 1186 1187 1188 1189 1190 1191 1192 1193 1194 1195 1196 1197 1198 1199 1200 1201 1202 1203 1204 1205 1206 1207 1208 1209 1210 1211 1212 1213 1214 1215 1216 1217 1218 1219 1220 1221 1222 1223 1224 1225 1226 1227 1228 1229 1230 1231 1232 1233 1234 1235 1236 1237 1238 1239 1240 1241 1242 1243 1244 1245 1246 1247 1248 1249 1250 1251 1252 1253 1254 1255 1256 1257 1258 1259 1260 1261 1262 1263 1264 1265 1266 1267 1268 1269 1270 1271 1272 1273 1274 1275 1276 1277 1278 1279 1280 1281 1282 1283 1284 1285 1286 1287 1288 1289 1290 1291 1292 1293 1294 1295 1296 1297 1298 1299 1300 1301 1302 1303 1304 1305 1306 1307 1308 1309 1310 1311 1312 1313 1314 1315 1316 1317 1318 1319 1320 1321 1322 1323 1324 1325 1326 1327 1328 1329 1330 1331 1332 1333 1334 1335 1336 1337 1338 1339 1340 1341 1342 1343 1344 1345 1346 1347 1348 1349 1350 1351 1352 1353 1354 1355 1356 1357 1358 1359 1360 1361 1362 1363 1364 1365 1366 1367 1368 1369 1370 1371 1372 1373 1374 1375 1376 1377 1378 1379 1380 1381 1382 1383 1384 1385 1386 1387 1388 1389 1390 1391 1392 1393 1394 1395 1396 1397 1398 1399 1400 1401 1402 1403 1404 1405 1406 1407 1408 1409 1410 1411 1412 1413 1414 1415 1416 1417 1418 1419 1420 1421 1422 1423 1424 1425 1426 1427 1428 1429 1430 1431 1432 1433 1434 1435 1436 1437 1438 1439 1440 1441 1442 1443 1444 1445 1446 1447 1448 1449 1450 1451 1452 1453 1454 1455 1456 1457 1458 1459 1460 1461 1462 1463 1464 1465 1466 1467 1468 1469 1470 1471 1472 1473 1474 1475 1476 1477 1478 1479 1480 1481 1482 1483 1484 1485 1486 1487 1488 1489 1490 1491 1492 1493 1494 1495 1496 1497 1498 1499 1500 1501 1502 1503 1504 1505 1506 1507 1508 1509 1510 1511 1512 1513 1514 1515 1516 1517 1518 1519 1520 1521 1522 1523 1524 1525 1526 1527 1528 1529 1530 1531 1532 1533 1534 1535 1536 1537 1538 1539 1540 1541 1542 1543 1544 1545 1546 1547 1548 1549 1550 1551 1552 1553 1554 1555 1556 1557 1558 1559 1560 1561 1562 1563 1564 1565 1566 1567 1568 1569 1570 1571 1572 1573 1574 1575 1576 1577 1578 1579 1580 1581 1582 1583 1584 1585 1586 1587 1588 1589 1590 1591 1592 1593 1594 1595 1596 1597 1598 1599 1600 1601 1602 1603 1604 1605 1606 1607 1608 1609 1610 1611 1612 1613 1614 1615 1616 1617 1618 1619 1620 1621 1622 1623 1624 1625 1626 1627 1628 1629 1630 1631 1632 1633 1634 1635 1636 1637 1638 1639 1640 1641 1642 1643 1644 1645 1646 1647 1648 1649 1650 1651 1652 1653 1654 1655 1656 1657 1658 1659 1660 1661 1662 1663 1664 1665 1666 1667 1668 1669 1670 1671 1672 1673 1674 1675 1676 1677 1678 1679 1680 1681 1682 1683 1684 1685 1686 1687 1688 1689 1690 1691 1692 1693 1694 1695 1696 1697 1698 1699 1700 1701 1702 1703 1704 1705 1706 1707 1708 1709 1710 1711 1712 1713 1714 1715 1716 1717 1718 1719 1720 1721 1722 1723 1724 1725 1726 1727 1728 1729 1730 1731 1732 1733 1734 1735 1736 1737 1738 1739 1740 1741 1742 1743 1744 1745 1746 1747 1748 1749 1750 1751 1752 1753 1754 1755 1756 1757 1758 1759 1760 1761 1762 1763 1764 1765 1766 1767 1768 1769 1770 1771 1772 1773 1774 1775 1776 1777 1778 1779 1780 1781 1782 1783 1784 1785 1786 1787 1788 1789 1790 1791 1792 1793 1794 1795 1796 1797 1798 1799 1800 1801 1802 1803 1804 1805 1806 1807 1808 1809 1810 1811 1812 1813 1814 1815 1816 1817 1818 1819 1820 1821 1822 1823 1824 1825 1826 1827 1828 1829 1830 1831 1832 1833 1834 1835 1836 1837 1838 1839 1840 1841 1842 1843 1844 1845 1846 1847 1848 1849 1850 1851 1852 1853 1854 1855 1856 1857 1858 1859 1860 1861 1862 1863 1864 1865 1866 1867 1868 1869 1870 1871 1872 1873 1874 1875 1876 1877 1878 1879 1880 1881 1882 1883 1884 1885 1886 1887 1888 1889 1890 1891 1892 1893 1894 1895 1896 1897 1898 1899 1900 1901 1902 1903 1904 1905 1906 1907 1908 1909 1910 1911 1912 1913 1914 1915 1916 1917 1918 1919 1920 1921 1922 1923 1924 1925 1926 1927 1928 1929 1930 1931 1932 1933 1934 1935 1936 1937 1938 1939 1940 1941 1942 1943 1944 1945 1946 1947 1948 1949 1950 1951 1952 1953 1954 1955 1956 1957 1958 1959 1960 1961 1962 1963 1964 1965 1966 1967 1968 1969 1970 1971 1972 1973 1974 1975 1976 1977 1978 1979 1980 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024 2025 2026 2027 2028 2029 2030 2031 2032 2033 2034 2035 2036 2037 2038 2039 2040 2041 2042 2043 2044 2045 2046 2047 2048 2049 2050 2051 2052 2053 2054 2055 2056 2057 2058 2059 2060 2061 2062 2063 2064 2065 2066 2067 2068 2069 2070 2071 2072 2073 2074 2075 2076 2077 2078 2079 2080 2081 2082 2083 2084 2085 2086 2087 2088 2089 2090 2091 2092 2093 2094 2095 2096 2097 2098 2099 2100 2101 2102 2103 2104 2105 2106 2107 2108 2109 2110 2111 2112 2113 2114 2115 2116 2117 2118 2119 2120 2121 2122 2123 2124 2125 2126 2127 2128 2129 2130 2131 2132 2133 2134 2135 2136 2137 2138 2139 2140 2141 2142 2143 2144 2145 2146 2147 2148 2149 2150 2151 2152 2153 2154 2155 2156 2157 2158 2159 2160 2161 2162 2163 2164 2165 2166 2167 2168 2169 2170 2171 2172 2173 2174 2175 2176 2177 2178 2179 2180 2181 2182 2183 2184 2185 2186 2187 2188 2189 2190 2191 2192 2193 2194 2195 2196 2197 2198 2199 2200 2201 2202 2203 2204 2205 2206 2207 2208 2209 2210 2211 2212 2213 2214 2215 2216 2217 2218 2219 2220 2221 2222 2223 2224 2225 2226 2227 2228 2229 2230 2231 2232 2233 2234 2235 2236 2237 2238 2239 2240 2241 2242 2243 2244 2245 2246 2247 2248 2249 2250 2251 2252 2253 2254 2255 2256 2257 2258 2259 2260 2261 2262 2263 2264 2265 2266 2267 2268 2269 2270 2271 2272 2273 2274 2275 2276 2277 2278 2279 2280 2281 2282 2283 2284 2285 2286 2287 2288 2289 2290 2291 2292 2293 2294 2295 2296 2297 2298 2299 2300 2301 2302 2303 2304 2305 2306 2307 2308 2309 2310 2311 2312 2313 2314 2315 2316 2317 2318 2319 2320 2321 2322 2323 2324 2325 2326 2327 2328 2329 2330 2331 2332 2333 2334 2335 2336 2337 2338 2339 2340 2341 2342 2343 2344 2345 2346 2347 2348 2349 2350 2351 2352 2353 2354 2355 2356 2357 2358 2359 2360 2361 2362 2363 2364 2365 2366 2367 2368 2369 2370 2371 2372 2373 2374 2375 2376 2377 2378 2379 2380 2381 2382 2383 2384 2385 2386 2387 2388 2389 2390 2391 2392 2393 2394 2395 2396 2397 2398 2399 2400 2401 2402 2403 2404 2405 2406 2407 2408 2409 2410 2411 2412 2413 2414 2415 2416 2417 2418 2419 2420 2421 2422 2423 2424 2425 2426 2427 2428 2429 2430 2431 2432 2433 2434 2435 2436 2437 2438 2439 2440 2441 2442 2443 2444 2445 2446 2447 2448 2449 2450 2451 2452 2453 2454 2455 2456 2457 2458 2459 2460 2461 2462 2463 2464 2465 2466 2467 2468 2469 2470 2471 2472 2473 2474 2475 2476 2477 2478 2479 2480 2481 2482 2483 2484 2485 2486 2487 2488 2489 2490 2491 2492 2493 2494 2495 2496 2497 2498 2499 2500 2501 2502 2503 2504 2505 2506 2507 2508 2509 2510 2511 2512 2513 2514 2515 2516 2517 2518 2519 2520
|
@c -*-texinfo-*-
@c This is part of the GNU Guile Reference Manual.
@c Copyright (C) 1996, 1997, 2000, 2001, 2002, 2003, 2004, 2007, 2009,
@c 2010, 2011, 2013 Free Software Foundation, Inc.
@c See the file guile.texi for copying conditions.
@node Input and Output
@section Input and Output
@menu
* Ports:: The idea of the port abstraction.
* Reading:: Procedures for reading from a port.
* Writing:: Procedures for writing to a port.
* Closing:: Procedures to close a port.
* Random Access:: Moving around a random access port.
* Line/Delimited:: Read and write lines or delimited text.
* Block Reading and Writing:: Reading and writing blocks of text.
* Default Ports:: Defaults for input, output and errors.
* Port Types:: Types of port and how to make them.
* R6RS I/O Ports:: The R6RS port API.
* I/O Extensions:: Using and extending ports in C.
* BOM Handling:: Handling of Unicode byte order marks.
@end menu
@node Ports
@subsection Ports
@cindex Port
Sequential input/output in Scheme is represented by operations on a
@dfn{port}. This chapter explains the operations that Guile provides
for working with ports.
Ports are created by opening, for instance @code{open-file} for a file
(@pxref{File Ports}). Characters can be read from an input port and
written to an output port, or both on an input/output port. A port
can be closed (@pxref{Closing}) when no longer required, after which
any attempt to read or write is an error.
The formal definition of a port is very generic: an input port is
simply ``an object which can deliver characters on demand,'' and an
output port is ``an object which can accept characters.'' Because
this definition is so loose, it is easy to write functions that
simulate ports in software. @dfn{Soft ports} and @dfn{string ports}
are two interesting and powerful examples of this technique.
(@pxref{Soft Ports}, and @ref{String Ports}.)
Ports are garbage collected in the usual way (@pxref{Memory
Management}), and will be closed at that time if not already closed.
In this case any errors occurring in the close will not be reported.
Usually a program will want to explicitly close so as to be sure all
its operations have been successful. Of course if a program has
abandoned something due to an error or other condition then closing
problems are probably not of interest.
It is strongly recommended that file ports be closed explicitly when
no longer required. Most systems have limits on how many files can be
open, both on a per-process and a system-wide basis. A program that
uses many files should take care not to hit those limits. The same
applies to similar system resources such as pipes and sockets.
Note that automatic garbage collection is triggered only by memory
consumption, not by file or other resource usage, so a program cannot
rely on that to keep it away from system limits. An explicit call to
@code{gc} can of course be relied on to pick up unreferenced ports.
If program flow makes it hard to be certain when to close then this
may be an acceptable way to control resource usage.
All file access uses the ``LFS'' large file support functions when
available, so files bigger than 2 Gbytes (@math{2^31} bytes) can be
read and written on a 32-bit system.
Each port has an associated character encoding that controls how bytes
read from the port are converted to characters and string and controls
how characters and strings written to the port are converted to bytes.
When ports are created, they inherit their character encoding from the
current locale, but, that can be modified after the port is created.
Currently, the ports only work with @emph{non-modal} encodings. Most
encodings are non-modal, meaning that the conversion of bytes to a
string doesn't depend on its context: the same byte sequence will always
return the same string. A couple of modal encodings are in common use,
like ISO-2022-JP and ISO-2022-KR, and they are not yet supported.
Each port also has an associated conversion strategy: what to do when
a Guile character can't be converted to the port's encoded character
representation for output. There are three possible strategies: to
raise an error, to replace the character with a hex escape, or to
replace the character with a substitute character.
@rnindex input-port?
@deffn {Scheme Procedure} input-port? x
@deffnx {C Function} scm_input_port_p (x)
Return @code{#t} if @var{x} is an input port, otherwise return
@code{#f}. Any object satisfying this predicate also satisfies
@code{port?}.
@end deffn
@rnindex output-port?
@deffn {Scheme Procedure} output-port? x
@deffnx {C Function} scm_output_port_p (x)
Return @code{#t} if @var{x} is an output port, otherwise return
@code{#f}. Any object satisfying this predicate also satisfies
@code{port?}.
@end deffn
@deffn {Scheme Procedure} port? x
@deffnx {C Function} scm_port_p (x)
Return a boolean indicating whether @var{x} is a port.
Equivalent to @code{(or (input-port? @var{x}) (output-port?
@var{x}))}.
@end deffn
@deffn {Scheme Procedure} set-port-encoding! port enc
@deffnx {C Function} scm_set_port_encoding_x (port, enc)
Sets the character encoding that will be used to interpret all port I/O.
@var{enc} is a string containing the name of an encoding. Valid
encoding names are those
@url{http://www.iana.org/assignments/character-sets, defined by IANA}.
@end deffn
@defvr {Scheme Variable} %default-port-encoding
A fluid containing @code{#f} or the name of the encoding to
be used by default for newly created ports (@pxref{Fluids and Dynamic
States}). The value @code{#f} is equivalent to @code{"ISO-8859-1"}.
New ports are created with the encoding appropriate for the current
locale if @code{setlocale} has been called or the value specified by
this fluid otherwise.
@end defvr
@deffn {Scheme Procedure} port-encoding port
@deffnx {C Function} scm_port_encoding (port)
Returns, as a string, the character encoding that @var{port} uses to interpret
its input and output. The value @code{#f} is equivalent to @code{"ISO-8859-1"}.
@end deffn
@deffn {Scheme Procedure} set-port-conversion-strategy! port sym
@deffnx {C Function} scm_set_port_conversion_strategy_x (port, sym)
Sets the behavior of the interpreter when outputting a character that
is not representable in the port's current encoding. @var{sym} can be
either @code{'error}, @code{'substitute}, or @code{'escape}. If it is
@code{'error}, an error will be thrown when an nonconvertible character
is encountered. If it is @code{'substitute}, then nonconvertible
characters will be replaced with approximate characters, or with
question marks if no approximately correct character is available. If
it is @code{'escape}, it will appear as a hex escape when output.
If @var{port} is an open port, the conversion error behavior
is set for that port. If it is @code{#f}, it is set as the
default behavior for any future ports that get created in
this thread.
@end deffn
@deffn {Scheme Procedure} port-conversion-strategy port
@deffnx {C Function} scm_port_conversion_strategy (port)
Returns the behavior of the port when outputting a character that is
not representable in the port's current encoding. It returns the
symbol @code{error} if unrepresentable characters should cause
exceptions, @code{substitute} if the port should try to replace
unrepresentable characters with question marks or approximate
characters, or @code{escape} if unrepresentable characters should be
converted to string escapes.
If @var{port} is @code{#f}, then the current default behavior will be
returned. New ports will have this default behavior when they are
created.
@end deffn
@deffn {Scheme Variable} %default-port-conversion-strategy
The fluid that defines the conversion strategy for newly created ports,
and for other conversion routines such as @code{scm_to_stringn},
@code{scm_from_stringn}, @code{string->pointer}, and
@code{pointer->string}.
Its value must be one of the symbols described above, with the same
semantics: @code{'error}, @code{'substitute}, or @code{'escape}.
When Guile starts, its value is @code{'substitute}.
Note that @code{(set-port-conversion-strategy! #f @var{sym})} is
equivalent to @code{(fluid-set! %default-port-conversion-strategy
@var{sym})}.
@end deffn
@node Reading
@subsection Reading
@cindex Reading
[Generic procedures for reading from ports.]
These procedures pertain to reading characters and strings from
ports. To read general S-expressions from ports, @xref{Scheme Read}.
@rnindex eof-object?
@cindex End of file object
@deffn {Scheme Procedure} eof-object? x
@deffnx {C Function} scm_eof_object_p (x)
Return @code{#t} if @var{x} is an end-of-file object; otherwise
return @code{#f}.
@end deffn
@rnindex char-ready?
@deffn {Scheme Procedure} char-ready? [port]
@deffnx {C Function} scm_char_ready_p (port)
Return @code{#t} if a character is ready on input @var{port}
and return @code{#f} otherwise. If @code{char-ready?} returns
@code{#t} then the next @code{read-char} operation on
@var{port} is guaranteed not to hang. If @var{port} is a file
port at end of file then @code{char-ready?} returns @code{#t}.
@code{char-ready?} exists to make it possible for a
program to accept characters from interactive ports without
getting stuck waiting for input. Any input editors associated
with such ports must make sure that characters whose existence
has been asserted by @code{char-ready?} cannot be rubbed out.
If @code{char-ready?} were to return @code{#f} at end of file,
a port at end of file would be indistinguishable from an
interactive port that has no ready characters.
@end deffn
@rnindex read-char
@deffn {Scheme Procedure} read-char [port]
@deffnx {C Function} scm_read_char (port)
Return the next character available from @var{port}, updating
@var{port} to point to the following character. If no more
characters are available, the end-of-file object is returned.
When @var{port}'s data cannot be decoded according to its
character encoding, a @code{decoding-error} is raised and
@var{port} points past the erroneous byte sequence.
@end deffn
@deftypefn {C Function} size_t scm_c_read (SCM port, void *buffer, size_t size)
Read up to @var{size} bytes from @var{port} and store them in
@var{buffer}. The return value is the number of bytes actually read,
which can be less than @var{size} if end-of-file has been reached.
Note that this function does not update @code{port-line} and
@code{port-column} below.
@end deftypefn
@rnindex peek-char
@deffn {Scheme Procedure} peek-char [port]
@deffnx {C Function} scm_peek_char (port)
Return the next character available from @var{port},
@emph{without} updating @var{port} to point to the following
character. If no more characters are available, the
end-of-file object is returned.
The value returned by
a call to @code{peek-char} is the same as the value that would
have been returned by a call to @code{read-char} on the same
port. The only difference is that the very next call to
@code{read-char} or @code{peek-char} on that @var{port} will
return the value returned by the preceding call to
@code{peek-char}. In particular, a call to @code{peek-char} on
an interactive port will hang waiting for input whenever a call
to @code{read-char} would have hung.
As for @code{read-char}, a @code{decoding-error} may be raised
if such a situation occurs. However, unlike with @code{read-char},
@var{port} still points at the beginning of the erroneous byte
sequence when the error is raised.
@end deffn
@deffn {Scheme Procedure} unread-char cobj [port]
@deffnx {C Function} scm_unread_char (cobj, port)
Place character @var{cobj} in @var{port} so that it will be read by the
next read operation. If called multiple times, the unread characters
will be read again in last-in first-out order. If @var{port} is
not supplied, the current input port is used.
@end deffn
@deffn {Scheme Procedure} unread-string str port
@deffnx {C Function} scm_unread_string (str, port)
Place the string @var{str} in @var{port} so that its characters will
be read from left-to-right as the next characters from @var{port}
during subsequent read operations. If called multiple times, the
unread characters will be read again in last-in first-out order. If
@var{port} is not supplied, the @code{current-input-port} is used.
@end deffn
@deffn {Scheme Procedure} drain-input port
@deffnx {C Function} scm_drain_input (port)
This procedure clears a port's input buffers, similar
to the way that force-output clears the output buffer. The
contents of the buffers are returned as a single string, e.g.,
@lisp
(define p (open-input-file ...))
(drain-input p) => empty string, nothing buffered yet.
(unread-char (read-char p) p)
(drain-input p) => initial chars from p, up to the buffer size.
@end lisp
Draining the buffers may be useful for cleanly finishing
buffered I/O so that the file descriptor can be used directly
for further input.
@end deffn
@deffn {Scheme Procedure} port-column port
@deffnx {Scheme Procedure} port-line port
@deffnx {C Function} scm_port_column (port)
@deffnx {C Function} scm_port_line (port)
Return the current column number or line number of @var{port}.
If the number is
unknown, the result is #f. Otherwise, the result is a 0-origin integer
- i.e.@: the first character of the first line is line 0, column 0.
(However, when you display a file position, for example in an error
message, we recommend you add 1 to get 1-origin integers. This is
because lines and column numbers traditionally start with 1, and that is
what non-programmers will find most natural.)
@end deffn
@deffn {Scheme Procedure} set-port-column! port column
@deffnx {Scheme Procedure} set-port-line! port line
@deffnx {C Function} scm_set_port_column_x (port, column)
@deffnx {C Function} scm_set_port_line_x (port, line)
Set the current column or line number of @var{port}.
@end deffn
@node Writing
@subsection Writing
@cindex Writing
[Generic procedures for writing to ports.]
These procedures are for writing characters and strings to
ports. For more information on writing arbitrary Scheme objects to
ports, @xref{Scheme Write}.
@deffn {Scheme Procedure} get-print-state port
@deffnx {C Function} scm_get_print_state (port)
Return the print state of the port @var{port}. If @var{port}
has no associated print state, @code{#f} is returned.
@end deffn
@rnindex newline
@deffn {Scheme Procedure} newline [port]
@deffnx {C Function} scm_newline (port)
Send a newline to @var{port}.
If @var{port} is omitted, send to the current output port.
@end deffn
@deffn {Scheme Procedure} port-with-print-state port [pstate]
@deffnx {C Function} scm_port_with_print_state (port, pstate)
Create a new port which behaves like @var{port}, but with an
included print state @var{pstate}. @var{pstate} is optional.
If @var{pstate} isn't supplied and @var{port} already has
a print state, the old print state is reused.
@end deffn
@deffn {Scheme Procedure} simple-format destination message . args
@deffnx {C Function} scm_simple_format (destination, message, args)
Write @var{message} to @var{destination}, defaulting to
the current output port.
@var{message} can contain @code{~A} (was @code{%s}) and
@code{~S} (was @code{%S}) escapes. When printed,
the escapes are replaced with corresponding members of
@var{args}:
@code{~A} formats using @code{display} and @code{~S} formats
using @code{write}.
If @var{destination} is @code{#t}, then use the current output
port, if @var{destination} is @code{#f}, then return a string
containing the formatted text. Does not add a trailing newline.
@end deffn
@rnindex write-char
@deffn {Scheme Procedure} write-char chr [port]
@deffnx {C Function} scm_write_char (chr, port)
Send character @var{chr} to @var{port}.
@end deffn
@deftypefn {C Function} void scm_c_write (SCM port, const void *buffer, size_t size)
Write @var{size} bytes at @var{buffer} to @var{port}.
Note that this function does not update @code{port-line} and
@code{port-column} (@pxref{Reading}).
@end deftypefn
@findex fflush
@deffn {Scheme Procedure} force-output [port]
@deffnx {C Function} scm_force_output (port)
Flush the specified output port, or the current output port if @var{port}
is omitted. The current output buffer contents are passed to the
underlying port implementation (e.g., in the case of fports, the
data will be written to the file and the output buffer will be cleared.)
It has no effect on an unbuffered port.
The return value is unspecified.
@end deffn
@deffn {Scheme Procedure} flush-all-ports
@deffnx {C Function} scm_flush_all_ports ()
Equivalent to calling @code{force-output} on
all open output ports. The return value is unspecified.
@end deffn
@node Closing
@subsection Closing
@cindex Closing ports
@cindex Port, close
@deffn {Scheme Procedure} close-port port
@deffnx {C Function} scm_close_port (port)
Close the specified port object. Return @code{#t} if it
successfully closes a port or @code{#f} if it was already
closed. An exception may be raised if an error occurs, for
example when flushing buffered output. See also @ref{Ports and
File Descriptors, close}, for a procedure which can close file
descriptors.
@end deffn
@deffn {Scheme Procedure} close-input-port port
@deffnx {Scheme Procedure} close-output-port port
@deffnx {C Function} scm_close_input_port (port)
@deffnx {C Function} scm_close_output_port (port)
@rnindex close-input-port
@rnindex close-output-port
Close the specified input or output @var{port}. An exception may be
raised if an error occurs while closing. If @var{port} is already
closed, nothing is done. The return value is unspecified.
See also @ref{Ports and File Descriptors, close}, for a procedure
which can close file descriptors.
@end deffn
@deffn {Scheme Procedure} port-closed? port
@deffnx {C Function} scm_port_closed_p (port)
Return @code{#t} if @var{port} is closed or @code{#f} if it is
open.
@end deffn
@node Random Access
@subsection Random Access
@cindex Random access, ports
@cindex Port, random access
@deffn {Scheme Procedure} seek fd_port offset whence
@deffnx {C Function} scm_seek (fd_port, offset, whence)
Sets the current position of @var{fd_port} to the integer
@var{offset}. For a file port, @var{offset} is expressed
as a number of bytes; for other types of ports, such as string
ports, @var{offset} is an abstract representation of the
position within the port's data, not necessarily expressed
as a number of bytes. @var{offset} is interpreted according to
the value of @var{whence}.
One of the following variables should be supplied for
@var{whence}:
@defvar SEEK_SET
Seek from the beginning of the file.
@end defvar
@defvar SEEK_CUR
Seek from the current position.
@end defvar
@defvar SEEK_END
Seek from the end of the file.
@end defvar
If @var{fd_port} is a file descriptor, the underlying system
call is @code{lseek}. @var{port} may be a string port.
The value returned is the new position in @var{fd_port}. This means
that the current position of a port can be obtained using:
@lisp
(seek port 0 SEEK_CUR)
@end lisp
@end deffn
@deffn {Scheme Procedure} ftell fd_port
@deffnx {C Function} scm_ftell (fd_port)
Return an integer representing the current position of
@var{fd_port}, measured from the beginning. Equivalent to:
@lisp
(seek port 0 SEEK_CUR)
@end lisp
@end deffn
@findex truncate
@findex ftruncate
@deffn {Scheme Procedure} truncate-file file [length]
@deffnx {C Function} scm_truncate_file (file, length)
Truncate @var{file} to @var{length} bytes. @var{file} can be a
filename string, a port object, or an integer file descriptor. The
return value is unspecified.
For a port or file descriptor @var{length} can be omitted, in which
case the file is truncated at the current position (per @code{ftell}
above).
On most systems a file can be extended by giving a length greater than
the current size, but this is not mandatory in the POSIX standard.
@end deffn
@node Line/Delimited
@subsection Line Oriented and Delimited Text
@cindex Line input/output
@cindex Port, line input/output
The delimited-I/O module can be accessed with:
@lisp
(use-modules (ice-9 rdelim))
@end lisp
It can be used to read or write lines of text, or read text delimited by
a specified set of characters. It's similar to the @code{(scsh rdelim)}
module from guile-scsh, but does not use multiple values or character
sets and has an extra procedure @code{write-line}.
@c begin (scm-doc-string "rdelim.scm" "read-line")
@deffn {Scheme Procedure} read-line [port] [handle-delim]
Return a line of text from @var{port} if specified, otherwise from the
value returned by @code{(current-input-port)}. Under Unix, a line of text
is terminated by the first end-of-line character or by end-of-file.
If @var{handle-delim} is specified, it should be one of the following
symbols:
@table @code
@item trim
Discard the terminating delimiter. This is the default, but it will
be impossible to tell whether the read terminated with a delimiter or
end-of-file.
@item concat
Append the terminating delimiter (if any) to the returned string.
@item peek
Push the terminating delimiter (if any) back on to the port.
@item split
Return a pair containing the string read from the port and the
terminating delimiter or end-of-file object.
@end table
Like @code{read-char}, this procedure can throw to @code{decoding-error}
(@pxref{Reading, @code{read-char}}).
@end deffn
@c begin (scm-doc-string "rdelim.scm" "read-line!")
@deffn {Scheme Procedure} read-line! buf [port]
Read a line of text into the supplied string @var{buf} and return the
number of characters added to @var{buf}. If @var{buf} is filled, then
@code{#f} is returned.
Read from @var{port} if
specified, otherwise from the value returned by @code{(current-input-port)}.
@end deffn
@c begin (scm-doc-string "rdelim.scm" "read-delimited")
@deffn {Scheme Procedure} read-delimited delims [port] [handle-delim]
Read text until one of the characters in the string @var{delims} is found
or end-of-file is reached. Read from @var{port} if supplied, otherwise
from the value returned by @code{(current-input-port)}.
@var{handle-delim} takes the same values as described for @code{read-line}.
@end deffn
@c begin (scm-doc-string "rdelim.scm" "read-delimited!")
@deffn {Scheme Procedure} read-delimited! delims buf [port] [handle-delim] [start] [end]
Read text into the supplied string @var{buf}.
If a delimiter was found, return the number of characters written,
except if @var{handle-delim} is @code{split}, in which case the return
value is a pair, as noted above.
As a special case, if @var{port} was already at end-of-stream, the EOF
object is returned. Also, if no characters were written because the
buffer was full, @code{#f} is returned.
It's something of a wacky interface, to be honest.
@end deffn
@deffn {Scheme Procedure} write-line obj [port]
@deffnx {C Function} scm_write_line (obj, port)
Display @var{obj} and a newline character to @var{port}. If
@var{port} is not specified, @code{(current-output-port)} is
used. This function is equivalent to:
@lisp
(display obj [port])
(newline [port])
@end lisp
@end deffn
In the past, Guile did not have a procedure that would just read out all
of the characters from a port. As a workaround, many people just called
@code{read-delimited} with no delimiters, knowing that would produce the
behavior they wanted. This prompted Guile developers to add some
routines that would read all characters from a port. So it is that
@code{(ice-9 rdelim)} is also the home for procedures that can reading
undelimited text:
@deffn {Scheme Procedure} read-string [port] [count]
Read all of the characters out of @var{port} and return them as a
string. If the @var{count} is present, treat it as a limit to the
number of characters to read.
By default, read from the current input port, with no size limit on the
result. This procedure always returns a string, even if no characters
were read.
@end deffn
@deffn {Scheme Procedure} read-string! buf [port] [start] [end]
Fill @var{buf} with characters read from @var{port}, defaulting to the
current input port. Return the number of characters read.
If @var{start} or @var{end} are specified, store data only into the
substring of @var{str} bounded by @var{start} and @var{end} (which
default to the beginning and end of the string, respectively).
@end deffn
Some of the aforementioned I/O functions rely on the following C
primitives. These will mainly be of interest to people hacking Guile
internals.
@deffn {Scheme Procedure} %read-delimited! delims str gobble [port [start [end]]]
@deffnx {C Function} scm_read_delimited_x (delims, str, gobble, port, start, end)
Read characters from @var{port} into @var{str} until one of the
characters in the @var{delims} string is encountered. If
@var{gobble} is true, discard the delimiter character;
otherwise, leave it in the input stream for the next read. If
@var{port} is not specified, use the value of
@code{(current-input-port)}. If @var{start} or @var{end} are
specified, store data only into the substring of @var{str}
bounded by @var{start} and @var{end} (which default to the
beginning and end of the string, respectively).
Return a pair consisting of the delimiter that terminated the
string and the number of characters read. If reading stopped
at the end of file, the delimiter returned is the
@var{eof-object}; if the string was filled without encountering
a delimiter, this value is @code{#f}.
@end deffn
@deffn {Scheme Procedure} %read-line [port]
@deffnx {C Function} scm_read_line (port)
Read a newline-terminated line from @var{port}, allocating storage as
necessary. The newline terminator (if any) is removed from the string,
and a pair consisting of the line and its delimiter is returned. The
delimiter may be either a newline or the @var{eof-object}; if
@code{%read-line} is called at the end of file, it returns the pair
@code{(#<eof> . #<eof>)}.
@end deffn
@node Block Reading and Writing
@subsection Block reading and writing
@cindex Block read/write
@cindex Port, block read/write
The Block-string-I/O module can be accessed with:
@lisp
(use-modules (ice-9 rw))
@end lisp
It currently contains procedures that help to implement the
@code{(scsh rw)} module in guile-scsh.
@deffn {Scheme Procedure} read-string!/partial str [port_or_fdes [start [end]]]
@deffnx {C Function} scm_read_string_x_partial (str, port_or_fdes, start, end)
Read characters from a port or file descriptor into a
string @var{str}. A port must have an underlying file
descriptor --- a so-called fport. This procedure is
scsh-compatible and can efficiently read large strings.
It will:
@itemize
@item
attempt to fill the entire string, unless the @var{start}
and/or @var{end} arguments are supplied. i.e., @var{start}
defaults to 0 and @var{end} defaults to
@code{(string-length str)}
@item
use the current input port if @var{port_or_fdes} is not
supplied.
@item
return fewer than the requested number of characters in some
cases, e.g., on end of file, if interrupted by a signal, or if
not all the characters are immediately available.
@item
wait indefinitely for some input if no characters are
currently available,
unless the port is in non-blocking mode.
@item
read characters from the port's input buffers if available,
instead from the underlying file descriptor.
@item
return @code{#f} if end-of-file is encountered before reading
any characters, otherwise return the number of characters
read.
@item
return 0 if the port is in non-blocking mode and no characters
are immediately available.
@item
return 0 if the request is for 0 bytes, with no
end-of-file check.
@end itemize
@end deffn
@deffn {Scheme Procedure} write-string/partial str [port_or_fdes [start [end]]]
@deffnx {C Function} scm_write_string_partial (str, port_or_fdes, start, end)
Write characters from a string @var{str} to a port or file
descriptor. A port must have an underlying file descriptor
--- a so-called fport. This procedure is
scsh-compatible and can efficiently write large strings.
It will:
@itemize
@item
attempt to write the entire string, unless the @var{start}
and/or @var{end} arguments are supplied. i.e., @var{start}
defaults to 0 and @var{end} defaults to
@code{(string-length str)}
@item
use the current output port if @var{port_of_fdes} is not
supplied.
@item
in the case of a buffered port, store the characters in the
port's output buffer, if all will fit. If they will not fit
then any existing buffered characters will be flushed
before attempting
to write the new characters directly to the underlying file
descriptor. If the port is in non-blocking mode and
buffered characters can not be flushed immediately, then an
@code{EAGAIN} system-error exception will be raised (Note:
scsh does not support the use of non-blocking buffered ports.)
@item
write fewer than the requested number of
characters in some cases, e.g., if interrupted by a signal or
if not all of the output can be accepted immediately.
@item
wait indefinitely for at least one character
from @var{str} to be accepted by the port, unless the port is
in non-blocking mode.
@item
return the number of characters accepted by the port.
@item
return 0 if the port is in non-blocking mode and can not accept
at least one character from @var{str} immediately
@item
return 0 immediately if the request size is 0 bytes.
@end itemize
@end deffn
@node Default Ports
@subsection Default Ports for Input, Output and Errors
@cindex Default ports
@cindex Port, default
@rnindex current-input-port
@deffn {Scheme Procedure} current-input-port
@deffnx {C Function} scm_current_input_port ()
@cindex standard input
Return the current input port. This is the default port used
by many input procedures.
Initially this is the @dfn{standard input} in Unix and C terminology.
When the standard input is a tty the port is unbuffered, otherwise
it's fully buffered.
Unbuffered input is good if an application runs an interactive
subprocess, since any type-ahead input won't go into Guile's buffer
and be unavailable to the subprocess.
Note that Guile buffering is completely separate from the tty ``line
discipline''. In the usual cooked mode on a tty Guile only sees a
line of input once the user presses @key{Return}.
@end deffn
@rnindex current-output-port
@deffn {Scheme Procedure} current-output-port
@deffnx {C Function} scm_current_output_port ()
@cindex standard output
Return the current output port. This is the default port used
by many output procedures.
Initially this is the @dfn{standard output} in Unix and C terminology.
When the standard output is a tty this port is unbuffered, otherwise
it's fully buffered.
Unbuffered output to a tty is good for ensuring progress output or a
prompt is seen. But an application which always prints whole lines
could change to line buffered, or an application with a lot of output
could go fully buffered and perhaps make explicit @code{force-output}
calls (@pxref{Writing}) at selected points.
@end deffn
@deffn {Scheme Procedure} current-error-port
@deffnx {C Function} scm_current_error_port ()
@cindex standard error output
Return the port to which errors and warnings should be sent.
Initially this is the @dfn{standard error} in Unix and C terminology.
When the standard error is a tty this port is unbuffered, otherwise
it's fully buffered.
@end deffn
@deffn {Scheme Procedure} set-current-input-port port
@deffnx {Scheme Procedure} set-current-output-port port
@deffnx {Scheme Procedure} set-current-error-port port
@deffnx {C Function} scm_set_current_input_port (port)
@deffnx {C Function} scm_set_current_output_port (port)
@deffnx {C Function} scm_set_current_error_port (port)
Change the ports returned by @code{current-input-port},
@code{current-output-port} and @code{current-error-port}, respectively,
so that they use the supplied @var{port} for input or output.
@end deffn
@deftypefn {C Function} void scm_dynwind_current_input_port (SCM port)
@deftypefnx {C Function} void scm_dynwind_current_output_port (SCM port)
@deftypefnx {C Function} void scm_dynwind_current_error_port (SCM port)
These functions must be used inside a pair of calls to
@code{scm_dynwind_begin} and @code{scm_dynwind_end} (@pxref{Dynamic
Wind}). During the dynwind context, the indicated port is set to
@var{port}.
More precisely, the current port is swapped with a `backup' value
whenever the dynwind context is entered or left. The backup value is
initialized with the @var{port} argument.
@end deftypefn
@node Port Types
@subsection Types of Port
@cindex Types of ports
@cindex Port, types
[Types of port; how to make them.]
@menu
* File Ports:: Ports on an operating system file.
* String Ports:: Ports on a Scheme string.
* Soft Ports:: Ports on arbitrary Scheme procedures.
* Void Ports:: Ports on nothing at all.
@end menu
@node File Ports
@subsubsection File Ports
@cindex File port
@cindex Port, file
The following procedures are used to open file ports.
See also @ref{Ports and File Descriptors, open}, for an interface
to the Unix @code{open} system call.
Most systems have limits on how many files can be open, so it's
strongly recommended that file ports be closed explicitly when no
longer required (@pxref{Ports}).
@deffn {Scheme Procedure} open-file filename mode @
[#:guess-encoding=#f] [#:encoding=#f]
@deffnx {C Function} scm_open_file_with_encoding @
(filename, mode, guess_encoding, encoding)
@deffnx {C Function} scm_open_file (filename, mode)
Open the file whose name is @var{filename}, and return a port
representing that file. The attributes of the port are
determined by the @var{mode} string. The way in which this is
interpreted is similar to C stdio. The first character must be
one of the following:
@table @samp
@item r
Open an existing file for input.
@item w
Open a file for output, creating it if it doesn't already exist
or removing its contents if it does.
@item a
Open a file for output, creating it if it doesn't already
exist. All writes to the port will go to the end of the file.
The "append mode" can be turned off while the port is in use
@pxref{Ports and File Descriptors, fcntl}
@end table
The following additional characters can be appended:
@table @samp
@item +
Open the port for both input and output. E.g., @code{r+}: open
an existing file for both input and output.
@item 0
Create an "unbuffered" port. In this case input and output
operations are passed directly to the underlying port
implementation without additional buffering. This is likely to
slow down I/O operations. The buffering mode can be changed
while a port is in use @pxref{Ports and File Descriptors,
setvbuf}
@item l
Add line-buffering to the port. The port output buffer will be
automatically flushed whenever a newline character is written.
@item b
Use binary mode, ensuring that each byte in the file will be read as one
Scheme character.
To provide this property, the file will be opened with the 8-bit
character encoding "ISO-8859-1", ignoring the default port encoding.
@xref{Ports}, for more information on port encodings.
Note that while it is possible to read and write binary data as
characters or strings, it is usually better to treat bytes as octets,
and byte sequences as bytevectors. @xref{R6RS Binary Input}, and
@ref{R6RS Binary Output}, for more.
This option had another historical meaning, for DOS compatibility: in
the default (textual) mode, DOS reads a CR-LF sequence as one LF byte.
The @code{b} flag prevents this from happening, adding @code{O_BINARY}
to the underlying @code{open} call. Still, the flag is generally useful
because of its port encoding ramifications.
@end table
Unless binary mode is requested, the character encoding of the new port
is determined as follows: First, if @var{guess-encoding} is true, the
@code{file-encoding} procedure is used to guess the encoding of the file
(@pxref{Character Encoding of Source Files}). If @var{guess-encoding}
is false or if @code{file-encoding} fails, @var{encoding} is used unless
it is also false. As a last resort, the default port encoding is used.
@xref{Ports}, for more information on port encodings. It is an error to
pass a non-false @var{guess-encoding} or @var{encoding} if binary mode
is requested.
If a file cannot be opened with the access requested, @code{open-file}
throws an exception.
When the file is opened, its encoding is set to the current
@code{%default-port-encoding}, unless the @code{b} flag was supplied.
Sometimes it is desirable to honor Emacs-style coding declarations in
files@footnote{Guile 2.0.0 to 2.0.7 would do this by default. This
behavior was deemed inappropriate and disabled starting from Guile
2.0.8.}. When that is the case, the @code{file-encoding} procedure can
be used as follows (@pxref{Character Encoding of Source Files,
@code{file-encoding}}):
@example
(let* ((port (open-input-file file))
(encoding (file-encoding port)))
(set-port-encoding! port (or encoding (port-encoding port))))
@end example
In theory we could create read/write ports which were buffered
in one direction only. However this isn't included in the
current interfaces.
@end deffn
@rnindex open-input-file
@deffn {Scheme Procedure} open-input-file filename @
[#:guess-encoding=#f] [#:encoding=#f] [#:binary=#f]
Open @var{filename} for input. If @var{binary} is true, open the port
in binary mode, otherwise use text mode. @var{encoding} and
@var{guess-encoding} determine the character encoding as described above
for @code{open-file}. Equivalent to
@lisp
(open-file @var{filename}
(if @var{binary} "rb" "r")
#:guess-encoding @var{guess-encoding}
#:encoding @var{encoding})
@end lisp
@end deffn
@rnindex open-output-file
@deffn {Scheme Procedure} open-output-file filename @
[#:encoding=#f] [#:binary=#f]
Open @var{filename} for output. If @var{binary} is true, open the port
in binary mode, otherwise use text mode. @var{encoding} specifies the
character encoding as described above for @code{open-file}. Equivalent
to
@lisp
(open-file @var{filename}
(if @var{binary} "wb" "w")
#:encoding @var{encoding})
@end lisp
@end deffn
@deffn {Scheme Procedure} call-with-input-file filename proc @
[#:guess-encoding=#f] [#:encoding=#f] [#:binary=#f]
@deffnx {Scheme Procedure} call-with-output-file filename proc @
[#:encoding=#f] [#:binary=#f]
@rnindex call-with-input-file
@rnindex call-with-output-file
Open @var{filename} for input or output, and call @code{(@var{proc}
port)} with the resulting port. Return the value returned by
@var{proc}. @var{filename} is opened as per @code{open-input-file} or
@code{open-output-file} respectively, and an error is signaled if it
cannot be opened.
When @var{proc} returns, the port is closed. If @var{proc} does not
return (e.g.@: if it throws an error), then the port might not be
closed automatically, though it will be garbage collected in the usual
way if not otherwise referenced.
@end deffn
@deffn {Scheme Procedure} with-input-from-file filename thunk @
[#:guess-encoding=#f] [#:encoding=#f] [#:binary=#f]
@deffnx {Scheme Procedure} with-output-to-file filename thunk @
[#:encoding=#f] [#:binary=#f]
@deffnx {Scheme Procedure} with-error-to-file filename thunk @
[#:encoding=#f] [#:binary=#f]
@rnindex with-input-from-file
@rnindex with-output-to-file
Open @var{filename} and call @code{(@var{thunk})} with the new port
setup as respectively the @code{current-input-port},
@code{current-output-port}, or @code{current-error-port}. Return the
value returned by @var{thunk}. @var{filename} is opened as per
@code{open-input-file} or @code{open-output-file} respectively, and an
error is signaled if it cannot be opened.
When @var{thunk} returns, the port is closed and the previous setting
of the respective current port is restored.
The current port setting is managed with @code{dynamic-wind}, so the
previous value is restored no matter how @var{thunk} exits (eg.@: an
exception), and if @var{thunk} is re-entered (via a captured
continuation) then it's set again to the @var{filename} port.
The port is closed when @var{thunk} returns normally, but not when
exited via an exception or new continuation. This ensures it's still
ready for use if @var{thunk} is re-entered by a captured continuation.
Of course the port is always garbage collected and closed in the usual
way when no longer referenced anywhere.
@end deffn
@deffn {Scheme Procedure} port-mode port
@deffnx {C Function} scm_port_mode (port)
Return the port modes associated with the open port @var{port}.
These will not necessarily be identical to the modes used when
the port was opened, since modes such as "append" which are
used only during port creation are not retained.
@end deffn
@deffn {Scheme Procedure} port-filename port
@deffnx {C Function} scm_port_filename (port)
Return the filename associated with @var{port}, or @code{#f} if no
filename is associated with the port.
@var{port} must be open, @code{port-filename} cannot be used once the
port is closed.
@end deffn
@deffn {Scheme Procedure} set-port-filename! port filename
@deffnx {C Function} scm_set_port_filename_x (port, filename)
Change the filename associated with @var{port}, using the current input
port if none is specified. Note that this does not change the port's
source of data, but only the value that is returned by
@code{port-filename} and reported in diagnostic output.
@end deffn
@deffn {Scheme Procedure} file-port? obj
@deffnx {C Function} scm_file_port_p (obj)
Determine whether @var{obj} is a port that is related to a file.
@end deffn
@node String Ports
@subsubsection String Ports
@cindex String port
@cindex Port, string
The following allow string ports to be opened by analogy to R4RS
file port facilities:
With string ports, the port-encoding is treated differently than other
types of ports. When string ports are created, they do not inherit a
character encoding from the current locale. They are given a
default locale that allows them to handle all valid string characters.
Typically one should not modify a string port's character encoding
away from its default.
@deffn {Scheme Procedure} call-with-output-string proc
@deffnx {C Function} scm_call_with_output_string (proc)
Calls the one-argument procedure @var{proc} with a newly created output
port. When the function returns, the string composed of the characters
written into the port is returned. @var{proc} should not close the port.
Note that which characters can be written to a string port depend on the port's
encoding. The default encoding of string ports is specified by the
@code{%default-port-encoding} fluid (@pxref{Ports,
@code{%default-port-encoding}}). For instance, it is an error to write Greek
letter alpha to an ISO-8859-1-encoded string port since this character cannot be
represented with ISO-8859-1:
@example
(define alpha (integer->char #x03b1)) ; GREEK SMALL LETTER ALPHA
(with-fluids ((%default-port-encoding "ISO-8859-1"))
(call-with-output-string
(lambda (p)
(display alpha p))))
@result{}
Throw to key `encoding-error'
@end example
Changing the string port's encoding to a Unicode-capable encoding such as UTF-8
solves the problem.
@end deffn
@deffn {Scheme Procedure} call-with-input-string string proc
@deffnx {C Function} scm_call_with_input_string (string, proc)
Calls the one-argument procedure @var{proc} with a newly
created input port from which @var{string}'s contents may be
read. The value yielded by the @var{proc} is returned.
@end deffn
@deffn {Scheme Procedure} with-output-to-string thunk
Calls the zero-argument procedure @var{thunk} with the current output
port set temporarily to a new string port. It returns a string
composed of the characters written to the current output.
See @code{call-with-output-string} above for character encoding considerations.
@end deffn
@deffn {Scheme Procedure} with-input-from-string string thunk
Calls the zero-argument procedure @var{thunk} with the current input
port set temporarily to a string port opened on the specified
@var{string}. The value yielded by @var{thunk} is returned.
@end deffn
@deffn {Scheme Procedure} open-input-string str
@deffnx {C Function} scm_open_input_string (str)
Take a string and return an input port that delivers characters
from the string. The port can be closed by
@code{close-input-port}, though its storage will be reclaimed
by the garbage collector if it becomes inaccessible.
@end deffn
@deffn {Scheme Procedure} open-output-string
@deffnx {C Function} scm_open_output_string ()
Return an output port that will accumulate characters for
retrieval by @code{get-output-string}. The port can be closed
by the procedure @code{close-output-port}, though its storage
will be reclaimed by the garbage collector if it becomes
inaccessible.
@end deffn
@deffn {Scheme Procedure} get-output-string port
@deffnx {C Function} scm_get_output_string (port)
Given an output port created by @code{open-output-string},
return a string consisting of the characters that have been
output to the port so far.
@code{get-output-string} must be used before closing @var{port}, once
closed the string cannot be obtained.
@end deffn
A string port can be used in many procedures which accept a port
but which are not dependent on implementation details of fports.
E.g., seeking and truncating will work on a string port,
but trying to extract the file descriptor number will fail.
@node Soft Ports
@subsubsection Soft Ports
@cindex Soft port
@cindex Port, soft
A @dfn{soft-port} is a port based on a vector of procedures capable of
accepting or delivering characters. It allows emulation of I/O ports.
@deffn {Scheme Procedure} make-soft-port pv modes
@deffnx {C Function} scm_make_soft_port (pv, modes)
Return a port capable of receiving or delivering characters as
specified by the @var{modes} string (@pxref{File Ports,
open-file}). @var{pv} must be a vector of length 5 or 6. Its
components are as follows:
@enumerate 0
@item
procedure accepting one character for output
@item
procedure accepting a string for output
@item
thunk for flushing output
@item
thunk for getting one character
@item
thunk for closing port (not by garbage collection)
@item
(if present and not @code{#f}) thunk for computing the number of
characters that can be read from the port without blocking.
@end enumerate
For an output-only port only elements 0, 1, 2, and 4 need be
procedures. For an input-only port only elements 3 and 4 need
be procedures. Thunks 2 and 4 can instead be @code{#f} if
there is no useful operation for them to perform.
If thunk 3 returns @code{#f} or an @code{eof-object}
(@pxref{Input, eof-object?, ,r5rs, The Revised^5 Report on
Scheme}) it indicates that the port has reached end-of-file.
For example:
@lisp
(define stdout (current-output-port))
(define p (make-soft-port
(vector
(lambda (c) (write c stdout))
(lambda (s) (display s stdout))
(lambda () (display "." stdout))
(lambda () (char-upcase (read-char)))
(lambda () (display "@@" stdout)))
"rw"))
(write p p) @result{} #<input-output: soft 8081e20>
@end lisp
@end deffn
@node Void Ports
@subsubsection Void Ports
@cindex Void port
@cindex Port, void
This kind of port causes any data to be discarded when written to, and
always returns the end-of-file object when read from.
@deffn {Scheme Procedure} %make-void-port mode
@deffnx {C Function} scm_sys_make_void_port (mode)
Create and return a new void port. A void port acts like
@file{/dev/null}. The @var{mode} argument
specifies the input/output modes for this port: see the
documentation for @code{open-file} in @ref{File Ports}.
@end deffn
@node R6RS I/O Ports
@subsection R6RS I/O Ports
@cindex R6RS
@cindex R6RS ports
The I/O port API of the @uref{http://www.r6rs.org/, Revised Report^6 on
the Algorithmic Language Scheme (R6RS)} is provided by the @code{(rnrs
io ports)} module. It provides features, such as binary I/O and Unicode
string I/O, that complement or refine Guile's historical port API
presented above (@pxref{Input and Output}). Note that R6RS ports are not
disjoint from Guile's native ports, so Guile-specific procedures will
work on ports created using the R6RS API, and vice versa.
The text in this section is taken from the R6RS standard libraries
document, with only minor adaptions for inclusion in this manual. The
Guile developers offer their thanks to the R6RS editors for having
provided the report's text under permissive conditions making this
possible.
@c FIXME: Update description when implemented.
@emph{Note}: The implementation of this R6RS API is not complete yet.
@menu
* R6RS File Names:: File names.
* R6RS File Options:: Options for opening files.
* R6RS Buffer Modes:: Influencing buffering behavior.
* R6RS Transcoders:: Influencing port encoding.
* R6RS End-of-File:: The end-of-file object.
* R6RS Port Manipulation:: Manipulating R6RS ports.
* R6RS Input Ports:: Input Ports.
* R6RS Binary Input:: Binary input.
* R6RS Textual Input:: Textual input.
* R6RS Output Ports:: Output Ports.
* R6RS Binary Output:: Binary output.
* R6RS Textual Output:: Textual output.
@end menu
A subset of the @code{(rnrs io ports)} module, plus one non-standard
procedure @code{unget-bytevector} (@pxref{R6RS Binary Input}), is
provided by the @code{(ice-9 binary-ports)} module. It contains binary
input/output procedures and does not rely on R6RS support.
@node R6RS File Names
@subsubsection File Names
Some of the procedures described in this chapter accept a file name as an
argument. Valid values for such a file name include strings that name a file
using the native notation of file system paths on an implementation's
underlying operating system, and may include implementation-dependent
values as well.
A @var{filename} parameter name means that the
corresponding argument must be a file name.
@node R6RS File Options
@subsubsection File Options
@cindex file options
When opening a file, the various procedures in this library accept a
@code{file-options} object that encapsulates flags to specify how the
file is to be opened. A @code{file-options} object is an enum-set
(@pxref{rnrs enums}) over the symbols constituting valid file options.
A @var{file-options} parameter name means that the corresponding
argument must be a file-options object.
@deffn {Scheme Syntax} file-options @var{file-options-symbol} ...
Each @var{file-options-symbol} must be a symbol.
The @code{file-options} syntax returns a file-options object that
encapsulates the specified options.
When supplied to an operation that opens a file for output, the
file-options object returned by @code{(file-options)} specifies that the
file is created if it does not exist and an exception with condition
type @code{&i/o-file-already-exists} is raised if it does exist. The
following standard options can be included to modify the default
behavior.
@table @code
@item no-create
If the file does not already exist, it is not created;
instead, an exception with condition type @code{&i/o-file-does-not-exist}
is raised.
If the file already exists, the exception with condition type
@code{&i/o-file-already-exists} is not raised
and the file is truncated to zero length.
@item no-fail
If the file already exists, the exception with condition type
@code{&i/o-file-already-exists} is not raised,
even if @code{no-create} is not included,
and the file is truncated to zero length.
@item no-truncate
If the file already exists and the exception with condition type
@code{&i/o-file-already-exists} has been inhibited by inclusion of
@code{no-create} or @code{no-fail}, the file is not truncated, but
the port's current position is still set to the beginning of the
file.
@end table
These options have no effect when a file is opened only for input.
Symbols other than those listed above may be used as
@var{file-options-symbol}s; they have implementation-specific meaning,
if any.
@quotation Note
Only the name of @var{file-options-symbol} is significant.
@end quotation
@end deffn
@node R6RS Buffer Modes
@subsubsection Buffer Modes
Each port has an associated buffer mode. For an output port, the
buffer mode defines when an output operation flushes the buffer
associated with the output port. For an input port, the buffer mode
defines how much data will be read to satisfy read operations. The
possible buffer modes are the symbols @code{none} for no buffering,
@code{line} for flushing upon line endings and reading up to line
endings, or other implementation-dependent behavior,
and @code{block} for arbitrary buffering. This section uses
the parameter name @var{buffer-mode} for arguments that must be
buffer-mode symbols.
If two ports are connected to the same mutable source, both ports
are unbuffered, and reading a byte or character from that shared
source via one of the two ports would change the bytes or characters
seen via the other port, a lookahead operation on one port will
render the peeked byte or character inaccessible via the other port,
while a subsequent read operation on the peeked port will see the
peeked byte or character even though the port is otherwise unbuffered.
In other words, the semantics of buffering is defined in terms of side
effects on shared mutable sources, and a lookahead operation has the
same side effect on the shared source as a read operation.
@deffn {Scheme Syntax} buffer-mode @var{buffer-mode-symbol}
@var{buffer-mode-symbol} must be a symbol whose name is one of
@code{none}, @code{line}, and @code{block}. The result is the
corresponding symbol, and specifies the associated buffer mode.
@quotation Note
Only the name of @var{buffer-mode-symbol} is significant.
@end quotation
@end deffn
@deffn {Scheme Procedure} buffer-mode? obj
Returns @code{#t} if the argument is a valid buffer-mode symbol, and
returns @code{#f} otherwise.
@end deffn
@node R6RS Transcoders
@subsubsection Transcoders
@cindex codec
@cindex end-of-line style
@cindex transcoder
@cindex binary port
@cindex textual port
Several different Unicode encoding schemes describe standard ways to
encode characters and strings as byte sequences and to decode those
sequences. Within this document, a @dfn{codec} is an immutable Scheme
object that represents a Unicode or similar encoding scheme.
An @dfn{end-of-line style} is a symbol that, if it is not @code{none},
describes how a textual port transcodes representations of line endings.
A @dfn{transcoder} is an immutable Scheme object that combines a codec
with an end-of-line style and a method for handling decoding errors.
Each transcoder represents some specific bidirectional (but not
necessarily lossless), possibly stateful translation between byte
sequences and Unicode characters and strings. Every transcoder can
operate in the input direction (bytes to characters) or in the output
direction (characters to bytes). A @var{transcoder} parameter name
means that the corresponding argument must be a transcoder.
A @dfn{binary port} is a port that supports binary I/O, does not have an
associated transcoder and does not support textual I/O. A @dfn{textual
port} is a port that supports textual I/O, and does not support binary
I/O. A textual port may or may not have an associated transcoder.
@deffn {Scheme Procedure} latin-1-codec
@deffnx {Scheme Procedure} utf-8-codec
@deffnx {Scheme Procedure} utf-16-codec
These are predefined codecs for the ISO 8859-1, UTF-8, and UTF-16
encoding schemes.
A call to any of these procedures returns a value that is equal in the
sense of @code{eqv?} to the result of any other call to the same
procedure.
@end deffn
@deffn {Scheme Syntax} eol-style @var{eol-style-symbol}
@var{eol-style-symbol} should be a symbol whose name is one of
@code{lf}, @code{cr}, @code{crlf}, @code{nel}, @code{crnel}, @code{ls},
and @code{none}.
The form evaluates to the corresponding symbol. If the name of
@var{eol-style-symbol} is not one of these symbols, the effect and
result are implementation-dependent; in particular, the result may be an
eol-style symbol acceptable as an @var{eol-style} argument to
@code{make-transcoder}. Otherwise, an exception is raised.
All eol-style symbols except @code{none} describe a specific
line-ending encoding:
@table @code
@item lf
linefeed
@item cr
carriage return
@item crlf
carriage return, linefeed
@item nel
next line
@item crnel
carriage return, next line
@item ls
line separator
@end table
For a textual port with a transcoder, and whose transcoder has an
eol-style symbol @code{none}, no conversion occurs. For a textual input
port, any eol-style symbol other than @code{none} means that all of the
above line-ending encodings are recognized and are translated into a
single linefeed. For a textual output port, @code{none} and @code{lf}
are equivalent. Linefeed characters are encoded according to the
specified eol-style symbol, and all other characters that participate in
possible line endings are encoded as is.
@quotation Note
Only the name of @var{eol-style-symbol} is significant.
@end quotation
@end deffn
@deffn {Scheme Procedure} native-eol-style
Returns the default end-of-line style of the underlying platform, e.g.,
@code{lf} on Unix and @code{crlf} on Windows.
@end deffn
@deffn {Condition Type} &i/o-decoding
@deffnx {Scheme Procedure} make-i/o-decoding-error port
@deffnx {Scheme Procedure} i/o-decoding-error? obj
This condition type could be defined by
@lisp
(define-condition-type &i/o-decoding &i/o-port
make-i/o-decoding-error i/o-decoding-error?)
@end lisp
An exception with this type is raised when one of the operations for
textual input from a port encounters a sequence of bytes that cannot be
translated into a character or string by the input direction of the
port's transcoder.
When such an exception is raised, the port's position is past the
invalid encoding.
@end deffn
@deffn {Condition Type} &i/o-encoding
@deffnx {Scheme Procedure} make-i/o-encoding-error port char
@deffnx {Scheme Procedure} i/o-encoding-error? obj
@deffnx {Scheme Procedure} i/o-encoding-error-char condition
This condition type could be defined by
@lisp
(define-condition-type &i/o-encoding &i/o-port
make-i/o-encoding-error i/o-encoding-error?
(char i/o-encoding-error-char))
@end lisp
An exception with this type is raised when one of the operations for
textual output to a port encounters a character that cannot be
translated into bytes by the output direction of the port's transcoder.
@var{char} is the character that could not be encoded.
@end deffn
@deffn {Scheme Syntax} error-handling-mode @var{error-handling-mode-symbol}
@var{error-handling-mode-symbol} should be a symbol whose name is one of
@code{ignore}, @code{raise}, and @code{replace}. The form evaluates to
the corresponding symbol. If @var{error-handling-mode-symbol} is not
one of these identifiers, effect and result are
implementation-dependent: The result may be an error-handling-mode
symbol acceptable as a @var{handling-mode} argument to
@code{make-transcoder}. If it is not acceptable as a
@var{handling-mode} argument to @code{make-transcoder}, an exception is
raised.
@quotation Note
Only the name of @var{error-handling-mode-symbol} is significant.
@end quotation
The error-handling mode of a transcoder specifies the behavior
of textual I/O operations in the presence of encoding or decoding
errors.
If a textual input operation encounters an invalid or incomplete
character encoding, and the error-handling mode is @code{ignore}, an
appropriate number of bytes of the invalid encoding are ignored and
decoding continues with the following bytes.
If the error-handling mode is @code{replace}, the replacement
character U+FFFD is injected into the data stream, an appropriate
number of bytes are ignored, and decoding
continues with the following bytes.
If the error-handling mode is @code{raise}, an exception with condition
type @code{&i/o-decoding} is raised.
If a textual output operation encounters a character it cannot encode,
and the error-handling mode is @code{ignore}, the character is ignored
and encoding continues with the next character. If the error-handling
mode is @code{replace}, a codec-specific replacement character is
emitted by the transcoder, and encoding continues with the next
character. The replacement character is U+FFFD for transcoders whose
codec is one of the Unicode encodings, but is the @code{?} character
for the Latin-1 encoding. If the error-handling mode is @code{raise},
an exception with condition type @code{&i/o-encoding} is raised.
@end deffn
@deffn {Scheme Procedure} make-transcoder codec
@deffnx {Scheme Procedure} make-transcoder codec eol-style
@deffnx {Scheme Procedure} make-transcoder codec eol-style handling-mode
@var{codec} must be a codec; @var{eol-style}, if present, an eol-style
symbol; and @var{handling-mode}, if present, an error-handling-mode
symbol.
@var{eol-style} may be omitted, in which case it defaults to the native
end-of-line style of the underlying platform. @var{handling-mode} may
be omitted, in which case it defaults to @code{replace}. The result is
a transcoder with the behavior specified by its arguments.
@end deffn
@deffn {Scheme procedure} native-transcoder
Returns an implementation-dependent transcoder that represents a
possibly locale-dependent ``native'' transcoding.
@end deffn
@deffn {Scheme Procedure} transcoder-codec transcoder
@deffnx {Scheme Procedure} transcoder-eol-style transcoder
@deffnx {Scheme Procedure} transcoder-error-handling-mode transcoder
These are accessors for transcoder objects; when applied to a
transcoder returned by @code{make-transcoder}, they return the
@var{codec}, @var{eol-style}, and @var{handling-mode} arguments,
respectively.
@end deffn
@deffn {Scheme Procedure} bytevector->string bytevector transcoder
Returns the string that results from transcoding the
@var{bytevector} according to the input direction of the transcoder.
@end deffn
@deffn {Scheme Procedure} string->bytevector string transcoder
Returns the bytevector that results from transcoding the
@var{string} according to the output direction of the transcoder.
@end deffn
@node R6RS End-of-File
@subsubsection The End-of-File Object
@cindex EOF
@cindex end-of-file
R5RS' @code{eof-object?} procedure is provided by the @code{(rnrs io
ports)} module:
@deffn {Scheme Procedure} eof-object? obj
@deffnx {C Function} scm_eof_object_p (obj)
Return true if @var{obj} is the end-of-file (EOF) object.
@end deffn
In addition, the following procedure is provided:
@deffn {Scheme Procedure} eof-object
@deffnx {C Function} scm_eof_object ()
Return the end-of-file (EOF) object.
@lisp
(eof-object? (eof-object))
@result{} #t
@end lisp
@end deffn
@node R6RS Port Manipulation
@subsubsection Port Manipulation
The procedures listed below operate on any kind of R6RS I/O port.
@deffn {Scheme Procedure} port? obj
Returns @code{#t} if the argument is a port, and returns @code{#f}
otherwise.
@end deffn
@deffn {Scheme Procedure} port-transcoder port
Returns the transcoder associated with @var{port} if @var{port} is
textual and has an associated transcoder, and returns @code{#f} if
@var{port} is binary or does not have an associated transcoder.
@end deffn
@deffn {Scheme Procedure} binary-port? port
Return @code{#t} if @var{port} is a @dfn{binary port}, suitable for
binary data input/output.
Note that internally Guile does not differentiate between binary and
textual ports, unlike the R6RS. Thus, this procedure returns true when
@var{port} does not have an associated encoding---i.e., when
@code{(port-encoding @var{port})} is @code{#f} (@pxref{Ports,
port-encoding}). This is the case for ports returned by R6RS procedures
such as @code{open-bytevector-input-port} and
@code{make-custom-binary-output-port}.
However, Guile currently does not prevent use of textual I/O procedures
such as @code{display} or @code{read-char} with binary ports. Doing so
``upgrades'' the port from binary to textual, under the ISO-8859-1
encoding. Likewise, Guile does not prevent use of
@code{set-port-encoding!} on a binary port, which also turns it into a
``textual'' port.
@end deffn
@deffn {Scheme Procedure} textual-port? port
Always return @code{#t}, as all ports can be used for textual I/O in
Guile.
@end deffn
@deffn {Scheme Procedure} transcoded-port binary-port transcoder
The @code{transcoded-port} procedure
returns a new textual port with the specified @var{transcoder}.
Otherwise the new textual port's state is largely the same as
that of @var{binary-port}.
If @var{binary-port} is an input port, the new textual
port will be an input port and
will transcode the bytes that have not yet been read from
@var{binary-port}.
If @var{binary-port} is an output port, the new textual
port will be an output port and
will transcode output characters into bytes that are
written to the byte sink represented by @var{binary-port}.
As a side effect, however, @code{transcoded-port}
closes @var{binary-port} in
a special way that allows the new textual port to continue to
use the byte source or sink represented by @var{binary-port},
even though @var{binary-port} itself is closed and cannot
be used by the input and output operations described in this
chapter.
@end deffn
@deffn {Scheme Procedure} port-position port
If @var{port} supports it (see below), return the offset (an integer)
indicating where the next octet will be read from/written to in
@var{port}. If @var{port} does not support this operation, an error
condition is raised.
This is similar to Guile's @code{seek} procedure with the
@code{SEEK_CUR} argument (@pxref{Random Access}).
@end deffn
@deffn {Scheme Procedure} port-has-port-position? port
Return @code{#t} is @var{port} supports @code{port-position}.
@end deffn
@deffn {Scheme Procedure} set-port-position! port offset
If @var{port} supports it (see below), set the position where the next
octet will be read from/written to @var{port} to @var{offset} (an
integer). If @var{port} does not support this operation, an error
condition is raised.
This is similar to Guile's @code{seek} procedure with the
@code{SEEK_SET} argument (@pxref{Random Access}).
@end deffn
@deffn {Scheme Procedure} port-has-set-port-position!? port
Return @code{#t} is @var{port} supports @code{set-port-position!}.
@end deffn
@deffn {Scheme Procedure} call-with-port port proc
Call @var{proc}, passing it @var{port} and closing @var{port} upon exit
of @var{proc}. Return the return values of @var{proc}.
@end deffn
@node R6RS Input Ports
@subsubsection Input Ports
@deffn {Scheme Procedure} input-port? obj
Returns @code{#t} if the argument is an input port (or a combined input
and output port), and returns @code{#f} otherwise.
@end deffn
@deffn {Scheme Procedure} port-eof? input-port
Returns @code{#t}
if the @code{lookahead-u8} procedure (if @var{input-port} is a binary port)
or the @code{lookahead-char} procedure (if @var{input-port} is a textual port)
would return
the end-of-file object, and @code{#f} otherwise.
The operation may block indefinitely if no data is available
but the port cannot be determined to be at end of file.
@end deffn
@deffn {Scheme Procedure} open-file-input-port filename
@deffnx {Scheme Procedure} open-file-input-port filename file-options
@deffnx {Scheme Procedure} open-file-input-port filename file-options buffer-mode
@deffnx {Scheme Procedure} open-file-input-port filename file-options buffer-mode maybe-transcoder
@var{maybe-transcoder} must be either a transcoder or @code{#f}.
The @code{open-file-input-port} procedure returns an
input port for the named file. The @var{file-options} and
@var{maybe-transcoder} arguments are optional.
The @var{file-options} argument, which may determine
various aspects of the returned port (@pxref{R6RS File Options}),
defaults to the value of @code{(file-options)}.
The @var{buffer-mode} argument, if supplied,
must be one of the symbols that name a buffer mode.
The @var{buffer-mode} argument defaults to @code{block}.
If @var{maybe-transcoder} is a transcoder, it becomes the transcoder associated
with the returned port.
If @var{maybe-transcoder} is @code{#f} or absent,
the port will be a binary port and will support the
@code{port-position} and @code{set-port-position!} operations.
Otherwise the port will be a textual port, and whether it supports
the @code{port-position} and @code{set-port-position!} operations
is implementation-dependent (and possibly transcoder-dependent).
@end deffn
@deffn {Scheme Procedure} standard-input-port
Returns a fresh binary input port connected to standard input. Whether
the port supports the @code{port-position} and @code{set-port-position!}
operations is implementation-dependent.
@end deffn
@deffn {Scheme Procedure} current-input-port
This returns a default textual port for input. Normally, this default
port is associated with standard input, but can be dynamically
re-assigned using the @code{with-input-from-file} procedure from the
@code{io simple (6)} library (@pxref{rnrs io simple}). The port may or
may not have an associated transcoder; if it does, the transcoder is
implementation-dependent.
@end deffn
@node R6RS Binary Input
@subsubsection Binary Input
@cindex binary input
R6RS binary input ports can be created with the procedures described
below.
@deffn {Scheme Procedure} open-bytevector-input-port bv [transcoder]
@deffnx {C Function} scm_open_bytevector_input_port (bv, transcoder)
Return an input port whose contents are drawn from bytevector @var{bv}
(@pxref{Bytevectors}).
@c FIXME: Update description when implemented.
The @var{transcoder} argument is currently not supported.
@end deffn
@cindex custom binary input ports
@deffn {Scheme Procedure} make-custom-binary-input-port id read! get-position set-position! close
@deffnx {C Function} scm_make_custom_binary_input_port (id, read!, get-position, set-position!, close)
Return a new custom binary input port@footnote{This is similar in spirit
to Guile's @dfn{soft ports} (@pxref{Soft Ports}).} named @var{id} (a
string) whose input is drained by invoking @var{read!} and passing it a
bytevector, an index where bytes should be written, and the number of
bytes to read. The @code{read!} procedure must return an integer
indicating the number of bytes read, or @code{0} to indicate the
end-of-file.
Optionally, if @var{get-position} is not @code{#f}, it must be a thunk
that will be called when @code{port-position} is invoked on the custom
binary port and should return an integer indicating the position within
the underlying data stream; if @var{get-position} was not supplied, the
returned port does not support @code{port-position}.
Likewise, if @var{set-position!} is not @code{#f}, it should be a
one-argument procedure. When @code{set-port-position!} is invoked on the
custom binary input port, @var{set-position!} is passed an integer
indicating the position of the next byte is to read.
Finally, if @var{close} is not @code{#f}, it must be a thunk. It is
invoked when the custom binary input port is closed.
The returned port is fully buffered by default, but its buffering mode
can be changed using @code{setvbuf} (@pxref{Ports and File Descriptors,
@code{setvbuf}}).
Using a custom binary input port, the @code{open-bytevector-input-port}
procedure could be implemented as follows:
@lisp
(define (open-bytevector-input-port source)
(define position 0)
(define length (bytevector-length source))
(define (read! bv start count)
(let ((count (min count (- length position))))
(bytevector-copy! source position
bv start count)
(set! position (+ position count))
count))
(define (get-position) position)
(define (set-position! new-position)
(set! position new-position))
(make-custom-binary-input-port "the port" read!
get-position
set-position!))
(read (open-bytevector-input-port (string->utf8 "hello")))
@result{} hello
@end lisp
@end deffn
@cindex binary input
Binary input is achieved using the procedures below:
@deffn {Scheme Procedure} get-u8 port
@deffnx {C Function} scm_get_u8 (port)
Return an octet read from @var{port}, a binary input port, blocking as
necessary, or the end-of-file object.
@end deffn
@deffn {Scheme Procedure} lookahead-u8 port
@deffnx {C Function} scm_lookahead_u8 (port)
Like @code{get-u8} but does not update @var{port}'s position to point
past the octet.
@end deffn
@deffn {Scheme Procedure} get-bytevector-n port count
@deffnx {C Function} scm_get_bytevector_n (port, count)
Read @var{count} octets from @var{port}, blocking as necessary and
return a bytevector containing the octets read. If fewer bytes are
available, a bytevector smaller than @var{count} is returned.
@end deffn
@deffn {Scheme Procedure} get-bytevector-n! port bv start count
@deffnx {C Function} scm_get_bytevector_n_x (port, bv, start, count)
Read @var{count} bytes from @var{port} and store them in @var{bv}
starting at index @var{start}. Return either the number of bytes
actually read or the end-of-file object.
@end deffn
@deffn {Scheme Procedure} get-bytevector-some port
@deffnx {C Function} scm_get_bytevector_some (port)
Read from @var{port}, blocking as necessary, until bytes are available
or an end-of-file is reached. Return either the end-of-file object or a
new bytevector containing some of the available bytes (at least one),
and update the port position to point just past these bytes.
@end deffn
@deffn {Scheme Procedure} get-bytevector-all port
@deffnx {C Function} scm_get_bytevector_all (port)
Read from @var{port}, blocking as necessary, until the end-of-file is
reached. Return either a new bytevector containing the data read or the
end-of-file object (if no data were available).
@end deffn
The @code{(ice-9 binary-ports)} module provides the following procedure
as an extension to @code{(rnrs io ports)}:
@deffn {Scheme Procedure} unget-bytevector port bv [start [count]]
@deffnx {C Function} scm_unget_bytevector (port, bv, start, count)
Place the contents of @var{bv} in @var{port}, optionally starting at
index @var{start} and limiting to @var{count} octets, so that its bytes
will be read from left-to-right as the next bytes from @var{port} during
subsequent read operations. If called multiple times, the unread bytes
will be read again in last-in first-out order.
@end deffn
@node R6RS Textual Input
@subsubsection Textual Input
@deffn {Scheme Procedure} get-char textual-input-port
Reads from @var{textual-input-port}, blocking as necessary, until a
complete character is available from @var{textual-input-port},
or until an end of file is reached.
If a complete character is available before the next end of file,
@code{get-char} returns that character and updates the input port to
point past the character. If an end of file is reached before any
character is read, @code{get-char} returns the end-of-file object.
@end deffn
@deffn {Scheme Procedure} lookahead-char textual-input-port
The @code{lookahead-char} procedure is like @code{get-char}, but it does
not update @var{textual-input-port} to point past the character.
@end deffn
@deffn {Scheme Procedure} get-string-n textual-input-port count
@var{count} must be an exact, non-negative integer object, representing
the number of characters to be read.
The @code{get-string-n} procedure reads from @var{textual-input-port},
blocking as necessary, until @var{count} characters are available, or
until an end of file is reached.
If @var{count} characters are available before end of file,
@code{get-string-n} returns a string consisting of those @var{count}
characters. If fewer characters are available before an end of file, but
one or more characters can be read, @code{get-string-n} returns a string
containing those characters. In either case, the input port is updated
to point just past the characters read. If no characters can be read
before an end of file, the end-of-file object is returned.
@end deffn
@deffn {Scheme Procedure} get-string-n! textual-input-port string start count
@var{start} and @var{count} must be exact, non-negative integer objects,
with @var{count} representing the number of characters to be read.
@var{string} must be a string with at least $@var{start} + @var{count}$
characters.
The @code{get-string-n!} procedure reads from @var{textual-input-port}
in the same manner as @code{get-string-n}. If @var{count} characters
are available before an end of file, they are written into @var{string}
starting at index @var{start}, and @var{count} is returned. If fewer
characters are available before an end of file, but one or more can be
read, those characters are written into @var{string} starting at index
@var{start} and the number of characters actually read is returned as an
exact integer object. If no characters can be read before an end of
file, the end-of-file object is returned.
@end deffn
@deffn {Scheme Procedure} get-string-all textual-input-port
Reads from @var{textual-input-port} until an end of file, decoding
characters in the same manner as @code{get-string-n} and
@code{get-string-n!}.
If characters are available before the end of file, a string containing
all the characters decoded from that data are returned. If no character
precedes the end of file, the end-of-file object is returned.
@end deffn
@deffn {Scheme Procedure} get-line textual-input-port
Reads from @var{textual-input-port} up to and including the linefeed
character or end of file, decoding characters in the same manner as
@code{get-string-n} and @code{get-string-n!}.
If a linefeed character is read, a string containing all of the text up
to (but not including) the linefeed character is returned, and the port
is updated to point just past the linefeed character. If an end of file
is encountered before any linefeed character is read, but some
characters have been read and decoded as characters, a string containing
those characters is returned. If an end of file is encountered before
any characters are read, the end-of-file object is returned.
@quotation Note
The end-of-line style, if not @code{none}, will cause all line endings
to be read as linefeed characters. @xref{R6RS Transcoders}.
@end quotation
@end deffn
@deffn {Scheme Procedure} get-datum textual-input-port count
Reads an external representation from @var{textual-input-port} and returns the
datum it represents. The @code{get-datum} procedure returns the next
datum that can be parsed from the given @var{textual-input-port}, updating
@var{textual-input-port} to point exactly past the end of the external
representation of the object.
Any @emph{interlexeme space} (comment or whitespace, @pxref{Scheme
Syntax}) in the input is first skipped. If an end of file occurs after
the interlexeme space, the end-of-file object (@pxref{R6RS End-of-File})
is returned.
If a character inconsistent with an external representation is
encountered in the input, an exception with condition types
@code{&lexical} and @code{&i/o-read} is raised. Also, if the end of
file is encountered after the beginning of an external representation,
but the external representation is incomplete and therefore cannot be
parsed, an exception with condition types @code{&lexical} and
@code{&i/o-read} is raised.
@end deffn
@node R6RS Output Ports
@subsubsection Output Ports
@deffn {Scheme Procedure} output-port? obj
Returns @code{#t} if the argument is an output port (or a
combined input and output port), @code{#f} otherwise.
@end deffn
@deffn {Scheme Procedure} flush-output-port port
Flushes any buffered output from the buffer of @var{output-port} to the
underlying file, device, or object. The @code{flush-output-port}
procedure returns an unspecified values.
@end deffn
@deffn {Scheme Procedure} open-file-output-port filename
@deffnx {Scheme Procedure} open-file-output-port filename file-options
@deffnx {Scheme Procedure} open-file-output-port filename file-options buffer-mode
@deffnx {Scheme Procedure} open-file-output-port filename file-options buffer-mode maybe-transcoder
@var{maybe-transcoder} must be either a transcoder or @code{#f}.
The @code{open-file-output-port} procedure returns an output port for the named file.
The @var{file-options} argument, which may determine various aspects of
the returned port (@pxref{R6RS File Options}), defaults to the value of
@code{(file-options)}.
The @var{buffer-mode} argument, if supplied,
must be one of the symbols that name a buffer mode.
The @var{buffer-mode} argument defaults to @code{block}.
If @var{maybe-transcoder} is a transcoder, it becomes the transcoder
associated with the port.
If @var{maybe-transcoder} is @code{#f} or absent,
the port will be a binary port and will support the
@code{port-position} and @code{set-port-position!} operations.
Otherwise the port will be a textual port, and whether it supports
the @code{port-position} and @code{set-port-position!} operations
is implementation-dependent (and possibly transcoder-dependent).
@end deffn
@deffn {Scheme Procedure} standard-output-port
@deffnx {Scheme Procedure} standard-error-port
Returns a fresh binary output port connected to the standard output or
standard error respectively. Whether the port supports the
@code{port-position} and @code{set-port-position!} operations is
implementation-dependent.
@end deffn
@deffn {Scheme Procedure} current-output-port
@deffnx {Scheme Procedure} current-error-port
These return default textual ports for regular output and error output.
Normally, these default ports are associated with standard output, and
standard error, respectively. The return value of
@code{current-output-port} can be dynamically re-assigned using the
@code{with-output-to-file} procedure from the @code{io simple (6)}
library (@pxref{rnrs io simple}). A port returned by one of these
procedures may or may not have an associated transcoder; if it does, the
transcoder is implementation-dependent.
@end deffn
@node R6RS Binary Output
@subsubsection Binary Output
Binary output ports can be created with the procedures below.
@deffn {Scheme Procedure} open-bytevector-output-port [transcoder]
@deffnx {C Function} scm_open_bytevector_output_port (transcoder)
Return two values: a binary output port and a procedure. The latter
should be called with zero arguments to obtain a bytevector containing
the data accumulated by the port, as illustrated below.
@lisp
(call-with-values
(lambda ()
(open-bytevector-output-port))
(lambda (port get-bytevector)
(display "hello" port)
(get-bytevector)))
@result{} #vu8(104 101 108 108 111)
@end lisp
@c FIXME: Update description when implemented.
The @var{transcoder} argument is currently not supported.
@end deffn
@cindex custom binary output ports
@deffn {Scheme Procedure} make-custom-binary-output-port id write! get-position set-position! close
@deffnx {C Function} scm_make_custom_binary_output_port (id, write!, get-position, set-position!, close)
Return a new custom binary output port named @var{id} (a string) whose
output is sunk by invoking @var{write!} and passing it a bytevector, an
index where bytes should be read from this bytevector, and the number of
bytes to be ``written''. The @code{write!} procedure must return an
integer indicating the number of bytes actually written; when it is
passed @code{0} as the number of bytes to write, it should behave as
though an end-of-file was sent to the byte sink.
The other arguments are as for @code{make-custom-binary-input-port}
(@pxref{R6RS Binary Input, @code{make-custom-binary-input-port}}).
@end deffn
@cindex binary output
Writing to a binary output port can be done using the following
procedures:
@deffn {Scheme Procedure} put-u8 port octet
@deffnx {C Function} scm_put_u8 (port, octet)
Write @var{octet}, an integer in the 0--255 range, to @var{port}, a
binary output port.
@end deffn
@deffn {Scheme Procedure} put-bytevector port bv [start [count]]
@deffnx {C Function} scm_put_bytevector (port, bv, start, count)
Write the contents of @var{bv} to @var{port}, optionally starting at
index @var{start} and limiting to @var{count} octets.
@end deffn
@node R6RS Textual Output
@subsubsection Textual Output
@deffn {Scheme Procedure} put-char port char
Writes @var{char} to the port. The @code{put-char} procedure returns
an unspecified value.
@end deffn
@deffn {Scheme Procedure} put-string port string
@deffnx {Scheme Procedure} put-string port string start
@deffnx {Scheme Procedure} put-string port string start count
@var{start} and @var{count} must be non-negative exact integer objects.
@var{string} must have a length of at least @math{@var{start} +
@var{count}}. @var{start} defaults to 0. @var{count} defaults to
@math{@code{(string-length @var{string})} - @var{start}}$. The
@code{put-string} procedure writes the @var{count} characters of
@var{string} starting at index @var{start} to the port. The
@code{put-string} procedure returns an unspecified value.
@end deffn
@deffn {Scheme Procedure} put-datum textual-output-port datum
@var{datum} should be a datum value. The @code{put-datum} procedure
writes an external representation of @var{datum} to
@var{textual-output-port}. The specific external representation is
implementation-dependent. However, whenever possible, an implementation
should produce a representation for which @code{get-datum}, when reading
the representation, will return an object equal (in the sense of
@code{equal?}) to @var{datum}.
@quotation Note
Not all datums may allow producing an external representation for which
@code{get-datum} will produce an object that is equal to the
original. Specifically, NaNs contained in @var{datum} may make
this impossible.
@end quotation
@quotation Note
The @code{put-datum} procedure merely writes the external
representation, but no trailing delimiter. If @code{put-datum} is
used to write several subsequent external representations to an
output port, care should be taken to delimit them properly so they can
be read back in by subsequent calls to @code{get-datum}.
@end quotation
@end deffn
@node I/O Extensions
@subsection Using and Extending Ports in C
@menu
* C Port Interface:: Using ports from C.
* Port Implementation:: How to implement a new port type in C.
@end menu
@node C Port Interface
@subsubsection C Port Interface
@cindex C port interface
@cindex Port, C interface
This section describes how to use Scheme ports from C.
@subsubheading Port basics
@cindex ptob
@tindex scm_ptob_descriptor
@tindex scm_port
@findex SCM_PTAB_ENTRY
@findex SCM_PTOBNUM
@vindex scm_ptobs
There are two main data structures. A port type object (ptob) is of
type @code{scm_ptob_descriptor}. A port instance is of type
@code{scm_port}. Given an @code{SCM} variable which points to a port,
the corresponding C port object can be obtained using the
@code{SCM_PTAB_ENTRY} macro. The ptob can be obtained by using
@code{SCM_PTOBNUM} to give an index into the @code{scm_ptobs}
global array.
@subsubheading Port buffers
An input port always has a read buffer and an output port always has a
write buffer. However the size of these buffers is not guaranteed to be
more than one byte (e.g., the @code{shortbuf} field in @code{scm_port}
which is used when no other buffer is allocated). The way in which the
buffers are allocated depends on the implementation of the ptob. For
example in the case of an fport, buffers may be allocated with malloc
when the port is created, but in the case of an strport the underlying
string is used as the buffer.
@subsubheading The @code{rw_random} flag
Special treatment is required for ports which can be seeked at random.
Before various operations, such as seeking the port or changing from
input to output on a bidirectional port or vice versa, the port
implementation must be given a chance to update its state. The write
buffer is updated by calling the @code{flush} ptob procedure and the
input buffer is updated by calling the @code{end_input} ptob procedure.
In the case of an fport, @code{flush} causes buffered output to be
written to the file descriptor, while @code{end_input} causes the
descriptor position to be adjusted to account for buffered input which
was never read.
The special treatment must be performed if the @code{rw_random} flag in
the port is non-zero.
@subsubheading The @code{rw_active} variable
The @code{rw_active} variable in the port is only used if
@code{rw_random} is set. It's defined as an enum with the following
values:
@table @code
@item SCM_PORT_READ
the read buffer may have unread data.
@item SCM_PORT_WRITE
the write buffer may have unwritten data.
@item SCM_PORT_NEITHER
neither the write nor the read buffer has data.
@end table
@subsubheading Reading from a port.
To read from a port, it's possible to either call existing libguile
procedures such as @code{scm_getc} and @code{scm_read_line} or to read
data from the read buffer directly. Reading from the buffer involves
the following steps:
@enumerate
@item
Flush output on the port, if @code{rw_active} is @code{SCM_PORT_WRITE}.
@item
Fill the read buffer, if it's empty, using @code{scm_fill_input}.
@item Read the data from the buffer and update the read position in
the buffer. Steps 2) and 3) may be repeated as many times as required.
@item Set rw_active to @code{SCM_PORT_READ} if @code{rw_random} is set.
@item update the port's line and column counts.
@end enumerate
@subsubheading Writing to a port.
To write data to a port, calling @code{scm_lfwrite} should be sufficient for
most purposes. This takes care of the following steps:
@enumerate
@item
End input on the port, if @code{rw_active} is @code{SCM_PORT_READ}.
@item
Pass the data to the ptob implementation using the @code{write} ptob
procedure. The advantage of using the ptob @code{write} instead of
manipulating the write buffer directly is that it allows the data to be
written in one operation even if the port is using the single-byte
@code{shortbuf}.
@item
Set @code{rw_active} to @code{SCM_PORT_WRITE} if @code{rw_random}
is set.
@end enumerate
@node Port Implementation
@subsubsection Port Implementation
@cindex Port implementation
This section describes how to implement a new port type in C.
As described in the previous section, a port type object (ptob) is
a structure of type @code{scm_ptob_descriptor}. A ptob is created by
calling @code{scm_make_port_type}.
@deftypefun scm_t_bits scm_make_port_type (char *name, int (*fill_input) (SCM port), void (*write) (SCM port, const void *data, size_t size))
Return a new port type object. The @var{name}, @var{fill_input} and
@var{write} parameters are initial values for those port type fields,
as described below. The other fields are initialized with default
values and can be changed later.
@end deftypefun
All of the elements of the ptob, apart from @code{name}, are procedures
which collectively implement the port behaviour. Creating a new port
type mostly involves writing these procedures.
@table @code
@item name
A pointer to a NUL terminated string: the name of the port type. This
is the only element of @code{scm_ptob_descriptor} which is not
a procedure. Set via the first argument to @code{scm_make_port_type}.
@item mark
Called during garbage collection to mark any SCM objects that a port
object may contain. It doesn't need to be set unless the port has
@code{SCM} components. Set using
@deftypefun void scm_set_port_mark (scm_t_bits tc, SCM (*mark) (SCM port))
@end deftypefun
@item free
Called when the port is collected during gc. It
should free any resources used by the port.
Set using
@deftypefun void scm_set_port_free (scm_t_bits tc, size_t (*free) (SCM port))
@end deftypefun
@item print
Called when @code{write} is called on the port object, to print a
port description. E.g., for an fport it may produce something like:
@code{#<input: /etc/passwd 3>}. Set using
@deftypefun void scm_set_port_print (scm_t_bits tc, int (*print) (SCM port, SCM dest_port, scm_print_state *pstate))
The first argument @var{port} is the object being printed, the second
argument @var{dest_port} is where its description should go.
@end deftypefun
@item equalp
Not used at present. Set using
@deftypefun void scm_set_port_equalp (scm_t_bits tc, SCM (*equalp) (SCM, SCM))
@end deftypefun
@item close
Called when the port is closed, unless it was collected during gc. It
should free any resources used by the port.
Set using
@deftypefun void scm_set_port_close (scm_t_bits tc, int (*close) (SCM port))
@end deftypefun
@item write
Accept data which is to be written using the port. The port implementation
may choose to buffer the data instead of processing it directly.
Set via the third argument to @code{scm_make_port_type}.
@item flush
Complete the processing of buffered output data. Reset the value of
@code{rw_active} to @code{SCM_PORT_NEITHER}.
Set using
@deftypefun void scm_set_port_flush (scm_t_bits tc, void (*flush) (SCM port))
@end deftypefun
@item end_input
Perform any synchronization required when switching from input to output
on the port. Reset the value of @code{rw_active} to @code{SCM_PORT_NEITHER}.
Set using
@deftypefun void scm_set_port_end_input (scm_t_bits tc, void (*end_input) (SCM port, int offset))
@end deftypefun
@item fill_input
Read new data into the read buffer and return the first character. It
can be assumed that the read buffer is empty when this procedure is called.
Set via the second argument to @code{scm_make_port_type}.
@item input_waiting
Return a lower bound on the number of bytes that could be read from the
port without blocking. It can be assumed that the current state of
@code{rw_active} is @code{SCM_PORT_NEITHER}.
Set using
@deftypefun void scm_set_port_input_waiting (scm_t_bits tc, int (*input_waiting) (SCM port))
@end deftypefun
@item seek
Set the current position of the port. The procedure can not make
any assumptions about the value of @code{rw_active} when it's
called. It can reset the buffers first if desired by using something
like:
@example
if (pt->rw_active == SCM_PORT_READ)
scm_end_input (port);
else if (pt->rw_active == SCM_PORT_WRITE)
ptob->flush (port);
@end example
However note that this will have the side effect of discarding any data
in the unread-char buffer, in addition to any side effects from the
@code{end_input} and @code{flush} ptob procedures. This is undesirable
when seek is called to measure the current position of the port, i.e.,
@code{(seek p 0 SEEK_CUR)}. The libguile fport and string port
implementations take care to avoid this problem.
The procedure is set using
@deftypefun void scm_set_port_seek (scm_t_bits tc, scm_t_off (*seek) (SCM port, scm_t_off offset, int whence))
@end deftypefun
@item truncate
Truncate the port data to be specified length. It can be assumed that the
current state of @code{rw_active} is @code{SCM_PORT_NEITHER}.
Set using
@deftypefun void scm_set_port_truncate (scm_t_bits tc, void (*truncate) (SCM port, scm_t_off length))
@end deftypefun
@end table
@node BOM Handling
@subsection Handling of Unicode byte order marks.
@cindex BOM
@cindex byte order mark
This section documents the finer points of Guile's handling of Unicode
byte order marks (BOMs). A byte order mark (U+FEFF) is typically found
at the start of a UTF-16 or UTF-32 stream, to allow readers to reliably
determine the byte order. Occasionally, a BOM is found at the start of
a UTF-8 stream, but this is much less common and not generally
recommended.
Guile attempts to handle BOMs automatically, and in accordance with the
recommendations of the Unicode Standard, when the port encoding is set
to @code{UTF-8}, @code{UTF-16}, or @code{UTF-32}. In brief, Guile
automatically writes a BOM at the start of a UTF-16 or UTF-32 stream,
and automatically consumes one from the start of a UTF-8, UTF-16, or
UTF-32 stream.
As specified in the Unicode Standard, a BOM is only handled specially at
the start of a stream, and only if the port encoding is set to
@code{UTF-8}, @code{UTF-16} or @code{UTF-32}. If the port encoding is
set to @code{UTF-16BE}, @code{UTF-16LE}, @code{UTF-32BE}, or
@code{UTF-32LE}, then BOMs are @emph{not} handled specially, and none of
the special handling described in this section applies.
@itemize @bullet
@item
To ensure that Guile will properly detect the byte order of a UTF-16 or
UTF-32 stream, you must perform a textual read before any writes, seeks,
or binary I/O. Guile will not attempt to read a BOM unless a read is
explicitly requested at the start of the stream.
@item
If a textual write is performed before the first read, then an arbitrary
byte order will be chosen. Currently, big endian is the default on all
platforms, but that may change in the future. If you wish to explicitly
control the byte order of an output stream, set the port encoding to
@code{UTF-16BE}, @code{UTF-16LE}, @code{UTF-32BE}, or @code{UTF-32LE},
and explicitly write a BOM (@code{#\xFEFF}) if desired.
@item
If @code{set-port-encoding!} is called in the middle of a stream, Guile
treats this as a new logical ``start of stream'' for purposes of BOM
handling, and will forget about any BOMs that had previously been seen.
Therefore, it may choose a different byte order than had been used
previously. This is intended to support multiple logical text streams
embedded within a larger binary stream.
@item
Binary I/O operations are not guaranteed to update Guile's notion of
whether the port is at the ``start of the stream'', nor are they
guaranteed to produce or consume BOMs.
@item
For ports that support seeking (e.g. normal files), the input and output
streams are considered linked: if the user reads first, then a BOM will
be consumed (if appropriate), but later writes will @emph{not} produce a
BOM. Similarly, if the user writes first, then later reads will
@emph{not} consume a BOM.
@item
For ports that do not support seeking (e.g. pipes, sockets, and
terminals), the input and output streams are considered
@emph{independent} for purposes of BOM handling: the first read will
consume a BOM (if appropriate), and the first write will @emph{also}
produce a BOM (if appropriate). However, the input and output streams
will always use the same byte order.
@item
Seeks to the beginning of a file will set the ``start of stream'' flags.
Therefore, a subsequent textual read or write will consume or produce a
BOM. However, unlike @code{set-port-encoding!}, if a byte order had
already been chosen for the port, it will remain in effect after a seek,
and cannot be changed by the presence of a BOM. Seeks anywhere other
than the beginning of a file clear the ``start of stream'' flags.
@end itemize
@c Local Variables:
@c TeX-master: "guile.texi"
@c End:
|