File: edbdoc.html

package info (click to toggle)
edbrowse 1.5.17-2
  • links: PTS
  • area: main
  • in suites: sarge
  • size: 368 kB
  • ctags: 1
  • sloc: perl: 6,446; makefile: 29
file content (2449 lines) | stat: -rw-r--r-- 97,234 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
1001
1002
1003
1004
1005
1006
1007
1008
1009
1010
1011
1012
1013
1014
1015
1016
1017
1018
1019
1020
1021
1022
1023
1024
1025
1026
1027
1028
1029
1030
1031
1032
1033
1034
1035
1036
1037
1038
1039
1040
1041
1042
1043
1044
1045
1046
1047
1048
1049
1050
1051
1052
1053
1054
1055
1056
1057
1058
1059
1060
1061
1062
1063
1064
1065
1066
1067
1068
1069
1070
1071
1072
1073
1074
1075
1076
1077
1078
1079
1080
1081
1082
1083
1084
1085
1086
1087
1088
1089
1090
1091
1092
1093
1094
1095
1096
1097
1098
1099
1100
1101
1102
1103
1104
1105
1106
1107
1108
1109
1110
1111
1112
1113
1114
1115
1116
1117
1118
1119
1120
1121
1122
1123
1124
1125
1126
1127
1128
1129
1130
1131
1132
1133
1134
1135
1136
1137
1138
1139
1140
1141
1142
1143
1144
1145
1146
1147
1148
1149
1150
1151
1152
1153
1154
1155
1156
1157
1158
1159
1160
1161
1162
1163
1164
1165
1166
1167
1168
1169
1170
1171
1172
1173
1174
1175
1176
1177
1178
1179
1180
1181
1182
1183
1184
1185
1186
1187
1188
1189
1190
1191
1192
1193
1194
1195
1196
1197
1198
1199
1200
1201
1202
1203
1204
1205
1206
1207
1208
1209
1210
1211
1212
1213
1214
1215
1216
1217
1218
1219
1220
1221
1222
1223
1224
1225
1226
1227
1228
1229
1230
1231
1232
1233
1234
1235
1236
1237
1238
1239
1240
1241
1242
1243
1244
1245
1246
1247
1248
1249
1250
1251
1252
1253
1254
1255
1256
1257
1258
1259
1260
1261
1262
1263
1264
1265
1266
1267
1268
1269
1270
1271
1272
1273
1274
1275
1276
1277
1278
1279
1280
1281
1282
1283
1284
1285
1286
1287
1288
1289
1290
1291
1292
1293
1294
1295
1296
1297
1298
1299
1300
1301
1302
1303
1304
1305
1306
1307
1308
1309
1310
1311
1312
1313
1314
1315
1316
1317
1318
1319
1320
1321
1322
1323
1324
1325
1326
1327
1328
1329
1330
1331
1332
1333
1334
1335
1336
1337
1338
1339
1340
1341
1342
1343
1344
1345
1346
1347
1348
1349
1350
1351
1352
1353
1354
1355
1356
1357
1358
1359
1360
1361
1362
1363
1364
1365
1366
1367
1368
1369
1370
1371
1372
1373
1374
1375
1376
1377
1378
1379
1380
1381
1382
1383
1384
1385
1386
1387
1388
1389
1390
1391
1392
1393
1394
1395
1396
1397
1398
1399
1400
1401
1402
1403
1404
1405
1406
1407
1408
1409
1410
1411
1412
1413
1414
1415
1416
1417
1418
1419
1420
1421
1422
1423
1424
1425
1426
1427
1428
1429
1430
1431
1432
1433
1434
1435
1436
1437
1438
1439
1440
1441
1442
1443
1444
1445
1446
1447
1448
1449
1450
1451
1452
1453
1454
1455
1456
1457
1458
1459
1460
1461
1462
1463
1464
1465
1466
1467
1468
1469
1470
1471
1472
1473
1474
1475
1476
1477
1478
1479
1480
1481
1482
1483
1484
1485
1486
1487
1488
1489
1490
1491
1492
1493
1494
1495
1496
1497
1498
1499
1500
1501
1502
1503
1504
1505
1506
1507
1508
1509
1510
1511
1512
1513
1514
1515
1516
1517
1518
1519
1520
1521
1522
1523
1524
1525
1526
1527
1528
1529
1530
1531
1532
1533
1534
1535
1536
1537
1538
1539
1540
1541
1542
1543
1544
1545
1546
1547
1548
1549
1550
1551
1552
1553
1554
1555
1556
1557
1558
1559
1560
1561
1562
1563
1564
1565
1566
1567
1568
1569
1570
1571
1572
1573
1574
1575
1576
1577
1578
1579
1580
1581
1582
1583
1584
1585
1586
1587
1588
1589
1590
1591
1592
1593
1594
1595
1596
1597
1598
1599
1600
1601
1602
1603
1604
1605
1606
1607
1608
1609
1610
1611
1612
1613
1614
1615
1616
1617
1618
1619
1620
1621
1622
1623
1624
1625
1626
1627
1628
1629
1630
1631
1632
1633
1634
1635
1636
1637
1638
1639
1640
1641
1642
1643
1644
1645
1646
1647
1648
1649
1650
1651
1652
1653
1654
1655
1656
1657
1658
1659
1660
1661
1662
1663
1664
1665
1666
1667
1668
1669
1670
1671
1672
1673
1674
1675
1676
1677
1678
1679
1680
1681
1682
1683
1684
1685
1686
1687
1688
1689
1690
1691
1692
1693
1694
1695
1696
1697
1698
1699
1700
1701
1702
1703
1704
1705
1706
1707
1708
1709
1710
1711
1712
1713
1714
1715
1716
1717
1718
1719
1720
1721
1722
1723
1724
1725
1726
1727
1728
1729
1730
1731
1732
1733
1734
1735
1736
1737
1738
1739
1740
1741
1742
1743
1744
1745
1746
1747
1748
1749
1750
1751
1752
1753
1754
1755
1756
1757
1758
1759
1760
1761
1762
1763
1764
1765
1766
1767
1768
1769
1770
1771
1772
1773
1774
1775
1776
1777
1778
1779
1780
1781
1782
1783
1784
1785
1786
1787
1788
1789
1790
1791
1792
1793
1794
1795
1796
1797
1798
1799
1800
1801
1802
1803
1804
1805
1806
1807
1808
1809
1810
1811
1812
1813
1814
1815
1816
1817
1818
1819
1820
1821
1822
1823
1824
1825
1826
1827
1828
1829
1830
1831
1832
1833
1834
1835
1836
1837
1838
1839
1840
1841
1842
1843
1844
1845
1846
1847
1848
1849
1850
1851
1852
1853
1854
1855
1856
1857
1858
1859
1860
1861
1862
1863
1864
1865
1866
1867
1868
1869
1870
1871
1872
1873
1874
1875
1876
1877
1878
1879
1880
1881
1882
1883
1884
1885
1886
1887
1888
1889
1890
1891
1892
1893
1894
1895
1896
1897
1898
1899
1900
1901
1902
1903
1904
1905
1906
1907
1908
1909
1910
1911
1912
1913
1914
1915
1916
1917
1918
1919
1920
1921
1922
1923
1924
1925
1926
1927
1928
1929
1930
1931
1932
1933
1934
1935
1936
1937
1938
1939
1940
1941
1942
1943
1944
1945
1946
1947
1948
1949
1950
1951
1952
1953
1954
1955
1956
1957
1958
1959
1960
1961
1962
1963
1964
1965
1966
1967
1968
1969
1970
1971
1972
1973
1974
1975
1976
1977
1978
1979
1980
1981
1982
1983
1984
1985
1986
1987
1988
1989
1990
1991
1992
1993
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
2026
2027
2028
2029
2030
2031
2032
2033
2034
2035
2036
2037
2038
2039
2040
2041
2042
2043
2044
2045
2046
2047
2048
2049
2050
2051
2052
2053
2054
2055
2056
2057
2058
2059
2060
2061
2062
2063
2064
2065
2066
2067
2068
2069
2070
2071
2072
2073
2074
2075
2076
2077
2078
2079
2080
2081
2082
2083
2084
2085
2086
2087
2088
2089
2090
2091
2092
2093
2094
2095
2096
2097
2098
2099
2100
2101
2102
2103
2104
2105
2106
2107
2108
2109
2110
2111
2112
2113
2114
2115
2116
2117
2118
2119
2120
2121
2122
2123
2124
2125
2126
2127
2128
2129
2130
2131
2132
2133
2134
2135
2136
2137
2138
2139
2140
2141
2142
2143
2144
2145
2146
2147
2148
2149
2150
2151
2152
2153
2154
2155
2156
2157
2158
2159
2160
2161
2162
2163
2164
2165
2166
2167
2168
2169
2170
2171
2172
2173
2174
2175
2176
2177
2178
2179
2180
2181
2182
2183
2184
2185
2186
2187
2188
2189
2190
2191
2192
2193
2194
2195
2196
2197
2198
2199
2200
2201
2202
2203
2204
2205
2206
2207
2208
2209
2210
2211
2212
2213
2214
2215
2216
2217
2218
2219
2220
2221
2222
2223
2224
2225
2226
2227
2228
2229
2230
2231
2232
2233
2234
2235
2236
2237
2238
2239
2240
2241
2242
2243
2244
2245
2246
2247
2248
2249
2250
2251
2252
2253
2254
2255
2256
2257
2258
2259
2260
2261
2262
2263
2264
2265
2266
2267
2268
2269
2270
2271
2272
2273
2274
2275
2276
2277
2278
2279
2280
2281
2282
2283
2284
2285
2286
2287
2288
2289
2290
2291
2292
2293
2294
2295
2296
2297
2298
2299
2300
2301
2302
2303
2304
2305
2306
2307
2308
2309
2310
2311
2312
2313
2314
2315
2316
2317
2318
2319
2320
2321
2322
2323
2324
2325
2326
2327
2328
2329
2330
2331
2332
2333
2334
2335
2336
2337
2338
2339
2340
2341
2342
2343
2344
2345
2346
2347
2348
2349
2350
2351
2352
2353
2354
2355
2356
2357
2358
2359
2360
2361
2362
2363
2364
2365
2366
2367
2368
2369
2370
2371
2372
2373
2374
2375
2376
2377
2378
2379
2380
2381
2382
2383
2384
2385
2386
2387
2388
2389
2390
2391
2392
2393
2394
2395
2396
2397
2398
2399
2400
2401
2402
2403
2404
2405
2406
2407
2408
2409
2410
2411
2412
2413
2414
2415
2416
2417
2418
2419
2420
2421
2422
2423
2424
2425
2426
2427
2428
2429
2430
2431
2432
2433
2434
2435
2436
2437
2438
2439
2440
2441
2442
2443
2444
2445
2446
2447
2448
2449
<HTML>
<FONT face=Arial,Helvetica,sans-serif size=4>
<HEAD>
<META name=description content="edbrowse documentation, a text based editor, browser, and mail client.">
<meta name=keywords content="
text based, command line, interactive,
editor, browser, mail client,
portable, perl,
blind, script, accessible">
<TITLE> edbrowse documentation </TITLE>
<LINK REL="SHORTCUT ICON" href="pc.ico">
</HEAD>
<BODY bgcolor=white text=black link=red vlink=red alink=navy>

<H2 align=center> edbrowse Documentation </H2>

<H4 align=center> Author </H4>

<P><PRE><font size=4 face=Arial,Helvetica,sans-serif>
Karl Dahlke
<A HREF=mailto:karl@eklhad.net>karl@eklhad.net</A>
248-524-1004 (during regular business hours)
</font></PRE>

<H4 align=center> Copyright Notice </H4>

This program is copyright (C) (C) Karl Dahlke, 2000-2003.
It is made available, by the author, under the terms of the General Public License (GPL),
as articulated by the Free Software Foundation.
It may be used for any purpose, and redistributed, provided this copyright notice is included.

<H4 align=center> Overview </H4>

This program is, at first glance, a reimplementation of /bin/ed.
In fact you might issue a few ed commands and not realize that you are
actually running my perl script.
But as you proceed
you will eventually discover some discrepancies,
areas where my program differs from ed.
These are discussed in the next section.

<P>
At first, reinventing ed seems a complete waste of time,
until you realize that this program also acts as a browser --
a browser embedded inside ed.
You can edit a URL as easily as a local file,
and activate browse mode to render the html tags
in a manner that is appropriate for a command-response program such as this.
In other words, we discard most of the formatting information
and retain the links and fill-out forms.
This allows blind  users to access the Internet
via an application that is entirely compatible with the linear nature of speech or braille.

<P>
I find this approach superior to the "quick fix"
of pasting an adaptor onto a preexisting screen browser (lynx)
or graphical browser (Netscape).
Of course that's just my opinion.
To be fair, many blind users, even totally blind users, are satisfied with their auditory screen scrapers.
I'm glad it works for them, but this approach frustrates the hell out of me.
If you also prefer linear applications,
give this browser a try.

<P>
This documentation assumes you are familiar with ed.
In fact it helps if you are fluent in ed.
Experience with internet browsers and the associated terminology is also valuable.

<H4 align=center> Important Deviations From /bin/ed </H4>

Certain search/substitute commands may behave differently under this editor.
This is because the regular expressions are passed
directly to perl for evaluation, hence they have more features and more power
than the regular expressions employed by /bin/ed.
The syntax is also somewhat different.
For instance, perl uses bare parentheses where ed uses escaped braces --
to delimit sections of matched text.
And perl uses $1 ... $9 to reference the matched substrings,
whereas ed uses \1 ... \9.
Also, perl supports the i suffix, for case insensitive search,
along with the traditional g (global) suffix.
There is no reason to describe all the nuances here.
Please read the perlre man page `man perlre' for a full description
of regular expressions under perl.
Once you are accustomed to their power and flexibility
you'll never go back to ed.

<P>
Great!  You've read the perlre man page, and you're back.
Here are a few changes that I've made to perl regular expressions.
I have found that ( and ) are almost always meant to be literal,
as in searching for myFunction(),
so I reverse the sense of escaped parentheses in perl.
That is, ( and ) now match the literal characters,
and \( and \) are used to demark substrings of the matched text.
These substrings are then referenced, in the replacement string, by $1 through $9.
Similarly, | means a literal |, and \| is alternation.
I also change the sense of &amp;, on the right hand side,
to mean what it means in ed.
I leave ^ $ . [ ] + * ? and {n} alone, to be interpreted by perl,
as described in the perlre man page.
My changes to regexp, to look more like ed, may be confusing
if you are a perl expert.
Sorry about that, but I think these changes make this
editor much easier to use, for everyone,
especially the experienced ed users.
Below are some additional differences between this program and /bin/ed.

<UL>
<P><LI>
Lines beginning with # are ignored, making it easier to comment your edbrowse scripts.
The # character has no special significance in the middle of a line.

<P><LI>
Lines beginning with ! implement a shell escape.
The ! character has no special significance in the middle of a line.
The ! alone spawns an interactive subshell - type exit to return to edbrowse.
The work "ok" is printed when the shell command is finished -
thus you can tell when a no-output command is done.

<P><LI>
Type `cd dirname' to change directories.
The new directory is always printed.
Type cd alone to find out where you are.
I don't know what happens under dos
if you type cd f:/this/that, I never tested it.
Type "cd -" to go back to the directory you were in before.

<P>
Unlike bash, edbrowse does not retrace your steps back through symbolic links.
Thus .. is always the physical parent directory.

<P>
environment variables are expanded before the cd command is applied, including the leading ~.
Thuse cd ~/work takes you to the work directory under your home directory.

<P>
This command does not change any filenames that may be active.
You can edit foo, cd .., and write,
and foo will be copied to the parent directory.
That's probably not what you want, so be careful.

<P><LI>
r operates on the current line by default,
rather then the last line.
Use $r to read a file at the end of your working text.

<P><LI>
The w+ command appends to the file.
Some versions of ed use w&gt; for this operation,
but for 40 years &gt; has been the industry standard for write with truncate,
so using &gt; for append is somewhat confusing.
And w&gt;&gt; is just too clunky, so I use w+.

<P><LI>
w/ writes the data into a file whose name is the last component
of the current file name.
This is useful when you've just downloaded this.that.com/foo/bar/package-2.7.7-22.tar.gz,
and you want to write the file locally, but don't want to retype the stuff at the end.
Alternatively, f/ changes the filename, keeping only the last component.

<P><LI>
Whenever a file is read from or written to disk,
$var, in the filename, is replaced with the corresponding environment variable.
Thus you can edit your address book at any time via `e $adbook',
provided $adbook has been set in your environment.
Also, a leading ~/ is replaced with $HOME/,
making it easy to edit files in your home directory
such as ~/.profile.

<P>
Shell metta characters are also expanded, provided the result is one file name.
You can read or write a file by typing a minimal portion of its name.
Neither $variables nor stars are expanded for files on the command line,
as this expansion is already done by the shell.
Of course, if you're on DOS or Windows, that expansion is not performed,
so this program performs the expansion for you.
This is an attempt to remain portable.
You should be able to edit *.c in any operating system
and get all the C source files in the current directory.

<P><LI>
Many versions of ed place a $ at the end of a listed line,
but this is not one of them, at least not by default.
I use a linear speech adapter, rather than a screen reader,
so the embedded newlines tell me exactly where the line boundaries are.
The extraneous $ character just gets in my way.

<P>
However, I realize most people still use screen readers,
where trailing whitespace is indistinguishable from the blank screen,
and a wrapped fragment is sometimes mistaken for a second line.
Therefore, you can use the command `el' to place end markers around listed lines.
Listed lines begin with ^ and end with $.
Enter `ep' to place end markers around all printed lines.
Use `eo' to turn end markers off.

<P><LI>
q quits without a warning message if the text
has never been associated with a file.

<P><LI>
Capital Q does not quit the editor absolutely.
This is because I often hit caps lock by mistake,
or even shift q by mistake,
and if I've forgotten about some important changes that I've made,
those changes are gone!
I know, this seems contrived, like it would never happen,
but it has happened to me many times,
so I disabled capital Q.
Type qt to quit absolute.

<P><LI>
Capital J joins lines together with spaces between them.

<P><LI>
x (encryption) is not implemented.

<P><LI>
P (prompt) is not implemented.

<P><LI>
missing line numbers before or after the comma are assumed to be 1 and $.
This is consistent with ,p -- to print the entire file.

<P><LI>
You cannot enter one command across two physical lines
by putting a backslash at the end of the first line.
And there's no need to in any case, because perl supports \n translation.
To split a line in the middle of the word doghouse, you would type
<br>
s/doghouse/dog-\nhouse/

<P><LI>
Only the first 500 characters of a line are displayed.
The rest of the line is in the buffer, and can even be modified via a substitute command,
but if you want to see it, you will need to split it,
as in the doghouse example above.

<P><LI>
a+ adds text, like a, but also adds the line you last typed,
when you thought you were in append mode, but you weren't.

<P><LI>
this program is less tolerant of whitespace than /bin/ed.
<br>
57 , 63 p      will not fly.

<P><LI>
A single % on the right hand side of a substitution is replaced with the last right hand side.
Some versions of ed do this, some don't; I find it a convenient feature.

<P><LI>
s, is shorthand for s/, +/,\n
This is used to split lines at phrase boundaries.
You can also use s. to split a line after the first period -- at a sentence boundary.
s; s: s) and s" can also be used.
s,3 splits the line after the third comma.
You might need to use s.2 if the sentence begins with Mr. Flintstone.

<P><LI>
The commands sg and sl make the remembered substitution and replacement strings global and local respectively.
If you want to look at all instances of "foo" in all the files in the current directory,
and change some of them to bar at your discretion,
edit *, then enter sg to make substitution strings global to all edit sessions.
In the first session, search for foo, and replace some of them with bar.
Type e2 to move to the next session, whence you can search using slash alone,
because the string "foo" is applied to all sessions.
Similarly, you can use % to refer to "bar".
The sl command returns this editor to its local behavior,
where each file has its own search/replace strings.

<P><LI>
Errors associated with reading or writing files, or switching sessions,
are always printed.  Other errors elicit the usual question mark,
whence you must type h to read the explanation.
Type capital H if you always want to see the error messages.

<P><LI>
In most versions of ed, the command z7 means .,+6p,
making the current line +7.
I think this is inconsistent, having one and only one ed command that leaves dot
somewhere other than the last line printed.
The confusion is compounded when z prints the last lines in the file,
whence dot actually is the last line printed.
So I have changed the z command slightly.
In this program z7 means +,+7p,
and the current line becomes the last line printed, just like the other commands.
Without a number, z prints the previous number of lines.
Thus you can read your file a chunk or screen at a time.

</UL>

<P>
Subsequent sections describe
new and interesting features, completely foreign to ed.
These include the simultaneous edit of multiple files
similar to emacs and vi,
and the ability to browse an html file and "edit" its fill-out form.
That's why I wrote the program in the first place.

<H4 align=center> Balancing Braces </H4>

The capital B command is of interest to programmers,
and will probably not be used by casual home users.
It locates the line with the balancing brace, parenthesis, or bracket.
Consider the following code fragment.

<P><PRE><font size=4 face=Arial,Helvetica,sans-serif><code>    if(x == 3 &&
    y == 7) {
        printf("hello\n");
    } else {
        printf("world\n");
        exit(1);
    }
</code></font></PRE>

<P>
The capital B command, on either the second or the last line,
moves to the middle line "} else {",
because that balances the open brace.
On the first line, B moves to the second line,
which balances the open parenthesis.
The second line balances {, rather than ),
because braces have precedence over parentheses,
which have precedence over brackets.
You can force a parenthesis match by typing B),
which moves from line 2 back to line 1.

<P>
The B command on the else line is ambiguous -
I don't know whether to look backwards or forwards.
You must type B{ or B}.

<P>
You can explicitly balance <>, as in multiline html tags,
or `', used in some preprocessors such as m4.

<P>
Comments or literal strings that contain balancing punctuation marks will
definitely throw edbrowse off the track.
If you are the author of the source,
you might want to avoid braces in comments,
or use comments to keep braces in balance.

<P>
static char openstring[] = "{block"; /* closing } is found elsewhere */

<H4 align=center> Context Switch </H4>

This program allows you to edit multiple files at the same time,
and transfer text between them.
This is similar to the world of virtual terminals (Linux),
where you switch between sessions via alt-f1 through alt-f6.
In this case you switch to a different editing session via the commands
e1 through e6.
Note that `e 2' edits a file whose name is "2",
whereas `e2' switches to session 2.
Similarly, you can read the contents of session 3 into the current buffer
via r3, and you can write the current buffer into session 5 via w5.
The latter command will produce a warning if session five already exists,
and you have made changes to its text, but have not saved those changes.
In other words, you are about to lose your edits in session 5.
Typing h will produce the explanation:
"Expecting `w' on session 5".

<P>
If you quit a session you are moved to the next valid editing session,
wrapping around to session 1 if necessary.
The program exits when the last session quits.

<P>
Warning, the program contains a bug regarding the undo command.
If you switch to another session, then switch back,
you cannot undo your last edit.
You'd think this would be easy to fix,
but it is trickier than it seems, so I haven't gotten rround to it.
I just wanted you to know.
Make sure everything is copasetic before you switch to another session.

<P>
Let's run through a cut&amp;paste example.
You are editing file foo in session 1, and you realize
that a paragraph from file bar would fit perfectly right here.
Here is how it might look.
Lines beginning with &lt; are the user's input,
and lines beginning with &gt; form the program's responses.
The # sign delimits my injected comments.

<P><PRE><font size=4 face=Arial,Helvetica,sans-serif>
&lt; e2   # switch to session 2
&gt; new session
#  Unlike ed, the r command does not establish a file name, even if the
#  buffer is empty.
#  Thus "r bar" is safer than "e bar".
#  The text is not linked to the file bar,
#  and we cannot accidentally corrupt this file.
#  After all, we don't want to change bar, we just want to steal from it.
&lt; r bar
&gt; 28719
&lt; /start/
&gt; This is the start of the cool paragraph that you want to copy.
&lt; 1,-d  # don't need the stuff before it
&lt; /end/
&gt; This is the end of the cool paragraph that you want to copy.
&lt; +,$d  # don't need the stuff after it
&lt; e1
&gt; foo
&lt; r2
&gt; 3279  # size of text read from session 2
&lt; q2  # clean house, get rid of session 2
&lt; w  # write foo, with the new paragraph included
&gt; 62121
</font></PRE>

<P>
The following moves the data from one file to another.

<P><PRE><font size=4 face=Arial,Helvetica,sans-serif>
&lt; e2
&gt; new session
&lt; e bar  # this time I'm going to change bar
&gt; 28719
&lt; /start/
&gt; This is the start of the cool paragraph that you want to move.
&lt; ka  # mark the paragraph
&lt; /end/
&gt; This is the end of the cool paragraph that you want to copy.
&lt; kb
&lt; 'a,'bw3
&gt; 3279
&lt; 'a,'bd
&lt; w  # write bar, withouth the cool paragraph
&gt; 25440
&lt; q
&gt; no file  # now in session 3
&lt; e1
&gt; foo  # back to session 1
&lt; r3
&gt; 3279
&lt; q3  # quit session 3 remotely, while still in session 1
&lt; w  # write foo, with the new paragraph included
&gt; 62121
</font></PRE>

<P>
An e command, by itself, tells you the current session, in case you've forgotten.
This is similar to f, by itself, which tells you the current file.

<H4 align=center> Usage </H4>

This perl program has no options.
Well actually there is a -m# option for fetchmail, but we'll get to that later.
And there's a -d# option to set the debug level.
And the -v option prints the version.
But other than that, there's no options.

<P>
The arguments are the files to edit.
Edbrowse reads these files into corresponding sessions
and starts you off in session 1.
If there are no arguments, you start in session 1,
but there is no text and no associated file.

<P>
If you like this program, and you want it to be your primary editor,
you can set the following alias.

<P>
alias e="perl -w /usr/local/bin/edbrowse"

<H4 align=center> Binary Characters </H4>

At all times, even when entering a file name, this program scans its input
for binary codes.
Sorry, but I like hex better than octal.
I know it's not standard, but there it is.
Use the three character sequence ~bd to enter the nonascii character 0xbd,
which is the code for 1/2.
Similarly, if you list a line, with the l command, the 1/2 character
is displayed as ~bd.
All nonascii and most control characters
are entered and displayed in this manner.
Tab and newline must be entered directly from the keyboard.
Tab and backspace are displayed as &gt; and &lt; respectively.
If the following line is entered,

<P>
Hello~07 ~x is ~bd of y

<P>
And then listed, you will see the very same text,
but there is a bell and a 1/2 character inside.
The ~x is not encoded into anything, because x is not a hex digit.
If you want to force a ~, even though there are hex digits following,
use two tildes, ~~.

<P>
When you are entering a regular expression, you have the choice, hex or octal.
My program converts ~xx, as a hex value,
and the perl regexp machinery converts \nnn, as octal.
Thus any of the following will undos a file.
The first is translated via my software, the second and third by perl regexp.

<P><PRE><font size=4 face=Arial,Helvetica,sans-serif>,s/~0d$//
,s/\15$//
,s/\r$//
</font></PRE>

<P>
Embedded escape characters are always displayed in hex,
whether the line is listed or not.
Most terminals and terminal emulaters, including the Linux console
and my speech adapter,
interpret various escape sequences as control commands.
Thus an errant escape sequence from a binary file could send your terminal or your speech adapter into an unexpected state,
making recovery difficult.
It seems prudent to render escapes as visible characters all the time.
If you have no idea where that ~1b came from, it's probably a literal escape character.

<P>
Returns and nulls are also converted into hex all the time.
Thus an embedded return will not make one line look like two lines.
You will usually see this when importing a dos text file.
Every line ends in ~0d.
Issue one of the three commands shown above to undos the file.

<H4 align=center> Binary Files </H4>

Data is considered binary if it is sufficiently large
(more than 50 bytes)
and it contains a significant fraction of non-ascii or null characters (more than 25%).
International text may contain scattered binary codes, for accented letters etc,
but most of the characters should still be ascii.
Therefore binary data is not international text.
In fact you probably won't be able to display or edit binary data effectively,
at least not by this program.
But don't let that stop you.
As an exercise, create an executable program that prints "hello world",
then edit the executable using this editor.
Look for the string "hello world" and replace world with jorld.
Write the file and run the executable.
You should now see "hello jorld".

<P>
When binary data is first read into the buffer, you will see the words "binary data".
After that the buffer remains "binary", even if you delete all the data and read in ascii text.
You must use the `e' command to get a fresh, ascii buffer.

<P>
For the most part it doesn't really matter if the data is considered binary or ascii.
Either way you can display and edit the data, and write it to a file.

<P>
This program tries to "do the right thing" under DOS/Windows.
That is, it converts crlf to and from newline if it believes the file is text;
otherwise it leaves the file alone.

<H4 align=center> Accessing A URL </H4>

Instead of invoking `e filename', you can invoke `e http://this.that.com/file.html',
and the editor will retrieve the named file using the http protocol.
The source (i.e. raw html) is made available for edit.
You can modify it and save it on your local machine.
Because the text was retrieve from another machine,
it cannot be written back to that machine,
hence the `w' command will not work.
You must specify a local file `w myfile.html',
or another editing session `w3'.

<P>
Note that this is not browsing, we are simply retrieving text from
another machine and editing it locally.
The text need not be html, it could be (for instance) a plain ascii document.
Many people, myself included, put various types of files, even executables,
on their web sites for retrieval.
Of course you wouldn't want to edit a binary file,
but you can still use this editor to retrieve the file and save it locally,
thus implementing an http download.

<P>
While inside the editor, you can type `e URL'
to leave the current buffer and
retrieve text from a remote machine.
Or you can type `r URL' to retrieve remote text and add it to the current buffer.
There is no `w URL' command, because the http protocol
does not allow you to "write" html source onto a remote machine.

<P>
As a convenience, any filename with two or more embedded dots
and a standard suffix (such as .com or .net)
is treated as a URL.
You can usually omit the http:// prefix.
Try invoking `e www.whitehouse.gov'
to view the home page of the capitol of  the United States.
But again, you are looking at html source, which probably isn't what you want.
Browsing will be discussed later.

<P>
Whenever you retrieve data from a URL, the editor, directed by the http protocol,
might change the filename out from under you.
This is because the resource has moved,
and the original computer was kind enough to give you the new address.
If debugging is set to 1 or higher,
you will see a series of three or four different URLs
as the editor is redirected across the internet.
Finally it retrieves your document,
and the current file name holds the correct URL.
You might want to update your bookmark file accordingly.
Then again, you might not.
Sometimes the initial url is the "public" location of the web page,
and subsequent redirections occur inside the company.
In this case you'll want to retain the public url,
which will always work, even if the company relocates its web server.
Use youre best judgment.

<H4 align=center> Browse Mode </H4>

If the editor contains html text, from any source,
you can type `b' to activate browse mode.
The command will be rejected only if the buffer is lacking in common html tags,
or the editor is already in browse mode.
You can force its hand by adding &lt;html&gt; at the top -
it will always try to convert such a file.
Now the transformed text is readable, without any visible html tags.
In other words, &lt;P&gt; has been turned into a paragraph break,
&lt;OL&gt; has become an ordered (numbered) list, and so on.
The filename is also changed; a .browse suffix has been appended.
If you write the transformed data, deliberately or accidentally,
the reformatted text will be saved in a new file,
without disturbing the original html.
This protects you if you are developing your own web pages.
BTW, I believe blind people should write raw html,
rather than wielding a wysiwyg web development tool such as Front Page.
In fact I write all my documents in html, even short business letters.
I can create headings, lists, tables, etc,
without using a wysiwyg editor or a screen reader.
This
<A HREF=http://www.mcli.dist.maricopa.edu/tut/>
excellent tutorial</A>
will get you started.

<P>
When the browse conversion is executed, the system checks for
common syntax errors, such as a numbered list that is never closed.
If the file name is a URL, these syntax errors are not reported.
After all, it's not your web page, and there's nothing you can do about it.
However, if the web page is yours, as indicated by a local filename,
the first syntax error is displayed,
whence you can return to the html source and fix it.
Type `ub' to undo the browse conversion.
This takes you back to the raw html text under its original filename.
Now you can coorect the error and try the `b' command again.
For your convenience, the label 'e is set to the line containing the error.
Repeat this process until `b' runs without errors.

<P>
If you try to quit, and the editor says "expecting `w'",
remember that you should be back in raw html before you issue the write command.
You could write the browsable text into file.browse,
and that will satisfy the "write" criteria,
but this isn't really what you want.
You've corrected errors in the html source, and that's what you need to save,
so remember to undo the browse reformatting before you write the file.

<P>
Note that you can issue the unbrowse command even if there were no errors.
If, for instance, you are looking at a well-constructed page
on some other web site,
and you'd like to read or save the raw html, just type ub.
As an exercise, invoke `e www.whitehouse.gov',
and use the `b' and `ub' commands to switch between
the raw html and the browsable text.

<P>
The browse reformatting is relatively simple,
because a blind person doesn't want complexity.
We don't care about fonts and italics etc, and if we do,
the best way to obtain this information is by reading the raw html.
So most tags are discarded, except those related to headers, paragraphs, and lists.
I don't indent subsections or list items.
The visual effect is lost on us,
and sometimes the extra spaces really get in the way.

<P>
Because the physical line is, for us, the unit of thought,
i.e. the atomic construct that is modified or moved or copied,
lines are cut at approximately 80 characters, give or take a few,
usually at a sentence or phrase boundary.
Thus reading line by line often reveals a sequence of sentences,
or at least self-contained phrases within a larger sentence.
I consider this the optimal way to view or edit a document --
any document.
If you read these words raw, without doing the browse on the file,
you'll see what I mean.

<P>
The layout of a preformatted section, &lt;pre&gt;, is honored,
although sequences of blank lines are compressed down to one blank line,
and whitespace at the end of lines is stripped.
This preserves the structure of street addresses,
and other preformatted blocks.

<P>
Tables are formatted like an ascii unload from a spreadsheet or sql database.
Pipes separate the fields on each row.
There is no whitespace around the pipes,
and the fields of a given row probably won't line up with the fields from the previous row.
It isn't pretty,
but a blind user can't really trace down a column in any case,
especially when using a line editor such as this.
Better to write the table to a local file and use cut, sort, join, etc.
Here is a sample table.

<P><PRE><font size=4 face=Arial,Helvetica,sans-serif>part number|quantity|price
2635|2|$34.80
1398|1|$67.50
8118|5|$125.00
</font></PRE>

<P>
Empty fields at the end of a row are dropped.
These are almost always images -- sometimes an entire row of images --
sometimes an entire table of images.
The blind user doesn't need to read the zero-content pipes.

<P>
Note that the browsable text is readonly.
After all, it's not the "source" -- why should you edit it?
There are ways to enter and edit the input fields of an on-line form,
but this will be discussed later.
For now, you can think of the text as readonly.
Issue a move or copy or insert or substitute command,
and you'll get an error.

<P>
If you do want to edit the text, as text,
enter the `et' command (edit as text).
You will not be able to return to the html that produced this page.
Nor can you follow a hyperlink or submit a fill-out form.
The browsable text has become plain text, with no internet semantics.

<P>
The command `b file.html' is shorthand for `e file.html', followed by `b'.
Remember that the ub command reverses the browse conversion, and reproduces the original html text,
as though you had entered `e file.html'.

<P>
If a url is opened from the command line,
as in "edbrowse www.google.com", it is automatically browsed.
Type `ub' to revert back to the raw html.

<H4 align=center> Technical, Math </H4>

Most people never read technical web pages, but if you do...

<P>
A subscript, as indicated by html tags, is enclosed in brackets.
Thus x&lt;sub&gt;n&lt;/sub&gt; becomes x[n].
This transformation is not done if the subscript is a one or two digit number.
Thus x subscript 1 is rendered x1, just like your professor would say it.
This is not ambiguous, as you might first think;
only programmers use x1 as a variable name, not mathematicians.
If you see x1 in a formula, it means x subscript 1.
Even 17a3b3 is not ambiguous;
it is a translation of 17 times a[3] times b[3].

<P>
Superscripts are enclosed in parentheses, with a preceeding arrow.
The parentheses are omited if the superscript is a number or letter.
Thus x cubed looks like x^3,
while x to the n-1 power looks like x^(n-1).

<P>
There are, sad to say, three different ways to encode mathematical symbols
in html.
At present edbrowse only supports one of them,
though it is the most common, and the most portable among all browsers.
This is the symbolic font face,
where the Greek letter theta is specified as
&lt;font face=symbol&gt;q&lt;/font&gt;.
Explorer turns this expression into <font face=symbol>q</font>,
one character on the screen,
while edbrowse turns it into the word theta.
It also puts spaces around the word
if its neighbors are also words.
This is illustrated by the circumfrence of a circle, which is 2 times pi times r.
These three tokens are usually squashed together,
and there is no confusion in the sighted world,
where pi is a separate Greek letter.
But if pi is spelled out,
and the tokens are left together,
the result is 2pir.
Now pir looks like a three letter word.
To avoid this, edbrowse inserts spaces, giving 2 pi r.
Other symbols, such as degrees, one half, times, etc,
are also expanded into words, with the same whitespace considerations.

<P>
These translations are designed to work with the pages of the
<A HREF=http://www.mathreference.com> on-line math project</A>,
an archive of advanced mathematics
that atemps to be both sighted and blind friendly at the same time.
This may be impossible,
but I'm giving it a shot.

<H4 align=center> Title, Description, Keywords </H4>

While in browse mode, the commands ft, fd, and fk
produce the title, description, and keywords of the current web page respectively.
These are normally not visible to the user.
The title describes the web page in 80 characters or less.
The description is a more complete explanation,
which is displayed by a search engine such as yahoo or altavista.
The user reads the description via the search engine and decides whether to read that web page.
Finally, the keywords are used by search engines to facilitate keyword searches.
Like the rest of the browsable text,
these three attributes are readonly.
If it is your web page,
you can modify them by returning to the raw html.
Web designers should pay close attention to the description and the keywords,
else your pages will not be accessible via the standard search engines.

<P>
Note that `ft' prints the title of the web page,
whereas `f t' renames the current file to "t".

<H4 align=center> Hyperlinks </H4>

A link to another web page is enclosed in braces, as in:

<P>
{Recent reports} suggest a connection between ADHD and food additives.

<P>
Behind the scenes, "recent reports" is linked to
www.feingold.org/research.shtml,
but you don't see that unless you activate the link
or view the raw html.

<P>
Of course the browsable text might also contain words inside braces,
especially if the web page is technical in nature.
Hence there is some ambiguity.
However, I believe it is clear from context.
{More information} is probably a link,
whereas ${HOME}/.profile is probably not.

<P>
Some web pages present a series of icons
that are actually links to other pages.
That is, you click on an icon, rather than a phrase, to go somewhere else.
These icons are suppose to be intuitive.
Sometimes they are -- sometimes they're not.
In any case, they aren't much use to the blind.
Sometimes the web designer is kind enough to supply
a text phrase that roughly describes the image.
In this case the phrase is used as the link.
It appears in braces, as though there were no image at all.
If there is no alternate phrase,
the filename of the hyperlink reference is used.
This name can be surprisingly helpful,
or it can be utterly useless, as in "index.html".
If this name canot be determined,
the generic link {image} is used.

<P>
To follow a link, enter the `g' (go) command.
Yes, `g' also initiates a global command,
such as a global substitute,
but only when it is followed by a regular expression.
By itself, g follows the link on the current line,
4g follows the link on line 4,
and g2 follows the second link on the current line.
If a link spreads across multiple lines, you must be on the first of these lines,
the line containing the left brace.

<P>
The g command can also follow a link that is written in raw text,
as long as it "looks" like a valid url.
If your friend sends you an interesting url via email,
and you save it to a text file,
you can "go" to that link,
even though the file is not html
and you've never issued a browse command.

<H4 align=center> Internal Links </H4>

Although most links lead to other web pages,
some links point to other sections within the current document.
Again, you will be able to tell by context.
Links in the table of contents are usually
shortcuts to chapters in the current document.
The same holds for links that look like:
see {Appendix I},
or, see the section on {Hardware Configuration}.

<P>
The g command follows an internal link or an external link.
Either way you find yourself in a different place.
However, if the link is internal,
you are still browsing the same file.
In fact, the only thing that has changed is the current line number.
The new line is displayed,
and should correspond to the link you activated.
Often the words are the same.
Activate {Appendix I}, and you'll probably see the section heading "Appendix I".
Enter z10 to read the first few lines of the appendix.

<H4 align=center> The Back Key </H4>

If you edit a new file via the `e', `b', or `g' commands,
and you already have text in the buffer,
that text is bundled up and pushed onto an internal stack.
You can pop the stack by issuing the `^' command.
This is suppose to be intuitive --
the up arrow pointing to the previous page that rolled off your screen.

<P>
This feature seems rather silly if you're just editing files,
but it makes sense when surfing the net.
Often we descend through two or three links,
only to find ourselves at a dead end.
"I didn't want to go here."
So we hit the back key again and again, until we reach familiar territory.
We can now proceed in a new direction.
The command ^3 or ^^^ backs up through three pages.
Don't use this iterative feature unless you know exactly how many times you need to back up.

<P>
Note that the entire state of an editing session is saved and reproduced,
including the file name, the last change (for undo),
the last search/replace strings for substitutions, etc.
The only bit that is cleared is the "write pending" bit.
After all, the program asked you if you wanted to save the changes
when you first edited the new file.
Apparently you didn't (having issued the edit command again),
so we may as well clear that bit.
You can still pop the stack and save the prior changes to disk,
but you don't have to.

<P>
Unlike lynx, I don't keep a running history of every web page visited.
I never really saw a need for this feature.
99% of the time I simply want to back up one or two pages,
and that's it.
Unfortunately this high-runner operation requires two somersaults and a back flip under lynx.
It is a one key command in my browser.

<P>
The stack should not be confused with parallel edits,
as described in an earlier section.
In fact each editing session, e1 e2 e3 ..., has its own internal stack.
Parallel sessions are appropriate when you need to move back and forth between two files,
or cut&amp;paste between them.
However, one session, with its internal stack,
is usually sufficient to surf the net.

<P>
If a browse command fails completely,
giving you a rather uninteresting empty buffer,
the stack is popped automatically,
taking you back to the previous web page.
Now you can retry the link by typing `g' again,
or follow a different link on the page.
Note that a browse command can fail, and still give you text explaining why it failed,
if the remote server is well-designed.
In this case you may see the error message "file not found",
yet you will be viewing a new web page, which explains the problem.
After you've read the explanation, type ^ to back up and try again.

<P>
If you are presented with a number, even 0, the stack has been pushed, and you are in a new file or url.
Use the ^ command to get back.
If there is no number, merely an error message,
then edbrowse did not create a new buffer.
(It didn't get that far.)
Typing . will produce the same line you saw before.

<P>
Following an internal link to another section in the current document
does not push anything onto the stack.
In other words, ^ will not take you back to where you were.
In fact, it will take you up to the previous web page, which is not what you want.
If you want to take a glance at Appendix I, and then return,
mark the current position with `kr'.
After you've visited the appendix, use the label 'r to return to your original location in the file.

<P>
If you want to follow several web pages in parallel,
you can save each one to another session using the w command.
The tags and links are transferred along with the rendered text.
For example, suppose a web page presents
<P>
{planes}
<br>
{trains}
<br>
{automobiles}

<P>
If you are curious about all three topics,
issue these commands in this order.

<P><PRE><font size=4 face=Arial,Helvetica,sans-serif>1g
w2
^
2g
w3
^
3g
w4
^
</font></PRE>

<P>
Now sessions 2 3 and 4 are the subpages about plains trains and automobiels respectively.
You can fill out forms or follow hyperlinks in any of them,
or stay in session 1 and do something else.

<H4 align=center> Input Fields </H4>

The input fields of an on-line form are usually indicated by angle brackets.
For example, a search engine might present the following form.

<P><PRE><font size=4 face=Arial,Helvetica,sans-serif>Keywords: &lt;&gt;
Advanced parsing: &lt;-&gt;
Language: &lt;en&gt;
Search now: &lt;GO&gt;
Cleaar form: &lt;RESET&gt;
</font></PRE>

<P>
The first line in this sample form is a simple text field, which is initially empty.
You supply the keywords to search for.
Entering and editing input fields is discussed later.

<P>
The second line is a checkbox.
This field tells the search engine to use advanced boolean features,
such as this keyword and that, or this, but not that, etc.
The feature is disabled, indicated by -.
(Most people don't know how to use advanced search anyways.)
A + means the checkbox is on.

<P>
The third line determines the language of the keywords, English by default.
This isn't a free text field, you can't just type in anything you want.
We'll describe how to view the options later.

<P>
The fourth line is the submit button, which sends the form to the search engine
and retrieves the results.
This "field" cannot be edited; it is merely a button to push.

<P>
The fifth line is also a button to push.
It clears all the data you have entered, so you can start over.
Default values will be restored.
Thus the third line goes back to &lt;en&gt;, rather than &lt;&gt;.

<H4 align=center> Data Entry </H4>

Filling out a form is relatively easy, once you know the `I' command.
The capital letter I indicates input fields (browse mode),
whereas lower case i inserts lines of text in edit mode.
In practice there is no ambiguity,
so you can  use either lower case i or upper case I for data entry.

<P>
If there is only one input field on the current line, i?
displays information about that input field.
If the line contains multiple input fields, you will need to use a number,
as in i3? for the third field.
The type of input field is displayed, then its size, then the current value, then the field name.
If the input field is drawn from a set of options,
the option list is displayed as well,
with menu numbers prepended.
When you want to select an option,
you can either type in a substring that determines that option uniquely,
such as mich for Michigan,
or you can type in its menu number.
Needless to say, the latter is often easier.
Recall the sample form in the previous section.
If you type i? at the third field, you might see the following.

<P><PRE><font size=4 face=Arial,Helvetica,sans-serif>select[7] &lt;english&gt; [language]
1: english
2: french
3: german
4: italian
5: spanish
</font></PRE>

<P>
If a select list contains hundreds of options,
type i?string to see only those options that contain the specified string.
Type I?mi in a state field and get Michigan, Mississippi, Missouri, and Minnesota.
Then select the option you want by name or by number.

<P>
Now let's do some data entry.
Type i=xyz to place xyz in the input field.
Remember, you will need to type i3=xyz
to put information into the third input field on the current line.
If you get an error, it is probably because the field has a fixed set of options,
and you didn't pick one of those options.
You can either type in one of the options or its menu number.
You can also type in a fragment of the option you want,
and edbrowse will fill in the rest.
This is done whenever one and only one option contains a copy
(case insensitive) of the string you entered.
Thus you could enter tali above and get Italian,
as that is the only language with those four letters.
This is useful when you are entering your address,
and they ask for the state.
Type in a few letters of your state name, enough to be unique,
and you'll probably glom onto the correct option in the list.
Note the paradigm here:
blind people don't want to wade through a menu unless they absolutely have to!

<P>
You can use i&lt;7 to pull the contents of session 7 into the current input field.
Session 7 must have one line of text.
Similarly, i&lt;filename reads the contents of the file into the current input field.
Again, the file should contain one line of text.
The filename is expanded in the usual way.
This includes wildcard expansion, as long as the expansion leads to one and only one file.
Put enough characters around the * to designate a single file.

<P>
Now suppose you are entering your credit card number, all 16 digits, into a free text field.
If you've made a typo, you don't really want to enter the entire string again.
No problem -- use the substitute command.
You can write this as I/x/y/ or i/x/y/ or s/x/y/ -- whatever you prefer.
Remember, you may need to specify a field, as in s3/x/y/.
The usual substitution syntax is honored.

<P>
If the submit button is the third field on the current line, you can press it via i3*.
However, i* is sufficient when there is only one button on the line.
Similarly, you can establish a text field by entering i=kangaroo,
rather than i1=kangaroo, if the second field on the current line is a submit button.
You only need specify a field number
when there are multiple input fields, or multiple buttons, on the current line.

<H4 align=center> Text Areas </H4>

Some internet forms allow you to type freely, as in "Please enter your comments here."
This is done inside a window within the screen,
having a fixed number of rows and columns,
although that is usually an artificial constraint.
The sighted user can type more lines than the window will hold,
and the window scrolls appropriately.
Fortunately the blind user can ignore the artificial window and type freely.
Still, the i? directive tells you how big the window would be,
if you were running a visual browser.
You might see something like "area[7x40]", which indicates a window 7 rows by 40 columns.

<P>
The lynx implementation of the text area is particularly hideous.
This is not surprising, since lynx is not an editor.
You can correct small typos on the current line,
but you can't actually "edit" the text you are working on.
Once you hit return, that line is done, and you're on to the next line.
You can't move lines around or insert lines etc.
Nor can you prepare your comments ahead of time and read them into the text area from a file.

<P>
In this program, the text area is managed from another editing session.
This allows you to use the full power of the editor.
You can move text, make global substitutions,
or read comments in from a prepared file.
The editing session is chosen for you, and appears in the input field.
Consider the following form.

<P><PRE><font size=4 face=Arial,Helvetica,sans-serif>Enter your email address: &lt;&gt;
Enter your comments: &lt;buffer 2&gt;
</font></PRE>

<P>
In this example session 2 was not active when browsing began.
The browser allocated session to specifically for this input field.
Type e2 to move to session 2, prepare your comments,
and type e1 to return to the input form.
On most web pages the text area starts out blank,
whence buffer 2 will be empty,
but this is not always the case.
Be sure to check for pre-existing text before you start typing your thoughts.
A particularly arrogant site might preload the text area with:
"I love your web site because".
When you finally submit the form,
as discussed in the next section,
text buffer 2, associated with the second editing session,
will replace the words "bufffer 2" in the input field.
Thus your carefully crafted comments are on their way.
(This doesn't mean anybody is going to listen to them.)

<H4 align=center> Push The Button </H4>

If the third input field on the current line is a reset or submit button,
you can press the button via i3*.
The reset button puts the input fields back to their original values,
as supplied by the web page when it was first loaded.
Text areas are an exception to this rule.
They are simply cleared.
Since they are usually blank at the start,
with no text preloaded,
this "bug" isn't a serious problem.

<P>
The submit button sends the form to the remote server and waits for a response.
This is similar to following an internet link,
but in this case you are sending some data along with the request.
Type "kangaroo" into a search engine and you'll soon be reading a web page about kangaroos.
As with any other link, you can use the ^ key to go back.
In this case you will return to the on-line form.
You can change the data and submit the form again, asking about another animal.

<P>
I have implemented the "get" and "post" methods,
the most common http protocols,
and they seem to work on most sites.

<P>
Once you have submitted your form,
and you are viewing the results,
you may notice some strange characters at the end of the filename.
(This only happens under the "get" method.)
If you have retrieved information on kangaroos, the filename might look like:
www.search-engine.com?keywords=kangaroo.
The text after the question mark is an encoded version of the data
you entered into the form.
It becomes part of the virtual URL.
This is actually a feature, as we shall see in the next section.

<H4 align=center> Web And Email Addresses </H4>

The capital A command shows you the web addresses behind
the links on the current line.
Each web address will be surrounded by &lt;A&gt; and &lt;/A&gt; tags,
ready to be pasted into a bookmark file, if that is what you wish.
These addresses exist in a new editing session; the previous session has been pushed onto the stack.
You can add these to your bookmark file via w+ $bookmarks.
They will be appended at the end;
you can move them to a more appropriate place in the file later on, when you're not "on line".
For many, with dial up connections,
connect time is precious,
and should not be spent rearranging bookmark files.
Finally, use the ^ key to return to the web page you were viewing.
Here is how it might look.

<P><PRE><font size=4 face=Arial,Helvetica,sans-serif>&lt; b this.that.com/whatever  # browse a web page
&gt; 16834  # size of the raw html
&gt; 7855  # size of the browsable text
&lt; /kangaroo/i  # looking for kangaroo on the page
&gt; Click here for {more information about kangaroos}, or {send us mail}.
&lt; A  # capture the URLs
&gt; 144  # size of the URLs
&lt; ,p  # let's see them
&gt; &lt;A HREF=www.kangaroo-info.com&gt;
&gt; more information about kangaroos
&gt;  &lt;/A&gt;
&gt; send us mail:info@kangaroo.org
&lt; 4d  # don't need the email address
&lt; w+ $bookmarks  # append this url to the bookmark file
&gt; 336
&lt; ^  # back to browsing
&gt; Click here for {more information about kangaroos}, or {send us mail}.
</font></PRE>

<P>
I suppose I could interrogate the environment variable $bookmarks
and append the URL to that file automatically,
but as this example shows, you might not want all the links.
In fact the email link makes no sense in a bookmark file.
Also, you may want to change the description of the link,
though in this example the description is pretty reasonable.

<P>
Alternatively, you might discard the url and retain the email address,
appending it to your address book.
Again, you will want to change the generic phrase "send us mail"
to a brief string that is meaningful to you, such as kangaroo-mail.
This becomes the alias, which you can use to send mail
to that recipient.
Subsequent sections describe the use of this program as a mail client.

<P>
If there are no links on the current line, or you are not in browse mode,
the current filename is used.
If the filename looks like a URL,
it will be enclosed in &lt;A&gt; &lt;/A&gt; tags,
as though it were a link.
This is useful when you want to bookmark the current page,
rather than some other page pointed to by a link.

<P>
If the current page is the result of a form submition, the filename
may include your input fields after the question mark.
If it does, that's a feature, not a bug.
This exact URL, with the data at the end, can be stored as a bookmark
and activated again and again,
as though you had filled out the form each time.
Every week you can call up this virtual URL
to see if there is any new information on kangaroos.
A more practical example might be a canned query that retrieves
the weather for a certain city
or the stock prices for the companies in your portfolio.
You can also write concise shell scripts that "fill in" the virtual
form, simply by modifying the information after the question mark.
This provides a simple command to retrieve the weather from any major city
or the current price of any stock.

<P>
Not all forms support this type of in-URL encoding.
If the A command produces www.search-engine.com,
with no ?data at the end,
you're out of luck.
You will have to fill out the form interactively every time you want to run this query.
Fortunately most on-line forms allow you to encode the input fields within the URL.
If you are a web designer, do your clients a favor
and use the "get" method in your forms whenever possible.
This will allow others to create and resubmit canned queries with specific input parameters,
or develop shell scripts that query your web site automatically.

<H4 align=center> Cookies </H4>

Some web sites serve "cookies",
which your browser is expected to retain
and pass back during subsequent exchanges.
In fact some web sites simply won't work without cooky support.
Therefore edbrowse supports cookies by default.
You can toggle this feature with the `ac' (accept cookies) command,
but you probably don't want to.

<P>
Note that only Netscape-style cookies are supported.
However, this is the most common flavor of cooky.
It will probably meet your needs.

<P>
Persistent cookies are stored in a file,
$HOME/.cookies, and are thus available for subsequent edbrowse sessions.
These cookies are used to store long-term information about you,
such as your login and password into amazon.com.
Hence your .cookies file should be mode 0600.
In fact the file is created mode 0600, for your own protection.

<P>
You probably won't need to view your .cookies file,
but it is text based, and can be edited directly if you wish.
The file format is consistent with lynx,
if your version of lynx supports persistent cookies.
Thus lynx and edbrowse are interoperable.
You can receive cookies from a web site using one browser,
switch to the other browser, and pass the appropriate cookies back as expected.

<H4 align=center> PDF Files </H4>

The portable document format, pdf, is growing in popularity;
don't ask me why.
If you want to download a technical manual,
or a scientific paper,
don't be surprised if it is in pdf format,
as indicated by the .pdf suffix on the filename.
Unfortunately this format is completely inaccessible to blind users.

<P>
Some pdf files (not all)
can be converted into html,
though the conversion is rather crude.
Still, it is sufficient to read the text and follow the hyperlinks.
Several third parties provide conversion utilities,
but I believe the best utility, now and in the future,
is at access.adobe.com.
After all, who knows more about pdf than its creater, adobe.com?
I doubt anyone else will do a better job converting to and from pdf,
especially as pdf evolves and grows in complexity.

<P>
If you retrieve a pdf file from the Internet,
edbrowse automatically routes it through access.adobe.com,
thus converting it into html.
Then the text is rendered in the usual way.
Note that the file name is the name of the pdf document,
but the request actually went to another url.
You might be fetching this.that.com/foobar.pdf,
and see the error "cannot connect to access.adobe.com".
This seems incongruous until you remember that a separate web server,
access.adobe.com, is performing the translation.
If you can't get to both web sites,
e.g. because of an internet problem, you won't be able to retrieve the data.

<P>
You can toggle pdf to html conversion
by entering the `ph', pdf to html, command.
When conversion is disabled,
all pdf files will be downloaded as binary files,
and you can do whatever you like with them.

<H4 align=center> Secure Connections </H4>

Edbrowse supports the most common method of encrypting web traffic,
HTTP over SSL/TLS, colloquially known as secure http.
Web sites which allow
secure http have URLs of the form:
https://secure.server.com/yawn.html.

<H6 align=center> Prerequisites for https</H6>

In order to use this new functionality, you will need to obtain some
prerequisite software.
Firstly, you must obtain the
<A HREF=http://www.openssl.org/source/>
OpenSSL toolkit</A>.
Then you must obtain the
<A HREF=http://symlabs.com/Net_SSLeay/>
Net::SSLeay package
</A>
for Perl.
These will need to be installed.  The README, INSTALL, and other files supplied
with these packages
are (somewhat) self-explanatory.

<H6 align=center> Certificate Verification</H6>

When connecting to a server, you will be sent a certificate.  This contains
the server's public key, and has been signed by a certifying authority.
This signature can and should be verified.  This lessens the chance that
you are dealing with a bogus server.  Certifying authorities run checks on
those whom they deal with, and will not sign a certificate for a sham
organization.  We verify against a file of certificates from legitimate certifying
authorities.  This should be placed in your home directory under the name
".ssl-certs".
A valid file should be obtainable from the
<A HREF=ssl-certs>
same place as Edbrowse</A>.
You can append
additional trusted certificates, as needed, in PEM format, to the end of this file.
You'd want to do this, for example, to access machines on your organization's
intranet if your organization uses it's own private certifying authority.

<H6 align=center> Disabling Certificate Verification</H6>

Sometimes you may not be concerned about the legitimacy of those with whom
you deal.  Maybe their certificate is invalid, out of date, etc.
You can toggle certificate verification with the `vs' (verify secure connections) command.
Certificate verification is
enabled by default.
Note that disabling certificate verification is a security risk.  I wouldn't
send my credit card information to an unverified server.
As of this writing, lynx does not verify any web servers,
and is less secure than edbrowse, Explorer, or Netscape.

<H4 align=center> FTP Retrievals </H4>

This browser supports the retrieval of ftp files and directories.  You can
give it an FTP URL like:
ftp://ftp.random.com/tarball.tar.gz
and the file will be fetched.
You can also visit an FTP URL from an html file.
By default, edbrowse uses the account name "anonymous" and the password
"some-user@edbrowse.net" for ftp connections.

<H6 align=center> Parsing Remote Directories</H6>

Some ftp URLs point at directories, not files.  If you visit one of these,
and it is located on a Unix-like server, you will receive the listing as an
html file with hyperlinks.  You can visit the directory members just as
though you were exploring a web site.
If the server does not run some
flavore of Unix, you will receive the directory listing in plain
text.

<H6 align=center> Accounts Other Than "anonymous"?</H6>

There is another form which ftp URLs may take.
ftp://user:password@host/path/
For example, let's say I want to access the file /etc/passwd on numenor.localdomain.
This file isn't readable by anonymous users.
Within edbrowse, I might use the command:
e ftp://chris:xxx@numenor.localdomain/etc/passwd
to download the file.
The ftp connection will be made as user "Chris", with password "XXX".

<H6 align=center>Modes of ftp</h6>

When you visit an ftp server to download a file or directory listing, two
types of connections are employed.  The control connection is used for sending
commands to the ftp host, and the data connection is used to download the
raw data.

<P>
While the control connection is made from the client to the server, the
data connection is usually made from the server to the client.  This is
called active mode ftp.
The problem is that this "active mode" doesn't work well if you happen to
be behind a firewall.

<P>
The solution is passive mode ftp.  The data connection
is made from the client to the server.  Edbrowse uses passive mode by default.
But this may not work if the server is behind a firewall.  So you have a choice;
use the `pm'command to toggle between passive mode and active mode.

<H4 align=center> Frames </H4>

Frames are a mechanism whereby a web page can fetch and display several other web pages on the screen.
Each subpage is called a frame, and lives in its own space on the screen.
Sometimes the frames are top middle and bottom;
sometimes they are left middle and right.
Edbrowse fetches these frames and presents them in order.

<P>
Most frames have names - hopefully these names are helpful.
If a page uses frames you might see something like this.

<P><PRE><font size=4 face=Arial,Helvetica,sans-serif>Frame top:

Stuff about the company, and how to get back to the home page etc.

Frame bottom:

The stuff you actually asked for, your order, etc.
</font></PRE>

<P>
You can disable frame fetching via the `ff' command,
if you know your web sites well, and you want to activate particular frames manually.
When frameFetch is disabled, frames look like hyperlinks:
frame{top}.
Access each one in turn, as if it were just another web page.

<P>
There may be web sites out there that won't work unless all the frames are fetched, in order.
Suppose the first frame sends your browser a cooky,
which it must return when fetching the second frame.
I don't know of any web sites like this,
but I wouldn't be surprised if there are some out there,
so you might want to leave fetchFrames enabled.

<H4 align=center> Javascript </H4>

Javascript is everywhere, but in some cases we can work around it.
Remember that web designers rarely <em>program</em> in Java.
In fact they aren't programmers at all.
They use various wysiwyg web design tools
that crank out canned fragments of javascript.
In some cases we canrecognize these fragments and deal with them.

<P>
Sometimes a hyperlink uses javascript only to open a new window,
which brings up the referenced web page.
Obviously the notion of a new window is meaningless in edbrowse;
every web page is already in its own window.
So this program tries to bypass the javascript and replace it with a simple hyperlink.
It isn't openNewWindow&nbsp;("foobar.html") any more,
it's just foobar.html.
It will look and act just like any other hyperlink.

<P>
Keep in mind, I'm using a simple heuristic,
encoded in a regular expression.
I'm sure there are openWindow() calls that I don't recognize as such,
and no doubt there are other javascript calls that I will mistake
for a subwindow hyperlink.
If you find either a false negative or a false positive,
send it along and I'll try to update my heuristics.

<P>
Another common javascript function is the validate&amp;submit function.
This function checks your entries -
have you filled in all the riquired fields -
is there an @ sign in your email address -
etc.
Then, if there are no errors, it submits the form.
I can't emulate the javascript,
but I can <em>assume</em> you have entered the data correctly
and submit the form.
The javascript call is replaced with a simple submit button.
The same text is used, but the letters js are appended,
to remind you that javascript is being bypassed.
Thus the last line on a registration form might look like:
<P>
&lt;Register now js&gt;

<P>
In some cases the javascript function, which I am neatly bypassing,
reformats your data.
If it does, you're screwed.

<H4 align=center> Web Strip (Experimental) </H4>

A web page often contains redundent information,
relative to its parent page.
Plowing through the same navigation links and introductory text,
in search of new information,
can be a huge waste of time.
The sighted user recognizes the same "stuff" at a glance and scrolls down the page,
consuming perhaps 4 seconds,
but I often spend 4 minutes performing the same operation.
How frustrating!

<P>
The web strip feature, `ws', attempts to delete the redundent
text at the start and end of a web page,
leaving only the new information.
After all, that's why you activated the link in the first place.
You can resurrect the deleted text via the unstrip, `us', command.

<P>
This feature is called experimental because it isn't very smart.
It's more like a cmp than a diff.
The slightest change in a line, or insertion of a blank line,
will cause web strip to stop in its tracks.
Often there is more text that <em>should</em> be deleted,
but I don't have time to turn this into a Ph.D. thesis.
If anybody wants to beef up this routine, I'd be grateful.

<H4 align=center> Web Express </H4>

<A HREF=http://www.webexpresstech.com/WebXP/WebExpressTutorial.html>
Web express
</A>
is a stand-alone program designed to fetch specific information from the internet
and present it concisely,
without all the headers and footers and commercials etc.
Some of this technology has been folded into edbrowse, with permission.

<P>
The @ command initiates a web express query.
The identifier that follows the @ sign is the name of the query.
It is often cryptic, just a couple of letters.
For instance, @gg is shorthand for a google search.
Type the following to look up information on african elephants.
<P>
@gg african elephant

<P>
When the page is retrieved,
specific post-processing code strips out the extraneous information,
and leaves only the results of your search.
This is a web strip feature that really works!
It works because somebody has tailored it to this particular page.
There are no heuristics - no guesses.
The down side is the constant maintenance.
If google changes its format,
we will have to modify the post-processing commands accordingly.

<P>
As described earlier, `us' (unstrip) is the opposite of web strip.
You can enter the us command to get the entire google page back again.
Then you can jump to other places in google,
click through the sponsored links, etc.

<P>
The web strip feature
supports inheritance.
If the first page doesn't have the information you are looking for,
aned you call up the second page by clicking on the {more results} button,
you can use the ws command to strip the second page of results,
using the same google-specific commands
that were applied to the first page.

<P>
Many shortcuts are available - too many to document here.
For instance,
@yf jnj gets the current stock quote for Johnson and Johnson from yahoo financial,
and @mw elephant looks up the word elephant
in the on-line Merriam Webster dictionary.
And so on.
Type @ by itself to get a list of available shortcuts.

<P>
Actually, none of these shortcuts are included in the edbrowse program.
They are all defined in your .ebrc config file for maximum flexibility.
You can modify existing shortcuts or add your own,
reflecting your particular interests.
See the
<A HREF=sample.ebrc>
sample config file</A>
for the definitions of various web express commands.

<H4 align=center> Predefined Edbrowse Command Sets </H4>

You can bundle a set of edbrowse commands together under one name,
similar to a macro.
If the following appears in your .ebrc file,
you can type &lt;ud to undos a file.
<P><PRE><font size=4 face=Arial,Helvetica,sans-serif>cmdlist = ud
&lt;   ,s/\r$//
</font></PRE>

<P>
The leading &lt; symbol tells edbrowse
that the line contains an edbrowse command.
The commands bundled together under the name "foo"
are invoked by typing &lt;foo.
The same list of commands can be folded into other lists of commands,
or into the post-processing directives associated with web express shortcuts.
The command list named "init" is run at startup.

<P>
If the name of the macro has a + before it in the .ebrc file,
execution stops when an error occurs.
Without the + sign
edbrowse runs all the commands in the set.
Note that all the commands are run, no matter what,
after a web express shortcut.

<H4 align=center> Directory Scan, File Manager </H4>

If you edit a directory
you will see a list of all the visible files in that directory,
in alphabetical order.
Type g to go to one of these files or sub directories.
Type ^ to return to the parent directory.
This is similar to browse mode;
just pretend like there are braces around each file name.
Thus you can traverse an entire directory tree
as though you were inside a file manager.

<P>
Like ls -F, a subdirectory is indicated by a trailing slash.
This slash is not part of the filename.
Similarly, named pipe is indicated by |,
symbolic link by @,
block special by *, character special by &lt;,
and socket by ^.
If a regular file ends in one of these characters, it may confuse you,
but it won't confuse this program.
Edbrowse knows whether that trailing | is part of the filename
or a pipe indicator.
Since each file is represented by a single line of text,
files with newlines embedded in their names cannot be accessed.

<P>
If you read a directory into a preexisting file it is just text.
You can't visit any of the underlying files, because they are just words.
You must edit a directory in its own session
or read a directory into an empty session
if you want to access the underlying files.
Note that you can write the buffer to another editting session,
and in that session the words are just words.
This distinction is important as we start to edit the text.

<P>
By default, directories are readonly.
If you try to delete a line, and hence the associated file,
it will tell you that you are still in directory read mode.
I'm trying to save you from yourself!
Type dw to enable directory writes,
and dr to make directories readonly again.

<P>
When directory writes are enabled,
you can remove files using the d command.
For instance, g/\.o$/d removes all the object files.
Since these edits have implications outside the scope of this program,
there is no undo capability.
When you make a change it is made.
With this in mind, I borrowed a good idea from Microsoft.
The deleted file isn't actually deleted;
it is moved to your recycle bin,
located in $HOME/.recycle.
So if you accidentally type ,d and remove all your files,
you can recover them from your recycle bin.
You may want to set up a cron job that removes
all the files from your recycle bin once a week.
This directory is created mode 700, so nobody else can look at your deleted files.
If you create this directory yourself, please make if 700.

<P>
Because this operation is a move, rather than a true delete,
there are a few restrictions based on your operating system.
If your OS can move directories,
this program will be able to delete a subdirectory as easily as a file.
The entire subtree is moved to your recycle bin.
Make sure your cleanup cron job is capable of removing directory trees, not just files.

<P>
Depending on your OS, you may not be able to move files across file systems.
From /disk2 to /disk1, or from the D drive to the C drive.
In this case you might want to issue the dx command,
which makes directories writable, like dw, but actually deletes the files.
You'll need this if you're trying to free up space on the disk.
Note that symbolic links are always deleted;
there isn't much point in moving a link to the recycle bin.

<P>
"What's the point of all this?" you may ask.
"What's wrong with the shell?"

<P>
Nothing, as long as the file names are small and familiar.
But sometimes the file names are long and cumbersome,
and it is nearly impossible to type those names into the shell,
character for character, upper and lower case, with no mistakes.
Meta characters such as the * can help,
but only when the file you want has a name radically different from the other files in the directory.
This isn't always the case.
Suppose an application generates log files as follows.

<P><PRE><font size=4 face=Arial,Helvetica,sans-serif>ProgramFooBar.-04-04-1998.06:31:59.log
ProgramFooBar.-04-11-1998.11:37:14.log
ProgramFooBar.-04-18-1998.16:22:51.log
</font></PRE>

<P>
How do you delete the old ones and keep the most recent,
or rename them to something more manageable?
Stars are a bit risky; you can access multiple files without realizing it.
And we're not even talking about those pesky files with spaces or invisible control characters in their names.
Our sighted friend calls up his file manager and simply clicks on the file he wants to view or edit or remove.
Sometimes I want/need that kind of power.

<P>
When the substitute command changes text, it renames the underlying file.
This won't move the file on top of another existing file,
so you can't lose any data this way.

<P>
The search and substitute commands ignore the trailing filetype characters.
If you want to rename a directory from foo/ to foobar/,
you can type s/$/bar/.
The bar will be placed at the end of the word foo, because the trailing / isn't really there.

<P>
Now suppose you want to run an arbitrary program on some of these files.
This could be a print utility,a compiler, whatever.
Sometimes you can rename the files for your convenience, then work in the shell.
But sometimes you don't own the files,
and sometimes they must retain their original names.
This happens when several html documents reference each other through hyperlinks,
using the aforementioned filenames.
So you can't rename the files, yet you still want to run your program on one or two of them.

<P>
You can run any program on any file without retyping that filename via the shell escape.
Use kx to assign the label x to the file you are interested in.
(This is standard ed syntax.)
Then run !program 'x
to invoke your program on the file in the line labeled x.
This sounds involved, but it is merely text substitution, implemented in a few lines of perl.
If 'x is present in a shell escape, and is not next to any letters or digits,
we replace it with the text on the line labeled x.
Thus if your filename contains spaces, you'd better run !program "'x",
to make sure the entire file name is one argument to the running program.

<P>
The token '. is replaced with the text on the current line,
and the token '_ is replaced with the current filename.
If you try to write a file, and remember that you left it readonly,
you can make it writable via !chmod +w '_,
and then write the text to the file.

<P>
You can expand multiple tokens in one shell command.
Use kx and ky to mark two files that you want to compare, then run !diff 'x 'y.

<P>
This feature is not limited to directory scans.
You may be editing a simple file,
but you can still paste the contents of a line into your shell command.
Off hand I don't know why you'd want to do this,
but you can.

<H4 align=center> The Refresh Command </H4>

Type `rf' to refresh the current file.
This rereads the file or url into the current buffer.
It does not push a new editing session onto the stack.
This is analogous to the refresh button on Netscape and Explorer.

<P>
If a web page is updated every minute, e.g. with the latest stock prices for your favorite companies,
you can type rf to fetch the latest copy of this web page.
This assumes the url is a direct html reference,
or a dynamic fetch using the get method.
This will not work with the post method.
It also assumes the intervening internet servers are not caching the web page
and handing you the same out-of-date copy over and over again.

<P>
On your local machine,
you can use this feature to read the latest version of a dynamic file,
such as a log file.
Or you can reread a directory,
to incorporate any new files that have been placed in that directory.
For example, you might use the shell escape to execute
`cat x y &gt;z',
yet z will not appear in your directory scan, until you type rf.

<H4 align=center> User Agent </H4>

Every time you fetch a web page from the internet,
your browser identifies itself to the host.
This is done automatically.
Edbrose identifies itself as "edbrowse/1.5.16",
where the number after the slash indicates the current version of edbrowse.

<P>
All well and good, but some websites have no respect for edbrowse,
or lynx for that matter.
They won't even let you in the door unless you look like Explorere or Netscape.
Clickbank.com, a major credit card processor, is one example.
<P>
So what do we do?
We lie!

<P>
You can specify different agents in your .ebrc file,
and activate them with the `ua' (user agent) command.
If the following lines are in your .ebrc file,
you can type ua1 to pretend to bee lynx,
and ua2 to pretend to be Mozilla.
Type ua0 to resurrect the standard edbrowse identification.
Let's hope there aren't too many asinine websites out there,
like Clickbank, that force us to lie.

<P>
agent = Lynx/2.8.4rel.1 libwww-FM/2.14
<br>
agent = Mozilla/4.0 (compatible; MSIE 5.5; Windows 98; Win 9x 4.90)

<H4 align=center> Upper/Lower Case </H4>

The `lc' command converts a line to lower case,
and `uc' converts it to upper case.
Perl users will recognize these directives.
As an extension, `mc' converts to mixed case, capitalizing the first letter of each word,
and the d in mcdonald.

<P>
This is especially useful in a directory scan.
The last thing a blind person wants to worry about is whether some of the letters in a file name are upper case.
If directory write mode is enabled,
type ,lc to convert all the file names to lower case.

<P>
If you want to convert a particular word, type s/word/uc/.
This converts the word to upper case.
All the other substitution suffixes apply.
To change foo, Foo, FOo, and FOO to FOO, everywhere,
type ,s/\bfoo\b/uc/ig.

<H4 align=center> Break Line </H4>

The `bl' command breaks the current line into sentences and phrases,
each about 70 characters long.
It also compresses white space and strips white space from the end of the line.
If the line contains return characters,
these are turned into line separaters -
places where the line will definitely be cut.
The only white space that is preserved is the tabs or spaces
at the beginning of the line, or after each return character.
This is a modest attemp to keep indented text indented,
if that makes any sense?

<P>
I use this feature in two different ways.
If I am familiar with the document,
(I probably wrote it),
I may use the bl command on a line of text that seems rather long.
I typed it in quickly, as an uninterrupted thought, and now I want to break it up.
But I don't want to count punctuation marks and say,
"I think we need a break after the third comma
and the period following that and then at the next comma",
issuing the s punctuation commands along the way.
Oh I like the s commands well enough - they put you in complete contrl -
but it's easier to type bl - and bl usually does the right thing.
Also, bl compresses accidental double spaces,
a typo that I will never hear if I simply read the line as a whole.

<P>
When the document comes in from the outside,
usually from another word processor such as MS-Word,
bl serves a completely different function.
Paragraphs are often stored on a single physical line.
Sometimes the entire document is on a single line,
with return characters, \r, separating paragraphs.
Wysiwyg word processors don't worry about separating sentences and phrases -
that's what word wrap is for.
Well - bl is our version of word wrap.
It doesn't try to conform to any screen;
it merely cuts the text into manageable chunks,
each piece a separate semantic unit.
When bl is issued,
physical lines will contain sentences or phrases, as delimited by punctuation,
or by the newline/return characters embedded in the original document.

<P>
If one of the original lines, delimited by newline or return,
is long, i.e. more than 120 characters,
it is assumed to be a self-contained paragraph,
and a blank line is added before and after.
Thus a disassembled paragraph containing 20 sentences
does not simply flow into the next disassembled paragraph containing 18 more sentences.
An empty line separates the two paragraphs.
This is only applicable if bl is applied to a range of lines,
or the entire document,
as might occur when making an outside document readable.

<P>
Don't apply the bl command to a preformatted section,
such as a table or ascii art.
If you're not sure what to expect,
i.e. you didn't write the file,
scan through it first,
and apply bl to the range of lines that actually represents text.
Often this is the entire document (,bl).
The following commands do a pretty good job of cleaning up a typical Microsoft Word document.

<P>
<PRE><font size=4 face=Arial,Helvetica,sans-serif><code>e whatever.doc or whatever.wps
f _  # don't accidently overwrite the microsoft document
,s/[~80-~ff~00-~0c~0e-~1f]//g;  # strip out non ascii control/formatting codes
g/^\s*$/d  # these blank lines use to contain non ascii codes
,bl  # split lines and paragraphs
1,20p  # first couple lines are often garbage, but then the text begins.
</code></font></PRE>

<P>
Note that this function uses the same software as the browse command.
Both commands format the text in the same way.
If you change the constant $optimalLine = 80 in the ebrowse source,
that affects both the break line and the browse translations.

<H4 align=center> Send Mail </H4>

You can email the contents of your current editing session to someone else
via an smtp server.
Use the `sm' command to send mail.
Your pop3/smtp server is described in the config file $HOME/.ebrc.
You can
<A HREF=sample.ebrc>
obtain a sample</A> here.
It is well commented.

<P>
The recipients, attachments, and subject must appear at the top of your send file.
The sm command is picky, so observe the following syntax carefully.

<P><PRE><font size=4 face=Arial,Helvetica,sans-serif>To: fred.flintstone@bedrock.us
To: barney.rubble@bedrock.us
account: 0
attach: hollyrock-brochure.pdf
Subject: Hollyrock Vacation
Come visit Hollyrock.
Brochure attached.
Sincerely,
Rock studios incorporated.
</font></PRE>

<P>
The account line is optional.
It tells edbrowse to use the first mail account specified in your .ebrc config file.
If you don't include an account: line,
edbrowse uses the default account, indicated by * in your .ebrc file.
That's usually the one you want anyways.

<P>
Typing sm5 causes edbrowse to use account number 5 (i.e. the sixth account)
in your config file.
This overrides the account: line in your file, if there is one.
It is often easier to type sm5 than to insert an account:5 line.

<P>
Use the attach: lines to add attachments to your email.
Each line should specify a file to attach.
If the filename is simply a number,
the corresponding edbrowse session is used instead.
Return to the earlier example,
where we are trying to attach a Hollyrock brochure.
Another way to do this is to switch to session 2 and edit the pdf file.
This is a binary file, but that doesn't matter.
Don't do anything with it, just hold it in session 2.
Then switch back to session 1 and use the line attach:2.

<P>
If you use attach:2, instead of attach:hollyrock-brochure.pdf,
Fred will notice one difference.
The attachment is not prenamed for him.
If he wants to save the attachment,
he'll have to come up with a filename himself.
Other than that, the email looks the same.

<P>
The alt: directive is almost the same as the attach: directive.
If you use alt:, the attachment is not treated as an adjunct file.
Instead, it is an alternate representation of the same email.
The mail client will use the alternate representation if it can.
This is usually used to send multimedia email,
with hyperlinks and pictures etc.
The primary email is in plain text,
but the alternate attachment is in html or rich text.
Unless something is amiss, the user sees the alternate presentation,
complete with graphics and hyperlinks.

<P>
Like attachments,
the alt: line can refer to a file or an edbrowse session.

<P>
As you may have guessed,
the to: lines establish the recipients.
Please don't specify more than a few recipients.
I paste them together into one line,
and if that line gets too long, the server might complain.
Beyond this, some servers, my mail server included,
set a hard limit on the number of recipients.
If you exceed this number,
usually ten,
the remaining recipients simply don't get their mail.
Best to limit your "to:" lines to 8 or less.

<P>
When specifying recipients, you can use aliases instead of full email addresses.
Aliases are checked against your address book,
a text file that is specified in your .ebrc file.
If your address book contains the line
<P>
fred:fred.flintstone@bedrock.us:226 cobblestone way:5553827
<P>
then you can simply write "To:fred" at the top of your file.
Only the first two fields in the address book are significant
as far as edbrowse is concerned.
Other fields might hold phone/fax numbers, street address, etc.
Note that "Reply to fred" is an alternate syntax for "to: fred".

<P>
Some web pages include sendmail links.
They look just like other hyperlinks, but they send email to the appropriate person.
Click here for
<A HREF=http://developer.netscape.com/viewsource/husted_mailto/mailto.html>
more details</A>.

<P>
If you activate a sendmail link,
you will be placed in a new editing session with the "to" and "subject" lines preloaded.
If the url did not specify a subject,
the subject is simply "Comments".
You will probably want to replace this with a better subject line.
Write your mail message and type `sm' to send it on its way.
Then type ^ to return to the web page you were looking at.
Other aspects of sendmail links are not supported by edbrowse,
as they are rarely used.

<P>
You can include attachments by placing "attach:" lines at the top of the file,
assuming the recipient can handle these attachments.
This might make sense when the sendmail link is asking for {bug reports} -
you might attach a program and/or its output.
Yet this is somewhat unusual.
Most sendmail links expect a few sentences of feedback, and nothing more.

<P>
Some web forms are submitted via email, rather than a direct http transmission.
Edbrowse handles this properly.
It shows you the destination email address,
sends the mail through smtp,
and tells you to watch for a reply.
This reply could be an email response, or even a phone call
if you provided your phone number in the form.
But remember, nothing happens immediately.
You are still on the same web page, still looking at the same submit button.
Don't push the button again!
The mail has been sent,
and you'll be hearing from the company in the next few days.

<H4 align=center> Send Mail Client </H4>

as described in the previous section,
edbrowse incorporates the features of a mail client.
In addition to the `sm' command,
you can send mail in a batch fashion, from the command line.
If fred and barney are in your address book,
and you want to send them mail from the command line, with an attachment,
using your primary email account, do this.
<P>
perl edbrowse -m0 fred barney hollyrock-notice +hollyrock-brochure.pdf

<P>
Files with the leading + are assumed to be attachments.
If they are binary they will be encoded properly,
according to the mime standard.
A leading - indicates an alternate format, like this.

<P>
perl edbrowse -m0 fred barney hollyrock-notice -hollyrock-graphical.html

<P>
Remember, you can specify several mail accounts in your .ebrc file.
The first account is indicated by index 0, as in -m0.
The account with the * (in the config file) is always used
as the outgoing smtp server.
The -m option does not change this.
However, -m3 will set the from and reply address according to account 3,
whether that has the * or not.
You always use the same smtp server to send your mail,
but you can make it look like it came from other places,
and direct the replies back to those places.

<P>
Suppose you've brought some work home, for instance,
and are on your home computer.
You want to send mail to your colleagues,
but you want it to look like it came from your work account,
and more important, you want their replies to go back to your work account
so you will see them next morning.
Specify the work account via the -m option.
Your local server, associated with your home ISP,
is still used to send the mail,
because that's the only server you can use.
Other servers, not associated with your ISP,
rarely allow outsiders to send mail.
Remember, the smtp protocol has no password.
So you always want to use your trusted local mail server to send,
even when another account is specified.

<P>
All right, that was all very complicated.
don't worry about it, edbrowse does the right thing.
You can make your life a lot easier with some aliases in your .bashrc file.

<P><PRE><font size=4 face=Arial,Helvetica,sans-serif>#  My mail, home account
alias mymail="perl /usr/local/bin/edbrowse -m0"
#  My wife's account; sometimes she doesn't check it for a week.
alias wifemail="perl /usr/local/bin/edbrowse -m1"
#  My work account.
alias workmail="perl /usr/local/bin/edbrowse -m2"
#  mail is obsolete
alias mail="echo use mymail, wifemail, or workmail"
</font></PRE>

<H4 align=center> Fetch Mail Client </H4>

If edbrowse is run with the -m0 option, and no other arguments,
it is an interactive fetch mail client,
retrieving mail from your first pop3 account.
The first thing it tells you is how many messages you have.
If there are no messages it says "No mail", and exits.
If there are messages, it retrieves each one in turn.
For each message, it displays some header information (such as subject
and sender) and the first page of text, and then presents a prompt.
A '?' prompt means the message is complete --
a '*' prompt means there is more text to read.
You respond by hitting a key.
Keys have the following meaning.

<P><PRE><font size=4 face=Arial,Helvetica,sans-serif>?	summary of key commands.
q	quit the program.
x	abort the program, deleted mail is not really deleted.
space	display more text.
n	read the next message.
A	add the sender to your address book.
d	delete this message.
J	junk this subject, and delete any future mail with this subject.
w	write this message to a file and delete it.
k	keep this message in a file, but don't delete it.
u	write this message unformatted to a file and delete it.
</font></PRE>

<P>
The capital A command appends name:address to your address book,
where name is the modified name of the sender and address is his email address.
Name is converted to lower case, and spaces are turned into dots.
Thus it can be typed from the shell, as an email alias,
without worrying about upper case letters or quotes.
In reality, you're probably going to edit the address file,
replace the full name with a simpler alias,
and move the line to the section where it belongs.
Thus fred.flintstone might simply become fred,
as you move the line to the section of cartoon characters.

<P>
The last three commands, k w and u, require a filename, which you enter.
The reserved filename "x" is essentially /dev/null,
hence the mail message, or subordinate attachment, is not saved.
You can save the mail message to x (discard) and still save the attachments.
If the file is anything other than x,
and the program cannot write to the specified file, it dies, rather ungracefully,
and you have to reestablish the email session.
Yes, I'll make it more fault tolerant in the future.


<H4 align=center> Mail Filtering </H4>

Your config file supports a modest level of mail filtering.
You can redirect all messages from a given email address into a file, or the bit bucket,
indicated by the special filename "x".
If subsequent messages are directed to an existing file,
they are appended to the end of the file, so that no data is lost.
Messages with unnamed attachments cannot be auto-redirected,
since the program needs to ask you what to do with those attachments.
By specifying a partial email address in the filter rule,
you can redirect all messages that come from a given domain,
such as "@space.com".
This is a case insensitive fragment match.
There is more documentation on this topic
in the
<A HREF=sample.ebrc>
sample .ebrc file</A>.

<H4 align=center> Formatted Mail </H4>

By default, incoming mail is formatted for readability.
If you want to save a copy of the mail, exactly as it was received (unformatted),
type  u  at the interactive prompt.
If you don't want the formatting at all, use the -u option on the command line,
as in `perl edbrowse -um0'.

<P>
Mail headers and mime section headers are detected,
and most of this header information is discarded.
Each header is consolodated down to:
subject, from, reply-path, and send-date.
That's all you really want to read anyways.

<P>
The body of the mail message, or mime section,
is then decoded, if it was encoded using quoted-printable
or base64 mime standard.
Sections encoded via base64 are assumed to be binary attachments.
They are never displayed, but you can save them to files if you wish.
The interactive program prompts you for this.

<P>
Sometimes a mail message is replicated, in plain text and in html.
You'll see both versions, separated by a dashed line.
The second version has the braces and angle brackets associated with browsable text,
but you can't really activate any of the links.
You're not in the editor after all.
If you plan to browse this email,
and fill out the form or follow some of the links therein,
save it unformatted to a file.
When the mail session is complete,
edit that file and browse it.
You now have the full power of html at your disposal.
This is perhaps not as fuly integrated as Netscape,
but it's the best I can do.

<P>
If the mail is plain text,
we can perform some additional processing for readability.
Leading &gt; signs, which indicate nested text, are stripped away,
and the words "indent n" are placed at the top of the
paragraph, to indicate the number of times the paragraph was &gt; indented.
This software is much harder than it first seems,
because email servers sometimes break lines longer than 80 characters,
and the dangling fragments will not have the leading &gt; characters.
The fragment must be inferred from context --
&gt; lines before and after it.
Then, a user might prepend &gt; to the entire mess,
and send the mail again, whence a server might break some of the lines,
which are now longer than 80 characters, thanks to the new &gt; signs.
At other times a user might deliberately inject a short comment
(which looks like a broken fragment)
into a block of someone else's indented text.
As you can see, decisions are made by heuristics,
and the algorithm is not 100% accurate.
It's barely 80% accurate, but it's better than nothing.

<P>
A trailing paragraph that begins "to unsubscribe", and is not too long, is removed,
because this is usually listmaintenance that you don't need to read
again and again.
This paragraph may appear at the end of the mail message, or an internal mime section.

<H4 align=center> Annoy File </H4>

The annoy file, as specified in your .ebrc file,
contains lines that annoy you, usually commercials
that you don't want to read again and again.
Here are some examples.

<P><PRE><font size=4 face=Arial,Helvetica,sans-serif>This is a multi-part message in MIME format.
Get Your Private, Free E-mail from MSN Hotmail at http://www.hotmail.com
Do You Yahoo!?
</font></PRE>

<P>
Email lines that match the lines in this file, exactly, are culled,
so you don't have to read them.
You maintain this file yourself, manually.
If a mail message has commercials that you don't want to hear again,
save it somewhere, and edit it, leaving only the commercials.
Then add these commercials to your growing annoy file.
You'll never see them again.

<P>
The lines in the annoy file are hashed, so you can include
hundreds of lines of text (if you wish) without degrading performance.

<H4 align=center> Two Letter Command Summary </H4>

The two-letter commands are unique to edbrowse -
there's nothing like them in ed.
Let's review them here.

<P><PRE><font size=4 face=Arial,Helvetica,sans-serif>qt: quit the program now, whether you've written your files or not
db: debug level [0-7]
cd: change directory
lc: convert line to lower case
uc: convert line to upper case
mc: convert line to mixed case
bl: break line into sentences and phrases
sg: substitution strings are global across sessions
sl: substitution strings are local to their sessions
ci: searches and substitutions are case insensitive
cs: searches and substitutions are case sensitive
dr: directory is readonly
dw: directory is writable, and d moves files to your recycle bin
dx: directory is writable, and d deletes files
dp: delete-print, print line after each delete (toggle)
eo: end markers off
el: show end markers ^$ when a line is listed
ep: show end markers when a line is listed or printed
ub: unbrowse a file
f/: retain only the lass component of the filename
w/: write to the lass component of the filename
w+: append to a file
ft: show the title of the current web page
fd: show the description of the current web page
fk: show the keywords of the current web page
ph: auto-convert pdf files to html (toggle)
rh: redirect html (toggle)
ff: fetch frames automatically (toggle)
vs: verify secure connections (toggle)
tn: send dos-style newlines on lines in textareas (toggle)
ac: accept cookies from web servers (toggle)
sr: send refering web page (toggle)
pm: run ftp in passive mode (toggle)
rf: refresh the web page or directory listing
et: edit this web page as pure text
ws: strip this web page relative to its parent
us: unstrip this web page, inverse of ws
sm: send mail [account number]
</font></PRE>

<H4 align=center> Mailing List </H4>

There is a mailing list for users of edbrowse and other command line utilities.
You can join by sending mail to
<A HREF=mailto:commandline-subscribe@yahoogroups.com?subject=Subscribe>
commandline-subscribe@yahoogroups.com</A>.

</BODY></font>