File: CONFIGURE_HOWTO.pod

package info (click to toggle)
gbrowse 2.56%2Bdfsg-8
  • links: PTS, VCS
  • area: main
  • in suites: bullseye
  • size: 13,104 kB
  • sloc: perl: 50,765; sh: 227; sql: 62; makefile: 50; ansic: 27
file content (3232 lines) | stat: -rw-r--r-- 119,184 bytes parent folder | download | duplicates (7)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
1001
1002
1003
1004
1005
1006
1007
1008
1009
1010
1011
1012
1013
1014
1015
1016
1017
1018
1019
1020
1021
1022
1023
1024
1025
1026
1027
1028
1029
1030
1031
1032
1033
1034
1035
1036
1037
1038
1039
1040
1041
1042
1043
1044
1045
1046
1047
1048
1049
1050
1051
1052
1053
1054
1055
1056
1057
1058
1059
1060
1061
1062
1063
1064
1065
1066
1067
1068
1069
1070
1071
1072
1073
1074
1075
1076
1077
1078
1079
1080
1081
1082
1083
1084
1085
1086
1087
1088
1089
1090
1091
1092
1093
1094
1095
1096
1097
1098
1099
1100
1101
1102
1103
1104
1105
1106
1107
1108
1109
1110
1111
1112
1113
1114
1115
1116
1117
1118
1119
1120
1121
1122
1123
1124
1125
1126
1127
1128
1129
1130
1131
1132
1133
1134
1135
1136
1137
1138
1139
1140
1141
1142
1143
1144
1145
1146
1147
1148
1149
1150
1151
1152
1153
1154
1155
1156
1157
1158
1159
1160
1161
1162
1163
1164
1165
1166
1167
1168
1169
1170
1171
1172
1173
1174
1175
1176
1177
1178
1179
1180
1181
1182
1183
1184
1185
1186
1187
1188
1189
1190
1191
1192
1193
1194
1195
1196
1197
1198
1199
1200
1201
1202
1203
1204
1205
1206
1207
1208
1209
1210
1211
1212
1213
1214
1215
1216
1217
1218
1219
1220
1221
1222
1223
1224
1225
1226
1227
1228
1229
1230
1231
1232
1233
1234
1235
1236
1237
1238
1239
1240
1241
1242
1243
1244
1245
1246
1247
1248
1249
1250
1251
1252
1253
1254
1255
1256
1257
1258
1259
1260
1261
1262
1263
1264
1265
1266
1267
1268
1269
1270
1271
1272
1273
1274
1275
1276
1277
1278
1279
1280
1281
1282
1283
1284
1285
1286
1287
1288
1289
1290
1291
1292
1293
1294
1295
1296
1297
1298
1299
1300
1301
1302
1303
1304
1305
1306
1307
1308
1309
1310
1311
1312
1313
1314
1315
1316
1317
1318
1319
1320
1321
1322
1323
1324
1325
1326
1327
1328
1329
1330
1331
1332
1333
1334
1335
1336
1337
1338
1339
1340
1341
1342
1343
1344
1345
1346
1347
1348
1349
1350
1351
1352
1353
1354
1355
1356
1357
1358
1359
1360
1361
1362
1363
1364
1365
1366
1367
1368
1369
1370
1371
1372
1373
1374
1375
1376
1377
1378
1379
1380
1381
1382
1383
1384
1385
1386
1387
1388
1389
1390
1391
1392
1393
1394
1395
1396
1397
1398
1399
1400
1401
1402
1403
1404
1405
1406
1407
1408
1409
1410
1411
1412
1413
1414
1415
1416
1417
1418
1419
1420
1421
1422
1423
1424
1425
1426
1427
1428
1429
1430
1431
1432
1433
1434
1435
1436
1437
1438
1439
1440
1441
1442
1443
1444
1445
1446
1447
1448
1449
1450
1451
1452
1453
1454
1455
1456
1457
1458
1459
1460
1461
1462
1463
1464
1465
1466
1467
1468
1469
1470
1471
1472
1473
1474
1475
1476
1477
1478
1479
1480
1481
1482
1483
1484
1485
1486
1487
1488
1489
1490
1491
1492
1493
1494
1495
1496
1497
1498
1499
1500
1501
1502
1503
1504
1505
1506
1507
1508
1509
1510
1511
1512
1513
1514
1515
1516
1517
1518
1519
1520
1521
1522
1523
1524
1525
1526
1527
1528
1529
1530
1531
1532
1533
1534
1535
1536
1537
1538
1539
1540
1541
1542
1543
1544
1545
1546
1547
1548
1549
1550
1551
1552
1553
1554
1555
1556
1557
1558
1559
1560
1561
1562
1563
1564
1565
1566
1567
1568
1569
1570
1571
1572
1573
1574
1575
1576
1577
1578
1579
1580
1581
1582
1583
1584
1585
1586
1587
1588
1589
1590
1591
1592
1593
1594
1595
1596
1597
1598
1599
1600
1601
1602
1603
1604
1605
1606
1607
1608
1609
1610
1611
1612
1613
1614
1615
1616
1617
1618
1619
1620
1621
1622
1623
1624
1625
1626
1627
1628
1629
1630
1631
1632
1633
1634
1635
1636
1637
1638
1639
1640
1641
1642
1643
1644
1645
1646
1647
1648
1649
1650
1651
1652
1653
1654
1655
1656
1657
1658
1659
1660
1661
1662
1663
1664
1665
1666
1667
1668
1669
1670
1671
1672
1673
1674
1675
1676
1677
1678
1679
1680
1681
1682
1683
1684
1685
1686
1687
1688
1689
1690
1691
1692
1693
1694
1695
1696
1697
1698
1699
1700
1701
1702
1703
1704
1705
1706
1707
1708
1709
1710
1711
1712
1713
1714
1715
1716
1717
1718
1719
1720
1721
1722
1723
1724
1725
1726
1727
1728
1729
1730
1731
1732
1733
1734
1735
1736
1737
1738
1739
1740
1741
1742
1743
1744
1745
1746
1747
1748
1749
1750
1751
1752
1753
1754
1755
1756
1757
1758
1759
1760
1761
1762
1763
1764
1765
1766
1767
1768
1769
1770
1771
1772
1773
1774
1775
1776
1777
1778
1779
1780
1781
1782
1783
1784
1785
1786
1787
1788
1789
1790
1791
1792
1793
1794
1795
1796
1797
1798
1799
1800
1801
1802
1803
1804
1805
1806
1807
1808
1809
1810
1811
1812
1813
1814
1815
1816
1817
1818
1819
1820
1821
1822
1823
1824
1825
1826
1827
1828
1829
1830
1831
1832
1833
1834
1835
1836
1837
1838
1839
1840
1841
1842
1843
1844
1845
1846
1847
1848
1849
1850
1851
1852
1853
1854
1855
1856
1857
1858
1859
1860
1861
1862
1863
1864
1865
1866
1867
1868
1869
1870
1871
1872
1873
1874
1875
1876
1877
1878
1879
1880
1881
1882
1883
1884
1885
1886
1887
1888
1889
1890
1891
1892
1893
1894
1895
1896
1897
1898
1899
1900
1901
1902
1903
1904
1905
1906
1907
1908
1909
1910
1911
1912
1913
1914
1915
1916
1917
1918
1919
1920
1921
1922
1923
1924
1925
1926
1927
1928
1929
1930
1931
1932
1933
1934
1935
1936
1937
1938
1939
1940
1941
1942
1943
1944
1945
1946
1947
1948
1949
1950
1951
1952
1953
1954
1955
1956
1957
1958
1959
1960
1961
1962
1963
1964
1965
1966
1967
1968
1969
1970
1971
1972
1973
1974
1975
1976
1977
1978
1979
1980
1981
1982
1983
1984
1985
1986
1987
1988
1989
1990
1991
1992
1993
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
2026
2027
2028
2029
2030
2031
2032
2033
2034
2035
2036
2037
2038
2039
2040
2041
2042
2043
2044
2045
2046
2047
2048
2049
2050
2051
2052
2053
2054
2055
2056
2057
2058
2059
2060
2061
2062
2063
2064
2065
2066
2067
2068
2069
2070
2071
2072
2073
2074
2075
2076
2077
2078
2079
2080
2081
2082
2083
2084
2085
2086
2087
2088
2089
2090
2091
2092
2093
2094
2095
2096
2097
2098
2099
2100
2101
2102
2103
2104
2105
2106
2107
2108
2109
2110
2111
2112
2113
2114
2115
2116
2117
2118
2119
2120
2121
2122
2123
2124
2125
2126
2127
2128
2129
2130
2131
2132
2133
2134
2135
2136
2137
2138
2139
2140
2141
2142
2143
2144
2145
2146
2147
2148
2149
2150
2151
2152
2153
2154
2155
2156
2157
2158
2159
2160
2161
2162
2163
2164
2165
2166
2167
2168
2169
2170
2171
2172
2173
2174
2175
2176
2177
2178
2179
2180
2181
2182
2183
2184
2185
2186
2187
2188
2189
2190
2191
2192
2193
2194
2195
2196
2197
2198
2199
2200
2201
2202
2203
2204
2205
2206
2207
2208
2209
2210
2211
2212
2213
2214
2215
2216
2217
2218
2219
2220
2221
2222
2223
2224
2225
2226
2227
2228
2229
2230
2231
2232
2233
2234
2235
2236
2237
2238
2239
2240
2241
2242
2243
2244
2245
2246
2247
2248
2249
2250
2251
2252
2253
2254
2255
2256
2257
2258
2259
2260
2261
2262
2263
2264
2265
2266
2267
2268
2269
2270
2271
2272
2273
2274
2275
2276
2277
2278
2279
2280
2281
2282
2283
2284
2285
2286
2287
2288
2289
2290
2291
2292
2293
2294
2295
2296
2297
2298
2299
2300
2301
2302
2303
2304
2305
2306
2307
2308
2309
2310
2311
2312
2313
2314
2315
2316
2317
2318
2319
2320
2321
2322
2323
2324
2325
2326
2327
2328
2329
2330
2331
2332
2333
2334
2335
2336
2337
2338
2339
2340
2341
2342
2343
2344
2345
2346
2347
2348
2349
2350
2351
2352
2353
2354
2355
2356
2357
2358
2359
2360
2361
2362
2363
2364
2365
2366
2367
2368
2369
2370
2371
2372
2373
2374
2375
2376
2377
2378
2379
2380
2381
2382
2383
2384
2385
2386
2387
2388
2389
2390
2391
2392
2393
2394
2395
2396
2397
2398
2399
2400
2401
2402
2403
2404
2405
2406
2407
2408
2409
2410
2411
2412
2413
2414
2415
2416
2417
2418
2419
2420
2421
2422
2423
2424
2425
2426
2427
2428
2429
2430
2431
2432
2433
2434
2435
2436
2437
2438
2439
2440
2441
2442
2443
2444
2445
2446
2447
2448
2449
2450
2451
2452
2453
2454
2455
2456
2457
2458
2459
2460
2461
2462
2463
2464
2465
2466
2467
2468
2469
2470
2471
2472
2473
2474
2475
2476
2477
2478
2479
2480
2481
2482
2483
2484
2485
2486
2487
2488
2489
2490
2491
2492
2493
2494
2495
2496
2497
2498
2499
2500
2501
2502
2503
2504
2505
2506
2507
2508
2509
2510
2511
2512
2513
2514
2515
2516
2517
2518
2519
2520
2521
2522
2523
2524
2525
2526
2527
2528
2529
2530
2531
2532
2533
2534
2535
2536
2537
2538
2539
2540
2541
2542
2543
2544
2545
2546
2547
2548
2549
2550
2551
2552
2553
2554
2555
2556
2557
2558
2559
2560
2561
2562
2563
2564
2565
2566
2567
2568
2569
2570
2571
2572
2573
2574
2575
2576
2577
2578
2579
2580
2581
2582
2583
2584
2585
2586
2587
2588
2589
2590
2591
2592
2593
2594
2595
2596
2597
2598
2599
2600
2601
2602
2603
2604
2605
2606
2607
2608
2609
2610
2611
2612
2613
2614
2615
2616
2617
2618
2619
2620
2621
2622
2623
2624
2625
2626
2627
2628
2629
2630
2631
2632
2633
2634
2635
2636
2637
2638
2639
2640
2641
2642
2643
2644
2645
2646
2647
2648
2649
2650
2651
2652
2653
2654
2655
2656
2657
2658
2659
2660
2661
2662
2663
2664
2665
2666
2667
2668
2669
2670
2671
2672
2673
2674
2675
2676
2677
2678
2679
2680
2681
2682
2683
2684
2685
2686
2687
2688
2689
2690
2691
2692
2693
2694
2695
2696
2697
2698
2699
2700
2701
2702
2703
2704
2705
2706
2707
2708
2709
2710
2711
2712
2713
2714
2715
2716
2717
2718
2719
2720
2721
2722
2723
2724
2725
2726
2727
2728
2729
2730
2731
2732
2733
2734
2735
2736
2737
2738
2739
2740
2741
2742
2743
2744
2745
2746
2747
2748
2749
2750
2751
2752
2753
2754
2755
2756
2757
2758
2759
2760
2761
2762
2763
2764
2765
2766
2767
2768
2769
2770
2771
2772
2773
2774
2775
2776
2777
2778
2779
2780
2781
2782
2783
2784
2785
2786
2787
2788
2789
2790
2791
2792
2793
2794
2795
2796
2797
2798
2799
2800
2801
2802
2803
2804
2805
2806
2807
2808
2809
2810
2811
2812
2813
2814
2815
2816
2817
2818
2819
2820
2821
2822
2823
2824
2825
2826
2827
2828
2829
2830
2831
2832
2833
2834
2835
2836
2837
2838
2839
2840
2841
2842
2843
2844
2845
2846
2847
2848
2849
2850
2851
2852
2853
2854
2855
2856
2857
2858
2859
2860
2861
2862
2863
2864
2865
2866
2867
2868
2869
2870
2871
2872
2873
2874
2875
2876
2877
2878
2879
2880
2881
2882
2883
2884
2885
2886
2887
2888
2889
2890
2891
2892
2893
2894
2895
2896
2897
2898
2899
2900
2901
2902
2903
2904
2905
2906
2907
2908
2909
2910
2911
2912
2913
2914
2915
2916
2917
2918
2919
2920
2921
2922
2923
2924
2925
2926
2927
2928
2929
2930
2931
2932
2933
2934
2935
2936
2937
2938
2939
2940
2941
2942
2943
2944
2945
2946
2947
2948
2949
2950
2951
2952
2953
2954
2955
2956
2957
2958
2959
2960
2961
2962
2963
2964
2965
2966
2967
2968
2969
2970
2971
2972
2973
2974
2975
2976
2977
2978
2979
2980
2981
2982
2983
2984
2985
2986
2987
2988
2989
2990
2991
2992
2993
2994
2995
2996
2997
2998
2999
3000
3001
3002
3003
3004
3005
3006
3007
3008
3009
3010
3011
3012
3013
3014
3015
3016
3017
3018
3019
3020
3021
3022
3023
3024
3025
3026
3027
3028
3029
3030
3031
3032
3033
3034
3035
3036
3037
3038
3039
3040
3041
3042
3043
3044
3045
3046
3047
3048
3049
3050
3051
3052
3053
3054
3055
3056
3057
3058
3059
3060
3061
3062
3063
3064
3065
3066
3067
3068
3069
3070
3071
3072
3073
3074
3075
3076
3077
3078
3079
3080
3081
3082
3083
3084
3085
3086
3087
3088
3089
3090
3091
3092
3093
3094
3095
3096
3097
3098
3099
3100
3101
3102
3103
3104
3105
3106
3107
3108
3109
3110
3111
3112
3113
3114
3115
3116
3117
3118
3119
3120
3121
3122
3123
3124
3125
3126
3127
3128
3129
3130
3131
3132
3133
3134
3135
3136
3137
3138
3139
3140
3141
3142
3143
3144
3145
3146
3147
3148
3149
3150
3151
3152
3153
3154
3155
3156
3157
3158
3159
3160
3161
3162
3163
3164
3165
3166
3167
3168
3169
3170
3171
3172
3173
3174
3175
3176
3177
3178
3179
3180
3181
3182
3183
3184
3185
3186
3187
3188
3189
3190
3191
3192
3193
3194
3195
3196
3197
3198
3199
3200
3201
3202
3203
3204
3205
3206
3207
3208
3209
3210
3211
3212
3213
3214
3215
3216
3217
3218
3219
3220
3221
3222
3223
3224
3225
3226
3227
3228
3229
3230
3231
3232
head1 CONFIGURE-HOWTO

This document provides information on configuring the Generic Genome
Browser (GBrowse), part of the Generic Model Organism Systems Database
Project (L<http://www.gmod.org/>).

=head2 CONTENTS

=over 4

=item A. CREATING NEW DATABASES FROM SCRATCH

=item B. ADDING A NEW DATABASE TO THE BROWSER

=item C. GENERATING HISTOGRAMS

=item D. INTERNATIONALIZATION

=item E. AUTHENTICATION AND AUTHORIZATION

=item F. DISPLAYING GENETIC AND RH MAPS

=item G. CHANGING THE LOCATION OF THE CONFIGURATION FILES

=item H. USING DAS (DISTRIBUTED ANNOTATION SYSTEM) DATABASES

=item I. THE BioMOBY BROWSER

=item J. BIOMOBY SERVICES

=item K. FILTERING SEARCH RESULTS

=item L. INVOKING GBROWSE URLs

=item M. FURTHER INFORMATION

=back

=head1 A. CREATING NEW DATABASES FROM SCRATCH

This section describes how to create new annotation databases from
scratch.

=head2 A1. The GFF file format

GBrowse is based around the GFF file format, which stands for "Gene
Finding Format" and was invented at the Sanger Centre. The GFF format
is a flat tab-delimited file, each line of which corresponds to an
annotation, or feature.  Each line has nine columns and looks like
this:

 Chr1  curated  CDS 365647  365963  .  +  1  Transcript "R119.7"

The 9 columns are as follows:

=over

=item 1 reference sequence

This is the ID of the sequence that is used to establish the
coordinate system of the annotation.  In the example above, the
reference sequence is "Chr1".

=item 2 source

The source of the annotation.  This field describes how the
annotation was derived.  In the example above, the source is
"curated" to indicate that the feature is the result of human
curation.  The names and versions of software programs are often
used for the source field, as in "tRNAScan-SE/1.2".

=item 3 method

The annotation method.  This field describes the type of the
annotation, such as "CDS".  Together the method and source describe
the annotation type.

=item 4 start position

The start of the annotation relative to the reference sequence.

=item 5 stop position

The stop of the annotation relative to the reference sequence.
Start is always less than or equal to stop.

=item 6 score

For annotations that are associated with a numeric score (for
example, a sequence similarity), this field describes the score.
The score units are completely unspecified, but for sequence
similarities, it is typically percent identity.  Annotations that
don't have a score can use "."

=item 7 strand

For those annotations which are strand-specific, this field is the
strand on which the annotation resides.  It is "+" for the forward
strand, "-" for the reverse strand, or "." for annotations that are
not stranded.

=item 8 phase

For annotations that are linked to proteins, this field describes
the phase of the annotation on the codons.  It is a number from 0 to
2, or "." for features that have no phase.

=item 9 group

GFF provides a simple way of generating annotation hierarchies ("is
composed of" relationships) by providing a group field.  The group
field contains the class and ID of an annotation which is the
logical parent of the current one.  In the example given above, the
group is the Transcript named "R119.7".

The group field is also used to store information about the target
of sequence similarity hits, and miscellaneous notes.  See the next
section for a description of how to describe similarity targets.

=back

The sequences used to establish the coordinate system for annotations
can correspond to sequenced clones, clone fragments, contigs or
super-contigs.

In addition to a group ID, the GFF format allows annotations to have a
group class.  This makes sure that all groups are unique even if they
happen to share the same name.  For example, you can have a GenBank
accession named AP001234 and a clone named AP001234 and distinguish
between them by giving the first one a class of Accession and the
second a class of Clone.

You should use double-quotes around the group name or class if it
contains white space.

=head2 A2. Creating a GFF table

The first 8 fields of the GFF format are easy to understand.  The
group field is a challenge.  It is used in three distinct ways:

=over

=item 1

to group together a single sequence feature that spans a discontinuous range, such as a gapped alignment.

=item 2

to name a feature, allowing it to be retrieved by name.

=item 3

to add one or more notes to the annotation.

=back

1. Using the Group field for simple features

For a simple feature that spans a single continuous range, choose a
name and class for the object and give it a line in the GFF file that
refers to its start and stop positions.

 Chr3	giemsa heterochromatin	4500000 6000000	. . .   Band 3q12.1

2. Using the Group field to group features that belong together

For a group of features that belong together, such as the exons in a
transcript, choose a name and class for the object.  Give each segment
a separate line in the GFF file but use the same name for each line.
For example:

 IV	curated	exon	5506900	5506996	. + .   Transcript B0273.1
 IV	curated	exon	5506026	5506382	. + .   Transcript B0273.1
 IV	curated	exon	5506558	5506660	. + .   Transcript B0273.1
 IV	curated	exon	5506738	5506852	. + .   Transcript B0273.1

These four lines refer to a biological object of class "Transcript"
and name B0273.1.  Each of its parts uses the method "exon", source
"curated".  Once loaded, the user will be able to search the genome
for this object by asking the browser to retrieve
"Transcript:B0273.1".  The browser can also be configured to allow the
Transcript: prefix to be omitted.

You can extend the idiom for objects that have heterogeneous parts,
such as a transcript that has 5' and 3' UTRs

 IV     curated  mRNA   5506800 5508917 . + .   Transcript B0273.1; Note "Zn-Finger"
 IV	curated  5'UTR	5506800	5508999	. + .   Transcript B0273.1
 IV	curated	 exon	5506900	5506996	. + .   Transcript B0273.1
 IV	curated	 exon	5506026	5506382	. + .   Transcript B0273.1
 IV	curated	 exon	5506558	5506660	. + .   Transcript B0273.1
 IV	curated	 exon	5506738	5506852	. + .   Transcript B0273.1
 IV	curated  3'UTR	5506852 5508917	. + .   Transcript B0273.1

In this example, there is a single feature with method "mRNA" that
spans the entire range.  It is grouped with subparts of type 5'UTR,
3'UTR and exon.  They are all grouped together into a Transcript named
B0273.1. Furthermore the mRNA feature has a note attached to it.

*NOTE* The subparts of a feature are in absolute (chromosomal or
contig) coordinates.  It is not currently possible to define a feature
in absolute coordinates and then to load its subparts using
coordinates that are relative to the start of the feature.

Some annotations do not need to be individually named.  For example,
it is probably not useful to assign a unique name to each ALU repeat
in a vertebrate genome.  For these, just leave the Group field empty.

3. Using the Group field to add a note

The group field can be used to add one or more notes to an annotation.
To do this, place a semicolon after the group name and add a Note
field:

 Chr3 giemsa heterochromatin 4500000 6000000 . . . Band 3q12.1 ; Note "Marfan's syndrome"

You can add multiple Notes.  Just separate them by semicolons:

  Band 3q12.1 ; Note "Marfan's syndrome" ; Note "dystrophic dysplasia"

The Note should come AFTER the group type and name.

3. Using the Group field to add an alternative name

If you want the feature to be quickly searchable by an alternative
name, you can add one or more Alias tags. A feature can have multiple
aliases, and multiple features can share the same alias:

 Chr3 giemsa heterochromatin 4500000 6000000 . . . Band 3q12.1 ; Alias MFX

Searches for aliases will be both faster and more reliable than
searches for keywords in notes, since the latter relies on whole-text
search methods that vary somewhat from DBMS to DBMS.

=head2 A3. Identifying the reference sequence

Each reference sequence in the GFF table must itself have an entry.
This is necessary so that the length of the reference sequence is
known.

For example, if "Chr1" is used as a reference sequence, then the GFF
file should have an entry for it similar to this one:

 Chr1 assembly chromosome 1 14972282 . + . Sequence Chr1

This indicates that the reference sequence named "Chr1" has length
14972282 bp, method "chromosome" and source "assembly".  In addition,
as indicated by the group field, Chr1 has class "Sequence" and name
"Chr".

It is suggested that you use "Sequence" as the class name for all
reference sequences, since this is the default class used by the
Bio::DB::GFF module when no more specific class is requested.  If you
use a different class name, then be sure to indicate that fact with
the "reference class" option (see below).

=head2 A4. Sequence alignments

There are several cases in which an annotation indicates the
relationship between two sequences.  One common one is a similarity
hit, where the annotation indicates an alignment.  A second common
case is a map assembly, in which the annotation indicates that a
portion of a larger sequence is built up from one or more smaller
ones.

Both cases are indicated by using the Target tag in the group field.
For example, a typical similarity hit will look like this:

 Chr1 BLASTX similarity 76953 77108 132 + 0 Target Protein:SW:ABL_DROME 493 544

Here, the group field contains the Target tag, followed by an
identifier for the biological object.  The GFF format uses the
notation Class:Name for the biological object, and even though this is
stylistically inconsistent, that's the way it's done.  The object
identifier is followed by two integers indicating the start and stop
of the alignment on the target sequence.

Unlike the main start and stop columns, it is possible for the target
start to be greater than the target end.  The previous example
indicates that the the section of Chr1 from 76,953 to 77,108 aligns to
the protein SW:ABL_DROME starting at position 493 and extending to
position 544.

A similar notation is used for sequence assembly information as shown
in this example:

 Chr1        assembly Link   10922906 11177731 . . . Target Sequence:LINK_H06O01 1 254826
 LINK_H06O01 assembly Cosmid 32386    64122    . . . Target Sequence:F49B2       6 31742

This indicates that the region between bases 10922906 and 11177731 of
Chr1 are composed of LINK_H06O01 from bp 1 to bp 254826.  The region
of LINK_H0601 between 32386 and 64122 is, in turn, composed of the
bases 5 to 31742 of cosmid F49B2.

=head2 A6. Loading the GFF file into the database

Use the BioPerl script utilities bulk_load_gff.pl, load_gff.pl or (if
you are brave) fast_load_gff.pl to load the GFF file into the
database.  For example, if your database is a MySQL database on the
local host named "dicty", you can load it into an empty database using
bulk_load_gff.pl like this:

  bulk_load_gff.pl -c -d dicty my_data.gff

To update existing databases, use either load_gff.pl or
fast_load_gff.pl.  The latter is somewhat experimental, so use with
care.

=head2 A5. Aggregators

The Bio::DB::GFF module has a feature known as "aggregators".  These
are small software packages that recognize certain common feature
types and convert them into complex biological objects.  These
aggregators make it possible to develop intelligent graphical
representations of annotations, such as a gene that draws confirmed
exons differently from predicted ones.

An aggregator typically creates a new composite feature with a
different method than any of its components.  For example, the
standard "alignment" aggregator takes multiple alignments of method
"similarity", groups them by their name, and returns a single feature
of method "alignment".

The various aggregators are described in detail in the Bio::DB::GFF
manual page.  It is easy to write new aggregators, and also possible
to define aggregators on the fly in the gbrowse configuration file.
It is suggested that you use the sample GFF files from the yeast,
drosophila and C. elegans projects to see what methods to use to
achieve the desired results.

In addition to the standard aggregators that are distributed with
BioPerl, GBrowse distributes several experimental and/or
special-purpose aggregators:

=over 4

=item match_gap

This aggregator is used for GFF3 style gapped alignments, in which
there is a single feature of method 'match' with a 'Gap' attribute.
This aggregator was contributed by Dmitri Bichko.

=item orf

This aggregator aggregates raw "ORF" features into "coding"
features. It is basically identical to the "coding" aggregator, except
that it looks for features of type "ORF" rather than "cds".

=item reftranscript

This aggregator was written to make the compound feature,
"reftranscript" for use with Gbrowse editing software developed
outside of the GMOD development group.  It can be used to aggregate
"reftranscripts" from "refexons", loaded as second copy features.
These features, in contrast to "transcripts", are usually implemented
as features which cannot be edited and serve as starting point
references for annotations added using Gbrowse for feature
visualization.

Adding features to the compound feature, "reftranscript", can be done
by adding to the "part_names" call (i.e. "refCDS").

=item  waba_alignment

This aggregator handles the type of alignments produced by Jim Kent's
WABA program, and was written to be compatible with the C elegans GFF
files.  It aggregates the following feature types into an aggregate
type of "waba_alignment":

   nucleotide_match:waba_weak
   nucleotide_match:waba_strong
   nucleotide_match:waba_coding

=item wormbase_gene

This aggregator was written to be compatible with the C elegans GFF2
files distributed by the Sanger Institute.  It aggregates raw "CDS",
"5'UTR", "3'UTR", "polyA" and "TSS" features into "transcript"
features.  For compatibility with the idiosyncrasies of the Sanger GFF
format, it expects that the full range of the transcript is contained
in a main feature of type "Sequence".

It is strongly recommended that for mirroring C. elegans annotations,
you use the "processed_transcript" aggregator in conjunction with the
GFF3 files found at:

 ftp://ftp.wormbase.org/pub/wormbase/genomes/elegans/genome_feature_tables/GFF3

=back

IT IS NOT NECESSARY TO USE AGGREGATORS WITH THE CHADO, BIOSQL OR
BIO::DB::SEQFEATURE::STORE (GFF3) DATABASES.


=head1 B. ADDING A NEW DATABASE TO THE BROWSER

Each data source has a corresponding configuration file in the
directory gbrowse.conf.  Once you've created and loaded a new
database, you should make a copy of one of the existing configuration
files and modify it to meet your needs.  The name of the new
configuration file must follow the form:

  sourcename.conf

where "sourcename" is a short word that describes the data source.
You can use this name to select the data source when linking to the
browser.  Just construct a URL that uses "sourcename" as a virtual
directory under cgi-bin/gbrowse:

  http://your.site.org/cgi-bin/gbrowse/sourcename/

(Note: If you don't add the slash at the end, gbrowse will
automatically do it for you, since the terminal slash is needed to
work around an apparent bug in MSIE's cookie handling.)

It is suggested that you use the same name as the database, although
this isn't a requirement.  (If no "source=" argument is given, gbrowse
picks the first configuration file that occurs alphabetically; you can
control this by placing numbers in front of the configuration file, as
in "01.yeast.conf".)

The configuration file is divided into a number of sections, each one
introduced by a [SECTION TITLE].  The [GENERAL] section contains
settings that are applicable to the entire application.  Other
sections define tracks to display.

I suggest that you begin with one of the example configuration files
provided with the distribution and modify it to suit your needs.

=head2 B1. The [GENERAL] Section

The [GENERAL] section consists of a series of name=value options.  For
example, the beginning of the yeast.conf sample configuration file
looks like this:

 [GENERAL]
 description = S. cerevisiae (via SGD Nov 2001)
 db_adaptor  = Bio::DB::GFF
 db_args     = -adaptor dbi::mysql
 	       -dsn     dbi:mysql:database=yeast;host=localhost
 aggregators = transcript alignment
 user        =
 passwd      =

Each option is a single word or phrase, usually in lower case.  This
is followed by an equals sign and the value of the option.  You can
add whitespace around the equals sign in order to increase
readability.  If a value is very long, you can continue it on
additional lines provided that you put a tab or other whitespace on
the continuation lines.  For example:

 description = S. cerevisiae annotations via SGD Nov 2001, and
	     converted using the process_sgd.pl script

Any lines that begin with a pound sign (#) are considered comments and
ignored.

During this discussion, you might want to follow along with one of the
example configuration files.

The following [GENERAL] options are recognized:

=over

=item * description

The description of the database.  This will appear in the popup
menu that allows users to select the data source and in the 
header of the page.  Don't make it as long as the previous example!
(You will want to change this.)

=item * db_adaptor

Tells GBrowse what database adaptor to use.  By using different adaptors
you can attach gbrowse to a variety of different databases.  Currently
the only stable adaptor you can use is Bio::DB::GFF, which is a standard
set of adaptors contained in Bioperl.

=item * db_args

Arguments to pass to the adaptor for it to use when making a database
connection.  The exact format will depend on the adaptor you're using.
For Bio::DB::GFF running on top of a MySQL database use
a db_args like the following:

    db_args = -adaptor dbi::mysql
              -dsn     dbi:mysql:database=<db_name>;host=<db_host>

replacing <db_name> and <db_host> with the database and database
host of your choice.  For MySQL databases running on the localhost,
you can shorten this to just "db_name".

If the database requires you to log in with a user name and
password, use the following db_adaptor:

    db_args = -adaptor dbi::mysql
              -dsn     dbi:mysql:database=<db_name>;host=<db_host>
              -user    <username>
              -pass    <password>

replacing <username> and <password> with the appropriate values.
In the example configuration files, we use a username of "nobody"
and an empty password.  This is appropriate if the database is
configured to allow "nobody" to log in from the local machine
without using a password.

To use the Oracle version of Bio::DB::GFF, use these arguments:

    db_args = -adaptor dbi::oracle
              -dsn dbi:oracle:database=db_service

Where db_description should be replaced with the name of the desired
database service definition.  See the documentation for the Perl
dbd::Oracle database driver for more information about the -dsn
format.

To use the in-memory version of Bio::DB::GFF, use these arguments:

  db_args = -adaptor memory
            -dir   /path/to/directory

The indicated directory should contain one or more GFF and FASTA files,
distinguished by the filename extensions .gff and .fa respectively.

=item * aggregators

This option is only valid when used with Bio::DB::GFF adaptors, and
lists one or more aggregators to use for complex features.  It is
possible to declare your own aggregator here using a special syntax
described in "B7. Declaring New Aggregators."

To disable the default aggregators, leave this setting blank, as in:

     aggregators=

To activate the default aggregators of "transcript," "clone,"
and "alignment," comment this setting out entirely:

    # aggregators =

=item * user

The user name for the gbrowse script to log in under if you are not
using "nobody".  This is exactly the same as providing the -user
option to db_args, and is deprecated.

=item * pass

The password to use if the database is password protected.  This is
the same as providing the -pass option to db_args, and is deprecated.

=item * stylesheet

Location of the stylesheet used to create the GBrowse look and feel.
(You probably will not need to change this.)

=item * plugins

This is a list of plugins that you want to be available from gbrowse.
Plugins are a way for third-party developers to add functionality to gbrowse
without changing its core source code.  Plugins are stored on the gbrowse
configuration directory under a subdirectory named "plugins."

A good standard list of plugins is:

    plugins = SequenceDumper FastaDumper RestrictionAnnotator

See the contents of conf/plugins and contrib/plugins for more plugins
that you can install.

=item * quicklink plugins

This is a list of plugins that you want to appear as links in the link
bar (which includes the [Bookmark this] and [Link to Image] links).
Selecting one of these links is equivalent to choosing the plugin from
the popup menu and pressing the "Go" button.  The popup will continue
to appear in the popup menu.

=item * plugin_path

By default gbrowse searches for plugins in its standard location of conf/plugins.
You can store plugins in a non-standard location by providing this option
with a space-delimited list of additional directories to search in.

=item * buttons

URL in which the various graphical buttons used by GBrowse are located.
(You will probably not need to change this.)

=item * js

URL in which the gbrowse javascript helper function files are located.
(You will probably not need to change this).

=item * tmpimages

URL of a writable directory in which GBrowse can write its temporary
images. The format is:

  tmpimages = <tmpimages_url> <tmpimages_path>

Where <tmpimages_url> is the directory as it appears as a URL and
<tmpimages_path> is the physical path to the directory as it appears
to the filesystem. Usually the physical path is just the URL with the
DocumentRoot configuration variable prepended to it, in which case
only the URL is needed. However, if the URL is defined using an Alias
directive, then the path argument is mandatory.

The tmpimages option is mandatory.

NOTE: The path argument is ignored if gbrowse is running under
modperl, because modperl allows the URL to be translated into a
physical directory programatically.

=item * cachedir

This is a writable directory that can be used for caching
gbrowse_img-generated images. Defining it will speed up some
operations. If not defined gbrowse_img will still work, but will
regenerate images from scratch even if they've been used before. It is
OK to use the same path as the tmpimages directory.

=item * image widths

The image widths option controls the set of image sizes to offer
the user.  Its value is a space-delimited list of pixel widths.
The default is probably fine.  Note that the height of the image
depends on the number of tracks and features, and cannot be
controlled.

=item * default width

The default width is the image width to start off with when the
user invokes the browser for the first time.  The default is 800.

=item * default features

The default features option is a space-delimited list of tracks
to turn on by default.  You will probably need to change this.
For example:

     default features = Genes ORFs tRNAs Centromeres:overview

The syntax for annotation plugins is slightly different. To activate
an annotation plugin track by default, preface the plugin's name with
"plugin:"

     default features = Genes ORFs Centromeres:overview 
                        plugin:RestrictionAnnotator

=item * reference class

gbrowse needs to know the class of the reference sequences that other
features are placed on.  The default is Sequence.  If you want to use
another class, such as Contig, please indicate the class here (if you
don't, certain features such as the keyword search will fail):

      reference class = contig

=item * initial landmark

This option controls what feature to show when the user first visits a
gbrowse database and has not yet performed a search. If not present,
gbrowse displays a page with the search area and options, but no
overview or panel.

Example:

       initial landmark = Chr1

=item * autocomplete

This turns on the autocomplete functionality, which is very
rudimentary at the moment. The argument to this option is the relative
path of a "nameserver" script that is expected to respond to queries
of the type:

  http://your.server/cgi-bin/nameserver?query=NNNN

where "NNNN" is the partial name that the user has typed into the
search box. The nameserver should return a line-delimited list of
completed names to suggest to the user.

Included in this distribution are two example nameserver scripts,
gbrowse_gff_autocomplete and gbrowse_seqfeature_autocomplete, which
work with the mysql versions of the Bio::DB::GFF and
Bio::DB::SeqFeature::Store adaptors, respectively. To have them
invoked properly, add the name of the database to the end of the
path. For example, if you are using the Bio::DB::GFF adaptor with a
database named "dbi:mysql:volvox", then this is what you should put
into the configuration file:

  autocomplete = /cgi-bin/gbrowse_gff_autocomplete/volvox

Change this if as appropriate if gbrowse_gff_autocomplete is not
installed in /cgi-bin.

For javascript security reasons, the nameserver script MUST be running
on the same machine as gbrowse. You can work around this using a
proxy, if need be. Also note that the scripts do not (currently) allow
you to pass a username or password, and do not check that the user is
authorized to connect to the database. Please check and change the
source code if this is of concern to you (the scripts are *VERY*
simple).

These scripts will be made more sophisticated in the future.

=item * truecolor

If this option is present and true, then GBrowse will create 24-bit
(truecolor) images. This is mainly useful when using the "image"
glyph, which allows you to paste arbitrary images onto the genome
map. Do not use this option unless you need it, because it slows down
drawing and makes the images much larger.

=item * units, unit_divider

The units option allows GBrowse to display units on an alternate scale
(for example, (centi)Morgans), and the unit_divider provides the converstion
factor between base pair units (which is what must be specified in the
GFF file) and the specified units.  For example if it is known that 5010
base pairs is equal to one Morgan, 5010 would be specified for the unit_divider.
Note that if unit_divider is specified, max segment, default segment and
and zoom levels will all be interpreted in terms of the specified units.

=item * max segment, min segment

These options control the size of segments that will be shown in the
detailed view.

The max segment option sets an upper bound on the maximum size segment
that will be displayed on the detailed view.  Its value is in the
selected units.  Above this limit, the user will be prompted to select
a smaller region on the birds-eye view.  The default is 1,000,000 base
pairs.

If the user tries to view a segment smaller than the min segment
option, then the segment will be resized to be this size. The default
is 20 bp.

=item * default segment

The default segment option sets the width of the segment (bp) that
will be displayed when the user clicks on the birds-eye view
without previously having set a desired magnification.  You may
want to adjust this value.

=item * zoom levels

GBrowse allows unlimited zoom levels.  This option selects the
width of each level, in bp.  For example:

      zoom levels = 1000 2000 5000 10000 20000 40000 100000 200000

=item * region segment

If this configuration option is set, a new "region panel" will appear
that is intermediate in size between the overview and the detail
panel. The value of this option becomes the initial size of the region
panel in base pairs.

     region segment = 10000

=item * region sizes

This contains a space-delimited list of region panel sizes to present
to the user in a popup menu:

     region sizes   = 5000 10000 20000

=item * show sources

A 0 (false) or 1 (true) value which controls whether or not to show
the popup menu displaying the defined data sources.  Set this to 0 if
you wish for the names of the data sources to be hidden.  If not
present, this option defaults to 1 (true).  

Note that all data sources will need to have this option defined in
order for it to take effect across all databases.

=item * default varying

The track selection table will be sorted alphabetically, by default;
setting this variable to true will cause the tracks to appear in the
same order as they appear in the configuration file.

=item * keyword search max

By default, gbrowse will limit the number of keyword search results
to 1,000.  The order in which the 1,000 hits are returned depends on
how the database was loaded, and so you may see odd patterns, such as
only hits on a particular chromosome being displayed.  To raise the 
limit on keyword search results, set "keyword search max" to the
desired maximum value.

=item * overview units

This option controls the units that will be used on the scale for
the birds-eye view display.  Possible values are "bp" (base pairs),
"k" (kilobases), "M" (megabases), and "G" (gigabases).  If this
option is omitted, the browser will guess the most appropriate
unit.


=item * overview bgcolor

This is the color for the background of the birds-eye view.

=item * selection rectangle color

This is the color of the rectangle in the overview and region
panels that shows where on the overview the detail panel represents.
The default is red.

=item * cache_overview

This option will cause the overview images to be cached on disk for a
period of time.  This may improve performance if you are placing many
complex tracks into the overview.  The value is the number of hours to
keep the cached copy of the overview image before refreshing it
(default = don't cache).

You can freshen the cache and force cached copies to be ignored by
touching the configuration file or by calling gbrowse with the CGI
option nocache=1.

=item * detailed bgcolor

This is the color for the background of the detailed view.

=item * request timeout

This is the timeout value for requests.  If a user requests a large
region and the request takes more than the indicated number of
seconds, then the request will timeout and the user will be advised to
choose a smaller region.  The default is 60 seconds (one minute).  You
can make the timeout longer or shorter than this.

=item * head

This is content to insert into the HTML <head></head> section.  It is
the appropriate place to stick JavaScript code, etc.  It can be a code
reference if you wish.

=item * header_template

This give the name of a Template Toolkit template file to print at the top of
the browser page.  This file should contain any valid Template Toolkit syntax
(including regular HTML).  The header template file needs to be placed in the
gbrowse template folder (ex:
/usr/local/apache/conf/gbrowse.conf/templates/default/)

It is also possible to place an anonymous Perl subroutine here.  The code will
be invoked during preparation of the page and must return a string value of the
header template file name.  See COMPUTED OPTIONS for details.

Example:

    header = header.tt2

=item * header (deprecated)

The header option has been replaced by header_template.  It will still work for
the time being but it is encouraged that you check out Template Toolkit and
look at using that instead.

This is a header to print at the top of the browser page.  It is
any valid HTML, and can span multiple lines provided that the
continuation lines begin with white space.

It is also possible to place an anonymous Perl subroutine here.
The code will be invoked during preparation of the page and must
return a string value to use as the header.  See COMPUTED OPTIONS
for details.

Example:

    header = <h1>Welcome to the Volvox Sequence Page</h1>

=item * footer_template

This give the name of a Template Toolkit template file to print at the top of
the browser page.  This file should contain any valid Template Toolkit syntax
(including regular HTML).  The footer template file needs to be placed in the
gbrowse template folder (ex:
/usr/local/apache/conf/gbrowse.conf/templates/default/)

It is also possible to place an anonymous Perl subroutine here.  The code will
be invoked during preparation of the page and must return a string value of the
footer template file name.  See COMPUTED OPTIONS for details.

Example:

    footer = footer.tt2

=item * footer (deprecated)

The footer option has been replaced by footer_template.  It will still work for
the time being but it is encouraged that you check out Template Toolkit and
look at using that instead.

This is a footer to print at the top of the browser page.  It is
any valid HTML, and can span multiple lines provided that the
continuation lines begin with white space.

It is also possible to place an anonymous Perl subroutine here.
The code will be invoked during preparation of the page and must
return a string value to use as the header.  See COMPUTED OPTIONS
for details.

Example:

    footer = <hr>
	<table width="100%">
	<TR>
	<TD align="LEFT" class="databody">
	For the source code for this browser, see the <a href="http://www.gmod.org">
	Generic Model Organism Database Project.</a>  For other questions, send
	mail to <a href="mailto:lstein@cshl.org">lstein@cshl.org</a>.
	</TD>
	</TR>
	</table>

=item * examples

You can provide GBrowse with some canned examples of "interesting
regions" for the user to click on.  The examples option, if
present, provides a space-delimited list of interesting regions.
For example:

       examples = II  NPY1 NAB2 Orf:YGL123W

=item * automatic classes

When the user types in a search string that is not qualified by a
class (as in EST:yk1234.5), GBrowse will automatically search for
a matching feature of class "Sequence".  You can have it search
for the name in other classes as well by defining the "automatic
classes" option.

Example:

	automatic classes = Symbol Gene Clone

When the user types in "hb3", the browser will search first for a
Sequence feature of class hb3, followed in turn by matching
features in Symbol, Gene and Clone.  The search stops when the
first match is found.  Otherwise, the browser will proceed to a
full text search of all the comment fields.

=item * search attributes (Bio::DB::SeqFeature::Store adaptor only)

When the browser has searched the name and alias of features without
success, it will do a whole database keyword search by calling the
database's search_notes() method. By default this will search the text
of all attributes, including such things as protein sequence. The
Bio::DB::SeqFeature::Store database is a bit smarter about searching,
and will only, by default, search attributes named "Note". You can
expand the search by giving a list of attribute names to the "search
attributes" option.

=item * remote sources

This option allows you to add remote annotation sources to the menu of such
sources at the bottom of the main window.  The format is:

      remote sources = "Menu Label 1" http://url1.host.com/etc/etc
		       "Menu Label 2" http://url2.host.com/etc/etc

=item * instructions, search_instructions, navigation_instructions

You may override the default instructions (as defined in the language-specific
configuration files in conf/lang) by setting these options.  For example:

         instructions = "Type in the name of a contig or clone."

=item * no search

If you don't want the "Landmark or Region" textbox to appear, set this
to true. The user will still be able to search the database by
appending q=<search term> to the URL.

          no search = 1

=item * no autosearch

If this option is set to a true value, then the user's previous search
will not be automatically re-executed the next time he visits
gbrowse. Instead, the previous search will be pasted into the
"Landmark or Region" box and the user will have to press "Search" to
reexecute it.

=item * instructions section
=item * search section
=item * overview section
=item * region section
=item * details section
=item * tracks section
=item * display_settings section
=item * upload_tracks section

These options control which sections are displayed and whether they
are initially open or collapsed. Their values are one of:

 open     Show the section initially open
 closed   Show the section initially collapsed
 off      Do not show the section at all

For example "instructions section = closed" will initially show the
instructions section in collapsed form when the user visits gbrowse
for the first time. "upload_tracks section = off" will disable the
uploads section entirely.

Note that turning off the details section will effectively disable
gbrowse, but you might want to do this if you want to show the
overview section only. Turning off the search section will also
disable the navigation buttons. If you want to disable searching
selectively, you should use the "no search" option instead.

=item * html1, html2, html3, html4, html5, html6

These options allow you to insert HTML into the GBrowse page at
strategic places.  Eventually this will be replaced with an HTML
template system, but for now, this is the best we have.

 <table>
 <tr><th>Option</th><th>Where it goes</th></tr>
 <tr><td>header</td><td>between the top and the instructions</td></tr>
 <tr><td>html1 </td><td>between the instructions and the navigation bar</td></tr>
 <tr><td>html2 </td><td>between the navigation bar and the overview</td></tr>
 <tr><td>html3 </td><td>between the overview and the detail view</td></tr>
 <tr><td>html4 </td><td>between the detail view and the data source panel</td></tr>
 <tr><td>html5 </td><td>between the data source panel and the track list</td></tr>
 <tr><td>html6 </td><td>between the track list and the annotation upload</td></tr>
 <tr><td>footer</td><td>between the annotation upload and the bottom</td></tr>
 </table>

These can be code references.  One useful thing to do is to use the
language translator to insert language-specific HTML.  Here's an
example provided by Marc Logghe:

    html2 = sub {
        my $go = $main::CONFIG->tr('Go');
        return
        qq(
        <table width="800" border="0">
        <tr class="searchbody">
        <td align="left" colspan="3" />
        <b>Dump:</b><input type="button" value="Assembly" onclick="window.open('gbrowse?plugin=AssemblyDumper;plugin_action=$go');">
        <input type="button" value="Reads" onclick="window.open('gbrowse?plugin=ReadDumper;plugin_action=$go');">
        </td>
        </tr>
        </table>
        );
       }

If you use a coderef for the html options, the subroutine is passed
two arguments.  The first argument is a Bio::Das::SegmentI object (see
the manual page for Bio::DB::GFF::RelSegment for details).  The second
argument is a hashref containing the user's settings for the current page.


=item * keystyle, empty_tracks

These two general options control the appearance of the keys
printed on the detailed view.  keystyle takes one of two values
"between" or "beneath".

       keystyle = between
       Print the track labels between the tracks themselves.

       keystyle = beneath
       Print the track labels at the bottom of the detailed view.

The "empty_tracks" option controls what to do when a track has
no features in it.  Possible values are:

       empty_tracks = key
       Print just the key (the track label).

       empty_tracks = suppress
       Suppress the track completely.

       empty_tracks = line
       Draw a solid line across the track.

       empty_tracks = dashed
       Draw a dashed line across the track.

The default value is "key."

=item * background, postgrid

These two options can be used to place custom background images in the
details panel and are useful for advanced operations such as
colorizing the panel to show gaps in the assembly. Either option
accepts either the path to a graphics file to be tiled onto the
background, or a callback subroutine. In the case of the latter the
callback will passed a two argument list consisting of the GD::Image
object and the Bio::Graphics::Panel object. This gives the callback a
chance to draw on top of the background using GD library calls.

The only difference between the two options is the time that they are
applied relative to the grid that shows base pair coordinates. The
background option is invoked before the grid is drawn so that the grid
appears on top of it. The postgrid option is invoked after the grid is
drawn, so that anything the option draws appears on top of the
grid. See
http://sourceforge.net/mailarchive/message.php?msg_id=12116755 for an
example of using this feature to show assembly gaps as vertical gray
regions.

=item * show track categories

If this option is set to a true value, then tracks that have been
assigned to categories (using the "category" option described later),
will have their categories included in their labels. For example, a
track of key "Protein matches" and category "vertebrate" will be
displayed in a track labeled "Protein match (vertebrate)".

The default is false.

=item * das mapmaster

This option, which should appear somewhere in the [GENERAL] section,
indicates that the database should be made available as a DAS source.
The value of the option corresponds to the URL of the DAS reference
server for this data source, or "SELF" if this database is its own
reference server.  (See http://www.biodas.org/ for an explanation of
what reference servers are.)

Please see L<DAS_HOWTO> for more information on using DAS with
GBrowse.

=item * proxy, http proxy, ftp proxy

If your web server is behind a firewall and needs to use a proxy in
order to access remote HTTP or FTP sites, then one or more of these
options needs to be specified in order for the "add remote
annotations" feature to work (both for file-based and DAS-based remote
annotations).  "http proxy" will set the proxy to use for outgoing
HTTP connections, "ftp proxy" will set the proxy to use for outgoing
FTP connections, and "proxy" will set both.  The value is the URL of
the proxy:

   proxy = http://myproxy.myorg.com:9000

=item * session driver

=item * session args

These options fine-tune how gbrowse manages its state-maintaining
sessions. GBrowse uses CGI::Session to store session data on the
server. By default (if neither of these options is present), it uses
CGI::Session's "file" driver and "default" serializer. The session
files are stored in the "sessions" directory underneath the directory
specified by the "tmpimages" option
(e.g. /usr/local/apache/htdocs/gbrowse/tmpimages/sessions).

The "session driver" option will be passed to CGI::Session->new() as
the first argument. It specifies the driver, serializer and ID
generator according to the syntax described in the CGI::Session manual
page. The "session args" option will be passed to CGI::Session->new()
as the third argument. It specifies additional parameters to be passed
to the selected driver.

For example, here is how to create session data that is stored in the
MySQL "test" database under a table named "gbrowse_sessions." The
session data will be stored in binary form by the Storable module:

 session driver = driver:mysql;serializer:storable
 session args   = DataSource test
 		  TableName  gbrowse_sessions

See the CGI::Session documentation for information about setting up
the MySQL table and appropriate permissions. 

You might also want to read about CGI::Session::ID::salted_md5 for
an ID generation algorithm that should be more secure (but slightly
slower) than the default one.

You will not ordinarily need to use these settings, as the defaults
seem to work well.

=item remember settings time

The length of time to remember page-specific settings in the format
"+NNNu", where NNN is a number and "u" is a unit ("w" = weeks, "d" =
days, "M" = months). For example:

  remember settings time = +3M   # remember settings for 3 months

The users' settings, which includes uploaded files, track options and
plugin configuration, will be reset to the default if he or she fails
to visit the site within the time specified.

The default value is 1 month.

See the CGI module's manual page for more information on the time
format.

=item remember source time

Like "remember settings time" but applies to remembering the user's
preferred data source. "remember source time" should be greater or
equal to "remember settings time" because the settings will expire
when the source expires. If you have multiple sources, this option
should be the same for each one.

The default value is 3 months.

=item msie hack

GBrowse uses HTTP POST to transfer the current page settings to the
web server. Because of the way that Microsoft Internet Explorer caches
pages, when users of this browser press the "Back" button, MSIE will
display an annoying alert that prompts the user to reload the page.

When you set "msie hack" to a true value, Gbrowse will use the GET
request when it detects MSIE in use. This will fix the "Back" button
issue, but will put very long URLs in the Location box. It is your
choice which of these is more annoying to your users.

=back

=head2 B2. The [TRACK DEFAULTS] section

The track defaults section specifies default values for each track.
The following common options are recognized:

	     glyph
 	     height
 	     bgcolor
 	     fgcolor
 	     fontcolor
 	     font2color
 	     strand_arrow

These options control the default graphical settings for any
annotation types that are not explicitly specified.  See the
section below on controlling the settings.  Any of the options
allowed in the [track] sections described below are allowed here.

=over

=item * label density

When there are too many annotations on the screen GBrowse automatically
disables the printing of identifying labels next to the feature.
"label density" controls where the cutoff occurs.  The value in the
example files is 25, meaning that labels will be turned off when
there are more than 25 annotations of a particular type on display
at once.

=item * bump density

When there are too many annotations on the screen GBrowse automatically
disables collision control.  The "bump density" option controls
where the cutoff occurs.  The value in the example files is 100,
meaning that when there more than 100 annotations of the same type
on the display, the browser will stop shifting them verticially to
prevent them from colliding, but will instead allow them to
overlap.

=item * link

The link option creates a default rule for creating outgoing links
from the GBrowse display.  When the user clicks on a feature of
interest, he will be taken to the corresponding URL.

The link option's value should be a URL containing one or more
variables.  Variables begin with a dollar sign ($), and are
replaced at run time with the information relating to the selected
annotation.  Recognized variables include:

     $name        The feature's name (group name)
     $id          The feature's id (eg, PK from a database)
     $class       The feature's class (group class)
     $method      The feature's method
     $source      The feature's source
     $ref         The name of the sequence segment (chromosome, contig)
                     on which this feature is located
     $description The feature's description (notes)
     $start       The start position of this feature, relative to $ref
     $end         The end position of this feature, relative to $ref
     $segstart    The left end of $ref displayed in the detailed view
     $segend      The right end of $ref displayed in the detailed view

For example, the wormbase.conf file uses this link rule:

     link = http://www.wormbase.org/db/get?name=$name;class=$class

At run time, if the user clicks on an EST named yk1234.5, this
will generate the URL

     http://www.wormbase.org/db/get?name=yk1234.5;class=EST

It is possible to override the global link rule on a
feature-by-feature basis.  See the next section for details on
this.  It is also possible to declare a subroutine to compute the
proper URL dynamically.  See COMPUTED OPTIONS for details.

A special link type of AUTO will cause the feature to link to
the gbrowse_details script, which summarizes information about
the feature.  The default is not to link at all.

=item * link_target

By default links will replace the contents of the current window.
If you wish, you can specify a new window to pop up when the user
clicks on a feature, or designate a named window or frame to
receive the contents of the link.  To do this, add the "link_target"
option to the [TRACK DEFAULTS] section or to a track stanza.  The format
is this:

       link_target = _blank

The value uses the HTML targetting rules to name/create the window
to receive the value of the link.  The first time the link is
accessed, a window with the specified name is created.  The next
time the user clicks on a link with the same target, that window
will receive the content of the link if it is still present, or it
will be created again if it has been closed.  A target named
"_blank" is special and will always create a new window.

The "link_target" option can also be computed dynamically.  See
COMPUTED OPTIONS for details.

=item * title

The title option controls the "tooltips" text that pops up when the
mouse hovers over a glyph in certain browsers.  The rules for
generating titles are the same as the "link" option discussed above.
The "title" option can also be computed dynamically.  See COMPUTED
OPTIONS for details.

Note HTML characters such as "<", ">" and "&" are not automatically
escaped from the title. This lets you do neat stuff, such as create
popup menus, but also means that you need to be careful. The function
CGI::escapeHTML() is available to properly escape HTML characters in
dynamically-generated titles.

The special value "AUTO" causes a default description to appear
describing the name, type and position of the feature.  This is
also assumed if the title option is missing or blank.

=item * landmark_padding = 1000

The landmark_padding option will add the indicated number of base pairs
to the right and left of all landmarks that are searched for by name.

=item * image_padding = 25

=item * pad_left = 50

=item * pad_right = 30

The image_padding option will add the indicated amount of whitespace
(in pixels) to the right and left of the detail panel.  The default is
25 pixels. You may need to adjust this if you are using the xyplot
glyph and finding that the scale (which is printed outside the graph
area) is being cut off.

You can individually adjust the left and right padding using pad_left
and pad_right, which, if present, will supersede image_padding.

=back

=head2 B3. Track Sections

Any other [Section] in the configuration file is treated as a
declaration of a track.  The order of track sections will become the
default order of tracks on the display (the user can change this
later).  Here is a typical track declaration from yeast.conf:

 [Genes]
 feature      = gene:sgd
 glyph        = generic
 bgcolor      = yellow
 forwardcolor = yellow
 reversecolor = turquoise
 strand_arrow = 1
 height       = 6
 description  = 1
 key          = Named gene

This track is named "Genes". You may use a short mnemonic if you
prefer; this will make the URL shorter when the user bookmarks a view
he or she likes. Track names can contain almost any character,
including whitespace, but cannot contain the "-" or "+" signs because
these are used to separate track names in the URL when
bookmarking. [My Genes] is OK, but [My-Genes] is not.

As in the general configuration section, the track declaration
contains multiple name=value option pairs.

Valid options are as follows:

=over

=item 1 feature

This relates the track to one or more feature types as they appear
in the database.  Recall that each feature has a method and source.
This is represented in the form method:source.  So, for example, a
feature of type "gene:sgd" has the method "gene" and the source
"sgd".  

It is possible to omit the source.  A feature of type "gene" will
include all features whose methods are "gene", regardless of the
source field.  It is not possible to omit the method.

It is possible to have several feature types displayed on a single
track.  Simply provide the feature option with a space-delimited
list of the features you want to include.  For example:

    feature = gene:sgd stRNA:sgd

This will include features of type "gene:sgd" and "stRNA:sgd" in the
same track and display them in a similar fashion.

=item 2 remote feature

This relates the track to a remote feature track somewhere on the
Internet. The value is a http: or ftp: URL, and may correspond to a
static file of features in GFF format, gbrowse upload format, a CGI
script, or a DAS source. When this option is active, the "feature"
option and most of the glyph control options described below are
ignored, but the "citation" and "key" options are honored.

Example:

 remote feature = http://www.wormbase.org/cgi-bin/das/wormbase?type=mRNA

=item 3 glyph

This controls the glyph (graphical icon) that is used to represent
the feature.  The list of glyphs and glyph-specific options are
listed in the section GLYPHS AND GLYPH OPTIONS.  The "generic" glyph
is the default.

=item 4 bgcolor

This controls the background color of the glyph.  The format of
colors is explained in GLYPHS AND GLYPH OPTIONS.

=item 5 fgcolor

This controls the foreground color (outline color) of the glyph.
The format of colors is explained in GLYPHS AND GLYPH OPTIONS.

=item 6 fontcolor

This controls the color of the primary font of text drawn in the
glyph.  This is the font used for the features labels drawn at the
top of the glyph.

=item 7 font2color

This controls the color of the secondary font of text drawn in
the glyph.  This is the font used for the longish feature descriptions
drawn at the bottom of the glyphs.

=item 8 height

This option sets the height of the glyph.  It is expressed in
pixels.

=item 9 strand_arrow

This is a true or false value, where true is 1 and false is 0.
If this option is set to true, then the glyph will indicate the
strandedness of the feature, usually by drawing an arrow of some
sort.  Some glyphs are inherently stranded, or inherently
non-stranded and simply ignore this option.

=item 10 label

This is a true or false value, where true is 1 and false is 0.  If
the option is set to true, then the name of the feature (i.e. its
group name) is printed above the feature, space allowing.

=item 11 description

This is a true or false value, where true is 1 and false is 0.  If
the option is set to true, then the description of the feature (any
Note fields) is printed below the feature, space allowing.

=item 12 key

This option controls the descriptive key that is drawn in the key
area at the bottom of the image.  It also appears in the checkboxes
that the end user uses to switch tracks on and off. If not specified, it defaults to
the track name.

=item 13 citation

If present, this option creates a human-readable descriptive
paragraph describing the feature and how it was derived.  This is
the text information that is displayed when the user clicks on the
track name in the checkbox group.  The value can either be a URL, in
which case clicking on the track name invokes the corresponding URL,
or a text paragraph, in which case clicking on the track name
generates a page containing the text description.  Long paragraphs
can be continued across multiple lines, provided that continuation
lines begin with whitespace.

=item 14 link, title, link_target

These options are identical to the similarly-named options in 
the [GENERAL] section, but change the rules on a track-by-track basis.  
They can be used to override the global rules.  To force a track not
to contain any links, use a blank value.

=item 15 box_subparts

If this option is greater than zero, then gbrowse will generate
imagemap rectangles for each of the subparts of a feature (e.g. the
exons within a transcript), allowing you to link each subpart
separately. The numeric value will control the number of levels of
subfeatures that the boxes will descend into. For example, if using
the "gene" glyph, set -box_subparts to 2 to create boxes for the whole
gene (level 0), the mRNAs (level 1) and the exons (level 2).

=item 16 feature_low

If this option is present, GBrowse will use the list of feature types
listed here at resolution views.  (This is one of the ways that
semantic zooming is implemented.)  This allows you, for example,
to switch off detailed exon, UTR, promoters and other within-the-gene
features, and just show the start and stop of the transcription
unit.

=item 17 global feature

If this option is present and set to a true value (e.g. "1"), GBrowse 
will automatically generate a  pseudo-feature that starts at the 
beginning of the currently displayed region and extends to the end.  
This is often used in conjunction with the "translation" and "dna"
glyphs in order to display global characteristics of the sequence.
If this option is set, then you do not need to specify a "feature"
option.

=item 18 group_pattern

This option lets you connect related features by dotted lines based
on a pattern match in the features' names.  A typical example is
connecting the 5' and 3' read pairs from ESTs or plasmids.  See
GROUPING FEATURES for details.

=item 19 group_on

For Bio::DB::SeqFeature::Store databases I<only>, the group_on field
allows you to group features together by display_name, target or any
other method. This is mostly useful for XY-plot data, where you may
want to dynamically group related data points together so that they
share the same vertical scaling.

Example:

	group_on = display_name

(this feature is under refinement and may change in the future)

=item 20 restrict

This option allows you to restrict who is allowed to view the current
track by host name, IP address or username/password.  See
AUTHENTICATION AND AUTHORIZATION for details.

=item 21 category

This option allows you to group tracks into different groups on the
GBrowse display in addition to the default group called 'General'.
For example, if you wanted several tracks to be in a separate group
called "Genes", you would add this to each of the track defintions:

  category = Genes

Note that it is not possible to make subcategories. If all tracks are
categorized, then the "General" category will not be displayed.

=item 22 das category, das landmark, das subparts, das superparts

All these options pertain to exporting the GBrowse database as a DAS
data source.  Please see L<DAS_HOWTO> for more information.

=back

A large number of glyph-specific options are also recognized.  These
are described in the next section.

=head2 B4. Glyphs and Glyph Options

A large variety of glyphs are available, and more are being added as
the Bio::Graphics module grows.

A list of the common glyphs and their options is provided by the
GBrowse itself.  Click on the "[Help]" link in the section labeled
"Upload your own annotations".  This page also lists the valid
foreground and background colors.  Most of the glyphs are found in the
BioPerl distribution, but a few are distributed directly with GBrowse.

The most popular glyph types are:

  Glyph			Description
  -----                 -----------

  generic		a rectangle
  allele_tower          allele found at a SNP position
  arrow			an arrow
  anchored_arrow        a span with vertical bases |---------|.  If one
                        or the other end of the feature is off-screen, the
                        base will be replaced by an arrow.
  box                   another rectangle; doesn't show subparts of features
  cds                   shows the reading frame of spliced transcripts; used
                        in conjunction with the "coding" aggregator.
  diamond		a point-like feature represented as a triangle
  dna                   DNA and GC content
  heterogeneous_segments a multi-segmented feature in which each segment can
                        have a distinctive color.  For Jim Kent's WABA features,
                        this works with the waba_alignment aggregator.
  idiogram              this takes specially-formatted feature data and turns it
			into an idiogram of a Giemsa-stained metaphase chromosome
  image                 this embeds photographic images and/or diagrams on features
  processed_transcript  multi-purpose representation of a spliced mRNA, including
                        positions of UTRs
  segments		a multi-segmented feature such as an alignment
  span                  like anchored_arrow, except that the ends are
                        truncated at the edge of the panel, not turned
                        into an arrow
  trace			reads an SCF trace file and draws a graphic representation
  triangle		a point-like feature represented as a diamond
  transcript		a gene model
  transcript2		a slightly different representation of a gene model
  translation		1-, 3- and 6-frame translations
  wormbase_transcript	yet another gene model that can show UTR segments
			(for features that conform to the WormBase gene
			schema). Used in conjunction with the
			"wormbase_gene" aggregator.
  xyplot                histograms and line plots

A more definitive list of glyph options can be found in the
Bio::Graphics manual pages.  Consult the manual pages for the
following modules:

  Glyph				Manual Page
  -----                         -----------

  (common options for all)      Bio::Graphics::Glyph
  allele_tower                  Bio::Graphics::Glyph::allele_tower
  anchored_arrow                Bio::Graphics::Glyph::anchored_arrow
  arrow				Bio::Graphics::Glyph::arrow
  box                           Bio::Graphics::Glyph::box
  cds				Bio::Graphics::Glyph::cds
  crossbox			Bio::Graphics::Glyph::crossbox
  diamond			Bio::Graphics::Glyph::diamond
  dna                           Bio::Graphics::Glyph::dna
  dot                           Bio::Graphics::Glyph::dot
  ellipse			Bio::Graphics::Glyph::ellipse
  extending_arrow		Bio::Graphics::Glyph::extending_arrow
  generic			Bio::Graphics::Glyph::generic
  graded_segments		Bio::Graphics::Glyph::graded_segments
  heterogeneous_segments	Bio::Graphics::Glyph::heterogeneous_segments
  idiogram			Bio::Graphics::Glyph::idiogram
  image 			Bio::Graphics::Glyph::image
  line				Bio::Graphics::Glyph::line
  primers			Bio::Graphics::Glyph::primers
  processed_transcript          Bio::Graphics::Glyph::processed_transcript
  rndrect			Bio::Graphics::Glyph::rndrect
  ruler_arrow			Bio::Graphics::Glyph::ruler_arrow
  segments			Bio::Graphics::Glyph::segments
  span                          Bio::Graphics::Glyph::span
  toomany			Bio::Graphics::Glyph::toomany
  trace				Bio::Graphics::Glyph::trace
  transcript			Bio::Graphics::Glyph::transcript
  transcript2			Bio::Graphics::Glyph::transcript2
  translation			Bio::Graphics::Glyph::translation
  triangle			Bio::Graphics::Glyph::triangle
  wormbase_transcript		Bio::Graphics::Glyph::wormbase_transcript
  xyplot                        Bio::Graphics::Glyph::xyplot

The "perldoc" command is handy for reading the documentation from the
Unix command line.  For example:

   perldoc Bio::Graphics::Glyph::primers

This will provide you with a summary of the options that apply to the
"primers" glyph.

In the manual pages, the glyph options are presented the way they are
called from Perl.  For example, the documentation will tell you to use
the -connect_color option to set the color to use when drawing the
line that connects the two inward pointing arrows in the primer pair
glyph.  This translates to the configuration file as an option named
"connect_color".  For example:

 [PCR Products]
 glyph = primer
 connect_color = blue

When referring to colors, you can use a variety of color names such as
"blue" and "green".  To get the full list, cut and paste the following
magic incantation into the command line:

 perl -MBio::Graphics::Panel -e 'print join "\n",Bio::Graphics::Panel->color_names'

or see this URL:

  http://www.wormbase.org/db/seq/gbrowse?help=annotation

Alternatively, you can use the #RRGGBB notation to specify the red,
green and blue components of the color.  Refer to any book on HTML for
the details on using the notation.

=head2 B5. Adding features to the overview

You can make any set of tracks appear in the overview by creating a
stanza with a title of the format [<label>:overview], where <label> is
any unique label of your choice.  The format of the stanza is
identical to the others, but the indicated track will appear in the
overview rather than as an option in the detailed view.  For example,
this stanza adds to the overview a set of features of method "gene",
source "framework":

 [framework:overview]
 feature       = gene:framework
 label         = 1
 glyph         = generic
 bgcolor       = lavender
 height        = 5
 key           = Mapped Genes

Similarly, you can make a track appear in the region panel by
appending ":region" to its name:

 [genedensity:region]
 feature       = gene_density
 glyph         = xyplot
 graph_type    = boxes
 scale         = right
 bgcolor       = red
 fgcolor       = red
 height        = 20
 key           = SNP Density

=head2 B6. Semantic Zooming

Sometimes you will want to change the appearance of a track when the
user has zoomed out or zoomed in beyond a certain level.  To indicate
this, create a set of "length qualified" stanzas of format
[<label>:<zoom level>], where all stanzas share the same <label>, and
<zoom level> indicates the minimum size of the region that the stanza
will apply to.  For example:

  [gene]
  feature = transcript:curated
  glyph    = dna
  fgcolor  = blue
  key      = genes
  citation = example semantic zoom track

  [gene:500]
  feature = transcript:curated
  glyph   = transcript2

  [gene:100000]
  feature = transcript:curated
  glyph   = arrow

  [gene:500000]
  feature = transcript:curated
  glyph   = generic

This series of stanzas says to use the "transcript2" glyph when the
segment being displayed is 500 bp or longer, to use the "arrow" glyph
when the segment being displayed is 100,000 bp or longer, and the
"generic" glyph when the region being displayed is 500,000 bp or
longer.  For all other segment lengths (1 to 499 bp), the ordinary
[gene] stanza will be consulted, and the "dna" glyph will be
displayed.  The bare [gene] stanza is used to set all but the
"feature" options for the other stanzas.  This means that the fgcolor,
key and citation options are shared amongst all the [gene:XXXX]
stanzas, but the "feature" option must be repeated.

You can override any options in the length qualified stanzas.  For
example, if you want to change the color to red in when displaying
genes on segments between 500 and 99,999 bp, you can modify the
[gene:500] stanza as follows:

  [gene:500]
  feature = transcript:curated
  glyph   = transcript2
  fgcolor = red

It is also possible to display different features at different zoom
levels, although you should handle this potentially confusing feature
with care.

If you wish to turn off a track entirely, you can use the "hide" flag
to hide the track when the display exceeds a certain size:

  [6_frame_translation:50000]
  hide = 1

=head2 B7. Computed Options

Some options can be computed at run time by using Perl subroutines as
their values. These are known as "callbacks." Currently this works
with the values of the "link", "title", "link_target", "header" and
"footer" options, and any glyph-specific option that appears in a
track section.

You need to know the Perl programming language to take advantage of
this.  The general format of this type of option is:

  option name = sub {
	      some perl code;
	      some more perl code;
	      even more perl code;
	      }

The value must begin with the sequence "sub {" in order to be
recognized as a subroutine declaration.  After this, you can have one
or more lines of Perl code followed by a closing brace.  Continuation
lines must begin with whitespace.

When the browser first encounters an option like this one, it will
attempt to compile it into Perl runtime code.  If successful, the
compiled code will be stored for later use and invoked whenever the
value of the option is needed. (Otherwise, an error message will
appear in your server error log).

For options of type "footer" and "header", the subroutine is passed no
arguments.  It is expected to produce some HTML and return it as a
string value.

For glyph-specific features, such as "bgcolor" the subroutine will be
called at run time with five arguments consisting of the feature, the
name of the option, the current part number of the feature, the total
number of parts in this feature, and the glyph corresponding to the
feature. Usually you will just look at the first argument.  The return
value is treated as the value of the corresponding option.  For
example, this bgcolor subroutine will call the feature's primary_tag()
method, and return "blue" if it is an exon, "orange" otherwise:

  bgcolor = sub {
	  my $feature = shift;
	  return "blue" if $feature->primary_tag eq 'exon';
	  return "orange";
	  }

See the manual page for Bio::DB::GFF::Feature for information on how
to interrogate the feature object.

For special effects, such as coloring the first and last exons
differently, you may need access to all five arguments. Here is an
example that draws the first and last parts of a feature in blue and
the rest in red:

   sub { 
	 my($feature,$option_name,$part_no,$total_parts,$glyph) = @_;
	 return 'blue' if $part_no == 0;                # zero-based indexing!
	 return 'blue' if $part_no == $total_parts-1;   # zero-based indexing!
	 return 'red';
	 }

See the Bio::Graphics::Panel manual page for more details.

Callbacks for the "link", "title", and "link_target" options have a
slightly different call signature. They receive three arguments
consisting of the feature, the Bio::Graphics::Panel object, and the
Bio::Graphics::Glyph object corresponding to the current track within
the panel:

  link = sub {
	     my ($feature, $panel, $track) = @_;
	     ... do something
	     }

Ordinarily you will only need to use the feature object. The other
arguments are useful to look up panel-specific settings such as the
pixel width of the panel or the state of the "flip" setting:

  title = sub {
	  my ($feature,$panel,$track) = @_;
	  my $name = $feature->display_name;
	  return $panel->flip ? "$name (flipped)" : $name;
       }


Named Subroutine References
---------------------------

If you use a version of BioPerl after April 15, 2003, you can also use
references to named subroutines as option arguments.  To use named
subroutines, add an init_code section to the [GENERAL] section of the
configuration file.  init_code should contain nothing but subroutine
definitions and other initialization routines.  For example:

  init_code = sub score_color {
	        my $feature = shift;
	        if ($feature->score > 50) { 
		  return 'red';
	        } else {
		  return 'green';
	        }
	      }
	      sub score_height {
	        my $feature = shift;
	        if ($feature->score > 50) { 
		  return 10;
	        } else {
		  return 5;
	        }
	      }

Then simply refer to these subroutines using the \&name syntax:

    [EST_ALIGNMENTS]
    glyph = generic
    bgcolor = \&score_color
    height  = \&score_height

You can declare global variables in the init_code subroutine if you
use "no strict 'vars';" at the top of the section:

    init_code = no strict 'vars';
                $HEIGHT = 10;
	        sub score_height {
	          my $feature = shift;
		  $HEIGHT++;
	          if ($feature->score > 50) { 
		    return $HEIGHT*2;
	          } else {
		    return $HEIGHT;
	          }
	        }

Due to the way the configuration file is parsed, there must be no
empty lines in the init_code section.  Either use comments to
introduce white space, or "use" a .pm file to do anything fancy.

Subroutines that you define in the init_code section, as well as
anonymous subroutines, will go into a package that changes
unpredictably each time you load the page. If you need a predictable
package name, you can define it this way:

   init_code = package My; sub score_height { .... }

   [EST_ALIGNMENTS]
   height = \&My::score_height

=head2 B8. Declaring New Aggregators

The Bio::DB::GFF data model recognizes a single-level of "grouping" of
features, but doesn't specify how to use the group information to
correctly assemble the various individual components into a biological
object.  Aggregators are used to assemble this information.  For
example, let's say that you decide that your preferred "transcript"
data model contains three subfeature types: a set of one or more
features of method "exon", a single feature of method "TSS", and a
single feature of method "polyA".  Optionally, the data model could
contain a single "main subfeature" that runs the length of the entire
transcript.  We might give this feature a method of "primary_transc"
(for "primary transcript.")

In a GFF file, a three-exon transcript might be represented as
follows:

 Chr1 confirmed primary_transc 100 500  .  +  .  Transcript "ABC.1"
 Chr1 confirmed TSS            100 100  .  +  .  Transcript "ABC.1"
 Chr1 confirmed exon           100 200  .  +  .  Transcript "ABC.1"
 Chr1 confirmed exon           250 300  .  +  .  Transcript "ABC.1"
 Chr1 confirmed exon           400 500  .  +  .  Transcript "ABC.1"
 Chr1 confirmed polyA          500 500  .  +  .  Transcript "ABC.1"

To aggregate this, you would like to create an aggregator named
"transcript", whose "main method" is "primary_transc", and whose "sub
methods" are "TSS," "exon," and "polyA."

The way to indicate this in the configuration file is to add a
"complex aggregator" to the list of aggregators:

  aggregator = transcript{TSS,exon,polyA/primary_transc}

The format of this value is
"aggregator_name{submethod1,submethod2,.../mainmethod}".  

You can now use the name of the aggregator name as the argument of the
"feature" option in a track section:

  [Transcripts]
  feature      = transcript
  glyph        = segments
  bgcolor      = wheat
  fgcolor      = black
  height       = 10
  key          = Transcripts

If you do not have a main subfeature, leave off the "/mainmethod".
For example:

  aggregator = transcript{TSS,exon,polyA}

A few formatting notes.  You are free to mix simple and complex
aggregators in the "aggregator" option.  For example, you can activate
the standard "clone" and "alignment" aggregators as well as the new
transcript aggregator with a line like this one:

 aggregator = clone
              transcript{TSS,exon,polyA/primary_transc}
              alignment

If the complex aggregator contains whitespace or apostrophes, you must
surround it with double-quotes, like this:

   "transcript{TSS,5'UTR,3'UTR,exon,polyA/primary_transc}"

Be aware that some glyphs look for particular method names when
rendering aggregated features.  For example, the standard "transcript"
glyph is closely tied to the "transcript" aggregator, and looks for
submethods named "intron", "exon" and "CDS", and a main method named
"transcript."  

Here is the list of available predefined aggregators:

     alignment
     clone
     coding
     transcript
     none
     orf
     waba_alignment
     wormbase_gene

To view the documentation for any of these aggregators, run the
command "perldoc Bio::DB::GFF::Aggregator::aggregator_name", where
"aggregator_name" is the name of the aggregator.

=head2 B9. GROUPING FEATURES

gbrowse recognizes the concept of a "group" of related features that
are connected by dotted lines. The canonical example is a pair of ESTs
that are related by being from the two ends of the same cDNA clone.
However many feature databases, including the GFF database recommended
for gbrowse, do not allow for arbitrary hierarchical grouping.  To
work around this, you may specify a feature name-based regular
expression that will be used to trigger grouping.

It works like this.  Say you are working with EST feature pairs and
they follow the nomenclature 501283.5 and 501283.3, where the suffix
is "5" or "3" depending on whether the read was from the 5' or 3' ends
of the insert.  To group these pairs by a dotted line, specify the
"group_pattern" option in the appropriate track section:

      group_pattern =  /\.[53]$/

At render time, gbrowse will strip off this pattern from the names of
all features in the EST track and group those that have a common base
name.  Hence 501283.5 and 501283.3 will be grouped together by a
dotted line, because after the pattern is removed, they will share the
same common name "501283".

This works for all embedded pattern, provided that stripping out the
pattern results in related features sharing the same name.  For
example, if the convention were "est.for.501283" and "est.rev.501283",
then this grouping pattern would have the desired effect:

      group_pattern = /\.(for|rev)\./

Don't forget to escape regular expression meta-characters and to
consider the various ways in which the regular expression might break.
It is entirely possible to create an invalid regular expression, in
which case gbrowse will crash until you comment out the offending
option.

=head2 B10. Controlling the gbrowse_details page

If a track definition's "link" option (see section B2) is set to AUTO,
the gbrowse_details script will be invoked when the user clicks on a
feature contained within the track.  This will generate a simple table
of all feature information available in the database.  This includes
the user-defined tag/value attributes set in Column 9 of the GFF for
that feature.

You can control, to some extent, the formatting of the tag value table
by providing a configuration stanza with the following format:

  [feature_type:details]
  tag1 = formatting rule
  tag2 = formatting rule
  tag3 = formatting rule

"feature_type" is the type of the feature you wish to control. For
example, "gene:sgd" or simply "gene". You may also specify a
feature_type of "default" to control the formatting for all
features. "tag1", "tag2" and so forth are the tags that you wish to
control the formatting of. The tags "Name," "Class", "Type", "Source",
"Position", and "Length" are valid for all features, while "Target"
and "Matches" are valid for all features that have a target
alignment. In addition, you can use the names of any attributes that
you have defined. Tags names are NOT case sensitive, and you may use a
tag named "default" to define a formatting rule that is general to all
tags (more specific formatting rules will override less specific ones).

A formatting rule can be a string with (possible) substitution values,
or a callback. If a string, it can contain one or more of the
substitution variable "$name", "$start", "$end", "$stop", "$strand",
"$method", "$type", "$description" and "$class", which are replaced
with the corresponding values from the current feature. In addition,
the substitution variable "$value" is replaced with the current value
of the attribute, and the variable "$tag" is replaced with the current
tag (attribute) name. HTML characters are passed through.

For example, here is a simple way to boldface the Type field,
italicize the Length field, and turn the Notes into a Google
search:

 [gene:details]
 Type   = <b>$value</b>
 Length = <i>$value</b>
 Note  = <a href="http://www.google.com/search?q=$value">$value</a>

If you provide a callback, the callback subroutine will be invoked
with three arguments. WARNING: the three arguments are different from
the ones passed to other callbacks, and consist of the tag value, the
tag name, and the current feature:

  Note = sub {
	     my($value,$tag_name,$feature) = @_;
	     do something....
	     }

You can use this feature to format sequence attributes nicely. For
example, if your features have a Translation attribute which contains
their protein translations, then you are probably unsatisified with
the default formatting of these features. You can modify this with a
callback that word-wraps the value into lines of at most 60
characters, and puts the whole thing in a <pre> section.

 [gene:details]
 Translation = sub {
     		my $value = shift;
		$value =~ s/(\S{1,60})/$1\n/g;
		"<pre>$value</pre>";
	     }

=head2 B11. Linking out from gbrowse_details

The formatting rule mechanism described in the previous section is the
recommended way of creating a link out from the gbrowse_details
page. However, an older mechanism is available for backward
compatibility.

To use this legacy mechanism, create a stanza header named
[TagName:DETAILS], where TagName is the name of the tag (attribute
name) whose values you wish to turn into URLs, and where DETAILS must
be spelled with capital letters. Put the option "URL" inside this
stanza, containing a string to be transformed into the URL.

For example, to link to a local cgi script from the following GFF line:
    
 IV	curated	exon	518	550	. + .   Transcript B0273.1; local_id 11723


one might add the following stanza to the configuration file:
    
    [local_id:DETAILS]
    URL   = http://localhost/cgi-bin/localLookup.cgi?tag=$tag;id=$value

The URL option's value should be a URL containing one or more
variables.  Variables begin with a dollar sign ($), and are
replaced at run time with the information relating to the selected
feature attribute.  Recognized variables are:

     $tag        The "tag" of the tag/value pair
     $value      The "value" of the tag/value pair

The value of URL can also be an anonymous subroutine, in which case
the subroutine will be invoked with a two-element argument list
consisting of the name of the tag and its value. This example,
provided by Cyril Pommier, will convert Dbxref tags into links to
NCBI, provided that the value of the tag looks like an NCBI GI number:

 [Dbxref:DETAILS]
 URL = sub { 
       my ($tag,$value)=@_;
       if ($value =~ /NCBI_gi:(.+)/){
        return "http://www.ncbi.nlm.nih.gov/gquery/gquery.fcgi?term=$1";
        }
        return;
      }

=head1 C. GENERATING HISTOGRAMS

With a little bit of additional effort, you can set one or more tracks
up to display a density histogram of the features contained within the
track.  For example, the human data source in GBrowse demo
(http://www.wormbase.org/db/seq/gbrowse/human) uses density
histograms in the chromosomal overview.  In addition, when the
features in the SNP track become too dense to view, this track
converts into a histogram.  To see this in action, turn on the SNP
track and then zoom out beyond 150K.

There are four steps for making histograms:

=over

=item 1

generate the density data using the bp_generate_histogram.pl script.

=item 2

load the density data using load_gff.pl or fast_load_gff.pl.

=item 3

declare a density aggregator to the gbrowse configuration file

=item 4

add the density aggregator to the appropriate track in the configuration file.

=back

The first step is to generate the density data.  Currently this is
done by generating a GFF file containing a set of "bin" feature
types.  Use the bp_generate_histogram.pl script to do this.  You will
find it in bioperl under the scripts/Bio-DB-GFF directory.

Assuming that your database is named "dicty", you have a feature named
SNP, and you wish to generate a density distribution across 10,000 bp
bins, here is the command you would use:

  bp_generate_histogram.pl -merge -d dicty -bin 10000 SNP >snp_density.gff

This is saying to use the "dicty" database (-d) option, to use 10,000
bp bins (the -bin option) and to count the occurrences of the SNP
feature throughout the database.  In addition, the -merge option says
to merge all types of SNPs into a single bin.  Otherwise they will be
stratified by their source.  The resulting GFF file contains a series
of entries like these ones:

  Chr1	SNP bin 1     10000 49 + . bin Chr1:SNP
  Chr1	SNP bin	10001 20000 29 + . bin Chr1:SNP

What this is saying is that there are now a series of pseudo-features
of type "bin:SNP" that occupy successive 10,000 bp regions of the
genome.  The score field contains the number of times a SNP was seen
in that bin.

You'll now load this file using load_gff.pl or fast_load_gff.pl:

  load_gff.pl -d dicty snp_density.gff

The next step is to tell GBrowse how to use this information.  You do
this by creating a new aggregator for the SNP density information.
Open the GBrowse configuration file and find the aggregators option.
Add a new aggregator that looks like this:

  aggregators = snp_density{bin:SNP}

This is declaring a new feature named "snp_density" that is composed
of subparts of type bin:SNP.

The last step is to declare a track for the density information.  You
will use the "xyplot" glyph, which can draw a number of graphs,
including histograms.  To add the SNP density information as a static
track in the overview, create a section like this one:

 [SNP:overview]
 feature       = snp_density
 glyph         = xyplot
 graph_type    = boxes
 scale         = right
 bgcolor       = red
 fgcolor       = red
 height        = 20
 key           = SNP Density

This is declaring a new constant track in the overview named "SNP
Density."  The feature is "snp_density", corresponding to the
aggregator declared earlier.  The glyph is "xyplot" using the graph
type of "boxes" to generate a column graph.

To set up a track so that the histogram appears when the user zooms
out beyond 100,000 bp but shows the detailed information at higher
magnifications, generate two track sections like these:

  [SNPs]
  feature       = snp
  glyph         = triangle
  point         = 1
  orient        = N
  height        = 6
  bgcolor       = blue
  fgcolor       = blue
  key           = SNPs

  [SNPs:100000]
  feature       = snp_density
  glyph         = xyplot
  graph_type    = boxes
  scale         = right

The first track section sets up the defaults for the SNP track.  SNPs
are represented as blue triangles pointing North.  The second track
declaration declares that when the user zooms out to over 100K base
pairs, GBrowse should display the snp_density feature using the xyplot
glyph.

=head1 D. INTERNATIONALIZATION

GBrowse is partially internationalized.  End-users whose browsers are
set to request a non-English language will see the GBrowse main and
secondary screens in their preferred language, provided that GBrowse
has the appropriate translation file.

Translation files are located in gbrowse.conf/languages/ and use the
standard two-letter language abbreviations, such as "fr" for French,
as well as the regional abbregiations, such as fr-CA for Canadian
French.  Currently there are translation files for French, Italian,
and Japanese.  If your favorite language isn't supported, you are
encouraged to create a new translation file and contribute it to the
GBrowse development effort.  Please contact Lincoln Stein
(lstein@cshl.org) for help in doing this.

If the end user does not specify a preferred language, GBrowse will
default to "en" (English).  You can change this by placing a
"language" option in the configuration file somewhere inside the
[GENERAL] section.  For example, to make Japanese the default, create
this entry:

  language = ja

GBrowse will still use the end-user's preferred language in preference
to the default if the preferred language is available.

Although GBrowse automatically changes the text and button language,
it can't automatically translate the track labels.  If you would like
the track labels to localize, you will have to provide your own
translations in the "key", "citation" and "category" options.  The
syntax is similar to that used for semantic zooming:

  [gene]
  glyph   = transcript
  feature = transcript:curated
  height  = 10
  key     = Named Gene
  key:fr  = Gnes Nomms
  key:it  = I Geni dati un nome a
  key:sp  = Los Genes denominados
  category = Genes
  category:fr = Gnes

The option is followed by a colon and the two-letter language name to
indicate that when the page is being displayed with this language, to
use the indicated value of the option.  The option without the colon
is the default.  You may enter accented and umlauted characters
directly, as shown, or use the HTML entities.  Non-English character
sets, such as Japanese, should also work correctly, provided that the
translation file indicates the correct character set to use.

HELP FILES:

The GBrowse help files are in English.  Although there is support for
internationalizing the hep files, no one has done this yet.  If you
are industrious and wish to translate the help files into your
favorite language, find the two help files where they are located in
htdocs/gbrowse/.  One is named general_help.html, while the other is
named annotation_help.html.  Translate them, and create new files with
the language prefix appended to the end.  For example, the French
translation of annotation_help.html would be annotation_help.html.fr.

LIMITATIONS:

- There is no localization support. For example, GBrowse will print
large numbers using commas (e.g. 1,234,567) instead of periods, even
when talking to a European browser.

- Although the HTML frame around the GBrowse genome image will use the
appropriate character set, the overview and detail images themselves
are limited to Latin alphabets.  This is because of limited native
character support in the GD library used by GBrowse.  When a non-Latin
character set is called for, such as Japanese, GBrowse will use
Japanese for the frame, but English for the image.

- The rate at which the GBrowse team adds new features to the browser
often outstrips the ability of volunteers to update the translation
files.  This means that new buttons and fields may be displayed in
English on an otherwise correctly internationalized page.

=head1 E. AUTHENTICATION AND AUTHORIZATION

You can restrict who has access to gbrowse by IP address, host name,
domain or username and password.  Restriction can apply to the
database as a whole, or to particular annotation tracks.

To limit access to a whole database, you can use Apache's standard
authentication and authorization.  Gbrowse uses a URL of this form to
select which database it is set to:

      http://your.host/cgi-bin/gbrowse/your_database

where "your_database" is the name of the currently selected database.
For example, the yeast database is
http://your.host/cgi-bin/gbrowse/yeast.

To control access to the entire database, create a <Location> section
in httpd.conf.  The <Location> section should look like this:

   <Location /cgi-bin/gbrowse/your_database>
	Order deny,allow
	deny from all
	allow from localhost .cshl.edu .ebi.ac.uk
   </Location>

This denies access to everybody except for "localhost" and browsers
from the domains .cshl.edu and .ebi.ac.uk.  You can also limit by IP
address, by username and password or by combinations of these
techniques.  See http://httpd.apache.org/docs/howto/auth.html for
the full details.

You can also limit individual tracks to certain individuals or
organizations.  Unless the stated requirements are met, the track will
not appear on the main screen or any of the configuration screens.  To
set this up, add a "restrict" option to the track you wish to make
off-limits:

	[PROPRIETARY]
	feature = etc
	glyph   = etc
	restrict = Order deny,allow
		   deny from all
		   allow from localhost .cshl.edu .ebi.ac.uk

The value of the restrict option is identical to the Apache
authorization directives and can include any of the directives
"Order," "Satisfy," "deny from," "allow from," "require valid-user" or
"require user."  The only difference is that the "require group"
directive is not supported, since the location of Apache's group file
is not passed to CGI scripts.  Note that username/password
authentication must be turned on in httpd.conf and the user must have
successfully authenticated himself in order for the username to be
available.

As with other gbrowse options, restrict can be a code subroutine.  The
subroutine will be called with three arguments consisting of the host,
ip address and authenticated user.  It should return a true value to
allow access to the track, or a false value to forbid it.  This can be
used to implement group-based authorization or more complex schemes.

Here is an example that uses the Text::GenderFromName to allow access
if the user's name sounds female and forbids access if the name sounds
male.  (It might be useful for an X-chromosome annotation site.)

    restrict = sub {
	       my ($host,$ip,$user) = @_;
	       return unless defined $user;
	       use Text::GenderFromName qw(gender);
	       return gender($user) eq 'f';
	     }

You should be aware that the username will only be defined if username
authentication is turned on and the user has successfully
authenticated himself against Apache's user database using the correct
password.  In addition, the hostname will only be defined if
HostnameLookups have been turned on in httpd.conf.  In the latter
case, you can convert the IP address into a hostname using this piece
of code:
    
    use Socket;
    $host = gethostbyaddr(inet_aton($addr),AF_INET);

Note that this may slow down the response time of gbrowse noticeably
if you have a slow DNS name server.

Another thing to be aware of when restricting access to an entire
database is that that even though the database itself will not be
accessible to unauthorized users, the name of the database will still
be available from the popup "Data Source" menu.  If you wish even the
name to be suppressed from view by unauthorized users, add the
following line to the [GENERAL] section of the configuration file of
the database you wish to suppress:

    restrict = require valid-user

The syntax described earlier for restricting access to tracks by
hostname, IP address or username holds true for restricting the
visibility of the database on the Data Source popup menu.

=head1 F. DISPLAYING GENETIC AND RH MAPS

GBrowse can be tweaked to make it more suitable for displaying genetic
and radiation hybrid maps.  

The main issue is that the Bio::DB::GFF database expects coordinates
to be positive integers, not fractions, but genetic and RH maps use
floating point numbers.  Working around this is a bit of an ugly hack.
Before loading your data you must multiply all your coordinates by a
constant power of 10 in order to convert them into integers.  For
example, if a genetic map uses Morgan units ranging from 0 to 1.80,
you would multiple by 100 to create a map in ranging from 0 to 180.

Create a GFF file containing the markers in modified coordinates and
load it as usual.  Now you must tell GBrowse to reverse these changes.
Enter the following options into the [GENERAL] section of the
configuration file:

 units = M
 unit_divider = 100

These two options tell GBrowse to use "M" (Morgan) units, and to
divide all coordinates by 100.  GBrowse will automatically display the
scale using the most appropriate units, so the displayed map will
typically be drawn using cM units.

=head1 G. CHANGING THE LOCATION OF THE CONFIGURATION FILES

If you wish to change the location of the gbrowse.conf configuration
file directory, you must manually edit the gbrowse CGI script.  Open
the script in a text editor, and find this section:

 ###################################################################
 # Non-modperl users should change this variable if needed to point
 # to the directory in which the configuration files are stored.
 #
 use constant CONF_DIR => '/usr/local/apache/conf/gbrowse.conf';
 #
 ###################################################################

Change the definition of CONF_DIR to the desired location of the
configuration files.

An alternative, for users of mod_perl only, is to add the GBrowseConf
per-directory variable to the configuration for the directory in which
the gbrowse script lives.  This variable overrides the CONF_DIR value.
For example:

 <Directory /usr/local/apache/cgi-perl>
   SetHandler      perl-script
   PerlHandler     Apache::Registry
   PerlSendHeader  On
   Options         +ExecCGI
   PerlSetVar      GBrowseConf /etc/gbrowse.conf
 </Directory>

=head1 H. USING DAS (DISTRIBUTED ANNOTATION SYSTEM) DATABASES

You may insert features from a DAS source into any named track. Create
a stanza as usual but instead of specifying the feature type using the
"feature" option, give the desired DAS URL using the "remote feature" option:

 remote feature = http://dev.hapmap.org/cgi-perl/das/t2d_testing?type=ldblock

Because DAS sources specify the glyph and visualization options, most
of the settings such as bgcolor will be ignored. However, the track
key and citation options are honored.

You can use the same syntax to load a GFF file or a feature file in
Gbrowse upload format into a track. Just provide a URL that returns
the desired data.

You can also run GBrowse entirely off a single DAS source. To get
this support, you must use Bio::Das version 0.90 or higher, available
from http://www.biodas.org.

A sample [GENERAL] configuration section looks like this:

 [GENERAL]
 description   = Das Example Database (dicty)
 db_adaptor    = Bio::Das
 db_args       = -source http://www.biodas.org/cgi-bin/das 
	         -dsn    dicty

The db_adaptor option must be set to "Bio::Das".  The db_args option
must contain a -source pointing to the base of the remote DAS server,
and a -dsn pointing to the name of the annotation database.

The remainder of the configuration file should be configured as
described earlier.  The following short script will return a list of
the feature types known to the remote DAS server.  You can use the
output of this script as the basis for the tracks to configure.

 #!/usr/bin/perl

 use strict;

 use Bio::Das;
 my $db = Bio::Das->new('http://localhost/cgi-bin/das'=>'dicty');
 print join "\n",$db->types;

Limitations: 

The DAS implementation does not descend into subcomponents.  For
example, if the user requests features on a chromosome, but the remote
DAS server has annotated genes using contig coordinates, then the
genes will not appear on the chromosome.

The gbrowse_details script does not provide useful information because
the DAS/1 protocol does not provide a way to retrieve attribute
information on a named feature.

=head1 I. THE BioMOBY BROWSER

The BioMOBY project aims to design and deploy platforms that
enable and simplify biological database interoperability.

To date, the MOBY-Services (MOBY-S) branch of the BioMOBY
project has published a fairly stable API that is now being
used by data providers worldwide to publish their data in an
interoperable manner.  A simple MOBY browser has been written
for Gbrowse that allows the end-user to "surf" out of their
Gbrowse view and begin exploring data related to the genomic
features displayed in Gbrowse.

Configuration of the gbrowse_moby script does, at this time,
require some VERY simple code-editing, and small modifications
to your XX.organism.conf configuration file.  These are described
in detail below:

=over

=item 1 SYNOPSIS 

In 0X.organism.conf, for example:
     
 [ORIGIN]
 link         = http://yoursite.com/cgi-bin/gbrowse_moby?source=$source&name=$name&class=$class&method=$method&ref=$ref&description=$description
 feature      = origin:Sequence
 glyph        = anchored_arrow
 fgcolor      = orange
 font2color   = red
 linewidth    = 2
 height       = 10
 description  = 1
 key          = Definition line
 link_target  = _MOBY


AND/OR


 [db_xref:DETAILS]
 URL = http://yoursite.com/cgi-bin/gbrowse_moby?namespace=$tag;id=$value


Note that all you are doing in each case is to associate a
mouse click on a particular feature type with an invocation
of the gbrowse_moby script, passing a few of the common Gbrowse
variables in the GET string.

The gbrowse_moby script will take information passed from a click on
a Gbrowse feature, or a click on a configured DETAILS GFF
attribute type, and initiate a MOBY browsing session with
information from that link.  Most information is discarded.
The only useful information to MOBY is a "namespace" and an
"id" within that namespace.

Generally speaking, namespaces in Gbrowse will have to be
mapped to a namespace in the MOBY namespace ontology (which
is derived from the Gene Ontology Database Cross-Reference
Abbreviations list).  Currently, this requires editing of the
gbrowse_moby code, where a Perl hash named %source2namespace
maps the GFF source (column 2) to a MOBY namespace:
    
  $source2namespace{$source} = moby_namespace


=item 2 REQUIRED LIBRARIES

This script requires libraries from the BioMOBY project.  Currently
these are only available from the CVS.  Anonymous checkout of the
BioMOBY project can be accomplished as follows:
    
  cvs -d :pserver:cvs@cvs.open-bio.org:/home/repository/moby login

When prompted for a password, type "cvs".

  cvs -d :pserver:cvs@cvs.open-bio.org:/home/repository/moby co moby-live
  cvs update -dP

You will then need to enter the moby-live/Perl folder and run "perl
Makefile.PL; make; make install" to install the MOBY libraries into
your system.

=item 3 USAGE

gbrowse_moby understands the following variables, some of
which (*) may be passed from Gbrowse through a mouse-click
into the GET string:

 * $source    - converted into a MOBY namespace by parsing
              the 'source' GFF tag against the %source2namespace
              hash.
             (see more detailed explanation in the examples below)
 $namespace - used verbatim as a valid MOBY namespace
 * $name      - used verbatim as a MOBY id interpreted in the namespace
 * $id        - used verbatim as a MOBY id interpreted in the namespace
 * $class     - this is the GFF column 9 class; used for the page title
 $objectclass - this should be a MOBY Class ontology term
               (becomes Class 'Object' by default, and this
                is usually correct)
 $object      - contains the raw XML of a valid MOBY object 

Note that you MUST at least pass a namespace-type variable (source/namespace)
and an id-type variable (name/id) in order to have a successful MOBY
call.

=item 4 EXAMPLES 

=over

=item Simple GFF

If your GFF were:

      A22344  Genbank  origin  1000  2000  87  +  .
 
You would set your configuration file as follows:
 
     [ORIGIN]
     link         = http://yoursite.com/cgi-bin/gbrowse_moby?source=$source&name=$name&class=$class
     feature      = origin:Genbank

and you would edit the gbrowse_moby script as follows:

      my %source2namespace = (
         #   GFF-source           MOBY-namespace
            'Genbank'       =>      'NCBI_Acc',
      );

this maps the GFF source tag "Genbank" to the MOBY namespace "NCBI_Acc"


=item GFF With non-MOBY Attributes

If your GFF were:

      A22344  Genbank origin  1000  2000 87 + . Locus CDC23

You would set your configuration file as follows:
 
     [ORIGIN]
     link         = http://yoursite.com/cgi-bin/gbrowse_moby?source=$source&name=$name&class=$class
     feature      = origin:Genbank

and you might also set a DETAILS call to handle the Locus Xref:
(notice that we use the 'source' tag to force a translation of
the foreign namespace into a MOBY namespace)

     [db_xref:DETAILS]
     URL = http://brie4.cshl.org:9320/cgi-bin/gbrowse_moby?source=$tag;id=$value

then to handle the mapping of Locus to YDB_Locus as well
as the Genbank GFF source tag you would
edit the source2namespace hash in gbrowse_moby to read:

      my %source2namespace = (
         #   GFF-source           MOBY-namespace
            'Genbank'       =>      'NCBI_Acc',
            'Locus'         =>      'YDB_Locus',
      );


=item GFF With MOBY Attributes

If your GFF were (NCBI_gi is a valid MOBY namespace):

      A22344  Genbank origin  1000  2000 87 + . NCBI_gi 118746

You would set your configuration file as follows:
 
     [ORIGIN]
     link         = http://yoursite.com/cgi-bin/gbrowse_moby?source=$source&name=$name&class=$class
     feature      = origin:Genbank

and you might also set a DETAILS call to handle the NCBI_gi Xref:
(notice that we now use the 'namespace' tag to indicate that
the tag is already a valid MOBY namespace)

     [db_xref:DETAILS]
     URL = http://brie4.cshl.org:9320/cgi-bin/gbrowse_moby?namespace=$tag;id=$value

Since there is no need to map the namespace portion, we now
only need to handle the Genbank GFF source as before:

      my %source2namespace = (
         #   GFF-source           MOBY-namespace
            'Genbank'       =>      'NCBI_Acc',
      );

=back

=item 5 HINTS

-The full listing of valid MOBY namespaces is available at:

    http://mobycentral.cbr.nrc.ca/cgi-bin/types/Namespaces

-A useful mapping to make is to put the organism name into the
Global_Keyword namespace.  This will trigger discovery of MedLine
searches for papers about that organism.

=back

=cut

=head1 J. BioMOBY Services

A selection of services are distributed with the Gbrowse
package that will allow you to serve your underlying data
using the BioMOBY Services architecture.

To enable these, simply do the following:

=over

=item 1. Set-up and fill your database 

as per the normal Gbrowse instructions

=item 2. Edit the moby.conf file 

in the /$CONFIG/gbrowse.conf/MobyServices folder. It
should be set up as follows:

=over


=item a. Reference

Your reference sequences will be based
on some type of identifier - e.g. they will be from Genbank
or from Embl or from Flybase, etc.  Look-up the BioMOBY
namespace corresponding to the type of identifier you are
using for your Reference sequences and put that
identifier here.

-The full listing of valid MOBY namespaces is available at:

    http://mobycentral.cbr.nrc.ca/cgi-bin/types/Namespaces

=item b. authURI

You are required to identify yourself when registering MOBY
Services.  Your authURI is a URI uniquely identifying you.
This is generally your domain (e.g. flybase.org)

=item c. contactEmail

You are required to provide a contact email address to which
people can contact you v.v. the services you are providing.

=item d. CGI_URL

This is simply the URL to the folder from which you are serving
your gbrowse scripts.  e.g. http://flybase.org/cgi-bin/browser/
DO NOT include the script name in this parameter!  It is the 
folder only!!

=item e. [Namespace_Class_Mappings]

This section is just a list of tuples indicating the relationship
between various entities in your database (e.g. Genes, Transcripts)
and their equivalent BioMOBY namespaces.  For example, if you are
TAIR, and you have entities in your database called "Locus", you 
would add the line:
	
	Locus = TAIR_Locus

to this section of the config file.  This will allow people who
have TAIR_Locus identifiers in-hand to discover your service and
request information about that locus from your database.

You may add as many Namespace->Class mappings as you wish; one per
line.

=back

=item 3. REGISTERING SERVICES

To register your services with the MOBY Central web service registry
simply run the "register_moby_services.pl" script, located in the 
Generic-Genome-Browser/bin folder.  The script documentation can be
retrieved with POD or simple documentation can be printed by simply 
running the script with no command-line parameters.  Generally speaking
you need only run:

perl register_moby_services.pl -register

As services are registered they will be added to a file:
registeredMOBYServices.dat.  This file is used to de-register
your services if you wish to do so.  To deregister, simply run:

perl register_moby_services.pl -clean

If your .dat file is not available, cleaning your services will
be unsuccessful.

=item 4. Service script

Your services are served by the script 'moby_server' in your
cgi-bin folder.  This is auto-configured by the register_services 
step above, so generally speaking you do not need to edit this
script.

=back

=cut

=head1 K. FILTERING SEARCH RESULTS 

GBrowse provides a method to filter the contents of individual tracks based
on information that can be obtained from feature attributes.  For example,
suppose you have performed a blast and added all hits as similarity features
on an entry. In gbrowse, all those features can get a little crowdy.
The administrator can decide to show only the top 5 of the blast hits.
This can easily be accomplished by adding the filter option in the conf file.
It might look like this:

  [BLAST]
  feature       = blast
  glyph         = segments
  filter = sub {
                 my $feat = shift;
                 (my $rank) = $feat->get_tag_values('rank'); # persistent Bio::SeqFeature::Generic features
                 #(my $rank) = $feat->attributes('rank'); # Bio::DB::GFF::Feature
                 $rank < 6;
               }

Another useful example is to show features coming from a plain genbank file.
When loaded into BioSQL the source becomes 'EMBL/Genbank/SwissProt'.
Using the Bio::DB::Das::BioSQL adaptor you have to pass the source to the
feature option. It can be rather difficult to distinguish all the features
when they all have the same source string. This problem can be solved using
the filter option. In the following example the difference between the
features is done based on the primary_tag

  [REGION]
  feature      = EMBL/GenBank/SwissProt
  filter       = sub {
                  my $feat = shift;
                  $feat->primary_tag =~ /region/i;
                 }
  key          = RefSeq Protein Domains
                                                                                
  [SIGPEPTIDE]
  feature      = EMBL/GenBank/SwissProt
  filter       = sub {
                  my $feat = shift;
                  $feat->primary_tag =~ /sig_peptide/i;
                 }
  key          = RefSeq Signal Peptide


=head1 L. INVOKING GBROWSE URLs (under construction)

This section describes the public CGI parameters recognized by
GBrowse.  By setting the parameters in the URL, you can get gbrowse to
do various useful things:

=over 4

=item The source argument

The last component of the gbrowse path is the symbolic name of the
data source.  For example:

   http://www.your.site/cgi-bin/gbrowse/volvox
   http://www.your.site/cgi-bin/gbrowse/yeast
   http://www.your.site/cgi-bin/gbrowse/my_testing_database

These will correspond to config files named volvox.pm, yeast.pm and
my_testing_database.pm respectively.

As noted earlier, you can place numbers in front of the configuration
file names in order to adjust the order in which they appear in the
data source menu.

NOTE: For obscure reasons involving Internet Explorer compatibility,
gbrowse will add an extra slash to the end of the URL, resulting in
URLs that look like:

  http://www.your.site/cgi-bin/gbrowse/yeast/?q=NAB2

Don't worry about this. The URL works the same with and without the
terminal slash.

=item q

The argument "q" will set the landmark or search string:

    http://www.your.site/cgi-bin/gbrowse/yeast?q=NAB2

This will have the same effect as typing "NAB2" into the gbrowse
search box.

To go immediately to the multiple hits page (which shows hits on
several overview panels), use multiple q arguments:

   http://www.your.site/cgi-bin/gbrowse/yeast?q=NAB2;q=NPY1

Alternatively, you can use a single q parameter and separate each
landmark name with a dash:

   http://www.your.site/cgi-bin/gbrowse/yeast?q=NAB2-NPY1

The rules for specifying relative offsets and object classes are the
same as in the main search field:

   http://www.your.site/cgi-bin/gbrowse/yeast?q=Gene:NAB2:1..5000

=item ref, start, stop, end

Together the "ref," "start" and "stop" arguments specify the reference
sequence and the start and end coordinates of the region of
interest. The "q" argument, if present, overrides these settings.

The "end" argument is a synonym for "stop".

=item label

The tracks to display. This parameter must contain the track names
(i.e. the names in [brackets] in the config file) separated by "+" or
"-" characters.  For example:

   http://www.your.site/cgi-bin/gbrowse/yeast?labels=ORFs-tRNAs

To use the "+" character you may have to URL escape it:

   http://www.your.site/cgi-bin/gbrowse/yeast?labels=ORFs%2BtRNAs

All tracks not explicitly given by the label parameter will be closed
(disabled).

=item enable

Tracks to enable. The tracks indicated by this parameter will be
opened in addition to any tracks that were previously opened by the
user.  The format is the same as label:

   http://www.your.site/cgi-bin/gbrowse/yeast?enable=ORFs-tRNAs

=item disable

Tracks to close. The tracks indicated by this parameter will be
disabled. Tracks not mentioned by this parameter will keep their
previous state.  The format is the same as label:

   http://www.your.site/cgi-bin/gbrowse/yeast?disable=ORFs-tRNAs

When modifying track state, the "label" parameter is processed first,
followed by the "enable" parameter and the "disable" parameter.

=item flip

Whether to flip the display. If set to a true value (flip=1), then the
coordinates will be reversed so that forward strand features become
reverse strand features. If set to a false value (flip=0) or absent,
then the forward strand is displayed as per usual.

=item width

Set the width of the overview, region and details images, in pixels.

=item region_size

Set the length of the region covered by the "region" panel, in base
pairs.

=item add

A feature to superimpose on top of the current image as a new
track. Multiple add arguments are allowed. The format is:

  reference+type+name+start..stop,start..stop,start..stop

example:

  add=chr3+EST+yk802.1+500000..500010,500100..500180

=item h_feat

The name of a feature to highlight in the format
"<feature_name>@<color_name>". Example:

      h_feat=SKT5@blue

You may omit "@color", in which case the highlight will default to
yellow. You can specify multiple h_feat arguments in order to
highlight several features with distinct colors.

Passing an argument of h_feature=_clear_ will clear all feature
highlighting.

=item h_region

The name of a region to highlight in the format
"<seq_id>:start..end@color". Example:

      h_region=Chr3:200000..250000@wheat

You may omit "@color" in which case the highlight will default to
lightgrey. You can specify multiple h_region arguments in order to
highlight multiple sequence ranges with different colors.

Passing an argument of h_region=_clear_ will clear all region
highlighting.

=item ks

The position of the key in the detail panel. Possible values are
"between," "beneath," "left" and "right".

=item sk

The sort order of track names in the "Tracks" panel.  Values are
"sorted" (alphabetically sorted by name) and "unsorted" (sorted by
the order of tracks as defined in the config file).

=item track_options

If true, open up the track configuration page.

=item help

Open up the specified help page. Possible values are:

     "general"    open the general help page
     "citations"  open up the track description & citation page
     "link_image" open the page that describes how to
                  generate an embedded image of the current view
      "svg_image" the page that describes how to generate SVGs

=item add

Upload a feature and add it in its own track. The format is
"reference+type+name+start..end", where reference is the landmark for
the coordinates (e.g. a named gene or chromosome), type is the type of
the feature, name is the name of the feature, and start..end are the
start and end coordinates. For a feature that has multiple segments,
you may use multiple start..end ranges, separated by commas. Example:

  add=chr3+miRNA+mir144+2309229..2309300,2309501..2309589

Pass multiple "add" parameters to upload several features.

"add" can be abbreviated to "a" for terseness.

=item style

Specify the style for features uploaded using "add". It is a flattened
version of the style configuration sections described in this
document. Lines are separated by "+" symbols rather than newlines. The
first word in the argument is the feature type to configure, for
example "miRNA." Subsequent option=value pairs control the glyph and
glyph options.

For example, if you have added a "miRNA" annotation, then you can tell
the renderer to use a red arrow for this glyph in this way:

   style=miRNA+glyph=arrow+fgcolor=red

"style" can be abbreviated to "s" for terseness.

=item id

The id is a unique session ID that will store persistent configuration
information. You do not typically need to use the id parameter except
in the circumstance in which you wish to upload an annotation file
programatically, in which case you should choose some large
hard-to-guess number.

=item Upload, upload_annotations, id

These three arguments must be present in order to upload a file of
external annotations to the server. "Upload" must be a true value
(such as "1"), and "upload_annotations" will contain the content of
the uploaded file. Note that you must POST the data using MIME type
"multipart/form-data" and that the "U" in upload is capitalized.

The "id" argument is used to associated the upload with a
session. Pick some long, hard to guess number. This will be associated
stably with the uploaded file(s). To see the upload information,
provide the same number in the "id" argument every time you access
gbrowse.

=item eurl

Specify the URL of a remote annotation source to load into the
database. You should also supply an "id" argument as well, as
described earlier, in order to be able to view the annotations.

=item plugin, plugin_do

These arguments run plugins. The "plugin" argument gives the name of
the plugin to activate. The name is the last component of the plugin
package name, e.g. FastaDumper. The "plugin_do" argument selects what
to do with the plugin. Possible values are "Configure", "Find" and
"Go". "Configure" launches the plugin's configure page, "Go" runs
dumper plugins' dump operation, and "Find" activates finder plugins'
find function. For find operations, you should in most cases pass the
find string in the "q" argument, but this depends on the particular
plugin.

Each plugin may have its own set of URL arguments. A plugin's
arguments are preceded by the plugin's name. For example, the
FastaDumper plugin has a parameter named "format" which controls the
output format. So to invoke this plugin and make the output plain
text, one would provide the arguments:

 http://www.your.site/cgi-bin/gbrowse/yeast?q=NUT21;plugin=FastaDumper;
             plugin_do=Go;FastaDumper.format=text

Plugins tend not to be well documented, so you may have to read
through the source code to figure out their arguments.

=back

=head1 M. FURTHER INFORMATION

For further information, bug reports, etc, please consult the mailing
lists at www.gmod.org.  The main mailing list for gbrowse support is
gmod-gbrowse@lists.sourceforge.net.

Have fun!

Lincoln Stein & the GMOD development team
lstein@cshl.edu