File: InspectingResults.html

package info (click to toggle)
shasta 0.14.0-2
  • links: PTS, VCS
  • area: main
  • in suites: sid
  • size: 29,648 kB
  • sloc: cpp: 82,262; python: 2,348; makefile: 223; sh: 143
file content (212 lines) | stat: -rw-r--r-- 8,228 bytes parent folder | download | duplicates (4)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
<!DOCTYPE html>
<html>

<head>
<link rel=stylesheet href=style.css />
<link rel=icon href=CZI-new-logo.png />
</head>

<body>
<main>
<div class="goto-index"><a href="index.html">Table of contents</a></div>

<h1>Exploring assembly results</h1>

<p>
Shasta has functionality for exploring
details of many data structures used during assembly.
<b>It can be useful to understand assembler behavior
for testing, debugging, parameter optimization,
development of new algorithms, or for gaining insight
not available by visually examining the assembled sequence</b>
files.
When this functionality is activated, Shasta
behaves as an http server, and the user communicates with it
via a standard Internet browser.

<p>
Follow the direction below to activate this functionality. 



<h2 id=Linux>Starting the Shasta http server on Linux</h2>
<p>
The Shasta http server uses 
<a href="https://www.graphviz.org/">Graphviz</a> software to display graphs.
To install it, use one of the following commands depending on the Linux system you are using:
<ul>
<li>On <code>Debian</code>, <code>Ubuntu</code>, and derived distributions, use: <pre>sudo apt install graphviz</pre>
<li>On <code>CentOS</code>, <code>Red Hat</code>, <code>Fedora</code>, and 
related distributions, use: <pre>sudo yum install graphviz</pre>
<li>On a Linux distribution where neither of the above commands works, 
please file an 
<a href="https://github.com/paoloshasta/shasta/issues">issue</a> on the Shasta GitHub repository mentioning 
the distribution you are using. 
We will make an effort to update the documentation to cover that distribution.
</ul>

<p>
To start the Shasta http server on Linux, follow these steps:
<ol>
<li>Run an assembly with option <code>--memoryMode filesystem</code>.
When this option is used, binary data used by the assembly
are stored in memory mapped files that remain accessible after assembly is complete.
<ul>
<li>If you don't have root access via <code>sudo</code> on the machine
you are using, also use option <code>--memoryBacking disk</code>.
This is slower
but guarantees that the results remain permanently available
after an assembly completes, unless
you use <code>--command cleanupBinaryData</code>.
<li>If you do have root access via <code>sudo</code> on the machine
you are using and you want to maximize assembly performance, 
also use option <code>--memoryBacking 2M</code>,
which results in binary data being stored in memory on
the Linux <code>hugetlbfs</code> filesystem (2 MB "huge" pages).
These data only remain available until the next reboot
or until you clean them up via <code>--command cleanupBinaryData</code>,
but you can save them persistently on disk using <code>--command saveBinaryData</code>.
</ul>

<li>Run the assembler again, this time specifying option 
<code>--command explore</code>,
plus the same <code>--assemblyDirectory</code> option used for the assembly run
(default is <code>ShastaRun</code>).
</ol>
<p>
See 
<a href="Running.html#MemoryModes">here</a>
for more information on the <code>--memoryMode</code> and <code>--memoryBacking</code>
command line options.




<h2 id=macOS>Starting the Shasta http server on macOS</h2>
<p>
The Shasta http server uses 
<a href="https://www.graphviz.org/">Graphviz</a> software to display graphs.
It also needs command <code>gtimeout</code> which is part of the coreutils package.
To install them, use this command: <pre>brew install graphviz coreutils</pre>
<p>
To start the Shasta http server on macOS, follow these steps:
<ol>
<li>Run an assembly as usual. The macOS version of Shasta
always stores binary data on disk. This is slower
but guarantees that the binary data remain permanently available
after an assembly completes, unless
you use <code>--command cleanupBinaryData</code>.
<li>Run the assembler again, this time specifying option 
<code>--command explore</code>,
plus the same <code>--assemblyDirectory</code> option used for the assembly run
(default is <code>ShastaRun</code>).
</ol>



<h2>Using a browser to explore assembly results</h2>
<p>Following the above directions will start a Shasta process 
in a mode in which it behaves as an http server.
It will also start your default browser and point it to
the Shasta http server process. 
If you want to start additional browser sessions, just
point your browser to the URL shown when the Shasta http server starts,
usually <code>http://localhost:17100</code>.

<p>
The browser session will initially show an assembly summary page.
At the top you will see a navigation menu that allows you to explore
details of many assembler data structures.
For example, this allows you to look at local subgraphs
of the read graph, marker graph, and assembly graph 
(see <a href="ComputationalMethods.html">here</a> for more information).
It also provides details of sequence assembly 
and of the input reads used.
Finally, there are several useful links between 
the various data structures. For example,
you can easily navigate from the assembly graph to the marker
graph and vice versa, and from the read graph to the reads.


<p>
When you are done using the browser, remember to stop the server
using <code>Ctrl^C</code> in the command window in which you started the
server. You can restart the server later as many times as you like,
as long as the binary data remain available.


<h2 id=AccessControl>Access control</h2>
<p>
By default, the server only responds
to requests from the same user and computer 
running the server.
However, you can use the command line option 
<code>--exploreAccess</code> to relax this restriction:

<ul>
<li><code>--exploreAccess user</code> (default)
to only allow access on the local machine and to the user that started
the server. THIS DEFAULT OPTION IS THE ONLY ONE THAT GUARANTEES
THAT NOBODY ELSE WILL BE ABLE TO ACCESS YOUR ASSEMBLY.
YOU SHOULD USE THIS OPTION IF YOUR ASSEMBLY DATA
IS SUBJECT TO CONFIDENTIALITY RESTRICTIONS
OR IS NOT CLEARED OR CONSENTED FOR PUBLIC RELEASE.
<li><code>--exploreAccess local</code> 
to allow access to ALL USERS on the local machine.
YOU SHOULD <b>NOT</b> USE THIS OPTION IF YOUR ASSEMBLY DATA
IS SUBJECT TO CONFIDENTIALITY RESTRICTIONS
OR IS NOT CLEARED OR CONSENTED FOR PUBLIC RELEASE.
<li><code>--exploreAccess unrestricted</code> 
for completely unrestricted access from any user, even from
different computers on your local area network,
and potentially even from the entire Internet,
if you are not protected by a firewall.
Some firewall may, however, be necessary for this to work.
YOU SHOULD <b>NOT</b> USE THIS OPTION IF YOUR ASSEMBLY DATA
IS SUBJECT TO CONFIDENTIALITY RESTRICTIONS
OR IS NOT CLEARED OR CONSENTED FOR PUBLIC RELEASE.

</ul>

</ul>

<h2>Things you can do with the HTTP server</h2>
<ul>
<li>Look at a specific read, its run-length encoding and location of markers</li>
<li>Find all alignments for a given read</li>
<li>Align any two reads</li>
<li>Align a read with all other reads</li>
<li>Visualize individual alignments as a grid</li>
<li>Visualize alignment graphs</li>
</ul>
and several other things. 

<h2>Screenshots</h2>
<p>
Below are some sample screenshots obtained using 
<code>--command explore</code>.

<h2>Assembly Summary</h2>
<img style='width:95%;border:dashed;border-color:lightgrey;' src="InspectingResults-Images/1.png">
<hr>
<h2>Investigating individual reads</h2>
You can see the run-length representation of the read along with its markers.
<img style='width:95%;border:dashed;border-color:lightgrey;' src="InspectingResults-Images/2.png">
<hr>
<h2>Investigating Read Graph</h2>
<img style='width:95%;border:dashed;border-color:lightgrey;' src="InspectingResults-Images/3.png">
<hr>
<h2>Investigating local Marker Graph</h2>
<img style='width:95%;border:dashed;border-color:lightgrey;' src="InspectingResults-Images/4.png">
<img style='width:95%;border:dashed;border-color:lightgrey;' src="InspectingResults-Images/5.png">
<hr>
<h2>Investigate Assembly Graph</h2>
<img style='width:95%;border:dashed;border-color:lightgrey;' src="InspectingResults-Images/6.png">
<img style='width:95%;border:dashed;border-color:lightgrey;' src="InspectingResults-Images/7.png">
<hr>

<div class="goto-index"><a href="index.html">Table of contents</a></div>
</main>
</body>
</html>