1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212
|
<!DOCTYPE html>
<html>
<head>
<link rel=stylesheet href=style.css />
<link rel=icon href=CZI-new-logo.png />
</head>
<body>
<main>
<div class="goto-index"><a href="index.html">Table of contents</a></div>
<h1>Exploring assembly results</h1>
<p>
Shasta has functionality for exploring
details of many data structures used during assembly.
<b>It can be useful to understand assembler behavior
for testing, debugging, parameter optimization,
development of new algorithms, or for gaining insight
not available by visually examining the assembled sequence</b>
files.
When this functionality is activated, Shasta
behaves as an http server, and the user communicates with it
via a standard Internet browser.
<p>
Follow the direction below to activate this functionality.
<h2 id=Linux>Starting the Shasta http server on Linux</h2>
<p>
The Shasta http server uses
<a href="https://www.graphviz.org/">Graphviz</a> software to display graphs.
To install it, use one of the following commands depending on the Linux system you are using:
<ul>
<li>On <code>Debian</code>, <code>Ubuntu</code>, and derived distributions, use: <pre>sudo apt install graphviz</pre>
<li>On <code>CentOS</code>, <code>Red Hat</code>, <code>Fedora</code>, and
related distributions, use: <pre>sudo yum install graphviz</pre>
<li>On a Linux distribution where neither of the above commands works,
please file an
<a href="https://github.com/paoloshasta/shasta/issues">issue</a> on the Shasta GitHub repository mentioning
the distribution you are using.
We will make an effort to update the documentation to cover that distribution.
</ul>
<p>
To start the Shasta http server on Linux, follow these steps:
<ol>
<li>Run an assembly with option <code>--memoryMode filesystem</code>.
When this option is used, binary data used by the assembly
are stored in memory mapped files that remain accessible after assembly is complete.
<ul>
<li>If you don't have root access via <code>sudo</code> on the machine
you are using, also use option <code>--memoryBacking disk</code>.
This is slower
but guarantees that the results remain permanently available
after an assembly completes, unless
you use <code>--command cleanupBinaryData</code>.
<li>If you do have root access via <code>sudo</code> on the machine
you are using and you want to maximize assembly performance,
also use option <code>--memoryBacking 2M</code>,
which results in binary data being stored in memory on
the Linux <code>hugetlbfs</code> filesystem (2 MB "huge" pages).
These data only remain available until the next reboot
or until you clean them up via <code>--command cleanupBinaryData</code>,
but you can save them persistently on disk using <code>--command saveBinaryData</code>.
</ul>
<li>Run the assembler again, this time specifying option
<code>--command explore</code>,
plus the same <code>--assemblyDirectory</code> option used for the assembly run
(default is <code>ShastaRun</code>).
</ol>
<p>
See
<a href="Running.html#MemoryModes">here</a>
for more information on the <code>--memoryMode</code> and <code>--memoryBacking</code>
command line options.
<h2 id=macOS>Starting the Shasta http server on macOS</h2>
<p>
The Shasta http server uses
<a href="https://www.graphviz.org/">Graphviz</a> software to display graphs.
It also needs command <code>gtimeout</code> which is part of the coreutils package.
To install them, use this command: <pre>brew install graphviz coreutils</pre>
<p>
To start the Shasta http server on macOS, follow these steps:
<ol>
<li>Run an assembly as usual. The macOS version of Shasta
always stores binary data on disk. This is slower
but guarantees that the binary data remain permanently available
after an assembly completes, unless
you use <code>--command cleanupBinaryData</code>.
<li>Run the assembler again, this time specifying option
<code>--command explore</code>,
plus the same <code>--assemblyDirectory</code> option used for the assembly run
(default is <code>ShastaRun</code>).
</ol>
<h2>Using a browser to explore assembly results</h2>
<p>Following the above directions will start a Shasta process
in a mode in which it behaves as an http server.
It will also start your default browser and point it to
the Shasta http server process.
If you want to start additional browser sessions, just
point your browser to the URL shown when the Shasta http server starts,
usually <code>http://localhost:17100</code>.
<p>
The browser session will initially show an assembly summary page.
At the top you will see a navigation menu that allows you to explore
details of many assembler data structures.
For example, this allows you to look at local subgraphs
of the read graph, marker graph, and assembly graph
(see <a href="ComputationalMethods.html">here</a> for more information).
It also provides details of sequence assembly
and of the input reads used.
Finally, there are several useful links between
the various data structures. For example,
you can easily navigate from the assembly graph to the marker
graph and vice versa, and from the read graph to the reads.
<p>
When you are done using the browser, remember to stop the server
using <code>Ctrl^C</code> in the command window in which you started the
server. You can restart the server later as many times as you like,
as long as the binary data remain available.
<h2 id=AccessControl>Access control</h2>
<p>
By default, the server only responds
to requests from the same user and computer
running the server.
However, you can use the command line option
<code>--exploreAccess</code> to relax this restriction:
<ul>
<li><code>--exploreAccess user</code> (default)
to only allow access on the local machine and to the user that started
the server. THIS DEFAULT OPTION IS THE ONLY ONE THAT GUARANTEES
THAT NOBODY ELSE WILL BE ABLE TO ACCESS YOUR ASSEMBLY.
YOU SHOULD USE THIS OPTION IF YOUR ASSEMBLY DATA
IS SUBJECT TO CONFIDENTIALITY RESTRICTIONS
OR IS NOT CLEARED OR CONSENTED FOR PUBLIC RELEASE.
<li><code>--exploreAccess local</code>
to allow access to ALL USERS on the local machine.
YOU SHOULD <b>NOT</b> USE THIS OPTION IF YOUR ASSEMBLY DATA
IS SUBJECT TO CONFIDENTIALITY RESTRICTIONS
OR IS NOT CLEARED OR CONSENTED FOR PUBLIC RELEASE.
<li><code>--exploreAccess unrestricted</code>
for completely unrestricted access from any user, even from
different computers on your local area network,
and potentially even from the entire Internet,
if you are not protected by a firewall.
Some firewall may, however, be necessary for this to work.
YOU SHOULD <b>NOT</b> USE THIS OPTION IF YOUR ASSEMBLY DATA
IS SUBJECT TO CONFIDENTIALITY RESTRICTIONS
OR IS NOT CLEARED OR CONSENTED FOR PUBLIC RELEASE.
</ul>
</ul>
<h2>Things you can do with the HTTP server</h2>
<ul>
<li>Look at a specific read, its run-length encoding and location of markers</li>
<li>Find all alignments for a given read</li>
<li>Align any two reads</li>
<li>Align a read with all other reads</li>
<li>Visualize individual alignments as a grid</li>
<li>Visualize alignment graphs</li>
</ul>
and several other things.
<h2>Screenshots</h2>
<p>
Below are some sample screenshots obtained using
<code>--command explore</code>.
<h2>Assembly Summary</h2>
<img style='width:95%;border:dashed;border-color:lightgrey;' src="InspectingResults-Images/1.png">
<hr>
<h2>Investigating individual reads</h2>
You can see the run-length representation of the read along with its markers.
<img style='width:95%;border:dashed;border-color:lightgrey;' src="InspectingResults-Images/2.png">
<hr>
<h2>Investigating Read Graph</h2>
<img style='width:95%;border:dashed;border-color:lightgrey;' src="InspectingResults-Images/3.png">
<hr>
<h2>Investigating local Marker Graph</h2>
<img style='width:95%;border:dashed;border-color:lightgrey;' src="InspectingResults-Images/4.png">
<img style='width:95%;border:dashed;border-color:lightgrey;' src="InspectingResults-Images/5.png">
<hr>
<h2>Investigate Assembly Graph</h2>
<img style='width:95%;border:dashed;border-color:lightgrey;' src="InspectingResults-Images/6.png">
<img style='width:95%;border:dashed;border-color:lightgrey;' src="InspectingResults-Images/7.png">
<hr>
<div class="goto-index"><a href="index.html">Table of contents</a></div>
</main>
</body>
</html>
|