1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446
|
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN">
<html><head><title>BRAKER2 and AUGUSTUS</title>
<link rel="stylesheet" type="text/css" href="augustus.css">
<script src="tutorial.js" type="text/javascript"></script>
</head><body>
<h1>Using BRAKER2 and AUGUSTUS</h1>
This page describes how to use BRAKER2 and AUGUSTUS for training of AUGUSTUS and for gene prediction. <br><br>
<h2>General remarks</h2>
<ul>
<li> This tutorial is designed in a way that persons with no experience in Linux should be able to follow. In case someone gets bored, he or she can try to get the code of this tutorial on his own laptop running.
<li> Here some manuals:
<ul>
<li> <a href="README_augustus.TXT">AUGUSTUS readme</a>
<li> <a href="BRAKER_userguide.pdf">BRAKER manual</a>
<li> <a href="STARmanual_2.3.0.1.pdf">STAR manual</a>
</ul>
</ul>
<br>
<h2>0. Preparation </h2>
<h3>Introduction to Linux </h3>
1. Navigate to the directory <tt>/home/mstanke</tt> on the genotoul server and display the content of the file <tt>.gm_key</tt> on your screen. <br><br>
Helpful commands:
<ul>
<li> <tt>ssh</tt> Connect to a server via SSH. The command <tt>ssh genotoul.toulouse.inra.fr -l user_name</tt> should connect you to the genotoul server if you fill in your user name for <tt>user_name</tt>
</tt>
<li> <tt>pwd</tt> Print working directory. Prints out the absolute path of the directory where you are right now.
<li> <tt>cd</tt> Change directory
<ul>
<li> <tt>cd work</tt> Change to the directory <tt>work</tt> given that it is a sub-directory of your current directory.
<li> <tt>cd /</tt> Change to the root directory.
<li> <tt>cd ~</tt> Change to your home directory.
<li> <tt>cd ..</tt> Change to the parent directory of your current directory.
<li> <tt>cd /usr</tt> Change to the directory <tt>usr</tt> which is located in the root directory.
</ul>
<li> <tt>ls</tt> List directory. Lists the content of your current directory.
<ul>
<li> <tt>ls</tt> List the content of your current directory.
<li> <tt>ls work</tt> List the content of the directory <tt>work</tt> given it is a sub-directory of your current directory.
<li> <tt>ls -l</tt> Lists additional information about the content of your current directory
<li> <tt>ls -a ~</tt> List all files in your home directory, including hidden ones, beginning with a dot.
<li> <tt>ll</tt> Short form for <tt>ls -l</tt>
</ul>
<li> <tt>man</tt> Display reference manual of a command. Try e.g. <tt>man ls</tt>
<li> <tt>less</tt> Display a given file, allowing to navigate it. Try e.g. <tt>less .bashrc</tt>
</ul>
<br>
2. Create a directory <tt>test</tt> in the directory <tt>work</tt> of your home directory. Create a file <tt>test.txt</tt> in this newly created directory. Edit this file using a text editor, i.e., write something into this file. Then delete the directory <tt>test</tt> including <tt>test.txt</tt>. <br><br>
Helpful commands:
<ul>
<li> <tt>mkdir</tt> Make directory. Creates the given directory. E.g. <tt>mkdir abc</tt> creates the directory <tt>acb</tt> as a sub-directory of your current directory.
<li> <tt>touch</tt> Change a file timestamp. The command <tt>touch file.txt</tt> changes the timestamp of a file <tt>file.txt</tt> to the current time and date. If <tt>file.txt</tt> does not exist yet, it is created.
<li> <tt>emacs</tt> Opens the editor emacs.
<ul>
<li> <tt>emacs -nw file.txt</tt> Opens the file <tt>file.txt</tt> with emacs in command line mode. That is, if you drop the option <tt>-nw</tt>, emacs is opened as a GUI, which does not work if you connected to a server using <tt>ssh</tt> without the option <tt>-X</tt>.
<li> <tt>Ctrl-x Ctrl-s</tt> Save the current file.
<li> <tt>Ctrl-x Ctrl-c</tt> Close the emacs.
</ul>
<li> <tt>rm</tt> Remove file / directory.
<ul>
<li> <tt>rm file.txt</tt> Removes the file <tt>file.txt</tt> in your current directory
<li> <tt>rm dir</tt> Removes the directory <tt>dir</tt> given it is a sub-directory of your current directory and <tt>dir</tt> is empty.
<li> <tt>rm -r dir</tt> Removes the directory <tt>dir</tt> even it is not empty.
</ul>
</ul>
<br><br>
3. Execute the following commands
<pre class="code">
/usr/local/bioinfo/src/augustus/augustus-3.2.3/bin/augustus --version
augustus --version
./augustus --version
cd /usr/local/bioinfo/src/augustus/augustus-3.2.3/bin/; ./augustus --version
</pre>
These commands do the following:
<ol>
<li> Execute <tt>augustus</tt> (with the option <tt>--version</tt>) using the executable <tt>augustus</tt> in <tt>/usr/local/bioinfo/src/augustus/augustus-3.2.3/bin</tt>
<li> Execute <tt>augustus</tt> using the executable which linux finds by itself using the environment variable <tt>$PATH</tt>
<li> Execute <tt>augustus</tt> using the executable <tt>augustus</tt> in the current directory. This fails because there is no executable <tt>augustus</tt> in the current directory.
<li> Change the directory to <tt>cd /usr/local/bioinfo/src/augustus/augustus-3.2.3/bin</tt> and then execute <tt>augustus</tt> using the executable <tt>augustus</tt> in the current directory.
</ol>
You can find out which file is executed with a given command using <tt>which</tt>:
<pre class="code">
which augustus
# /usr/local/bioinfo/src/augustus/current/bin/augustus
which exonerate
# /usr/local/bioinfo/bin/exonerate
</pre>
<br>
<h3>Specific preparation </h3>
1. Copy the directory <tt>/home/mstanke/work/tutorial</tt> into your directory <tt>~/work</tt><br><br>
Helpful commands:
<ul>
<li> <tt>cp</tt> Copy files / directory
<ul>
<li> <tt>cp src.txt dest.txt</tt> Copies the file <tt>cp src.txt</tt> onto the file <tt>dest.txt</tt>
<li> <tt>cp -r src dest</tt> Copies the directory <tt>src</tt> including its entire content onto the directory <tt>dest</tt>
</ul>
</ul>
If not stated differently, <u>it is assumed that the commands provided in this tutorial are executed in the directory <tt>~/work/tutuorial</tt>.</u><br><br>
2. To improve readability of the shell commands that we will use subsequently, generate the following symbolic links in the folder <tt>tutorial/data</tt>:
<ul>
<li> A link <tt>genome.fa</tt> for the file <tt>RCC4221_Ot_outlier.fasta</tt>
<li> A link <tt>proteins.fa</tt> for the file <tt>osttaV2_active_pep_2015_ch02_ch19.tfa</tt>
</ul>
Check whether your commands worked using <tt>less</tt>.<br><br>
Helpful commands:
<ul>
<li> <tt>ln</tt> Make link between files. The command <tt>ln -s file.txt link.txt</tt> creates a symbolic link <tt>link.txt</tt> for the file <tt>file.txt</tt>. That is, you can use <tt>link.txt</tt> everywhere instead of <tt>file.txt</tt>.
</ul>
<br>
3. Export some additional folders to the <tt>PATH</tt> variable. <u>This has to be repeated if you close and re-open your terminal.</u>
<pre class="code">
export PATH=$PATH:/usr/local/bioinfo/src/RepeatScout/current:/usr/local/bioinfo/src/GeneMark-ET/current/gmes_petap
</pre>
<br><br>
<h2>1. Repeat-mask the genome </h2>
For
<ul>
<li>the prediction of genes using AUGUSTUS</li>
<li>the alignment of RNA-seq reads to the genome</li>
</ul>
we need to mask the repeats in the target genome. This involves several steps, in which we employ RepeatScout and RepeatMasker.<br><br>
First, we compile a repeat library using RepeatScout.
<pre class="code">
mkdir -p masking
build_lmer_table -sequence data/genome.fa -freq masking/genome.freq
RepeatScout -sequence data/genome.fa -output masking/genome.repseq.fa -freq masking/genome.freq # takes ~30s
</pre>
The file <tt>masking/genome.freq</tt> contains a list of ostensible repeat sequences.
<pre class="code">
head -n 100 masking/genome.freq
</pre>
<pre class="code">
AAAAAGTGTGAA 4 351047
AAAACATGTGAA 8 758672
AAAAAACCTGAA 5 1473666
AAAAAGTATGAA 4 1470478
AAAACTGATGAA 3 336101
AAAACTACTTTT 5 1478803
AAAAGTTATGAA 7 822634
AAAATCGCTGAA 3 819533
AAAACTTTGGAA 3 1424203
AAAATACATTTT 3 504310
...
</pre>
If you want to have a look at the entire file, you can use
<pre class="code">
less masking/genome.freq
</pre>
The repeat library generated by RepeatScout is stored in <tt>masking/genome.repseq.fa</tt>
<pre class="code">
>R=0
GCGCGCGCGCGCGTAGGCCACGCGAACGCCTCCACACGCGTCGTCGCGCTGTCACGGTGGGCAGTACACGTCGCCACGGG
CGCGTGATGATACGTGCGAGTGCGCGACGTAGAAATACGTGCGCACGCGACTCACGTCGGGTCACCCAAAGCTCGCGCGC
TTGGCCGCGTCGCCTCGCGTCCATCCCCAGGATGCGCTCGTCGTGACGCACACGCTGTCGAGACGTATTTGAAGTGCGTT
TTCACAGACGATCTGGTACGTGTGATGGACGAACGTCGCATAGGACACAGTGTCGGCTATGAAACACGCCCAGCGGCGAT
GCGGCGTCGTTCCGCGCGCGTCCATTTTGATGGGCGCGCGCGGGCGTCACGTCTGGAAACTGCGTGCGCGCGTCGCTCGA
...
>R=1
GGATTTTGACTAATCTCGGTGTTGGTTGTGCACACCTTTTCTTGAGTCATAATCCGTCACGTCATGTATGGGTAAACAGT
ATGATCGGGGTACTCCAACTTGTGTTTTACCTTGTACAAATTTCGGCAAGTTTTCGCAACACGCGACATGTGTGCGCGGA
TAACGCGCCGACGAGGCAGCGAGGACGTGCGCAGCCATGAACAGTTGTAGATACATCCATGTTACATCAGATTCATCCAG
TTTCTGTGGCGTACGACGGACATAGCGACGTCCTCACGCAGTTCCTCGCGCTCTGGTACTCCACGCCAAGCTCTCAAGAA
ACCACGTCTAAACCCTACCTAAACCCTACCTAAACCCTAGTTAGGGTTTATGTCACATGTACACCACGTGCACTACGCGC
...
</pre>
<br><br>
Next, we remove repeats detected by RepeatScout that are too short or of low complexity.
<pre class="code">
cat masking/genome.repseq.fa | filter-stage-1.prl > masking/genome.repseq.f1.fa
</pre>
Comparing the repeat library before and after the execution of <tt>filter-stage-1.prl</tt> shows that three repeats has been removed.
<pre class="code">
diff masking/genome.repseq.fa masking/genome.repseq.f1.fa
</pre>
<pre class="code">
1c1
< >R=0
---
> >R=0 (RR=1. TRF=0.015 NSEG=0.009)
169c169
< >R=1
---
> >R=1 (RR=2. TRF=0.087 NSEG=0.000)
...
1087,1089c1087
< >R=82
< CGTATTATATCACTCATTGGAGCATAACGATACATATTACATCTGAACAC
< >R=83
---
> >R=83 (RR=83. TRF=0.000 NSEG=0.000)
</pre>
This can be confirmed with <tt>grep</tt>.
<pre class="code">
grep -c ">" masking/genome.repseq.fa masking/genome.repseq.f1.fa
# masking/genome.repseq.fa:156
# masking/genome.repseq.f1.fa:153
</pre>
<br><br>
Next, we mask the genome with RepeatMasker, using the repeat library generated by RepeatScout.
<pre class="code">
RepeatMasker data/genome.fa -e ncbi -lib masking/genome.repseq.f1.fa -dir masking # takes ~1m
</pre>
The file <tt>masking/genome.fa.masked</tt> contains the masked genome.
<pre class="code">
>ch02_Contig_2T
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNGCGACAAGAGACG
CACACAATGACAGACCCCGGCCAACATTGAGGGTGTACCAAGGCGCACAC
CACAAAACGAGGCACTAAGACACGTCTGGGGGAACCCTTATTCAAGAAAT
...
</pre>
<br><br>
Then, we filter out repeats which do not occur often enough.
<pre class="code">
cat masking/genome.repseq.f1.fa | filter-stage-2.prl --cat=masking/genome.fa.out > masking/genome.repseq.f2.fa
</pre>
<pre class="code">
grep -c ">" masking/genome.repseq.fa masking/genome.repseq.f1.fa masking/genome.repseq.f2.fa
# masking/genome.repseq.fa:156
# masking/genome.repseq.f1.fa:153
# masking/genome.repseq.f2.fa:36
</pre>
<br><br>
Finally, we run RepeatMasker using the final repeat library. Since the alignment tool STAR needs a hard-masked genome, whereas AUGUSTUS and BRAKER need it soft-masked, we generate both a soft- and a hard-masked genome.
<pre class="code">
RepeatMasker data/genome.fa -e ncbi -lib masking/genome.repseq.f2.fa -dir masking # takes ~1m
mv masking/genome.fa.masked masking/genome.fa.hardmasked
RepeatMasker data/genome.fa -e ncbi -lib masking/genome.repseq.f2.fa -dir masking -xsmall # takes ~1m
mv masking/genome.fa.masked masking/genome.fa.softmasked
</pre>
<br><br>
<h2>2. Alignment of RNA-seq reads </h2>
We use the first ??? reads from the RNA-seq library with the SRA accession number SRX3142169 for the gene prediction (run SRR5986291). As a preparatory step, we need to align these RNA-seq reads to our target genome. For this purpose, we use STAR. The application of STAR comprises two steps: First, the genome to which the reads will be aligned to is processed. Second, the reads are aligned and the result is output as a sorted BAM file.
<pre class="code">
mkdir -p star
STAR --runMode genomeGenerate --genomeDir star --genomeFastaFiles masking/genome.fa.hardmasked --genomeSAindexNbases 9
STAR --genomeDir star --readFilesIn data/rna-seq.fa --outFileNamePrefix star/ --outSAMtype BAM SortedByCoordinate # takes ~12m
mv star/Aligned.sortedByCoord.out.bam star/rna-seq.bam
</pre>
You can have a look at the result with the command
<pre class="code">
samtools view star/rna-seq.bam | less
</pre>
<br><br>
<h2>3. Gene prediction </h2>
As preparation, we carry out three steps. First, we set two environment variables. <u>This has to be repeated if you close and re-open your terminal.</u>
<pre class="code">
export GENEMARK_PATH=/usr/local/bioinfo/src/GeneMark-ET/current/gmes_petap
export BAMTOOLS_PATH=/usr/local/bioinfo/bin
</pre>
Second, in order to use GeneMark a valid key file has to be put into the home directory. and copy configuration data of Augustus into the home directory.
<pre class="code">
cp /home/mstanke/.gm_key ~/.gm_key
mkdir -p /work/tutorial/augustus
cp /home/mstanke/work/tutorial/augustus/config ~/work/tutorial/augustus/config
</pre>
Finally, we make the <tt>species</tt> directory writable to everyone so that BRAKER and AUGUSTUS can write into them. Moreover, we remove two directories which if existed would lead to BRAKER and AUGUSTUS aborting once called. It is important to recall to remove these two directories when havnig to repeat the first call of BRAKER below in the situation that previous calls were not successful.
<pre class="code">
chmod a+w -R augustus/config/species
rm -rf augustus/config/species/o.tauri
rm -rf braker
</pre>
<br><br>
Now, we can begin with the gene prediction itself. For this purpose, we use BRAKER and AUGUSTUS. <br><br>
The effect of our first call of BRAKER is trifold:
<ul>
<li> Generate a set of training genes, based on the RNA-seq data we prepared, using GeneMark
<li> Train AUGUSTUS with these training genes
<li> Predict genes using the RNA-seq data as hints using AUGUSTUS
</ul>
<pre class="code">
/home/mstanke/work/BRAKER_v2.0.4+/braker.pl --species=o.tauri --genome=masking/genome.fa.softmasked --bam=star/rna-seq.bam --softmasking --skipOptimize --cores 1 \
--AUGUSTUS_SCRIPTS_PATH=/usr/local/bioinfo/src/augustus/augustus-3.2.3/scripts --AUGUSTUS_BIN_PATH=/usr/local/bioinfo/src/augustus/augustus-3.2.3/bin \
--AUGUSTUS_CONFIG_PATH=/home/[user name]/work/tutorial/augustus/config # takes ~ 7m
</pre>
The options in the BRAKE call have the follownig meanings:
<ul>
<li> <tt>--species</tt>: The name of the species on which we carry out our gene prediction. <br>
<i>The parameters AUGUSTUS infers from the training data and uses to parametrize its internal model for the gene prediction is located in AUGUSTUS_CONFIG_PATH/species/[value of --species]. The term "species" is used in this option although AUGUSTUS can also be used for gene predictions on different strains of the same species.</i>
<li> <tt>--genome</tt>: The genomic data on which the gene prediction is carried out.
<li> <tt>--bam</tt>: The BAM file containign the aligned RNA-seq reads used for inference of the training genes by GeneMark and as hints for the gene prediction by AUGUSTUS
<li> <tt>--softmasking</tt>: Flag indicating that the genome is softmasked. <br>
<i>It is possible to use a hardmasked genome but this leads to an inferior performance of AUGUSTUS.</i>
<li> <tt>--skipOptimize</tt> Flag indicating that AUGUSTUS should not try to optimize the parameters it derives from the training genes. <br>
<i>One should not use the parameters AUGUSTUS infers without optimization to generate a gene set used in a scientific project. We skip optimization here because it is very time-consuming</i>
<li> <tt>--cores</tt>: The number of cores used by BRAKER and AUGUSTUS. <br>
<i> To reduce running time, once can increase the number of cores. Nevertheless, for this session this is probably not advisable as the genotoul server only has 32 cores. </i>
<li> <tt>--AUGUSTUS_SCRIPTS_PATH</tt>, <tt>--AUGUSTUS_BIN_PATH</tt>, <tt>--AUGUSTUS_CONFIG_PATH</tt>: These paths specify where to find various files and executables. <br>
<i>When you install BRAKER and AUGUSTUS on a computer for which you have administrator rights, you very probably will not need to set these paths.</i>
</ul>
<br>
You can determine the number of training genes generated by BRAKER using the command
<pre class="code">
grep -c LOCUS braker/o.tauri/genbank.good.gb
</pre>
<br><br>
We now carry out two further gene predictions, using hints differing form the ones just used in the prediction step. Nevertheless, to inspect the results just obtained, it is advisable to proceed to the next section and first visualize the results and then come back later to this section. <br><br>
Next, we want to carry out a gene prediction not only using RNA-seq data as hints but also protein data. Since we have already trained AUGUSTUS in the first run of BRAKER, we can now reuse the output from the first run.
<pre class="code">
/home/mstanke/work/BRAKER_v2.0.4+/braker.pl --species=o.tauri --genome=masking/genome.fa.softmasked --hints=braker/o.tauri/hintsfile.gff --prot_seq=data/proteins.fa \
--prg=exonerate --softmasking --skipOptimize --skipAllTraining --cores 1 --workingdir braker_proteins \
--AUGUSTUS_SCRIPTS_PATH=/usr/local/bioinfo/src/augustus/augustus-3.2.3/scripts --AUGUSTUS_BIN_PATH=/usr/local/bioinfo/src/augustus/augustus-3.2.3/bin \
--AUGUSTUS_CONFIG_PATH=/home/[user name]/work/tutorial/augustus/config # takes ~11m
</pre>
The following options were not used in the first run:
<ul>
<li> <tt>--hints</tt>: In the first run we provided a BAM file as hints. This file was processed and converted into a hint file. Instead of passing the BAM file to BRAKER we therefore now can usen the hint file.
<li> <tt>--prot_seq</tt>: The protein data used as hints by AUGUSTUS.
<li> <tt>--prg</tt>: The program used to algin the proteins. <br>
<i>We use exonerate here out of convenience but GenomeThreader should be chosen for research projects.</i>
<li> <tt>--skipAllTraining</tt>: No training is done by GeneMark or AUGUSTUS.
<li> <tt>--workingdir</tt>: The output directory used by BRAKER. <br>
<i> Since we do not want to override the results from the first run of BRAKER, we specify the output directory for this run.</i>
</ul>
<br>
Finally, we carry out an ab initio gene prediction using AUGUSTUS directly.
<pre class="code">
augustus --species=o.tauri masking/genome.fa.softmasked > augustus/augustus.abinitio.gff --AUGUSTUS_CONFIG_PATH=/home/[user name]/work/tutorial/augustus/config # takes ~2m
</pre>
Since we specify the species as o.tauri, AUGUSTUS automatically retrieves the parameter we estimated in the first run of BRAKER and uses them to carry out the gene prediction.
<br><br><br>
<h2>4. Visualization using UCSC browser </h2>
The results of the gene prediction can be displayed via an assembly hub of the UCSC browser. Go to <a href="https://genome-euro.ucsc.edu/cgi-bin/hgHubConnect">https://genome-euro.ucsc.edu/cgi-bin/hgHubConnect</a>, select the "My Hubs" tab, enter <tt>http://bioinf.uni-greifswald.de/bioinf/otauri/hub/hub.txt</tt> into the field "URL" and click "Add Hub". Once you habe been redirected to Genome Browser Gateway site, click "GO" in the top part of the page. This will show you the RNA-Seq coverage and, if you zoom in far enough, the nucleotides of the genome as well as their translation in all frames. <br><br>
To display the results of the gene prediction and the data we used for it, click on "add custom track". Then click on the upper "Browse..." and click "Submit". Then click on the link "User Track" in the table and enter a meaningful name instead of "User Track". Click again "Submit" and then click "Go". <br><br>
The GFF files to be displayed in the UCSC genome browser can be found here:
<ul>
<li> <a href="data/introns.gff">Introns GFF</a>
<li> <a href="data/proteins.gff">Proteins GFF</a>
<li> <a href="data/augustus_ab_initio.gff">GFF of AUGUSTUS ab initio prediction</a>
<li> <a href="data/augustus_rna.gtf">GFF of BRAKER prediction using RNA-Seq data</a>
<li> <a href="data/augustus_rna_prot.gtf">GFF of BRAKER prediction using RNA-Seq and protein data</a>
</ul>
<br><br>
<h2>5. Miscellanous </h2>
An AUGUSTUS service allowing for parameter training and gene prediction is available on <a href="http://bioinf.uni-greifswald.de/webaugustus/">http://bioinf.uni-greifswald.de/webaugustus/</a>.<br><br>
The version of BRAKER used in this tutorial has been modified in a way that it is possible to use it with rna-seq data too small to obtain a gene prediction of reasonable quality. If a release version of BRAKER is applied to the tutorial data, this probably will result in BRAKER aborting with an error message. <br><br>
The presentation Ingo Bulla gave during the morning session on 13 February 2018 in Banyuls can be found <a href="presentation.pdf">here</a>.
</body></html>
|