1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47
|
The makeflow_blasr program aligns genome sequences listed in a fastq file to the given a fasta reference. It uses the Basic Local Alignment with Successive Refinement. It aligns by partitioning the file into pieces containing one or more sequences and distributing them for individual alignment. The program uses the Makeflow and Work Queue frameworks for distributed execution on available resources.
To build Makeflow:
1. Install Blasr and all its required dependencies. Blasr can be downloaded from:
https://github.com/PacificBiosciences/blasr
2. Blasr only support hdf5 1.8.4, 1.8.11 and 1.8.12. Make sure to have the correct version of HDF5 installed.
3. Blasr must be run with HDF5. Make sure to include the "lib" directory inside of installed HDF5 in the same folder of your query, reference and other input file.
4. Install CCTools.
5. Copy makeflow_blasr from cctools/apps to the location the makeflow will be executed.
6. Run './makeflow_blasr --ref test_ref.fasta --query test.fastq --output result --makeflow Makeflow --split_gran 50000'
This produces a file called Makeflow
To run:
1. Run 'makeflow'
Note if the --makeflow option is specified with a MAKEFLOWFILE, run
makeflow <MAKEFLOWFILE>
This step runs the makeflow locally on the machine on which it is executed.
2. If you want to run with a distributed execution engine (Work Queue, Condor,
or SGE), specify the '-T' option. For example, to run with Work Queue,
makeflow -T wq
3. Start Workers
work_queue_workers -d all <HOSTNAME> <PORT>
where <HOSTNAME> is the name of the host on which the manager is running
<PORT> is the port number on which the manager is listening.
Alternatively, you can also specify a project name for the manager and use that
to start workers:
1. makeflow -T wq -N Blasr
2. work_queue_worker -d all -N Blasr
For listing the command-line options, do:
./makeflow_blasr -h
When the alignment completes, you will find the whole output from the individual partitions
in the directory the makeflow was run in.
|