1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99
|
#!/usr/bin/perl
#Copyright (C) 2006-2008 Keio University
#(Kris Popendorf) <comp@bio.keio.ac.jp> (2006)
#
#This file is part of Murasaki.
#
#Murasaki is free software: you can redistribute it and/or modify
#it under the terms of the GNU General Public License as published by
#the Free Software Foundation, either version 3 of the License, or
#(at your option) any later version.
#
#Murasaki is distributed in the hope that it will be useful,
#but WITHOUT ANY WARRANTY; without even the implied warranty of
#MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
#GNU General Public License for more details.
#
#You should have received a copy of the GNU General Public License
#along with Murasaki. If not, see <http://www.gnu.org/licenses/>.
###############
## formats sequences to pretty Fasta -- krisp
###############
use Getopt::Long;
use Pod::Usage;
use File::Basename;
BEGIN {
unshift(@INC,(fileparse($0))[1].'perlmodules');
}
use Murasaki;
my $geneparse=getProg("geneparse");
die "Couldn't find useable geneparse program" unless $geneparse;
our ($name,$width)=(undef,75);
GetOptions('help|?' => \$help, man => \$man,
"title|name=s" => \$name, "width=i" => \$width,
);
pod2usage(1) if $help;
pod2usage(-exitstatus => 0, -verbose => 4) if $man;
($filename,$outfile)=@ARGV;
if($filename and -e $filename){
open(INF,'-|',"$geneparse -c $filename") or die "Couldn't load geneparse.pl";
my ($basename,$path,$suffix) = fileparse($filename);
$name=$basename unless $name;
} else {
print STDERR "File $filename not found. Waiting for input from stdin.\n" unless !$filename or $filename eq "-";
open(INF,"-");
$name="stdin" unless $name;
}
if($outfile){
open(OUTF,">$outfile");
} else {
open(OUTF,">-");
}
print OUTF ">$name\n";
do{
$rbytes=read(INF,$_,$width);
goto DONE unless $rbytes;
s/\s//gmi; #kill all whitespace
print OUTF $_."\n";
}until(!$rbytes);
DONE:
__END__
=head1 NAME
faformat.pl -- Reformat a sequence into fasta format
=head1 SYNOPSIS
faformat.pl [options] [<file-in> [<file-out>]]
=head1 OPTIONS
=over 8
=item B<--name|--title>
Sets the name to be provided on the first line of the fasta file.
=item B<--width=<columns>>
Column width to line wrap at (default 75).
=back
=head1 DESCRIPTION
Reads in a sequence, and puts it out in FastA format.
=cut
|