1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108
|
#!/usr/bin/perl
use Getopt::Long;
our $prior=1;
our $color=0;
my %optctl = (help => \$help, dl => \$prior, best => \$best, color => \$color, ofile => \$ofile);
&GetOptions(\%optctl,"best","dl=i","help","color","ofile=s");
printhelp() if($help==1);
require sylseg_sk::Trans_sylseg;
import sylseg_sk::Trans_sylseg;
my $ifile;
my @stack;
my (%g1_tp, %g1_bp, %g1_mp);
my @syl;
local *INPUT,*OUTPUT;
#set_debug_level($prior,$color);
process_arguments();
mess("Processing ...",1);
open (INPUT,"<$ifile") or
die ("Can not open input file $ifile !!!\n");
open (OUTPUT,">$ofile") or
die ("Can not create output file $ofile !!!\n") if($ofile ne "");
while(<INPUT>)
{
my $slovko;
my @finlist;
chomp;
s/\r//g;
$_=lc_1($_);
s/ //g; $slovko=$_; s/^/-/;
next if(/-$/);
mess("Cating $slovko ...",2);
mess("Cating $_ ...",3);
@stack=gen_pos_sylabels($_);
my $flist=join(" :: ",@stack);
mess("Cat: $flist",2);
@finlist=calc_probs(@stack);
foreach $k (@finlist)
{
($prob,$hyp)=split("::",$k);
mess("$hyp \t\t$prob",0);
print OUTPUT "$hyp \t\t$prob\n" if($ofile ne "");
last if($best==1);
}
mess("---------------",0);
print OUTPUT "---------------\n" if($ofile ne "");
}
close INPUT;
close OUTPUT if($ofile ne "");
#warn "\n(c) Dodo 2003,2004,2005\n\n";
exit(0);
sub process_arguments
{
if(@ARGV==0)
{
$ifile="-";
mess("Reading from standard input ...",1);
}
else
{
$ifile=$ARGV[0];
}
mess("Input:\t $ifile",1);
mess("Output:\t $ofile",1) if($ofile ne "");
}
sub printhelp
{
print "\
sylseg_sk [--best] [--color] [--dl debug level] [--help] [ofile <file_name>] [<input_file>]
\t<input file>\t\t - list of the words for the segmentation
\t\-\-best\t\t\t - Print the best result only.
\t\-\-color\t\t\t - Enable color output.
\t\-\-dl 1..5\t\t - Set the debug level. Control the amount of displayed information
\t\t\t\t The debug level 0 displays nothing. The maximum level 5 displays full
\t\t\t\t debugging report. The default debug level is 1.
\t\-\-ofile <file_name>\t - Write output also in to the file.
\t\-\-help\t\t\t - display a short help text and exit\n";
exit(0);
}
sub lc_1
{
mess("Zmena vsetkych velkych pismen na male ...",5);
my %ul1=('A'=>'a',''=>'',''=>'','B'=>'b','C'=>'c',''=>'','D'=>'d',''=>'','E'=>'e',
''=>'','F'=>'f','G'=>'g','H'=>'h','I'=>'i',''=>'','J'=>'j','K'=>'k',
'L'=>'l',''=>'',''=>'','M'=>'m','N'=>'n',''=>'','O'=>'o',''=>'',
''=>'','P'=>'p','Q'=>'q','R'=>'r',''=>'','S'=>'s',''=>'','T'=>'t',
''=>'','U'=>'u',''=>'','V'=>'v','W'=>'w','X'=>'x','Y'=>'y',''=>'',
'Z'=>'z',''=>'');
my $in=shift(@_);
my @chars=split('',$in);
foreach $char (@chars)
{
$char=$ul1{$char} || $char;
}
$in=join('',@chars);
return $in;
}
|