File: genpyt.pod

package info (click to toggle)

sunpinyin 3.0.0~rc1%2Bds1-3

links: PTS, VCS
area: main
in suites: bullseye
size: 1,944 kB
sloc: cpp: 13,586; python: 923; makefile: 198

file content (54 lines) | stat: -rw-r--r-- 1,314 bytes

parent folder | download | duplicates (4)

=head1 NAME

genpyt - generate the PINYIN lexicon

=head1 SYNOPSIS

B<genpyt> I<lexicon-file> I<result-file> I<log-file> I<slm-file>

=head1 DESCRIPTION

B<genpyt> is used to generate the PINYIN lexicon. 
It only works on zh_CN.UTF-8 locale.

=head1 ARGUMENTS

=over 4

=item I<lexicon-file>

Specify a dictionary file. It should be a line-based text file in utf-8 encoding
. Each line looks like:

   CCC  id  [pinyin'pinyin'pinyin]*

A default dictionary file can be found at F</usr/share/sunpinyin/dict.utf8>.


=item I<result-file>

The output binary PINYIN lexicon file. This lexicon contains a trie presenting the key tree of PINYIN. And all of the candidate words are sorted using the unigram in I<slm-file>. This file can be used with sunpinyin input method engines.


=item I<log-file>

Specify the file to where the log goes. The I<log-file> can be seen as the human-readble presentation of the binary output file.


=item I<slm-file>

The language model from which the unigram information are retrieved. Typically, the I<slm-file> is generated by B<slmthread>.

=back

=head1 AUTHOR

Originally written by Phill.Zhang E<lt>phill.zhang@sun.comE<gt>.
Currently maintained by Kov.Chai E<lt>tchaikov@gmail.comE<gt>.

=head1 SEE ALSO

B<slmthread>(1).

=for comment
-*- indent-tabs-mode: nil -*- vim:et:ts=4