1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101
|
#!/usr/bin/env bash
ASPELL=aspell
HUNSPELL=hunspell
: ${SCOWL:=..}
SPELLER="$SCOWL/speller"
: ${UNIX2DOS:=unix2dos}
set -e
export LANG=C
export LC_ALL=C
export LC_CTYPE=C
export LC_COLLATE=C
SIZE=60
mk-list() { $SCOWL/mk-list -d $SCOWL/final "$@"; }
prep() {
echo prep
cat $SCOWL/misc/{offensive.1,offensive.2,profane.1} | sort -u > nosug
}
doit() {
echo creating $1.dic
eval $2 | sort -u > $1.0
comm -12 $1.0 nosug > $1-nosug.1
comm -23 $1.0 nosug > $1.1
$SPELLER/munch-list munch $SPELLER/en.aff < $1-nosug.1 | $SPELLER/add-no-suggest > $1.2
$SPELLER/munch-list munch $SPELLER/en.aff < $1.1 >> $1.2
cat $SPELLER/en.dic.supp >> $1.2
wc -l < $1.2 | tr -d '[:blank:]' > $1.dic
cat $1.2 | sort | iconv -f iso-8859-1 -t utf-8 >> $1.dic
cp $SPELLER/en.aff $1.aff
if [ "$SCOWL_VERSION" ]; then
fn="$1-$SCOWL_VERSION"
else
fn="$1"
fi
WHAT="$1 Hunspell Dictionary" sh $SPELLER/README_en.txt.sh > README_$1.txt
if [ -z "$3" ]; then
echo "Wordlist Command: $2" >> README_$1.txt
else
cat $3 >> README_$1.txt
fi
if [ -z "$3" ]; then
mkdir -p hunspell
#echo check
cat $1-nosug.1 $1.1 | sort -u > $1.tocheck
#hunspell -l -d ./$1 < $1.dic.tocheck > misspelled
cat $1.tocheck | iconv -f iso-8859-1 -t utf-8 | $UNIX2DOS > $1.txt
fi
}
prep
if [ "$1" = "-all" ]
then
doit en_US "mk-list --accents=strip en_US $SIZE"
doit en_CA "mk-list --accents=strip en_CA $SIZE"
doit en_GB-ize "mk-list --accents=strip en_GB-ize $SIZE"
doit en_GB-ise "mk-list --accents=strip en_GB-ise $SIZE"
doit en_AU "mk-list --accents=strip en_AU $SIZE"
doit en_US-large "mk-list -v1 --accents=both en_US 70"
doit en_CA-large "mk-list -v1 --accents=both en_CA 70"
doit en_GB-large "mk-list -v1 --accents=both en_GB-ize en_GB-ise 70"
doit en_AU-large "mk-list -v1 --accents=both en_AU 70"
sh $SPELLER/README_en.txt.sh > hunspell/README
elif [ "$1" = "-one" -a -n "$2" -a -n "$3" ]
then
doit $2 "cat" $3
else
echo "usage: $0 -all | -one <dict-name> <parms file>"
fi
#rm eng*.dat nosug en_US*.? en_CA*.?
|