1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36
|
Data
====
This directory contains Thai word lists.
Files listed in 'Makefile.am' will be used to construct a dictionary for
the Thai word-breaking function (thbrk).
Misspelled words
================
It's crucial to understand the distinct purposes of 'tdict-spell.txt' and
'tdict-common.txt'.
- tdict-spell.txt: This file contains non-standard spellings of words found in
'tdict-std.txt'. These are typically considered misspellings by authoritative
sources. For example, "อิเลคโทรนิคส์" (for "electronics"), found in
'tdict-spell.txt', is considered a misspelling of "อิเล็กทรอนิกส์" found in
'tdict-std.txt'.
- tdict-common.txt: This file contains words used in real-life contexts but not
defined in standard dictionaries. It also includes spelling variations of
these words, as official standards for them have not yet been established.
For example, "กิมจิ" ("kimchi"), "ครัวซอง" ("croissant"), and "ครัวซองต์"
("croissant").
- Other field-specific tdict-*.txt files: These files follow the same
principles. They contain spelling variations of words specific to particular
fields. For example, 'tdict-lang-ethic.txt' contains both "คาตากานะ" and
"คาตาคานะ" as alternative spellings for "katakana" (a Japanese script).
To construct a dictionary of "correctly spelled words", it is recommended to
omit 'tdict-spell.txt'.
'tdict-common.txt' was split from the original 'tdict.txt' on 2006-08-31.
'tdict-spell.txt' was split from the original 'tdict-std.txt' on 2006-09-01.
|