File: README

package info (click to toggle)
libthai 0.1.30-1
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid
  • size: 3,280 kB
  • sloc: sh: 5,015; ansic: 4,528; makefile: 288
file content (36 lines) | stat: -rw-r--r-- 1,632 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
Data
====

This directory contains Thai word lists.

Files listed in 'Makefile.am' will be used to construct a dictionary for
the Thai word-breaking function (thbrk).

Misspelled words
================

It's crucial to understand the distinct purposes of 'tdict-spell.txt' and
'tdict-common.txt'.

- tdict-spell.txt: This file contains non-standard spellings of words found in
  'tdict-std.txt'. These are typically considered misspellings by authoritative
  sources. For example, "อิเลคโทรนิคส์" (for "electronics"), found in
  'tdict-spell.txt', is considered a misspelling of "อิเล็กทรอนิกส์" found in
  'tdict-std.txt'.

- tdict-common.txt: This file contains words used in real-life contexts but not
  defined in standard dictionaries. It also includes spelling variations of
  these words, as official standards for them have not yet been established.
  For example, "กิมจิ" ("kimchi"), "ครัวซอง" ("croissant"), and "ครัวซองต์"
  ("croissant").

- Other field-specific tdict-*.txt files: These files follow the same
  principles. They contain spelling variations of words specific to particular
  fields. For example, 'tdict-lang-ethic.txt' contains both "คาตากานะ" and
  "คาตาคานะ" as alternative spellings for "katakana" (a Japanese script).

To construct a dictionary of "correctly spelled words", it is recommended to
omit 'tdict-spell.txt'.

'tdict-common.txt' was split from the original 'tdict.txt' on 2006-08-31.
'tdict-spell.txt' was split from the original 'tdict-std.txt' on 2006-09-01.