File: en-np.dat

package info (click to toggle)
apertium-eo-en 1.0.0~r63833-3
  • links: PTS, VCS
  • area: main
  • in suites: bullseye
  • size: 28,044 kB
  • sloc: xml: 32,398; python: 364; sh: 318; makefile: 131
file content (102 lines) | stat: -rw-r--r-- 1,477 bytes parent folder | download | duplicates (4)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
## list of words that may be included in a Named Entity
## e.g.  "Banco de Comercio"  "Pedro Villacasas de la Morena"
<FunctionWords>
the
a
an
of
for
's
&
</FunctionWords>
## Punctuation mark tags after which a capitalized word
## may be indicating just a sentence (or clause) beginning
<SpecialPunct>
(
)
{
.
...
:
-
"
'
''
``
¿
?
¡
!
</SpecialPunct>
## Section <NounAdjs> lists the tags that are considered a possible NP 
## when they are capitalized at sentence beggining 
<NounAdj>
n
adj
</NounAdj>
## Closed categories. Any word belonging to these categories will not
## be considered NP, even if capitalized.
<ClosedCats>
predet
preadv
def
det
dem
pr
cnjadv
cnjcoo
cnjsub
prn
vbser
</ClosedCats>
# Tags for non-words
<DateNumPunct>
num
sent
cm
lquest
lpar
rpar
</DateNumPunct>
## upper limit to consider an all-caps sentence as a NE.
<TitleLimit>
0
</TitleLimit>
## Words that may be names even though they do not comply the default rules
<Names>
</Names>
## Words that will NOT be proper names. Value 0 => ignore as NE if it is a capitalized word alone. Value 1=> always ignore as NE. (e.g. "I" in English is usually not a NE when alone, but it may be in "King Henry I" )
<Ignore>
i 0
i'm 1
i'll 1 
i've 1
english 0
spanish 0
dutch 0
german 0
french 0
basque 0
catalan 0
january 1
february 1
march 1
april 1
may 1
june 1
july 1
august 1
september 1
october 1
november 1
december 1
monday 1
tuesday 1
wednesday 1
thursday 1
friday 1
saturday 1
sunday 1
NP 1
</Ignore>