File: LangVietnameseModel.log

package info (click to toggle)
uchardet 0.0.8-1
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid, trixie
  • size: 2,244 kB
  • sloc: cpp: 8,045; python: 1,305; ansic: 112; sh: 75; makefile: 9
file content (121 lines) | stat: -rw-r--r-- 3,725 bytes parent folder | download | duplicates (3)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
= Logs of language model for Vietnamese (vi) =

- Generated by BuildLangModel.py
- Started: 2016-02-13 03:37:17.480303
- Maximum depth: 3
- Max number of pages: 40

== Parsed pages ==

Chữ_Quốc_ngữ (revision 22887853)
1651 (revision 21455247)
1773 (revision 21354755)
1815 (revision 21361292)
1838 (revision 21361314)
1865 (revision 21361338)
1869 (revision 21361342)
1888 (revision 21389506)
1902 (revision 21354811)
1918 (revision 21354828)
1919 (revision 21354829)
1938 (revision 21354849)
1945 (revision 21354857)
22 tháng 2 (revision 21376086)
26 tháng 11 (revision 22579845)
28 tháng 12 (revision 22475308)
A (revision 22549334)
ASCII (revision 22528409)
Alexandre de Rhodes (revision 22859954)
Antonio Barbosa (revision 22145269)
B (revision 22836557)
BBC (revision 22863903)
Biên khảo (revision 22531516)
Bán nguyên âm (revision 22655600)
Bình luận (revision 22117664)
Bảng chữ cái Bồ Đào Nha (revision 22887853)
Bảng chữ cái Hy Lạp (revision 21362081)
Bảng chữ cái Latinh (revision 22442448)
Bắc Kỳ (revision 22393289)
Bồ Đào Nha (revision 22620858)
C (revision 21341881)
Cao Xuân Dục (revision 22620201)
Chính tả (revision 22187359)
Chính tả tiếng Việt (revision 20897580)
Chữ Hán (revision 22889609)
Chữ Nôm (revision 22781506)
Chữ cái (revision 22169220)
Công giáo (revision 22173119)
D (revision 21447691)

== End of Parsed pages ==

- Wikipedia parsing ended at: 2016-02-13 03:42:06.560479

101 characters appeared 222814 times.

First 55 characters:
[ 0] Char n: 11.262308472537633 %
[ 1] Char h: 8.881398834902654 %
[ 2] Char t: 7.022898022565907 %
[ 3] Char c: 6.365398942615815 %
[ 4] Char i: 6.198443544840091 %
[ 5] Char g: 5.591210606155808 %
[ 6] Char a: 3.5998635633308496 %
[ 7] Char u: 2.8499106878382867 %
[ 8] Char m: 2.615185760320267 %
[ 9] Char o: 2.6012728105056238 %
[10] Char đ: 2.222032726848403 %
[11] Char r: 2.1102803234985235 %
[12] Char à: 2.0447548179198796 %
[13] Char v: 1.9437737305555307 %
[14] Char l: 1.9119085874316697 %
[15] Char á: 1.7539292863105551 %
[16] Char p: 1.6453185167897888 %
[17] Char b: 1.541195795596327 %
[18] Char ư: 1.4397659033992478 %
[19] Char s: 1.3760356171515256 %
[20] Char y: 1.280440187779942 %
[21] Char e: 1.2454334108269678 %
[22] Char d: 1.1251537156552103 %
[23] Char ế: 1.071745940560288 %
[24] Char k: 1.0695019163966357 %
[25] Char â: 0.9658280000359044 %
[26] Char ữ: 0.9604423420431392 %
[27] Char ê: 0.8374698178749989 %
[28] Char ệ: 0.7459136319979893 %
[29] Char ô: 0.7073164163831717 %
[30] Char ạ: 0.6727584442629277 %
[31] Char ộ: 0.6705144200992756 %
[32] Char ố: 0.6476253736300233 %
[33] Char ó: 0.6072329386842837 %
[34] Char ả: 0.5484395055965963 %
[35] Char ủ: 0.5475418959311353 %
[36] Char q: 0.5138815334763525 %
[37] Char ợ: 0.48560682901433483 %
[38] Char ờ: 0.4851580241816044 %
[39] Char ể: 0.4748355130288043 %
[40] Char ớ: 0.4676546357051173 %
[41] Char ấ: 0.418286104104769 %
[42] Char ị: 0.40212913012647317 %
[43] Char ầ: 0.3904602044754818 %
[44] Char ọ: 0.3801376933226817 %
[45] Char ề: 0.3787912788244904 %
[46] Char ơ: 0.3590438661843511 %
[47] Char í: 0.35679984202069887 %
[48] Char ụ: 0.35276059852612496 %
[49] Char ậ: 0.3469261357006292 %
[50] Char ì: 0.32762752789322036 %
[51] Char ă: 0.3253835037295682 %
[52] Char ứ: 0.29665999443482005 %
[53] Char ồ: 0.29665999443482005 %
[54] Char x: 0.2939671654384374 %

The first 55 characters have an accumulated ratio of 0.9603301408349568.

1494 sequences found.

First 512 (typical positive ratio): 0.9321889118082535
Next 512 (512-1024): 0.009604423420431392
Rest: 0.0068905733918831966

- Processing end: 2016-02-13 03:42:07.174723