File: Unicode

package info (click to toggle)
mlton 20130715-3
  • links: PTS
  • area: main
  • in suites: stretch
  • size: 60,900 kB
  • ctags: 69,386
  • sloc: xml: 34,418; ansic: 17,399; lisp: 2,879; makefile: 1,605; sh: 1,254; pascal: 256; python: 143; asm: 97
file content (98 lines) | stat: -rw-r--r-- 3,735 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
<!DOCTYPE html>
<html lang="en">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<meta name="generator" content="AsciiDoc 8.6.8">
<title>Unicode</title>
<link rel="stylesheet" href="./asciidoc.css" type="text/css">
<link rel="stylesheet" href="./pygments.css" type="text/css">


<script type="text/javascript" src="./asciidoc.js"></script>
<script type="text/javascript">
/*<![CDATA[*/
asciidoc.install();
/*]]>*/
</script>
<link rel="stylesheet" href="./mlton.css" type="text/css"/>
</head>
<body class="article">
<div id="banner">
<div id="banner-home">
<a href="./Home">MLton 20130715</a>
</div>
</div>
<div id="header">
<h1>Unicode</h1>
</div>
<div id="content">
<div id="preamble">
<div class="sectionbody">
<div class="paragraph"><p>The current release of MLton does not support Unicode.  We are working
on adding support.</p></div>
<div class="ulist"><ul>
<li>
<p>
<span class="monospaced">WideChar</span> structure.
</p>
</li>
<li>
<p>
UTF-8 encoded source files.
</p>
</li>
</ul></div>
<div class="paragraph"><p>There is no real support for Unicode in the <a href="DefinitionOfStandardML">Definition</a>;
there are only a few throw-away sentences along the lines of "ASCII
must be a subset of the character set in programs".</p></div>
<div class="paragraph"><p>Neither is there real support for Unicode in the <a href="BasisLibrary">Basis Library</a>.
The general consensus (which includes the opinions of the
editors of the Basis Library) is that the <span class="monospaced">WideChar</span> structure is
insufficient for the purposes of Unicode.  There is no <span class="monospaced">LargeChar</span>
structure, which in itself is a deficiency, since a programmer can not
program against the largest supported character size.</p></div>
<div class="paragraph"><p>MLton has some preliminary support for 16 and 32 bit characters and
strings.  It is even possible to include arbitrary Unicode characters
in 32-bit strings using a <span class="monospaced">\Uxxxxxxxx</span> escape sequence.  (This
longer escape sequence is a minor extension over the Definition which
only allows <span class="monospaced">\uxxxx</span>.)  This is by no means completely
satisfactory in terms of support for Unicode, but it is what is
currently available.</p></div>
<div class="paragraph"><p>There are periodic flurries of questions and discussion about Unicode
in MLton/SML.  In December 2004, there was a discussion that led to
some seemingly sound design decisions.  The discussion started at:</p></div>
<div class="literalblock">
<div class="content monospaced">
<pre>http://www.mlton.org/pipermail/mlton/2004-December/026396.html</pre>
</div></div>
<div class="paragraph"><p>There is a good summary of points at:</p></div>
<div class="literalblock">
<div class="content monospaced">
<pre>http://www.mlton.org/pipermail/mlton/2004-December/026440.html</pre>
</div></div>
<div class="paragraph"><p>In November 2005, there was a followup discussion and the beginning of
some coding.</p></div>
<div class="literalblock">
<div class="content monospaced">
<pre>http://www.mlton.org/pipermail/mlton/2005-November/028300.html</pre>
</div></div>
<div class="paragraph"><p>We are optimistic that support will appear in the next MLton release.</p></div>
</div>
</div>
<div class="sect1">
<h2 id="_also_see">Also see</h2>
<div class="sectionbody">
<div class="paragraph"><p>The <a href="fxp">fxp</a> XML parser has some support for dealing with Unicode
documents.</p></div>
</div>
</div>
</div>
<div id="footnotes"><hr></div>
<div id="footer">
<div id="footer-text">
</div>
<div id="footer-badges">
</div>
</div>
</body>
</html>