1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420
|
<!DOCTYPE html>
<html>
<head>
<meta charset="UTF-8"/>
<style>
/*
This demo uses the method devised by Roel Nieskens for accessing
OpenType features. For further explanation, see
https://pixelambacht.nl/2019/fixing-variable-font-inheritance/.
*/
/*
font-family names this family and associates this declaration with
other possible declarations for Junicode 2.
src:url() points to the font file.
font-weight is either normal (400) or bold (700).
font-style: normal indicates that this declaration is for an upright
or roman font.
These fonts have been subsetted, so they're a fraction of their
original size.
*/
@font-face {
font-family: 'JuniusX';
src:url("./webfiles/JuniusX-Regular.woff2");
font-weight: normal;
font-style:normal;
}
@font-face {
font-family: 'JuniusX';
src:url("./webfiles/JuniusX-Bold.woff2");
font-weight: bold;
font-style:normal;
}
/*
In the :root declaration we define variablesa for the OpenType
tags we'll be using, together with their default values. We define
them here because :root is as far up the document tree as you can
go.
To use this model for your document, identify all the feature tags
you'll be using in your text and set the defaults here.
*/
:root {
--jvf-ss03: "ss03" off;
--jvf-ss17: "ss17" off;
--jvf-pcap: "pcap" off;
--jvf-ss16: "ss16" off;
--jvf-cv05: "cv05" off;
--jvf-cv09: "cv09" off;
--jvf-cv13: "cv13" off;
}
/*
Style the body. Make the font big so everybody can see what's
going on.
*/
body {
font-family: 'JuniusX';
font-size: 28px;
line-height: 150%;
margin-left: 10%;
margin-right: 10%;
}
/*
An asterisk selects any element. Every time we hit an element, we'll
call font-feature-settings to explicitly set all the OpenType features
we're using in this text. This block is called after the more specific
ones, so the variables we use in this rule will have been set in a
class definition.
*/
* {
font-feature-settings: var(--jvf-ss03),
var(--jvf-ss17),
var(--jvf-pcap),
var(--jvf-ss16),
var(--jvf-cv05),
var(--jvf-cv09),
var(--jvf-cv13);
}
/*
Define the elements we'll be using. Notice how we change the style
by setting the value of the variables in :root.
*/
h1 {
font-size: 115%;
font-weight: normal;
font-style: normal;
}
/*
pre will automatically use a fixed-width font. We won't mess with
whatever the standard is, except to make the type a bit smaller.
*/
pre {
font-size: 85%;
}
/*
Define classes. We only need to change the variable that's relevant
to each class: the others will remain at their current values.
*/
.mufitext {
margin-left: 40px;
}
/*
Note: the cvNN values are indexes into the list of values they
contain. However, "off" will also work to turn the features off.
*/
.medtext {
margin-left: 40px;
--jvf-ss03: "ss03" on;
--jvf-ss16: "ss16" on;
--jvf-cv05: "cv05" 1;
--jvf-cv09: "cv09" 1;
}
.dgrph {
--jvf-ss17: "ss17" on;
}
.longs {
--jvf-ss03: "ss03" on;
}
.pcap {
--jvf-pcap: "pcap" on;
}
.norrotunda {
--jvf-ss16: "ss16" off;
}
.dotlessi {
--jvf-cv13: "cv13" 1;
}
/* No OpenType features for this one, but just a big red letter. */
.initcap {
color: red;
font-size: 150%;
}
/*
The restoration and deletion classes provide an alternative way of
displaying MENOTA's brackets for editorial restorations and scribal
deletions, using the CSS content property. Using this method, the
brackets are not in the underlying text, but only in the displayed
text. This may or may not be desirable.
*/
.restoration:before { content: "["; }
.restoration:after { content: "]"; }
.deletion:before { content: "⸠"; }
.deletion:after { content: "⸡"; }
</style>
<!-- Global site tag (gtag.js) - Google Analytics -->
<script async src="https://www.googletagmanager.com/gtag/js?id=G-DGTTVQHWG1"></script>
<script>
window.dataLayer = window.dataLayer || [];
function gtag(){dataLayer.push(arguments);}
gtag('js', new Date());
gtag('config', 'G-DGTTVQHWG1');
</script>
</head>
<body lang="en">
<h1>Searchable web pages with MUFI and Junicode 2</h1>
<p>
Junicode 2 fully implements the recommendations of the
<a href="https://mufi.info/m.php?p=mufi">Medieval Unicode
Font Initiative</a> (MUFI), which offers an encoding scheme for medieval
texts based on <a href="https://home.unicode.org/">Unicode</a>,
the worldwide standard for text
exchange. You probably know that a computer text is a string of
numbers, each representing a character: using Unicode, you employ a
standard mapping of numbers to characters, ensuring that a text sent to you
by (say) your friend in Thailand will
look the same when it reaches you as it did when it left your friend’s
computer. MUFI does a similar job for medieval characters. For
example, if you need a Dutch libra sign, you won’t find it in Unicode;
but MUFI has it at code point U+F2EA (), and it will be recognizable
to anyone with a MUFI-compliant font (like Junicode 2).
</p>
<p>
To work its magic, MUFI selects characters of interest to medievalists
from the Unicode standard, and when a medieval character is missing
from Unicode it assigns a code point from a block of numbers set aside
for individuals and groups to use in any way they like: the
<b>Private Use Area</b>. MUFI-compliant fonts like Junicode 2 must all
agree on the mappings of code point to character defined by MUFI so
that texts can be exchanged reliably.
</p>
<p>
For example, here is the beginning of a diplomatic text of the Old
Norse <i>Vǫluspá</i>, copied and pasted from the
Medieval Nordic Text Archive (MENOTA—because Junicode 2 Italic is not yet
ready, the text is presented only in roman type):
</p>
<p class="mufitext" lang="is">
<span class="initcap">H</span>lıoðſ bið ec allar kinꝺir meiri oc miɴi maugo<br/>
heimꝺalar uilðo at ec ualꝼꜹþr uel ꝼyr telia ꝼoꝛn<br/>
ſpioll ꝼíra þꜹ er ꝼremſt um man. Ec mán iǫtna<br/>
ar um boꝛna þ⸠ꜹ̣⸡a er ꝼoꝛꝺom mic ꝼǫꝺꝺa hoꝼꝺo. nio man ec heima<br/>
nío iviþi[vr] miot uið mran ꝼyr molꝺ neðan.<br/>
</p>
<p>
Looks great, right? And it will read the same way whether it’s set in
Junicode 2, Andron Scriptor, Cardo, or Palemonas MUFI. There’s just one
problem—but it’s a significant one. To illustrate, look at the last
word in the second line:
<b>ꝼoꝛn</b>. That’s an <b>insular f</b> in the first position and an
<b>r rotunda</b> in the third—different flavors of <b>f</b> and <b>r</b>.
Now try searching
this text for the word “forn” using your browser’s search function
(Ctrl-F or Cmd-F). What happened? Your search skipped right over
<b>ꝼoꝛn</b> in the medieval text and landed on “forn” in this
paragraph. The fact that <b>f</b> and <b>r</b> are encoded differently
in this text from the way they’re encoded in plain text means that
this text can’t be searched the way most webpages can. (To be a little
more precise, some browsers will find the <b>f</b>, but none of them
will find the <b>r</b>).
</p>
<p>
MENOTA has its own powerful search tools, so that the inability to search
in the browsers is not a great loss; but this inability may also signal
problems with accessibility, which is increasingly
important to university administrations, funding agencies, and (of course)
end users. Is it possible, then, to create a MUFI-compliant webpage, rich
with medieval characters, that is both searchable and accessible?
</p>
<p>
Turns out that it is both possible and easy to create such a page
using Junicode 2. We’ll take it in two steps. The first will be to create
as plain an etext as we can get away with, and the second will be to
style this text using
<a href="https://www.w3.org/Style/CSS/Overview.en.html">CSS</a>
(“Cascading Style Sheets,” the standard for
styling web pages) and the features of Junicode 2.
</p>
<p>
Here’s my attempt to convert the passage to plain text:
</p>
<p class="mufitext" lang="is">
<span class="initcap">H</span>lioðs bið ec allar kindir meiri oc mini maugo<br/>
heimdalar uilðo at ec ualfavþr uel fyr telia forn<br/>
spioll fíra þav er fremst um man. Ec mán iǫtna<br/>
ar um borna þ⸠aṿ⸡a er fordom mic fǫdda hofdo. nio man ec heima<br/>
nío iviþi[vr] miot uið mę́ran fyr mold neðan.
</p>
<p>
If “plain text” is what you can type on a U.S. keyboard, we haven’t
quite made it: we still have
several accented characters and the Icelandic thorn and eth, not to
mention the strange brackets in line 4 (U+2E20 and U+2E21—we’ll deal
with these later). The Icelandic
characters are familiar enough to medievalists that we needn’t worry
about them. The dot under the <b>v</b> in what was an
<b class="dgrph">av</b> digraph (which we took
apart to make it searchable as <b>av</b>, but we’ll
put it back together later) is a <b>combining
diacritical mark</b>: you type it right after the base character, and
your software takes care of positioning it correctly. The accented
<b>ę́</b> was a character in the Private Use Area, but I
have changed it to <b>ę</b> (U+0119, common in modern Polish) plus
another combining mark, U+0301, the acute accent. That makes it
standard Unicode, like the other accented characters. The <b>ę</b>
is searchable as <b>e</b>, and browsers will
ignore the marks when searching.
</p>
<p>
The next step is to select what we need from among the OpenType
features of Junicode 2.
<a href="https://docs.microsoft.com/en-us/typography/opentype/spec/">OpenType</a>
is a standard for font construction that
allows fonts to do all kinds of clever things. To cite one common
example, when you type the letters “find” and the <b>f</b> and the <b>i</b> come out
joined in a ligature (“find”), an OpenType feature did that. But
Junicode 2 has more than eighty OpenType features that do a lot of useful
things.
</p>
<p>
Here are the features we’ll want to apply to the whole text:
</p>
<ul>
<li>All instances of <b>s</b> in this text are long
<span class="longs">(s)</span>: use the “Long
s” feature (hist or ss03) for that.</li>
<li>All instances of <b>d</b> are round: use
Character Variant 5 (cv05).</li>
<li>All instances of <b>f</b> are insular (ꝼ): use Character
Variant 9 (cv09).</li>
<li>Instances of <b>r</b> are round after rounded characters like <b>o</b> and <b>ꝺ</b>;
use “Contextual r rotunda” (ss16).</li>
</ul>
<p>
The features whose four-letter tags begin with <b>ss</b> are Stylistic
Sets, and they are either on or off. The Character Variant features
contain a list of variants of a particular character; we select from
this list with a numerical index (starting with 1).
To turn on these OpenType features, then, we need the following CSS rule:
</p>
<pre><code>font-feature-settings: "ss03" on, "cv05" 1, "cv09" 1, "ss16" on;</code></pre>
<p>
When we apply that to our fragment of <i>Vǫluspá</i>, here’s what we get:
</p>
<p class="medtext" lang="is">
<span class="initcap">H</span>lioðs bið ec allar
kindir meiri oc mini maugo<br/>
heimdalar uilðo at ec ualfavþr uel
fyr telia forn<br/>
spioll fíra þav er fremst um man. Ec mán iǫtna<br/>
ar um borna þ⸠aṿ⸡a er fordom mic fǫdda hofdo. nio man ec heima<br/>
nío iviþi[vr] miot uið mę́ran fyr mold neðan.
</p>
<p>
One line of CSS, and already we’re almost home! But there are several
special cases to attend to. There’s the dotless <b>i</b> in <b>Hlioðs</b>, the small
capital <b><span class="pcap">n</span></b> in <b>mi<span
class="pcap">n</span>i</b>, the <b class="dgrph">av</b> digraph that has to be
reassembled, and the <b>r rotunda</b> that should be a regular
<b>r</b> at the end of <b lang="is">ualfꜹþr</b>. For those, we devise CSS classes that turn
OpenType features on or off, and we apply these classes to individual
words or letters:
</p>
<ul>
<li>For dotless <b>i</b>, the style “dotlessi” selects the first variant
of <b>i</b> in Character Variant 13 (cv13 1).</li>
<li>For <b><span class="pcap">n</span></b>,
we need the “pcap” style, which turns on
the “petite capitals”
feature (pcap). (Petite capitals are different from regular small
caps in that they’re sized to be mixed with lowercase
letters.)</li>
<li>For ꜹ, a ligature with a phonetic value (like æ), distinct from other
ligatures, we use the “dgrph” style, which turns on “Rare Digraphs” (ss17).
</li>
<li>For the <b>r</b> that shouldn’t be rotunda, we use the “norrotunda”
style, which temporarily turns off the “Contextual r rotunda” feature (ss16).</li>
</ul>
<p>
As an example of what the HTML will look like, here’s the first line
of our text:
</p>
<pre><code><span class="initcap">H</span>l<span class="dotlessi">i</span>oðs bið
ec allar kindir meiri oc mi<span class="pcap">n</span>i maugo</code></pre>
<p>
Of course it looks illegible and hard to write, but for most projects it will
be generated automatically from an underlying text; and of course your readers
will not see the ugly HTML code or the CSS, but rather this:
</p>
<p class="medtext" lang="is">
<span class="initcap">H</span>l<span class="dotlessi">i</span>oðs bið ec allar
kindir meiri oc mi<span class="pcap">n</span>i maugo<br/>
heimdalar uilðo at ec ualf<span class="dgrph">av</span>þ<span class="norrotunda">r</span> uel
fyr telia forn<br/>
spioll fíra þ<span class="dgrph">av</span> er fremst um man. Ec mán iǫtna<br/>
ar um borna þ⸠<span class="dgrph">aṿ</span>⸡a er fordom mic fǫdda hofdo. nio man ec heima<br/>
nío iviþi[vr] miot uið mę́ran fyr mold neðan.
</p>
<p>
and if any of them should happen to hit Ctrl-F and perform a
quick-and-dirty search for “kindir” or “forn”
or “mini” or “<span lang="is">ualfavþr</span>” or “meran,” they’ll find what they’re looking for.
</p>
<p>
For one final refinement, we change the brackets in the text to CSS classes:
“deletion” and “restoration.” The brackets themselves are not inserted in
the text, but rather displayed via the CSS content property. The effect of this change
is that searches of the text will ignore the brackets: thus a user can search
for (and find) <b lang="is">iviþivr</b>. When displayed by this method, editorial interventions
can also be changed programmatically—e.g. highlighted or hidden.
The final text:
</p>
<p class="medtext" lang="is">
<span class="initcap">H</span>l<span class="dotlessi">i</span>oðs bið ec allar
kindir meiri oc mi<span class="pcap">n</span>i maugo<br/>
heimdalar uilðo at ec ualf<span class="dgrph">av</span>þ<span class="norrotunda">r</span> uel
fyr telia forn<br/>
spioll fíra þ<span class="dgrph">av</span> er fremst um man. Ec mán iǫtna<br/>
ar um borna þ<span class="dgrph deletion">aṿ</span>a er fordom mic fǫdda hofdo. nio man ec heima<br/>
nío iviþi<span class="restoration">vr</span> miot uið mę́ran fyr mold neðan.
</p>
<p>
This document is not only a demonstration, but also a how-to for
producing a searchable online document. For detailed instructions,
view the source for this page. The CSS is thoroughly commented so you
can tell exactly what each bit does.
</p>
<div style="clear: both; padding: 20px; background-color:
lightgray">
<p>
Junicode/Junicode 2 font copyright © 1998–2022 by Peter
S. Baker.
</p>
<p>
<a href="https://github.com/psb1558/Junicode-font">Development site</a> · <a href="https://psb1558.github.io/Junicode-New/">Specimen Page</a>
</p>
<p>
Licensed under the <a href="https://scripts.sil.org/cms/scripts/page.php?site_id=nrsi&id=OFL">Open Font License, v. 1.1</a>.
</p>
</div>
</body>
</html>
|