File: NOTES

package info (click to toggle)
bedstead 3.252-1
links: PTS, VCS
area: non-free
in suites: forky, sid
size: 560 kB
sloc: ansic: 4,373; python: 337; makefile: 133; sh: 71
file content (316 lines) | stat: -rw-r--r-- 11,368 bytes
parent folder | download | duplicates (2)
Precise pixel layouts for mosaic graphics
=========================================

Based on a 6x10 grid of pixels:

............  000000111111  ..0000..1111  
......##....  000000111111  ..0000..1111
....##..##..  000000111111  ............
..##......##  222222333333  ..2222..3333
..##......##  222222333333  ..2222..3333
..##########  222222333333  ..2222..3333
..##......##  222222333333  ............
..##......##  444444555555  ..4444..5555
............  444444555555  ..4444..5555
............  444444555555  ............

Mosaic characters are not smoothed, and smoothing also doesn't apply
between mosaic characters and adjacent alphanumeric characters.

Reverse engineering
===================

https://stardot.org.uk/forums/viewtopic.php?t=21608
https://github.com/lanceewing/saa5050/

Character sets
==============

The SAA5050 series is generally compatible with the G0 primary
character sets defined by ETSI EN 300 706 V1.2.1:

SAA5050: Latin with English option
SAA5051: Latin with German option
SAA5052: Latin with Swedish option
SAA5053: Latin with Italian option
SAA5054: Latin with French option

SAA5057: Cyrillic (Russian)

There's no ETSI EN 300 706 character set that corresponds to the
SAA5055.  The ETSI EN 300 706 Hebrew set is almost identical to the
SAA5056 set, except that the old Sheqel sign on the SAA5056 is
replaced by the new Sheqel sign in ETSI EN 300 706.

There appears to be an SAA5058, listed as "Afrikaans", but I haven't
found a datasheet for it and there isn't an Afrikaans character set in
ETSI EN 300 706.

ETSI EN 300 706 to Unicode mapping:

ETSI EN 300 706 just gives glyph images without names, so it's not
always obvious what the intended semantics are.  ZVBI, the Zapping
teletext decoder has opinions on this in src/lang.c.

ZVBI uses quite a few bits of Private Use Area for characters that
aren't, or weren't, in Unicode, or where the author can't work out the
correct glyph:

U+E620..U+E67F Arabic G0 set
U+E720..U+E77F Arabic G2 set
U+E800 Turkish Lira "TL" symbol, 2/3 in the Latin Turkish subset
U+EE00..U+EE7F G1 Block Mosaics set
U+EF20..U+EF7F G3 Smooth Mosaics and line drawing
U+F000..U+F7FF Dynamically Re-definable Characters

ZVBI maps Greek G0 Primary Set position 5/2 to U+0374 GREEK NUMERAL
SIGN, but I think it's a tonos to go with a subsequent capital letter.

ZVBI doesn't have a proper mapping for Arabic characters, but maybe we
can.

0x20 0x0020
0x21 0x0021
0x22 0x0022
0x23 0x00A3
0x24 0x00A4
0x25 0x0025

0x40 0xFE94 # ARABIC LETTER TEH MARBUTA FINAL FORM
0x41 0xFE80 # ARABIC LETTER HAMZA ISOLATED FORM
0x42 0xFE92 # ARABIC LETTER BEH MEDIAL FORM
0x43 0xFE8F # ARABIC LETTER BEH ISOLATED FORM (also final?)
0x44 0xFE98 # ARABIC LETTER TEH MEDIAL FORM
0X45 0xFE95 # ARABIC LETTER TEH ISOLATED FORM (also final?)
0x46 0xFE8E # ARABIC LETTER ALEF FINAL FORM
0x47 0xFE8D # ARABIC LETTER ALEF ISOLATED FORM
0x48 0xFE91 # ARABIC LETTER BEH INITIAL FORM
0x49
0x4A 0xFE97 # ARABIC LETTER TEH INITIAL FORM
0x4B 0xFE9B # ARABIC LETTER THEH INITIAL FORM
0x4C 0xFE9F # ARABIC LETTER JEEM INITIAL FORM
0x4D 0xFEA3 # ARABIC LETTER HAH INITIAL FORM
0x4E 0xFEA7 # ARABIC LETTER KHAH INITIAL FORM
0x4F
0x50 0x0631 # ARABIC LETTER REH (final and isolated)
0x51 0x0630 # ARABIC LETTER THAL (final and isolated)
0x52 0xFEB3 # ARABIC LETTER SEEN INITIAL FORM (also medial)
0x53 0xFEB7 # ARABIC LETTER SHEEN INITIAL FORM (also medial)

0x5B 0xFE9C # ARABIC LETTER THEH MEDIAL FORM
0x5C 0xFEA0 # ARABIC LETTER JEEM MEDIAL FORM
0x5D 0xFEA4 # ARABIC LETTER HAH MEDIAL FORM
0x5E 0xFEA8 # ARABIC LETTER KHAH MEDIAL FORM

0x6B 0xFE99 # ARABIC LETTER THEH ISOLATED FORM (also final?)
0x6C 0xFE9D # ARABIC LETTER JEEM ISOLATED FORM (also final?)
0x6D 0xFEA1 # ARABIC LETTER HAH ISOLATED FORM (also final?)
0x6E 0xFEA5 # ARABIC LETTER KHAH ISOLATED FORM (also final?)

Misc
====

UK patent number 1343298 (filed 1971 by Mullard) describes the
character rounding technique used by the SAA5050.

FontForge's "Expand stroke" may be useful for converting single-line
font into something useful.  A 50-unit circular pen generates Bedstead
Plotter.  A 100×50-unit, 20° rectangular pen generates a surprisingly
good Bedstead Calligraphic.

Typography
==========

Typographers seem to like explaining how to design their languages'
unique letters:

Æ and æ:
https://medium.com/@frodefrodefrode/designing-the-letter-%C3%A6-862cffbe22b

Polish characters:
http://www.twardoch.com/download/polishhowto/

ẞ:
https://typography.guru/journal/capital-sharp-s-designs/
https://typography.guru/journal/how-to-draw-a-capital-sharp-s-r18/
http://cinga.ch/eszett/

ĿL and ŀl:
https://glyphsapp.com/learn/localize-your-font-catalan-punt-volat

Microsoft have their own suggestions:
https://docs.microsoft.com/en-gb/typography/develop/character-design-standards/

WebKit 'palt' bug
=================

As of 2024, Safari on Apple platforms mis-displays the Bedstead Web
page.  The 'palt' feature is being mis-applied to U+0020 SPACE
characters.  Most of them end up not being narrowed, but some instead
end up with negative width, as though different layers of the browser
are calculating the widths of spaces differently.  The problem doesn't
appear to affect any characters other than spaces.  Other characters,
including punctuation, get displayed correctly.

Adding a GSUB lookup that replaces space with a glyph with a smaller
advance width doesn't help: the new glyph is used, but with the
original glyph's width.  Replacing each U+0020 SPACE with U+00A0
NO-BREAK SPACE followed by U+200B ZERO WIDTH SPACE leads to all the
spaces' being too wide, but consistently so without any overlaps.

This bug appears to be https://bugs.webkit.org/show_bug.cgi?id=236307.

Choice of design unit
=====================

Bedstead uses a design unit of 1/1000 em, largely because that's
traditional for PostScript fonts.  But that does have some problems,
and in particular depends on being able to encode arbitrary rationals
in CFF fonts, which means that odd widths end up with significantly
larger font files.

So what are the constraints on design units if we want to get more
integers in the output?  Assuming we can tweak XQTR if we need to.

UnitsPerEm must be a multiple of 10 for vertical pixels.

But horizontal pixels vary in width by factors of 1/8, so must be a
multiple of 80.

XQTR and YQTR want to be about (UnitsPerEm / 10) * (1-1/sqrt(2)).
That's about (UnitsPerEm / 10) * 0.29289321881345254.

XML silliness
=============

Not sure if these are a good idea, but they're fun:

#define XML_FOR_O_I (outer, init, test, step, inner)			\
	for (xml_open(outer), (init);					\
	     (test) ? (xml_open(inner), true) : (xml_close(outer), false); \
	     (step), xml_close(inner))
#define XML_FOR_I (init, test, step, inner)	\
	for (init;				\
	     (test) && (xml_open(inner), true); \
	     (step), xml_close(inner))
#define XML_FOR_O (init, test, step, inner)		\
	for (xml_open(outer), (init);			\
	     (test) || (xml_close(outer), false);	\
	     step)
#define XML_IF (test, inner) \
	XML_FOR_I (bool _done = false, !_done && (test), _done=true, inner)
#define XML (elem) XML_IF(true, elem)

Version numbers
===============

Up until version 002.009, Bedstead's version numbers took the form of
two dot-separated three-digit numbers with leading zeroes.

The example fonts in "Adobe Type 1 Font Format" (version 1.1) have
version numbers like "001.003".  The PostScript Language Reference
Manual (3rd edition) places no requirements on font versions as stored
in the "version" entry of the FontInfo dictionary.

The "%%Version" comment in DSC 3.0 takes a <real> followed by a
<uint>.  <real> is defined broadly, and includes "-.002", "34.5",
"-3.62", "123.6e10", "1E-5", "-1.", and "0.0".

OpenType 'head' field "version" is a 16.16 fixed-point binary number.
This is recommended to be rounded and padded to three decimal places
for display.  OpenType 'name' ID 5 is required to contain two numbers,
at most 65535 each, separated by a dot.  It's recommended that this
come just after "Version " at the start.

dpkg (like many other things) thinks a version number consists of dot
separated integers.  It treats "002.009" as equal to "2.9", but "1.3"
as different from "1.300".

So I think version numbers of the form "3.141" are sensible.  Losing
the leading zeroes and having three decimal places means they can be
formatted correctly from the 'head' table.  Not having leading or
trailing zeroes in the fractional part means that they can't be
suppressed by anything that might try to canonicalise version numbers.

Inspiration from a cycle ride: sticking the year in the version number
would make it easier to see how old a release is.  3.YYn for the nth
release in 20YY seems good.  So the first one would be 3.246 (or maybe
3.251).

HP 264x large characters
========================

https://drive.google.com/open?id=1rtsO2rohVpmKGR6rF42JccKgAu8f4nFS has
an explanation of how the large character pieces in the Symbols for
Legacy Computing Supplement fit together.

Mapping between ASCII and large characters
------------------------------------------

     ! 𜸚  " 𜸛  # 𜸜  $ 𜸝  % 𜸞  & 𜸟  ' 𜸠
( 𜸡  ) 𜸢  * 𜸣  + 𜸤  , 𜸥  - 𜸦  . 𜸧  / 𜸨
0 𜸩  1 𜸪  2 𜸫  3 𜸬  4 𜸭  5 𜸮  6 𜸯  7 𜸰
8 ?  9 𜸱  : 𜸲  ; 𜸳  < 𜸴  = ▚  > 𜸵  ? 𜸶
@ 𜸷  A 𜸸  B 𜸹  C 𜸺  D 𜸻  E 𜸼  F 𜸽  G 𜸾
H 𜸿  I 𜹀  J 𜹁  K 𜹂  L 𜹃  M 𜹄  N 𜹅  O 𜹆
P 𜹇  Q 𜹈  R 𜹉  S 𜹊  T 𜹋  U 𜹌  V 𜹍  W 𜹎
X 𜹏  Y 𜹐  Z ▘  [ ▝  \ ▖  ] ▗  ^ ▀  _ ▌

Looks like Unicode has somehow missed out one of them.

TIFAX
=====

The TIFAX XM11 was a 1975 Teletext decoder board made by Texas
Instruments Ltd.  It used a 5 × 9 character matrix with rounding like
an SAA5050, but implemented over rather more chips.  Unlike the
SAA5050, the XM11 had two black pixels horizontally between
characters, so the total character size was 7 × 10.

https://www.blunham.com/Radar/Teletext/PDFs/XM11-B183.pdf

The character ROM is an SN74S262, and the character shapes mostly
match the SAA5050, so treating it as a variant might not be
unreasonable.  There is some uncertainty over some characters as
explained by Neil Williamson: https://p298.net/devnotes/

The different character pitch could mostly be accomodated by having
the application add extra spacing between characters, but what about
mosaic graphics?  The XM11 generated those in a separate chip (the
X908), and they were the full 7 pixels wide, four in the left column
and three in the right.  Would we want an entire extra set?  At least
the XM11 didn't support separated graphics (added in the 1976 spec).
And what about box-drawing?

The XM11 has a dot clock the same as the Teletext bit clock: 6.9735 MHz.
Hence a nominal XPIX of 105.8.

There was also a Swedish ROM, SN74S263, which was used in the ABC80, a
Swedish Z80-based computer, so there is a purported ROM of it
available for emulators.

Also some real screenshots; e.g. https://www.youtube.com/watch?v=Jy6tQsyd4ho

I haven't found a datasheet for the SN74S262/3, but the schematics in the
ABC80 and RML 380Z service manuals provides a pinout:

1: D4
2: D5
3: D6
4: RA (row address)
5: RB
6: RC
7: RD
8: Y1 (output)
9: Y2
10: Gnd
11: Y3
12: Y4
13: Y5
14: CS*
15: CS*
16: D0 (character address)
17: D1
18: D2
19: D3
20: Vcc