1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184
|
=========
Keymapper
=========
The Problem
===========
Keyboard selection is a problem for many users since most ofthem have no
idea which keyboard layout they have.
This program uses a different approach. We simply ask the user to press
a couple of keys and infer from their scan code which layout they belong
to, thereby lowering the number of possibilities in a classic
divide-and-conquer approach.
Some difficulties show up. For instance, asking people to press a key
they don't have is considered to be bad style. Todistinguish between
German and Danish keyboards, we therefore have to offer multiple
possible symbols. This should also be helpful with trying to distinguish
between QWERTZ and AZERTY when the user has a cyrillic keyboard. :-)
Inferring the decision tree from a large number of keyboard layouts is
not efficient, thus the tree neds to be pre-calculated.
There are two ways of selecting among possible layouts:
* Ask the user to press a key with a particular symbol
* Ask the user to choose the symbol which appears on a key, e.g. to the
right of the tab key.
This tool will only use the first approach. If it is not possible to
distinguish between keyboard layouts that way, e.g. because their only
difference is whether a particular key has an X or a Y symbol, a simple
binary question ("Do you have an X key") may have to be asked.
Basic selection algorithm
=========================
Initially, list all keyboards. Create a list of forbidden scan codes,
which is initially empty.
Select a symbol that appears on most keyboards, in as many places as
possible, but not on any keys in the forbidden list. Remoce these
keyboards from the keyboard list and add the scan codes of those
keys to the forbidden list.
Repeat until the list of keyboards is empty.
Display the list of symbols to the user. Each scan code we get back is
associated with at least one keyboard. Clear the list of scan codes and
repeat this process if there is more than one.
Heuristics
----------
Unmodified, this method tends to pick 'obscure' keys like @,
which tend to be in lots of different places historically.
Unfortunately, the risk of this approach is that people will not find
the key if it's not labelled as such on their keyboard.
Some possible modifications are:
* Initially, prefer ASCII letters and low-ASCII symbols.
* Strongly prefer a symbol that's not selected with AltGr or, less
strongly, Shift (unless the base key is a number). Especially in the
former case, the symbol may not even appear on the keyboard.
(DONE) (partly conflicts with the first part)
* In order to avoid an "I can't find any of these keys" problem, it
might be necessary to prepare two symbol lists in each step. The key
codes should be distinct of course.
(DONE)
* Some key choices should be either avoided or combined. For instance,
asking the user to press an A in order to find out whether the layout
is French (qwerty vs. azerty) might be a bad idea if Cyrillic
keyboards have not been either excluded (in a previous question) or
included (by internally mapping а => a, ѡ => ω, etc., and accepting
all ppossible answers).
A list of Unicode glyph lookalikes should be findable on the Internet.
In addition, it might make sense to always show capital letters.
(TODO)
* Do a global optimization to minimize tree depth. Currently, key
selection locally favors deciding between many layouts with one
symbol -- the problem is that the keys which end up selecting between
the rest also select from a few of those already decided on by the
first ones, thus they add more steps. A nicer strategy might be to
find a group of symbols which collecively do the same thing, only
better.
Tests
=====
Testing this approach is the easiest part of the problem -- the
FakeQuery class answers the questions depending on which keyboard map
was passed to it. If multiple answers are possible, one is selected
randomly.
Redundancy
----------
During twenty test runs with the nine initiall test keymaps, the system
asked 27 questions on average, or three per keymap. Since 2^3 is 8, the
generated decision tree is shorter than a binary tree, which suggests
that generating two symbol lists in parallel doesn't result in asking
additional questions.
Graph output
------------
Since the result is somewhat difficult to visualize, I added graphviz
output. The result isn't pretty in any way (specifically, edge labels
are not placed correctly) but it's a workable first step.
File output
-----------
The data file produced by this tool consists of a number of lines:
STEP ##
+++++++
This line starts a new step. Steps are numbered; the first step is
numbered zero.
Steps are not necessarily numbered consecutively, but all references
pint to steps which occur later in the file. Thus it is never necessary
to keep the while file in memory.
MAP name
++++++++
This step uniquely identifies a keymap.
The MAP command is the only command in is step.
PRESS char
++++++++++
Ask the user to press a character on the keyboard. Multiple PRESS
commands may occur in a step; the characters (encoded in utf-8)
need to be presented simultaneously ("Press one of these keys").
Auxiliary keys (shift, alt, ...) should be ignored.
CODE code ##
++++++++++++
Direct the evaluating code to process step ## next if the user has
pressed a key which returned that keycode.
Obviously, multiple CODE commands may occur in a step.
FIND char
+++++++++
Ask the user whether that character is present on their keyboard.
This is a question of last resort; it's asked only if keymaps cannot be
distinguished otherwise.
This command will be accompanied by one YES and one NO command.
FINDP char
++++++++++
Equivalent to FIND, except that the user is asked to consider only the
primary symbols (i.e. Plain and Shift).
YES ##
++++++
Direct the evaluating code to process step ## next if the user does have
this key.
NO ##
+++++
Direct the evaluating code to process step ## next if the user does not
have this key.
|