Human factors alphabets for higher base numbers

2022-03-30 2022-04-13

The alphabets for higher base numbers are generally chosen either arbitrarily or to satisfy certain constraints, such as to avoid characters that may cause a problem in the context of a URL or a filename. Where humans are concerned, constraints can include the need to avoid confusable characters and collisions with words, as in case of the base 20 alphabet used for Open Location Codes.

What is not generally seen is a choice of alphabet that makes it easier for humans to convert the digits to decimal values. This is not surprising, given that

An alphabet that facilitates mnemonic conversion can be useful, however. For example, if base 60 is used for a clock then the minutes and seconds are each a single digit (so no long division). In such a context, what would be a good alphabet? This question arose for a personal project (not a clock), and led to the definition of two alphabets described below.

One approach is form an alphabet from a few simple rules, so that it is easy to remember how to reconstruct the alphabet. This criterion is fulfilled by many existing alphabets. For example, one could take as a base 60 alphabet the first 60 digits of the base 64 alphabet for which one needs only to remember that the digits are the concatenation of

  1. the uppercase Latin letters 'A'-'Z' (decimal values 0-25)
  2. the lowercase letters 'a'-'z' (26-51)
  3. the Arabic numerals '0'-'7' (52-59)

The problem is that while the alphabet is easy to reconstruct, mental reconstruction of the alphabet is not something one would wish to need to do in order to convert a particular number.

The Human Factors Decade-Congruent Alphabet

An improvement is the first of the two alphabets presented here: the Human Factors Decade-Congruent Alphabet (HFDCA). The following table presents the HFDCA in sequential decades - so 'a' is 10, 't' is 29, etc.

    The Human Factors Decade-Congruent Alphabet

    0  0123456789
    1  abcdefghij
    2  klmnopqrst
    3  ABCDEFGHIJ
    4  KLMNOPQRST
    5  vwxyzVWXYZ

This alphabet strikes a good compromise, in that the construction remains simple while it is also considerably easier to apply without full reconstruction, since each uppercase letter has the same value as the corresponding lowercase letter plus 20. As a nod to reducing collisions with words, it is the vowel 'u' that is omitted (rather than 'z') as it also occurs at a natural break in the sequence.

A base 60 HFDCA digital clock:   (enable JavaScript)

However, to interpret a given HFDCA digit, one is still likely to resort to counting from the start of the relevant decade, e.g., "K is 40, L is 41, ...".

The Human Factors Decimal Morphology Alphabet

An alternative is to make the individual digits more memorable, which is the approach taken for the Human Factors Decimal Morphology Alphabet (HFDMA).

    The Human Factors Decimal Morphology Alphabet

    0  0123456789
    1  cjzwfsbvxq
    2  nltmhgdrkp
    3  CJZWFSBVXQ
    4  NLTMHGDRKP
    5  uiyeaUIYEA

Here, the points to observe for recall are:

A base 60 HFDMA digital clock:   (enable JavaScript)

Where it is necessary to exclude digits confusable with '1', a modified version of this alphabet replaces 'l' (lowercase 'L') with '~' and 'I' (capital 'i') with '_'.

Number bases other than 60

For base X numbers, X < 60, the first X digits of the alphabet are used (so, formally, hexadecimal uses the HFDCA). Preferred letters come first, particularly in the HFDMA, for which numbers in any base up to base 50 will avoid the vowels, and thus collisions with most words.

However, for some bases above 20, other constraints might make adjustment appropriate. For example, a base 30 alphabet could use the 3x-decade digits ("CJZ...") for the decimal values 20-29. This would avoid both lowercase 'l' as well as the need to remember the weaker mnemonic for the digits "nlt...".

For X > 60, the definition of both alphabets is extended to 64 digits:

A HF base 64 alphabet seems unlikely to have much application, although a time nominally in base 60 might occasionally have an underscore '_' for a leap second.