Way Off Base (by David A. Wheeler)

2020-05-30 (original from 2000)

This little essay is an exercise in mathematical recreations; I hope you find it amusing!

Most people use base 10 for their number system. Computer people often find base 2, 8, or 16 convenient. But surely, we're missing out.. why not try some really bizarre bases?!

First, a quick definition: in a base B, each position to the left has its digit multiplied by one greater power of B, while each position to the right of the decimal point is multiplied by one less power of B. With that in mind, let's look at a few bases:

Base 1

Base 1 has some interesting properties; each position is raised to the power of one. If we only get one symbol, and declare it to be "0" (with the value zero), the only value you can represent is zero. We could cheat a little bit, and declare that the one symbol we get is "1" (with the value of one); this way "1" represents one, "11" represents two, and so on. It's basically like using scratch marks on a wall, but you don't get to group them. There's the problem of being able to represent zero, but by cheating further we could declare that an empty area (with no marks) represents zero. Alternatively, you could say that you can't represent zero with this number system, instead you have to represent zero through arithmetic (e.g., ``1-1''). Note that the decimal point is irrelevant; "11", "1.1", and ".11" all have the same value (two) in base 1. You can still use fractions to represent numbers other than whole numbers: "1/11" is one half, "11/111" is two-thirds, and so on.

You could even cheat further by using two symbols ("0" and "1" standing for zero and one), but this kind of cheating wouldn't help you. Since one to any power is still only one, adding zeros as a placeholder won't help you. For example, "10", "100", and "1000" would all have the same value (one).

xkcd has cartoon about the marvelous powers of 1, which will only make sense if you've seen Charles and Ray Eames' "Powers of Ten" documentary (spoofed by the Simpsons).

Base pi and e

If we use base pi and we can use integer digits up to (but not including) the base, counting starts off easily enough: 0, 1, 2, 3. However, the value of four is tricky, because "10" in base pi is the value pi. Since pi is an irrational number, the value "four" will require an infinite number of digits to completely represent accurately. Base e does the same sort of thing.

Base i and multiplicands of it

Using the square root of -1, traditionally called "i" or "j", has its own oddities. First, there's a symbol choice to be made: does a "1" represent a one or an i? Let's assume for the moment it means the traditional value of one (we'll revisit this assumption in a moment). Using just i is a lot like base 1, but worse. If we use our "cheating" trick, there are still few values we can represent:

"1" has the value of 1
"11" has the value of i^1+(1), which is i+1.
"111" has the value of i^2+(i+1), which is i.
"1111" has the value of i^3+(i), which is 0.
"11111" has the value of i^4+(0), which is 1.
"111111" has the value of i^5+(1), which is i+1.

As you can tell, it's cyclic; only four different values can be represented in base i as a single number. You can go further with arithmetic, i.e., four would be written as "1+1+1+1", but certainly the base isn't helping.

Do things get better if the lone symbol represents i? To keep things clear, let's use the symbol "i".. and it turns out the answer is that it doesn't help:

"i" has the value of i
"ii" has the value of i*i+(i), which is i-1.
"iii" has the value of i*i^2+(i-1), which is -1.
"iiii" has the value of i*i^3+(-1), which is 0.
"iiiii" has the value of i*i^4+(0), which is i.
"iiiiii" has the value of i*i^5+(i), which is i-1.

One approach to solving this problem is to use the other cheating approach we mentioned in the discussion about base 1. Basically, let's permit two symbols ("0" and "1", with their traditional meaning). This helps quite a bit; now we can count one, two, three as "1", "10001", "100010001" (using the "1" means one system). Now at least we can represent all whole numbers - though it's not pleasant. One odd thing about this approach is that there are now many ways to represent a number - "10001" and "100000001" both represent the value two. You can even represent a few complex numbers quite easily - traditional "2+i" becomes "10011".

Another solution, which avoids this kind of cheating, is to use a larger absolute value, but still a complex number, for the base. We can, for example, choose 10i as a base. Doing this has truly baroque impacts that are hard to characterize, and you can do even more interesting things by writing "complex" numbers and adding them with "complex" numbers multiplied by i. Multiplying such numbers in particular is bizarre. You could use symbols such as "1", "2", as representing their traditional value, or use them to represent 1i, 2i, and so on; either way they're bizarre.

Base 0 (zero)

Here we come to the truly worthless base. In theory we can have no symbols, but let's stretch and claim we can use one symbol (0). Unfortunately, without the decimal point, we can only represent the value of zero, since "0", "00", "000" and so on all evaluate to zero. Adding the decimal point makes things worse... we now must evaluate 0/0. I suppose you could argue that base 0 can "represent" all numbers as 0/0, but since it can't distinguish between them it's not exactly an advantage.

Fractional Bases

You can have fractional bases, but those are actually studied in mathematics. High school student Billy Dorminy has even developed an encryption algorithm using fractional bases, in his science project titled "Improper Fractional Base Encryption". Thus, fractional bases are too useful to be considered further in this paper :-).

Other Related Works

Other people have also thought about number bases. You could see Trinary for information about base 3 (see also this article in American Scientist, Nov-Dec 2001). There's a special form of base 3 called ``balanced ternary'' (base 3) which uses symbols for 0, +1, and -1. The ``Logical Alternative to the Existing Positional Number System'' by Robert R. Forslund discusses using the digits 1 to b instead of 0 to (b-1) for a given base b. The negabinary system is base -2; it appears that this system was used by the experimental Polish computers SKRZAT 1 and BINEG in 1950. In Negabinary, negative and positive numbers can be represented without a sign bit, and arithmetic operations are more complicated.

Donald Knuth's "The Art of Computer Programming", volume 2, contains Chapter 4.1, "Arithmetic"; that has more information than perhaps you wanted to know about implementing arithmetic on computers. I've been told that "Number: From Ahmes to Cantor" by Midhat Gazale, ISBN 0-691-00515-X, Chapter 2 discusses positional number systems in great detail. Everything Gray Code (in gzipped Postscript format) discusses gray code, a different way to use binary digits to represent numbers.

Henry S. Warren, Jr.'s "Hacker's Delight" chapter 12 discusses some unusual bases, including base -2 (with digits 0 or 1), bases -1+i and -1-i (again, digits 0 or 1), and hints at a few others such as base 2i with digits 0, 1, 2, and 3.

In India, I am told that the Sora language has a varying base, e.g., the units are base 12, but the next higher place is base 20. I've since received an email from Richard Engelbrecht-Wiggans who casts some doubt on this; this is suspiciously similar to the pile of pense coins under the old British money system (12 pense to a shilling, 20 shillings for a pound), and perhaps the natives were playing with the linguists. Further independent investigation would be great on this point.

Slightly different - and way more useful - is the Radix 2^51 trick. This is a way to speed up computation on large integers on modern computers. A modern 64-bit computer can easily add two 64-bit numbers. If you wanted to add two 256-bit numbers, you could divide each into four 64-bit numbers, but handling the carry between them makes it slow because that means they have to be done in sequence. By dividing them into five 51-bit numbers and handling carries separately, the overall addition can take less time due to parallelism. I don't know if this counts as a "different base" - but it's worth mentioning.

Conclusions

I haven't covered negative bases; perhaps I'll add those later.

In short, there's a reason you never saw these before! Hopefully, you found a little fun in this romp through useless bases.

If you enjoyed this article, you might enjoy my article on the Four fours problem or My home page at dwheeler.com.

David A. Wheeler, 2000-09-22; revised 2012-09-08