I want to write books that teach kids how computers work at their most fundamental level.
Perhaps the most fundamental concept in computing is how integers are represented and used as a code for the processor. Before somebody can understand that they need to be familiar with the concept of numerical place value in binary. Generally K-12 curricula don’t convey enough about why binary (and hexadecimal) are so useful and fundamental to our digital world. That’s why I have a plan to improve how kids are educated about numbers.
Pretend you’re a young child who doesn’t yet understand much about numbers. Now imagine you’re sitting at a desk. A rectangular area on the desk’s surface in front of you and you see a pile of blocks outside the taped area. You also notice some speakers and a monitor which prominently displays 0000.

You move several of the smallest blocks into the taped area. Suddenly the monitor changes to 0004 and you hear the speakers say “four”.
Surprised by the change you decide to move the blocks back out of the taped area.
“Wow, it’s back to 0000 again! And it said ‘zero’.”
You add the blocks you just removed. In response the monitor goes back to 0004 and you again hear “four”.
You repeat the process a few times and decide the monitor and speakers reflect something about the blocks inside the taped area. But what exactly?
The taped area has no blocks and 0000 illuminates from the monitor. Unsure what to think you decide to move small blocks into the taped area one at a time.
0001(one)0002(two)0003(three)0004(four)
Your memory stirs. “Oh, that’s the one I saw before.”
0005(five)
You recall your parents saying the same words as they touch each of your fingers.
0006(six)
“That symbol changes every time I move a block. Will it change if I take away some blocks?”
0005(five)
“Five, yeah, I just saw that one!”
0004(four)
“I remember four too.”
After more experimentation it would be clear that these symbols and sounds map to the real world concept of quantity. Even better, the relationship between the three is continuously reinforced.
Now suppose all the blocks are taken away and replaced with rods.

0040(forty)0000(zero)0040(forty)0000(zero)0010(ten)0020(twenty)0030(thirty)0040(forty)0050(fifty)0060(sixty)0050(fifty)0040(forty)
Doing the same thing with the rods results in different sounds, but the symbol changes are remarkably similar. In fact, they’re the same changes except that the position of the symbol that changes is different.
You’re given a rod’s length of unitary blocks back. Soon you realize that either adding all those unitary blocks or adding a single rod results in the same feedback from the monitor and speakers. From what you can tell this is true regardless of how many rods are already in the taped area. Eventually you conclude that a rod and ten unitary blocks represent the same quantity.
Place value in the base 10 numbers would become intuitive with enough experimentation. This system is flexible enough to work with other bases, including binary.
Going from that point to explaining how a 32-bit ‘word’ gets processed by a CPU isn’t that hard.
Program an implementation of vectors for all dimensions with the following in mind.
Let’s give this a try in Python. To account for the arbitrary dimension requirement, we’ll have the constructor take a single parameter called components. The plan is to pass an array of numbers such as [-3.0, 8.24, 2] or [42].
class Vector:
def __init__(self, components):
self.components = components
def __add__(self, v):
pairs = zip(self.components, v.components)
return [ sum(axis_pair) for axis_pair in pairs ]
In case you’re not familiar with zip it helps you easily match up elements from different arrays at the same index. It’s useful in situations like this where we’re trying to sum vector elements component-wise.
>>> zip([1, 2, 3], [4, 5, 6]) [(1, 4), (2, 5), (3, 6)]
Let’s tinker with this implementation a bit.
>>> a = Vector([-3.0, 8.24, 2]) >>> b = Vector([10, 9, 8]) >>> a + b [7.0, 17.240000000000002, 10]
Not bad. What about this?
>>> c = Vector([3.14159]) >>> a + c [0.14158999999999988]
Gasp! The implementation accepts adding vectors from different dimensions. That’s because the __add__ implementation relies on zip, which truncates all the input arrays to the length of the shortest one. Guess we better validate the input to __add__.
def __add__(self, v):
if len(self.components) != len(v.components):
raise ValueError
pairs = zip(self.components, v.components)
return [ sum(axis_pair) for axis_pair in pairs ]
Now there’s an error when we try adding vectors with different dimensions.
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "vector.py", line 7, in __add__
raise ValueError
We’re not getting wrong output anymore! The downside is we’ll have to depend on the programmers to check the dimensions themselves or handle the exception properly. Aside from that, we also have to raise an error on every vector operation where there’s a dimension constraint violation. Bugs like to creep in when there are maintenance constraints like this.
The core problem with the Python implementation above is all Vectors, regardless of their dimension, have the same type. We could have tried getting around that by making distinct Vector classes for several dimensions, but we’d probably end up with either too many or too few supported dimensions. It wouldn’t be fun maintaining those all those Vector implementations. Plus, using an array in the constructor won’t work unless the length of the array is validated against the dimension associated with the class. How could we dynamically build up constructors for each dimension vector type to accept the right number of arguments? I can imagine potential solutions in some dynamically typed languages with certain metaprogramming constructs. Maybe I’ll go into that further in a later post. Let’s move on for now.
D deserves a brief introduction because it’s not quite as popular as Python. It’s a systems language resembling C++ and, at least syntactically, modern VM-based incarnations like Java and C#. The preprocessing engine that executes before compilation supports constructs far more expressive than #define or #ifdef. As a result it’s possible to do some interesting metaprogramming without losing the benefit of compile-time static checking or relying on potentially slower runtime operations.
Before we continue, here’s a quick note of attribution: Some of the following code was taken from the phoboslinalgebra project. Many thanks to Gareth Charnock for sharing his code on GitHub. Alright, let’s dig in, shall we?
A TypeTuple is a specific-length list where each entry corresponds to a particular type. For example, TypeTuple!(int, double, string) describes the type for lists containing three elements where the first is an int, the second is a double, and the third is a string.
An n-dimensional vector is represented by nothing more than a list of n values where each is an element of the vector space’s field. To represent an n-dimensional vector in D we could really use a TypeTuple!(T, T, ..., T) where T is the field type and there are n of those Ts. How can we tell D we want the type for 3D vectors of floats, 2D vectors of ints, or 1000D vectors of mySuperSpiffyNumericType? The answer is to recursively build a type definition for our TypeTuple!(T, T, ..., T).
import std.typetuple;
template TypeNuple(T, size_t n) {
static if (n == 0) {
alias TypeTuple!() TypeNuple;
}
else {
alias TypeTuple!(T, TypeNuple!(T, n-1)) TypeNuple;
}
}
The TypeNuple template uses the static if to dynamically generate code during the early stages of compilation. Notice that the implementation recursively builds up a type tuple of Ts until it has n elements. We’re using recursion to dynamically generate the types we need at compile time!
Now let’s create a struct that utilizes our recursively defined list of types.
struct Vector(Field, uint D) {
alias TypeNuple!(Field,D) ArgList;
Field[D] components;
this(ArgList argList) {
foreach (i, arg; argList) {
components[i] = argList[i];
}
}
}
The alias call makes ArgList a tuple of D elements, each of which has type Field. Suppose we’re making a 3D vector of reals. In this context, ArgList has type (real, real, real) and an example instance could be (3.14159, 2.71828, 0). Notice how the constructor iteratively initializes the components attribute array with the values from the provided argList.
Now we’ll overload the binary operators + and - to take two Vector!(Field,D) values and produce a new Vector!(Field,D). For convenience we’ll alias that type as VecType. The following code goes inside the struct defined above.
alias Vector!(Field,D) VecType;
VecType opBinary(string op)(VecType vec) {
if (op == "+" || op == "-") {
VecType result;
for (int i = 0; i < D; i++) {
mixin("result.components[" ~ i.stringof ~ "] ="
"this.components[" ~ i.stringof ~ "]" ~ op ~
"vec.components[" ~ i.stringof ~ "];");
}
return result;
}
}
The mixin construct makes it easy to assign each element in the resulting vector to the sum of the corresponding elements in the two other vectors.
Let’s make a unit test that adds a 3D real vector to itself and checks the result:
unittest {
alias Vector!(real,3U) VecR3;
VecR3 v = VecR3(3.14159, 2.71828, 0);
VecR3 v_doubled = v + v;
assert(v_doubled.components[0] == 3.14159 + 3.14159);
assert(v_doubled.components[1] == 2.71828 + 2.71828);
assert(v_doubled.components[2] == 0 + 0);
}
When we run the unit test there’s no output. That tells us it’s working!
$ rdmd -unittest --main vector.d $
If you’re skeptical that no news is good news, try changing the asserts so they come out to false. rdmd will cry tears of garbage.
What happens when we add two vectors with different field types or dimensions? You’ll get an error at compile time. Yeah, that’s right—kill those bugs before they have a fighting chance.
vector.d(56): Error: incompatible types for ((r2) + (r3)): 'Vector!(real,2u)' and 'Vector!(real,3u)'
I imagine there are other languages that give you the same expressiveness and safety, but haven’t tested the theory. What other languages can generate recursively defined types for you at compile time? Can they also easily support useful binary operations on those types to produce a new value of the same type?