Sensitive Circuit

Text

Type-safe mathematical vectors

The Challenge

Program an implementation of vectors for all dimensions with the following in mind.

  1. Binary operations on vectors like addition don’t make sense when the vectors have different dimensions.
  2. People using your implementation shouldn’t have to write boilerplate code to accommodate how you choose to handle dimension mismatches.
  3. Rather than failing silently by handling input incorrectly, the implementation should alert the programmer as early as possible about their mistake.

Case Study: Python

Let’s give this a try in Python. To account for the arbitrary dimension requirement, we’ll have the constructor take a single parameter called components. The plan is to pass an array of numbers such as [-3.0, 8.24, 2] or [42].

class Vector:
    def __init__(self, components):
        self.components = components

    def __add__(self, v):
        pairs = zip(self.components, v.components)
        return [ sum(axis_pair) for axis_pair in pairs ]

In case you’re not familiar with zip it helps you easily match up elements from different arrays at the same index. It’s useful in situations like this where we’re trying to sum vector elements component-wise.

>>> zip([1, 2, 3], [4, 5, 6])
[(1, 4), (2, 5), (3, 6)]

Let’s tinker with this implementation a bit.

>>> a = Vector([-3.0, 8.24, 2])
>>> b = Vector([10, 9, 8])
>>> a + b
[7.0, 17.240000000000002, 10]

Not bad. What about this?

>>> c = Vector([3.14159])
>>> a + c
[0.14158999999999988]

Gasp! The implementation accepts adding vectors from different dimensions. That’s because the __add__ implementation relies on zip, which truncates all the input arrays to the length of the shortest one. Guess we better validate the input to __add__.

def __add__(self, v):
    if len(self.components) != len(v.components):
        raise ValueError
    pairs = zip(self.components, v.components)
    return [ sum(axis_pair) for axis_pair in pairs ]

Now there’s an error when we try adding vectors with different dimensions.

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "vector.py", line 7, in __add__
    raise ValueError

We’re not getting wrong output anymore! The downside is we’ll have to depend on the programmers to check the dimensions themselves or handle the exception properly. Aside from that, we also have to raise an error on every vector operation where there’s a dimension constraint violation. Bugs like to creep in when there are maintenance constraints like this.

The core problem with the Python implementation above is all Vectors, regardless of their dimension, have the same type. We could have tried getting around that by making distinct Vector classes for several dimensions, but we’d probably end up with either too many or too few supported dimensions. It wouldn’t be fun maintaining those all those Vector implementations. Plus, using an array in the constructor won’t work unless the length of the array is validated against the dimension associated with the class. How could we dynamically build up constructors for each dimension vector type to accept the right number of arguments? I can imagine potential solutions in some dynamically typed languages with certain metaprogramming constructs. Maybe I’ll go into that further in a later post. Let’s move on for now.

Case Study: D

D deserves a brief introduction because it’s not quite as popular as Python. It’s a systems language resembling C++ and, at least syntactically, modern VM-based incarnations like Java and C#. The preprocessing engine that executes before compilation supports constructs far more expressive than #define or #ifdef. As a result it’s possible to do some interesting metaprogramming without losing the benefit of compile-time static checking or relying on potentially slower runtime operations.

Before we continue, here’s a quick note of attribution: Some of the following code was taken from the phoboslinalgebra project. Many thanks to Gareth Charnock for sharing his code on GitHub. Alright, let’s dig in, shall we?

A TypeTuple is a specific-length list where each entry corresponds to a particular type. For example, TypeTuple!(int, double, string) describes the type for lists containing three elements where the first is an int, the second is a double, and the third is a string.

An n-dimensional vector is represented by nothing more than a list of n values where each is an element of the vector space’s field. To represent an n-dimensional vector in D we could really use a TypeTuple!(T, T, ..., T) where T is the field type and there are n of those Ts. How can we tell D we want the type for 3D vectors of floats, 2D vectors of ints, or 1000D vectors of mySuperSpiffyNumericType? The answer is to recursively build a type definition for our TypeTuple!(T, T, ..., T).

import std.typetuple;

template TypeNuple(T, size_t n) {
  static if (n == 0) {
    alias TypeTuple!() TypeNuple;
  }
  else {
    alias TypeTuple!(T, TypeNuple!(T, n-1)) TypeNuple;
  }
}

The TypeNuple template uses the static if to dynamically generate code during the early stages of compilation. Notice that the implementation recursively builds up a type tuple of Ts until it has n elements. We’re using recursion to dynamically generate the types we need at compile time!

Now let’s create a struct that utilizes our recursively defined list of types.

struct Vector(Field, uint D) {
  alias TypeNuple!(Field,D) ArgList;

  Field[D] components;
  
  this(ArgList argList) {
    foreach (i, arg; argList) {
      components[i] = argList[i];
    }
  }
}

The alias call makes ArgList a tuple of D elements, each of which has type Field. Suppose we’re making a 3D vector of reals. In this context, ArgList has type (real, real, real) and an example instance could be (3.14159, 2.71828, 0). Notice how the constructor iteratively initializes the components attribute array with the values from the provided argList.

Now we’ll overload the binary operators + and - to take two Vector!(Field,D) values and produce a new Vector!(Field,D). For convenience we’ll alias that type as VecType. The following code goes inside the struct defined above.

  alias Vector!(Field,D) VecType;

  VecType opBinary(string op)(VecType vec) {
    if (op == "+" || op == "-") {
      VecType result;
      for (int i = 0; i < D; i++) {
        mixin("result.components[" ~ i.stringof ~ "] ="
              "this.components[" ~ i.stringof ~ "]" ~ op ~
              "vec.components[" ~ i.stringof ~ "];");
      }
      return result;
    }
  }

The mixin construct makes it easy to assign each element in the resulting vector to the sum of the corresponding elements in the two other vectors.

Let’s make a unit test that adds a 3D real vector to itself and checks the result:

unittest {
  alias Vector!(real,3U) VecR3;

  VecR3 v = VecR3(3.14159, 2.71828, 0);
  VecR3 v_doubled = v + v;

  assert(v_doubled.components[0] == 3.14159 + 3.14159);
  assert(v_doubled.components[1] == 2.71828 + 2.71828);
  assert(v_doubled.components[2] == 0 + 0);
}

When we run the unit test there’s no output. That tells us it’s working!

$ rdmd -unittest --main vector.d
$

If you’re skeptical that no news is good news, try changing the asserts so they come out to false. rdmd will cry tears of garbage.

What happens when we add two vectors with different field types or dimensions? You’ll get an error at compile time. Yeah, that’s right—kill those bugs before they have a fighting chance.

vector.d(56): Error: incompatible types for ((r2) + (r3)): 'Vector!(real,2u)' and 'Vector!(real,3u)'

I imagine there are other languages that give you the same expressiveness and safety, but haven’t tested the theory. What other languages can generate recursively defined types for you at compile time? Can they also easily support useful binary operations on those types to produce a new value of the same type?

View comments
Posted on Wednesday, June 29 2011. Tagged with: dprogrammingrecursive typestype safetymathlinear algebra
7
Notes
  1. ad38j liked this
  2. mekaj posted this

Ask me anything
Previous Next