Affine Structure in Ao
In designing models, we often end up stacking transforms on top of each other.
Considering a function that rotates a 2D shape:
(define (rotate shape angle)
(let ((ca (cos angle))
(sa (sin angle)))
(lambda (x y) (shape (+ (* ca x) (* sa y))
(+ (* (- sa) x) (* ca y))))))
Applying this transform creates a tree structure that remaps the X and Y coordinates:
Calling it a second time makes the tree twice as deep, as the top-level X'
and X'
values are plugged into the bottom of the same structure again.
Here's the tree produced for X''
, the twice-transformed X coordinate:
This is computationally inefficient: the tree could be flattened
out into a single multiple-and-add structure that rotates
by the angle a + b
.
In addition, chaining these operations makes our use of interval arithmetic less accurate.
Let's say that the bounds of our shape are given by $$ \left\{x, y \mid 0 \lt x \lt 1, 0 \lt y \lt 1\right\} $$
Rotating once by 45º brings the bounding box to $$ \left\{x, y \mid -\sqrt 2 / 2 \lt x \lt \sqrt 2 / 2, 0 \lt y \lt \sqrt 2 \right\} $$
Rotating again by 45º scales the bounding box by $\sqrt 2$ again:
After these two rotations, our calculated bounds are much larger than the actual bounds of the shape. If we collapse the two rotations into a single operation that rotates by 90º, the resulting bounds stay tight to the original shape.
These two motivations (smaller computation trees and tigher bound tracking) push for a smarter way to represent transforms. For rotations, there's an intuitive strategy for accumulating a series of operations: just sum the angles. Can we generalize further?
Rotations are an example of an affine transformation.
A rotation about the origin is the same as multiplying by the matrix $$ \left[ \begin{array}{cc} \cos a && \sin a \\ -\sin a && \cos a \end{array} \right] $$
Applying a pair of rotations to a coordinate looks like the following: $$ \left[ \begin{array}{cc} \cos b && \sin b \\ -\sin b && \cos b \end{array} \right] \left[ \begin{array}{cc} \cos a && \sin a \\ -\sin a && \cos a \end{array} \right] \left[ \begin{array}{c} x \\ y \end{array} \right] $$
This structure hints at our plan: instead of grouping the operations as $$ \left[ \begin{array}{cc} \cos b && \sin b \\ -\sin b && \cos b \end{array} \right] \times \left[ \begin{array}{cc} \cos a && \sin a \\ -\sin a && \cos a \end{array} \right] \left[ \begin{array}{c} x \\ y \end{array} \right] $$ we can group them as $$ \left[ \begin{array}{cc} \cos b && \sin b \\ -\sin b && \cos b \end{array} \right] \left[ \begin{array}{cc} \cos a && \sin a \\ -\sin a && \cos a \end{array} \right] \times \left[ \begin{array}{c} x \\ y \end{array} \right] $$
Since all the terms in the matrix are constant, this calculation can be done in advance and applied as a single matrix to the target coordinates.
This gives us generality: rotation, translation, and scaling can all be represented as affine transforms with a 4x4 matrix.
In Ao, we introduce a new opcode, OP_AFFINE
, which is guaranteed to have
the following tree structure:
The initial coordinates X
, Y
, and Z
are constructed as OP_AFFINE
with appropriate coefficients (e.g. X
has a
as 1 and and all other constants
as 0).
Then, arithmetic obeys a simple set of rules:
- Adding or subtracting two
OP_AFFINE
trees is done coefficient-wise - Adding or subtracting an
OP_AFFINE
with a scalar accumulates in thed
term - Multiplication or division of an affine by a scalar applies to all coefficients
Each of these special-cased operations returns a new OP_AFFINE
tree, so we
avoid building deep trees when multiple transformations are applied.
If none of these rules apply, then we build up the arithmetic tree as usual.
So, what does this accomplish?
Consider the following recursive function, which builds up a stack of twisted cubes:
(define (stack i scale rot)
(let ((base (cube '(-1 -1 -1) '(1 1 1))))
(if (= i 0)
base
(union base
(rotate-z (move
(scale-xyz (stack (- i 1) scale rot) scale)
(list 0 0 (* 2 scale)))
rot)))))
Let's compare the tree depth with and without the affine representation:
Using the affine representation, the tree stays much smaller, which means
intervals will be tighter. The slope of the affine line is one level per
cube, due to the stacking of union
operations; the non-affine slope
is four levels per cube.