I was reading Complexity by M. Waldrop and I started thinking about cellular automaton. One thing led to another and I ended up building an instance of Conway’s Game of Life in the browser. I wrote it in Clojurescript and if features the ability to step forward and backwards through sequence, as well as detect end states with some basic cycle detection. It runs in near realtime which was a bit of a feat as Conway’s Game of Life is a rather CPU intensive application to run a realtime for a pattern of any real size. It’s still not quite as fast as a good GoL implementation should be, but I feel like this application nevertheless represents a bit of a case study of a performance intensive application written in Clojurescript.
Initial flailing
When I wrote the application initially, I took the naive route and represented the grid as a sequence of booleans, each representing a cell (true for alive, of course). A second variable was used to represent the width of the grid. One could find the index for cell [x y]
with the formula x+grid-width*y
. To update the grid, I looped over the sequence with the for macro and looked up the count for adjacent live cells for each cell. All in all, it was very simple:
(def ^:dynamic *world-width* 25)
(defn get-index [position]
"Given x and y coordinates, return the index in the grid array
that corresponds to those coordinates."
(let [{x :x y :y} position]
(+ x (* y *world-width*))))
(defn get-position [index]
"Converts a position into x and y coordinates"
{:x (mod index *world-width*)
:y (int (/ index *world-width*))})
(defn adjacent-life [index world]
"Returns a count of the live cells adjacent to the cell at index"
(let [xmax (dec *world-width*)
ymax (int (/ (dec (count world)) *world-width*))
{x :x y :y} (get-position index)]
(reduce +
(for [xo (range -1 2)
yo (range -1 2)
:let [i (get-index {:x (+ x xo) :y (+ y yo)})]
:when (and (not (= i index))
(<= 0 (+ x xo) xmax)
(<= 0 (+ y yo) ymax))]
(if (nth world i) 1 0)))))
(defn pass-generation [world]
"Returns a new iteration of the world according to the rules of
life."
(for [index (range (count world))]
(if (nth world index)
(case (adjacent-life index world)
(0 1 4 5 6 7 8) false
(2 3) true)
(= 3 (adjacent-life index world)))))
This approach turned out to be painfully slow. I actually thought that there was a bug in the code at first as Firefox would display the “slow script” box before the first iteration would display. I was not surprised that the naive approach would be too slow for my purposes; I had anticipated needing to optimize it, but I surprised at how slow it would actually be.
The first things I tried were some simple performance enhancements. I replaced the Clojure sequence with a Javascript array by calling into-array
at every iteration and that helped. I shrank the size of the grid and that helped as well. I eventually unrolled the loop in which I looked for adjacent live cells and that actually made the application almost bearable, but it was still very slow.
(defn adjacent-life [index world]
"Returns a count of the live cells adjacent to the cell at index.
(Unrolling the internal loop of this function speeds it up immensely.)"
(let [xmax (dec *world-width*)
ymax (dec *world-height*)
{x :x y :y} (get-position index)
decx (dec x)
decy (dec y)
incx (inc x)
incy (inc y)]
(+ (if (and (<= 0 decx xmax)
(<= 0 decy ymax)
(nth world (get-index {:x decx :y decy})))
1 0)
(if (and (<= 0 incx xmax)
(<= 0 incy ymax)
(nth world (get-index {:x incx :y incy})))
1 0)
(if (and (<= 0 decx xmax)
(<= 0 incy ymax)
(nth world (get-index {:x decx :y incy})))
1 0)
(if (and (<= 0 incx xmax)
(<= 0 decy ymax)
(nth world (get-index {:x incx :y decy})))
1 0)
(if (and (<= 0 decx xmax)
(nth world (get-index {:x decx :y y})))
1 0)
(if (and (<= 0 incx xmax)
(nth world (get-index {:x incx :y y})))
1 0)
(if (and (<= 0 decy ymax)
(nth world (get-index {:x x :y decy})))
1 0)
(if (and (<= 0 incy ymax)
(nth world (get-index {:x x :y incy})))
1 0))))
(defn pass-generation [world]
"Returns a new iteration of the world according to the rules of
life."
(into-array
(for [index (range (count world))]
(if (nth world index)
(case (adjacent-life index world)
(0 1 4 5 6 7 8) false
(2 3) true)
(= 3 (adjacent-life index world))))))
Searching around the web for some typical GoL algorithms revealed some interesting results. Some approaches, such as the Hashlife algorithm were not practical in Javascript as the memoization required would actually be too slow without some serious work. I did try a trick of incrementing/decrementing the adjacency counts when cells came to life or died rather than counting the number of adjacent live cells for each cell each iteration and that helped immensely, but the application was still too slow.
The final algorithm
The algorithm I eventually settled on was actually very simple: I maintained two sets, one was a list of live cell coordinates and the other was a list of active cell coordinates. The active cells were cells whose state might change during the next iteration. Usually this meant that they were adjacent to a cell whose states had during changed this iteration. I’d loop over the active cell list and query the live cell list to find the number of adjacent cells each time.
(def ^:dynamic *world-width* 50)
(def ^:dynamic *world-height* 40)
(defn adjacent-life [index life]
"Returns a count of the live cells adjacent to the cell at index"
(let [xmax (dec *world-width*)
ymax (dec *world-height*)
[x y] index
decx (dec x)
decy (dec y)
incx (inc x)
incy (inc y)]
(+ (if (contains? life [decx decy])
1 0)
(if (contains? life [incx incy])
1 0)
(if (contains? life [decx incy])
1 0)
(if (contains? life [incx decy])
1 0)
(if (contains? life [decx y])
1 0)
(if (contains? life [incx y])
1 0)
(if (contains? life [x decy])
1 0)
(if (contains? life [x incy])
1 0))))
(defn set-active [index active]
"Add the cells surrounding the cell at index to the set
of active cells."
(let [xmax (dec *world-width*)
ymax (dec *world-height*)
[x y] index
decx (dec x)
decy (dec y)
incx (inc x)
incy (inc y)]
(reduce conj! active
[[decx decy]
[incx incy]
[decx incy]
[incx decy]
[decx y]
[incx y]
[x decy]
[x incy]])))
(defn pass-generation [world-adj]
"Returns a new iteration of the world according to the rules of
life."
(let [world (first world-adj)
active (second world-adj)]
(loop [indexes active
nworld (transient (first world-adj))
nactv (transient #{})]
(if (not-empty indexes)
(let [index (first indexes)]
(let [adjacencies (adjacent-life index world)]
(if (contains? world index)
(if (or (= 2 adjacencies)
(= 3 adjacencies))
(recur (rest indexes) nworld nactv)
(recur (rest indexes)
(disj! nworld index)
(set-active index nactv)))
(if (= 3 adjacencies)
(recur (rest indexes)
(conj! nworld index)
(set-active index nactv))
(recur (rest indexes) nworld nactv)))))
(vector (persistent! nworld)
(persistent! nactv))))))
This approach had several advantages. The first was that the grid was theoretically infinite. There was no problem with edges interfering with the game or with a pattern interfering with itself in a toroidally wrapped grid. The second was that empty space and still lifes would passed over and skipped. This was actually very fast, still not fast enough, but getting there. The one remaining detail was to perform some optimizations by changing the datastructures used from Clojure ones to native Javascript ones which are much faster.
It’s possible to use a Javascript object as a set by using the parameter names as members and setting parameters to true or removing them altogether based on membership in the set being modeled. It turns out that this is faster than the Clojure script Hashset or Hashmap datatypes. Javascript objects do have the disadvantage that they are mutable, but I was already using transients for a speed improvement anyway. Making a copy of each object at each iteration and modifying the new versions in place was a lot faster than either using the immutable or transient Clojurescript datastructures. In one sense this shouldn’t be surprising as the Clojurescript structures are constructed out of the Javascript ones and implement a lot functionality on top of them. On the other hand, it’s rather disappointing to realize that you might have to break out of the comfort zone of pure Clojure when writing performance code in Clojurescript.
The ultimate performance tip
I don’t usually write a lot of client side code so this is the first time I’ve used a profiler on Javascript code, but that was the most important thing in getting this application as fast as it is. Firefox has a built in Javascript profiler which was actually amazingly useful. Most of the important performance improvements I implemented were because the profiler told me that the application was spending an inordinate amount of time doing something that should have been quick. So the first time I unrolled the inner loop, switching from Clojure sets to Javascript objects, and changing the cell representation from a pair to a single integer were all motivated by profiler information.
Summary
There are still some improvements that I’d like to make to this GoL implementation. The first is that there still is some significant slowdown, mainly in the graphical aspect. I’m using D3.js rather than canvas which is ultimately a mistake performance wise. The ability create the starting patterns is still inadequate. Only toggling individual cells is supported and the selector isn’t very easy to use. I’d like to create a library of common patterns. Also, I’d like to be able to read the pattern from html parameters. Maybe I’ll come back to this later and fix these things but I’m done for now.