At the end of April my neice Alexa came to visit from the UK. I love my neice, but I don’t get to see her a whole lot. One of the things I had hoped to do whilst she was here was build some toys on my Micro3D printer. We used TinkerPlay to build a really cool looking toy on my iPad, transfered it to my computer where the estimated print time was 2 hours per piece, and there were 30 pieces.

Now these pieces were small. You could easily fit 6 or 7 into a single print. This was made more obvious by the fact that the MakerBot has a file format just for that purpose. This lead me on a two hour search on the web for a tool that would either convert the MakerBot format into a single STL file or that combines multiple STL into a single STL file, and found nothing.

Looking on various 3D printing forums it looks like people who are good at this sort of thing think nothing of composing the files themselves using 3D Studio Max or Blender. However, getting good at 3D modeling turned out to be really hard, especially as they were really hard to use and required something that i was missing: talent.

So I was left with my only talent: coding.

My initial estimate for taking an STL file and collecting it into a single file was 1 day. I mean, how hard can it be? There must be readers for this common file format, and I spend all day taking lists of things and smooshing them together, this was going to be easy.

Fresh out of a Clojure gig, taking N data structures and combining them into one sounded like a perfect fit for Clojure. Also, I just love the language and wanted the chance to work with it on my own terms, applying what I had learned at Outpace to what I knew from my previous experience and see how it affected my designs and output.

Stats

The project ended up being split into two projects.

STL-CLJ

This is the meat of the project. You provide it with a directory filled with stl files that you would like to be grouped together and it returns a single stl file, reasonably well packed into a 2D space.

Stat Value Note
Age 47 days 19 active days (40.43%)
Total Files 27  
Total Lines of Code 10211 (30104 added, 19893 removed)
Total Commits 94 (average 4.9 commits per active day, 2.0 per all days)
Authors 1 (average 94.0 commits per author)

10211 lines of code in 19 days!!

That’s 531 lines of code a day! That makes me a 53x developer, I new Clojure was more productive that Java!!!

Wait, that doesn’t make any sense, there is no way I wrote 10211 lines of code. I used a lot of STL files in this project until the end they were mostly ASCII files that were many kilobytes in length.

Extension Files (%) Lines (%) Lines/file
clj 8 (29.63%) 499 (4.89%) 62
cljc 5 (18.52%) 159 (1.56%) 31
md 1 (3.70%) 61 (0.60%) 61
stl 10 (37.04%) 112 (1.10%) 11
yml 1 (3.70%) 4 (0.04%) 4

OK, so that’s 558 lines of code in 19 days, or 30 lines per day. But that’s just the final figure. There was a great deal of churn as I played with a couple of solutions and I pulled out an entire library.

So lets take a look at the stats for that library.

Packager

About two thirds of the way through the project I realized that I had no idea what I was doing when it came to laying out the pieces in 2D space. I had originally though I could just distribute them across the width and then move down a row when I was done.

This turned out to be close to the truth, but far from accurate.

This lead to a deep dive on packing algorithms. Which lead me to write packager, which is a naive 2D packing algorithm.

Stat Value Note
Age 6 days 6 active days (100.00%)
Total Files 16  
Total Lines of Code 523 (839 added, 316 removed)
Total Commits 23 (average 3.8 commits per active day, 3.8 per all days)
Authors 1 (average 23.0 commits per author)

Now these stats feel closer to the truth… so my ego would like me to believe.

523 loc in 6 days puts me at 87 loc per day, or 8.7x developer!

But fair is fair, lets look at the same stats as stl-collector.

Extension Files (%) Lines (%) Lines/file
clj 6 (37.50%) 166 (31.74%) 27
cljc 5 (31.25%) 108 (20.65%) 21
md 2 (12.50%) 17 (3.25%) 8
yml 1 (6.25%) 2 (0.38%) 2

Meh, ok maybe that’s no so accurate. 4.5x ok, that’s pretty respectable.

Now these projects overlapped, so I get 832 lines of code in 19 days or a 4.3x dev.

Test Coverage

Lets look at test coverage:

Name Forms Percent Lines Percent
stl-collector.core 29.55 % 45.45 %
stl-collector.flattener 65.58 % 100.00 %
stl-collector.reader 64.45 % 90.00 %
stl-collector.transforms 61.12 % 100.00 %
stl-collector.writer 64.59 % 100.00 %
packager.box 66.93 % 100.00 %
packager.container 71.72 % 100.00 %
packager.distributor 64.53 % 94.12 %
packager.shelf 64.51 % 100.00 %

Wait, what does that even mean? How can I have 100% coverage and only 65% coverage in the same file?

So clojure is terse, like really, really terse.

(assoc {} :id 5)

is equivalent to

Map<Object, Object> map = new HashMap();
map.put("id", 5);

actually that’s not true

{:id 5}

is the equivalent, and that doesn’t even begin to describe the real differences in those statements. The Clojure map is both persistent and immutable. So, in the first example, I’m not actually modifying the empty map, so much as using it as the basis for the map I ultimately return. If I had assigned {} to a symbol and used it later it would still be {}, unlike map in the Java example, which now contains an id.

But that is beside the point. What I’m hoping to establish is that you can get a lot done in a single line Clojure, and that is before we factoring macros (code being evaluated as data before it is ultimately evaluated at code), which leads to all kinds of weirdness around figuring out how much of a line was actually tested, as that line might macroexpand out to many, many lines.

But I’m pretty happy with the coverage. Hitting 100% of lines was usually enough for me to know that I had exercised all of the branches in my code which is really all I wanted.

So am I still a 6x developer?

It would be really convenient if lines of code was a decent metric for developer productivity.

My stats would have me at 4.3x. But I don’t believe for a second that I was 50% less productive in Clojure that Java.

Take a look at this:

(defn [xs]
   (reduce (fn [d [k v]] (merge-with conj d {k [v]})) {} xs)))

This function takes something that looks like:

[[:a 5] [:b 6] [:a 3]]

and transform it into:

{:a [5 3]
 :b [6]}

How do you get close to that in Java? More to the point, how do you get close to that in Java and be thread safe?

10x doesn’t get close, I suspect it close to 100x. I used this logic a fair amount in the course of this exercise, but I only had to write it once.

You see, this code doesn’t care what the keys or the values of the code really are. This will work for any sequence of pairs. So I could use it for boxes, shelves and containers alike, and I only had to write it once not once for each type, or combination of types.

It was this kind of reuse that really blew my mind during this project.

So was I more productive or not?

Unless I sat down and wrote this again in Java I’m never going to know for sure, but I strongly suspect that I was at least 3 times more productive in Clojure than Java.

What does that mean in practical terms?

Would I still be writing this code if I was doing it in Java?

I honestly don’t know. What I suspect is that my Java code would have not been as reliable, thread safe or as flexible. I strongly suspect that I achieved an excellent solution in the same amount of time I would have achieved a solution in Java and reaching the same level of excellence in Java would have taken at least 3x as long. But I would never have done that. A solution is all that is really required, its excellence beyond the original brief is just show boating.

What is interesting to me is that it is hard to write a crappy solution. The strong mathematical foundation of lisp makes my code feel more like a formal verification than any code I’ve written in other languages. Once it works, you know it works for pretty much anything, and unit tests and schema are the blanket that keep you warm as you sleep soundly at night.

This is in stark contrast to other languages in my arsenal. Python/Ruby/Javascript may have allowed me to write a solution that was equivalent in crappiness to the hypothetical Java solution in less time (I doubt it for the reasons I expressed in the last post). But I would never have been able to achieve a solution that was as complete and performant. Who cares if the solution is thread safe when your language has a GIL?

It might be worth stating that I originally started this project in Python, hoping to take advantage of the numpy-stl library. But honestly, it was more hassle than it was worth, and the packing algorithms would have been a pain in the arse to reason about.

So was there anything about Clojure that makes it more productive?

Paredit

Perhaps the biggest boon when working with Clojure is paredit mode or equivalent. Its the sort of feature that you don’t realize you’ve been missing until you go back to a language that can’t support it and you feel like you are coding in molasses.

Real REPL

I’ve used languages with REPLs in the past and never really understood what all the fuss was about. The two main features always seems to be API discovery and maybe a bit of complicated debugging as I tried to figure out where in the global namespace my classes had been dumped. Both of these features were already integral to my workflow in Java via the IDE.

In Clojure you live in the REPL. It isn’t a library that you float into to ask a couple of questions, it is you development environment, a workshop, a lab. If you are working on server software you start your server in your REPL, and then send individual forms to the repl until the server is doing what you would like. The same is true for client apps as well, where your REPL is either your runtime environment for a SeeSaw app, or inside the browser on web client.

The REPL forms a tight relationship with your editor providing accurate code completion and refactoring tools.

Clojure’s REPL is the first development tool I’ve used that doesn’t leave me wishing for IntelliJ.

Leiningen

Leiningen is Clojure’s equivalent to Maven. It is also about as ubiquitous. It is slow to start (you get to pay Clojure’s start up cost twice), it is incredibly reliable and significantly easier to use than Maven (which doesn’t say a whole lot). Its main purposes is dependency management and final packaging as most builds occur inside the repl (which is normally initiated by leiningen). I have never felt like it got in my way.

Emacs

I am as productive from an ssh session as I am with a full-blown GUI IDE. The immediate consequence of that is that I can use tmux if I want to remote pair with a shared cursor, and I can still ssh in and get my own view of the code, if it is on my pairs machine.

This solution is so good, that even if I’m sitting next to the person I am pairing with it is still my preference vs two keyboards and mice on the same machine.

So far, I only hit this level of productivity in Emacs with Clojure and other lisps, but I’m hopeful that Javascript, Python and maybe even Java (unlikely) will follow.

Lisp

When I started learning Lisp I wasn’t convinced at all. Now, I spend a significant amount of time trawling the internet looking for the reasons that Lisp didn’t win.

Why did Alan Kay create SmallTalk, when he was an avid fan of Lisp?

Why does Javascript exist when Brendan Eich was bought in to put Scheme inside of Netscape?

Why did the JPL abandon Lisp despite a huge number of successful projects?

Why did Sony demand that Naughty Dog abandon the tool that made such a small team so successful?

With the rise of AI and data science why isn’t Lisp the language of choice once more?

The more I learn the more questions I have as Lisps appear to be freaking awesome.

So was it all roses?

No. During my time working on this my tweets ranged from the sublime to the ridiculous. I think I hit the nail on the head with

And boy does it like to kick you in the nuts.

Error reporting

I lost 2 hours to this:

(deftest "a test"
  (testing "some details"
    (is (= 1 1))))

The obvious mistake is that I passed a string to deftest, when I should have passed a symbol. The stack trace, and repl insisted that the problem was in my namespace declaration.

I solved these kinds of problems in my own code used Prismatic Schema for ‘typing’ (much more flexible than core.typed), and contracts (:pre and :post).

That didn’t entirely solve the problem as

Debugging

You can debug clojure, but it is confusing for my tiny little brain. For a start macro expansion means that your code really isn’t what you think it is, so the code you are stepping through is often surprising, secondly, everything is an expression, a multiple expressions to a line is the norm. What does step over, step in, step out, mean in these instances?

Cursive, a plugin for IntelliJ, comes the closest to getting it right, but I still prefer Emacs as it feels lighter and more complete.

Start up time

Ran 16 tests containing 33 assertions.
0 failures, 0 errors.

real	0m11.703s
user	0m41.000s
sys	0m1.704s

lein test is so slow that it is essentially unusable especially compared to running the tests in the repl.

Ran 16 tests containing 33 assertions.
0 failures, 0 errors.
"Elapsed time: 255.829821 msecs"

{:test 16, :pass 33, :fail 0, :error 0, :type :summary}

So whats the problem?

REPL restarts also take 20-30 seconds, so if you do screw up your environment you have plenty of time to reflect on what you just did wrong… and I screw up a lot. I used to think it was just me, but having paired with a lot of Clojurists… its not just me.

It is this start up time that prevents me from using Clojure for command line utilities, or building mobile apps. I hope it gets fixed soon, but I suspect it is here for a while.

In the mean time, compared to stack traces and error reporting this is small potatoes.

Conclusions

So how do you measure productivity?

Its a problem that has plagued our industry for decades as developer salaries have risen almost as quickly as bug counts.

It would be great if we could leverage lines of code as a metric, life would be so much simpler, but it simply doesn’t make any sense at all.

So now we’re left with this really queezy situation where I am telling you that my solution is better than a hypothetical solution in other languages.

It has many features that point to ‘goodness’:

  • Immutable objects
  • Pure functions
  • Use of data structures over objects
  • Terse. The less code you write, the less bugs.

But I’m not leveraging those features for multithreading, as the problem isn’t especially parallel. What should be parallel? Reading from the single file system? Adding objects to the single container?

I could make assertions about the number of bugs, except I can’t. I honestly don’t have that much faith that this software is any more reliable than an equivalently tested Java application. And I did need those tests. They did catch bugs. There was nothing about the language that protected me from them. The classes of bugs that I need the tests for was different, the majority of the problems were because Lisps are so flexible.

Passing the wrong value into a function can only be observed as a problem once the output is considered to be incorrect. They aren’t collected as compilation failures. Even with the introduction of Prismatic Schema, you still need unit tests, if only to make sure that Prismatic Schema has checked that each method call obeys the schema.

But it also has features that are not great:

  • Slow start up time
  • Probably ‘slower to market’ than a Python / Ruby solution
  • Probably slower than a numpy solution as Java/Clojure is not great at matrix mathematics, although I have used a library which would allow users to plug in a fast matrix back-end (core.matrix).
  • Requires contributors to be familiar with functional programming at a minimum and probably Clojure which at best as a 1% of developers familiar with it.

So I’m left looking at a solution that I’m very happy with, that I feel I completed in a reasonable amount of time, that has shapes that make me feel like I did a good job and I have almost no way of proving it… except to release it to all of you lot and see why you hate it.