Optimize What?
Sunday, 25 November 2007
Posted by austin in: Apple, Ruby, Technology, trackback
There’s been a furore recently over an article by Ankur Kothari where he optimized CTGradient. As an exercise in optimizing code, it was fairly aggressive but otherwise pedestrian. As an article, it was inflammatory and insulting from the beginning:
CTGradient contains an incredible diversity of built-in gradients, gradient styles… For demonstration purposes, all these features are excellent. For production, this is a nightmare.
It gets no better at the end:
…The documentation already shows you how to draw gradients, yet the number of applications using CTGradient – the whole 1300 lines of it – is astonishing.Please: When you use other people’s code, don’t put it in without a thought. Go through it, understand it, and optimize it for your specific need. For the better performance and reduced RAM usage, computers will thank you.
Quite legitimately, some Mac developers spoke out about this. Hoare’s famous dictum Premature optimization is the root of all evil
was pulled out early. It’s completely applicable here by any measure. This story would have ended here, and I wouldn’t be writing now, had it not been for a series of bile-filled nonsensical articles posted at Rixstep, starting with one calling Mac developers (such as Daniel Jalkut) objecting to this inflammatory article as the Landed Gentry of Mac Development™.
Which Optimization?
I’ve been developing software for a long time. While I haven’t done any Objective C programming yet, there’s nothing in Ankur’s article which is unique to Objective C. When developing software, you make it work, make it right, make it fast. In that order. The goal is to ship your software, not have it languishing in interminable development. Mac developers aren’t the only ones who will use third-party source without looking for optimizations (or further optimizations, as the case may be); Unix and Windows developers do this, too. Any developer worth their salt does this, because the first goal is to make it work. At work, when we see performance and memory issues, we don’t start digging through third-party code. We don’t even start looking at our own code. We start looking at profiled performance data. Then, and only then, do we start to make something fast.So, if the first goal of software development is make it work, what’s the first optimization you should do? You should optimize your developers’ time toward shipping the softare. Window background gradients aren’t the type of thing that will sell more copies of an application, but their absence may prevent some sales because the application doesn’t look “tasty” enough. So, you start looking at how to provide gradients. You see that Core Graphics supports it, and you start digging in the documentation and you see that it’s going to cost you lots of extra time to learn the CG gradient support. And then you have to debug what you’ve written. No sensible unit testing here; this is visual inspection. If you can add a single file that reduces your time to implement a background gradient—which you don’t really care about yourself but you know it’ll cost you sales not to have it—then you’re going to be better off entirely. Scott Stevenson of Theocacao stated it much better in 2006:
Ah. Now doesn’t that feel better? It’s not that the class does anything that is otherwise impossible, it’s just a lot cleaner because all of the goofy callbacks and whatnot are moved into their own code space. In other words, you have more free time to work on the actual application.
CTGradient is (almost) all about developer optimization. It also helps you deal with the second optimization: optimizing for correctness. It’s code that someone else has written and debugged. You know that other people who develop Mac apps are using it, and no one is screaming loudly about bugs in the code, so you feel pretty comfortable with it. So, you know that by using this drop-in library, you are not only getting this negative feature done, you’re getting it done right. At this point, you can forget about the getting it done fast, because no one has complained about gradient performance at this point, because you’re not yet done shipping.Make it work; make it right; make it fast. Optimize for developer time first (and this includes good design), and then worry about the rest when you need to.
More Code Optimization = Better Developer?
Things really went off the rails with this discussion when Rixstep posted the “Landed Gentry” article. He’s posted further bilious articles attacking some of the developers involved in the discussion over CTGradient optimization, but I’m not considering those articles for this discussion, since they’re pure bile and add nothing meaningful to the discussion (that is, they’re still based on false optimizations that I’ll address in a moment). In the “Landed Genry” article, Rixstep asks:
Who would you rather have engineer your software? Someone who’s as conscientious as Ankur Kothari? Or someone who squirms and attacks and insinuates and does the absolute utmost to avoid the actual question? You the user/paying customer can decide. Wander over to Ankur Kothari’s article on CTGradient and see who’s objecting. At least you’ll now know what the Landed Gentry of Mac Development™ think of you.
Rixstep presents a false dichotomy here. Ankur isn’t particularly conscientious in his article; there are specializations presented as optimizations, and some optimizations aren’t geared toward the most important parts of development anyway. Daniel Jalkut, on the other hand, is very conscientious toward his most important target: his customers’ time in using the program. I’d be very surprised if anyone was saying that MarsEdit is too slow because of gradient display.Optimizing on parts of code that don’t matter doesn’t make you a better developer. Optimizing the right parts of code at the right time do. If I were hiring Ankur and Daniel, I’d have to watch over Ankur more to ensure that he wasn’t working on stuff that doesn’t matter to the customer. Like optimizing gradient code without actually knowing that it was a performance bottleneck. Ankur does a decent job of making sure that the resulting code meets the minimum required needs of the job (which, if you’re developing from scratch is a good thing, after all, YAGNI). He outright says that one should spend time going through third-party code that (probably) isn’t relevant to the primary mission of your softwasre, and therefore doesn’t help toward “make it work.”In reality? They’d probably both work out as great developers. But Ankur cares no more about a user’s software experience than Daniel does, despite the insinuations of Rixstep. I do question Ankur’s judgement on the optimizations made and how they were reached, or at least how they were explained.
The Axis of Optimization
Let’s play along for a moment, though. Assume we have already determined that there’s a measurable performance problem and we don’t have any currently outstanding feature requests or other problems that are higher priority than this performance problem. Assume further that we’ve profiled our program and we’ve determined that yes, our source of program slowdowns is CTGradient.There’s a legitimate question about whether Ankur’s effort was really optimization or specialization. When he says Firstly, let’s remove the methods we know for sure we won’t need…
, the discerning reader should be asking why we don’t need those methods. Ankur doesn’t explain how he reached this conclusion. This means that the developer who is doing their job right won’t immediately think of eviscerating the entire CTGradient library, but rather measuring the performance characteristics of the library.Ankur made a mistake in his approach. Yes, simpler code is usually easier to maintain, debug, and will usually perform better. But simpler code does not necessarily mean better or more performant code. A bubble sort is simpler than quicksort; there’s no way that it’s faster. One may as well look at binary size for comparison. This isn’t to say that one shouldn’t strive for smaller binaries; the larger your binary, the more likely it is that you’ll cause something else the user uses to swap out to disk, which would be bad. Can CTGradient actively contribute to this? It’s not distributed as a Framework, so most projects compile it directly into their code.CTGradient’s 1,172 lines of code in six source files (as determined by Sloccount) doesn’t even come close to adding meaningful binary code bloat. I compiled CTGradient from the latest available SVN checkout and Ankur’s “Lean Gradient” project. I compiled both as Universal using the default settings and compared the final binary size (from Contents/MacOS/) and the intermediate object file sizes. Since the projects are structured differently, it’s not a completely fair comparison. Ankur’s binary does one gradient and results in a binary size of 42,992 bytes; CTGradient does a lot of different types of gradients and results in a binary size of 78,776 bytes, a difference of a mere 35,784 more bytes of Universal code. A lot of the binary size differences is overhead, though. CTGradient creates three object files per architecture: CTGradient.o (55,964 bytes), CTGradientView.o (14,160 bytes), and main.o (976 bytes), leaving an overhead slack of 7,676 bytes. optGrad just has main.o (10,912 bytes), leaving an overhead slack of 32,080 bytes. So, CTGradient.o isn’t going to add to your binary size overhead in any meaningful way.What about memory use? Ankur posted two follow-up comments that suggested that CTGradient adds between three and ten megabyes of memory use to a program over and above his approach. That suggests that there might be room for a CTGradientLite, but making that would require extensive profiling to determine the parts that could be excised prior to doing so. And, of course, if all you need is a single gradient like Ankur did, and have the time and mathematical knowledge to do what Ankur did, by means do so. Ankur’s five minutes, though, might be five hours for you—so make sure that you don’t have something more important you should be working on.
The Unwashed Messes
Ankur can perhaps be forgiven for the mistakes he made, as other posts by Rixstep suggest that he’s quite young. He hasn’t yet had to learn the personal interaction lessons that most of us have to learn, and that most of us do learn. There’s a few who don’t, and Rixstep appears to be one of these developers who resisted learning about how to behave toward other people throughout his career. He stepped into this discussion “defending” Ankur from the “Landed Gentry”, who were simply asking the same questions that any good software developer would. He’s gone further recently into personal attacks against two of the more vocal questioners. Reading further on Rixstep’s site, anything that is selling better than his software seems to result in that anything being called the “Landed Gentry” of something. He thinks that he’s defeating dragons, but in reality he’s tilting at windmills just as usefully as Don Quixote did.Rixstep is supposedly in the business of selling software. The content and tone of his posts have managed to do the opposite. I had recently considered purchasing some of his software; I know now that to do so would have been a mistake, because he doesn’t treat his peers well. It’s impossible for him to treat his customers well with that sort of attitude. It doesn’t take much to be generous in spirit to people, at least in starting. It doesn’t take much to realize that there are better ways to deal with conflict than has been done.There is a difference between taking a hard-line, hard-nosed approach to something and defending it on technical grounds and actively attacking someone without basis. Rixstep’s attacks, starting with the “Landed Gentry” article, are entirely without merit. Ankur didn’t optimize CTGradient; he made a specialization. This is valid sometimes, but in reality something like window gradients isn’t central to your application’s purpose, and to spend more than an hour or two dealing with them would be wholly inappropriate. The good developer knows that it’s better to ship than to spend all your time optimizing someone else’s code or rewriting it yourself.
Where does Ruby Fit into This?
The astute reader would have noticed that I filed this as a Ruby article as well. While I haven’t mentioned Ruby until this point in the article, the parallels here are obvious. Rixstep and Ankur both say that the dozens of applications that use CTGradient are abusing their users because it uses too much code, too much space, too much memory, and it’s too slow. Aside from the memory use, the claims have either been exposed as false already or haven’t been supported with data. As Scott Stevenson’s quote suggests, CTGradient optimizes for developer time.So does Ruby. I can get more written with Ruby than any other language that I use. It won’t be the fastest software, it might be a little more memory intensive, but I will get it written—and written correctly—faster. And that, given that shipping is the goal, is important. When Rixstep rejects CTGradient without providing a more optimized yet equivalent functionality alternative, he rejects any developer-side optimizations in favour of hand coding everything, including using more expressive languages.I reject that notion. And so should you.




Comments
There is no large causation between the size of the code and memory consumption. Write a tiny program which keeps mallocing to prove this, if you will. In any case the runtime linker loads in the necessary methods across a 4K page boundary – it doesn’t load in the entire object, unless needed ( there is a mechanism at link time , not well advertised by Apple , to create a link “order file” which places the code on disk – and pages – in the order they are called. This requires some empirical research beforehand to work out optimal oder). Linkers also strip out “dead code”, I dont think this will work for factory methods – one of the bugbears of the CTGradient haters.
Of course this 4K is insignificant – what makes a program big in ram is data, not code. The real memory cost of the CTGradient is the Apple based drawing routines which use video memory to draw on the screen. The bigger the gradient window, the bigger this cost. So rather than optimizing the CTGradiant class the lad could have made more impact on runtime memory by using less gradiant, which is far more significant than any memory increase caused by the use of a large class most of which will not be loaded in by the loader – and even if the rest of the 4K page loaded in was redundant the real memory cost is less than 4K memory, because some of that code *was* needed after all.
We are not coding for calculators anymore.