Performace up front?

In a new posting, Uli tells us that performance should be be built into software design.. This appears to be a noble goal; but on the second look the problem is more complex than it seems.

There is in fact a class of conceptual performance problems, which are directly related to the software design. The structure of an XML file, for example, makes it impossible to quickly select a specific data set. So as long as you store your huge database in plain XML, you won’t get quite happy.

One would agree that this kind of problem should be addressed during design. But performance issues are not always that clear-cut and easily seen.

They come in two flavours: Complexity issues and expensive operations. Complexity is what we learnt in computer science class: If, for example, you have to look at each element of a list in order to add a new one, you’re in trouble. (Joel has a nice explanation how these kind of problems can bring down your performance.
) These problems are usually inherent in the algorithms design (or in the design of the algorithms used by your algorithm), and this is the place where they should be addressed.

The other kind of performance problems comes from inherently slow operations. Disk access will always be slow, compared to memory access. Memory access will always be slow, compared to registers. Copying strings in memory is slow, compared to passing pointers around. And so on – I assume you get the drift.

The problem is that it’s often hard to see which operations are expensive, and what their effect will be. In spite of this, every developer will have some ideas about which operations are expensive. Uli, for example, thinks that abstractions are expensive:

Object orientation introduces performance penalties that can prevent every compiler optimisation. So each abstraction introduces a significant overhead on you system. This may decrease the performance of your code by dimensions. To get rid of this problem later on, you may have to change your abstractions

But what to do about this? The obvious approach is to avoid abstractions; but, unfortunately, abstractions are your friends. They help you to make your code clean, modular and maintainable.

If we want to avoid abstractions, it’d be better to only do so where it’s absolutely necessary. But which are the hot spots that really matter?

The problem with most software developers is that once they become sensitive of an issue (“abstractions are expensive”), they will see their code in terms of possible “optimisations”. If carried on, this becomes a deeply ingrained habit: Each design decision will immediately be assessed on it’s perceived performance. The result is that one starts messing up other aspects of the design. One of the best examples are developers who will write ridiculously long functions, out of the fear the function calls may be “expensive”. More subtle is the notion that each level of indirection has to “justified” somehow, because of the perceived cost.

This is at the core of the “do not optimise yet” mantra. It’s not about ignorance, it’s about putting the issue aside to avoid the mental trap. Later, when you have a running piece of software, you can measure the results, identify the problems and do something about them. This way you can make your decisions based on actual knowledge, rather than on the nagging feeling at the back of your head.

And yes, this may mean you have to change your abstraction. This may be inconvenient, even difficult. But there’s a good chance that the problems are different from what you thought, and the changes go into a different direction anyway. And with an agile process you should be able to spot the problems early on, and changes shouldn’t doom you.

Advertisements

One thought on “Performace up front?

  1. I did not say to forget about performance either. What I’m saying is that, in many cases, assessing the performance of something results in “I don’t really know”. At which point you should take your hands off until you do know.

    Let me give an example: Some days ago I was fixing a problem with a colleague. Some kind of update routine. He noticed a special case, where the update wasn’t really neccessary because nothing important had changed. He suggested adding special code for that case, for the sake of performance. This means he suggested to deliberately make the code more complex – without really being able to understand what the gain would be.

    The code in question is called on each mouse click. Is this frequent? How much of the overall execution time is spend in that mouse click handler? How expensive is the operation in question? How expensive is it compared to the other operations in the mouse click handler? On how many of the actual mouse clicks of an actual user would this special case be trigerred? 0,5%? 5%? 95%?

    It’s virtually impossible to find the answers to this by just thinking about it. It reasonably easy to measure the results. And: Adding the code for the special case later on is easy. Finding some bug introduced in the extra code may be hard.

    This is what I was talking about, and this is what “don’t optimise yet” is about.

    Like

Comments are closed.

Create a free website or blog at WordPress.com.

Up ↑

%d bloggers like this: