Wednesday 19 November 2014

Make Concepts Explicit when Fixing Bugs

Dear Junior

To find implicit concepts and make them explicit is a very powerful way to improve code. But, sometimes it is hard to judge beforehand which concept is important enough to make explicit. We try to make good judgements, but sometimes we miss.

But, no reason to despair, we can take Time as our helpful aid. So, instead of everything perfect from start, we continuously perfect it whenever we find a reason. Two of the topmost reasons are either a bug appearing or further development of the code.

Let me for now fokus on the case of when a bug appears. If the code does not behave the way we intended it to do, then there is most often a place in the code that was to convoluted. In that convoluted state, subtle misunderstandings and simple mistakes could hide. The solution is to clarify the code until it is obvious whereof the mistake consists.

Things get clearer by examples, so let me clarify what I mean with a very minimal example, the story about a bug and its fix.

In real life, more or less a decade ago, I worked on a project where we did a batched import of records coming from an external source. The code looked roughly like this following in the class ImportCron.

public class ImportCron {

    private RecordImporter importer;
    private Logger logger;

    void runImport()  {
        List records = importer.listWaitingRecords();
        logger.log(Level.INFO, "Importing records: " + records.size());
        for ( Object rec : records) // imports each of the records in turn
   …
    }

Now, the import was run once a minute, but most of the time there where non records waiting.  Problem was the import log became filled with rows saying "Importing records: 0" so we had to grep out the non-zero lines each time we wanted to look at what was happening.

One of those days I got tired of this, fired up the editor, and wrapped the logging in an if statement to filter out all those zero-import writes.

    void runImport()  {
        List records = importer.listWaitingRecords();
        if (records.size() == 0)
            logger.log(Level.INFO, "Importing records: " + records.size());
        for ( Object rec : records) // imports each of the records in turn
   …
    }

A few days later we released during lunch, which was the best time for our users. The deploy went smooth, apps started up as expected, and soon it was time for the first import "tick".

Within a minute, import was run the first time, and the log line read:
<timestamp> Importing records: 0
While standing confused, a minute went by, and the log line read:
<timestamp> Importing records: 0
Another minute went by … and no log line.
While firing up my editor to check the code, another minute went by and a third log line appeared:
<timestamp> Importing records: 0
A colleague of mine looking in the database calmly reported "We just imported three records".
It only took me a split of a second to realise my mistake once I had the code in front of me.

    void runImport()  {
        List records = importer.listWaitingRecords();
        if (records.size() == 0)
            logger.log(Level.INFO, "Importing records: " + records.size());
        for ( Object rec : records) // imports each of the records in turn

    }

Obviously my spine reflex for coding "check for zero" made me write the boolean expression the wrong way around. By the way, did you capture that mistake at first glance in the code on my first description? Confirmation bias is a nasty thing.

Now, I must say this kind of bugs are pretty uncommon, bugs that are typos or simply mispunching the keys. Most bugs in my opinion are rather that pieces of code subtly misunderstand each other.

Well there are at least two ways of fixing this code. The obvious, and fastest would be to quickly change "==" to "!=". However, humbled by the mistake I had done in the first place, I realised that that kind of hasty coding was what got me in the trouble in the first place.

One of my coding mantras since long have been "Code should mean something, not just do something". So, a better way out would be to make the code more meaning-ful. From Eric Evans I learned the phrase "Make implicit concepts explicit", which says the same thing in this context, but gives a better guiding direction forward.

I took to the challenge of fixing the bug by finding what implicit concepts had been missed, and making them explicit until it was blatantly obvious that the code was wrong. An added benefit would be to be able to make a test proving the code was wrong.

Extracting the boolean condition to a method of its own would force me to spell out the meaning of that piece of code.

    void runImport()  {
        List records = importer.listWaitingRecords();
        if (containsRecords(records))
            logger.log(Level.INFO, "Importing records: " + records.size());
        for ( Object rec : records) // imports each of the records in turn

    }

    static boolean containsRecords(List records) {
        return records.size() == 0;
    }

Granted, this is more code than I started with - but I think the code is more "to the point" (phrased inspired by Rickard Öberg, another of the great programmers).

At least this made me able to write a test

    @Test
    public void shouldConsiderEmptyListNotContainingRecords() {
       Assert.assertFalse(ImportCron.containsRecords(Collections.EMPTY_LIST));
    }

Well, at least this is what the test would look today - at the time JUnit looked slightly different.

Anyway, now I have a failing test. Claiming "the empty list does not contain records" is simply false - as proven by the test. We safely update the code to fix the bug.

    static boolean containsRecords(List records) {
        return records.size() != 0;
    }

Upon which the test switched to green as expected.

During this "refactor -> put under test -> fix" something interesting has happened. The concept "import list contains records" that hitherto had been implicitly represented by the technical "records.size() != 0" has now been made explicit and given a name "containsRecords".

It might be claimed that the main benefit in this example was derived from the drive to put code under test before making a change. Undeniably, that is a point, but I think it only tells half the tale. 

Simply putting things under test can be done in a myriad of ways, and I have seen several very technology-based efforts. I do not think those efforts have paid off handsomely. Mostly the code get cut along technical lines to push in tests in the gaps. But, the code does not get more comprehensible over time.

Focusing on "make implicit concepts explicit" have been a more productive course for me when working with code.

Yours

   Dan

ps My team-mates did not force me to skip lunch to fix the bug. We went out together, and when we came back satisfied and rested, I sat down and fixed the code. We released the fix during lunch-time the day thereafter.






Monday 17 November 2014

Make Implicit Concepts Explicit in Code

Dear Junior

In our letters on software development we have touched upon the idea several times, but I think it might be worth spelling it out - the idea to make implicit concepts explicit; something I see as one central message of Domain Driven Design applied to coding.

Let me pick that phrase apart for a moment. When coding we have a lot of concepts in mind, e g when writing code that handles people I might have a class namned Person. In this case the concept of person has an explicit representation in the code - i e an explicit concept.

A person might have a birth date, represented in code as a data-field Date dayofbirth in the Person class; another example of an explicit concept.

We might have business logic restrictions that are based on the age of the person, e g you have to be at least fifteen years to access some content. So there will be code calculating this.

    void contentRequest() {
        ... 
        boolean access = ((new Date().getTime() - customer.dayofbirth.getTime()) / msPerYear) >= 15;

Again we have a concept represented in code, this time age. However, this time the representation is not explicit - there is nothing in the code saying "age". Age is an implicit concept

There is nothing strange with having implicit concepts. In the short code snippet we have several implicit concepts: current point of time, point of time of customer's birth, and age-limit for content are just three obvious. This is not a problem. The programmer reading the code will intuitively re-construct the relevant concepts.

However, the design guideline from Domain Driven Design advice us to keep an eye open for important concepts - and if we find the represented implicitly, then make that implicit concept explicit

Let us look at the code snippet again.

        boolean access = ((new Date().getTime() - customer.dayofbirth.getTime()) / msPerYear) >= 15;

Of the implicit concepts dwelling in this like, which of them seem important enough to make explicit? 

The concept of "current point of time"? Probably not.

The concept of "milliseconds since birth"? Nah, not that either.

The concept of "milliseconds since birth expressed as years, rounded downwards"? Hey! I know this! We already have a word for it - we call it "age"! And we talk about it all the time!

Well, that sounds important enough. Let us start with giving that value a name. 

    void contentRequest() {
        ... 
        long age = (new Date().getTime() - customer.dayofbirth.getTime()) / msPerYear;
        boolean access = age >= 15;

By the way: this is where I really like the IntelliJ shortcuts "cmd-w" to widen the selection until the expression is selected, then "cmd-alt-v" for the refactoring to extract the selection as a variable - naming it "age" in the pop-up. It literally takes less than 10 seconds.

Now, we actually have made the implicit concept age into an explicit concept. Mission completed.

Well, not completely. We need to find the concept a good home where it can lead a good life. Right now it is stranded in the middle of some access computation. At least we can give it a method of its own.

Applying another of my favourite refactoring "Replace Temp with Query" yields this code.

    void contentRequest() {
        ... 
        boolean access = customerAge(customer) >= 15;

    private long customerAge(Person customer) {
        return (new Date().getTime() - customer.dayofbirth.getTime()) / msPerYear;
    }

Better, but can be improved to make the concept clearer. If we talk about the "age of the customer" it will vary from time to time, and it might cause confusion: the age when the customer requested access to the content, the age when content was first accessed, when it was last accessed, the age now? Better to clarify.

Of course we could rename the method to "customerAgeNow". However, I do not feel comfortable having my concepts depend implicitly on external things like the system clock. I prefer to make those dependencies explicit. So we make the time-in-question a parameter. 

    void contentRequest() {
        ... 
        boolean access = customerAgeAt(customer, new Date()) >= 15;

    private long customerAgeAt(Person customer, Date timepoint) {
        return (timepoint.getTime() - customer.dayofbirth.getTime()) / msPerYear;
    }
 
This design also improves testability drastically. The tests can now use explicit test-data, and need not rely on checking, or changing, the system clock.

Kudos once again to the lovely refactoring support of modern IDEs. Another less-than-10-seconds-refactoring: two cmd-W to select "new Date()", cmd-alt-P to extract as parameter, change name to "timepoint" in pop-up. Enter. Done.

By now it is pretty obvious that the method "customerAgeAt" suffers from feature envy. The most relevant data it operates upon is the Customer, but it resides somewhere else. We should consider moving it.

Should we move it, the Customer class would get a method "ageAt", taking the liberty to rename it slightly. That method certainly makes sense in this context, but does it so in other contexts? As we do  not have the rest of the codebase at hand, we will just have to pretend that "ageAt" makes sense in other context - and might actually be useful there as well.

This is really one of my favourite refactoring: move method. Again extremely smooth thanks to modern IDEs. Literally two keystrokes (F6, Enter - in IntelliJ) for the move, and a few for renaming. 

    void contentRequest() {
        // ...
        boolean access = customer.ageAt(new Date()) >= 15;

class Person {
    Date dayofbirth;
    // ...
    long ageAt(Date timepoint) {
        return (timepoint.getTime() - dayofbirth.getTime()) / TimeUtil.msPerYear;
    }
}

You might not believe me, but I remember the days of yore when you actually had to do all the involved steps manually; cut-n-paste the method body and header, change the list of parameters, update the code to use the fields of "this" instead of the argument, change all client calls (which might be several).

Now, we have finally ended up with a version I feel comfortable with. The concept "person" has now an conceptual attribute "age at" which has an explicit representation in code. In code we have "enhanced" our language of what we can talk about directly. So, when we discuss the age of a customer with business people, it is likely that we consistently mean the same thing - even if we for sure still can misunderstand each other.

As a side effect the code checking the access has become a lot clearer.

        boolean access = customer.ageAt(new Date()) >= 15;

This is a line of code that can be shown some person from the business side and explained by reading it out "access is granted if the customer's age at 'now' is at least fifteen years". Given that support, they can see by themselves - something that really builds trust. 

The time invested was not huge. Coding-wise all refactorings took less then a minute together. The time to slowly realise that "age" is an important concept in this particular domain is not counted. However, that insight will have to dawn on the programmer sooner or later, and when that happens, that extra minute is well invested.

To make concepts explicit does not only apply to code, it can be fruitfully applied to other areas as well, e g capturing and writing requirements/specifications. 

As we all stand on shoulders of gigants, I also must give credit where credit is due. The phrase "Make Implicit Concepts Explicit" I have picked up from Eric Evans, the thought leader of Domain Driven Design. 

In conclusion: to make implicit concepts explicit help us to over time close the gap between the code and the understanding of the business domain. It is not uncommon to in the process find bugs or crucial misunderstandings, that then can be adressed in a proactive manner instead of popping up as nasty surprises later on. 

Yours 

   Dan





Wednesday 12 November 2014

Agile Projects should have Effect Targets

Dear Junior

Measuring projects and setting targets for them is a tricky business. And, it does not get easier when agile projects enter the scene. This is not really because agile projects are strange per se, but because they are different from non-agile projects.

Setting targets and measuring projects is also an important business. I have seen several agile initiatives fail. Most of them have failed because they did not succeed in setting goals for projects, and monitor their progress. And if you cannot do that, you quickly lose the confidence and support from upper management. Agile initiative terminated, or left to dwindle away. End of story.

However, it need not to be so. Agile projects can have measurable targets, and they can be monitored - you just have to do it right.

There are some bad news and some good news.

The Well-Established Chaos Rod for Measuring Success

The bad news is that agile projects cannot be measured using the de-facto established standard rod for project success.

The standard rod for measuring projects is "on-time and on-budget, with all features and functions as initially specified", as used by e g Standish Group in their much-to-cited Chaos Reports. Let us call this the "Chaos rod". Most organisations use some version of the Chaos rod for measuring their projects.

As we have discussed earlier it is pretty obvious that "all features implemented" is not a good way to measure "project success" - not for any project. Nevertheless, this is the standard rod.

Damned if you do, damned if you don't

Of course agile projects will fail if measured using the Chaos rod. The reason is simple - agile projects does not manage to keep their hands away from fiddling with the original specification. Agile projects remove stuff, change stuff, add stuff - agile projects are in a constant state of scope-creep.

This is no coincidence - it is how agile projects are designed.

Think of it this way: If we learn something during the run of the project - shall we let that insight effect the plan of what we intended to do? Or shall we ignore that insight, sticking to the original plan even though inferior? Of course we will adapt! Anything else would be ridiculous.

An agile project anticipates that such insights will emerge, so its processes are designed to harness and leverage upon those insights: demos, retrospectives, re-prioritisation of product backlog etc. These practises are all aimed at constantly refine and redefine the scope. But then, we have derived from the original specification "all features and functions as initially specified".

Thus, any agile project longer than a sprint will fail - by definition. That is, if you use the Chaos rod for measuring success.

To put it bluntly, if you measure an agile project using the Chaos rod, it will fail. Either the agile process will fail, or the project as defined will fail.

Either, the project adapts to its measuring rod and blindly follows the "functions as specified", throwing whatever insight gained aside. Then, "project" will succeed, but "agile" will fail.

Or,  the project will work according to its agile-minded processes, and then certainly the result of the project will not be "all features and functions as initially specified". Then, "agile" will succeed, but "project" will fail. I e project will fail as measured by Chaos rod.

Damned if you do, damned if you don't.

A New (or Old) Hope

The good news is that the Chaos rod is not the only way to measure projects. As we have discussed earlier, there has always been two ways to measure projects: feature list (Chaos rod) or measuring business effect, what we can call the Effect rod.

Let us take our earlier example with an on-line book store. They had an idea that recommendations of the type others-have-bought would increase their sales, so they set of some money for that projects.

Using the Chaos rod they would have created a feature list around others-have-bought recommendations. The project would later be evaluated on how well it implemented the list. But let us look at a better idea.

Using the Effect rod they might set a target that customers will by 0.35 more books per checkout on average. The Effect rod is used to state the value of a project. These 0.35 books can probably be converted to money, that makes the project worth the effort.

The project might start out with an idea that others-have-bought recommendations would cut the cake. After the first small stories, deployed to production and put into hands of customers, the team learns that some kind of category would be helpful. To facilitate this they implement a simplified "search similar" as one of their features.

Suddenly the number of books in the checkout carts raise to a level 0.4 above the pre-project baseline. And the level sustains, it was not just random noise - the change is statistically significant.

The project has fulfilled its target according to the Effect rod, and is declared a success.

Measuring using the Chaos rod "on time, budget, and specification" - the project would have been deemed a failure, because it did not deliver the initially specified others-have-bought.

So, agile projects can be measured. You just have to use a measurement that is well suited. And, to be honest - which is better anyway.

A sad side-note is that I know of several projects which have worked in an agile manner, and delivered enormous business benefit - but in the project report the project lead has had to apologise repeatedly for not meeting the project target as defined in the project specification - a feature list.

Granted, to measure projects on business effect is not a new idea in any way at all. The idea was certainly there before the Agile Manifesto was written around the turn-of-millennia. What is new is that this way of measuring is essential to provide rigour to agile projects.

For Agile to succeed at larger scale, we certainly need lots practises around agile-minded processes. Measuring agile projects on effects is certainly one of them.

The way to measure agile projects is to set targets for business effect.

Yours

   Dan

PS Interesting enough, there is a sub-category "agile projects" in the later Chaos Reports, but the rate of success is not 0%, it is around 40%. I wonder what is going on here.

PPS Within the agile community, the practice of projects is debated - at least in the sense of the common description "temporary organisation with limited time, resources, and ambition". So, it would probably be better to use some other term, e g talking about "initiatives". However, for convenience, I have kept the often-used and familiar term "project". Check out the twitter hashtag #NoProjects.