Friday 16 September 2011

Public final and data encapsulation


Dear Junior

We were discussing immutable value objects and using "public final" data-fields for their representation. In that discussion I started off mixing up immutability and encapsulation. Now that we have covered immutability, let us return to encapsulation.

Obviously, declaring a field as public will break data encapsulation. You lose your freedom to change data representation without having to bother the clients.

Let us have a look at a name-class with some actual usage.

public class Name {
    public final String fullname;

    public Name(String fullname) {
        if(!fullname.matches("[a-zA-Z\\ ]+"))
            throw new IllegalArgumentException();
        this.fullname = fullname;
    }

    public String[] names() {
        return fullname.split(" ");
    }
}

The public API of this class consists of three parts: the construction of a name from a string, the attribute “fullname” (accessible through property data field), and the attribute “names (accessible through accessor method).

An example of what client code looks like we can find in the tests.

public class NameTest {

    private final String danbjson = "Dan Bergh Johnsson";

    @Test
    public void shouldHaveFullNameAsAttribute() {
      Assert.assertEquals(danbjson, new Name(danbjson).fullname);
    }

    @Test(expected = IllegalArgumentException.class)
    public void shouldNotAllowWeiredCharsInName() {
      new Name("#€%&/");
    }

    @Test
    public void shouldSplitIntoNamesAtSpaces()  {
      Assert.assertEquals(
        new String[] {"Dan", "Bergh", "Johnsson"},
        new Name(danbjson).names());
    }

}

Imaging that we want to change the internal data representation to using a char-array instead. Doing so will break all the clients as they rely on having that public data-field “fullname”. Thus, it is a bad design. Or?

I would argue that it is still a good design. Using modern tools it is easy to add the encapsulation when needed. Applying "Encapsulate Field" in e g IntelliJ yields.

public class Name {
    private final String fullname;

    public Name(String fullname) {
        if(!fullname.matches("[a-zA-Z\\ ]+"))
            throw new IllegalArgumentException();
        this.fullname = fullname;
    }

    public String[] names() {
        return fullname.split(" ");
    }

    public String fullname() {
        return fullname;
    }
}

And of course the client code has been changed accordingly.

    @Test
    public void shouldHaveFullNameAsAttribute() {
        Assert.assertEquals(danbjson, new Name(danbjson).fullname());
    }


Now, I could have chosen "getFullname" as the method name for the new method. However, I have always found that naming convention a little bit awkward, the property is “full name” and adding a boilerplate “get” does not add any value in my opinion. By the way, JavaBeans is just one naming convention in Java, the naming convention for CORBA predates JavaBeans. In the CORBA convention if you have a property “fullname” of type String, then the way to access it was to call a method “String fullname()” and the way to change the property was to call a method  “void fullname(String)”. So the convention I use is not new to Java at all.

In some languages there is no distinction in syntax between accessing a field and calling a no-arg method. Had the code been written in Eiffel or Scala, the syntax would have been the same and there would be no change to the client code at all.

Now that we have encapsulated the usage via “fullname()” we can change the internal representation to a char array.

public class Name {
    private final char[] chars;

    public Name(String name) {
        if(!name.matches("[a-zA-Z\\ ]+"))
            throw new IllegalArgumentException();
        this.chars = name.toCharArray();
    }

    public String[] names() {
        return new String(chars).split(" ");
    }

    public String fullname() {
        return new String(chars);
    }
}

Just for reference, in Scala the corresponding change would start with this "final public" representation – here represented by the keyword “val”.

class Name(val fullname: String)
{
  if(!fullname.matches("[a-zA-Z\\ ]+")) throw new IllegalArgumentException();

  def names = fullname.split(' ')
}

The change would take us to the slightly more verbose char array representation. Here we have no “val” in external class declaration, but a private val-field hidden inside the class-block instead.

class Name(name: String)
{
  if(!name.matches("[a-zA-Z\\ ]+")) throw new IllegalArgumentException();
  private val chars = name.toCharArray

  def fullname = chars.toString

  def names = fullname.split(' ')
}


In conclusion: As the public final field is accessible by the clients, it is also a part of the API for Name: This breaks data encapsulation. If you want to change data representation, you will need to create an accessor method to which you direct the client.

In Scala or Eiffel this is a no-issue, as the new accessor could transparently have the same name as the old datafield ("fullname") and be accessed with exactly the same syntax ("name.fullname"). However, in Java you have to include an empty pair of parenthesis to call the accessor method - thus the client code must be changed.

Now, I consider this a small issue, as there is excellent refactoring support in modern IDEs that eliminate the change to a fully automated four-click, one-minute exercise. Thus, there is no point in designing for that change up front.

So, even if we *do* break data encapsulation, I would say that it is not much of a problem.

Now, there are other kinds of encapsulations that are more interesting than data encapsulation, but that analysis will have to wait for some other letter.

Yours

   Dan

Monday 12 September 2011

Public final is also Immutable, Again

Dear Junior

I must apologise as in my last letter I mixed two separate discussions into one: one about immutability, and one about encapsulations. What I was after was immutability, even though encapsulation is also interesting.

To make thinks clear, let us return to the Name class for representing value objects for a name e g "Dan Bergh Johnsson". To focus on immutability, let us simplify it.

public class Name {
    public final String fullname;

    public Name(String fullname) {
        this.fullname = fullname;
    }
}

Now, if I create an object of this class, then no client can mutate the state of that object. This is because

  1. The datafield is final so the client cannot make the reference point to some other String 
  2. Strings are immutable, so the referred object cannot be changed 

Just for reference. I mentioned that Scala has a very elegant way of defining "public immutable datafield properties". The corresponding Scala class would be a one-liner.

class Name(val fullname: String)

Elegant, isn't it? "Name is a class with the attribute value 'fullname' of type String". Scala promotes the use of immutable constructs by making it simple to declare them.

Now, as my friend Tommy Malmström pointed out to me, I should also make my class final. This is a very valid and interesting point, so let me dig into.

The problem in this case is not the objects we construct ourselves, but objects that are sent to us. We should rightfully assume that such objects are also immutable. However, someone could wittingly and deviously create a mutable subclass of Name.

Back in Java land such a subclass could for example overrride toString

package names;

class EvilName extends Name {
    public EvilName(String fullname) {
        super(fullname);
    }

    public String toString() {
        return "Voldemort";
    }

    public static void main(String[] args) {
        Name name = new EvilName("Harry Potter");
        System.out.println(name);
        System.out.println(name.fullname);
    }
}

danbj$ java names.EvilName
Voldemort
Harry Potter

You see how confusing it could be when toString is used to render the name object "Harry Potter".

So, forbidding such overrides by declaring the Name class as final is really a good idea.

On a side note, this is the reason why String is declared final. The String class is for example used to represent class and package information when loading code dynamically over the network - so imagine the consequences had it been possible to make a mutable phoney String.

Well, that was about immutability.

On the issue of encapsulation it can be argued whether "public final String fullname" breaks encapsulation or not - and that discussion also depends on what encapsulation we mean. Any way, that is a separate discussion.

Yours

   Dan

Friday 9 September 2011

"public final" is also Immutable

Dear Junior

Immutable value objects are one of my favourite programming idioms. I really like how they aid and ease the burden of the rest of the code by taking care of small pieces of complexity. When they are based on the concepts of the domain they become yet another magnitude more valuable. 

Implementation-wise they are most often a primitive type wrapped up in a protecting box. So it is pretty natural that a "name" is stored with a String containing the full name. Wrapped together in the box is probably as well some complexity, like the validation of the name and some interpretations of the data - in this case the logic to split a full name into its parts. At the end of the day, it is not so interesting to encapsulate the data as such - it is encapsulating the interpretation of the data that is crucial. In Java such a name class would look like this.

public class Name {
    public final String fullname;

    public Name(String fullname) {
        if(!fullname.matches("[a-zA-Z\\ ]+"))
            throw new IllegalArgumentException();
        this.fullname = fullname;
    }

    public String[] names() {
        return fullname.split(" ");
    }
}

Now, one thing worth noting is the datafield "fullname". It plays double roles both as data storage and as an attribute. Should we not have a private field and an accessor method instead?

Well, the integrity of the object is still guaranteed as

  • the datafield is final so the referred object cannot be exchanged 
  • the referred object (String) is immutable so the referred object cannot be changed 

So, yes, people can get to the field from the outside, but they cannot break anything. Of course there is the question that if you change the data representation, then you will break the clients.

However, that is no big deal. If the situation should arise, we can apply the refactoring "encapsulate field" to introduce a new method "String fullname()" and replace every access to the field with a call to that method instead. Checking the entire codebase for accesses to "fullname" might be a large task. But, guess what, using a modern IDE there will be a menu item in the "Refactoring" menu that will do exactly that - fully automated.

The alternative would be to have that code in from the start. However, I cannot see that there is a point in paying the overhead of more lines of code in the meantime.

Making a field public does not break encapsulation. The important encapsulation is the interpretation and constraints of the data that is found in the constructor validation and the logic of the method "names()".

By the way: in Scala you would not be able to see the difference between a public field and a method with the same name. Nice.

Yours
   Dan