Friday, 16 September 2011

Public final and data encapsulation


Dear Junior

We were discussing immutable value objects and using "public final" data-fields for their representation. In that discussion I started off mixing up immutability and encapsulation. Now that we have covered immutability, let us return to encapsulation.

Obviously, declaring a field as public will break data encapsulation. You lose your freedom to change data representation without having to bother the clients.

Let us have a look at a name-class with some actual usage.

public class Name {
    public final String fullname;

    public Name(String fullname) {
        if(!fullname.matches("[a-zA-Z\\ ]+"))
            throw new IllegalArgumentException();
        this.fullname = fullname;
    }

    public String[] names() {
        return fullname.split(" ");
    }
}

The public API of this class consists of three parts: the construction of a name from a string, the attribute “fullname” (accessible through property data field), and the attribute “names (accessible through accessor method).

An example of what client code looks like we can find in the tests.

public class NameTest {

    private final String danbjson = "Dan Bergh Johnsson";

    @Test
    public void shouldHaveFullNameAsAttribute() {
      Assert.assertEquals(danbjson, new Name(danbjson).fullname);
    }

    @Test(expected = IllegalArgumentException.class)
    public void shouldNotAllowWeiredCharsInName() {
      new Name("#€%&/");
    }

    @Test
    public void shouldSplitIntoNamesAtSpaces()  {
      Assert.assertEquals(
        new String[] {"Dan", "Bergh", "Johnsson"},
        new Name(danbjson).names());
    }

}

Imaging that we want to change the internal data representation to using a char-array instead. Doing so will break all the clients as they rely on having that public data-field “fullname”. Thus, it is a bad design. Or?

I would argue that it is still a good design. Using modern tools it is easy to add the encapsulation when needed. Applying "Encapsulate Field" in e g IntelliJ yields.

public class Name {
    private final String fullname;

    public Name(String fullname) {
        if(!fullname.matches("[a-zA-Z\\ ]+"))
            throw new IllegalArgumentException();
        this.fullname = fullname;
    }

    public String[] names() {
        return fullname.split(" ");
    }

    public String fullname() {
        return fullname;
    }
}

And of course the client code has been changed accordingly.

    @Test
    public void shouldHaveFullNameAsAttribute() {
        Assert.assertEquals(danbjson, new Name(danbjson).fullname());
    }


Now, I could have chosen "getFullname" as the method name for the new method. However, I have always found that naming convention a little bit awkward, the property is “full name” and adding a boilerplate “get” does not add any value in my opinion. By the way, JavaBeans is just one naming convention in Java, the naming convention for CORBA predates JavaBeans. In the CORBA convention if you have a property “fullname” of type String, then the way to access it was to call a method “String fullname()” and the way to change the property was to call a method  “void fullname(String)”. So the convention I use is not new to Java at all.

In some languages there is no distinction in syntax between accessing a field and calling a no-arg method. Had the code been written in Eiffel or Scala, the syntax would have been the same and there would be no change to the client code at all.

Now that we have encapsulated the usage via “fullname()” we can change the internal representation to a char array.

public class Name {
    private final char[] chars;

    public Name(String name) {
        if(!name.matches("[a-zA-Z\\ ]+"))
            throw new IllegalArgumentException();
        this.chars = name.toCharArray();
    }

    public String[] names() {
        return new String(chars).split(" ");
    }

    public String fullname() {
        return new String(chars);
    }
}

Just for reference, in Scala the corresponding change would start with this "final public" representation – here represented by the keyword “val”.

class Name(val fullname: String)
{
  if(!fullname.matches("[a-zA-Z\\ ]+")) throw new IllegalArgumentException();

  def names = fullname.split(' ')
}

The change would take us to the slightly more verbose char array representation. Here we have no “val” in external class declaration, but a private val-field hidden inside the class-block instead.

class Name(name: String)
{
  if(!name.matches("[a-zA-Z\\ ]+")) throw new IllegalArgumentException();
  private val chars = name.toCharArray

  def fullname = chars.toString

  def names = fullname.split(' ')
}


In conclusion: As the public final field is accessible by the clients, it is also a part of the API for Name: This breaks data encapsulation. If you want to change data representation, you will need to create an accessor method to which you direct the client.

In Scala or Eiffel this is a no-issue, as the new accessor could transparently have the same name as the old datafield ("fullname") and be accessed with exactly the same syntax ("name.fullname"). However, in Java you have to include an empty pair of parenthesis to call the accessor method - thus the client code must be changed.

Now, I consider this a small issue, as there is excellent refactoring support in modern IDEs that eliminate the change to a fully automated four-click, one-minute exercise. Thus, there is no point in designing for that change up front.

So, even if we *do* break data encapsulation, I would say that it is not much of a problem.

Now, there are other kinds of encapsulations that are more interesting than data encapsulation, but that analysis will have to wait for some other letter.

Yours

   Dan

4 comments:

  1. Context matters and challenges the assumption that refactorability makes public data a small issue. If you publish code, i.e., provide your classes to others whose code is beyond your influence or control, refactoring tools are irrelevant and this is a big issue. The applicability of the advice is dependent on the context.

    ReplyDelete
  2. No doubt encapsulation is great concept and if implemented correctly it helps a lot while maintaining code, e.g. you can change private variable anytime but not public variable with same confidence. the classic rule says that "encapsulate whatever changes" :) by the way fantastic post, covered the topic very well. thanks.

    Javin
    What is polymorphism in Java

    ReplyDelete
  3. Dear Kevlin

    Well spotted. The "refactoring" argument is built on the assumption that you *do* have control over the client code base. That apply to *lots* of situations, e g when different parts of your system communicate with each other.

    However, no doubt the argument breaks down when you cross the borders of control and influence.

    Yours

    Dan

    ReplyDelete
  4. Dear Javin

    Glad you liked it. I think you comment weaves well together with Kevlin's in pointing out that a public field becomes a part of the API. Thus it must be a stable domain concept, and not an implementation detail.

    Yours /Dan

    ReplyDelete