Monday, 16 May 2011

Scala Actor Waiting for Godot - Vladimir Thread Version

Dear Junior

My collegue Roozbeh Maadani and I where curious about the different actor models of Scala so we decided to try them out. I have also had a distant interest in Akka for a while, an interest re-vigored by Jonas Bonér's presentation at Jfokus, and deepened by attending his mini-course on Akka at OP-KoKo. So, I have been toying around with Akka a little as well. I will get back to that.

Let us get started and create an application with some actors - in this case an reenactment of Waiting for Godot. The basic plot of Waiting for Godot is roughly two characters, Vladimir and Estragon, waiting for a person named Godot to show up - which he does not. So let us start with the setup.

object waitingforgodot extends Application {
  println("Setting the stage")
  val vladimir = new Vladimir
  val estragon = new Estragon
}
class Estragon extends Actor {
  // here we will later read Estragon's script
  def act() = null
}
class Vladimir extends Actor {
  def act() = null
}

In the application script we create two actor objects, one "Vladimir" and one "Estragon", defined by one class each. When started (which they are not yet), they will run their respective scripts which is defined in their act()-method. Agreeably neither of them have much of a script to follow yet.

Nevertheless, we can run the application, even though the result is not very thrilling:

danbj$ scala godot.waitingforgodot
Setting the stage

Time to give one of the actors some script to follow. Let us start with Vladimir.

class Vladimir extends Actor {
   def act() = {
      println("Vladimir is waiting")
      // ... stuff happen
      println("Vladimir's wait is over")
   }
}

Running the app still gives

danbj$ scala godot.waitingforgodot
Setting the stage

Obviously, the script for Vladimir is not executed. This is because in the Scala "actors.Actor" API the basic metaphor is that an actor is a thread (conceptually). It is not necessarily implemented that way, but the API keeps the mental model as close to thread as possible. Thus, we need to start it.

object waitingforgodot extends Application {
   println("Setting the stage")
   val vladimir = new Vladimir
   val estragon = new Estragon
   println("Starting the play")
   vladimir.start
   estragon.start
   println("Main script is over")
}

Now we see the actor Vladimir acting his part as well

danbj$ scala godot.waitingforgodot
Setting the stage
Starting the play
Main script is over
Vladimir is waiting
Vladimir's wait is over

Note that the Vladimir script is actually running even after the main script is over. This is explain by "vladimir.start" is actually starting a new thread which keeps running even after the original main-thread has finished.

This becomes even more obvious if we add some thread information to our printlns.

object waitingforgodot extends Application {
   println("Setting the stage " + Thread.currentThread.getId)
   val vladimir = new Vladimir
   val estragon = new Estragon
   ...
}

class Vladimir extends Actor {
   def act() = {
      println("Vladimir is waiting " + 
         Thread.currentThread.getId)
      println("Vladimir's wait is over " + 
         Thread.currentThread.getId)
   }
}

Which yields

danbj$ scala godot.waitingforgodot
Setting the stage 1
Starting the play
Main script is over
Vladimir is waiting 10
Vladimir's wait is over 10

Main script is run by thread 1 and character Vladimir is run by thread 10. Curiously this actually apply to the play itself. In the play Vladimir is the only one who have a clear sense of time: he remembers the past (which the other characters does not, at least not very well), and he has an idea of the future to come.

Now let's get over to the point of the play, which is waiting for a specific person named Godot. Vladimir's script is extended by a "wait for Godot to arrive". The actor representation of this is to send a message to the actor Vladimir, a message consisting of "Godot".

Vladimir on his part must wait for this message to arrive in his "inbox". He does so by using "receive" and a message handler. Using "recieve" makes the actor to check its mailbox and act upon the messages it finds. The "message handler" is simply a codeblock that tells what to do with messages.

If there is no message in the inbox when the actor gets to "recieve", then the actor blocks and waits for a message to arrive.

class Vladimir extends Actor {
   def threadid = {
      Thread.currentThread.getId
   }

   def act() = {
      println("Vladimir is waiting " + threadid)
      receive {
         case Godot =>
            println("Vladimir saw Godot arrive! " + threadid)
      }
      println("Vladimir's wait is over " + threadid)
   }
}

In this code "receive" causes the thread to be put in a waiting-state and watch the actor's "inbox". The "inbox" is the way to communicate with an actor. You should neither access its data, nor should you call methods on the object directly. Instead you send "messages" to it, which it will receive and react upon. I think of it like corresponding with mail.

So, whenever a message arrive, the actor will run its message handler to see what it should do about it. The message handler in this case is the code-block following "receive"

receive {
   case Godot =>
      println("Vladimir saw Godot arrive! " + threadid)
}

So, if there is a message that matches the value "Godot", then it will be handled by running the code

println("Vladimir saw Godot arrive! " + threadid)

Let us ponder this for a while. The actor support in Scala is a library, not a part of the system. So "receive" is not a reserved word in the language, it is the name of a method of the Actor class (which we subclass). So, what actually is happening is that we execute the receive-method and pass in a code-block as the argument, the message handler.

So another way to phrase it is that we use "receive" to register a message-handler with the actor runtime (i e the thread). As the message-handler is passed as a parameter it will sit on the top of the stack. And there the processing stops (the thread goes to sleep).

At a later point of time (when there is message in the inbox), the processing will be resumed (by the same thread). Now, the message-handler is still there on the top of the stack so we can run it, do the case-pattern-matching, select the "println(...)" and execute it. Thereafter, the thread pops out of the "recieve {...}" and continue with the rest of the program.

So, the entire "receive, wait for message, pattern-match, run message handler" was just a slow subroutine call.

Now over to the message.

We have one piece missing: who (or what) is Godot?

In this example Godot is a simple value. There is no need to make it an actor, because it does not play any active part in the program. Actually, in the play there is no character Godot on stage at any time - he simply never shows up. So casting the Godot part is easy - there is no actor needed.

Thus we can represent Godot with a value object defined by a case class

case class Godot

Yepp, that is all that is needed to define a message.

This defines a class Godot which contains exactly one object: Godot (a little bit like an enum would). As there is no mutable state, this object can safely be shared by anyone interested in Godot, in the same way as there only need to be one integer 4711 or one string "fubar".

The main advantage of case classes is that they can be used for pattern matching, which is necessary if we want to receive them as messages the way we did.

Now let us run this.

danbj$ scala godot.waitingforgodot
Setting the stage 1
Starting the play
Main script is over
Vladimir is waiting 10
^C

This time I had to kill the process as there was one thread that had not finished. Thread 10 was still active running the Vladimir actor and fully occupied waiting for Godot. I e it was hanging in "receive" but as it never got "Godot" it kept hanging there.

This is by the way also a parallel to the play. In the play Vladimir is very energetic, at least compared to Estragon who also waits for Godot. Vladimir is on his feet all the time during the play, whereas Estragon often sits down and even takes a nap. Obviously, Estragon's strategy conserves resources better, but let us not hold that against Vladimir right now.

In the play, Godot never show up, but let us be nice to poor Vladimir just for a moment and send him a Godot to his inbox.

object waitingforgodot extends Application {
   println("Setting the stage " + Thread.currentThread.getId)
   val vladimir = new Vladimir
   val estragon = new Estragon
   println("Starting the play")
   vladimir.start
   estragon.start
   println("Let Godot arrive")
   vladimir ! Godot
   println("Main script is over")
}

Here we use the "!" (pronounced "bang") syntax for putting a message into the inbox of the actor referred by "vladimir".

This will wake vladimir from its recieve-wait-state.

receive {
   case Godot =>
      println("Vladimir saw Godot arrive! " + threadid)
}

Now that actor is reactivated, it will pattern-match the "case" clauses (only one in this example) in the handler and run the corresponding code.

danbj$ scala godot.waitingforgodot
Setting the stage 1
Starting the play
Let Godot arrive
Vladimir is waiting 11
Main script is over
Vladimir saw Godot arrive! 11
Vladimir's wait is over 11 

This time I did not need to kill the process as the Vladimir thread had a chance to finish its execution.

Note also that messages are sent asynchronously so there is no guarantee that message handling is done immediately after message is sent. On the contrary, things can happen in apparently random order. In this case, the Godot-message is sent before Vladimir had stated running, and the main script interleaves with the Vladimir script. Of course your code should not rely on any specific order of execution.

In this run some significant things happen from a message passing perspective

  • "Let Godot arrive" (by thread 1)- here a message is sent to (and put in) Vladimir's inbox
  • "Vladimir is waiting" (by thread 11) - Vladimir enters the "receive", pushes the message-handler on the stack, and goes into thread wait
  • "Vladimir saw Godot arrive!" (by thread 11) - the actor runtime (the same thread) realises that there is a message in the inbox; it wakes up and run through the pattern-matching; it finds that the value Godot matches the case Godot and run the corresponding code.

A drawback with this Vladimir design is that each actor needs to have its own thread. The thread will be in wait-state when there is no message processing going on, so no CPU will be wasted. However, the number of threads a system can handle is quite limited so dedicating threads to waiting actors is pretty wasteful. However, that is a letter in its own right.

In contrast to Vladimir who is on his feet the entire play we have the other character, Estragon. As I mentioned, Estragon is more laid back; he sits down and occasionally takes a nap. We will see that he is a pretty good picture of the other actor execution model, the event-based. But that will also need to be the subject of another letter. And, then we want to cover Akka as well ....

Nevertheless, the thread-based actor model that Vladimir represents is a pretty good starting point for understanding actors, message passing, and message-handling execution.

Yours

Dan

ps I have actually made some measurements to see how limited the thread-based execution model actually is.