C++ Actor Framework: January 2012

Tuesday, January 24, 2012

Dining Philosophers

The Dining Philosophers Problem is a well-known exercise in computer science for concurrent systems. Recently, I've found an implementation for Akka that's an adaption of this algorithm of Dale Schumacher. I think it's a pretty nice example program to introduce libcppa's event-based actor implementation. The following source code is an adaption of the Akka example implementation.

Let's start with the straightforward chopstick implementation.

#include "cppa/cppa.hpp"
using std::chrono::seconds;
using namespace cppa;
// either taken by a philosopher or available
struct chopstick : sb_actor<chopstick> {

  behavior& init_state; // a reference to available
  behavior available;

  behavior taken_by(const actor_ptr& philos) {
    // create a behavior new on-the-fly
    return (
      on<atom("take"), actor_ptr>() >> [=](actor_ptr other) {
        send(other, atom("busy"), this);
      },
      on(atom("put"), philos) >> [=]() {
        become(&available);
      }
    );
  }

  chopstick() : init_state(available) {
    available = (
      on<atom("take"), actor_ptr>() >> [=](actor_ptr philos) {
        send(philos, atom("taken"), this);
        become(taken_by(philos));
      }
    );
  }

};

The class fsm_actor uses the Curiously Recurring Template Pattern to initialize the actor with the behavior stored in the member init_state. We don't really have an initial state. Thus, init_state is a reference to the behavior available. You also could inherit from event_based_actor and override the member function init() by hand. But using an fsm_actor is more convenient. You change the state/behavior of an actor by calling become. It takes either a pointer to a member or an rvalue. The member function taken_by creates a behavior "on-the-fly". You could achieve the same by using a member storing the 'owning' philosopher. But why use member variables if you don't have to? This solution is easier to understand than manipulating some internal state.

Btw: you should always use [=] in lambda expressions for event-based actors. This copies the this pointer into the lambda and you have access to all members. You should never use references, because become always immediately returns. If your lambda finally gets called, everything that was previously on the stack is gone! All your references are guaranteed to cause undefined behavior.

The implementation of philosopher is a little bit longer. Basically, we implement the following state diagram.

/*
 *                +-------------+  {(busy|taken), Y}
 *      /-------->|  thinking   |<------------------\
 *      |         +-------------+                   |
 *      |                |                          |
 *      |                | {eat}                    |
 *      |                |                          |
 *      |                V                          |
 *      |         +-------------+ {busy, X}  +-------------+
 *      |         |   hungry    |----------->|   denied    |
 *      |         +-------------+            +-------------+
 *      |                |
 *      |                | {taken, X}
 *      |                |
 *      |                V
 *      |         +-------------+
 *      |         | wait_for(Y) |
 *      |         +-------------+
 *      |           |    |
 *      | {busy, Y} |    | {taken, Y}
 *      \-----------/    |
 *      |                V
 *      | {think} +-------------+
 *      \---------|   eating    |
 *                +-------------+
 *
 *
 * [ X = left  => Y = right ]
 * [ X = right => Y = left  ]
 */

This is a simplification of the original diagram. A philosopher becomes hungry after receiving an atom("eat") and then tries to obtain its left and right chopstick. A philosopher eats if it obtained both chopsticks, otherwise it will think again.

struct philosopher : sb_actor<philosopher> {

  std::string name; // the name of this philosopher
  actor_ptr left;   // left chopstick
  actor_ptr right;  // right chopstick

  // note: we have to define all behaviors in the constructor because
  //       non-static member initialization are not (yet) implemented in GCC
  behavior thinking;
  behavior hungry;
  behavior denied;
  behavior eating;
  behavior init_state;

A philosopher has a name, a left and a right chopstick and a bunch of possible behaviors. Hopefully, GCC has non-static member initialization in the next version. For now, we have to define all behavior in the constructor, except for wait_for, which is realized as a function similar to taken_by of chopstick.

  // wait for second chopstick
  behavior waiting_for(const actor_ptr& what) {
    return (
      on(atom("taken"), what) >> [=]() {
        // create message in memory to avoid interleaved
        // messages on the terminal
        std::ostringstream oss;
        oss << name
            << " has picked up chopsticks with IDs "
            << left->id()
            << " and "
            << right->id()
            << " and starts to eat\n";
        cout << oss.str();
        // eat some time
        delayed_send(this, seconds(5), atom("think"));
        become(&eating);
      },
      on(atom("busy"), what) >> [=]() {
        send((what == left) ? right : left, atom("put"), this);
        send(this, atom("eat"));
        become(&thinking);
      }
    );
  }

Our constructor defines all behaviors as well as the three member variables storing name and left and right and chopstick.

  philosopher(const std::string& n, const actor_ptr& l, const actor_ptr& r)
  : name(n), left(l), right(r) {
    // a philosopher that receives {eat} stops thinking and becomes hungry
    thinking = (
      on(atom("eat")) >> [=]() {
        become(&hungry);
        send(left, atom("take"), this);
        send(right, atom("take"), this);
      }
    );
    // wait for the first answer of a chopstick
    hungry = (
      on(atom("taken"), left) >> [=]() {
        become(waiting_for(right));
      },
      on(atom("taken"), right) >> [=]() {
        become(waiting_for(left));
      },
      on<atom("busy"), actor_ptr>() >> [=]() {
        become(&denied);
      }
    );
    // philosopher was not able to obtain the first chopstick
    denied = (
      on<atom("taken"), actor_ptr>() >> [=](actor_ptr& ptr) {
        send(ptr, atom("put"), this);
        send(this, atom("eat"));
        become(&thinking);
      },
      on<atom("busy"), actor_ptr>() >> [=]() {
        send(this, atom("eat"));
        become(&thinking);
      }
    );
    // philosopher obtained both chopstick and eats (for five seconds)
    eating = (
      on(atom("think")) >> [=]() {
        send(left, atom("put"), this);
        send(right, atom("put"), this);
        delayed_send(this, seconds(5), atom("eat"));
        cout << (  name
             + " puts down his chopsticks and starts to think\n");
        become(&thinking);
      }
    );
    // philosophers start to think after receiving {think}
    init_state = (
      on(atom("think")) >> [=]() {
        cout << (name + " starts to think\n");
        delayed_send(this, seconds(5), atom("eat"));
        become(&thinking);
      }
    );
  }

};

The source code is a straightforward implementation of the state diagram. Our main starts five chopsticks and five philosophers. We use an anonymous group to send the initial {think} message to all philosophers at once. You could send five single messages as well.

int main(int, char**) {
  // create five chopsticks
  cout << "chopstick ids:";
  std::vector<actor_ptr> chopsticks;
  for (size_t i = 0; i < 5; ++i) {
    chopsticks.push_back(spawn<chopstick>());
    cout << " " << chopsticks.back()->id();
  }
  cout << endl;
  // a group to address all philosophers
  auto dinner_club = group::anonymous();
  // spawn five philosopher, each joining the Dinner Club
  std::vector<std::string> names = { "Plato", "Hume", "Kant",
                                     "Nietzsche", "Descartes" };
  for (size_t i = 0; i < 5; ++i) {
    spawn_in_group<philosopher>(dinner_club,
                                names[i],
                                chopsticks[i],
                                chopsticks[(i+1) % chopsticks.size()]);
  }
  // tell philosophers to start thinking
  send(dinner_club, atom("think"));
  // real philosophers are never done
  await_all_others_done();
  return 0;
}

The full source code can be found in the examples folder on github.

Have fun!

Sunday, January 22, 2012

Minor API Changes

I've removed some function from the cppa namespace in the latest version. If your code fails to compile after git pull, follow this instructions:

self() → self
trap_exit(...) → self->trap_exit(...)
last_received() → self->last_dequeued()
link(other) → self->link_to(other)
quit(reason) → self->quit(reason)

Wednesday, January 18, 2012

Opt for `option`!

In a perfect world, each function could safely assume all arguments are valid. But this is the real world and things go wrong all the time. Especially whenever a function needs to parse a user-defined input. A good example for such a function is X to_int(string const& str). What is the correct type of X?

Well, some might argue "int of course and the function should throw an exception on error!". I would not recommend it. Throwing an exception is really the very last thing you should do. An exception is a gigantic hammer that smashes your stack to smithereens. If your user didn't read the documentation and uses your function without a try block, the exception will kill the whole program. Furthermore, exceptions are slow. All major compilers are optimized to have few overhead for entering a try block. Throwing an exception, stack unwinding and catching are expensive. You should not throw an exception unless there is nothing else you can do.

Use a bool pointer. Some libraries, e.g., the Qt library, use a bool pointer as function argument. Honestly, I don't like this approach. It forces you do declare additional variables and always returns an object, even if the function shouldn't. It's ok for integers, but what if your function returns a vector or string? Creating empty objects is a waste of time.

Return a pair. Some STL functions return a pair with a boolean and the result. The boolean indicates whether the function was successful. Again, creating empty objects is a waste of time. Thus, this approach isn't very efficient. But that's not the real issue here:

auto x = to_int("try again");
if (x.first) do_something(x.second);

Is this code correct? The answer is "I don't know". Remember, you can assign a bool to an int and you can use integers in if-statements. You have to read the documentation of to_int to see if x.first is the bool or the int.

Return a pointer. Safety issues aside, you should use the stack for doing work and the heap for dynamically growing containers. Your stack is your friend. It is fast, automatically destroys variables as they go out of scope, and did I mention fast? Allocating small objects on the heap has a significant performance impact.

Return an option. If you're familiar with Haskell or Scala, you'll know Maybe or Option. In short, a function doesn't return a value. It returns maybe a value. If your string actually is an integer, the function returns an integer. Otherwise, it returns nothing. libcppa does have an option class. I guess you'll know what this code is supposed to do:

auto x = to_int("try again");
if (x) do_something(*x);

So, what is the type of x here? It is "option<int>". You can write if (x.valid()) do_something(x.get()); instead if you prefer a more verbose style. Option supports default values: do_something(x.get_or_else(0));. If you're a user of the boost library, maybe you'll know boost::optional. In general, I would recommend boost to everyone, but boost::optional is slow. Just have a look at the implementation. cppa::option uses a union to store the value. If the option is empty, the object in the union won't get constructed. You'll have a slight overhead of returning "empty memory" but you don't pay for creating empty objects.

This are some performance results for cppa::option compared to returning an int, returning a pair, and returning a boost::optional.

return type	100,000,000 values	100,000,000 empties
int	0.488s	-
std::pair<bool, int>	2.418s	2.304s
cppa::option<int>	2.776s	1.598s
boost::optional<int>	7.419s	2.987s

The boost implementation clearly falls short. This is because the boost implementation doesn't use the stack. That's not because the boost developers don't know how to write efficient code. It's because unrestricted unions are a C++11 feature. cppa::option has a slight overhead compared to a pair for returning values but is even faster than a pair for returning empties, because the memory is uninitialized in this case. If you're already a user of libcppa: use option. If you're not (yet) a user: copy & paste the source code and use it. :)
Of course, it's C++11 only. Honestly, I really don't know why this isn't part of the STL. It should! It's general, fast, safe and really improves the readability of your source code.