Wednesday, November 14, 2012

Using Actors with Qt

Actor programming is nice and makes our lives easier, but at some point, we have to display the output of our program to the user. Not everything is a command line tool. Hence, we have to use some sort of GUI library, which raises the question how we pass messages from actors - safely! - to the GUI at runtime. Well, what if we could send an ordinary message to the GUI as if it's an actor? What if the GUI can send us ordinary messages whenever the user pushes some buttons? Briefly speaking, can we treat GUI elements as actors? Luckily, the answer is yes! We have to write some gluecode, but in fact, you can treat everything as an actor with libcppa.

Getting Started

In this article, we will focus on the Qt library, as it is the open source GUI library of choice for C++ developers. In the next article, we will have a look at the general concept of actor companions. An actor companion is an actor that co-exists with another object. In our case, this object is a QWidget-base class. The companion is used to receive and send messages, but it does not have any behavior itself. All it takes to enable a class to have such a companion is to use the actor_widget_mixin.

The following class implements a simple group-based chat widget with a QTextEdit for text output and a QLineEdit for user input (chat messages). The full source code can be found in the examples folder in the Git repository.
#include <QWidget>
#include "cppa/qtsupport/actor_widget_mixin.hpp"

class ChatWidget : public cppa::actor_widget_mixin<QWidget> {

  Q_OBJECT

  typedef cppa::actor_widget_mixin<QWidget> super;

  // ...

 public:

  ChatWidget(QWidget* parent = nullptr, Qt::WindowFlags f = 0);

  // ...

 private:

  std::string m_name;
  cppa::group_ptr m_chatroom;

};
The mixin adds the following member functions to ChatWidget:
  • actor_ptr as_actor()
    returns the companion of this widget
  • set_message_handler
    sets the partial function for message processing

Handle Messages to the Widget

In our example, the widget should handle 'join', 'setName', and 'quit' messages as well as display chat messages (received as std::string). We encode our message handling directly into the constructor of ChatWidget:
ChatWidget::ChatWidget(QWidget* parent, Qt::WindowFlags f) : super(parent, f) {
  set_message_handler (
    on(atom("join"), arg_match) >> [=](const group_ptr& what) {
      if (m_chatroom) {
        send(m_chatroom, m_name + " has left the chatroom");
        self->leave(m_chatroom);
      }
      self->join(what);
      print(("*** joined " + to_string(what)).c_str());
      m_chatroom = what;
      send(what, m_name + " has entered the chatroom");
    },
    on(atom("setName"), arg_match) >> [=](string& name) {
      send(m_chatroom, m_name + " is now known as " + name);
      m_name = std::move(name);
      print("*** changed name to "
            + QString::fromUtf8(m_name.c_str()));
    },
    on(atom("quit")) >> [=] {
      close(); // close widget
    },
    on<string>() >> [=](const string& txt) {
      // don't print own messages
      if (self != self->last_sender()) {
        print(QString::fromUtf8(txt.c_str()));
      }
    }
  );
}

Pitfalls & Things to Know

The code is pretty straight forward if you are already familiar with libcppa, but there is one important thing to note: Unhandled messages are lost. Although the widget is able to handle libcppa messages now, it does not have a mailbox. Messages are delivered to the widget as QEvent objects that are disposed afterwards.

The second thing to note is that you should never use self outside the message handler. This includes using functions such as send, because self is internally used to determine the sender of a message. The reason to this limitation is that libcppa's self "pointer" is thread-local. Using self would therefore convert the Qt thread your widget runs in to an actor, but it wouldn't address your widget's companion.

To send a message from outside of your handler, you have to tell libcppa who is the sender of the message by hand:
send_as(as_actor(), m_chatroom, "hello world!");


Have fun!

Friday, October 26, 2012

Version 0.5 released

Version 0.5 of libcppa has just been released. This release brings Log4j-like logfiles to libcppa developers (must be enabled at compile time), support for user-defined communication protocols, and actor companions.

Each class using the actor_companion_mixin has such an actor companion, which provides an easy way for active non-actor objects, i.e., objects with an own thread or event loop, to send messages to/as actor. The default use case for this feature is to treat GUI elements - widgets - as actors. A more specific mixin for Qt widgets is on its way.

By the way, libcppa now has more than a thousand commits on GitHub! :-)

Wednesday, August 22, 2012

Version 0.4 released

The main feature of version 0.4 is a streamlined and bugfixed network layer. Of course, there are some new features as well:
  • Support for User-Defined Network Layer
    The functions publish and remote_actor were overloaded to allow user-defined network layers such as OpenSSL; see this mailing list discussion and the doxygen documentation of util::input_stream, util::output_stream and util::acceptor for more details.
  • Shutdown Function
    The new shutdown() function closes all network connections, stops the scheduler and deletes all of libcppa's singletons. It is strongly recommended to call this function before returning from main(), especially if you are connected to remote actors.
  • Syntactic Sugar for Synchronous Messaging
    Synchronous message handling using futures is flexible but sometimes too verbose.
    auto future = sync_send(...);
    handle_response(future, ...);  // event-based API
    receive_response(future, ...); // blocking API
    Version 0.4 provides some syntactic sugar to make your code more compact. Whenever you send a message and immediately wait for the response, you can write the following instead.
    sync_send(...).then(...);  // event-based API
    sync_send(...).await(...); // blocking API
    Furthermore, there is a feature request for a continuation-passing style API to enable developers to easily encode "send X then receive Y then send Z" message flows (see Issue 58 on GitHub).
  • Local Groups & Remote Actors
    Local groups, as returned by calling group::get("local", ...); or group::anonymous(), are now not-so-local. It is possible to send a local group to a remote actor and let the remote actor join the group. Whenever an actor sends a message to the group, the message is send back to the owning process if needed and forwarded to all subscribers from there, including remote actors. This approach certainly does not scale for largely distributed groups, since it is N-times unicast*. However, it paves to path for more use cases of "local" groups and we are working on scalable group communication as well.

    * The N in this case is the number of hosts and not the number of remote actors.

Friday, August 10, 2012

New Functions in 0.3.3

Among some bugfixes, version 0.3.3 also includes a few new functions to add 'missing' features:
  • Forwarding of Messages
    libcppa lacked an easy and transparent way to forward messages. The new function forward_to finally adds this functionality. Furthermore, forwarding a synchronous messages is not possible without this function. Read more about forwarding in Section 5.4 of the manual.
  • Messaging with Tuples
    Matthias Vallentin pointed out that the API was somewhat inconsistent, since it did not provide functions to use a tuple as response message. We have added the following functions in version 0.3.3 to treat tuples as first-class citizen: send_tuple, sync_send_tuple, reply_tuple, delayed_send_tuple and delayed_reply_tuple.
  • Manual
    The manual is now included to the source distribution as manual.pdf and states the libcppa version.
  • Doxygen
    CMake checks whether doxygen is available on your system and adds an optional "doc" target to the Makefile. You can create your own local version of the doxygen documentation by running "make doc". Open the file html/index.html afterwards.

Thursday, July 26, 2012

libcppa on Mountain Lion

I have just upgraded my Mac and I have good news for all Apple users out there!

If you have switched to Mountain Lion or have plans to do so, you will find clang++ in version 4.0 after installing XCode. Hence, there is finally no need to get a recent compiler from an external source. libcppa 0.3 compiles and runs like a charm after installing CMake and Boost from MacPorts.

Wednesday, July 25, 2012

Version 0.3 Released

I'm glad to announce libcppa 0.3. This release has two major improvements.

Synchronous Messages

Actors can send synchronous response messages by using the function sync_send. This function returns a future to the response that can be received by using either the blocking receive_response or the event-based handle_response. Please read the section about synchronous communication in the manual for further details.

Configure Script

Thanks to Matthias Vallentin (a.k.a. mavam), libcppa has a much simpler build process now. The new configure script hides all CMake details behind a nice and clean interface.

Minor Improvements

  • Context-switching can be disabled by using the --disable-context-switching configure option for platforms without Boost.Context support (Issue 24).
  • The function tuple_cast does no longer require an additional header include (Issue 38).
  • The new "cppa_fwd.hpp" header provides forward declarations for heavily used data structures, such as actor_ptr (Issue 36).
  • become() no longer accepts pointers to avoid potential misuses and resulting memory leaks.

Friday, June 29, 2012

We Have a Beta! Read the Manual!

After months of development, I've just merged the development branch into the master branch. Get the V0.2.0 tag from Github or update your local working copy.

What's new?

  • Improved performance! (new benchark results will follow)
  • CMake instead of Automake
  • A user manual (PDF)
  • A stable API*


* I know libcppa had a quite volatile API in the past. However, this is the first official release and libcppa is no longer released as experimental version. You will seldom see code-breaking changes from now on and in major updates only! This is the perfect time to get started with libcppa. :)

That said, version 0.2 had a few changes. Here is a list of "quick-fixes" to get your code running again:
  • become_void() => quit()
  • future_send() => delayed_send()
  • stacked_event_based_actor is dead and gone; just use event_based_actor and the new policy-based API of become/unbecome (please read the manual for further details)
  • fsm_actor => sb_actor (read: "State-Based Actor"); because "real", transition-oriented FSM actors are something I am thinking about as a possible feature in a future release
  • spawn(new T(...)) => spawn<T>(...)


I will update all blog posts in the near future, so that all code examples will compile and run with version 0.2.

Have fun!

Saturday, May 12, 2012

A Little Bit of History and Why Restrictions Breed Creativity

If you think about the (original) actor model, you'll find that it's defined by what you are not allowed to do. You are not allowed to share state. You are not allowed to use everything else but message passing for communication. You are not allowed to randomly change your state, because you shall change your state only in response to a received message. If you think about functional programming, you'll find the exact same mindset. Functional programming is all about immutability (read: "you are not allowed to change objects") and prohibition of side-effects (read: "you are not allowed to manipulate your arguments or some global state"). There are good reasons for such restrictions, but let's talk about Erlang first.

If you are familiar with Erlang, you've probably noticed that the word "actor" is not used at all. Neither in function names nor in its documentation. In fact, the dissertation of Joe Armstrong (the inventor of Erlang) doesn't even include the word "actor". Why? Erlang is always referred to as the reference implementation of the actor model, isn't it?

Well, Erlang is based on a simple observation. In the "old days", people started implementing operation systems able to execute multiple programs in parallel. Mostly because punched cards and batch processing don't make fun. But this led to trouble. Serious trouble. In a naive mutlitasking-OS, you are limited by the amount of freedom you (and other developers) have, because if you are free to write everywhere you want, you can easily corrupt someone else's program. That's why we need an MMU in our computer. It's impossible to write a program if you know that someone else might manipulate your state (memory) at any point in time. So, the OS organized running programs into processes that do not share state and the world was sane again. You can start a program ten times and each of it will have a unique state. Pipes were added to allow processes to communicate. A process also can create a new process using fork. But then, people wanted concurrent execution of a single program, e.g., to keep GUI's responsive while doing background work. Threads were added, because fork is expensive and communicating via a pipe is complicated ... and the concurrency nightmare begun.

Erlang had the simple idea to fix the real issue here: processes are too expensive in modern OS's and communication between processes is too complicated. Do you know why Erlang does not have a threading library? Because no language should! Programs are written by humans and humans cannot think concurrently. Erlang's VM provides lightweight processes with a simple way to exchange messages. And it abstracts away all the dirty details. Plus, message passing is network transparent.

Message passing is something people can imagine. It is something people can reason about. Abstraction is always the answer in computer science. And abstraction means to build a simple model of something inherently complicated. Small systems communicating via message passing is something we can reason about. We can build large systems by "plugging" small systems together and we are still able to handle it. But hundreds of thousands of objects floating around, sharing state and run in parallel is hard to comprehend. To quote Rich Hickey (have a look at this channel9 video if you don't know him): "Mutable stateful objects are the new spaghetti code".

By not sharing memory, some problems just disappear. Isolation is a restriction, but it allows you to focus on your problems and you can use all your creativity and skill to do this. Threads are broken by design. Some bright minds are trying to fix them, but I think threads are best avoided. Unfortunately, we cannot "undo" threads and we cannot go back in time to implement OS's more suitable to write concurrent software. However, we can treat "std::thread" (and friends) the way we treat "goto". It's in the language/STL, but you should not use it in production code. Well, maybe someone implements something that's useful and safe on top of it (libcppa for example).

But let's get back to actors. Karl Hewitt et al. published an article about isolated computational entities, "actors", in the year 1973 (btw. there's an interesting video featuring Hewitt on channel9). It's a theoretically point of view to answer the question "what is the minimum set of axioms we need to describe concurrency?" Erlang came from the opposite direction. They said "writing concurrent, fault-tolerant software in traditional programming models is extremely difficult, how can we provide a better model?" Interestingly enough, Erlang developers came up with a programming paradigm that's in fact based on the axioms of the actor model. So to speak, the wheel was invented twice. But Armstrong didn't came up with a remarkable name for the programming paradigm while Hewitt did.

Monday, April 16, 2012

Guards! Guards!

(no, this post is not about the Discworld novel)

Guards are a new feature in libcppa. More precisely, it's an adaption of Erlangs guard sequences for patterns. A guard is an optional expression that evaluates to either true or false for a given input (message). Guards can be used to constrain a given match statement by using placeholders (_x1 - _x9) as shown in the example below.
using namespace cppa;
using namespace cppa::placeholders;
receive_loop(
  on<int>().when(_x1 % 2 == 0) >> []() {
    // int is even
  },
  on<int>() >> []() {
    // int is odd
  }
);
Guard expressions are a lazy evaluation technique (somewhat like boost lambdas). The placeholder _x1 is substituted with the first value of an incoming message. You can use all binary comparison and arithmetic operators as well as "&&" and "||". In addition, there are two functions designed to be used in guard expressions: gref and gcall. The function gref creates a reference wrapper. It's similar to std::ref but it is always const and 'lazy'. A few examples to illustrate some pitfalls:
int ival = 42;
receive(
  on<int>().when(ival == _x1) // (1) ok, matches if _x1 == 42
  on<int>().when(gref(ival) == _x1) // (2) ok, matches if _x1 == ival
  on<int>().when(std::ref(ival) == _x1) // (3) ok, because of placeholder
  on<anything>().when(gref(ival) == 42) // (4) ok, matches everything as long as ival == 42
  on<anything>().when(std::ref(ival) == 42) // (5) compiler error
);
The statement std::ref(ival) == 42 is evaluated immediately and returns a boolean, whereas gref(ival) == 42 creates a guard expression. Thus, you should always use gref instead of std::ref to avoid subtle errors.

By the way, gref is a great way to reduce verbosity of receive loops:

bool done = false;
do_receive(
  // ...
  on<atom("shutdown")>() >> [&]() {
    done = true;
  }
)
.until(gref(done));
//equal to: .until([&]() { return done; })


The second function, gcall, encapsulates a function call. It's usage is similar to std::bind, but there is also a short version for unary functions: gcall(..., _x1) is equal to _x1(...).
auto vec_sorted = [](std::vector<int> const& vec) {
  return std::is_sorted(vec.begin(), vec.end());
};
receive(
  on<std::vector<int>>().when(gcall(vec_sorted, _x1)) // equal to:
  on<std::vector<int>>().when(_x1(vec_sorted))) // ...
);
Placeholders provide some more convenience member functions besides operator(). A few code snippets:
_x1.starts_with("hello")
_x1.size() > 10
_x1.in({"abc", "def"})
_x1.not_in({0, 10, 100})
_x1.empty()
_x1.not_empty()
_x1.front() == 42
A final note: you don't have to check if a container is empty before calling _x1.front(), because front() returns an option for a reference to the first element.

Have fun!

Friday, March 9, 2012

A Few Words About Theron

Theron is currently the only existing actor library for C++ besides libcppa. It implements event-based message processing without mailbox, using registered member functions as callbacks. I wanted to include Theron to my three benchmarks, but I had some trouble to get Theron running on Linux. I've compiled Theron in release mode, ran its PingPong benchmark ... and got a segfault. GDB pointed me to the DefaultAllocator implementation and I could run the benchmarks after replacing the memory management with plain malloc/free. Thus, the shown results might be some percentages better with a fixed memory management. However, I was unable to get results for the Actor Creation Overhead benchmark because Theron crashed for more than 210 actors.

Theron has two ways of sending a message to an actor, because it distincts between references and addresses. The push method can be used only if one has a reference to an actor. This is usually only the case for the creator of an actor. The send method uses addresses and is the general way to send messages.



The results for the Mixed Scenario benchmark are as follows.



Theron yields good results on two and four cores. However, the more concurrency we add, the more time Theron needs in the mixed scenario. I don't know about Theron's internals, but this is a common behavior of mutex-based software in many-core systems and cannot be explained by the "missing" memory management.

I used Theron in version 3.03. As always, the benchmarks ran on a virtual machine with Linux using 2 to 12 cores of the host system comprised of two hexa-core Intel® Xeon® processors with 2.27GHz. All values are the average of five runs.

The sources can be found on github in the benchmark folder (theron_mailbox_performance.cpp and theron_mixed_case.cpp).

Thursday, March 8, 2012

Mailbox Part 2

It's been a while since I've posted Mailbox Part 1 and I promised a benchmark for a 1:N communication scenario for up to 12 cores. So here it is. This benchmark compares libcppa to Erlang and Scala with its two standard library implementations and Akka.

The benchmark uses 20 threads sending 1,000,000 messages each, except for Erlang which does not have a threading library. In Erlang, I spawned 20 actors instead. The minimal runtime of this benchmark is the time the receiving actor needs to process 20,000,000 messages and the overhead of passing the messages to the mailbox. More hardware concurrency leads to higher synchronization between the sending threads, since the mailbox acts as a shared resource.



Both libcppa implementations show similar performance to Scala (receive) on two cores but have a faster increasing curve. The message passing implementation of Erlang does not scale well for this use case. The more concurrency we add, the more time the Erlang program needs, up to an average of 600 seconds on 12 cores. The results are clipped for visibility purposes in the graph. The increase in runtime for libcppa is similar to the increase seen in the Actor Creation Overhead benchmark and is caused by the scheduler (the cached stack algorithm scales very well and is not a limiting factor here). The overhead of stack allocation is negligible in this use case. Thus, the run time of both libcppa implementations is almost identical.

The benchmarks ran on a virtual machine with Linux using 2 to 12 cores of the host system comprised of two hexa-core Intel® Xeon® processors with 2.27GHz. All values are the average of five runs.

The sources can be found on github (MailboxPerformance.scala, mailbox_performance.erl and mailbox_performance.cpp).

Tuesday, March 6, 2012

RIP invoke_rules

The class invoke_rules was among the first classes of libcppa. In fact, it received a lot of refactoring even before the first commit on github. However, it's finally gone. If your code fails to compile with the current version, this is how to fix your code:
invoke_rules → partial_function
timed_invoke_rules → behavior
The class invoke_rules had too much changes in the past and its name isn't very well chosen. In fact, the implemented behavior of it already was identical to a partial function. But it had a brother called timed_invoke_rules that was a partial function with a timeout. That's pretty much the definition of an actor's behavior, isn't it? It's an old remains from the time I've implemented on() and after(). The new partial_function/behavior interface is straightforward and much more intuitive.

Monday, February 13, 2012

Actor Creation Overhead

libcppa provides two actor implementations: a context switching and an event-based implementation.

The context-switching implementation is easier to use from a user's point of view. One has to write less code and receives can be nested. But there is a downside to this approach: each actor allocates its own stack. As an example for a current mainstream system: Mac OS X Lion defines the two constants SIGSTKSZ = 131072 and MINSIGSTKSZ = 32768 in its system headers. SIGSTKSZ is the recommended stack size in bytes and MINSIGSTKSZ is the minimum allowed stack size in bytes. Assuming a system with 500,000 actors, one would require a memory usage of at least 15 GB of RAM for stack space only. This would rise up to 61 with the recommended stack size instead in use. This clearly does not scale well for large systems. The event-based implementation uses fewer system resources, allowing developers to use hundreds of thousands of actors. Creating an event-based actor is cheap and lightweight but you have to provide a class-based implementation. Furthermore, you cannot use receive() since this would block the calling worker thread. However, the behavior-based approach is slightly different to use but fairly easy to understand and use (see the Dining Philosophers example).

The following benchmark measures the overhead of actor creation. It recursively creates 219 (524,288) actors, as the following pseudo code illustrates.
spreading_actor(Parent):
  receive:
    {spread, 0} =>
      Parent ! {result, 1}
    {spread, N} =>
      spawn(spreading_actor, self)) ! {spread, N-1}
      spawn(spreading_actor, self)) ! {spread, N-1}
      receive:
        {result, X1} =>
          receive:
            {result, X2} =>
              Parent ! {result, X1+X2}

main():
  spawn(spreading_actor, self)) ! {spread, 19}
  receive:
    {result, Y} =>
      assert(2^19 == Y)

This measurement tests how lightweight actor implementations are. We did not test the thread-mapped actor implementation of Scala, because the JVM cannot handle half a million threads. And neither could a native application.


It is not surprising that Erlang yields the best performance, as its virtual machine was build to efficiently handle actors. Furthermore, we can see the same increase in runtime caused by more hardware concurrency for the event-based libcppa implementation as in our previous benchmark. However, the context-switching (stacked) implementation clearly falls short in this scenario. Please note that this benchmark used the minimal stack size to be able to create half a million actors. Per default, libcppa uses the recommended stack size! Consider using event-based actors whenever possible, especially in systems consisting of a large amount of concurrently running actors.

The benchmarks ran on a virtual machine with Linux using 2 to 12 cores of the host system comprised of two hexa-core Intel® Xeon® processors with 2.27GHz. All values are the average of five runs.

The sources can be found on github (ActorCreation.scala, actor_creation.erl and actor_creation.cpp).

Monday, February 6, 2012

libcppa vs. Erlang vs. Scala Performance (Mixed Scenario)

Remark


Please note that the results of the original post are heavily outdated. The graph below illustrates some newer benchmark results using Scala 2.10, Erlang 5.10.2, and libcppa 0.9. Rather than running the benchmark on a 12-core machine, we have used a 64-core machine (4 CPUs with 16 cores each). Furthermore, we have used a slightly different set of parameters: 100 rings, 50 actors each, initial token value of 1000, and 5 repetitions. We will publish a more throughout evaluation in the near future.



Original Post


This benchmark simulates a use case with a mixture of operations. The continuous creation and termination of actors is simulated along with a total of more than 50,000,000 messages sent between actors and some expensive calculations are included to account for numerical work load. The test program creates 20 rings of 50 actors each. A token with initial value of 10,000 is passed along the ring and decremented once per iteration. A client receiving a token always forwards it to the next client and finishes execution whenever the value of the token was 0. The following pseudo code illustrates the implemented algorithm.

chain_link(Next):
  receive:
    {token, N} =>
      next ! {token, N}
      if (N > 0) chain_link(Next)

worker(MessageCollector):
  receive:
    {calc, X} =>
      MessageCollector ! {result, prime_factorization(X)}

master(Worker, MessageCollector):
  5 times:
    Next = self
    49 times: Next = spawn(chain_link, Next)
    Next ! {token, 10000}
    Done = false
    while not Done:
      receive:
        {token, X} =>
          if (X > 0): Next ! {token, X-1}
          else: Done = true
  MessageCollector ! {master_done}

Each ring consists of 49 chain_link actors and one master. The master recreates the terminated actors five times. Each master spawns a total of 245 actors and the program spawns 20 master actors. Additionally, there is one message collector and one worker per master. A total of 4921 actors (20+(20∗245)+1) are created but no more than 1021 (20+20+(20∗49)+1) are running concurrently. The message collector waits until it receives 100 (20∗5) prime factorization results and a done message from each master. Prime factors are calculated to simulate some work load. The calculation took about two seconds on the tested hardware in our loop-based C++ implementation. Our tail recursive Scala implementation performed at the same speed, whereas Erlang needed almost seven seconds.




As expected, the thread-based Scala implementation yields the worst performance though the runtime increase for eight and more cores surprises. Akka is significantly faster than both standard library implementations of Scala. Erlang performs very well, given the fact that its prime factorization is more than three times slower. The very efficient scheduling of Erlang, which is the only implementation under test that performs preemptive scheduling, is best at utilizing hardware concurrency. The current overhead of the libcppa scheduler hinders better performance results. The overhead of stack allocation and context switching is about 10-20% in this benchmark for up to six cores where the scheduler is stretched to its limits.

This benchmark shows that libcppa is competitive and performs at comparable speed to well-tested and established actor model implementations. Nevertheless, the current scheduling algorithm is not able to utilize more than six cores efficiently by now.

The benchmarks ran on a virtual machine with Linux using 2 to 12 cores of the host system comprised of two hexa-core Intel® Xeon® processors with 2.27GHz. All values are the average of five runs.

The sources can be found on github (MixedCase.scala, mixed_case.erl and mixed_case.cpp).

Tuesday, January 24, 2012

Dining Philosophers

The Dining Philosophers Problem is a well-known exercise in computer science for concurrent systems. Recently, I've found an implementation for Akka that's an adaption of this algorithm of Dale Schumacher. I think it's a pretty nice example program to introduce libcppa's event-based actor implementation. The following source code is an adaption of the Akka example implementation.


Let's start with the straightforward chopstick implementation.
#include "cppa/cppa.hpp"
using std::chrono::seconds;
using namespace cppa;
// either taken by a philosopher or available
struct chopstick : sb_actor<chopstick> {

  behavior& init_state; // a reference to available
  behavior available;

  behavior taken_by(const actor_ptr& philos) {
    // create a behavior new on-the-fly
    return (
      on<atom("take"), actor_ptr>() >> [=](actor_ptr other) {
        send(other, atom("busy"), this);
      },
      on(atom("put"), philos) >> [=]() {
        become(&available);
      }
    );
  }

  chopstick() : init_state(available) {
    available = (
      on<atom("take"), actor_ptr>() >> [=](actor_ptr philos) {
        send(philos, atom("taken"), this);
        become(taken_by(philos));
      }
    );
  }

};
The class fsm_actor uses the Curiously Recurring Template Pattern to initialize the actor with the behavior stored in the member init_state. We don't really have an initial state. Thus, init_state is a reference to the behavior available. You also could inherit from event_based_actor and override the member function init() by hand. But using an fsm_actor is more convenient. You change the state/behavior of an actor by calling become. It takes either a pointer to a member or an rvalue. The member function taken_by creates a behavior "on-the-fly". You could achieve the same by using a member storing the 'owning' philosopher. But why use member variables if you don't have to? This solution is easier to understand than manipulating some internal state.

Btw: you should always use [=] in lambda expressions for event-based actors. This copies the this pointer into the lambda and you have access to all members. You should never use references, because become always immediately returns. If your lambda finally gets called, everything that was previously on the stack is gone! All your references are guaranteed to cause undefined behavior.


The implementation of philosopher is a little bit longer. Basically, we implement the following state diagram.
/*
 *                +-------------+  {(busy|taken), Y}
 *      /-------->|  thinking   |<------------------\
 *      |         +-------------+                   |
 *      |                |                          |
 *      |                | {eat}                    |
 *      |                |                          |
 *      |                V                          |
 *      |         +-------------+ {busy, X}  +-------------+
 *      |         |   hungry    |----------->|   denied    |
 *      |         +-------------+            +-------------+
 *      |                |
 *      |                | {taken, X}
 *      |                |
 *      |                V
 *      |         +-------------+
 *      |         | wait_for(Y) |
 *      |         +-------------+
 *      |           |    |
 *      | {busy, Y} |    | {taken, Y}
 *      \-----------/    |
 *      |                V
 *      | {think} +-------------+
 *      \---------|   eating    |
 *                +-------------+
 *
 *
 * [ X = left  => Y = right ]
 * [ X = right => Y = left  ]
 */
This is a simplification of the original diagram. A philosopher becomes hungry after receiving an atom("eat") and then tries to obtain its left and right chopstick. A philosopher eats if it obtained both chopsticks, otherwise it will think again.
struct philosopher : sb_actor<philosopher> {

  std::string name; // the name of this philosopher
  actor_ptr left;   // left chopstick
  actor_ptr right;  // right chopstick

  // note: we have to define all behaviors in the constructor because
  //       non-static member initialization are not (yet) implemented in GCC
  behavior thinking;
  behavior hungry;
  behavior denied;
  behavior eating;
  behavior init_state;
A philosopher has a name, a left and a right chopstick and a bunch of possible behaviors. Hopefully, GCC has non-static member initialization in the next version. For now, we have to define all behavior in the constructor, except for wait_for, which is realized as a function similar to taken_by of chopstick.
  // wait for second chopstick
  behavior waiting_for(const actor_ptr& what) {
    return (
      on(atom("taken"), what) >> [=]() {
        // create message in memory to avoid interleaved
        // messages on the terminal
        std::ostringstream oss;
        oss << name
            << " has picked up chopsticks with IDs "
            << left->id()
            << " and "
            << right->id()
            << " and starts to eat\n";
        cout << oss.str();
        // eat some time
        delayed_send(this, seconds(5), atom("think"));
        become(&eating);
      },
      on(atom("busy"), what) >> [=]() {
        send((what == left) ? right : left, atom("put"), this);
        send(this, atom("eat"));
        become(&thinking);
      }
    );
  }
Our constructor defines all behaviors as well as the three member variables storing name and left and right and chopstick.
  philosopher(const std::string& n, const actor_ptr& l, const actor_ptr& r)
  : name(n), left(l), right(r) {
    // a philosopher that receives {eat} stops thinking and becomes hungry
    thinking = (
      on(atom("eat")) >> [=]() {
        become(&hungry);
        send(left, atom("take"), this);
        send(right, atom("take"), this);
      }
    );
    // wait for the first answer of a chopstick
    hungry = (
      on(atom("taken"), left) >> [=]() {
        become(waiting_for(right));
      },
      on(atom("taken"), right) >> [=]() {
        become(waiting_for(left));
      },
      on<atom("busy"), actor_ptr>() >> [=]() {
        become(&denied);
      }
    );
    // philosopher was not able to obtain the first chopstick
    denied = (
      on<atom("taken"), actor_ptr>() >> [=](actor_ptr& ptr) {
        send(ptr, atom("put"), this);
        send(this, atom("eat"));
        become(&thinking);
      },
      on<atom("busy"), actor_ptr>() >> [=]() {
        send(this, atom("eat"));
        become(&thinking);
      }
    );
    // philosopher obtained both chopstick and eats (for five seconds)
    eating = (
      on(atom("think")) >> [=]() {
        send(left, atom("put"), this);
        send(right, atom("put"), this);
        delayed_send(this, seconds(5), atom("eat"));
        cout << (  name
             + " puts down his chopsticks and starts to think\n");
        become(&thinking);
      }
    );
    // philosophers start to think after receiving {think}
    init_state = (
      on(atom("think")) >> [=]() {
        cout << (name + " starts to think\n");
        delayed_send(this, seconds(5), atom("eat"));
        become(&thinking);
      }
    );
  }

};
The source code is a straightforward implementation of the state diagram. Our main starts five chopsticks and five philosophers. We use an anonymous group to send the initial {think} message to all philosophers at once. You could send five single messages as well.
int main(int, char**) {
  // create five chopsticks
  cout << "chopstick ids:";
  std::vector<actor_ptr> chopsticks;
  for (size_t i = 0; i < 5; ++i) {
    chopsticks.push_back(spawn<chopstick>());
    cout << " " << chopsticks.back()->id();
  }
  cout << endl;
  // a group to address all philosophers
  auto dinner_club = group::anonymous();
  // spawn five philosopher, each joining the Dinner Club
  std::vector<std::string> names = { "Plato", "Hume", "Kant",
                                     "Nietzsche", "Descartes" };
  for (size_t i = 0; i < 5; ++i) {
    spawn_in_group<philosopher>(dinner_club,
                                names[i],
                                chopsticks[i],
                                chopsticks[(i+1) % chopsticks.size()]);
  }
  // tell philosophers to start thinking
  send(dinner_club, atom("think"));
  // real philosophers are never done
  await_all_others_done();
  return 0;
}
The full source code can be found in the examples folder on github.

Have fun!

Sunday, January 22, 2012

Minor API Changes

I've removed some function from the cppa namespace in the latest version. If your code fails to compile after git pull, follow this instructions:
self() → self
trap_exit(...) → self->trap_exit(...)
last_received() → self->last_dequeued()
link(other) → self->link_to(other)
quit(reason) → self->quit(reason)

Wednesday, January 18, 2012

Opt for option!

In a perfect world, each function could safely assume all arguments are valid. But this is the real world and things go wrong all the time. Especially whenever a function needs to parse a user-defined input. A good example for such a function is X to_int(string const& str). What is the correct type of X?

Well, some might argue "int of course and the function should throw an exception on error!". I would not recommend it. Throwing an exception is really the very last thing you should do. An exception is a gigantic hammer that smashes your stack to smithereens. If your user didn't read the documentation and uses your function without a try block, the exception will kill the whole program. Furthermore, exceptions are slow. All major compilers are optimized to have few overhead for entering a try block. Throwing an exception, stack unwinding and catching are expensive. You should not throw an exception unless there is nothing else you can do.


Use a bool pointer. Some libraries, e.g., the Qt library, use a bool pointer as function argument. Honestly, I don't like this approach. It forces you do declare additional variables and always returns an object, even if the function shouldn't. It's ok for integers, but what if your function returns a vector or string? Creating empty objects is a waste of time.


Return a pair. Some STL functions return a pair with a boolean and the result. The boolean indicates whether the function was successful. Again, creating empty objects is a waste of time. Thus, this approach isn't very efficient. But that's not the real issue here:

auto x = to_int("try again");
if (x.first) do_something(x.second);
Is this code correct? The answer is "I don't know". Remember, you can assign a bool to an int and you can use integers in if-statements. You have to read the documentation of to_int to see if x.first is the bool or the int.


Return a pointer. Safety issues aside, you should use the stack for doing work and the heap for dynamically growing containers. Your stack is your friend. It is fast, automatically destroys variables as they go out of scope, and did I mention fast? Allocating small objects on the heap has a significant performance impact.


Return an option. If you're familiar with Haskell or Scala, you'll know Maybe or Option. In short, a function doesn't return a value. It returns maybe a value. If your string actually is an integer, the function returns an integer. Otherwise, it returns nothing. libcppa does have an option class. I guess you'll know what this code is supposed to do:
auto x = to_int("try again");
if (x) do_something(*x);
So, what is the type of x here? It is "option<int>". You can write if (x.valid()) do_something(x.get()); instead if you prefer a more verbose style. Option supports default values: do_something(x.get_or_else(0));. If you're a user of the boost library, maybe you'll know boost::optional. In general, I would recommend boost to everyone, but boost::optional is slow. Just have a look at the implementation. cppa::option uses a union to store the value. If the option is empty, the object in the union won't get constructed. You'll have a slight overhead of returning "empty memory" but you don't pay for creating empty objects.


This are some performance results for cppa::option compared to returning an int, returning a pair, and returning a boost::optional.
return type 100,000,000 values 100,000,000 empties
int 0.488s -
std::pair<bool, int> 2.418s 2.304s
cppa::option<int> 2.776s 1.598s
boost::optional<int> 7.419s 2.987s
The boost implementation clearly falls short. This is because the boost implementation doesn't use the stack. That's not because the boost developers don't know how to write efficient code. It's because unrestricted unions are a C++11 feature. cppa::option has a slight overhead compared to a pair for returning values but is even faster than a pair for returning empties, because the memory is uninitialized in this case. If you're already a user of libcppa: use option. If you're not (yet) a user: copy & paste the source code and use it. :)
Of course, it's C++11 only. Honestly, I really don't know why this isn't part of the STL. It should! It's general, fast, safe and really improves the readability of your source code.