The vlingo/platform Architecture: Part 1

Several have requested a document describing the vlingo/platform architecture. So here it is. This specific article is kept relatively brief. This is in part to emphasize the simplicity of the vlingo/platform. It’s just not complicated or difficult to describe. The other motivation for brevity is not knowing entirely what architects and developers are looking for in such a document. Thus, I am open to feedback and input on what additionally is needed by readers.


The vlingo/platform is built fully on a reactive and message-driven foundation. The bedrock of the platform is, as described previously, vlingo/actors. This foundation of the platform implements the Actor Model of computation. The major abstractions of this Actor Model toolkit are:

  • World: This is the overall system in which actors live and operate. A World can have a number of Stages in which the lifecycles of a subset of live actors are managed. When the World is started, actors can be produced. When the World terminates, all Stages and Actors are terminated/stopped.
  • Stage: Every World has at least one Stage, known as the default. The Stage is where actors are managed and within which they play or execute. There may be multiple Stages, each with several or many actors (even millions) under its management. Each Stage has an internal Directory within which live actors are kept.
  • Actors: Each actor is an object, but one that reacts to incoming asynchronous messages, and sends outgoing messages asynchronously. Each actor is assigned a Mailbox, and a Mailbox delivers messages though a Dispatcher. When the actor receive a message, it performs some business behavior internally, and the completes. All actors are type safe in that their behaviors are defined by Java method signatures rather than arbitrary object types. Every actor may implement any (practical) number of behavioral interfaces, known as actor protocols.
  • Supervision: Actors are supervised in order to deal with failure. When an actor experiences an exception while handling a message, the exception is caught and relayed to the actor’s supervisor as a message. When received by the supervisor, the exceptional message is interpreted to determine the appropriate step to correct the actor’s problem. The corrective action can be one of the following: resume, restart, stop, or escalate. In the case of supervision escalation, the exceptional message is relayed to this supervisor’s supervisor, for it to take some action. There are four kinds of supervision: direct parent, default public root, override of public root, and registered common supervisors for specific actor protocols (behavioral interfaces).
  • Scheduler: Each Stage has a scheduler that can be used to schedule future tasks by time intervals. An actor determines the timeframe that is needed, and each occasion on which the interval is reached, the actor receives an interval signal message indicating such. The receiving actor can then execute some necessary behavior in response. The actor that creates the scheduled task need not be the target actor of the interval signal message.
  • Logging: Logging capabilities are provided to every actor, and different loggers may be assigned to different actors.
  • Plugins: The vlingo/platform in general, and vlingo/actors specifically, support a plugin architecture. New kinds of plugins can be created at any time to extend the features of vlingo/actors, and any other platform component. In particular, mailboxes and dispatchers, supervision, and logging can be extended via plugins.
  • Testkit: There is a very simple testkit that accompanies vlingo/actors, making it quite easy to test individual and collaborating actors for adherence to protocol and correctness.

The following is a vlingo/actors architecture diagram.

The components seen in the above diagram can be traced back to the above names and descriptions. In the default Stage notice that there are three special actors, one that is bright yellow with a #, one that is bright yellow with a *, and one that is red with an X. Respectively, these are:

# The private root actor, which is the parent of the public root actor and the dead letters actor

* The public root actor, which is the default parent and supervisor if no others are specified

X The dead letters actor, which receives actor messages that could not be delivered

I next discuss vlingo/cluster, and then move on to the entire platform.


The vlingo/cluster is a key component that sits on top of vlingo/actors, to support the development of scalable and fault-tolerant tools and applications. In other words, additional tools that we are developing to continue to build out the vlingo/platform, will almost always be built on top of vlingo/cluster. Additionally, we intend for you to implement and deploy your services/applications in clusters.

Generally a cluster will be composed of an odd number of multiple nodes (not just one, but for example, 3, 5, 21, or 127). The reason for choosing an odd number of nodes is to make it possible to determine whether there is a quorum of nodes (totalNodes / 2 + 1) that can form a healthy cluster. Choosing an even number of nodes works, but in that case, when loosing one node, it doesn’t improve the quorum determination, nor does it strengthen the cluster when all nodes are available. When a quorum of nodes are available and communicating with one another, a leader can be elected.

The vlingo/cluster works in full-duplex mode, with two kinds of cluster node communication channels. There are the operational channels, which the cluster nodes use to maintain the health of the cluster. There are also application channels, which the services/applications use to pass messages between nodes. Using two different channel types allows for tuning each type in different ways. For example, application channels may be assigned to a faster network than the operational channels. It also opens the possibility for either channel type to use TCP or UDP independently of the other channel type.

Besides scalable fault-tolerance, the vlingo/cluster also provides cluster-wide, synchronizing attributes. This enables the cluster to share live and mutating operational state among all nodes.

The following diagram shows a three-node vlingo/cluster.

In the above cluster, if one node is lost the cluster will still maintain a quorum. However, if two nodes are lost and only one node remains in a running state, the quorum is lost and the cluster is considered unhealthy. In that case the one remaining node goes into idle state and awaits one or more of the other nodes to return to operational state, which will again constitute a quorum and enable a healthy running cluster.


So far I have discussed the foundation of vlingo/actors and the scalability and fault-tolerance achieved using vlingo/cluster. But, what comes next?

For a few different reasons we have decided not to disclose the roadmap for the full platform at this time. However, I can tell you that there are currently seven additional platform components planned or in development. Ultimately the platform will be even greater than nine components, but those will follow the ones that we consider essential. What we are willing to share is found in the following diagram.

The vlingo/directory supports service registration and discovery. The vlingo/auth service provides security within the vlingo/platform. The vlingo/auth may also be used by any service/application that you create, but it is not a requirement that your organization adopt it as your security standard. Does the above diagram imply strong coupling between vlingo/platform components, or with services/applications with the platform? Of course not!

I hope that this architecture overview has help you understand the significance of the vlingo/platform, and where we are headed.


Your Cart