A Better Mousetrap (Distributed Objects)

Despite being jaded and feeling that you've seen it all before, its important to keep your sense of wonder at the possibility of seeing something new. There are new ideas in software. Maybe the last cool idea that captured your imagination didn't turn out to be as amazing as you had hoped. Don't give up. The next idea might be better. It probably means that the first idea didn't solve the whole problem, or had some flawed assumption. But now you know more about the problem and can try again.

Simplicity is the result of lots of hard work.

This is how I feel about my work on distributed objects. First of all, let me say that CORBA, RMI, and DCOM are pretty much complete failures. Ok, Ok. I know that some big systems have been built with CORBA. How could these fancy technologies have been stomped by something as whimpy as Web Services and REST? Have people lost their minds? Is there no answer at all?

I think that both CORBA and Web Services are fundamentally flawed. But that is not the interesting point. The interesting point is that they both have gotten something right. If you combine the good parts of each, and avoid the flaws, then the combination might actually be a good solution. So, what are the problems?

The problem with CORBA/RMI is granularity. Clean object-oriented design encourages fine-grained methods. But these are death to distributed systems. So you have to use Data Transfer Objects and Remote Facades to make things work. Its a mess. Automatic proxies are an invitation to poor performance.

Web Services can use document-oriented approach, which encourage large-granularity designs. But the programming model is terrible. You spend all your time creating documents that describe the actions you want to perform and decoding results. Its the opposite of direct manipulation.

The hidden assumption behind both these approaches is that distribution can be implemented as a library. This is such a deep assumption that we don't even realize its there. We say, if you are going to implement distributed computing, you have to create a library to do it... and so you create proxies, documents, etc... but you haven't realized that you've already lost the game before you started. Its similar to the arguments that threads cannot be expressed as a library. I'll add to that my assertion that regular expressions cannot (effectively) be implemented as a library.

What happens if you drop this assumption? We started with a simple observation. If the problem with CORBA/RMI is granularity, lets look at that. The problem is that one call is one round-trip, and two calls is two round-trips. CORBA/RMI is not compositional from the viewpoint of communication. There is no reason why two calls should take two round-trips. So we invented an a language notation for specifying groups of calls that should be performed together:

batch (r) {
print( r.op(3) );
print( r.get() );
}

In this example, "r" refers to a remote object. The calls to "op" and "get" are both executed in one round trip to the server. Voila! Compositional remote services. You can also execute programs with complex dependencies among the remote parts:

batch (r) {
Foo x = r.locate(3);
print( x.get() );
}

In this case the remote method returns a Foo object, which is then used in the next call. But since all of the action happens on the server within a batch, there is no need for the client to ever refer directly to "x". No proxies! What happens is that the program is partitioned into a remote part and a local part. The remote part is a script, which runs on the server and returns a result:

REMOTE: result = r.locate(3).get();

The local part then uses the result for the local computation:

LOCAL: print( result );

The result can be multiple values, and the script can contain loops and conditionals. See the papers for more examples. It turns out that these "remote scripts" are isomorphic to documents in document-oriented web services. Thus our new idea of "batches" combines the best parts of CORBA (objects!) and Web Servies (documents!) into one.

We took us so long to discover it? Simplicity is the result of hard work.

Here are some papers with details:

16 comments:

Roger Sessions said...

The batch idea is interesting. However I would argue that it is no different than a single asynchronous message call that does all of the "batched" operations. Since you can't get a result within a batch (I assume), why not just use the asynchronous call? In my experience, asynchronous is always better than synchronous for remote work requests. This was another failure of CORBA, we didn't understand this at the time.

Ali Ibrahim said...

Asynchronous calls can handle batches of calls IF they do not depend on each other. Consider the case where "x.get()" depends on the successful execution of "x.update(3)".

William Cook said...

Yes, I should have mentioned that asynchrony is often proposed as the right solution. Batches *do* allow using a result within a batch. Here is a simplified example from one of our papers, which would be very slow in an asynchronous system, but is a single fast round-trip using batches.

batch (m) {
for (Album a : m.getAlbums())
if (a.rating() < 50) {
print ("removed: " + a.getTitle());
album.delete();
}}

Doug said...

I like this work. It remains a minor mystery to me though why it took 10-15 years of CORBA/RMI before you and Eli took the problem seriously enough to solve it nicely. "Everyone knows" that round-trip delays are the major costs of remote communication. And in Java, everyone knows you can send code to run remotely (as a batch of sorts). Yet there is something about how CORBA/RMI came to be used that prevented solutions like yours from even being considered. Do you have any hypotheses?

William Cook said...

That's from Doug Lea. Although our batches are similar to mobile code, I want to make it clear that it is not the same as mobile code, for two reasons.

1) Batches automatically handle the issue of identifying and moving data in bulk from the client to the remote server and back. You'd have to program this explicitly with mobile code, and be careful about what else in the environment of the mobile code might get copied.

2) We use a domain-specific and language-independent format for the batch. It is a form of code, certainly, but it only allows method calls, conditionals, loops, and let/variable bindings. This format can be sent over an XML web service between radically different platforms (Ruby and Java for example).
--- * --- * ----
As for the question of why, you would probably know better than me. My guess is that they assumed "no language changes". The assumption that "extensions must be done with libraries and external tools" is deeply held. In the case of CORBA, that assumption made sense because they wanted to connect to every language. For Java RMI the assumption makes less sense. But I think that no satisfactory solution is possible if you make this assumption. So what we have are partial solutions and lots of messy design patterns to try to patch up the problems. This applies to both CORBA and Web Services.

As an aside, I will note that AppleScript's remote communication model is actually a kind of batch mechanism! But I love this new, more pure, presentation of the idea much better.

Steve Vinoski said...

I believe the fundamental problem with CORBA and all that stuff is its "language first" focus: it wants to fit into just any ol' programming language, OO preferred of course, and make distributed entities as straightforward to use as local ones.

See my slides from the recent "Historically Bad Ideas" track from QCon London — there, I cover the history of RPC and explain that it came about primarily during a time when all of the following were true:

* every computer vendor owned their entire stack from network all the way up to applications

* all distributed systems research of note was based on full stacks, often including special operating systems, very often focused on building whole languages that supported distribution, and almost always with a focus on local/remote transparency

* systems were still very slow compared to today, which meant getting good performance from a language meant getting good support from your system vendor because only they could write good code generators for their proprietary hardware

* thus while there was lots of language research, only a few languages were making it big in production because the vendors' capacities to support a variety of languages was very limited

Because of all these forces, a non-trivial part of the industry that was focused on efficient, production-quality OO programming languages, specifically C++ and later Java, eventually charged headlong down the path of forcing distribution behind those popular languages, where ultimately, no matter how many tricks we might try to play, it simply doesn't fit.

These days I focus on Erlang, a very practical language designed built specifically for distributed systems. I have never seen anything in my whole career of focusing on distributed systems that even comes close to its capabilities. Out of the box it provides facilities that we used to struggle to build for CORBA for years, and not only that, but those facilities are incredibly elegant, practical, highly scalable, and they perform exceptionally well. Erlang embraces distribution and system failure in its key abstractions — it's just incredibly well done.

I also focus heavily on REST these days, because it's extremely clear about its distribution trade-offs, much more so than any RPC-oriented system I've ever seen.

I believe in these directions so much that I left the middleware industry two years ago and moved to the media distribution industry, where the control plane simply begs for Erlang's capabilities and the data side is often the web or something much like it, obviously a perfect fit for REST.

I could go on and on, but just read the slides maybe catch the video of my talk once the QCon guys make it available.

I don't mean to sound harsh but I think focusing on RMI, WS-* web services, or anything RPC-like, even if you're trying to correct shortcomings, is just the wrong road to be on these days. The premises on which those ideas and technologies were built largely don't exist anymore IMO.

William Cook said...

Thanks for the comment, Steve. I read your slides in detail. I completely agree that work on distributed objects and RPC had (at least) at two wrong ideas:
* Using existing languages with no changes
* Focus on transparency rather than real distribution problems (like latency)

You conclude that OO languages *cannot* have a reasonable distribution model. I say that they cannot without rejecting the assumptions above. What amazes me is how hard it is to get people to realize that the current approach is broken, despite the continuous blinking red lights everywhere.

As for REST and WS, you say that WS is just CORBA/RPC all over again. Some people use it that way, but you don't have to. I am primarily interested in the document-oriented flavor of WS. I'm curious if Erlang has investigated this direction.

REST is interesting because of what it says about the network (caching! idempotency!) and these are very important. But it bothers me that REST is not "latency compositional". I have defined this to mean a system where multiple operations can be performed with the *same* latency one operation.

Steve Vinoski said...

I think one of the best things about REST is that it makes its trade-offs very clear and doesn't pretend that it's a general purpose architectural style for all distribution problems — it isn't, nor is it intended to be. Roy Fielding's thesis, where it's defined, is remarkable in its methodical derivation of the REST style. It would be interesting to apply similar methods to the work you're doing to see what different style you might derive given the architectural properties you're shooting for and the constraints you can impose to achieve those properties.

Roger Pack said...

JMI works well.

Rob Jellinghaus said...

William, are you familiar with the promises pattern and its support for pipelining? How would you evaluate this style of "RPC" compared to your batching framework?

William Cook said...

Promises, or futures, are a well-known technique. They allow a client program to call methods but then keep running until it needs the actual return value of the call. They are not as powerful as batches because there are many cases when the client needs a return value before it can continue: if the return value is used in a conditional, the client must know it before it can proceed. This is also true for loops, where the collection must be known before the loop can proceed. Loops also introduce the problem of managing lists of futures. Batches handle all these cases and more. Our original implementation of batches was based on an extended form of future (which we called BRMI), but we found that the batch syntax simplified using it greatly.

I don't see how JMI is directly relevant, although there may be some connection.

Daniel Yokomizo said...

In Haskell we can define a DSL to express remote calls and use a monad to make it prettier:

do {
update r 3;
print (get r);
}

This example print will either be a valid server command or fail with an error (because everything inside the do must run on the server). We can specify the place where the remote batch will run:

do {
x <- batch $ do {
update r 3;
return (get r);
};
print x
}

This example the outer do is the local part, the inner do is the remote call (actually just building the script) and batch takes the script, invokes it remotely and returns the results.

Similar things could be expressed in Smalltalk, Lisp, Io and any other language supporting serialization of thunks (or provides a decent syntax for DSLs), using a library, no changes to these languages are required.

William Cook said...

Thanks for stopping by, Daniel. I am quite familiar with monads and techniques for embedding DSLs. Buf if you look at the paper I think you will see that you have not captured the essence of the problem or its solution. In particular, you have not done anything to facilitate the communication of the results of multiple calls between client and server, or loops and conditionals.

I updated the first blog example to make the problem more clear (you should read the paper for more details):

batch (r) {
print( r.op(3) );
print( r.get() );
}

The "op" and "get" calls are performed in a batch on the server, then an intermediate data structure, which is a pair in this case, is returned.
We have coined the term "reforestation" to describe this technique, since it involves creating an intermediate data structure to communicate results and it is the inverse of well-known technique of deforestation. Many pure functional programs are naturally written in reforested, or point-free, style. Hence the need for deforestation. But point-free style can be quite difficult for programmers to use. Despite its theoretical beauty, point-free style is not necessarily clearer to read.

On the other hand, I believe there is a monadic interpretation of batches. I have been working on the details, but he basic idea is to create a product monad of a reader and a writer. These monads are written in an interleaved style, but can be executed independently for distribution.

The example above would be written something like this:

do {
x <- remote $ r.op(3)
local $ print(x)
y <- remote $ r.get()
local $ print(y)
}

Where "local" and "remote" refer to the two monads in the product. The local monad reads values written by the remote monad. But they can be re-ordered. I'm not sure how this handles loops and conditionals.

Monica Sharma said...

Good article Keep writing!

J. Suereth said...

This is very interesting. Pardon my non-academic speak, but it seems like in a JVM-based approach, you could make use of the "EJB Command Pattern". The remote types (r in the examples) would have to be defined as interfaces. The client compiler could then create some kind of closure encasing the remote batch behaviour that is sent to the remote execution engine, and returns the intermediate results. The local portion of the code would then interpret these results and use them. Is this the basic premise of your idea?

If that's the case, I would love to see a working example. You could probably implement such via a Scala library + compiler plugin. I'd also be interested in how you would remove transparency from the distribution such that developers could build fault-tolerance and/or clustering into such a solution. I'm the type who comments before reading references, so forgive me if this is detailed in your paper.

William Cook said...

I am fairly fluent in non-academic-speak too. It is clear that design patterns are concepts that cannot be properly represented in the programming language paradigm for which the design pattern applies. What's interesting is that batches eliminate the need for the three major design patterns that are the foundation of EJB:
* Data Transfer Object
* Command Pattern
* Server Facade
It is a good thing when your language doesn't need a pattern any more, because it represents the solution directly.