3 Day Startup

I was on the review panel for 3 Day Startup, an entrepreneurial experience for students here at UT Austin. It is co-organized by one of my PhD students, Thomas Finsterbusch, and I have worked with several of the other 44 students who participated. They had a good mix of technical, business, and communication experience. They also included graduate students and undergraduates. Last night they pitched 4 ideas to the panel. I thought all the ideas were interesting, but each one would have its own particular challenges in coming to market. While some may complain that doing a startup in 3 days is not realistic, I don't think that's the point. The point is that it was exactly like a large startup, but in minuature. It gives students a real sense of the psychology of startups. I do think that running something like this over a longer period of time would give the students more time to reflect and also to learn new styles of working, when they find that their first attempt doesn't work. But either way, direct experience of starting a business is not something you can get from book learning or studying cases. You have to live it.

A Better Mousetrap (Distributed Objects)

Despite being jaded and feeling that you've seen it all before, its important to keep your sense of wonder at the possibility of seeing something new. There are new ideas in software. Maybe the last cool idea that captured your imagination didn't turn out to be as amazing as you had hoped. Don't give up. The next idea might be better. It probably means that the first idea didn't solve the whole problem, or had some flawed assumption. But now you know more about the problem and can try again.

Simplicity is the result of lots of hard work.

This is how I feel about my work on distributed objects. First of all, let me say that CORBA, RMI, and DCOM are pretty much complete failures. Ok, Ok. I know that some big systems have been built with CORBA. How could these fancy technologies have been stomped by something as whimpy as Web Services and REST? Have people lost their minds? Is there no answer at all?

I think that both CORBA and Web Services are fundamentally flawed. But that is not the interesting point. The interesting point is that they both have gotten something right. If you combine the good parts of each, and avoid the flaws, then the combination might actually be a good solution. So, what are the problems?

The problem with CORBA/RMI is granularity. Clean object-oriented design encourages fine-grained methods. But these are death to distributed systems. So you have to use Data Transfer Objects and Remote Facades to make things work. Its a mess. Automatic proxies are an invitation to poor performance.

Web Services can use document-oriented approach, which encourage large-granularity designs. But the programming model is terrible. You spend all your time creating documents that describe the actions you want to perform and decoding results. Its the opposite of direct manipulation.

The hidden assumption behind both these approaches is that distribution can be implemented as a library. This is such a deep assumption that we don't even realize its there. We say, if you are going to implement distributed computing, you have to create a library to do it... and so you create proxies, documents, etc... but you haven't realized that you've already lost the game before you started. Its similar to the arguments that threads cannot be expressed as a library. I'll add to that my assertion that regular expressions cannot (effectively) be implemented as a library.

What happens if you drop this assumption? We started with a simple observation. If the problem with CORBA/RMI is granularity, lets look at that. The problem is that one call is one round-trip, and two calls is two round-trips. CORBA/RMI is not compositional from the viewpoint of communication. There is no reason why two calls should take two round-trips. So we invented an a language notation for specifying groups of calls that should be performed together:

batch (r) {
r.update(3);
print( r.get() );
}

In this example, "r" refers to a remote object. The calls to "update" and "get" are both executed in one round trip to the server. Voila! Compositional remote services. You can also execute programs with complex dependencies among the remote parts:

batch (r) {
Foo x = r.locate(3);
print( x.get() );
}

In this case the remote method returns a Foo object, which is then used in the next call. But since all of the action happens on the server within a batch, there is no need for the client to ever refer directly to "x". No proxies! What happens is that the program is partitioned into a remote part and a local part. The remote part is a script, which runs on the server and returns a result:

REMOTE: result = r.locate(3).get();

The local part then uses the result for the local computation:

LOCAL: print( result );

The result can be multiple values, and the script can contain loops and conditionals. See the papers for more examples. It turns out that these "remote scripts" are isomorphic to documents in document-oriented web services. Thus our new idea of "batches" combines the best parts of CORBA (objects!) and Web Servies (documents!) into one.

We took us so long to discover it? Simplicity is the result of hard work.

Here are some papers with details:

Generic Syntax: Lisp parsing + C notation

I've been working with Jose Falcon, undergrad here at UT, for almost a year on this crazy idea for an approach to syntax that combines aspects of Lisp S-Expressions and familiar Algol/C/Java notations. The idea is to use a generic syntax, as in Lisp, but extend it to include many of the common syntactic conventions found in programming languages, grammars, and style sheets. Lisp only recognizes these characters: "(.,')" and space. Our language, called Gel, recognizes {}, [], (), and arbitrary unary and infix operators. So you can write
{ a + (x ** 3) ==> x | y.z; x := 37; if: a=3 then: print(3, f[x]); }
and have it parse just as you would expect. We tag keywords with a ":", because they have to be generic too. To get operators to work right, we make spaces meaningful. Thus:
a +b == a(+b)
a+ b == (a+)b
a + b == (a)+(b)
This corresponds to common usage in Java/C and also most grammar notations:
E ::= E | ("+" E)*
also parses correctly in Gel. Its basically a "super-lexer" just as Lisp is. We will get the source code up soon so you can check it out. Here is the paper. The work will be presented at IFIP Working Conference on Domain Specific Languages (DSL WC).

LINQ is the best option for a future Java query API

I have participated in this thread about LINQ for Java. There are some very good comments. I don't think that LINQ is perfect, but it is better than most alternatives. It is better than my proposal, Safe Query Objects [PDF] (aka Native Queries supported by Db40), although the constraints were different. I was trying to see how to do a type-safe query language without any changes to Java. I think its a reasonable design. But if you allow yourself to change the language significantly, as Microsoft did, then you have to explore other possibilities.

Strategic Programming

I have been working with model-driven software development for many years, but I haven't published anything on it yet. At Allegis, we developed a complete enterprise application based on models. We found this to be much more effective than the standard object-oriented MVC approaches. One reason I returned to research, and joined academia, was to do research into this programming paradigm. I've been struggling for years with the idea, but have finally written a paper about it, with some of my students:

Strategic Programming by Model Interpretation and Partial Evaluation
William R. Cook, Benjamin Delaware, Thomas Finsterbusch, Ali Ibrahim, Ben Wiedermann

One question that we didn't address directly in the paper is "Why call it Strategic Programming?" The work is closely related to Model-Driven Software Development and also Domain-Specific Language engineering (DSL). Why not use one of those? To me, its a matter of focus. We are all "feeling the same elephant". But I want to focus on a different part of the elephant.

In programming language work there are three important components:

* Syntax
* Programs
* Semantics

When you map these onto the model/DSL viewpoint, you see the following correspondence:

* Syntax: Domain-Specific Languages
* Programs: Models
* Semantics: Interpretations (interpreters, transformers, compilers)

Some people focus on the models, others on the DSLs, and others on transformation. But rather than name the approach after one of its parts, I wanted to use a name that focuses on the overall approach. Keep in mind that there can be more than one interpretation.

I am suggesting that underlying all this is the idea of a strategy, which guides the design of the language and the interpretation of particular models to achieve some goal. It's the strategy that binds the three components together. The other reason is that some people think models are just pictures and don't have semantics, and other think that DSLs are just syntax. So I'm trying to sell a more fundamental vision of this emerging paradigm to the academic programming languages community. I think that models/DSLs/interpretation/strategies are going to be the next big programming paradigm, and so we need to get ready for it.

Scheme Debugging

I've been using Scheme recently, and have been complaining to everybody about the lack of any good visual debugger for the language. Dr. Scheme has some debugging features, but they seem very primitive and awkward to me. Maybe they are exploring new ways to debug (e.g. drawing arrows all over the code to illustrate the calls stack), but I just want a good conventional debugger. Gambit Scheme has a command-line debugger, but its a pain to use for long periods of time and complex code.

Eclipse has a plugin architecture and lots of dynamic languages are gettting IDEs based on it. SchemeWay is a plugin for Scheme, but it does not support debugging. Recently I disovered the Dynamic Languages Toolkit (DLTK). Its supports the Xdebug DBGp protocol.

So I decided to make an Eclipse-based debugger for Scheme. Its called Schemeide and the alpha version is available for download now.

Any Scheme interpreter could work: all it has to do is implement the runtime side of DBGp. I implemented it, and it requires Gambit 4.2.9. Right now I include a patched version of 4.2.8 with builds for MacOS X, Win32, and Linux. The IDE also includes a indenter/reformatter for Scheme/Lisp. Other features are TBD.

How to embed news in a web page

I hunted around for a while and found out a good way to embed a news feed (RSS) in a web page. You use Google Reader to convert the feed into a little block of text that can be embedded. Here are directions on how to do it.

Middle Earth Programming Language seminar

I just got back from the Middle Earth Programming Language seminar (MEPLS) in Abilene TX. Where I gave a talk on Strategic Programming by model interpretation. I have been working on this for a long time, and its a great feeling to finally have a working implementation and a paper. See my home page at UT Austin for a link. I am very excited about this work. I'm programming an implementation of the idea in Scheme, and that is going very well. The system is code-named "Borg" because it can assimilate ideas from lots of other systems.