Skip to content

Iloggable

Promise: Classes aren't Types

I wanted to start with the basic unit of construction in Promise, the lambda or closure. However, since Promise's lambda's borrow from C#, allowing Type definition in the declaration, I really need to cover the concepts of the promissory type system before i can get into that.

Classes are prototypes

Borrowing from javascript and Ruby and by extension Smalltalk, Promise classes are prototypes that are used to instantiate objects with the characteristics of the class. But those instances are untyped and while they can be inspected to see the originating class, the instances can also be changed with new characteristics making them diverge from that class.

Types are contracts

Similar to Interfaces in C#, Java, etc., Types in Promise describe the characteristics of some object. Unlike those languages, classes do not implement an interface. Rather, they are similar to Interfaces in Go in that any instance that has the methods promised by the Type can be used as that Type. Go, however, is still a typed language where an instance has the Type of its class but can be cast to an Interface, while Promise doesn't require any Type.

Compile time checking of Types

This part I still need to play with syntax to figure out useful and intuitive behavior. Classes aren't static and neither are instances, so checking the Type contract against the instantiation Class' capabilities isn't always the final word, Furthermore, if the object ever is passed without a Type annotation, the compiler can't determine the class to inspect. Finally, classes can have wildcard methods to catch missing method calls. In order to have some kind of compile time checking on the promissory declarations, there will have to be some way to mark variables as promising a Type cast that the compiler will trust, so that Type annotated and pure dynamic instances can work together.

This last part is where the dynamic keyword in C# fails for me, since you cannot cross from dynamic into static without a way of statically declaring the capabilities of an instance. That means that in C# you cannot take a dynamic object and use it anywhere that expects a typed class, regardless of its capabilities. I wrote Duckpond for C# to only address casting classes to interfaces that they satisfy, but I should extend it to proxy dynamic objects as well.

Implicit Types and the Type/Class naming overlap

In order to avoid needless Type declarations, every Class definition generates a matching Type with the same name (based only on its non-wildcard methods) at compile time. If you create a class called Song and a lambda with the signature (Song song) => { ... };, it might look like the Class is the Type. What's really going on though is that the class definition generated a Type called Song that expects an instance with the capabilities of the compile time definition of the Song class.

The shadowing of Classes by Types is possible because Class and Type names do not collide. A class name is only used for Class definition and modification, while Type names are used for signature and variable type. And since the shadowing is implicit, it's also possible to create a Song Type by hand to use instead of the implicit one. This is particularly useful when the Song class has several similar methods handled by a single wildcard method, but those signatures should be captured in the Type.

A final effect of the naming overlap, that I won't get into until I talk about the language level IoC, is that Song.new() is not a call on the class, but on the Type -- this is also an artifact of how Promise treats Class/static methods and how the IoC container resolves instance creation. Can you see how this could be useful for mocking?

Enough with the theory already

Sorry for another dry, codeless post, but I couldn't see getting into the syntax that will use Types and Classes without explaining how Classes and Types interact in Promise first.

Next time, lambdas, lambdas, lambdas.

More about Promise

This is a post in an ongoing series of posts about designing a language. It may stay theoretical, it may become a prototype in implementation or it might become a full language. You can get a list of all posts about Promise, via the Promise category link at the top.

I made this half-pony half-monkey monster to please you

I made this half-pony half-monkey monster to please you But I get the feeling that you don't like it What's with all the screaming? You like monkeys, you like ponies Maybe you don't like monsters so much Maybe I used too many monkeys Isn't it enough to know that I ruined a pony making a gift for you?

Jonathan Coulton - Skullcrusher Mountain

I'm primarily a C# developer these days, but tinker in various other languages to see what others are doing better or worse. I am clearly biased towards static languages, despite lots of features I do like in various dynamic languages. This isn't some fear of the unknown, it's preference from extensive experience -- before switching to C# by way of Java, I was a perl developer for about 8 years. But I'm not trying to start yet another static vs. dynamic flame war here.

Playing with lots of languages got me thinking about what i'd ideally like to see in a programming language. So over the next couple of posts are going to be largely a thought experiment in designing that language. I do this not because i think there aren't enough languages, but to gain getter insights into how I work, how language design works and what goes into good usability design.

The conclusion of these posts may be that I find a language that already offers what I am looking for or that I start prototyping the new language in the DLR, since that's another skill i want to expand on.

Promise

The primary feature of this language, which I'm calling Promise for now, is that it's a dynamic language with a duck-typing system to allow declaration of contracts both for discoverability and compile time (AOT or JIT) verification, but without forcing strict class inheritance hierarchies on the programmer. Its mantra is "I promise that you will get an instance that does what you expect it to do".

Below are the top-level features I find desirable:

Optional Type Annotation

Some things can be completely dynamic, but I find it a lot more useful if a method can express the type of instance it expects, rather than having that information out-of-band in the documentation. As I said, I want it to be pure duck-typing: Classes aren't types and Types aren't classes, a class can be cast to a type if it satisfies the contract. This attaching the Types at the consumption rather than the declaration side I've previously written about and it's one thing i really like about Go. I want types to aid in discovery and debugging, not be the yoke it can become in purely statically typed languages

Runs in a virtual machine

I like virtual machines that can host many languages. It allows you to write in your favorite language but still take advantage of features and libraries written in other languages without complicated and platform specific interop stories. It took a while for this promise to come to fruit but both the JVM and CLR are turning into pretty cool polyglot ecosystems. So the language must run in one of those two environments (pending the emergence of another VM that has as much adoption).

Just-in-Time compilation

While I am a big fan of compiling and deploying bytecode, for rapid development and experimentation, I really want the language to be able to run either as bytecode or as loose files that are compiled just in time.

Object-Orientation

I will continue to structure logical units as work as classes with state and methods and pass around complex data in some kind of entity for a fair amount of the work I do. So organizing code in classes that are instantiated is a fundamental building block I rely oh.

Lambdas

But just as much as object-orientation is useful for organization of responsibilities, defining anonymous functions with closures for callbacks, continuations, etc. are another essential to way I code and a fundamental building block for asynchronous programming.

Mix-ins

While C# 3.5 extension methods are a decent way of extending existing classes and attaching functionality to interfaces, I would prefer the mix-in style of attaching functionality to a class or instance without going down the multiple inheritance path.

Language level Inversion of Control

Having to know how to construct dependency hierarchies by hand or knowing about and managing the lifetime scopes of instances is, imho, an imperative programming concept that just needlessly complicates things. I just want to get an instance that I can work with and not have to deal with a myriad of constructors or have to know how to initialize a fresh instance otherwise. Nor do I want to deal with knowing when to get my own instance vs. a shared one. All that is plumbing that is common and fundamental enough that it should move into the language itself, rather than having to build an IoC framework that takes over constructors for you, or using Service Location to self-initialize.

With the 10k foot overview out of the way...

From these qualities, I have started to define the syntax for Promise and next time I'll go into some detailed specifications and start working through various syntax examples.

If you want to skip ahead and see my brainstorm notes, they can be found here.

If you think there already is a language that satisfies most if not even all my requirements, I'd love to hear about it as well.

More about Promise

This is a post in an ongoing series of posts about designing a language. It may stay theoretical, it may become a prototype in implementation or it might become a full language. You can get a list of all posts about Promise, via the Promise category link at the top.

Don't blow your stack: C# trampolining

A couple of weeks ago someone posted two links on twitter suggesting the trampolining superiority of Clojure to C#. (Can't find that tweet anymore, and yes, that tweet was what finally motivated me to migrate my blog so i could post again.)

Well, the comparison wasn't really apples to apples. The C# article "Jumping the trampoline in C# – Stack-friendly recursion" is trying to do a lot more than the Clojure article "Understanding the Clojure `trampoline'". But maybe that's the point as well: that with C# people end up building much more complex, verbose machinery when the simple style of Clojure is all that is needed. That determination wanders into the subjective, where language feuds are forged, and i'll avoid that diversion this time around.

What struck me more was that you can do trampolining a lot more simply in C# if we mimic the Clojure example given. Let's start by reproducing the stack overflow handicapped example first:

Func<int, int> funA, funB = null;
funA = n => n == 0 ? 0 : funB(--n);
funB = n => n == 0 ? 0 : funA(--n);

I'd dare say that the Lambda syntax is more compact than the Clojure example :) Ok, ok, the body is artificially small, allowing me to replace it with a single expression, which isn't a realistic scenario. Suffice it to say, you can get quite compact with expressions in C#.

Now, if we call funA(4), we'd get the same call sequence as in Clojure, i.e.

funA(4) -> funB(3) -> funA(2) -> funB(1) -> funA(0) -> 0

And if you, instead, call funA(100000), you'll get a StackOverflowException.

So far so good, but here is where we diverge from Clojure. Clojure is dynamically typed, so it can return a number or an anonymous function that produces that number. We can't do that (lest we return object, _ick), b_ut we can come pretty close.

Bring in the Trampoline

The idea behind the Trampoline is that it unrolls the recursive calls into sequential calls, by having the functions involved return either a value or a continuation. The trampoline simply does a loop that keeps executing returned continuations until it gets a value, at which point it exits with that value.

What we need for C# then is a return value that can hold either a value or a continuation and with generics, we can create one class to cover this use case universally.

public class TrampolineValue<T> {
    public readonly T Value;
    public readonly Func<TrampolineValue<T>> Continuation;

    public TrampolineValue(T v) { Value = v; }
    public TrampolineValue(Func<TrampolineValue<T>> fn) { Continuation = fn; }
}

Basically it's a container for either a value T or a func that produces a new value container. Now we can build our Trampoline:

public static class Trampoline {
    public static T Invoke<T>(TrampolineValue<T> value) {
        while(value.Continuation != null) {
            value = value.Continuation();
        }
        return value.Value;
    }
}

Let's revisit our original example of two functions calling each other:

Func<int, TrampolineValue<int>> funA, funB = null;
funA = (n) => n == 0
  ? new TrampolineValue<int>(0)
  : new TrampolineValue<int>(() => funB(--n));
funB = (n) => n == 0
  ? new TrampolineValue<int>(0)
  : new TrampolineValue<int>(() => funA(--n));

Instead of returning an int, we simply return a TrampolineResult<int> instead. Now we can invoke funA without worrying about stack overflows like this:

Trampoline.Invoke(funA(100000));

Voila, the stack problem is gone. It may require a bit more plumbing than a dynamic solution, but not a lot more than adding type declarations, which will always be the syntactic differences between statically and dynamically typed. But with lambdas and inference, it doesn't have to be much more verbose.

But wait, there is more!

Using Trampoline.Invoke with TrampolineValue<T> is a fairly faithful translation of the Clojure example, but it doesn't feel natural for C# and actually introduces needless verbosity. It's functional rather than object-oriented, which C# can handle but it's not its best face.

What TrampolineValue<T> and its invocation really represent are a lazily evaluated value. We really don't care about the intermediaries, nor the plumbing required to handle it.

What we want is for funA to return a value. Whether that is the final value or lazily executes into the final value on examination is secondary. Whether or not TrampolineValue<T> contains a value or a continuation shouldn't be our concern, neither should passing it to the plumbing that knows what to do about it.

So let's internalize all this into a new return type, Lazy<T>:

public class Lazy<T> {
    private readonly Func<Lazy<T>> _continuation;
    private readonly T _value;

    public Lazy(T value) { _value = value; }
    public Lazy(Func<Lazy<T>> continuation) { _continuation = continuation; }

    public T Value {
        get {
            var lazy = this;
            while(lazy._continuation != null) {
                lazy = lazy._continuation();
            }
            return lazy._value;
        }
    }
}

The code for funA and funB is almost identical, simply replacing TrampolineValue with Lazy:

Func<int, Lazy<int>> funA, funB = null;
funA = (n) => n == 0
  ? new Lazy<int>(0)
  : new Lazy<int>(() => funB(--n));
funB = (n) => n == 0
  ? new Lazy<int>(0)
  : new Lazy<int>(() => funA(--n));

And since the stackless chaining of continuations is encapsulated by Lazy, we can simply invoke it with:

var result = funA(100000).Value;

This completely hides the difference between a Lazy<T> that needs to have its continuation triggered and one that already has a value. Now that's concise and easy to understand.

Migrated to Wordpress

A couple of months ago google deprecated their SFTP publishing from blogger. I hadn't been excited about blogger for a long time, but not doing anything to change things won out over doing something about it. At least until i couldn't post anything anymore. To be fair, google gave me plenty warning to migrate, but I've been pretty buried with development for the Olympic release of MindTouch, so it's not until now that I've migrated my blog over to Wordpress and can finally post again.

I've got a back log of things i've been wanting to write about, so hopefully I actually can make the time to commit those thoughts here in the coming weeks.

NoSQL is the new Arcadia

Recently there's been a lot of people lamenting the sheep like mentality of picking RDBMS (and with it ORMs) as the way to model persistence, without first considering solutions that do not suffer the object-relational impedance mismatch.

Many of the arguments for having to use RDBMS' are easily shot down, such as the relentless requirements for adhoc reporting against production data (If your OLTP and OLAP are the same DB you are doing it wrong™.) But just because the arguments for picking an RDBMS are often ill-considered, the reasons for abandoning it also seem to suffer from some depth of consideration.

Let me be clear that I do my best to stay away from RDBMS' whenever i can. I have plenty of scars from supporting large production DB environments over the years and there are lots of pain points in writing web applications against RDBMS'. I, too, love schema-less, document and object databases. They make so much sense. I rabidly follow the MongoDB and Riak mailing lists and prototype projects with them and others NoSQL tech, such as Db4o. However, following those lists it is clear to me that a) they are still re-discovering lessons painfully learned by RDBMS folks and b) my knowledge of working with these systems when something goes wrong is woefully behind my knowledge of the same for RDBMS.

Pick the best tool

So yes, marvel at the simplicity of mapping your object model to a document model, or even serialize that object graph using an object or graph DB. But don't just concentrate on what they do better for development, ignoring the day-to-day production support issues. Take a minute and see if you can answer these questions for yourself:

Can you troubleshoot performance problems?

Every RDBMS has some kind of profiling tool and process list. And on the ORM side, Ayende's Uberprof is doing a fantastic job of bringing additional transparency to many ORMs. Do you have any similar tools for the alternative persistence layer? Do you know what's blocking your writes, your reads? What's slowing down your map/reduce? What indicies, if applicable, are being hit? And if you're using a sharded setup, profiling just got an order of magnitude more complicated.

What about concurrency on non-key accesses?

Key/value stores are much faster than even primary key hits on RDBMS. And document databases let you store the entire data hierarchy instead of normalizing them across foreign key tables making graph retrieval cheap too.

But as NoSQL goes beyond simple key retrieval with query APIs and map/reduce, concurrency concerns sneak back in along with the query power. Many NoSQL stores are still using single threaded concurrency per node or at least data silos (read: table locking).In RDBMS land, mysql was the last one to solve that and it did it 6-7 years ago.

What tools to you have to recover a corrupted data file?

Another set of tools you are guaranteed to find with any RDBMS are utilities for recovering corrupted data and index files. Or at the very least utilities for extracting data from them in case of catastrophic failure.

With many NoSQL stores using memory mapped files, corruption on power loss or DB crash is not uncommon. Does you persistence choice have ways to recover those files?

What's your backup strategy?

Most DBs have non-blocking DB dumps. Almost all have replication. Both are valid mechanisms.

Some NoSQL stores use replication to address the problem, others seem to punt on it by using redundant data duplication across nodes. But unless your redundant/replica nodes are geographically co-located, it's not the same as being able to go back to a backup on catastrophic loss.

How do you know your replicas are working?

So you say, you don't care if your data gets corrupted or that you can't do live backups, because it all gets replicated to a safe server. Well, much like going back to tape only to discover that your back-up process hasn't actually backed up anything, do you have the tools to ensure that your replicas are up to date and didn't get the corruption replicated into them?

Do your sysadmins share your comfort level?

A lot of these production level and back-up related issues are not even something developers think about, because with the maturity of RDBMS' their maintenance and back-up are often tightly integrated into the sysadmin's processes. If you don't think you need to care about the above questions, chances are you have others doing it for you. And in that case, it's vital that your sysadmins are versed the in NoSQL tool you are choosing before you throw the operations requirements over the wall at them.

The tool you know

Maybe you have all those questions covered for your NoSQL tool of choice. Google, Facebook, LinkedIn do. But likely, you don't. Maybe you don't have them covered for any RDBMS that you know either. But here's the difference: These problems have been tackled in painstaking detail in thousands of RDBMS production environments. So, when you hit a wall with an RDBMS, chances are you can find an answer and get yourself out of that production mess.

The relative novelty and deployment size of most NoSQL solutions means you can't easily fall back on established production experience. Until you have that same certainty when, not if, you face problems in production, you can't really say that you objectively evaluated all choices and found NoSQL to be the superior solution to your problem.

You're an administrator, not THE Administrator

I set up a new dev machine last week and decided to give win7 a try. Most recent dev setup was using win2k8 server and it's still my favorite dev environment. Fast, unobtrusive, things just worked.

Win7 appeared to be a different story, reminding me of the evil days of Vista. I had expected it to be more like Win2k8 server, but it just wasn't. I was trying to be zen about the constant UAC nagging and just get used to the way it wanted me to work. But two days in, it just came to a head and after wasting countless hours trying to work within the security circus it set up, i was ready to pave the machine.

Here's just a couple of things that were killing me:

Can't save into Program Files from the web

Had to save into my documents then move it there. Worse, it told me i had to talk to an administrator about that. I am an administrator!

Can't unzip into Program Files

Same story as above.

Have to whitelist reserve Uri's for HttpListeners and you can't wildcard ports.

This was the final straw, since my unit tests create random port listeners so that the shutdown failures of a previous test doesn't hose the registration of the next.

All these things need administrator privileges. But wait, I am an administrator, so what's going on? It appears that being an administrator is more like being in the sudoers file on unix. I have the right to invoke commands in the context of an administrator, but my normal actions aren't. I tried to work around this with registry hacks, shortcuts set to run as administrator and so on, to try to get things to start-up with administrator privs by default, but Visual Studio 2k8 just refused to play along. You cannot set it up so that you can double-click on a solution and it launch the solution as administrator in Win7. And even if you start VS as administrator, you cannot drag&drop files to it since it's now running in a different context as Explorer.

And if you ask MS Connnect about this you'll find that like anything of value the issue has been closed as "By Design.". Ok, look buddy, just because you designed a horrible user experience doesn't mean the problem can just be dismissed.

But why was win2k8 so much better an experience, a nagging voice kept asking. Turns out that on win2k8, i just run as Administrator. Win7 never gave that option (and you have to do some cmdline foo to enable the account.) Being a unix guy as well, running dev in what is root, just felt distasteful. But distaste or not, it's the key for actually being able to do productive development work in windows. As soon as I became THE Administrator, instead of an administrator, all was smooth again.

Stupid lesson learned.

Duckpond: Lightweight duck-typing for C

Edit: Changed As to AsImplementationOf since it's an extension method on object and too likely to collide.

A while back I was talking about Interface Segregation and proposed using either LinFu's DynamicObject or delegate injection. While I played a bit more with delegate injection, in practical use delegate injection turned out to be rather ugly and not really improve readability.

So, I've come back to wanting to cast an object to an interface regardless of what interfaces that object implemented. I wanted this to be as simple and lightweight as possible, so rather than using a dynamic proxy framework, i simply rolled my own IL and wrote a pure proxy that does nothing but call the identical method on the class it wraps.

Introducing DuckPond

DuckPond is a very simple and focused library. It currently adds only a single extension method: object.AsImplementationOf<Interface>

Given a class Duck that implements a number of methods, including Quack:

public class Duck {
  public void Quack(double decibels) {
    ...
  }

  //... various other methods ...
}

we can easily cast Duck to a more limited interface that the class doesn't implement such as:

public interface IQuacker {
  void Quack(double decibels);
}
using the `object.AsImplementationOf<T>` extension method:

using Droog.DuckPond;

...

var quacker = new Duck().AsImplementationOf<IQuacker>();

That's all there is to it.

But is it fast?

Honestly, i don't know yet. I have not benchmarked the generated classes against virtual method dispatches or LinFu's and Castle's dynamic proxy. I assume it is, since unlike with dyanmic proxy, DuckPond doesn't use an interceptor. Instead it emits Intermediate Language for each call in the interface, dispatching the call against the wrapped instance's counterpart.

Try it, fork it, let me know what you think

The code is available now at GitHub: http://github.com/sdether/duckpond

Linq2MongoDB: Building a Linq Provider for MongDB

This weekend has been a hack-a-thon, trying to build a simple linq provider for MongoDB. I'm using Sam Corder, et al.'s excellent C# MongoDB Driver as the query pipeline, so my provider really is just a translator from Linq syntax to Mongo Document Query syntax. I call it a hack-a-thon, because it's my first linq provider attempt and, boy, is that query translator state machine ugly already. However, I am covering every bit of syntax with tests, so that once i understand it all better, i can rewrite the translator in a cleaner fashion.

My goals for this provider is to replace a document storage layer i've built for a new notify.me project using NHibernate against mysql. This is in no way a judgment against NHibernate. It just happens that for this project, my schema is a heavily denormalized json document database. While fluent NHibernate made it a breeze to let me map it into mysql, it's really an abuse of an RDBMS. It was a case of prototyping with what you know, but now it's time to evaluate whether a document database is the way to go.

Replacing existing NHibernate code does mean that, eventually, i want the provider to work with POCO entities and use a fully strong-typed query syntax. But that layer will be built on top of the string-key based version i'm building right now. The string-key based version will be the primary layer, so that you never loose any of the schema-less flexibility of MongoDB, unless you choose to.

Basic MongoDB queries

So, lacking an entity with named properties to map against, what does the syntax look like right now? First thing we need is an IQueryable<Document> which is created like this:

var mongo = new Mongo();
var queryable = mongo["db"]["collection"].AsQueryable();

Given the queryable, the queries can be built using the Document indexer like this:

var q = from d in queryable where (string)d["foo"] == "bar" select d;

The Document returns an object, which means a cast is unfortunately required on one side of the conditional. Alternatively, Equals, either the static or instance version, also works, alleviating the need for a cast:

var q = from d in queryable where Equals(d["foo"], "bar") select d;
// OR
var q = from d in queryable where d["foo"].Equals("bar") select d;

Better, but it's not as nice as operator syntax would be, if we could get rid of the casts..

As it turns out there is a number of query operators in MongoDB that don't have an equivalent syntax in Linq, so a helper class to generate query expression was already needed. The helper is instantiated via the Document extension method .Key(_key_), giving us the opportunity to overload operators for the various types recognized by MongoDB's BSON. This allows for the following conditional syntax:

var q = from d in queryable
        where d.Key("type") == "customer" &&
              d.Key("created") >= DateTime.Parse("2009/09/27")
              d.Key("status") != "inactive"
        select d;

IN and NOT IN

In addition to normal conditional operators, the query expression helper class also defines IN and NOT IN syntax:

var in = from d in queryable where d.Key("foo").In("bar", "baz") select d;

var notIn = from d in queryable where d.Key("foo").NotIn("bar", "baz") select d;

The helper will be the point of extension to support more of MongoDB's syntax, so that most query definitions will use the d.Key(_key_) syntax.

findOne, limit and skip

Linq has matching counter parts of MongoDB's findOne(), limit() and skip(), in First or FirstOrDefault, Take and Skip respectively, and the current version of Linq provider already supports them.

What's missing?

There is a lot in Linq that will likely never be supported, since MongoDB is not a relational DB. That means joins, sub-queries, etc. will not covered by the provider. Anything that does map to MongoDB's capabilities, though, will be added over time. The low hanging fruit are Count() and order by, with group by following thereafter.

Surprisingly, || (or conditionals) are not going to happen as fast, since aside from or type queries using the .In syntax, it is not directly supported by MongoDB. In order to perform || queries, the query has to be written as a javascript function, which would basically mean that as soon as a single || shows up in the where clause the query translato would have to rewrite all other conditions in javascript as well. So, that's a bit more on the nice to have end of the spectrum of priorities.

Ready to go!

I will most likely concentrate on the low hanging fruit and then work on the POCO query layer next, since my goal is to be able to try out MongoDB as an alternative to my NHibernate code.

All that said, the code described above works now and is ready for some test driving. It's currently only in my branch on github, but I hope it will make it into the master soon.

About Concurrent Podcast #3: Coroutines

Posted a new episode of the Concurrent Podcast over on the MindTouch developer blog. This time Steve and I delve into Coroutines, a programming pattern we use extensively in MindTouch 2009 and one that i'm also trying out as an alternative to my actor based Xmpp code in Notify.me.

Since there isn't a native coroutine framework in C#, we're using the one provided by MindTouch Dream. It's built on top of the .NET iterator pattern (i.e. IEnumerable and yield) and makes the assumption that all Coroutines are asynchronous methods using Dream's Result<T> object for coordinating the producer and consumer of a return values. Steve's previously blogged about Result. Since those posts there's also been a lot of performance improvements and capability improvements to Result committed to trunk, primarily providing robust cancellation with resource cleanup callbacks. For background on coroutines, you can also check out previous posts I'vee written.

The cool thing about asynchronous coroutines compared to an actor model is that call/response based actions can be written as a single linear block of code, rather than separate message handlers whose contiguous flow can only be determined by examining the message dispatcher. With a message dispatcher that can correlate message responses with suspended coroutines, sending and waiting for a message in a coroutine can be made to look like a method call without blocking the thread, which, especially with message passing concurrency, is vital, since a response isnn't in any way guaranteed to happen.

I'm due to write another post on how to use Dream's coroutine framework, but in the meantime i highly recommend checking out Dream from mindtouch's svn. Lot's of cool concurrency stuff in there. _trunk_is under heavy development, as we work towards Dream profile 2.0, but 1.7.0 is stable and production proven.

Composing remote and local linq queries

One of the cool things with Linq is that queries are composable, i.e. you can add further query constraints by selecting from an existing query. Nothing is executed until you try to read from the query. This allows IQueryable to compose all the added constraints and transform it into the underlying query structure, most commonly SQL.

However it does come with the pitfall that there are a lot of things legal in Linq expressions that will die at run time. This happens because an expression may not have an equivalent syntax in the transformed language, like calling a method as part of a where clause.

This does not mean that you can't use linq for the portion of the query that is not executable by the provider. As long as you know what expression is affected, you can use query composition to build a query that executes some part remotely and some part against the object model in memory.

Let's suppose we wish to execute the following query:

var q = from e in session.Queryable<Entry>()
    where e.Created > DateTime.Parse("2009/9/1")
        &&  e.Created < DateTime.Now
        && e.Tags.Contains('foo')
    select e;

But our query provider doesn't understand the extension method that allows us to check the list of Tags. In order for this query to work, that portion must be executed against the result set of the date range query. We could coerce the first portion to a list or array and then query that portion, but that would just force the date query to be materialized before we could prune the set. Instead we want to feed the stream of matching entries into a second query, composing a new query that contains both portions as a single query and won't access the database until we iterate over it.

To accomplish this I created an extension method that coerces the query into a sequence that yields each item as it is returned by the database query:

public static class LinqAdapter {
    public static IEnumerable<T> AsSequence<T>(this IEnumerable<T> enumerable) {
        foreach(var item in enumerable) {
            yield return item;
        }
    }
}

UPDATE: As Scott points out in the comments, my AsSequence just re-implements what is already available in Ling as AsEnumerable. So the above really just serves to explain how AsEnumerable defers execution to enumeration rather than query definition.

Anyway, AsSequence or AsEnumerable allows me to compose the query from server and local expressions like this:

var q = from e in session.Queryable<Entry>()
    where e.Created > DateTime.Parse("2009/9/1")
        &&  e.Created < DateTime.Now
    select e;
q = from e in q.AsSequence() where e.Tags.Contains('foo') select e;

When q is enumerated, the first expression is converted to SQL and executes against the database. Each item returned from the database is then fed into the second query, which checks its expression and yields the item to the caller, should the expression match. Since q.AsSequence() is used as part of query composition, it does not force the first expression to execute at the time of query definition as q.ToList() would. The additional benefit is that even when q.AsSequence() is executed, it never builds the entire result set in memory as a list to iterate over, but rather just streams each database query result item through its own expression evaluation.

Of course, this still have the performance implications of sending data across the wire and filtering it locally. However, this is not an uncommon problem when SQL alone cannot provide all the filtering. The benefit of this approach is reduced memory pressure on execution, better control when execution occurs and the ability to use Linq syntax to do the secondary filtering.