Linq2MongoDB: Building a Linq Provider for MongDB
This weekend has been a hack-a-thon, trying to build a simple linq provider for MongoDB. I'm using Sam Corder, et al.'s excellent C# MongoDB Driver as the query pipeline, so my provider really is just a translator from Linq syntax to Mongo Document Query syntax. I call it a hack-a-thon, because it's my first linq provider attempt and, boy, is that query translator state machine ugly already. However, I am covering every bit of syntax with tests, so that once i understand it all better, i can rewrite the translator in a cleaner fashion.
My goals for this provider is to replace a document storage layer i've built for a new notify.me project using NHibernate against mysql. This is in no way a judgment against NHibernate. It just happens that for this project, my schema is a heavily denormalized json document database. While fluent NHibernate made it a breeze to let me map it into mysql, it's really an abuse of an RDBMS. It was a case of prototyping with what you know, but now it's time to evaluate whether a document database is the way to go.
Replacing existing NHibernate code does mean that, eventually, i want the provider to work with POCO entities and use a fully strong-typed query syntax. But that layer will be built on top of the string-key based version i'm building right now. The string-key based version will be the primary layer, so that you never loose any of the schema-less flexibility of MongoDB, unless you choose to.
Basic MongoDB queries
So, lacking an entity with named properties to map against, what does the syntax look like right now? First thing we need is an
IQueryable<Document> which is created like this:
Given the queryable, the queries can be built using the
Document indexer like this:
Document returns an
object, which means a cast is unfortunately required on one side of the conditional. Alternatively,
Equals, either the static or instance version, also works, alleviating the need for a cast:
Better, but it's not as nice as operator syntax would be, if we could get rid of the casts..
As it turns out there is a number of query operators in MongoDB that don't have an equivalent syntax in Linq, so a helper class to generate query expression was already needed. The helper is instantiated via the
Document extension method
.Key(_key_), giving us the opportunity to overload operators for the various types recognized by MongoDB's BSON. This allows for the following conditional syntax:
IN and NOT IN
In addition to normal conditional operators, the query expression helper class also defines IN and NOT IN syntax:
The helper will be the point of extension to support more of MongoDB's syntax, so that most query definitions will use the
findOne, limit and skip
Linq has matching counter parts of MongoDB's
Skip respectively, and the current version of Linq provider already supports them.
There is a lot in Linq that will likely never be supported, since MongoDB is not a relational DB. That means joins, sub-queries, etc. will not covered by the provider. Anything that does map to MongoDB's capabilities, though, will be added over time. The low hanging fruit are
order by, with
group by following thereafter.
|| (or conditionals) are not going to happen as fast, since aside from or type queries using the
.In syntax, it is not directly supported by MongoDB. In order to perform
|| shows up in the
Ready to go!
I will most likely concentrate on the low hanging fruit and then work on the POCO query layer next, since my goal is to be able to try out MongoDB as an alternative to my NHibernate code.