Skip to content

2009

db4o 7.4 binaries for mono

As I talked about recently, the standard binaries for db4o have some issues with mono, so I recompiled the unmodified source with the MONO configuration flag. I've packed up both the debug and release binaries and you can get them here. These are just the binaries (plus license). It's not the full db4o package. If you want the full package, just get it directly from the db4o site, since the MONO config flag and have Visual Studio rebuild the package.

This package should show up on the official db4o mono page shortly as well.

db40 indexing and query performance

Indexing on db4o is a bit non-transparent, imho. There's barely a blurp in their Documentation app and it just tells you how to create an index and how to remove it. But you can't easily inspect that one exists, or whether it's being used. So i spent a good bit of time today trying to figure out why my queries were so slow, was an index created and if so, was it being used? The final answer is, if querying is slow in db4o, you're not using an index, because, OMG, you'll know when you do an indexed query.

Index basics

Given an object such as

public class Foo
{
  public string Bar;
}

you create an index, globally (meh) for that object on all databases you create thereafter, with this call:

Db4oFactory.Configure().ObjectClass(typeof(Foo)).ObjectField("Bar").Indexed(true);

So far, straight forward enough. But let's say you're using a property? Well, db4o does its magic by inspecting your underlying storage fields, so you have to index them, not the properties that expose them. That means if our object was supposed to have a readonly property Bar, like this:

public class Foo
{
  private string bar;
  public Foo(string bar)
  {
    this.bar = bar;
  }
  public string Bar { get { return bar; } }
}

then the field you need to index is actually the private member bar:

Db4oFactory.Configure().ObjectClass(typeof(Foo)).ObjectField("bar").Indexed(true);

Given this idiosyncrasy, the obvious question is "what about automatic properties?". Well, as of right now the answer is, no such luck, because you'd have to reflect the underlying storage field that is created and index it, and you don't get any guarantees that field is named the same from compiler to compiler or version to version. That probably also means, that automatic properties are dangerous all around, because you may never get your data back if the storage changes, although on that conclusion i'm just speculating wildly.

Query performance

Index in hand, I decided to populate a DB, always checking if the existing item already existed, using a db4o native query. That started at 1 ms query time and then linearly increased with every item added. That sure didn't seem like an indexed search to me. I finally discovered a useful resource on the db4o site, but unfortunately it's behind a login, so google didn't help me find it and my link to it will only take you to the login. That's a shame because this bit of information ought to be somewhere in big bold letters!

You must have the following DLLs available for Native Queries to be optimized into SODA queries, which apparently is the format that hits the index:

  • Db4obects.Db4o.Instrumentation.dll
  • Db4objects.Db4o.NativeQueries.dll
  • Mono.Cecil.dll
  • Cecil.FlowAnalysis.dll

The query will execute fine, regardless of their presence, but the performance difference between the optimized, index using query and the unoptimized native query is orders of magnitude. My queries went from 100-500ms to 0.01ms, just by dropping those DLLs into my executable directory. Yeah, that's a useful change.

Interestingly enough, the same is not required for linq queries. They seem to hit the index without the extra help (although just to even run, Mono.Cecil and Cecil.FlowAnalysis need to be present, so here you at least get an error). There currently appears to be about 1ms overhead for parsing linq into SODA, but i'll take that hit for the syntactic sugar.

Conclusions

I'm pretty happy with simplicity and performance of db4o so far. It seems like an ideal local, queryable persistence layer. The way it works does want to make me abstract my data model into simple data objects that are then converted into business entities. I'd rather have the attribute based markup of ActiveRecord, but that's not a deal breaker.

Db4o on .NET and Mono

After failing to get a cross-platform sample of NHibernate/Sqlite going, I decided to try out Db4o. This is for a simple, local object persistence layer anyhow, nothing more than a local cache, so db4o sounded perfect.

The initial DLLs for 7.4 worked beautifully on .NET but ran into problems on Mono. Apparently db4o imports FlushFileBuffers from kernel32.dll if your build target is not CF or mono. And in its call to FlushFileBuffers it uses FileStream.SafeFileHandle.DangerousGetHandle() which it not yet implemented under Mono, resulting in this exception:

Unhandled Exception: System.NotImplementedException: The requested feature is not implemented.
  at System.IO.FileStream.get_SafeFileHandle () [0x00000]
  at Sharpen.IO.RandomAccessFile.Sync () [0x00000]
  at Db4objects.Db4o.IO.RandomAccessFileAdapter.Sync () [0x00000]
  ...

I found this page on the Db4o site, which suggested just falling back to FileStream.Handle. However, that for me just resulted in this:

Unhandled Exception: System.EntryPointNotFoundException: FlushFileBuffers
  at (wrapper managed-to-native) Sharpen.IO.RandomAccessFile:FlushFileBuffers (intptr)
  at Sharpen.IO.RandomAccessFile.Sync () [0x00000]
  at Db4objects.Db4o.IO.RandomAccessFileAdapter.Sync () [0x00000]
  ...

So, i simply defined MONO as a compilation symbol in visual studio and rebuilt it. I figure the only time this code will run on Windows is during testing, so treating it as mono is fine. And that did solve my issues and i now have a DLL for db40 7.4 that works beautifully across .NET and mono from a single build.

Being a Linq nut, I immediately decided to skip the Native Query syntax and dive into using the Linq syntax instead. Which worked great on mono 2.0.1, but unfortunately on the current Redhat rpm (stuck back in 1.9.1 lang), the Linq implementation isnt' quite complete and you get this:

Unhandled Exception: System.NotImplementedException: The requested feature is not implemented.
  at System.Linq.Expressions.MethodCallExpression.Emit (System.Linq.Expressions.EmitContext ec) [0x00000]
  at System.Linq.Expressions.LambdaExpression.Emit (System.Linq.Expressions.EmitContext ec) [0x00000]
  at System.Linq.Expressions.LambdaExpression.Compile () [0x00000]
  at Db4objects.Db4o.Linq.Expressions.SubtreeEvaluator.EvaluateCandidate (System.Linq.Expressions.Expression expression) [0x00000]
  ...

But falling back from this syntax:

var items = from RosterItem r in db where r.CreatedAt > DateTime.Now.Subtract(TimeSpan.FromMinutes(10)) select r;

to the NativeQuery syntax (with delegates replaced by lambda's): `db.Query`
var items = db.Query<RosterItem>(r => r.CreatedAt > DateTime.Now.Subtract(TimeSpan.FromMinutes(10)));

It's still a fairly compact and straight forward syntax, so until i finish setting up my own Centos mono RPMs, i'll stick to this syntax.

I need to run db4o through some more serious paces, but I like what I see so far.