Skip to content

Iloggable

Searching a Tree of Objects with Linq

UPDATE: Posted a follow-up here. I've finally had legitimate use for LINQ to Objects, not just to make the syntax cleaner, but also to simplify the underlying code and provide me a lot of flexibility without significant work.

The scenario

I have a tree of objects that have both a type and a name. The name is unique, the Type is not. The interface is this:

public interface INode
{
  int Id { get; }
  string Name { get; set; }
  string Type { get; set; }
  List<INode> Children { get; }
}

I want to be able to find a single named Node in the tree and I want to be able to retrieve a collection of all nodes for a particular type. The searchable interface could be expressed as this:

public interface ISearchableNode : INode
{
  INode FindByName(string name);
  IEnumerable<INode> FindByType(string name);
}

Both require me to walk the tree and examine each node, so clearly I just want to have one walk routine and generically evaluate the node in question. In C# 2.0 parlance, that means I could pass an anonymous delegate into my generic find routine and have it recursively iterate through all the children. I also pass along a resultset to be populated.

The signature for the evaluation delegate looks like this:

delegate bool FindDelegate(INode node);

but since I'm using C# 3.0 (i.e. .NET 3.5) I can use lambda expressions to avoid creating a delegate and simplify my syntax. Instead of FindDelegate, I can simply use Func<INode,bool>:

// Instead of this:
private void Find(INode node, List<INode> resultSet, FindDelegate findDelegate);
// called like this for a Name search:
Find(this, resultSet, delegate(INode node) { return node.Name == name; });

// I can use this:
private void Find(INode node, List<INode> resultSet, Func<INode, bool> f)
// called like this:
Find(this, resultSet, node => node.Name == name);

Thus giving me the following implementation for ISearchableNode:

public INode FindByName(string name)
{
  List<INode> resultSet = new List<INode>();
  Find(this, resultSet, x => x.Name == name);
  return resultSet.FirstOrDefault();
}

public IEnumerable<INode> FindByType(string type)
{
  List<INode> resultSet = new List<INode>();
  Find(this, resultSet, x => x.Type == type);
  return (IEnumerable<INode>)resultSet;
}

private void Find(INode node, List<INode> resultSet, Func<INode, bool> f)
{
  if (f(node))
  {
    resultSet.Add(node);
  }
  foreach (INode child in node.Children)
  {
    Find(child, resultSet, f);
  }
}

Problem solved, move on... Well, except there is significant room for improvement. Here are the two main issues that ought to be resolved:

  1. Syntax is limited to two types of searches and exposing the generic find makes for an ugly syntax. It would be much nicer if queries to the tree could be expressed in LINQ syntax.
  2. It's also inefficient for the Name search, since I'm walking the entire tree, even if the first node matched the criteria.

LINQ to Hierarchical Data

In order to use LINQ to objects, I need to either create a custom query provider or implement IEnumerable. The latter is significantly simpler and could be expressed using the following interface:

public interface IQueryableNode : IEnumerable<INode> { }

Ok, ok, I don't even need an interface, I could just implement IEnumerable... But what does that actually mean? In the simplest sense, I'm iterating over the node's children, however, I also with do descend into the children's children and so on. So a simple foreach won't do. I could just do the same tree walking with a resultset as I did above and return the Enumerator of the resulting list to implement the interface, but C# 2.0 introduced a much more useful way to implement non-linear Iterators, i.e. the yield keyword. Instead of building a list to be interated over, yield let's the iterating code return values as they are found, which means it can be used for recursive iteration. Thus the GetEnumerator is implemented simply as follows

#region IEnumerable<Node> Members
public IEnumerator<INode> GetEnumerator()
{
  yield return this;
  foreach (Node child in Children)
  {
    foreach (Node subchild in child)
    {
      yield return subchild;
    }
  }
}
#endregion

#region IEnumerable Members
System.Collections.IEnumerator System.Collections.IEnumerable.GetEnumerator()
{
  return this.GetEnumerator();
}
#endregion

Nice and simple and ready for LINQ.

Searching for all Nodes of a type becomes

var allBar = from n in x
                where n.Type == "bar"
                select n;
foreach (Node n in allBar)
{
  // do something with that node
}

and the search for a specifically named node becomes

INode node = (from n in x
              where n.Name == "i"
              select n).FirstOrDefault();

But the real benefit of this approach is that I don't have hard-coded search methods, but can express much more complex queries in a very natural syntax without any extra code on the Node.

Deferred execution

As it turns out, using yield for the recursive iteration also solved the second issue. As yield returns values as it encounters them during iteration, the search doesn't happen until the query is executed. And one of the side effects of LINQ syntax is that creating a query does not execute it until the result set is iterated over. Therefore, FirstOrDefault() actually short-circuits the query as soon as the first match (and in case of Name, it's going to be the only match) is hit.

Threading: Mail.app vs. Thunderbird

I generally prefer message forums over mailing lists for community discussions because of the better separation of topics and implicit threading. Not that forums are ideal for threads, since they generally cannot easily spawn sub-threads.

So, when reading discussions on mailing lists, I try to use the threading mode of the mail client to bring some clarity to the discussion and filter out discussions I don't care about. I am a big Imap proponent and read mail on PC with Thunderbird and Mail.app on Mac. Overall i like Mail.app better but sometimes it does exhibit the Mac app tendency of "if you don't like the way we do things, well, sucks for you", vs. Thunderbird's more liberal configurability.

I'll leave the whole subject of how well clients group messages into threads.... Ok, just one stab at that subject. Subject of "Hey" is not too uncommon. So all messages of "Hey", which are almost guaranteed to be unrelated, get lumped into a thread together. Happens in both readers. I know that threading in mail is ad-hoc, so I am not faulting the clients. It's just a silly artifact.

But here's some behavior that I find not only unintuitive but downright tedious:

On both Mail.app and Thunderbird, i can collapse and expand threads by using left and right arrows. So when I decide that a discussion is not of interest to me, i just collapse the thread and hit delete. On Mail.app, the thread is deleted, as I'd expect. On Thunderbird, the current message in the thread is deleted and the next message becomes the head of the thread. So to delete a thread i have to open the thread, select all messages manually and then delete. Bah!

I've dug around the config and even advanced config, but can't find a way to change this behavior.

Stupid ExtensionMethod tricks

I have yet to decide whether Extension Methods in C# are a boon or bane. I've already several times, been frustrated by Intellisense not showing me a method that was legal somewhere else, until I could figure out what using statement brought that extension method into scope. On one hand Extension Methods can be used to simplify code, on the other, I see them as the source of much confusion as they become popular.

Worse yet, they have potential for more about than the blink tag, imho.

The one place I see extension methods being instrumental is in defining fluent interfaces, yet another practice I have yet to decide whether I am in favor of or not. Partially, because I don't see them as intrinsically easier to read. Partially because they allow for much syntactic abuse.

So today, I created a fluent interface for an operation that I wish was just support in the language in the first place -- the between operator. It exists in some SQL dialects and is a natural part of so many math equations. I wish I could just write:

if( 0.5 < x < 1.0 )
{
  // do something
}

Instead, I'll settle for this:

if( x.Between(0.5).And(1.0) )
{
  // do something
}

The first part is easy, it's just an Extension Method on double. And if I just had it take the lower and upper bound, then we would have been done. But this is where the fluent interface bug bites me and I want to say And. This means, that Between can't return a boolean. It needs to return the result of the lower bound test and the value to be tested. That means that Between returns a helper class, which has one method And, which finally returns the boolean value.

public static class DoubleExtensions
{

  public static BetweenHelper Between(this double v, double lower)
  {
    return new BetweenHelper(v > lower, v);
  }

  public struct BetweenHelper
  {
    public bool passedLower;
    public double v;

    internal BetweenHelper(bool passedLower, double v)
    {
      this.passedLower = passedLower;
      this.v = v;
    }

    public bool And(double upper)
    {
      if (passedLower && v < upper)
      {
        return true;
      }
      else
      {
        return false;
      }
    }
  }
}

That's a lot of code for a simple operation and it's still questionable whether it really improves readability. But it is a common enough operation if you have a lot of bounds checking, that it might be worth throwing into a common code dll. I've yet to make up my mind, I mostly did it because i wanted to play with the syntax.

One week of Windows on Macbook

Spent this past week doing development in XP under VMWare Fusion. I hooked up a windows keyboard and mouse when I'm stationary and when in the Windows Space, there was no way to tell that wasn't on a native machine. I actually had a harder time on the Mac side, since I had to remeber to hit the Windows key to get the normal Apple-Command behavior.

The latest thing I tried, that I seriously didn't expect to work was hook up my HTC Apache. As much trouble as I've had with ActiveSync on my desktop machine, I figured that either the USB connection wouldn't even be seen by Windows or that ActiveSync just bombed. But instead, ActiveSync found the phone and synced everything. Now I even have Standard Time with me on a laptop instead of my old setup of Desktop and phone for time tracking.

My new Visual Studio Dev workhorse: Macbook Pro

I'm just starting some new contract work that requires a bit more on-site than usual and instead of syncing up my various dev environments all the time, I decided that it's time for a new laptop. My current laptop is a 15" Powerbook G4. It's been a great machine, but for the past couple of years it's mostly been an expensive Email/Browser appliance with the occasional Eclipse/Java diversion.

This time I needed something to let me do VS.NET development and all the other MS related things that come across my path. Now, obviously my previous laptop choice shows i'm not impartial, but I certainly wasn't going to settle for less that what I had. I went through this last time as well and the conclusion is not only the same, but decidedly more in favor of a Macbook this time around.

Go ahead, try it.. Find a laptop like this: slim profile, powerful CPU, large widescreen LCD, light and sturdy. By the time you find the PC equivalents, you don't get to be cheaper. And for some reason, PCs still come mostly in two flavors: 1) light and small screen or low power and 2) giant hulking powerhouses that are not that much more convenient than the original PC Portable's. Add to that, that I have yet to find another laptop that comes with a sturdy shell like the Macs -- If you look at some of the dents my old Powerbook has sustained, then you'd realize that I would have cracked open any plastic shelled laptop a long time ago.

Ok, enough with the rationalization already, buy your Mac, be a fanboy. Don't come whining when it can't cut it as your dev machine...

I have to admit that the last 3 days have included more reboots than I'd care to admit, but I finally have everything configured just right and I don't mind saying that this rig is freakin' sweet!

The setup is as follows: 15" 2.4GHz Core Duo Macbook Pro w/ 2GB RAM & 160MB HD. 140GB Leopard partition. 20GB bootcamp partition (short-sighted mistake on sizing) with Win XP Pro, Visual Studio Orcas, etc. Vmware Fusion 1.1 Beta running boot camp partition as VPC image under Leopard.

So what pitfalls have I come across?

Bootcamp was a pain to get going

Vmware fusion was just about the most painless windows install I've ever done. It asked me for the key before it started and took care of everything until it booted into XP. Bravo!

Boot camp on the other hand complained about my disk being corrupt (Brand new mac, mind you). Rebooted from CD, ran disk repair, was told there was nothing to repair. Tried boot camp again. Success! Started XP install on the formatted partition that boot camp set up. Got to the reboot early in the install, Mac rebooted, complained it couldn't boot from CD and the HD had errors, please press any key, but no keys produced results. Hard rebooted, ejected CD, tried to restart install via bootcamp, but same problem. Removed bootcamp partition, started over, this time manually formatting the disk during install to be sure. Some more issues aside, finally boot camp installed. Only then do I find out that VMware Fusion can use the boot camp partition as a virtual image. Now that's useful. Except I had sized it as the emergency fall back windows install. Doh.. Well, I'll just mount the Mac disk on the windows side for storage. Then all my files are inside the FileVault as well.

XP Activation with bootcamp/vmware fusion

After finding out I could use a single install as a full native and a virtual instance, I was thrilled and started up the bootcamp partition. VMware had to tweak it some, but it came up. XP Activation got invalidated because I apparently changed the hardware too much. Same happened when I rebooted into XP directly. Now it seems ok, but Redmond has received about 4 activations from the same XP install in 2 days. No, I'm not frantically installing on all my friend's PCs, I'm just trying to get one install stable.

Vmware Fusion Unity and Leopard Spaces have some issues

Now, I have no right to expect something as funky as Unity to work with an OS feaure that was released 4 days ago, so I'm not really bitching, just warning. For me using Unity and Spaces caused spaces to switch back and forth automatically at about twice a second several times when I had two Win windows in different spaces. And my Mac apps all lost their windows. So for now, XP runs fullscreen in its own Space and it's all wonderfully stable.

WPF likes hardware acceleration

So in the middle of the night I wake up in a panic. Everything was working so wonderfully, but had I missed something? Well, I kept saying "i don't care that virtualization doesn't support the GPU, i'm not planning to play games on this machine". Hahaha.. But what about WPF? It uses the GPU to render all that fancy vector goodness! Did I just buy another email appliance? I fired up some WPF samples and it worked fine. As they got fancier, things got a bit choppier and the 3D was a slideshow. But work, it did. So, WPF gracefully falls back to software only mode. I rebooted into native XP and WPF was running in all its glory. Yay, all is good.

Ok, this is day 3 with my new rig and I'm very happy on both the Mac and Windows side. I even have a single dev machine that can test all browsers currently supported by Silverlight. And once Moonlight drops, I'll just fire up a VM of Fedora and cover that use case as well. Let's see if the euphoria lasts.

DynamicProxy rocks!

Recently, I've had a need to create proxy objects in two unrelated projects. What they had in common was that they both dealt with .NET Remoting. In one it was traditional Remoting between machines, in the other working with multiple AppDomains for Assembly flexibility. In both scenarios, I had a need for additional proxies other than the Remoting created Transparent proxy and the Castle project's DynamicProxy proved invaluable and versatile. It's a bit sparse on documentation, but for the most part, you should be able to figure things out by playing with it and lurking around their forum.

If you're not familiar with the Castle project, check it out, because DynamicProxy is really only a supporting player in an excellent collection of useful tools. From their Inversion of Control Containers MicroKernel/Windsor to MonoRail and ActiveRecord (built on NHibernate), there is a lot there that can make your life easier.

But why would I need to create proxies, when the Remoting infrastructure in .NET takes care of this for you with Transparent Proxies? Well, let me describe both scenarios:

Resilient Client/Server operations with Remoting

From my experience with .NET Remoting, it's dead simple to do something simple. But it really is best suited for WebService stateless calls, because there isn't a whole lot exposed to add quality of service plumbing. The transparent proxy truly is transparent until something fails. Same is true on the server side, where you don't get a lot of insight into the clients connected to you.

And then there are events, which, imho, are one of the greatest thing about doing client/server programming with Remoting. Those are painful in two ways, having to have a MarshalByRefObject helper proxy the event calls and pruning dead clients on the server which you won't find out about until you try to invoke their event handler.

But those shortcomings are not enough reason to fall back to sockets and custom/stateful wire protocols. Instead I like to wrap my transparent proxy in another proxy that has all the plumbing for maintaining the connection, pinging the server and intelligently handling failure. Originally I created them by hand, but I just converted my codebase over to use DynamicProxy the other day.

Using CreateInterfaceProxyWithoutTarget and having the proxy derive from my plumbing baseclass automagically provides me with an object that looks like my target interface by wraps the Remoting proxy with my quality of service code.

public static RemotingClient<T> GetClient(Uri uri)
{
  Type t = typeof(T);
  ProxyGenerator g = new ProxyGenerator();
  ProxyGenerationOptions options = new ProxyGenerationOptions();
  options.BaseTypeForInterfaceProxy = typeof(RemotingClient<T>);
  RemotingClient<T> proxy = 
    (RemotingClient<T>)g.CreateInterfaceProxyWithoutTarget(
      t, new Type[0], options, new ProxyInterceptor<T>());
  proxy.Uri = uri;
  return proxy;
}

I'll do another post later just on Remoting, since the whole business of getting events to work was a bit of a labor and isn't documented that well.

Loading and unloading Assemblies on the fly

The second project started with wanting to be able to load and unload plug-ins dynamically and has now devolved into a framework for auto-updating components of an App at runtime.

This involves loading the assemblies providing the dynamic components into new AppDomains, so that we can unload the assemblies again by unloading the AppDomain. Instances of the components class are then marshaled across the AppDomain boundary by reference so that the main AppDomain never loads the dynamic assembly. This way, when a component needs to be updated, I can dump that AppDomain and recreate it.

The resilience of the connection isn't in question here, although there is need for quite a bit of plumbing to get references to the remoted components disposed before the AppDomain can be unloaded. Again, a DynamicProxy can help on the main AppDomain side by acting as a facade to the actual reference, so that you can reload the underlying assembly and recreate the reference without the main application having to be aware of it.

But I haven't gotten that far, nor have I decided whether that would be the best way, rather than an explicit disposal and recreation of the references.

Where DynamicProxy comes to a rescue here is when your object to be passed across the boundary isn't derived from MarshalByRefObject. This could be a scenario, where you are trying to use a component that's already derived from another baseclass, so adding MarshalByRefObjectisn't an option. Now DynamicProxy with its ability to provide a baseclass for the proxy gives you a sort of virtual multiple inheritance.

This is also an ideal scenario for using Castle.Windsor as the provider of the component. Using IoC, I was able to create a generic common dll that contains all the interfaces as well as a utility class for managing the components that need to be passed across the boundary, this class never has to know anything about the dynamic assembly, other than that the assembly stuffs its components into the AppDomain's Windsor container.

The resulting logic for creating an instance that can be marshaled across AppDomain boundaries looks something like this:

public T GetInstance<T>()
{
  Type t = typeof(T);
  if (!t.IsInterface)
  {
    throw new ArgumentException("Type must be an interface");
  }
  try
  {
    T instance = container.Resolve<T>();
    if (typeof(MarshalByRefObject).IsAssignableFrom(instance.GetType()))
    {
      return instance;
    }
    else
    {
      ProxyGenerator generator = new ProxyGenerator();
      ProxyGenerationOptions generatorOptions = new ProxyGenerationOptions();
      generatorOptions.BaseTypeForInterfaceProxy = typeof(MarshalByRefObject);
      return (T)generator.CreateInterfaceProxyWithTarget(t, instance, generatorOptions);
    }
  }
  catch (Castle.MicroKernel.ComponentNotFoundException)
  {
    return default(T);
  }
}

All in all, DynamicProxy is something that really should be part of the .NET framework. Proxying and Facading patterns are just too common to only support them via the heavier and MarshalByRefObject dependent remoting infrastructure.

The version of DynamicProxy I was using was DynamicProxy2, which is part of the Castle 1.0 RC3 install. It's a lot more versatile than the first DynamicProxy and deals with generics, but mix-ins are not yet implemented in this version. However, if you just need a single mix-in, specifying the base class for your proxy can go a long way to solving that limitation.

Vista vs. XP

About a month ago, I finally gave in and upgraded my main dev machine. It was getting a little slow for the my various build tasks and running multiple instances of LFS for my test environment just didn't work. That Bioshock had just been released had nothing to do with it, really!

Anyway, I didn't actually upgrade as much as build a new machine and use my old dev as another test target. So I figured I might as well go for Vista on this machine. I'd heard plenty of anecdotes, usually sticking to one end of the love/hate spectrum or the other and I wanted to find out where i'd fit.

Well, it's been a month with Vista, but with regular excursions to XP on other machines and overall I am right at the meh centerline of the spectrum. Vista gives me no trouble, anymore than any new OS install does until i tweak it to my liking. There are various UI aspects that I like. But if I was forced to use only XP, I'd just shrug my shoulders. I really can't find anything about the OS that I care deeply enough to make me prefer or dislike it more than XP.

So, no it's not the worst product MS has ever put out and clearly a sign that they're finally loosing the grip on the desktop stranglehold, as you will hear from one extreme, nor is it a next generation OS solving all the problems we've been having with the previous generation OSs. It's just another version of Windows, if you ask me. From my experience I just have to discount tales of massive downgrade rushes from users as FUD that's becoming somewhat self-fulfilling. Sure, you ought to have good hardware for Vista. But what software that is put out today doesn't work better on today's hardware vs. the machine you bought when XP was fresh.

Even for my latest Redhat Fedora install I finally had to admit that my trusty home server from 5 years ago was a bit too weak to handle it. Sure, proportionally Fedora needs much less hardware, but then I don't try to put it through heavy UI lifting, which is, imho, where most horsepower, for better or for worse, goes these days.

Automatic properties syntax wish

Just a quick thought on on Automatic properties in C# 3.0 (.NET 3.5, Orcas, whathaveyou). I, like most .NET developers, have spent too much time writing

private string foo;

public string Foo
{
 get { return foo; }
 set { foo = value; }
}

This can now be replaced with

public string Foo { get; set; }

Now that's nice and all, but for the most part it doesn't seem like a big step up from just making your field public. Now, you do get the benefit of being able to use this property in places where reflection is used for exposing properties, i.e. in most data-binding scenarios, which don't accept public fields. And it let's you implement interfaces, since they also can't define public fields. But that's a lot more useful, at least to me is replacing

private string foo;

public string Foo { get { return foo; } }

with

public string Foo { get; private set; }

Now you can create those read-only properties with very simple, clean syntax. That'll clean up a lot of code for me.

But finally we still have the scenario where properties have side-effects, like calling the event trigger for INotifyPropertyChanged. This cannot be expressed with automatic properties and we're back to defining a separate private field. What I would love to see is some syntactic sugar along the lines of

public string Foo
 : private _foo
{
 get;
 set
 {
   string previous = _foo;
   _foo = value;
   if(previous != _foo)
   {
     OnPropertyChanged("Foo");
   }
 }
}

I know this doesn't buy that much in syntax terseness (the get is shorter and you don't have to define the type of _foo), but it means that the private storage used by the property is part of the property definition, which makes the definition that much more easy to read, imho. That, or allow pre and post conditions for set in Automatic Properties, although the syntax i can think of for that doesn't seem cleaner than the above.

TDD & you can't test what you can't measure

Recently I've been dealing with a lot of bug fixing that I can't find a good way to wrap tests around. Which is really annoying, because it means that as things get refactored these bugs can come back. These scenarios are almost always UI related, whether it's making sure that widgets behave as expected, or monitoring the visual state of an external application I'm controlling (Live For Speed, in this case). What all these problems have in common is that the recognition of a bug existing can only be divined by an operator, because somewhere there is lack of instrumentation that could programatically tell the state I'm looking for. And to paraphrase the old management saying, "You can't test what you can't measure".

My best solution is the usual decoupling of business logic from UI. Make everything a user can do an explicit function of your model and now you can test the functions with unit tests. At least you business logic is solid, even if your presentation gets left in the cold. And depending on your UI layer, a lot of instrumentation can still be strapped on. WinForms controls are generally so rich that nearly everything you would want to test for you can probably query from the controls. But things that you can test and see within a second may be a lot of lines of code to test programatically and of course go right out the window when you refactor your UI. And if your trying to test the proper visual states of a DataGridView and its associated bindings, then you're in for some serious automation pains.

I know that business logic is the heart of the application, but presentation is what people interact with and if it's poor, then it doesn't matter how kick ass your back end is. So for the time being that means that UI takes a disproportionate amount of time to run through its possible states and it's something I would like to handle more efficiently.

LFSLib 0.16b w/ InSim Relay support released

With this version, LFSLib.NET gains support for the InSim Relay currently in testing (see this Forum thread for details). The InSim relay allows you to get a listing of servers connected to the relay and send and receive InSim packets to them without having to directly connect to them. To create an InSimHandler to the relay simply call:

IInSimRelayHandler relay = InSimHandler.GetMasterRelayHandler();

This server isn't up yet, so for testing Victor has set up the following server:

IInSimRelayHandler relay = InSimHandler.GetRelayHandler("vic.lfs.net", 47474);

IInSimRelayHandler implements a subset of the main InSimHandler, discarding methods, properties and events that don't make sense and adding a couple of relay specific ones.

The remainder of the changes are minor tweaks and bug fixes:

  • Added auto-reconnect feature to handler (really only works for TCP, since UDP is connectionless)
  • CameraPositionInfo now allows ShiftU & related properties to be set
  • Updated TrackingInterval to accept values as low as 50ms
  • BUGFIX: Fixed race condition where a LFSState event could be fired on a separate thread before the main thread noticed that LFS was connected, allowing for invalid NotConnected exceptions to occur.
  • BUGFIX: Fixed Ping event (was looking for the wrong packet)
  • BUGFIX: RaceTrackPlayer.HandicapIntakeRestriction always returned zero

Full details are in the release notes.

All links, docs, etc are at lfs.fullmotionracing.com