Skip to content


The problem with Frameworks

Over years, I've developed a dislike for frameworks, especially ORMs and web stacks such as Rails. But aside from complaining about "magic" and a vague icky feeling, I could never eloquently explain why. Meanwhile, all my spare time web projects have been done with Express and, man, I haven't had more fun in years, again not being able to eloquently explain why.

So when it came time to pick a stack for a scala web app, I was staring at a sea of options. Play seemed to be the obvious choice, being part of the TypeSafe stack, but looking at their documentation I got that icky feeling again. I searched for some vs. posts to hear what people like/dislike about the many options available. And that's where I came across a paragraph that was the best explanation of why I stay clear of frameworks when i can:

"[...] frameworks are fine, but it is incumbent upon you to have an intimate understanding of how such frameworks abstract over this infrastructure, as well as the infrastructure itself. Some will disagree with that, arguing that the very reason you want these abstractions is so you don't need this understanding. If you buy that then I wish you luck; you'll surely need it should you decide to develop a non-trivial application."

                                             -- [Scalatra vs Unfiltered vs Lift vs Play](

Frameworks mean you have to know more, not less, about what you are doing

It all comes down to the fallacy that an abstraction alleviates the need to understand what it abstracts. Sure, for the scenarios where what you are trying to do maps 100% to the target scenarios of the framework and you have no bugs, frameworks can really cut down on that boiler plate. This is why scaffolding examples of most frameworks are so magically concise. They excercise exactly what the framework was built to be best at. But the moment you step out of their wheelhouse or need to troubleshoot something (even if the bug is in your own code), you'll have to not only know how the abstracted system should work but also how the abstraction works, giving you two domains to master rather than just one.

The ORM example

I used to be huge proponent of ORMs. I even wrote two and a half of them. My original love for them stemmed from believing that I could curb bad SQL getting into production by providing an abstraction. I was violently disabused of that notion and have since come to accept that badly written code isn't a structural, but rather a cultural problem of developers either not understanding what is bad or not caring because they are insulated from the pain their code causes.

But I was still convinced that ORMs were an inherent win. I went on to evangelize NHibernate, Linq-2-SQL, LLBLGen Pro. Entity Framework never got a chance, since I was already burned out by the time it stopped sucking.

Over time it became obvious ORMs were not saving me time or preventing mistakes. Quite the opposite, setup was increasingly more complex, as you had to design schemas AND write mapping code to get them to work right. Usage went the same way: I knew what SQL I wanted to execute and now had a new dance to tweaking the ORM queries to do what I wanted. I was an abstraction whisperer. And that's not even the hours wasted with bugs. Debugging through the veil of abstraction usually ended up showing not an error in business logic or data structure but simply in usage or configuration of the ORM.

I had resisted it for several years, but finally had to admit that Ted Neward was right: ORM's are the Vietnam of Computer Science.

But... DRY, damnit!

One of the attractions of frameworks is that they drastically cut down on repetitive boilerplate code. I admit that using component libraries, you will likely do more manual wiring in your initial setup than with a framework, but this is generally a one-time setup per project. In my experience that one time cost of explicit wire-up pales in comparison to the benefits of having explicit configuration that you can trace down when problems do occur. So my stance is that wire-up isn't repeated boilerplate but rather explicit specificatgion and so does not violate the spirit of DRY.

For the most part I favor components that simplify working in an infrastructure's domain language over frameworks that try to hide that domain and expose it as a different one. Sooner or later you will have to understand what's happening under the hood and when that time comes, having a collection helpers for working with the native paradigm beats trying to diagnose the interactions of two (or more) domains.

Namespaces: Obfuscating Xml for fun and profit

One reason Xml is hated by many is namespaces. While the concept is incredibly useful and powerful, the implementation, imho, is a prime example of over-engineered flexibility: It's so flexible that you can express the same document in a number of radically different ways that are difficult to distinguish with the naked eye. This flexibility then becomes the downfall of many users, as well as simplistic parsers, trying to write XPath rather than walking the tree looking at localnames.

Making namespaces confusing

Conceptually, it seems very useful to be able to specify a namespace for an element so that documents from different authors can be merged without collision and ambiguity. And if this declaration was a simple unique map from prefix to Uri, it would be a useful system. You see a prefix, you know know it has a namespace that was defined somewhere earlier in the document. Ok, it could also be defined in the same node -- that's confusing already.

But that's not how namespaces work. In order to maximize flexibility, there are a number of aspects to namespacing that can make them ambiguous to the eye. Here are what I consider the biggest culprits in muddying the waters of understanding:

Prefix names are NOT significant

Let's start with a common misconception that sets the stage for most comprehension failures that follow, i.e that the prefix of an element has some unique meaning. The below snippets are identical in meaning:

<xsl:stylesheet version="1.0" xmlns:xsl="">
  <xsl:template match="/">

<a:stylesheet version="1.0" xmlns:a="">
  <a:template match="/">

The prefix is just a short alias for the namespace uri. I chose xsl because there are certain prefixes like xsl, xhtml, dc, etc, that are used consistently with their namespace uri's that a lot of people assume that the name is significant. But it isn't. Someone may give you a document with their favorite prefix and on first look, you'd think the xml is invalid.

Default Namespaces

Paradoxically, default namespaces likely came about to make namespacing easier and encourage their use. If you want your document to not conflict with anything else, it's best to declare a namespace

<my:a xmlns:my="ns1"/>

But that's just tedious. I just want to say "assume that everything in my document is in my namespace":

<a xmlns="ns1"/>

Beautiful. I love default namespaces!

Ah, but wait, there's more! A default namespace can be declared on any element and governs all its children. Yep, you can override previous defaults and elements at the same hierarchy level could have different namespaces without looking different:

<a xmlns="ns1"/>
  <b xmlns="ns2">
  <b xmlns="ns3">

Here it looks like we have a with two child elements b, each with an element c. Except not only is the first b really {ns2}b and the seconds b {ns3}b, but even worse, the c elements which have no namespace declaration are also different, i.e. {ns2}c and {ns3}c. This smells of someone being clever. It looks like a feature serving readibility when it does exactly the opposite. Use this in larger documents with some more nesting and the only way you can determine whether and what namespace an element belongs to is to use a parser. And that defeats the human readibility property of Xml.

Attributes do not inherit the default namespace

As if default namespaces didn't provide enough obfuscation power, there is a special exception to them and that's attributes:

<a xmlns="ns1"/>
  <b c="who am i">blah</b>

So you'd think this is equivalent to:

<x:a xmlns:x="ns1"/>
  <x:b x:c="who am i">blah</x:b>

But you'd be wrong. @c isn't @x:c, it's just @c. It's without namespace. The logic goes like this: Namespaces exist to uniquely identify nodes. Since an attribute is already inside a uniquely identifyable container, the element, it doesn't need a namespace. The only way to get a namespace on an attribute is to use an explicit prefix. Which means that if you wanted @c to have be in the namespace {ns1} , but not force every element to declare the prefix as well, you'd have to write it like this:

<a xmlns="ns1"/>
  <b x:c="who am i" xmlns:x="ns1">blah</b>

Oh yeah, much more readable. Thanks for that exception to the rule.

Namespace prefixes are not unique

That last example is a perfect segway into the last, oh, my god, seriously?, obfuscation of namespacing: You can declare the same namespace multiple times with different prefixes and, even more confusingly you can define the same prefix with different namespaces.

<x:a xmlns:x="ns1">
  <x:b xmlns:x="ns2">
    <x:c xmlns:x="ns1">you don't say</x:c>
  <y:b xmlns:y="ns1">
    why would you do this?

Yes, that is legal AND completely incomprehensible. And yes, people aren't likely to do this on purpose, unless they really are sadists. But I've come across equivalent scenarios where multiple documents were merged together without paying attention to existing namespaces. In fairness, trying to understand existing namespaces on merge is a pain, so it might have been purely done in self-defense. This is the equivalent of spaghetti code and it's enabled by needless flexibility in the namespace system.

XPath needs unambiguous names

So far i've only addressed the ambiguity in authoring and in visually parsing namespaced Xml, which has plenty of painpoints just in itself. But now let's try to find something in one of these documents.

<x:a xmlns:x="ns1">
  <x:b xmlns:x="ns2">
    <x:c xmlns:x="ns1">you don't say</x:c>
  <y:b xmlns:y="ns1">
    why would you do this?

Let's get the c element with this xpath:


But that doesn't return any results. Why not? The main thing to remember with XPath is that, again, prefixes are NOT signficant. That means, just because you see a prefix used in the document doesn't actually mean that XPath can find it by that name. Again, why not? Indeed. After all, the x prefix is defined, so why can't XPath just use that mapping? Well, remember about this example that depending on where you are in the document, x means something different. XPath doesn't work contextually, it needs unique names to match. Internally, XPath needs to be able to convert the element names into fully qualified names before ever looking at the document. That means what it really wants is a qury like this:


Since namspaces can be used in all sorts of screwy ways to use the same prefixes to mean different things contextually, the prefixes seen in the text representation of the document are useless to XPath. Instead, you need to define manual, unique mappings from prefix to namespace, i.e. you need to provide a unique lookup from prefix to uri. Gee, unique prefix.. Why couldn't the Xml document spec for namespaces have respected that requirement as well.

Namespace peace of mind: Be explicit and unique

The best you can do to keep namespacing nightmares at bay is to follow 2 simple rules for formatting and ingesting Xml:

  1. Only use default namespacing on the root node
  2. Keep your prefixes unique (preferably across all documents you touch)

There, done, ambiquity is gone. Now make sure you normalize every Xml document that passes through your hands by these rules and bathe in the light of transparency. It's easier to read, and you can initialize XPath with that global nametable of yours so that your XPath represenation will match your rendered Xml representation.

Oh, you can keep the disc, but the bits ain't yours

After writing "Maybe it's time to stop pretending we buy software?", I talked to a couple of gamer friends about their purchasing habits. The concept of DLC that has a strong "withheld content" smell came up and whether this was a "first buyer bonus" or "selling crippled products" had no straight forward and agreed answer. But what did emerge is that pricing of games is a key factor in why used sale and purchase are considered a right. The sentiment that at ~$60 there is an expectation that the game has a residual value you can recoup should the game not justify itself as a keeper. Which, of course, itself is an indicator that our usage patterns of games are much closer aligned with a rental than purchase relationship. In particular, one friend uses Amazon's Trade-In store as a form of game rental. Depending on how many games you play, it is a better deal than Gamefly.

Now it turns out that arguing about used games and whether they are crippled or not may not even be an issue in the future. Ars Technica did a great summary called "No, you don't own it: Court upholds EULAs, threatens digital resale" of the US Court of Appeals for the Ninth Circuit ruling re: Vernor v. Autodesk. The gist is that EULAs are enforceable and that you may really only own a non-transferable license. In the light of keeping your upgrade while selling the old version, that makes sense to me. Of course fairness should dictate that you can sell your license. Then again fairness should also dictate that you don't make copies of the software for all your friends. So, given the unenforceability of fairness, software companies use draconian licensing EULAs and consumers have chosen to ignore them out of hand. This legal decision, however has the possibilities of escalating this conflict and if companies go after used game stores, used DVD stores, etc. I predict that piracy will really run rampant, as consumers will take the only option available to them in fighting rules that violate their sense of fairness.

Physical products engender First Sale Doctrine expectations

I personally have not bought a new Xbox game, relying on my amazon wishlist for those. Of all the games I've played on the Xbox, only GTA4 has felt justified of its full price. The ones I have bought were used and the ones that had no replay value I sold. After all, I had a box with a disc sitting there, so of course I can sell that box.

I have, however, bought plenty of games on Steam. It's a digital sale -- I can install it on any computer when i want to play it but I can't ever sell it or even let someone borrow it. Yet I am happy about those purchases. Ok, what is wrong with me? The difference to me, if were to try to put a finger on it, lies in a combination of pricing and purchasing experience.

New PC games are ususally at least $10 cheaper. Whether you claim that this price difference is historical or because console's are recouping hardware costs, it makes a new game easier to digest. Add to that that Steam has mastered the art of the long tail, reducing prices for older games, frequently doing brief yet radical sales and even adding games not originally released on Steam along with patches and support for newer features such as cloud save game storage. Finally, with Steam (even if this is more Valve itself than anyone else), you usually get a longer life out of the game, with updates and free multiplayer.

The purchasing itself further severs the physical ownership bond you have with boxed games. Aside from a less painful price point, it's simple, immediate gratification, being able to buy a game at any time of day or night. You also generally don't run an installer, you just buy and wait until Steam tells you that the game is ready to play. In all respects the experience feels like a service not a product, which reduces the feeling that you own something that you can resell.

As a game dev, what seems like a better way to deal with the fact that only some percentage will buy your game at full price? Only make money at full sale and try to encourage new purchase by devaluing used games with exclusive DLC, etc? Or sell electronically and cut out used games entirely, but attract those not willing to pay full price by offering sales later? After all, 9 months after release $19.99 (like Left 4 Dead right now) still beats not seeing a dime.

While the Vernor v. Autodesk decision may embolden publishers to crack down on used sales, I sure hope more will follow Valve's model. After all, gamers generally don't talk fondly of publishers, but Valve is almost uniformly a hero to the community and that's while preventing gamers from selling their games. Sounds like they've got a good model going.

People have no business driving on the highway

Image courtesy of Atwater Village NewbieI'm going to go rather deeply off-topic and venture into tl;dr territory: Every time I drive through LA or am on the long 4-lane interstate corridors of Barstow-Las Vegas or the central valley, my mind spends a lot of time contemplating how highway driving is such an inefficient process. It's a perfect scenario of continuous lanes with defined entry/exit/merge points. You get on at one point and off at some other point. The whole having to drive the vehicle between the two points is not only a misapplication of resources but human nature seems to ensure that it'll always be slower than it has to be.

Why autonomous highway vehicles won't happen (anytime soon)

Before I make the case why and how autonomous highway travel could happen, let's just get the naysaying out of the way. Won't may be strong, but since my objections are based on people, not on technology, I can't forsee this change to happen in any near term. Long after the technological hurdles are crossed, the psychological ones, fear and self-determination, are likely to linger.

Fear of yielding control to machines is as old as machines. We're deeply suspicious of anything that wants to take over a task we believe requires our own skillset. Even if repeatedly proven wrong, we believe that a human being can do things better than any machine. Disregarding the thousands of people that die in car accidents due to their own failings (exceeding their or their cars reaction capabilities, driving impaired, etc.), we are willing to accept those deaths as the cost of driving. But should a single person die because of a computer malfunction, all computer controlled cars should immediately be suspended. We only have to look at the recent, false, accusation that Prius' we're running amok because of a faulty on-board computer and the public outcry as proof.

And even if we trusted cars to be better drivers, we still would not yield control because we want to be the ones that decide what to do next. This is more true in car cultures like the US, but the need for self-determination means that we want to be able to drive where and how we want at all times (ignoring that we already have agreed to a meriad of rules of the road). Maybe we want to cross three lanes to get off at that exit. Or we want to weave through traffic. After all, our superior cognitive skills will gets us there faster than flowing with the computer controlled pack, right?

How could it work?

There's just too much variability and unpredictability involved in driving for computers to take over. Well, not so fast. On surface streets that's certainly true. There are so many unexpected factors that require making decisions based not on hard rules, such as bikes, pedestrians, ambigious signage, bad directions, etc. that will keep daily driving out of the reach of autonmous vehicles reach for a while. But highways are different. 99% of all unexpected decision making on highways is due to humans driving in the first place. If you didn't have to react to the unpredictable cars around you, it's a simple set of rules: There's lanes, there's entrance and exit points, there's lane merges and splits and with communication at lightspeed, reacting to conditions created by another car would be far more reliable than the visual detection and reaction of a driver.

So let's say highways are a controlled environment that can be managed by today's technology, how would something like this come to pass, especially since we can't just set up a new, separate highway system and can't turn it on over night.

Autonomous vehicles

One fear and realistic obstacle in computer controlled cars is the central control of all traffic, that even with redundancy is seen as a single point of failure. Also extending trust in computers to trusting some massive government controlled computer is a special leap that's spawned a hundred dystopian sci-fi stories. For this system to have a chance, each car needs to be in control of itself. People will trust their own cars before they trust an outside entity.

You would pull onto the entrance ramp, determine where you want to get off and the car would take over, merge into the traffic flow and on exit at your destination, the car would hand control back over or stop if it sensed that you weren't acknowledging transfer of control. I'll cover how this is possible next, but the important concept is that it's really just an auto-pilot for your car.

Recognition of the static environment

In order for your car to work on auto-pilot, it needs to have a way to recognize entrances, exits, lanes, etc. This could be done with a combination of GPS markers and RFID. GPS for the layout of major features, such as interchanges, entrances and exits and RFID to determine boundaries, etc. This static environment can be built out and expanded one highway at a time and the combination of GPS and RFID means that there is a general expectation with a local verification of that expectation, i.e. a physical safe-guard to override outdated data.

Recognition of the dynamic environment

Just as important as recognizing the lanes is recognizing cars and other obstacles. By using RFID, radar and/or visual recognition and WIFI networking, cars would be able to detect surrounding cars as well as communicate speed changes and negotiate merges. This communication would also allow the forwarding of conditions far ahead without requiring a central traffic computer. It's basically peer-to-peer traffic control. Since the computers would lack the ego of drivers, merges would not require sudden stops that ripple for miles behind and cars could drive faster and closer while still being safer.

The awareness of all other autonomous vehicles and the propagation of information also allows the detection and avoidance of out-of-system obstacles, such as physical objects, cars with malfunctioning systems or rogue drivers who are controlling their cars manually. Once one of these conditions is detected, it might trigger manual control for everyone, which would just return us to the crappy situation we already have, but it still wouldn't be sudden since traffic ahead would warn our cars long before we'd encounter it.

Oh, right, then there's reality

All the technology to bring this about exists today. Mapping our highways for GPS is already done. Implanting RFID markers is no more complicated than existing highway maintenance. Converting the fleet will take a while, but we could easily start with HOV lanes as autonomous lanes and add more lanes until the entire highway and fleet is converted. Sorry, classic cars, you will be relegated to surface streets or require transport. But considering your polluting nature, that's a good thing.

But let's say the government did decide to undertake this, the implementation reality would be lobbying by large government contractors to create their proprietary systems, attach patents to the tech and create inferior technology (just look at voting machines). They'd create unreliable crap that would erode any trust in autonomous vehicles that people could muster. Maybe the government would require some standard but the development of a standard would be a pissing match between car conglomerates that ends up with something as useless as Cablecard and still lock out any innovative application. Finally, the hunger for data would mean that all this peer-to-peer communication and travel data would be an irresistible analytics goldmine for upselling car, travel, etc. products and services, turning the autonomous system into some kind of giant big brother of movement. Of course, considering present consumer behavior, the big brother scenario would probably not act as an obstacle.

I guess I'm going to continue to be stuck behind the guy in the left lane whose speed is the righteous amount over the limit and who only accelerates when his ego is threatened by me passing him on the right. And i'll continue to have to hit the brakes or react to someone else having to hit their brakes because someone decided that their lane change was of higher priority than the flow of the remaining traffic. All of which is completely uneccessary and counter-productive to everyone on the road and highway travel could be as simple as treating your car as your personal travel compartment in a massive compartment routing system. Well, a geek can dream.

Maybe it's time to stop pretending we buy software?

There's been a lot of noise about comments made by THQ's Cory Ledesma about used games. Namely,

"I don't think we really care whether used game buyers are upset because new game buyers get everything. So if used game buyers are upset they don't get the online feature set I don't really have much sympathy for them." -cvg

Well, this has gotten a lot of gamers upset, and my immediate reaction was something like "dude, you are just pissing off your customers." And while Cory may have been the one to say it out loud, actions by EA and others in providing free DLC only to the original buyer and similar original buyer incentives show that the industry in general agrees with his sentiments.

Holding steadfast to my first-sale doctrine rights, I, like most gamers, software and media purchasers, strongly believe that we can sell those bits we bought. Of course, EULAs have said nu-uh to that belief for just as long. We purchasers of bits only own a license to those bits, we don't own a product. But just as nobody reads an EULA, everybody believes those EULAs to be unenforcable. I own those bits, man!

So I continue to believe that when I purchase a product, let's say some bits on a DVD, i can sell it again or buy such a product from someone else. It wasn't until I read Penny Arcade earlier this week, that I had to admit that, first-sale doctrine notwithstanding, I am not their customer.

Penny Arcade - Words And Their Meanings
Penny Arcade - Words And Their Meanings

But, I thought, just like buying CDs used, I am actually contributing to a secondary market that promotes the brand of the artist. Buying that old CD used makes it more likely that I will buy the next one new, or that I will go to their show when they come to town, etc. Put aside whether this secondary market really has the magical future revenue effects i ascribe to it, for games there is no such secondary market. As Tycho said in his post accompanying the strip:

"If I am purchasing games in order to reward their creators, and to ensure that more of these ingenious contraptions are produced, I honestly can't figure out how buying a used game was any better than piracy. From the the perspective of a developer, they are almost certainly synonymous." - tycho, penny arcade

Ok, maybe you think the secondary market is sequels that you will buy new because you bought the original used. Never mind that most sequels are farmed out to another development house by the publisher, buying used games, at best, actively encourages the endless milking of sequels rather than new IP. But it's even worse for games, because virtually all games now include some multi-player component and keeping that running costs real money. You paying for Xbox Live doesn't mean the publisher isn't still paying more cash to Microsoft to run those servers. So every used Modern Warfare player costs the publisher money while only Gamestop made any cash on the sale. So, sure, you own that disk, but you're insane if you think that the developer/publisher owes you anything.

Now, let's extend this to the rest of the software market. Here you can argue a bit more for a secondary market, since software regularily comes out with new versions, encouraging you to pugrade. If you look at that boxed software revenue cycle it becomes clear that the added features and version revving just exist to extend a product into a recurring revenue stream. And if that's the motivation, it also means we're encouraging developers to spend less on quality and bug fixes (because nobody wants to pay for those), and more on bells and whistles, cause those justify the version rev and with it the upgrade price. In reality, if you use Photoshop professionally you've long ago stopped being a purchaser of boxed software and are instead a subscriber to the upgrade path.

This fickle revenue stream also has an effect on pricing. You may only use Powerpoint once in a while, but you paid to use it 24/7. Or maybe because you don't use it enough you've rationalized pirating it, which only serves to justify a high price tag, since the paying customers are subsidizing the pirates. Either way, the developer inflates the price to smooth out the revenue stream.

The sooner we stop pretending that we buy software and just admit that really we just want to rent it, the better. Being addicted to high retail prices, some publishers certainly will try to keep the same pricing as they move to the cloud, but the smart ones will adjust their pricing to attract those buyers who would never have bought the boxed version. Buying metered or by subscription has the potential for concentrating on excellence rather than bloat and the responsiveness and frequent updates of existing services seem to bear that promise out already. It's really in our favor to let go of idea of wanting a boxed product with a resale value.

You're an administrator, not THE Administrator

I set up a new dev machine last week and decided to give win7 a try. Most recent dev setup was using win2k8 server and it's still my favorite dev environment. Fast, unobtrusive, things just worked.

Win7 appeared to be a different story, reminding me of the evil days of Vista. I had expected it to be more like Win2k8 server, but it just wasn't. I was trying to be zen about the constant UAC nagging and just get used to the way it wanted me to work. But two days in, it just came to a head and after wasting countless hours trying to work within the security circus it set up, i was ready to pave the machine.

Here's just a couple of things that were killing me:

Can't save into Program Files from the web

Had to save into my documents then move it there. Worse, it told me i had to talk to an administrator about that. I am an administrator!

Can't unzip into Program Files

Same story as above.

Have to whitelist reserve Uri's for HttpListeners and you can't wildcard ports.

This was the final straw, since my unit tests create random port listeners so that the shutdown failures of a previous test doesn't hose the registration of the next.

All these things need administrator privileges. But wait, I am an administrator, so what's going on? It appears that being an administrator is more like being in the sudoers file on unix. I have the right to invoke commands in the context of an administrator, but my normal actions aren't. I tried to work around this with registry hacks, shortcuts set to run as administrator and so on, to try to get things to start-up with administrator privs by default, but Visual Studio 2k8 just refused to play along. You cannot set it up so that you can double-click on a solution and it launch the solution as administrator in Win7. And even if you start VS as administrator, you cannot drag&drop files to it since it's now running in a different context as Explorer.

And if you ask MS Connnect about this you'll find that like anything of value the issue has been closed as "By Design.". Ok, look buddy, just because you designed a horrible user experience doesn't mean the problem can just be dismissed.

But why was win2k8 so much better an experience, a nagging voice kept asking. Turns out that on win2k8, i just run as Administrator. Win7 never gave that option (and you have to do some cmdline foo to enable the account.) Being a unix guy as well, running dev in what is root, just felt distasteful. But distaste or not, it's the key for actually being able to do productive development work in windows. As soon as I became THE Administrator, instead of an administrator, all was smooth again.

Stupid lesson learned.

Teh Shiny

For technology geeks, one of the greatest enemies of productivity is teh Shiny, i.e. some algorithm, framework, language that is a possible solution to the problem at hand, but mostly attracts because of its dazzling awesomeness. When something seems like the cool way to do it, chances are you are about to step into a big pile of YAGNI.

Often, the simplest possible solution or, even more importantly, using what you know, rather than what appears to be the "best", is the way to go.

A good example is the boolean algebra parser for my tag queries. Once I realized I had a mini-DSL on my hands, my google fingers were refreshing my knowledge of Coco/R and evaluating Irony in order to build a parser to build me my AST. It took stepping back to realize that to get this to the prototype stage, some regex and a simple state machine could handle the parsing much more quickly (mostly because it avoided the tooling learning curve). It may not be the final solution, but it's one that works well enough to serve as a place holder until the rest of the tech proves out.

Don't get me wrong, this does not mean, hack up the simplest bit of code to get the job done. A straight path to the solution is likely to result in an unmaintainable mess. Certain standards of code quality need to be met, so that you can still test the result and be able to refactor it without starting from scratch. But if you meet that criteria, getting it done fast first, buys you the time to evaluate whether a) the problem you're solving is the right problem, b) the solution provided is sufficient for the intended use and c) whether the shiny tech really would improve the implementation.

Determining whether a path is just shiny or appropriate isn't always easy. One bit of tech that's been consuming a bit more time than I wish, is basing my persistence story on NHibernate, instead of rolling my own. The temptation to use what I know is strong, but fortunately (or unfortunately?), I have enough experience from rolling my own ORM and hand-coded SQL to know that down that road lies madness. So, be sure not to mistake "learning curve" for yagni.

Software Activation vs. Virtualization, Part 3

Part of an ongoing saga.

Rebooted back into VMWare Fusion and yeah, Illustrator Activation was indeed screwed there as well. Office 2007 too, but at least it just let's me reactivate (no doubt noting me as a repeat offender somewhere). So I called Adobe and was told that "it's a sensitive piece of software". No it's not. Illustrator can take any beating you give it.. It's the "anti-piracy" crap that's sensitive. I got an "emergency activation code" to get it going again and was advised to Deactivate before i switch VM setups and then re-activate after the reboot. OMFG. Seriously, just give me USB dongle if you are so sensitive about it. That would be inifintely more convenient.

Dug around the net a bit and it seems that if i fake my mac address to be the same between boot camp and the VM boot, it'll not invalidate my activation. Might try that next. Of course, the same board i found that on also noted that if I just got a crack for my legally purchased product, all troubles would be gone as well. Yes, once again, anti-piracy crap is not stopping pirates but legitimate customers. You'd figure someone might have spotted the pattern here, but may those DRM-colored glasses filter reality a bit too well.

Software Activation vs. Virtualization (and multiple PC ownership)

Just as Virtualization is finally becoming a useful technology, everybody and their uncle has decided that software activation is the new hot way to stop theft. Of course, like all anti-piracy tools, the paying customers get screwed, because the pirates have already patched their copies to not require activation. Bravo! You know i'd prefer friggin USB dongles to this big brother activation business.

I've talked about these problems before, but I've got more fun with the VM vs. bootcamp image activation troubles. I just got Adobe CS3 and for a programmer with occasional Photoshop/Illustrator needs, that's a pretty serious expense. I mean it costs me more than MSDN and gets used a fraction of the time. But I need it. And forget that I have three different computers I use at different times and I really ought to be able to install my purchased software on all of these machines, since I, the owner of the license, will never be using two computers at once. But that's a whole other story.

Back to the re-activation on hardware change business... I've been running Windows under VMware for the last couple of weeks, but for the Illustrator work I need to do right now, it was a bit sluggish. No problem, reboot into Bootcamp! Mind you, this isn't a differnt install of Windows. This is the same physical disk partition, but booted natively vs. via VMware. What happens? Illustrator bitches about activation, as does office, because it saw the hardware change. Let me guess, when i reboot in the virtual machine it'll bitch yet again. Sooner or later it'll just shut me down as a serial offender. Thanks! Way to reward my purchase.

A case for XML

XML gets maligned a lot. It's enterprisey, bloated, overly complex, etc. And the abuses visited upon it, like trying to express flow control or whole DSLs in it or being proposed as some sort of panacea for all interop problems only compound this perception. But as long as you treat it as what it is, data storage, I generally can find little justification to use something else. Not because it's the best, but because it's everywhere.

If you are your own consumer and you want a more efficient data storage, just go binary already. If you're not, then I bet your data consumers are just tickled that they have to add another parser to their repository of data ingestors. Jim Clark probably put it best when he said:

"For the payload format, XML has to be the mainstay, not because it's technically wonderful, but because of the extraordinary breadth of adoption that it has succeeded in achieving. This is where the JSON (or YAML) folks are really missing the point by proudly pointing to the technical advantages of their format: any damn fool could produce a better data format than XML."

Ok, I won't get religious on the subject, but mostly wanted to give a couple of examples, where the abilities and the adoption of XML have been a godsend for me. All this does assume you have a mature XML infrastructure. If you're dealing with XML via SAX or even are doing the parsing and writing by hand, then you are in a world of hurt, I admit. But unless it's a memory constraint there really is no reason to do that. Virtually every language has an XML DOM lib at this point.

I love namespaces

One feature a lot of people usually point to when they decry XML to me is namespaces. They can be tricky, i admit, and a lot of consumers of XML don't handle them right, causing problems. Like Blend puking on namespaces that weren't apparently hardcoded into its parser. But very simply, namespaces let you annotate an existing data format without messing with it.

<somedata droog:meta="some info about somedata">
  <droog:metablock>And a whole block of extra data</droog:metablock>

Here's the scenario. I get data in XML and need to reference metadata for processing further down the pipeline. I could have ingested the XML and then written out my own data format. But that would mean I'd have to also do the reverse if I wanted to pass the data along or return it after some modifications and I have to define yet another data format. By creating my own namespace, I am able to annotate the existing data without affecting the source schema and I can simply strip out my namespace when passing the processed data along to someone else. Every data format should be so versatile.

Transformation, Part 1: Templating

When writing webapps, there are literally dozens of templating engines and there's constantly new ones emerging. I chose to learn XSLT some years back because I liked how Cocoon and AxKit handled web pages. Just create your data in XML and then transform it using XSLT according to the delivery needs. So far, nothing especially unique compared to other templating engines. Except unlike most engines, it didn't rely on some program creating the data and then invoking the templating code. XSLT works with dynamic Apps as easily as with static XML or third party XML without having.

Since those web site roots, I've had need for email templating and data transformation in .NET projects and was able to leverage the same XSLT knowledge. That means I don't have to pick up yet another tool to do a familiar task just a little differently.

What's the file format?

When I first started playing with Xaml, I was taking Live For Speed geometry data and wanted to render it in WPF and Silverlight. Sure, I had to learn the syntax of the geometry constructs, but I didn't have to worry about figuring out the data format. I just used the more than familiar XmlDocument and was able to concentrate on geometry, not file formats.

Transformation, Part 2: Rewriting

Currently I'm working with Xaml again for a Silverlight project. My problem was that I had data visualization in Xaml format (coming out of Illustrator), as well as associated metadata (a database of context data) and I needed to attach the metadata to the geometry, along with behavior. Since the first two are output from other tools I needed a process that could be automated. One way would be to walk the Visual tree once loaded, create a parallel hierarchy of objects containing the metadata and behavior and attach their behavior to the visual tree. But i'd rather have the data do this for itself.

<Canvas x:Name="rolloverContainer_1" Width="100" Height="100">
  <!-- Some geometry data -->

<!-- becomes -->

<droog:RolloverContainer x:Name="rolloverContainer_1" Width="100" Height="100">
  <!-- Some geometry data -->

So I created custom controls that subclassed the geometry content containers. I then created a post-processing script that simply loaded the Xaml into the DOM and rewrote the geometry containers as the appropriate custom controls using object naming as an identifying convention. Now the wiring happens automatically at load, courtesy of Silverlight. Again, no special parser required, just using the same XmlDocument class I've used for years.

And finally, Serialization

I use XML serialization for over the wire transfers as well as data and configuration storage. In all cases, it lets me simply define my DTOs and use them as part of my object hierarchy without ever having to worry about persistence. I just save my object graph by serializing it to XML and rebuild the graph by deserializing the stream again.

I admit that this last bit does depend on some language dependent plumbing that's not all that standard. In .NET, it's built in and let's me mark in my objects with attributes. In Java, I use Simple for the same effect. Without this attribute driven mark up, I'd have to walk the DOM and build m objects by hand, which would be painful.

Sure, for data, binary serialization would be cheaper and more compact, but that misses the other benefits I get for free. The data can be ingested and produced by a wide variety of other platforms, I can manually edit it, or easily build tools for editing and generation, without any specialized coding.

For my Silverlight project, I'm currently using JSON as my serialization layer between client and server, since there currently is no XmlSerializer or even XmlDocument in Silverlight 1.1. It, too, was painless to generate and ingest and, admittedly, much more compact. But I then I added this bit to my DTO:

List<IContentContainer> Containers = new List<IContentContainer>();

It serialized just fine, but then on the other end it complained about there not being a no-argument constructor for IContentContainer. Ho Hum. Easily enough worked around for now, but I will be switching back to XML for this once Silverlight 2.0 fleshes out the framework. Worst case, I'll have to build XmlSerializerLitem, or something like that, myself.

All in all, XML has allowed me to do a lot of data related work without having to constantly worry about yet another file format, or parser. It's really not about being the best format, but about it virtually being everywhere and being supported with a mature toolchain across the vast majority of programming environment and that pays a lot of dividents, imho.