The State of SimpleDB Clones

Over the past couple weeks I’ve been updating the DataMapper SimpleDB adapter for DataMapper 0.10, an effort which yesterday culminated in the release of version 1.0.0. Testing against Amazon SimpleDB can be an excruciatingly slow process – not only can the connections take a long time to set up, but SimpleDB’s “eventual consistency” model means that after a write you often have to wait a second or more until the data you just wrote is available for reading.

I began to wonder if there was any way I could accelerate the tests by running them against a local SimpleDB clone. I surveyed the field of projects which mimic the SimpleDB API and found two which seemed promising. Unforunately neither one panned out.

The first I tried was simpledb-dev, a Python script which claims to fully implement the SimpleDB API. I had high hopes for this one because it had been used to test a previous version of the DataMapper SimpleDB adapter.

Unfortunately, simpledb-dev only supports an outdated version of the SimpleDB API. Specifically, it does not implement the SELECT operation. Since the adapter uses SELECT for database reads, simpledb-dev was out of the running. Which was a pity, because I found trivially easy to get simpledb-dev up and running.

The same cannot be said of the second solution I tried, M/DB. M/DB is an ambitious project to provide a production-level drop-in replacement for SimpleDB. It is distributed only as a .deb package for Debian/Ubuntu systems – Mac users are out of luck. Apparently it used to be distributed as a VMWare virtual appliance, but no longer.

When I installed the package, M/DB took over port 80 on my development machine. Configuring it involves editing a configuration file in a non-standard location, and then hitting a special path on the local web server it starts up. After that you can start using it as a SimpleDB-like service.

However, M/DB requires that all SimpleDB API calls be made against the URL http://localhost/mdb/request.mgwsi. The RightAWS library used in the adapter, while it accepts a configurable host and port for the service, expects to be able to hit the host root directly, and provides no way to specify a base path other than /.

So the upshot is that while there are some promising attempts at SimpleDB-alikes out there, I couldn’t find one that quite met my needs for local testing. If you know of a an alternative SimpleDB clone that I missed, please clue me in.

10 comments

  1. A few comments about M/DB if I may:

    – we're working on an rpm installer. Since M/DB is an application written on top of the GT.M database, it will run on any platform supported by GT.M, which for the open source version, means GNU Linux. I'm afraid there won't be a Mac version as a result. You could of course run it as a Linux VM inside OSX using eg Parallels or similar.

    – however M/DB is compatible with the Cache' database which is available natively on OSX, but Cache' is a commercially licensed database product

    – Yes the default configuration for Apache is set up as port 80 but you can change this to whatever you like in the Apache config file. Similarly it ought to be a simple task to add some mod_rewrite rules to map the M/DB path (/mdb/request.mgwsi) to a simple path that DataMapper can handle.

    – Alternatively get in touch with the folks who wrote the RightAWS library. Getting a change made to the endpoint URL should be pretty trivial for them. By comparison the standard off-the-shelf Python SimpleDB interface (boto) was already configurable for M/DB by using some appropriate parameters in the connect_sdb() function.

    Hope this helps

    Rob Tweed
    M/Gateway Developments Ltd

    1. Thanks very much for the reply. You're right that with enough fiddling (and maybe a RightAWS patch) I probably could have gotten it working – it was just one of those cases where the investment didn't justify the payoff. I might try again before attempting to add the next round of features to the gem.

      1. Yes it's a difficult balancing act creating something like M/DB: how much do you provide ready-fixed and how much do you leave configurable so users can adapt it to their needs. As an Open Source product, my view was to provide a simple basic “out of the box” configuration that others could adapt as needed, and build it in such a way that such adaptation was possible.

        With M/DB soon to be a pre-built image provided by Canonical in their Ubuntu Enterprise Cloud Image Store, I'm hoping that more of the SDB client authors will be motivated to provide M/DB configurability as a matter of course. I think that's the real solution you need!

        Rob

      1. I'd suggest getting a standard off the shelf Debian or Ubuntu pre-built VirtualBox VM and then just apply the M/DB installer to it. I know VMWare have standard pre-built Linux VMs in their marketplace – not sure if these can be adapted for VirtualBox or whether an equivalent exists elsewhere.

        As you noted we used to provide M/DB as a pre-built VMWare VM but to be honest it was more trouble than it was worth once we had the installer instead. The installer gives you a lot more flexibility and it's a lot easier for us to manage and maintain.

        Rob

  2. A few comments about M/DB if I may:

    – we're working on an rpm installer. Since M/DB is an application written on top of the GT.M database, it will run on any platform supported by GT.M, which for the open source version, means GNU Linux. I'm afraid there won't be a Mac version as a result. You could of course run it as a Linux VM inside OSX using eg Parallels or similar.

    – however M/DB is compatible with the Cache' database which is available natively on OSX, but Cache' is a commercially licensed database product

    – Yes the default configuration for Apache is set up as port 80 but you can change this to whatever you like in the Apache config file. Similarly it ought to be a simple task to add some mod_rewrite rules to map the M/DB path (/mdb/request.mgwsi) to a simple path that DataMapper can handle.

    – Alternatively get in touch with the folks who wrote the RightAWS library. Getting a change made to the endpoint URL should be pretty trivial for them. By comparison the standard off-the-shelf Python SimpleDB interface (boto) was already configurable for M/DB by using some appropriate parameters in the connect_sdb() function.

    Hope this helps

    Rob Tweed
    M/Gateway Developments Ltd

  3. Thanks very much for the reply. You're right that with enough fiddling (and maybe a RightAWS patch) I probably could have gotten it working – it was just one of those cases where the investment didn't justify the payoff. I might try again before attempting to add the next round of features to the gem.

  4. I'd suggest getting a standard off the shelf Debian or Ubuntu pre-built VirtualBox VM and then just apply the M/DB installer to it. I know VMWare have standard pre-built Linux VMs in their marketplace – not sure if these can be adapted for VirtualBox or whether an equivalent exists elsewhere.

    As you noted we used to provide M/DB as a pre-built VMWare VM but to be honest it was more trouble than it was worth once we had the installer instead. The installer gives you a lot more flexibility and it's a lot easier for us to manage and maintain.

    Rob

  5. Yes it's a difficult balancing act creating something like M/DB: how much do you provide ready-fixed and how much do you leave configurable so users can adapt it to their needs. As an Open Source product, my view was to provide a simple basic “out of the box” configuration that others could adapt as needed, and build it in such a way that such adaptation was possible.

    With M/DB soon to be a pre-built image provided by Canonical in their Ubuntu Enterprise Cloud Image Store, I'm hoping that more of the SDB client authors will be motivated to provide M/DB configurability as a matter of course. I think that's the real solution you need!

    Rob

Leave a Reply to robtweed Cancel reply

Your email address will not be published. Required fields are marked *