I’ve been mostly ignoring the so-called ‘NoSQL movement’ for a while now. I’ve been so heavily invested in relational databases that I couldn’t imagine ever moving to something else. Document-based databases seemed interesting but felt like it required a different way of thinking about data. CouchDB has its’ REST interface which is kind of cool, and most of the other NoSQL databases tout map/reduce stuff, which again, is kind of cool. I don’t think I ever had a computational problem so large, that it would have benefited from map/reduce, so it didn’t pique my interest.
The other day, I was researching various file storage systems and came across MongoDB’s GridFS. GridFS is a layer that sits on top of MongoDB that takes file storage requests, splits them into binary chunks and stores them as separate documents (and can reverse that process when retrieving files). Traditional knowledge and prior experience told me that you never ever want to store files in a database, but after reading on how MongoDB implemented it, it kind of made sense and felt OK. The data is stored as binary JSON objects, which I like. JSON never done me wrong, it’s actually my preferred data format.
So, I downloaded the pre-compiled mongodb for OS X and decided to experiment with it for a bit. My previous experiment with Hadoop made my head spin, so I had low expectations but I was so wrong. It was so drop-dead simple that I was inserting, retrieving documents and files, literally within minutes. Both the client and server software consists of a few small executable files. Nice. The PHP extension was easy to enable, and the driver even implements fluent interfaces! Nice. The Mongo client uses javascript to interface with the server, which makes sense being that it stores the data as BSON, it also makes it feel very ‘web-oriented’. After an hour of playing around with MongoDB, I was able to get a master-slave setup working, as well as a sharded data-set. I was impressed. Very Nice. I started imagining how my current databases could fit into Mongo, and all the different methods of scaling and replicating that it offers.
Although I originally was looking into MongoDB for it’s GridFS capabilities, I quickly realized the beauty of it’s capabilities as a real database. There is even work being done on some Godfather-esque auto-sharding which is awesome, a bit scary and very interesting. MongoDB is open-source, and is being actively developed and supported by a local NYC company 10gen which makes me feel all the more confident with considering rolling this out at work.
In the software engineering world, I think there are all kinds of development philosophies and when you come across software that is in-line with your philosophy, it makes it so much easier to grok, and it feels good.