Your interviewer is David Mertz: This is David Mertz reporting from OSCON 2011 for IBM® developerWorks. Today I had the opportunity to interview Dwight Merriman, the founder of 10Gen and creator of the MongoDB database which is proving enormously popular. In my talk with Dwight, I had an opportunity both to ask him general questions about MongoDB, its architecture and how it works, and then also have him show me and walk through in a screencast various features of using MongoDB. This second of two parts is a screencast of using MongoDB in action.
Merriman: This is a demo here. So there's a couple components in Mongo. One is MongoD which is main database. So, and then the other component is this administrative shell. So MongoD is the database, MongoDB is the shell. And then if you're going sharding you would also use a MongoS process. So we're going to do a little demo with a single MongoD, so we don't need the MongoS rounding process.
So first of all let's just start up a database. So we'll need the data directory. So let's create that. I'll start the database. Whoops. Okay. So now the database is running. So let's start it up. And we will now go over to my tab, and we'll use the shell to do some demoing and manipulations.
So by default the shell will connect the local host and database so we'll just do that. Now we're connected. Once you're in the shell we can do various operations and get help if we want.
Now, in Mongo there is a query language and in fact, we represent queries as JSON [at developerWorks], but that's data driven. So we need some way to just represent the basic verbs we want to manipulate for which we will pass them JSON query expressions. So, you know, we need an update, map it, insert map, query mapping. And that's really what the Mongo shell here is for.
Merriman: And there's a predefined variable
DB which is our connection to a particular database, so we can then use that. And so, for example, we could do, first of all, we could do ... if we did just
help, though, I'll show you top-level help, but
db.help will gives us help on methods on the database object. And so we can use some of those, right?
So we could do something like, we can do an
insert to a collection. So
db.users.insert name Java address ...
In fact, with the
show indexes actually we get the help on collection method objects. We'll see that we can do ... since I didn't remember it, we can do get indexes and that will show us the indexes on this collection. So an index on underscore
ID is automatically created. And the reason for this is the Mongo documents there's always an underscore
ID field, which is your object ID.
It's sort of like a primary key. So they're unique and there's a unique value for the document. So if you do not supply one, one will be created for you. We can also supply value for ID as long as it's unique it can be of any data type. But in this case, we didn't, so we created one for us.
Mertz: Is that something like MD5 hash of all...?
Merriman: Yes. So this particular object ID we're seeing here is a BSON object ID date type. So in Mongo it's basically a JSON-style document or a database, but the actual internal representation of the data is in a font called BSON which stands for binary JSON. And it's a binary representation of JSON.
And the reason we're using binary format is not for compactness, right, because actually JSON is quite compact. It's sometimes a number and it's one byte long, right? So, but it's actually for fast scanability is the main reason. And the other thing is in BSON they've added a couple of extra data types, right, so there's a good set of data types in JSON: number, strings, null.
Mertz: Can you define an arbitrary class that defines a new data type?
Merriman: So you cannot define new data types, but what you can do is you can store anything you want as binary data and you can tag that with a subtype. You can also, but normally what you would do is you would just create an embedded JSON object, right?
So what we could do here is we could do something like add a new user, and we could do something like location, which is in some arbitrary coordinate system here, right, so you could kind of think of that as an object, right? But we just represent getting that embedded object as JSON. And in fact, there are ... there is a geospatial 3D indexing facility in Mongo, by the way. But if that were something else, we could query on that, too.
So one interesting thing with these queries, so if I do a query like the following, like let's say,
name=Jane, okay, this is how we express that. So the query expressions in Mongo, there is sort of a query language, but it's data driven. And it's JSON. So we can do that. So that's doing a query where
So we could also do one where we reach in a little bit deeper, so we could do a query such as
address.state=New York. So now we're reaching into an embedded object. So this is a good example how this is not just a key value store.
So in Mongo what you typically do is you create relatively rich documents and then there's functionality in the database to manipulate those documents. So we can do that. In addition, I could create an index on this field.
Mertz: Thanks for listening again to IBM developerWorks coverage of OSCON. This has been David Mertz, and please tune in for future installments.