If you're a typical PHP developer, it doesn't take a thorough review of past projects to pick out a telling pattern: In most (if not all) cases, you're probably getting PHP to talk to a database back end for all that dynamic data goodness; in 99 percent of those instances, you're using MySQL.
Now, there's nothing wrong with using a relational database. If you're working with highly structured data with lots of relationships, it's the way to go. You can happily (or unhappily, depending on your familiarity and comfort with SQL) go through the process of working up schemas, normalizing data relations, setting up tables, and all the rest.
However, every once in a while, you work on a project where you probably think to yourself, "Why am I doing all this work?" The project you're working on contains very simple bits of data or data that's difficult to predict — you might get different data fields on different days or even from transaction to transaction. If you were to create a schema to predict what's coming down the pike at you, you'd end up with tables that have lots of empty fields or lots of mapping tables.
For those projects, you need a different approach — something that doesn't involve a relational database. What you need in these situations is a document-based, schema-free, ad-hoc database with a flat address space. In short, you need Apache CouchDB.
CouchDB is (according to the Apache CouchDB Web site):
- A document database server, accessible via a RESTful JSON API.
- Ad-hoc and schema-free, with a flat address space.
- Distributed, featuring robust, incremental replication with bi-directional conflict detection and management.
- Query-able and index-able, featuring a table-oriented reporting engine that uses JavaScript as a query language.
What this means is that you can create a CouchDB database that accepts JSON documents. Each document gets a unique revision ID and has its own structure, with all documents stored in the same flat collection. For example, say you're setting up a resume collection. The first resume comes in with fields for first name, last name, phone number, e-mail address, Twitter account, a list of qualifications, and a detailed work history. The second resume just comes in with first name, last name, e-mail address, and a work history that's a sparse list. The differences there might be enough to make a relational database very unhappy, but from CouchDB's point of view, it's just another day at the office.
In short, a CouchDB document is an object consisting of named fields. Those field values may be strings, Boolean values, numbers, dates, ordered lists, or associative maps. Listing 1 shows a sample resume document.
Listing 1. A simple CouchDB document
{
"Firstname": "Tom"
"Lastname": "Myer"
"Twitter": "@myerman"
"Email": "tom@example.com"
"Skills": ["php","couchdb","xml","json"]
"Work History": ....
}
|
So far, nothing too far out here if you're used to working with JSON. Even if you aren't, you can easily map this document to something more comfortable, like a PHP array. In fact, you'll see that you can use the built-in JSON encode/decode functions to work with CouchDB, or you can use a more object-oriented path.
To query information from a collection, you can use various comfort-inducing query methods via a RESTful JSON API. The fact that you're working in JSON simplifies a lot of issues. For one thing, as a Web developer familiar with JavaScript, Ajax, and JSON, you don't have to know SQL to get anything done.
Before moving on, it would be good to hit the pause button for a minute to emphasize a few points. CouchDB isn't a relational database. You've heard me say that, but it's a good idea to emphasize the point. Don't try to use CouchDB in a relational way, like inserting ID fields that help you make relationships clear between documents. Instead of creating relationships, stuff the content you want into your document and keep moving.
Here's something else that CouchDB isn't: an object-oriented database. It's not some kind of native object, persistent data layer you can use as the underpinning for your object-oriented structures. Don't do it.
If you're on Mac OS X, the installation process for CouchDB is pretty simple:
- Open a Terminal window and type
sudo port install couchdb. - When prompted, type your root password.
- Launch MacPorts to install the necessary CouchDB packages.
- From the Terminal window, run
the following command to retrieve any last-minute changes or
dependencies:
sudo port upgrade couchdb. - To get CouchDB up and running, type the following command in
Terminal:
sudo launchctl load -w /opt/local/Library/LaunchDaemons/org.apache.couchdb.plist
This launches the CouchDB server and keeps it running persistently, so it will start up if you ever restart your Mac.
To see CouchDB in action, type http://127.0.0.1:5984/_utils/index.html in your browser. The Futon utility as it appears in Figure 1 will appear.
Figure 1. The Futon utility
On a Windows® system, your process will be a bit more convoluted, as you'll need to install Cygwin, the Microsoft® C compiler, any number of other prerequisites (like cURL, ICU, and SeaMonkey), download and install the source code for Erlang and Couch, configure those according to the README files, then do a full installation. This process is described in the CouchDB wiki (see Resources). You will also find instructions for Linux®, Berkeley Software Distribution (BSD), and other environments.
Before jumping into PHP, it might be a good
idea to get a feel for the CouchDB API, which is accessible via HTTP using
GET and PUT
requests and returns data in JSON format. This setup makes it easy to
store and retrieve data from your Web application regardless of the
language you're using — PHP, Microsoft Active Server Pages
(ASP), Ruby, Python, or even simple jQuery Ajax functions.
This
section shows how to use the cURL command-line tool to issue GET, POST, PUT, and DELETE
requests to CouchDB. Once you've got the hang of the API, a particular PHP
wrapper helps you simplify your development tasks.
The first thing
you need to run (again, from a Terminal window) is this
command: curl http://127.0.0.1:5984/. What
you should see is a response similar to {"couchdb":"Welcome","version":"0.10.0"}. This
simply tells you that CouchDB is up and running and which version
you're using. If you don't see this message, go back through
your installation and configuration process to get CouchDB up and running.
Now, try listing all the collections set up in CouchDB. Run curl -X GET http://127.0.0.1:5984/_all_dbs.
If this is a fresh installation of CouchDB, you should see the response [], which means that no collections or databases
are present (the double brackets signify an empty JavaScript array).
Please note that with this cURL command, you use the -X option to specify a GET
operation explicitly.
So let's solve that problem by creating a database:
curl -X PUT http://127.0.0.1:5984/songs
|
When you run this, you will see the response {"ok":true}. You now know that you can check for an ok attribute to make sure of success. Running
curl -X GET
http://127.0.0.1:5984/_all_dbs again
results in a non-empty array: ["songs"].
Specifically, your CouchDB instance has one database in it: songs.
Now try to create another database called songs. If you
run curl -X PUT http://127.0.0.1:5984/songs
again, you'll get this error
message:
{"error":"file_exists","reason":"The database could not be created,
the file already exists."}
|
So
it's easy for you to check for an error
attribute to see if any problems have occurred.
Create a second database called foobar:
curl -X PUT http://127.0.0.1:5984/foobar
|
If you ran curl -X GET
http://127.0.0.1:5984/_all_dbs, it would result in the response ["songs","foobar"]. To
get rid of this second database, pass in a DELETE call:
curl -X DELETE http://127.0.0.1:5984/foobar
|
Running curl -X GET http://127.0.0.1:5984/_all_dbs
reveals that you're back to ["songs"].
Now go ahead and create some documents inside your songs database. True to form, you want to store some songs in this database, with fields for song title, artist name, and album name. To create a document, follow this pattern:
curl -X PUT http://127.0.0.1:5984/songs/*id* -d '{ *json_data* }'
|
Notice that you call the name of your database, followed by some kind of ID (which needs to be unique not only in this CouchDB instance but preferably across all instances if you can), followed by your JSON data.
Why the unique ID? You could use a UUID (or a GUID) as a unique ID, or you could create some kind of natural key that combines various bits of data (for example, the name of a song with underscores instead of spaces combined with a timestamp), or you can have CouchDB create a unique ID for you (this is a slow process). Either one of these approaches is good, just don't use some kind of auto-increment value like you would in the MySQL environment.
Now, enter a song into your database:
curl -X PUT http://localhost:5984/songs/whatever_you_like -d \
'{"title":"Whatever You Like", "artist":"T.I.","album":"Paper Trail"}'
{"ok":true,"id":"whatever_you_like","rev":"1-1d915e4c209a2e47e5cf05594f9f951b"}
|
Notice that I took a very simple approach to the unique ID (using a simplified
version of the song name with underscores instead of spaces). This simple
approach is probably OK for right now. Luckily, the wrappers you'll use
in PHP will help you create better IDs. Also notice that I immediately
received a response of "ok" with my document ID and a rev attribute to tell me what the revision version is set to.
To view the document you just added, try this:
curl -X GET http://localhost:5984/songs/whatever_you_like
{"_id":"whatever_you_like","_rev":"1-1d915e4c209a2e47e5cf05594f9f951b",
"title":"Whatever You Like", "artist":"T.I.", "album":"Paper Trail"}
|
If
you're following along in Futon, you should be able to click the songs
database name and see a listing for whatever_you_like in the documents list. Clicking that link
displays the details of the document in question, as illustrated in Figure 2.
Figure 2. Document details
You're getting the idea — make RESTful requests using JSON, and things happen.
Now, all of this looks pretty good, but if you're a PHP developer, you're wondering how to tie this all in to something you feel comfortable with. The next section introduces you to some PHP wrappers for CouchDB.
For the next step, you need to download PHP-on-Couch from Github (see Resources). Place the extracted /lib folder contents into your development area. When you have your work area set up, create a simple PHP application that will talk to the CouchDB database you've already set up (your songs collection). Create a new file, and call it index.php. Put the code in Listing 2 into it.
Listing 2. CouchDB connection settings
<?php
$couch_dsn = "http://localhost:5984/";
$couch_db = "songs";
require_once "./lib/couch.php";
require_once "./lib/couchClient.php";
require_once "./lib/couchDocument.php";
$client = new couchClient($couch_dsn,$couch_db);
?>
|
This code serves as your connection code to CouchDB and includes all the relevant classes that you'll need to work with the database. Go ahead and list all the information related to your database, as shown in Listing 3.
Listing 3. Getting database information
try {
$info = $client->getDatabaseInfos();
} catch (Exception $e) {
echo "Error:".$e->getMessage()." (errcode=".$e->getCode().")\n";
exit(1);
}
print_r($info);
|
What you should get is something close to Listing 4.
Listing 4. Database information
stdClass Object
(
[db_name] => songs
[doc_count] => 2
[doc_del_count] => 0
[update_seq] => 2
[purge_seq] => 0
[compact_running] =>
[disk_size] => 8281
[instance_start_time] => 1266082749089965
[disk_format_version] => 4
)
|
Next, retrieve a document from your song database. Listing 5 shows the code to do so.
Listing 5. Retrieving a song from the database
try {
$doc = $client->getDoc('whatever_you_like');
} catch (Exception $e) {
if ( $e->code() == 404 ) {
echo "Document not found\n";
} else {
echo "Error: ".$e->getMessage()." (errcode=".$e->getCode().")\n";
}
exit(1);
}
print_r($doc);
|
Listing 6 shows the response.
Listing 6. The retrieved song
stdClass Object
(
[_id] => whatever_you_like
[_rev] => 1-1d915e4c209a2e47e5cf05594f9f951b
[title] => Whatever You Like
[artist] => T.I.
[album] => Paper Trail
)
|
Great, but what if you need to make updates to a document? You can make two
different kinds of updates: by making changes to existing
field values or by adding new fields with their own values. You do this
using arrow notation (for example, $doc->new_field), then committing your changes with storeDoc(). Listing 7 shows
the code for updating a document.
Listing 7. Updating a document
$doc->genre = 'hip-hop';
$doc->year = 2008;
try {
$response = $client->storeDoc($doc);
} catch (Exception $e) {
echo "Error: ".$e->getMessage()." (errcode=".$e->getCode().")\n";
exit(1);
}
|
Running this code, you can then retrieve the document ID and get back the response in Listing 8.
Listing 8. Updated document
stdClass Object
(
[_id] => whatever_you_like
[_rev] => 2-12513a362693b300928aa45f82faed83
[title] => Whatever You Like
[artist] => T.I.
[album] => Paper Trail
[genre] => hip-hop
[year] => 2008
)
|
Notice that the _rev attribute has been incremented to
2-whatever. Before, it was 1-whatever. It's an easy way for you to tell what
revision is there.
What if you want to store a new document in the database? You would instantiate a new object and use arrow notation to fill in the fields in the document. The code in Listing 9 shows the code to do so.
Listing 9. Creating a new document
$song = new stdClass();
$song->_id = "in_the_meantime";
$song->title = "In the Meantime";
$song->album = "Resident Alien";
$song->artist = "Space Hog";
$song->genre = "Alternative";
$song->year = 1995;
try {
$response = $client->storeDoc($song);
} catch (Exception $e) {
echo "Error: ".$e->getMessage()." (errcode=".$e->getCode().")\n";
exit(1);
}
print_r($response);
|
The response would look like Listing 10.
Listing 10. Response from creating a new document
stdClass Object
(
[ok] => 1
[id] => in_the_meantime
[rev] => 1-d65b03a9fe2f3c8095b08883e7cd97df
)
|
At this point, you have more than enough information to get started with CouchDB and PHP. It should be easy for you to create your basic update forms that then allow you to create or update existing documents in a database. The PHP-on-Couch package also gives you access to other methods for creating and deleting databases, working with CouchDB views, and more. With any luck, there's enough information in this article to get you off to a good start.
Learn
-
Visit the
CouchDB project site.
-
CouchDB: The Definitive
Guide is a free online version of the definitive guide for
CouchDB.
-
Check out the CouchDB
wiki for answers to your CouchDB questions.
-
Read Installing
CouchDB on Windows for instructions for installing the database
on the Windows platform.
-
Visit Installing CouchDB
on Linux/BSD to find instructions for installing the database on
computers running Linux or BSD.
-
Check out The CouchDB API
Reference to learn more about the API used in this article.
-
Read "Exploring CouchDB" to learn
more about what makes CouchDB tick.
-
PHP.net is the central resource for PHP developers.
-
Check out the "Recommended PHP reading list."
-
Browse all the PHP content on developerWorks.
-
Follow developerWorks on Twitter.
-
Expand your PHP skills by checking out IBM developerWorks' PHP project resources.
-
To listen to interesting interviews and discussions for software developers, check out developerWorks podcasts.
-
Using a database with PHP? Check out the Zend Core for
IBM, a seamless, out-of-the-box, easy-to-install PHP development and production environment that supports IBM DB2 V9.
-
The My developerWorks community is an example of a successful general community that covers a wide variety of topics.
-
Stay current with developerWorks' Technical events and webcasts.
-
Check out upcoming conferences, trade shows, webcasts, and other Events around the world that are of interest to IBM open source developers.
-
Visit the developerWorks Open source zone for extensive how-to information, tools, and project updates to help you develop with open source technologies and use them with IBM's products, as well as our most popular articles and tutorials.
-
Watch and learn about IBM and open source technologies and product functions with the no-cost developerWorks On demand demos.
Get products and technologies
-
Download the PHP-on-Couch library from Github.
-
Innovate your next open source development project with IBM trial software, available for download or on DVD.
- Download
IBM product evaluation versions
or explore
the online trials in the IBM SOA Sandbox and get your hands on application development tools and middleware products from
DB2®, Lotus®, Rational®, Tivoli®, and WebSphere®.
Discuss
-
Participate in developerWorks blogs and get involved in the developerWorks community.
-
Participate in the developerWorks PHP Forum: Developing PHP applications with IBM Information Management products (DB2, IDS).

Thomas Myer is a consultant, author, and speaker based in Austin. He runs Triple Dog Dare Media and tweets as @myerman on Twitter. You can reach him at tom@tripledogs.com.




