CouchDB basics for PHP developers

A look at CouchDB from a PHP developer's point of view

Author Thomas Myer shows experienced PHP developers how to add CouchDB to their technical toolboxes.

Thomas Myer, Principal, Triple Dog Dare Media

author Thomas Myer is a consultant, author, and speaker based in Austin. He runs Triple Dog Dare Media and tweets as @myerman on Twitter. You can reach him at tom@tripledogs.com.



23 March 2010

Also available in Japanese Portuguese

If you're a typical PHP developer, it doesn't take a thorough review of past projects to pick out a telling pattern: In most (if not all) cases, you're probably getting PHP to talk to a database back end for all that dynamic data goodness; in 99 percent of those instances, you're using MySQL.

Now, there's nothing wrong with using a relational database. If you're working with highly structured data with lots of relationships, it's the way to go. You can happily (or unhappily, depending on your familiarity and comfort with SQL) go through the process of working up schemas, normalizing data relations, setting up tables, and all the rest.

However, every once in a while, you work on a project where you probably think to yourself, "Why am I doing all this work?" The project you're working on contains very simple bits of data or data that's difficult to predict — you might get different data fields on different days or even from transaction to transaction. If you were to create a schema to predict what's coming down the pike at you, you'd end up with tables that have lots of empty fields or lots of mapping tables.

Frequently used acronyms

  • Ajax: Asynchronous JavaScript + XML
  • API: Application programming interface
  • GUID: Globally Unique Identifier
  • HTTP: Hypertext Markup Language
  • JSON: JavaScript Object Notation
  • REST: Representational State Transfer
  • SQL: Structured Query Language
  • UUID: Universal Unique Identifier

For those projects, you need a different approach — something that doesn't involve a relational database. What you need in these situations is a document-based, schema-free, ad-hoc database with a flat address space. In short, you need Apache CouchDB.

What is CouchDB?

CouchDB is (according to the Apache CouchDB Web site):

  • A document database server, accessible via a RESTful JSON API.
  • Ad-hoc and schema-free, with a flat address space.
  • Distributed, featuring robust, incremental replication with bi-directional conflict detection and management.
  • Query-able and index-able, featuring a table-oriented reporting engine that uses JavaScript as a query language.

What this means is that you can create a CouchDB database that accepts JSON documents. Each document gets a unique revision ID and has its own structure, with all documents stored in the same flat collection. For example, say you're setting up a resume collection. The first resume comes in with fields for first name, last name, phone number, e-mail address, Twitter account, a list of qualifications, and a detailed work history. The second resume just comes in with first name, last name, e-mail address, and a work history that's a sparse list. The differences there might be enough to make a relational database very unhappy, but from CouchDB's point of view, it's just another day at the office.

In short, a CouchDB document is an object consisting of named fields. Those field values may be strings, Boolean values, numbers, dates, ordered lists, or associative maps. Listing 1 shows a sample resume document.

Listing 1. A simple CouchDB document
{
"Firstname": "Tom"
"Lastname": "Myer"
"Twitter": "@myerman"
"Email": "tom@example.com"
"Skills": ["php","couchdb","xml","json"]
"Work History": ....
}

So far, nothing too far out here if you're used to working with JSON. Even if you aren't, you can easily map this document to something more comfortable, like a PHP array. In fact, you'll see that you can use the built-in JSON encode/decode functions to work with CouchDB, or you can use a more object-oriented path.

To query information from a collection, you can use various comfort-inducing query methods via a RESTful JSON API. The fact that you're working in JSON simplifies a lot of issues. For one thing, as a Web developer familiar with JavaScript, Ajax, and JSON, you don't have to know SQL to get anything done.

Before moving on, it would be good to hit the pause button for a minute to emphasize a few points. CouchDB isn't a relational database. You've heard me say that, but it's a good idea to emphasize the point. Don't try to use CouchDB in a relational way, like inserting ID fields that help you make relationships clear between documents. Instead of creating relationships, stuff the content you want into your document and keep moving.

Here's something else that CouchDB isn't: an object-oriented database. It's not some kind of native object, persistent data layer you can use as the underpinning for your object-oriented structures. Don't do it.


Installing CouchDB

If you're on Mac OS X, the installation process for CouchDB is pretty simple:

Installing in Linux

Your developerWorks editor was able to install CouchDB on his Ubuntu Linux laptop in two steps:

sudo apt-get install couchDB
sudo /etc/init.d/couchdb start

The software was already in the repository and loaded with no surprises.

  1. Open a Terminal window and type sudo port install couchdb.
  2. When prompted, type your root password.
  3. Launch MacPorts to install the necessary CouchDB packages.
  4. From the Terminal window, run the following command to retrieve any last-minute changes or dependencies: sudo port upgrade couchdb.
  5. To get CouchDB up and running, type the following command in Terminal:
    sudo launchctl load -w /opt/local/Library/LaunchDaemons/org.apache.couchdb.plist

    This launches the CouchDB server and keeps it running persistently, so it will start up if you ever restart your Mac.

To see CouchDB in action, type http://127.0.0.1:5984/_utils/index.html in your browser. The Futon utility as it appears in Figure 1 will appear.

Figure 1. The Futon utility
The Futon utility

On a Windows® system, your process will be a bit more convoluted, as you'll need to install Cygwin, the Microsoft® C compiler, any number of other prerequisites (like cURL, ICU, and SeaMonkey), download and install the source code for Erlang and Couch, configure those according to the README files, then do a full installation. This process is described in the CouchDB wiki (see Resources). You will also find instructions for Linux®, Berkeley Software Distribution (BSD), and other environments.


Using the CouchDB API

Before jumping into PHP, it might be a good idea to get a feel for the CouchDB API, which is accessible via HTTP using GET and PUT requests and returns data in JSON format. This setup makes it easy to store and retrieve data from your Web application regardless of the language you're using — PHP, Microsoft Active Server Pages (ASP), Ruby, Python, or even simple jQuery Ajax functions.

This section shows how to use the cURL command-line tool to issue GET, POST, PUT, and DELETE requests to CouchDB. Once you've got the hang of the API, a particular PHP wrapper helps you simplify your development tasks.

The first thing you need to run (again, from a Terminal window) is this command: curl http://127.0.0.1:5984/. What you should see is a response similar to {"couchdb":"Welcome","version":"0.10.0"}. This simply tells you that CouchDB is up and running and which version you're using. If you don't see this message, go back through your installation and configuration process to get CouchDB up and running.

Now, try listing all the collections set up in CouchDB. Run curl -X GET http://127.0.0.1:5984/_all_dbs.

If this is a fresh installation of CouchDB, you should see the response [], which means that no collections or databases are present (the double brackets signify an empty JavaScript array). Please note that with this cURL command, you use the -X option to specify a GET operation explicitly.

So let's solve that problem by creating a database:

curl -X PUT http://127.0.0.1:5984/songs

When you run this, you will see the response {"ok":true}. You now know that you can check for an ok attribute to make sure of success. Running curl -X GET http://127.0.0.1:5984/_all_dbs again results in a non-empty array: ["songs"]. Specifically, your CouchDB instance has one database in it: songs.

Now try to create another database called songs. If you run curl -X PUT http://127.0.0.1:5984/songs again, you'll get this error message:

{"error":"file_exists","reason":"The database could not be created, 
    the file already exists."}

So it's easy for you to check for an error attribute to see if any problems have occurred.

Create a second database called foobar:

curl -X PUT http://127.0.0.1:5984/foobar

If you ran curl -X GET http://127.0.0.1:5984/_all_dbs, it would result in the response ["songs","foobar"]. To get rid of this second database, pass in a DELETE call:

curl -X DELETE http://127.0.0.1:5984/foobar

Running curl -X GET http://127.0.0.1:5984/_all_dbs reveals that you're back to ["songs"].

Now go ahead and create some documents inside your songs database. True to form, you want to store some songs in this database, with fields for song title, artist name, and album name. To create a document, follow this pattern:

curl -X PUT http://127.0.0.1:5984/songs/*id* -d '{ *json_data* }'

Notice that you call the name of your database, followed by some kind of ID (which needs to be unique not only in this CouchDB instance but preferably across all instances if you can), followed by your JSON data.

Why the unique ID? You could use a UUID (or a GUID) as a unique ID, or you could create some kind of natural key that combines various bits of data (for example, the name of a song with underscores instead of spaces combined with a timestamp), or you can have CouchDB create a unique ID for you (this is a slow process). Either one of these approaches is good, just don't use some kind of auto-increment value like you would in the MySQL environment.

Now, enter a song into your database:

curl -X PUT http://localhost:5984/songs/whatever_you_like -d \
	'{"title":"Whatever You Like", "artist":"T.I.","album":"Paper Trail"}'

{"ok":true,"id":"whatever_you_like","rev":"1-1d915e4c209a2e47e5cf05594f9f951b"}

Notice that I took a very simple approach to the unique ID (using a simplified version of the song name with underscores instead of spaces). This simple approach is probably OK for right now. Luckily, the wrappers you'll use in PHP will help you create better IDs. Also notice that I immediately received a response of "ok" with my document ID and a rev attribute to tell me what the revision version is set to.

To view the document you just added, try this:

curl -X GET http://localhost:5984/songs/whatever_you_like

{"_id":"whatever_you_like","_rev":"1-1d915e4c209a2e47e5cf05594f9f951b", 
	"title":"Whatever You Like", "artist":"T.I.", "album":"Paper Trail"}

If you're following along in Futon, you should be able to click the songs database name and see a listing for whatever_you_like in the documents list. Clicking that link displays the details of the document in question, as illustrated in Figure 2.

Figure 2. Document details
Screenshot of a listing for the song entry, showing all fields populated

You're getting the idea — make RESTful requests using JSON, and things happen.

Now, all of this looks pretty good, but if you're a PHP developer, you're wondering how to tie this all in to something you feel comfortable with. The next section introduces you to some PHP wrappers for CouchDB.


Working with PHP

For the next step, you need to download PHP-on-Couch from Github (see Resources). Place the extracted /lib folder contents into your development area. When you have your work area set up, create a simple PHP application that will talk to the CouchDB database you've already set up (your songs collection). Create a new file, and call it index.php. Put the code in Listing 2 into it.

Listing 2. CouchDB connection settings
<?php
$couch_dsn = "http://localhost:5984/";
$couch_db = "songs";

require_once "./lib/couch.php";
require_once "./lib/couchClient.php";
require_once "./lib/couchDocument.php";


$client = new couchClient($couch_dsn,$couch_db);
?>

This code serves as your connection code to CouchDB and includes all the relevant classes that you'll need to work with the database. Go ahead and list all the information related to your database, as shown in Listing 3.

Listing 3. Getting database information
try {
	$info = $client->getDatabaseInfos();
} catch (Exception $e) {
	echo "Error:".$e->getMessage()." (errcode=".$e->getCode().")\n";
	exit(1);
}
print_r($info);

What you should get is something close to Listing 4.

Listing 4. Database information
stdClass Object 
( 
	[db_name] => songs 
	[doc_count] => 2 
	[doc_del_count] => 0 
	[update_seq] => 2 
	[purge_seq] => 0 
	[compact_running] => 
	[disk_size] => 8281 
	[instance_start_time] => 1266082749089965 
	[disk_format_version] => 4 
)

Next, retrieve a document from your song database. Listing 5 shows the code to do so.

Listing 5. Retrieving a song from the database
try {
	$doc = $client->getDoc('whatever_you_like');
} catch (Exception $e) {
	if ( $e->code() == 404 ) {
		echo "Document not found\n";
	} else {
		echo "Error: ".$e->getMessage()." (errcode=".$e->getCode().")\n";
	}
	exit(1);
}
print_r($doc);

Listing 6 shows the response.

Listing 6. The retrieved song
stdClass Object
(
    [_id] => whatever_you_like
    [_rev] => 1-1d915e4c209a2e47e5cf05594f9f951b
    [title] => Whatever You Like
    [artist] => T.I.
    [album] => Paper Trail
)

Great, but what if you need to make updates to a document? You can make two different kinds of updates: by making changes to existing field values or by adding new fields with their own values. You do this using arrow notation (for example, $doc->new_field), then committing your changes with storeDoc(). Listing 7 shows the code for updating a document.

Listing 7. Updating a document
$doc->genre = 'hip-hop';
$doc->year = 2008;
try {
        $response = $client->storeDoc($doc);
} catch (Exception $e) {
        echo "Error: ".$e->getMessage()." (errcode=".$e->getCode().")\n";
        exit(1);
}

Running this code, you can then retrieve the document ID and get back the response in Listing 8.

Listing 8. Updated document
stdClass Object
(
    [_id] => whatever_you_like
    [_rev] => 2-12513a362693b300928aa45f82faed83
    [title] => Whatever You Like
    [artist] => T.I.
    [album] => Paper Trail
    [genre] => hip-hop
    [year] => 2008
)

Notice that the _rev attribute has been incremented to 2-whatever. Before, it was 1-whatever. It's an easy way for you to tell what revision is there.

What if you want to store a new document in the database? You would instantiate a new object and use arrow notation to fill in the fields in the document. The code in Listing 9 shows the code to do so.

Listing 9. Creating a new document
$song = new stdClass();
$song->_id = "in_the_meantime";
$song->title = "In the Meantime";
$song->album = "Resident Alien";
$song->artist = "Space Hog";
$song->genre = "Alternative";
$song->year = 1995;

try {
	$response = $client->storeDoc($song);
} catch (Exception $e) {
	echo "Error: ".$e->getMessage()." (errcode=".$e->getCode().")\n";
	exit(1);
}
print_r($response);

The response would look like Listing 10.

Listing 10. Response from creating a new document
stdClass Object
(
    [ok] => 1
    [id] => in_the_meantime
    [rev] => 1-d65b03a9fe2f3c8095b08883e7cd97df
)

Conclusion

At this point, you have more than enough information to get started with CouchDB and PHP. It should be easy for you to create your basic update forms that then allow you to create or update existing documents in a database. The PHP-on-Couch package also gives you access to other methods for creating and deleting databases, working with CouchDB views, and more. With any luck, there's enough information in this article to get you off to a good start.

Resources

Learn

Get products and technologies

Discuss

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Select information in your profile (name, country/region, and company) is displayed to the public and will accompany any content you post. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into Open source on developerWorks


static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Open source
ArticleID=475036
ArticleTitle=CouchDB basics for PHP developers
publish-date=03232010