Use Node.js as a full cloud environment development stack

Embrace the concurrency model using asynchronous I/O via callbacks, and build a chat server

Explore Node.js, an event-driven I/O framework for the version 8 JavaScript™ engine on UNIX®-like platforms designed for writing scalable network programs such as web servers. This article examines the framework, the ecosystem surrounding it (including cloud offerings), and wraps up with a comprehensive example of how to build a chat server in Node.js.

Noah Gift, Associate Director Engineer, AT&T Interactive

author Noah GiftNoah Gift is an experienced technical leader and software developer at AT&T Interactive. He solves interesting problems in a variety of languages including Python/Iron Python, Erlang, F#, C#, and JavaScript. (He's also worked at Caltech, Disney Feature Animation, Sony Imageworks, and Weta Digital.) A member of the Python Software Foundation, he is also an author of many developerWorks articles and the co-author of Python For Unix and Linux System Administration. He earned a BS in Nutritional Science from Cal Poly San Luis Obispo, an MS in Computer Information Systems from CSULA, and is an MBA Candidate at UC Davis specializing in business analytics, finance, and entrepreneurship. In his spare time, he composes for the piano and runs in marathons. Find him at his web site, on Twitter, or for consulting.


developerWorks Contributing author
        level

Jeremy Jones, Senior Systems Engineer, Predictix

Jeremy JonesJeremy Jones is a Senior Systems Engineer for Predictix. He is the author of numerous articles online as well as the co-author of Python for UNIX and Linux System Administration. He most commonly tackles problems with Python, but has been known to reach for bash, JavaScript, perl, C#, and Java. He enjoys troubleshooting and resolving non-obvious problems in various problem domains. He spends his spare time building things out of wood and spending time with his family.



25 April 2011

Also available in Chinese Russian Japanese

As technology innovation continues to advance at a seemingly exponential rate, new ideas arise from day one that just make sense. Server-side JavaScript is one of those ideas. Node.js, an event-driven I/O framework for the version 8 JavaScript engine on UNIX-like platforms, intended for writing scalable network programs such as web servers, is the execution of that good idea.

Instead of fighting JavaScript, Node.js embraces it as the full development stack, from server-side code all the way to the browser. Node.js also embraces another innovative idea: The concurrency model involving asynchronous I/O through callbacks.

Node.js cloud platforms

A significant benefit to using the Node.js framework arises when you use it in a cloud environment. For application developers, this typically boils down to using either Platform as a Service (PaaS) or Infrastructure as a Service (IaaS) models. The most abstracted and arguably most convenient approach for a developer is to use a PaaS provider. Figure 1 provides a very simple look at the structures of the PaaS and IaaS models.

Figure 1. The PaaS and IaaS structures
The PaaS and IaaS structures

Recently, an exciting open source project, Cloud Foundry, released the code to create a private PaaS that is capable of running Node.js. The same hosting engine is also available in a public and commercial cloud and they accept software patches.

It is truly an exciting time to be a developer as the pain of managing infrastructure can be outsourced (forever!) to providers who in turn can capitalize on economies of scale, whether it be in source code or physical hardware resources.


Using the Node.js shell

Before we dive into writing a full Node.js example, let's start with an introduction to using the interactive shell. If you don't have Node.js installed, you can refer to the resources section and either follow the instructions to install it or use one of the online interactive Node.js sites that lets you enter the code directly into a browser.

To write a JavaScript function interactively in Node.js, at a command line prompt type node as shown:

lion% node
> var foo = {bar: 'baz'};
> console.log(foo);
{ bar: 'baz' }
>

In this example, the object foo was created and then printed out to the console by calling console.log. That was very powerful and fun, but the real fun begins when you use tab completion to explore foo, as in the following example. If you type foo.bar. and then press the tab key, you see the methods available on the object.

> foo.bar.
[...output suppressed for space...]
foo.bar.toUpperCase           foo.bar.trim
foo.bar.trimLeft              foo.bar.trimRight

The toUpperCasemethod looks interesting to try. Here is what it does:

> foo.bar.toUpperCase();
'BAZ'

As you can see that method converted the string to uppercase. This style of interactive development is ideal for doing development with an event-driven framework like Node.js.

With this simple intro out of the way, it is time to move into actually building something.


Building a chat server in Node.js

Node.js makes it easy to write event-based network servers. As an example, let's create a couple of chat servers. The first one is trivial, nearly featureless, and has no exception handling.

A chat server allows multiple clients to connect to it. Each client can write messages that are then broadcast to all other users. Here is the code for the simplest possible chat server.

net = require('net');

var sockets = [];

var s = net.Server(function(socket) {

    sockets.push(socket);

    socket.on('data', function(d) {

        for (var i=0; i < sockets.length; i++ ) {
            sockets[i].write(d);
        }
    });
});

s.listen(8001);

In less than 20 lines of code (actually, there are only eight that are doing anything), you have built a functional chat server. Here is the flow of this simple program:

  • When a socket connects, append the socket object to an array.
  • When the client writes to their connection, write that data to all sockets.

Now let's walk through the code and explain how the example fulfills the defined expectation of what a chat server is and does. The first line allows access to the contents of the net module:

net = require('net');

Let's use Server from this module.

You'll need a place to hold all of client connections so that you can write to all of them when you write data. Here is the variable that holds all of the client socket connections:

var sockets = [];

The next line starts a code block that dictates what happens when each client connects.

var s = net.Server(function(socket) {

The only argument that you pass into the Server is a function that will be called for each client connection. Inside this function, add the client connection to the list of all client connections:

sockets.push(socket);

This next piece of code sets up an event handler to dictate what happens when a client sends data:

socket.on('data', function(d) {

    for (var i=0; i < sockets.length; i++ ) {
        sockets[i].write(d);
    }
});

The socket.on() call registers an event handler with node so that it knows what to do when certain events happen. Node.js calls this particular event handler when data is received from the client. Other event handlers include connect, end, timeout, drain, error, and close.

The structure of the socket.on() call is similar to the Server() call mentioned earlier. You pass in a function to both of them; the function is called when something happens. This callback approach is common in asynchronous networking frameworks. This is the main thing that people with procedural programming experience have a problem with when starting out with an asynchronous framework like Node.js.

In this case, when any client sends data to server, this anonymous function is called and the data is passed into the function. It iterates over the list of socket objects you have been accumulating and sends the same data to them all. Each client connection will receive the data.

This chat server is pretty simple. It is so simple that it is lacking some very basic features such as an identification of who sent each message or handling the case where a client disconnects. (If a client disconnects from this chat server and anyone sends a message, the server will crash.)

Here is the source code (called chat2.js in the download examples) for an improved socket server with enhanced functionality and code to deal with "bad things happening" (such as clients disconnecting).

net = require('net');

var sockets = [];
var name_map = new Array();
var chuck_quotes = [
    "There used to be a street named after Chuck Norris, but it was changed because 
     nobody crosses Chuck Norris and lives.",
    "Chuck Norris died 20 years ago, Death just hasn't built up the courage to tell 
     him yet.",
    "Chuck Norris has already been to Mars; that's why there are no signs of life.",
    "Some magicians can walk on water, Chuck Norris can swim through land.",
    "Chuck Norris and Superman once fought each other on a bet. The loser had to start 
     wearing his underwear on the outside of his pants."
]

function get_username(socket) {
    var name = socket.remoteAddress;
    for (var k in name_map) {
        if (name_map[k] == socket) {
            name = k;
        }
    }
    return name;
}

function delete_user(socket) {
    var old_name = get_username(socket);
    if (old_name != null) {
        delete(name_map[old_name]);
    }
}

function send_to_all(message, from_socket, ignore_header) {
    username = get_username(from_socket);
    for (var i=0; i < sockets.length; i++ ) {
        if (from_socket != sockets[i]) {
            if (ignore_header) {
                send_to_socket(sockets[i], message);
            }
            else {
                send_to_socket(sockets[i], username + ': ' + message);
            }
        }
    }
}

function send_to_socket(socket, message) {
    socket.write(message + '\n');
}

function execute_command(socket, command, args) {
    if (command == 'identify') {
        delete_user(socket);
        name = args.split(' ', 1)[0];
        name_map[name] = socket;
    }
    if (command == 'me') {
        name = get_username(socket);
        send_to_all('**' + name + '** ' + args, socket, true);
    }
    if (command == 'chuck') {
        var i = Math.floor(Math.random() * chuck_quotes.length);
        send_to_all(chuck_quotes[i], socket, true);
    }
    if (command == 'who') {
        send_to_socket(socket, 'Identified users:');
        for (var name in name_map) {
            send_to_socket(socket, '- ' + name);
        }
    }
}

function send_private_message(socket, recipient_name, message) {
    to_socket = name_map[recipient_name];
    if (! to_socket) {
        send_to_socket(socket, recipient_name + ' is not a valid user');
        return;
    }
    send_to_socket(to_socket, '[ DM ' + get_username(socket) + ' ]: ' + message);
}

var s = net.Server(function(socket) {
    sockets.push(socket);
    socket.on('data', function(d) {
        data = d.toString('utf8').trim();
        // check if it is a command
        var cmd_re = /^\/([a-z]+)[ ]*(.*)/g;
        var dm_re = /^@([a-z]+)[ ]+(.*)/g;
        cmd_match = cmd_re.exec(data)
        dm_match = dm_re.exec(data)
        if (cmd_match) {
            var command = cmd_match[1];
            var args = cmd_match[2];
            execute_command(socket, command, args);
        }
        // check if it is a direct message
        else if (dm_match) {
            var recipient = dm_match[1];
            var message = dm_match[2];
            send_private_message(socket, recipient, message);
        }
        // if none of the above, send to all
        else {
            send_to_all(data, socket);
        };

    });
    socket.on('close', function() {
        sockets.splice(sockets.indexOf(socket), 1);
        delete_user(socket);
    });
});
s.listen(8001);

A bit more advanced: Load balancing a chat server

Often, the reason for deploying to the cloud includes the intention to scale up as load increases. Such deployments require some sort of load balancing mechanism.

Most of the lightweight web servers like nginx and lighttpd are able to perform load balancing for multiple HTTP servers, but if you want to balance among non-HTTP servers, nginx might not be the way to go. And while there are generic TCP load balancers, you may not like the load balancing algorithm they use. Or you might want some feature that they didn't include. Or maybe you just want the fun of rolling your own load balancer.

Here is the simplest load balancer possible. It doesn't do any failover. It expects all of the destinations to be available. And it doesn't do any error handling. It is spartan. The basic idea is that it receives a socket connection from the client, it randomly picks a destination server to connect to, connects, and forwards all data from the client to that server and all data from the server back to the client.

net = require('net');

var destinations = [
    ['localhost', 8001],
    ['localhost', 8002],
    ['localhost', 8003],
]

var s = net.Server(function(client_socket) {
    var i = Math.floor(Math.random() * destinations.length);
    console.log("connecting to " + destinations[i].toString() + "\n");
    var dest_socket = net.Socket();
    dest_socket.connect(destinations[i][1], destinations[i][0]);

    dest_socket.on('data', function(d) {
        client_socket.write(d);
    });
    client_socket.on('data', function(d) {
        dest_socket.write(d);
    });
});
s.listen(9001);

The definition of destinations is the configuration for the back-end servers we are going to balance between. This is a simple array of arrays with the hostname as the first element, port number as the second.

The definition of the Server() is similar to the chat server example. You create a socket server and get it to listen on a port. This time, it will listen on 9001.

The callback for the Server() definition first randomly selects a destination to connect to:

var i = Math.floor(Math.random() * destinations.length);

You could have used a round-robin or done some extra work and gone with a "least connections" algorithm, but we wanted to keep it as simple as possible.

There are two named socket objects in this example: client_socket and dest_socket.

  • client_socket is the connection between the load balancer and the client.
  • dest_socket is the connection between the load balancer and the balanced servers.

These two sockets each handle one event: data received. When either of them receive data, it writes the data to the other socket.

Let's walk through full cycle what happens when a client connects to a generic network server over the load balancer, sends data, then receives data.

  1. When a client connects to the load balancer, Node.js creates a socket between the client and itself. We'll refer to that as the client_socket.
  2. After the connection is made, the load balancer picks a destination and creates a socket connection to the destination. We'll refer to this as the dest_socket.
  3. When the client sends data, the load balancer pushes that same data to the destination server.
  4. When the destination server responds back and writes some data to the dest_socket, the load balancer pushes that data back to the client over the client_socket.

Improvements that can be made to this load balancer include error handling, embedding another in the same process to dynamically add and remove destinations, adding different balancing algorithms, and adding some fault tolerance.


Beyond homegrown solutions: The Express web framework

Node.js comes equipped with HTTP server capabilities, but they are low-level abilities. If you are considering building a web application in Node.js, you might want to look at Express, a web application development framework built for Node.js. It fills in some of Node.js's gaps.

In the next example, let's focus on a couple of the obvious advantages of using Express over unadorned Node.js. One such item is request routing. Another item is registering an event for an HTTP "verb" type, such as "get" or "post."

Following is a very simple web application. It doesn't do anything except to demonstrate some of the basic capabilities of Express.

var app = require('express').createServer();

app.get('/', function(req, res){
  res.send('This is the root.');
});

app.get('/root/:id', function(req, res){
  res.send('You sent ' + req.params.id + ' as an id');
});

app.listen(7000);

The two lines that start with app.get() are event handlers that are triggered when a GET request comes in. The first argument of both of these calls is a regular expression specifying the URL that the customer may pass in. The second argument is a function that will actually handle the request.

The regular expression argument is the routing mechanism. If the request type (GET, POST, etc.) and the resource (/, /root/123) match, the handler function is called. In the first app.get() call, / is simply specified as the resource. In the second, /root is specified followed by an ID. The colon (:) character before a resource in the URL mapping regex identifies that piece as a parameter that can be used later.

The handler function is called when the request type and the regular expression match. This function takes two arguments — a request (req) and a response (res). The parameter mentioned earlier is attached to the request object. And the webserver's message back to the user is passed into the response object.

This is a very simple example, but it is easy to see how "real applications" can take advantage of this framework to build richer, fuller bundles of functionality. If you plugged in a templating system and some data engine (either a traditional or a NoSQL database), you could easily build out some set of functionality to meet the requirements for a real application.

One of the attributes of Express is high performance. That, along with attributes that are common to other rapid web application frameworks, can position Express nicely in the arena of cloud deployments where high performance and massive scalability are critical.


A final bit of knowledge before we wrap up

Two more concepts/trends to be aware of are:

  • The sudden popularity of key/value databases.
  • Other asynchronous web paradigms.

Key/value databases ... why suddenly popular?

Because JavaScript is the lingua franca of the web, a discussion of JavaScript Object Notation (JSON) is often not far behind any JavaScript-related conversation. JSON is the most common way of exchanging data between JavaScript and some other language. JSON is essentially a key/value store; as a result it is natural for JavaScript and Node.js developers to show interest in key/value databases. After all, if you can store data in JSON format, it makes the life of a JavaScript developer that much easier.

In a somewhat unrelated trend, key/value databases are also talked about in the context of NoSQL databases as well. The CAP theorem, which is also knows as Brewer's theorem, states it is impossible for a distributed system to have more than two out of these three properties: Consistency, Availability, and Partition tolerance (formal proof of CAP). This theorem is one of the driving forces behind the NoSQL movement in that it provides a basis for trading some of the features of a traditional relational database in for, typically, higher availability. A few popular key/value databases are Riak, Cassandra, CouchDB, and MongoDB.

Asynchronous web paradigms

Event driven, asynchronous web frameworks have been around for quite some time. One of the more popular and recent asynchronous web frameworks is Tornado which is written in the Python language and is used internally at Facebook. Following is an example of what hello_world looks like in Tornado (called hello_tornado.py in the downloads).

import tornado.ioloop
import tornado.web

class MainHandler(tornado.web.RequestHandler):
    def get(self):
        self.write("Hello, world")

application = tornado.web.Application([
    (r"/", MainHandler),
])

if __name__ == "__main__":
    application.listen(8888)
    tornado.ioloop.IOLoop.instance().start()

Twisted.web, which is also written in Python, acts in a very similar manner.

Finally, in terms of the actual web server itself, there is nginx, that, unlike Apache, doesn't use threads but instead uses an event-driven (asynchronous) architecture to handle requests. It is very common to see asynchronous web frameworks use nginx as their web server.


In conclusion

Node.js does present a compelling story for web developers. It allows a development team to write JavaScript on both the client and servers side. They can also tap into the powerful technologies available in the JavaScript ecosystem: JQuery, V8, JSON, and event-driven programming. In addition, there are also ecosystems developing on top of Node.js, like the Express web framework.

As compelling as the Node.js story is, it is worth mentioning a few drawbacks as well. If you are CPU bound, then you won't get the benefits from the non-blocking I/O that Node.js gives you. There are architectures that work around this such as forking a pool of processes each running an instance of Node.js. It is up to you as the developer to implement it.


Download

DescriptionNameSize
Sample code for this articlenodejs_src.zip6KB

Resources

Learn

Get products and technologies

  • See the product images available on the IBM Smart Business Development and Test on the IBM Cloud.

Discuss

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into Cloud computing on developerWorks


  • Bluemix Developers Community

    Get samples, articles, product docs, and community resources to help build, deploy, and manage your cloud apps.

  • Cloud digest

    Complete cloud software, infrastructure, and platform knowledge.

  • DevOps Services

    Software development in the cloud. Register today to create a project.

  • Try SoftLayer Cloud

    Deploy public cloud instances in as few as 5 minutes. Try the SoftLayer public cloud instance for one month.

static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Cloud computing
ArticleID=650698
ArticleTitle=Use Node.js as a full cloud environment development stack
publish-date=04252011