Skip to main content

Second Life client, Part 3: Adding simple translation to Second Life

Or, how not to say you are a jelly doughnut

Peter Seebach (developerworks@seebs.plethora.net), Freelance writer, Plethora.net
Author photo
Peter Seebach has been using computers for years and is gradually becoming acclimated. He still doesn't know why mice need to be cleaned so often, though.

Summary:  In the last part of our exploration of the Second Life software, learn how to plug a simple command-line program into Second Life that provides a language translation function.

View more content in this series

Date:  30 Apr 2007
Level:  Intermediate
Activity:  2975 views

Whether you're simply converting all-caps text to lower case to spare delicate ears, or trying to make everyone sound like the Swedish Chef, simple translation software has been a popular theme of computer environments for years. More sophisticated solutions exist as well. This article looks at the technical issues involved in translating chat messages in Second Life.

More in this series

  • Part 1: Hacking Second Life:
    What happens when a company releases proprietary software to the open source community? Find out here. We cover the build process and some of the stepping stones and stumbling blocks on the way to hacking Second Life.

  • Part 2: Digging into the documentation:
    Projects live and die by their documentation, so learn how the Second Life client stacks up in that department and what makes its approach so unusual.

While there's certainly plenty of industrial-strength translation software out there, my first notion was to use a simple command-line one. A command-line app that requires little setup is easy to configure and check, and likewise easy to incorporate into another program. I picked Linguaphile (see Resources). While it'd certainly be possible to incorporate one of the web-based translation services, or one of the heavier-duty translation packages, Linguaphile has the key advantage that it is very simple to configure, allowing me to focus on the Second Life viewer code, rather than on translation software. It also has the key advantage over some services that it is free software, available for immediate download, and its license is permissive.

Linguaphile

Linguaphile had no installation or build process; it's just a bundle of files and a perl script which uses them. If you run it from the directory the archive is unpacked in, it just works. So the build took a total of about 0 seconds. This was a definite strength.

The Second Life documentation for the chat system is incomplete, as of this writing. The code is mostly found in the llviewermessage.cpp file, which handles messages coming in from the simulator, and in llchatbar.cpp, which handles outgoing messages.


Listing 1. Processing incoming messages (llviewermessage.cpp)
                
void process_chat_from_simulator(LLMessageSystem *msg, void **user_data)
{
[...]
   char mesg[DB_IM_MSG_BUF_SIZE];
   [...]

And sure enough, there's the message, which gets unpacked into a small buffer. For starters, just to confirm this, I modified the program to smash the message into all lowercase; it's easy to see whether a message has been changed, and I always worry about going deaf from all the people online who shout habitually. This is presented as a self-contained code block because nothing outside this block ever needs to see any local variables used along the way:


Listing 2. Smashing case
                
msg->getStringFast(_PREHASH_ChatData, _PREHASH_Message,
                             DB_CHAT_ MSG_BUF_SIZE, mesg);
{
   char *s;
   for (s = mesg; *s; ++s)
      *s = tolower(*s);
}

The getStringFast call is from the existing code; I include it so you can see where the modification goes. After a quick rebuild (well, not so quick; I see why they use a distributed build system), I got to test this out. Sure enough, I logged in on the rebuilt client, and said something; my client, receiving the message back from the server, translated it to lower case. That test confirms that this is the right place to perform translations. Next up is the task of running a subprocess; unfortunately, this is not a task you can easily perform cross-platform. On UNIX® or UNIX-like systems (such as Linux® and Mac OS X), it's very simple. First, let's just look at the minimalist call to an external program.


Listing 3. Smashing case the hard way
                
{
   int rfd[2], wfd[2];
   int pid;
   pipe(wfd); // to child
   pipe(rfd); // from child
   if ((pid = fork())) {
      /* close child's side of file descriptors in parent */
      close(wfd[0]);
      close(rfd[1]);
      /* send message, close stream */
      write(wfd[1], mesg, strlen(mesg));
      close(wfd[1]);
      /* read response, close stream */
      read(rfd[0], mesg, DB_CHAT_MSG_BUF_SIZE);
      close(rfd[0]);
      /* wait for child process */
      waitpid(pid, NULL, 0);
   } else {
      /* close parent's side of file descriptors in child */
      close(wfd[1]);
      close(rfd[0]);
      /* close existing standard input and output */
      close(0);
      close(1);
      /* reassign pipe to standard input and output */
      dup2(wfd[0], 0);
      dup2(rfd[1], 1);
      execl("/bin/tr", "tr", "A-Z", "a-z", NULL);
   }
}

This may seem a little obfuscated, but it makes perfect sense once you know what it does. Each call to pipe() makes a pair of file descriptors; data written to fd[1] can be read from fd[0]. Our program needs to do two things; first, it needs to write a message to the external utility program, second, it needs to read a response back. That requires a pair of pipes. I named them "wfd" (write file descriptor) and "rfd" (read file descriptor) respectively. In the parent program (which receives the child's process id as the return from fork), the unused halves of the pipes are closed, the message is written, and the file descriptor is closed (this makes the client detect EOF after it reads the data sent). Meanwhile, the child closes the other halves of the pipes, then uses close and dup2 to map them onto standard input and standard output. The child executes an external program; in this case, "tr".

The tr utility reads until it reaches EOF (getting the message we sent), sends the data back converted, and exits, closing the file descriptor. The parent process reads everything available up through EOF back into the message buffer, calls waitpid() to reap the now-deceased child process, and continues.

This is just a proof of concept, but now, any program you can call that converts input to output can be used instead of tr. For instance, the aforementioned translating program.

Actually using linguaphile

It would seem like it would be totally trivial to patch in linguaphile. It was close, but there were a couple of problems. One obvious problem is that the original code, for smashing case, simply assumes that the output message is the same length as the input message. This is not one bug, but two! The first bug (and I hope it's the one that leapt out at you immediately) is that a message which becomes longer could overrun the message buffer, allowing a carefully crafted message to smash the stack and potentially execute dangerous code. But what happens in the case where the translation is shorter? In early testing, using German to English, the German word "mich" got translated to "mech". Why? Linguaphile's translation ("me") was being overwritten onto a buffer containing "mich"; the result was "mech" (emphasis mine).

The corrected read code is simple enough:


Listing 4. Terminating the buffer
                
len = read(fd[1][0], mesg, DB_CHAT_MSG_BUF_SIZE - 1);
mesg[len] = '\0';

The "-1" is to avoid smashing another object with the terminating null byte; it would also work to just make the buffer a byte longer.

The code to call linguaphile is simplicity itself:


Listing 5. Calling linguaphile
                
chdir("/home/seebs/ic/linguaphile-0.2");
execl("/usr/bin/perl", "perl", "linguaphile.pl", "-q", NULL);

The chdir call is used so that linguaphile can find its distribution package materials; without these, it won't run. The "-q" option suppresses the initial message identifying which languages are being translated. In testing, I just set the default source language in the code. For a production environment, you'd obviously want to specify those as arguments, and provide some way to change the translation layer.

Linguaphile's a fairly simple word-for-word translation program; it just passes unknown words through without alteration or marking. However, it did the thing that was most useful for getting this in place; it worked as a command-line app out of the box. If you desperately need to talk to someone whose language you don't know, an application like this could be enough to get you stumbling through.


So what does that prove?

One of the frustrations in pitching open source is the difficulty of convincing people not yet familiar with it that it matters. I just took a major application consisting of thousands of lines of code, and got a working proof of concept of a plug-in translator architecture working. It's pretty rough, but if I desperately needed to talk to someone in a foreign language, this might be enough to get a few concepts across. It also changes the nature of the problem; now that a way to translate incoming messages has been established, working on improving the translation is simple and nicely modular; the external program can be any program, although you'd want a responsive one; a web-based service might be problematic.

The version presented so far is far from production code. It runs only on UNIX platforms (Linux, OS X, FreeBSD; not Windows®). It's woefully inefficient, spawning an external application for every line of text it processes. There's no runtime user configuration, and it doesn't provide for any translation of outgoing text. It's a proof of concept.

Each of these issues can be addressed. The spawning overhead is harder to fix than you might think; the default behavior of handling input and output data in blocks makes it easy for the application to enter a deadlock waiting for linguaphile to send data back, while linguaphile is waiting for more data from the viewer. The hard part is that there's no trivial way for the viewer to know when the client is done sending data, without waiting to see if more data come along. Inserting an arbitrary delay is hardly ideal, either. A real solution to this would require, at the minimum, the addition of some kind of sentinel value (probably a newline) to the message, and then checking for the sentinel value while reading data from the pipe. Another solution would be to prefix incoming data with its length. All of these are more complicated; the big appeal of the original version is that the read system call will always return once it hits EOF, which is generated automatically when the client program completes and terminates.

Runtime user configuration might be a little easier to deal with. Since the translation program is already messing with incoming requests, it makes sense to add it to outgoing requests, and along the way add a feature of intercepting IRC-style "/" messages. By convention, assume that you always want to type and read in the same language. So, all you need is a pair of language settings; one for the language people will be talking to you in, one for the language you'll be talking in. I nominate /speak and /hear, which were (as of this writing) not already in use as commands in the existing emote system.


Communication goes both ways

The code I am about to present is ugly in a few ways. It works, but it's not a clean solution. The focus here was on minimal intrusion into the Second Life code, not on an elegant design. Still, it works. The goal was to add two-way communication to the Second Life chat system; that is, translate outgoing material into a language, and incoming material from it. The interface I designed was to add two "emotes", /speak and /hear. The usage of these is ambiguous; "/speak" is the language you type and read in, "/hear" is the language that other people send to you, or receive from you. If you "/speak en" and "/hear de", you write "book" and the other party hears "buch"; if they say "buch", you hear "book".

My solution to keep this isolated was a single function with static local variables to hold the language type. The function is a close relative of the code previously presented in llviewermessage.cpp:


Listing 6. The translation function
                
void translate_in_place(const char *newhear, const char *newspeak, int out, char *mesg)
{
   static char hear[3] = "de", speak[3] = "en";
   int rfd[2], wfd[2];
   int pid;
   if (newspeak)
      strncpy(speak, newspeak, 2);
   if (newhear)
      strncpy(hear, newhear, 2);
   llinfos << "speaking " << speak << ", hearing " << hear << "." << llendl;
   if (!mesg) {
      return;
   }
   pipe(wfd); // to child
   pipe(rfd); // from child
[...]

      chdir("/home/seebs/ic/linguaphile-0.2");
      execl("/usr/bin/perl", "perl", "linguaphile.pl",
         "-s", out ? speak : hear,
         "-d", out ? hear : speak,
         "-q", NULL);
   }
}
 

The most significant change to the actual translation logic is the addition of the -s (source) and -d (destination) flags to linguaphile. These are set based on the current values for the hear and speak variables, and whether the message is "outgoing". The rather awkward calling sequence lets you specify languages with one call, or a message to translate with another. It's ugly, but it doesn't pollute the global namespace. The assumption that all language codes are two letters is actually wrong, but many of them were, and it was good enough for testing. There's no real error-checking here; it might make sense in production to add return codes to translate_in_place and some kind of diagnostic output for the user. The essential algorithm is clearer without it, though. The code which calls this to set these variables goes in llchatbar.cpp, in the LLChatBar::sendChat() function:


Listing 7. Handling the /speak and /hear emotes
                
std::string utf8text = wstring_to_utf8str(text);

/* Check for translation hack */
if (!strncmp(utf8text.data(), "/speak ", 7)) {
   translate_in_place(NULL, utf8text.data() + 7, 0, 0);
   utf8text.erase();
} else if (!strncmp(utf8text.data(), "/hear ", 6)) {
   translate_in_place(utf8text.data() + 6, NULL, 0, 0);
   utf8text.erase();
}

This code is ugly beyond words, but, as long as useful arguments are given to it, it does what is intended; it modifies the "hear" and "speak" values in the translation function. Finally, a call to the translator goes later in this function. The utf8_revised_text variable holds the text as modified by the viewer's native "emote" system which detects strings like "/smoke" and turns them into visible actions.


Listing 8. Translating outgoing text
                
utf8_revised_text = utf8str_trim(utf8_revised_text);
{
   char   mesg[DB_CHAT_MSG_BUF_SIZE + 1];
   strncpy(mesg, utf8_revised_text.data(), DB_CHAT_MSG_BUF_SIZE);
   mesg[DB_CHAT_MSG_BUF_SIZE] = '\0';
   translate_in_place(NULL, NULL, 1, mesg);
   utf8_revised_text.assign(mesg);
}

With this code in place, the system does what you expect; on the default settings, if you use a word that linguaphile can translate, the word is translated. Other words are left alone. You might not realize it's working right away, as your translated words are translated back when the game echoes your statements! I did find a corner case, though; with the default English/German translation, where "table" gets translated to "tisch", but "tisch" doesn't get translated back to table.

The code's still rough; it needs error checking to indicate what languages are known or unknown, and of course, it wouldn't hurt to buff the translation engine. However, the essential goal, of creating something that could let you talk to someone who doesn't speak your native language, without any effort on their part, has been met. Don't use this in production; if you give it an invalid language code, it may well simply eliminate all incoming or outgoing text, or otherwise act up. There's no real handling of translation problems, and there's no error-checking. This is a proof of concept; don't rely on it in real life. (Or second life, for that matter.)


Observations

Looking around the system and modifying it has been enlightening. The Second Life implementation seems to be fairly well organized, but not perfectly. For instance, the logic to handle users using "/me" in messages to indicate actions ("/me says hi" turns into "Yourname says hi") occurs in more than one place. Still, it's consistent and fairly easy to find. Names are well-chosen and code doesn't generally try to poke around in class internals.

That speaks to a good development model, and one well suited to an open source release. Modules are modular; that's a good thing. It's fairly easy to find the right place to make a given change, and the system provides a broad range of helpful tools, such as the llinfos stream. Years of playing around with other peoples' code have left me generally skeptical, but in this case, the experience was rewarding and fairly fun. Compared to other projects I've seen, the code is quite good; I find myself wondering if one of the biggest objections to open source isn't just that some companies don't feel their code would withstand careful scrutiny.


Resources

Learn

Get products and technologies

  • Order the SEK for Linux, a two-DVD set containing the latest IBM trial software for Linux from DB2®, Lotus®, Rational®, Tivoli®, and WebSphere®.

  • With IBM trial software, available for download directly from developerWorks, build your next development project on Linux.

Discuss

About the author

Author photo

Peter Seebach has been using computers for years and is gradually becoming acclimated. He still doesn't know why mice need to be cleaned so often, though.

Comments (Undergoing maintenance)



Trademarks  |  My developerWorks terms and conditions

Help: Update or add to My dW interests

What's this?

This little timesaver lets you update your My developerWorks profile with just one click! The general subject of this content (AIX and UNIX, Information Management, Lotus, Rational, Tivoli, WebSphere, Java, Linux, Open source, SOA and Web services, Web development, or XML) will be added to the interests section of your profile, if it's not there already. You only need to be logged in to My developerWorks.

And what's the point of adding your interests to your profile? That's how you find other users with the same interests as yours, and see what they're reading and contributing to the community. Your interests also help us recommend relevant developerWorks content to you.

View your My developerWorks profile

Return from help

Help: Remove from My dW interests

What's this?

Removing this interest does not alter your profile, but rather removes this piece of content from a list of all content for which you've indicated interest. In a future enhancement to My developerWorks, you'll be able to see a record of that content.

View your My developerWorks profile

Return from help

static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Linux, Open source, Web development
ArticleID=216863
ArticleTitle=Second Life client, Part 3: Adding simple translation to Second Life
publish-date=04302007
author1-email=developerworks@seebs.plethora.net
author1-email-cc=

My developerWorks community

Tags

Help
Use the search field to find all types of content in My developerWorks with that tag.

Use the slider bar to see more or fewer tags.

Popular tags shows the top tags for this particular content zone (for example, Java technology, Linux, WebSphere).

My tags shows your tags for this particular content zone (for example, Java technology, Linux, WebSphere).

Use the search field to find all types of content in My developerWorks with that tag. Popular tags shows the top tags for this particular content zone (for example, Java technology, Linux, WebSphere). My tags shows your tags for this particular content zone (for example, Java technology, Linux, WebSphere).

Rate a product. Write a review.

Special offers