The rsync family

Synchronizing any two machines is easy when you use rsync

Synchronizing two machines (such as a laptop and a desktop computer) is easier and faster when you use rsync, which boasts an efficient algorithm and options for just about everything you might need. And if a command-line operation isn't your thing, several graphic tools make using rsync easier still.

Share:

Federico Kereki, Systems Engineer, 自由职业者

Photo of Federico KerekiFederico Kereki is a Uruguayan systems engineer with more than 20 years of experience developing systems, doing consulting work, and teaching at universities. He currently works with a good jumble of acronyms: SOA, GWT, Ajax, PHP, and of course FLOSS! You can reach Federico at fkereki@gmail.com.


developerWorks Contributing author
        level

14 April 2009

Also available in Chinese

If you work with both a laptop and a desktop computer, you know you have to synchronize the machines to keep them up to date. In addition, you probably want to run the synchronization not only at your home but also from a remote site; in my case, whenever I travel with my laptop, I make sure that whatever I do on it gets backed up to my desktop computer. (Losing your laptop and thereby losing all your work isn't nice at all!) Many solutions to this problem exist: This article introduces one such tool—rsync—and mentions several related tools, all of which provide easy synchronization procedures.

What is rsync?

The rsync utility is a file-transfer and synchronization program widely available for Linux® and UNIX® and even ported to Windows®. Its key feature is a very fast algorithm that just sends file differences over the data link, thus minimizing the total data flow between machines. (If you use the File Transfer Protocol [FTP] or utilities such as rcp or scp, complete files will be sent, even if just one byte has changed.) Of course, rsync isn't limited to existing files: It can also deal with files and directories that might be present only at one end of the link. Finally, communications are optimized by compressing data, so you can use the tool even without a broadband connection.

The status of rsync

The rsync utility was originally developed by Andrew Tridgell of Samba fame. The software is available under the GNU General Public License (GPL), and its current version is 3.0.5, dated December 2008.

Getting and installing rsync

You can get precompiled binary packages for most current Linux distributions, and this is the first thing you should check. I use Smart for package management with OpenSUSE, and sudo smart install rsync was all I needed to install rsync's latest version. If you are a die-hard compile-everything fan, you can get the source code (see Resources for a link) and install it with the classic configure, make, make install method; check the included README file for detailed instructions.

For secure communications, you will want to have Secure Shell (ssh) installed. (You could use remote shell [rsh], but it's nowhere near as secure.) OpenSSH, a free implementation of ssh, is commonly available with all distributions. You will also need to open a port in your firewall so that your computers can connect with each other. All this configuration is standard: Check Resources for links to more information.

Running rsync as a daemon

There are two ways of running rsync: as a daemon or on demand. If you are only going to synchronize two computers, which method you choose won't make any noticeable difference. Running rsync as a daemon (by using the command rsync --daemon) can help only with a server if several different users are going to synchronize their own computers against it. Check man rsyncd.conf for several options you might want to specify, but note that for this specific usage—synchronizing a laptop and a desktop computer—it is overkill and a needless complication.

Using rsync

So, let's start using rsync and directly synchronize your laptop with a remote server. To do so, you use code similar to that shown in Listing 1. You can also synchronize your remote server to your laptop (files will be sent over from the server to your laptop) or even two local directories, but not two remote servers.

Listing 1. Two versions of the same complete rsync command
rsync --compress --recursive --delete --links \
--times --perms --owner --group \
--verbose --progress --stats \
--rsh="ssh" \
--exclude "*bak" --exclude "*~" \
/my/path/at/the/laptop/* myserver:/some/path/at/the/server

rsync -zrltpogve "ssh" --progress --stats --delete \
--exclude "*bak" --exclude "*~" \
/my/path/at/the/laptop/* myserver:/some/path/at/the/server

Unison: rsync alternative

For UNIX and Windows users, Unison is a valid alternative to rsync. It allows synchronizing any two machines --even with different operating systems-- and uses an algorithm similar to rsync's for optimized communications. Though its development is sort of halted right now, the original developers are still using and maintaining the program. The current stable version is 2.27.57, but 2.31.4 (quite a jump!) is in beta. Since this program has been around since 1998, most distributions provide it in their repositories; in my case, "sudo smart install unison" was enough to get it installed. Note that you will require having the very same version of the program in all the machines you want to synchronize; Unison won't work if different versions are used. (That could be a hindrance if synchronizing several machines, but for simple laptop/desktop usage it should prove no problem.) You can define profiles to save common, frequently used configurations.

Unison provides both a command line and a graphic interface. To learn more about the program, do "unison -doc tutorial"; for more help, try "unison -doc topics". There are about 70 command line options; "unison -help" will get you started. The (somewhat spartan) user interface allows you to pick the source and destination directories, Unison works bi-directionally, and can send or receive files as needed, so both machines will have the same files. For changed files (same name, different content) you will have to decide which one to keep. An important detail: Unison doesn't actually compare files contents (as rsync does) but just looks at the file's inode number and modification time; if neither differs since the previous run, it concludes that the fie hasn't been changed. (The information is kept in a hidden .unison directory.) Though this might claim "false updates", it won't miss actual changes -- and if you want to force a file to be copied, you could use "touch" and change its timestamp. Finally, you can use "ssh" for a safe connection, or a socket connection method for a simpler, but more insecure way.

You might do well by studying Unison, and considering it as a valid alternative to rsync. Features and concepts are similar enough, so you can feel at home with any of them. Though it might be said that rsync is more prevalent --and easier to use across different machines-- for a simple setup of your own, Unison can be simple to install, configure, and use in a frequent basis, so give it a go.

Note that the order of the options in Listing 1 is arbitrary, and most have a shorter version. First, --compress (alternative: -z) specifies that data will be compressed, saving bandwidth in the process. You should always include this option. (It can be argued that over a very high-speed data link, you might do without compression, but for most remote connection links, compression will help.) A complementary option, --compress-level=level can be used to specify different levels of compression; however, the standard compression level is typically acceptable.

The --recursive (-r) option makes rsync copy all directories recursively. All files within a directory, including possibly other directories and their own contents, will be copied. If you don't happen to need this functionality, the --dirs option (-d) provides the opposite effect: Directories and their contents will be skipped.

By default, rsync copies needed files to the destination computer but won't delete extra files there. By using the --delete option, the destination directory will be exactly like the original one. Be careful, though: If you ever happen to sync an empty file with a remote directory, you will delete everything at the remote machine directory!

If there are symlinks in your original directory, the --links option (also -l) recreates those symlinks in the destination directory. As an alternative, --copy-links or -L copies the item the symlink points to instead of the symlink itself. If you have symlinks that point outside the copied tree (a safety risk), you can use --copy-unsafe-links instead. The --safe-links option provides a safer method, ignoring such links.

The next four options—--times, --perms, --owner, and --group or -tpog—make rsync keep the original update timestamp, permission, owner, and group details, respectively. An easier way to specify all these options is by using --archive or -a, which also sets the --recursive and --links options.

The three following options (--verbose, --progress, and --stats) provide lots of information as to what rsync is doing. If you are not interested, just skip them, and rsync will be quiet unless an error pops up.

Although current rsync versions default to using ssh, the --rsh (or -e) option lets you force its usage. If you happened to require extra parameters for ssh (say, if you had set up ssh to use a non-standard port), you can add them, as in --rsh "ssh -p 12345".

The --exclude option (and its sibling, --include) lets you be more selective as to which files to synchronize. In this example, I excluded common backup files. Exclude and include files as desired to optimize what's sent over.

Finally, specify both the source and the destination paths, and you are done! Don't forget the final /*, or the result might not be as desired. If you check the documentation, you can find out the difference between some/path, some/path/, and some/path/*. But using /* is the safest way out.

You can shorten the command from Listing 1 by using the -a option (--archive), as shown in Listing 2. (For purists, the -a option can copy some extra elements—check the documentation—but only if you are running rsync as root in the server, which isn't a secure thing to do anyway.) There are far more options; check rsync --help and man rsync for a complete list.

Listing 2. A shorter, more silent version of the same command
rsync -zae "ssh" --delete --exclude "*bak" --exclude "*~" \
/my/path/at/the/laptop/* myserver:/some/path/at/the/server

Graphic alternatives

Using rsync without a password

If you use rsync as shown, you will have to enter your password for the remote machine every time. Doing so can certainly be a bother for common usage, but it's a show-stopper if you want to run rsync from a cron procedure, for example.

To allow password-less rsync sessions, you have to set up a public-private key pair. First, on your laptop, run ssh-keygen -t rsa. When asked for a passphrase, just leave it empty. This command creates a couple of files at your home directory in the (hidden) .ssh directory: id_rsa and id_rsa.pub.

Then, log on to your server, and at your home directory on that machine, run mkdir .ssh; chmod 0700 .ssh. Go back to your laptop and finish by copying the recently created id_rsa.pub file as authorized_keys2 to the new .ssh directory. At your laptop, you might run scp .ssh/id_rsa.pub your_server:.ssh/authorized_keys2.

You are done! From now on, you will be able to connect through ssh to your server (and run scp or rsync) without entering a password.

If you'd rather use a graphical user interface (GUI) instead of the command line, there are several possibilities. However, you should keep in mind that there's no "perfect alternative" and that you should do some thorough testing before committing to any program in particular. Some programs are in development (though they look interesting enough to include them in this review), and some are more advanced in their capabilities. (There also are some duds, which I include as a warning!)

GAdmin-Rsync

GAdmin-Rsync (shown in Figure 1) is part of the Gadmintools package, a set of GPL-licensed GUI tools for Linux systems administration. Its latest version is a surprisingly low 0.1.1 from January 2009, while the previous version was 0.1.0 from June 2008. Installation is quite simple: If you don't find a distribution-ready package, it's just a matter of downloading the source code and running a simple ./Autoinstall procedure.

Figure 1. Despite its low version number, GAdmin-Rsync promises good functionality, though its interface needs more development.
Gadmin-rsync

A small surprise was that the program requires the root password. Call me safety conscious, but I certainly don't like working as root unless I have to; mistakes are usually costlier for the root user!

The first time you use this tool, it asks for details about the backup you want to run. GAdmin-Rsync allows you to define several backups, so it's easier to re-run them. You need to specify the kind of backup (local to local, local to remote, or remote to local) and the appropriate directories and server data. But be careful here: I didn't find a way to edit the server parameters, so fixing them would require creating a new backup—not too user friendly. I also met another problem: The program wouldn't accept a password-less connection.

There are not many frills in GAdmin-Rsync. For example, you cannot just do a "dry run." In contrast, there's an easy way to specify cron jobs to be run at later times. Probably, this functionality reflects the "root-oriented" idea of the program: It's not for casual users but for systems administrators. (The Help feature agrees with this: It just says "Howto backup using GAdmin-Rsync: Visit http://www.gadmintools.org"—just a notch above a "RTFM" comment!) How much you will like this program depends on your systems administrator bent, but it can be useful.

Grsync

Grsync (shown in Figure 2) is a GTK-based GUI for rsync, but it isn't limited to Gnome. Its latest version is 0.6.2, dated December 2008, which means that the program is still supported and in development. Among its most interesting features are:

  • Saving your settings as "sessions" so that you can easily re-run a backup procedure.
  • Allowing a "simulation" (dry run) before actually committing to the backup.
  • Executing extra commands before and after the backup job.
  • Including a command-line version, grsync-batch, that lets you run Grsync sessions from a cron-scheduled run, for example.
Figure 2. Grsync doesn't offer too many of the options of the underlying rsync command but is quite usable and stable.
Grsync

At the home page (see Resources for a link), you will find only source code, which you can compile on your own if you have GTK and Autotools. However, you can find ready-made binaries for many distributions, including OpenSUSE, Mandriva, Red Hat (and Fedora and CentOS, as well), and more. Grsync is just a front end, so it doesn't include rsync: You will have to install that on your own first.

Not all rsync features are available, but for most users, the included options will be enough. If you need something, click the Advanced Options tab, and you will be able to add any option you require. Be careful with the syntax, though: If you make a mistake, Grsync won't complain, but rsync will, and you will get its error message when you try to execute the backup. Other than that, the package is quite usable and stable—probably the best of all the GUIs I reviewed.

QSync and TKsync

QSync is a Qt-based interface, but its development seems to have stopped at version 0.3, from December 2005. I won't recommend this tool: It requires its own rsync version, so it won't use your specific, up-to-date rsync package but rather the internal (certainly old) version of the command. I downloaded an OpenSUSE package, but it wouldn't run, and frankly, it didn't seem worthwhile trying a custom build for a seemingly abandoned package. The author himself admitted (in 2003) that "The syncing portion of QSync isn't quite right yet," and because there have been no updates since then, it stands to reason that this problem hasn't been solved.

Running a Google search for rsync GUIs might lead you to another project—TKsync—whose latest version (0.2.1) was released in 2004. Searching, however, failed to get the (apparently deleted) project page. So, it's fairly safe to call this project dead. If you happen to find an installation package, you'd probably be better off ignoring it.

Zynk

Even with Zynk being (obviously) at the beginning of its development cycle, the program looks promising enough to mention. Also, you might find versions of it for several distributions, and you should be aware of its (current) limitations. Finally, note that Zynk is a GTK+ application but can be run without Gnome; in particular, I ran my tests under the K Desktop Environment (KDE).

As to development status, Zynk is currently at version 0.0.2, dated February 2009, and the author himself warns, "There are hundreds of bugs at the moment! Only some parts of the software work as expected! USE AT YOUR OWN RISK!" On his estimate, the program is only about 10 percent done, though it looks like it is more complete than that, as Figure 3 shows.

Figure 3. Zynk is at the beginning of its development cycle but looks promising.
Zynk

Zynk apparently provides for most (if not all) of the rsync options. (By the way, you need have rsync previously installed.) At the bottom of the window, you can see the command that will be executed and its output.

Having run some tests, I must agree that the program needs more work. But unlike QSync, it seems like development is ongoing, so there's a reasonable chance that the program will actually become usable.

Conclusion

The rsync utility is a mandatory tool for your command-line work, and you need to learn how to use it for easy, safe, quick laptop-desktop synchronization. If a GUI is more your thing, Grsync seems the best option available today, as QSync is badly outdated and both GAdmin-Rsync and Zynk are at the beginning of their development cycles.

Resources

Learn

Get products and technologies

  • Read up on ssh, and install OpenSSH (a free version of ssh) for secure communications.
  • Download rsync and install it.
  • GAdmin-Rsync is a part of the Gadmintools package, a set of GUI tools for Linux systems administrators.
  • Grsync is still being updated and, though a bit limited, is quite usable.
  • Krsync is built upon Kommander, a visual scripting tool for Linux.
  • QSync is dated and probably abandoned; I'd suggest staying away from it.
  • Zynk looks interesting, though it's at an early stage and not ready yet.
  • Use Smart for package management.

Discuss

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into AIX and Unix on developerWorks


static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=AIX and UNIX
ArticleID=381524
ArticleTitle=The rsync family
publish-date=04142009