Version control is to programmers what the safety net is to a trapeze artist. Knowing the net is there to catch them if they fall, aerialists are free to fly. In the same way, version control enables you to take programming risks that you would never otherwise consider. If something goes wrong, you can always revert back to a known, good-working version of your code. You can experiment in a branch off the main trunk without interfering with other team members. When bugs are discovered in an older version of a shipped product, you can easily check out that specific version to confirm, fix, and generate a patch for the bug. Without version control, you would have to be much more cautious, move more slowly, and generally be less productive.
Subversion is a new open source, version control system that supports database and file-system repositories that can be accessed locally or over the network. As well as providing the usual diff, patch, tag, commit, revert, and branch functionality, Subversion adds the ability to track moves and deletes. Furthermore, it supports non-ASCII text and binary data, all of which makes it very useful not just for traditional programming tasks but also for Web development, book authoring, and other domains that have not traditionally adopted version control.
This article demonstrates the basics of using Subversion to track your projects so that you can take more risks and have more fun writing code.
I started out with the granddaddy of version control systems, sccs, while working as a grad student at the National Solar Observatory. Numerous version control systems are available today, and sccs has long since been superseded by more capable products such as Visual SourceSafe, BitKeeper, Perforce, and the open source CVS (see Resources).
Among open source programmers, CVS has become the de facto standard. Free CVS hosting at sites like Codehaus, Sourceforge, Savannah, and the Java™ Community's java.net has made establishing a repository for an open source project relatively straightforward. A large third-party marketplace of add-ons has grown up around CVS with tools like TortoiseCVS, ViewCVS, and Fisheye.
Compared to other version control systems, CVS is most notable for its non-locking repository, which allows multiple developers to check out the same file at the same time. CVS resolves conflicts at commit time, which prevents it from becoming a bottleneck to progress. The second notable feature of CVS is that it's a network repository. Programmers on many different systems can access the same repository over the public Internet.
While CVS has served the community well over the last 10 years, it is showing its age. First of all, it can really only handle ASCII files. Unicode text confuses it badly. Furthermore, CVS repositories are very hard to change. CVS has no concept of a "move" operation. It can only note that a file has been deleted in one place and another file created in a new location. Since it doesn't connect the two operations, it can easily lose track of the history of a file. When you set up an CVS repository, you have to be very careful about choosing the exact location of each file because from that point on you're pretty much stuck with it.
It has gradually become apparent that CVS is no longer adequate for modern development. In particular, while CVS serves the ASCII needs of old-fashioned C programmers, it really doesn't work for Web developers and other nontraditional users. When you start thinking about storing an entire Web site in CVS, moving files from one directory to another becomes critical. Thus a few years ago, many of the core CVS developers decided it was time to create a next-generation source code repository from scratch, using the lessons they had learned while working on and with CVS over the years. In early 2004, their labor bore fruit with Subversion 1.0.
Programmers (especially those who rely on version control) are a cautious bunch, and Subversion has been slow to catch on. Few programmers wanted to be on the bleeding edge, even if they were already bleeding a little from the sharp edges of CVS. Even after Subversion became reliable, it took a couple of years for all the third-party editors, IDEs, and documentation to catch up. Still, Subversion has continued to improve, and third-party tools such as BBEdit and Eclipse now have adequate-to-good Subversion support. Increasingly, new projects are choosing Subversion for their version control needs, and old projects are migrating. Most recently, the Apache Software Foundation has migrated to Subversion. Projects that have already made the shift include the Xerces XML parser, the Apache HTTP Server, and Spamassassin.
You can use Subversion from the command line, but it's much more convenient if it's integrated with your IDE. IntelliJ IDEA 5.0 and later include built-in support for Subversion. NetBeans doesn't yet support Subversion, but work is under way to add support in future versions. For Eclipse, you'll need to install the Subclipse plug-in. I use Eclipse as my IDE in this article.
Subclipse installs like any other Eclipse plug-in:
- In Eclipse, go to Help Software Updates | Find and Install....
- In the first panel of the wizard, click the Search for new features to install radio button and click Next.
- In the second panel of the wizard, click New Remote Site. Enter
Subclipsefor the name and
http://subclipse.tigris.org/update_1.0.xfor the URL. Then click Finish. This launches another small wizard. The next dialog should offer you one feature to install, Subclipse. Check it and click Next.
- Accept the license agreement and complete the installation.
The package is not currently digitally signed, but as long as you're downloading from the URL in step 3, it should be safe.
After the package is installed, you must restart Eclipse before you can use Subversion. Once you've done this, you can use Subversion pretty much like you use CVS today, though there are a few differences.
Setting up a new Subversion repository, especially a network repository, is relatively complex. However, if you're joining an existing project, checking out the files the first time is easy. Go to the File menu and select New/Other.... These steps bring up the dialog shown in Figure 1:
Figure 1. Starting a new project from Subversion
Next, open the SVN folder, choose Select Checkout Projects from SVN, and press Next to display the dialog shown in Figure 2:
Figure 2. Select/Create Location
The first time you check out from a repository, you need to select Create a new repository location and press Next. This brings you to the dialog in Figure 3:
Figure 3. Type a URL for the repository
Here you provide the URL of your repository. This is just an ordinary http URL such as http://svn.apache.org/repos/asf/xerces/java/. When you click Next this time, Eclipse connects to the repository and looks for folders you can check out. Usually there are three: branches, tags, and trunk, as shown in Figure 4. Branches are for experiments. Tags normally identify older, already released versions of the software. However, most of the time, you'll want to work on the main branch (what CVS calls the HEAD), so select trunk, and click Next.
Figure 4. Choosing the revision to check out
You now have two options: Check out as a Project configured using the New Project Wizard or Check out as a project in the workspace, as shown in Figure 5. You can use whichever you prefer. You probably also want to give the project a name because the default is "trunk." Finally, click Finish.
Figure 5. Two ways to check out a project
Eclipse now downloads all the source files from the branch, trunk, or tag you've selected. If you've selected Check out as a project in the workspace, you have to go through Eclipse's New Project wizard to set up the compiler level, project layout, and other options. If you did not use the New Project wizard, you need to set up the build path and other options manually, just as if you had created a project from a directory on your file system. Indeed, that's pretty much what you've done. All files are stored locally. For normal operations like building, running, and debugging, Eclipse doesn't care whether or not the files were checked out of version control.
At this point, it's a good idea to do a quick sanity check to make sure you've set up the build path properly. If there are no obvious problems and you can run the unit tests, you're good to go.
If there is a problem, check the project properties to make sure the
source path and classpath are set properly. It's not uncommon to have
them off by one, either up or down, so you end up with classes that
Eclipse thinks are named something like
apache.xerces.parsers.SAXParser instead of
multifolder projects, it's also common for Eclipse to mistakenly tag a
data folder as a source folder or omit a source folder. If any of these
glitches have occurred, you need to fix them before continuing.
To check for errors, select Project | Properties and then find Java Build Path. The source tab is where you can fix anything the wizard has done wrong. You may also need to add extra JAR archives required by the project, which Eclipse didn't notice upon check out. You also do this under Java Build Path but in the library tab.
Expect to spend a little time on this. Eclipse rarely gets everything right the first time, and each project organizes its files and libraries at least a little differently.
You can now edit the files as usual. Make any changes you like. Run the unit tests. Optimize the code. Correct spelling errors in the comments. When you've finished some piece of work, use the context menu and select Team/Commit.... You are presented with the dialog shown in Figure 6 and asked to enter a commit comment:
Figure 6. Subclipse commit dialog
Likewise, if someone else has made changes in the repository you want to apply to your copy, just select the file in the Package Explorer and choose Team/Update from the context menu. This replaces your copy with the master copy.
If you've made changes and other developers have also, you may have to merge the files manually. For most simple changes, Subversion can figure out what to do without any human intervention. However for large, complex, and conflicting changes, you may need to help it out by merging the changes manually.
Subclipse can help out here, but to be honest, I usually find it easier to simply make a scratch copy of my file in a separate window or tab, update my local copy from the repository so my changes are overwritten, and then re-enter them from the scratch copy. If the changes in the repository are fairly minor compared to what I've been doing, then I make the scratch copy from the repository and commit over that instead of updating. Then I reapply those changes. This sounds complicated, but it usually isn't, and it doesn't happen nearly as often as you'd expect. Even if you make a mistake and forget to reapply or misapply a change, there's always a complete history of all changes (including the ones you overwrote) to refer to. Nothing is ever completely lost.
If you aren't sure which files have changed, the synchronize view shows you. You can also turn on label decorations to see which files have changed since the last commit/update. Go to Window | Preferences and then select the General/Appearance/Label Decorations. There you'll check the SVN checkbox. (This really should be checked by default.)
If you don't have commit access to the repository you're working with, you need to make a patch and send it to the maintainers instead. Just select the files you want to compare and select Team/Create Patch... from the context menu. You can save the patch to a file or the clipboard as convenient. Then you can e-mail the patch to the maintainers or attach it to a bug report. The patch itself is in the same diff format that CVS uses.
Applying patches others send you is no more complex. Just select the file or project you want to patch. Select Team/Apply Patch... from the context menu and then choose the patch file using your platform's customary open file dialog box.
If you've made some changes and want to see how your copy differs from what's in the repository, just select Compare/Latest From Repository from the context menu. This works virtually the same as it does with CVS. Figure 7 shows the diff function at work:
Figure 7. Comparing two files in Subclipse
Deleting files is also easy. Just delete the file from Eclipse's package manager and then commit the parent folder. Deleting directories is a little trickier. You can select a directory and delete it. All the files in the directory are immediately deleted. However, as soon as you do that, the directory itself and all its subdirectories pop right back up again. To really delete a folder, you need to select the "deleted" folder and commit it. The same applies when moving a file from one place to another.
After you've deleted a file or a folder, you can still get it back from the repository, even after you've committed it. You never really lose anything permanently once it's in the repository, which is sometimes a problem. For instance, suppose you discover someone has accidentally checked in their entire home directory including their Quicken data files and the complete archive of love letters from their significant other. You'd like to be able to obliterate the mistakenly committed files so no one can ever get them back. While this is an unusual operation (after all the purpose of a version control system is to keep every revision of every file forever), sometimes it's necessary. Sadly, Subversion is missing this one critical feature.
Not having an obliterate command makes me nervous about using Subversion for externally visible repositories. CVS doesn't have an obliterate command either; but in CVS, it is possible to manually delete mistakenly committed files without damaging the repository.
For internal repositories, Subversion is a vast improvement over CVS. Once some kind of obliteration functionality is added, it should be suitable for external repositories as well. While third-party Subversion support in tools like Eclipse is not quite as widespread as support for CVS, this is changing rapidly. As a free, open-source option where a more powerful and mature option such as Rational® ClearCase is not available, Subversion provides a sound source code repository for new projects, as well as existing projects that don't have source code control. Existing projects that already have CVS repositories may wish to wait until all the tools they depend on fully support Subversion before switching.
Rational ClearCase: IBM's software configuration management product for medium to large development teams. Download the trial version today.
- "Sharing code with the Eclipse Platform" (Pawel Leszek, developerWorks, March 2003): An overview of how the Eclipse platform handles source code
- "Create a blog from scratch with PHP and Subversion" (Tyler Anderson, developerWorks, February 2006): A tutorial introduction to Web development with PHP and Subversion.
- "Automate your team's build and unit-testing process" (Mark Wilkinson, developerWorks, October 2005): Create an automated system to build and test your source code using (among other things) CVS and Subversion.
Version Control using Subversion, 2nd Edition" (Mike Mason, The
Pragmatic Programmers, May 2006): A complete introduction to using
Version Control with
(Ben Collins-Sussman, Brian W. Fitzpatrick, C. Michael Pilato; O'Reilly Media): A free online book about Subversion.
Sourceforge: Offers free
Subversion hosting for open source projects.
The Java technology
zone: Hundreds of articles about every aspect of Java
Get products and technologies
Subversion: Source code and
binaries for various platforms.
Subclipse: The Eclipse Subversion
NetBeans Subversion project.
TMate: The IntelliJ IDEA
blogs: Get involved in the developerWorks community.
Elliotte Rusty Harold is originally from New Orleans, to which he returns periodically in search of a decent bowl of gumbo. However, he resides in the Prospect Heights neighborhood of Brooklyn with his wife Beth and cats Charm (named after the quark) and Marjorie (named after his mother-in-law). He's an adjunct professor of computer science at Polytechnic University, where he teaches Java and object-oriented programming. His Cafe au Lait Web site has become one of the most popular independent Java sites on the Internet, and his spin-off site, Cafe con Leche, has become one of the most popular XML sites. His books include Effective XML, Processing XML with Java, Java Network Programming, and Java I/O. He's currently working on the XOM API for processing XML, the Jaxen XPath engine, and the Jester test coverage tool.