Managing source code with Mercurial

A powerful, flexible system for managing project source code

Managing the source code for a software development project is only slightly less important than writing it in the first place. UNIX® and Linux® systems offer a rich selection of version control system (VCS) packages, each of which takes a slightly different approach to this common concern. This article focuses on the Mercurial source code management system, often simply referred to as hg. Mercurial provides a powerful, modern, and light-weight solution for source code control that makes it easy for developers to make and debug their changes to a software project while maintaining a stable, centralized source code repository that all project members can depend upon.

William von Hagen, Systems Administrator, Writer, WordSmiths

William von Hagen has been a writer and UNIX systems administrator for more than 20 years and a Linux advocate since 1993. Bill is the author or co-author of books on subjects such as Ubuntu Linux, Xen Virtualization, the GNU Compiler Collection (GCC), SUSE Linux, Mac OS X, Linux file systems, and SGML. He has also written numerous articles for Linux and Mac OS X publications and Web sites. You can reach Bill at wvh@vonhagen.org.



12 April 2011

Also available in Chinese

Source code management on UNIX and Linux systems

Identifying and tracking the changes made by multiple developers and merging them into a single, up-to-date codebase makes collaborative, multi-developer projects possible. VCS software, also referred to as revision control systems (RCS) or source code management (SCM) systems, enable multiple users to submit changes to the same files or project without one developer's changes accidentally overwriting another's changes.

Linux® and UNIX® systems are knee-deep in VCS software, ranging from dinosaurs such as the RCS and the Concurrent Versions System (CVS) to more modern systems such as Arch, Bazaar, Git, Subversion, and Mercurial. Like Git, Mercurial began life as an open source replacement for a commercial source code management system called BitKeeper, which was used to maintain and manage the source code for the Linux kernel. Since its inception, Mercurial has evolved into a popular VCS system that is used by many open source and commercial projects. Projects using Mercurial include Mozilla, IcedTea, and the MoinMoin wiki. See Resources for links to these and many more examples.

VCS systems generally refer to each collection of source code in which changes can be made and tracked as a repository. How developers interact with a repository is the key difference between more traditional VCS systems such as CVS and Subversion, referred to as centralized VCS systems, and more flexible VCS systems such as Mercurial and Git, which are referred to as distributed VCS systems. Developers interact with centralized VCS systems using a client/server model, where changes to your local copy of the source code can only be pushed back to the central repository. Developers interact with distributed VCS systems using a peer-to-peer model, where any copy of the central repository is itself a repository to which changes can be committed and from which they can be shared with any other copy. Distributed VCS systems do not actually have the notion of a central, master repository, but one is almost always defined by policy so that a single repository exists for building, testing, and maintaining a master version of your software.

Why Mercurial?

Mercurial is a small, powerful distributed VCS system that is easy to get started with, while still providing the advanced commands that VCS power users may need (or want) to use. Mercurial's distributed nature makes it easy to work on projects locally, tracking and managing your changes via local commits and pushing those changes to remote repositories whenever necessary.

Among modern, distributed VCS systems, the closest VCS system to Mercurial is Git. Some differences between Mercurial and Git are the following:

  • Multiple, built-in undo operations: Mercurial's revert, backout, and rollback commands make it easy to return to previous versions of specific files or previous sets of committed changes. Git provides a single built-in revert command with its typical rocket-scientist-only syntax.
  • Built-in web server: Mercurial provides a simple, integrated web server that makes it easy to host a repository quickly for others to pull from. Pushing requires either ignoring security or a more complex setup that supports Secure Sockets Layer (SSL).
  • History preservation during copy/move operations: Mercurial's copy and move commands both preserve complete history information, while Git does not preserve history in either case.
  • Branches: Mercurial automatically shares all branches, while Git requires that each repository set up its own branches (either creating them locally or by mapping them to specific branches in a remote repository).
  • Global and local tags: Mercurial supports global tags that are shared between repositories, which make it easy to share information about specific points in code development without branching.
  • Native support on Windows platforms: Mercurial is written in Python, which is supported on Microsoft® Windows® systems. Mercurial is therefore available as a Windows executable (see Resources). Git on Windows is more complex—your choices are msysGit, using standard git under Cygwin, or using a web-based hosting system and repository.
  • Automatic repository packing: Git requires that you explicitly pack and garbage-collect its repositories, while Mercurial performs its equivalent operations automatically. However, Mercurial repositories tend to be larger than Git repositories for the same codebase.

Mercurial and Git fans are also happy to discuss the learning curve, merits, and usability of each VCS system's command set. Space prevents that discussion here, but a web search on that topic will provide lots of interesting reading material.

Creating and using Mercurial repositories

Mercurial provides two basic ways of creating a local repository for a project's source code: either by explicitly creating a repository or by cloning an existing, remote repository:

  • To create a local repository, use the hg init [REPO-NAME] command. Supplying the name of a repository when executing this command creates a directory for that repository in the specified location. Not supplying the name of a repository turns the current working directory into a repository. The latter is handy when creating a Mercurial repository for an existing codebase.
  • To clone an existing repository, use the hg clone REPO-NAME[LOCALNAME] command. Mercurial supports the Hypertext Transfer Protocol (HTTP) and Secure Shell (SSH) protocols for accessing remote repositories. Listing 1 shows an example hg command and the resulting output produced when cloning a repository via SSH.

    Listing 1. Cloning a Mercurial repository via SSH
    	$ hg clone ssh://codeserver//home/wvh/src/pop3check 
    	wvh@codeserver's password: 
    	destination directory: pop3check
    	requesting all changes
    	adding changesets
    	adding manifests
    	adding file changes
    	added 1 changesets with 12 changes to 12 files
    	updating to branch default
    	12 files updated, 0 files merged, 0 files removed, 0 files unresolved
    	remote: 1 changesets found

Note: To use the HTTP protocol to access Mercurial repositories, you must either start Mercurial's internal web server in that repository (hg serve -d) or use Mercurial's hgweb.cgi script to integrate Mercurial with an existing web server such as Apache. When cloning via HTTP, you will usually want to specify a name for your local repository.

After you create or clone a repository and make that repository your working directory, you're ready to start working with the code that it contains, add new files, and so on.

Getting help in Mercurial

Mercurial's primary command is hg, which supports a set of sub-commands that are similar to those in other VCS systems. To see a list of the most common commands, execute the hg command with no arguments, which displays output similar to that shown in Listing 2.

Listing 2. Basic commands provided by Mercurial
    Mercurial Distributed SCM

    basic commands:

    add        add the specified files on the next commit
    annotate   show changeset information by line for each file
    clone      make a copy of an existing repository
    commit     commit the specified files or all outstanding changes
    diff       diff repository (or selected files)
    export     dump the header and diffs for one or more changesets
    forget     forget the specified files on the next commit
    init       create a new repository in the given directory
    log        show revision history of entire repository or files
    merge      merge working directory with another revision
    pull       pull changes from the specified source
    push       push changes to the specified destination
    remove     remove the specified files on the next commit
    serve      export the repository via HTTP
    status     show changed files in the working directory
    summary    summarize working directory state
    update     update working directory

    use "hg help" for the full list of commands or "hg -v" for details

This short list displays only basic Mercurial commands. To obtain a full list, execute the hg help command.

Tip: You can obtain detailed help on any Mercurial command by executing the hg help COMMAND command, replacing COMMAND with the name of any valid Mercurial command.

Checking repository status

Checking in changes is the most common operation in any VCS system. You can use the hg status command to see any pending changes to the files in your repository. For example, after creating a new file or modifying an existing one, you see output like that shown in Listing 3.

Listing 3. Status output from Mercurial
    $ hg status
    M Makefile
    ? hgrc.example

In this case, the Makefile file is an existing file that has been modified (indicated by the letter M at the beginning of the line), while the hgrc.example file is a new file that isn't being tracked (indicated by the question mark ?) at the beginning of the line.

Adding files to a repository

To add the hgrc.example file to the list of files that are being tracked in this repository, use the hg add command. Specifying one or more file names as arguments explicitly adds those files to the list of files that are being tracked by Mercurial. If you don't specify any files, all new files are added to the repository, as shown in Listing 4.

Listing 4. Adding a file to your repository
    $ hg add
    adding hgrc.example

Tip: To add automatically all new files and mark any files that have been removed for permanent removal, you can use Mercurial's handy hg addremove command.

Checking the status of the repository shows that the new file has been added (indicated by the letter A at the beginning of the line), as shown in Listing 5.

Listing 5. Repository status after modifications
    $ hg status
    M Makefile
    A hgrc.example

Committing changes

Checking in changes is the most common operation in any VCS system. After making and testing your changes, you're ready to commit those changes to the local repository.

Before committing changes for the first time

If this is your first Mercurial project, you must provide some basic information so that Mercurial can identify the user who is committing those changes. If you do not do so, you'll see a message along the lines of abort: no username supplied... when you try to commit changes, and your changes will not be committed.

To add your user information, create a file called .hgrc in your home directory. This file is your personal Mercurial configuration file. You need to add at least the basic user information shown in Listing 6 to this file.

Listing 6. Mandatory information in a user's .hgrc file
    [ui]
    username = Firstname Lastname <user@domain.tld>

Replace Firstname and Lastname with your first and last names; replace user@domain.tld with your email address; save the modified file.

You can set default Mercurial configuration values that apply to all users (which should not include user-specific information) in the /etc/mercurial/hgrc file on Linux and UNIX systems and in the Mercurial.ini file on Microsoft Windows systems, where this file is located in the directory of the Mercurial installation.

The standard commit process

After creating or verifying your ~/.hgrc file, you can commit your changes using the hg commit command, identifying the specific files that you want to commit or committing all pending changes by not supplying an argument, as in the following example:

    $ hg commit
    Makefile
    hgrc.example
    committed changeset 1:3d7faeb12722

As shown in this example output, Mercurial refers to all changes that are associated with a single commit as a changeset.

When you commit changes, Mercurial starts your default editor to enable you to add a commit message. To avoid this, you can specify a commit message on the command line using the -m "Message.." option. To use a different editor, you can add an editor entry in the [ui] section of your ~/.hgrc file, following the editor keyword with the name of the editor that you want to use and any associated command-line options. For example, after adding an entry for using emacs in no-window mode as my default editor, my ~/.hgrc file looks like that shown in Listing 7.

Listing 7. Additional customization in a user's .hgrc file
    [ui]
    username = William von Hagen <wvh@vonhagen.org>
    editor = emacs -nw

Tip: To maximize the amount of information that Mercurial provides about its activities, you can add the verbose = True entry to the [ui] section of your Mercurial configuration file.

Pushing changes to a remote repository

If you are using a clone of a remote repository, you want to push those changes back to that repository after committing changes to your local repository. To do so, use Mercurial's hg push command, as shown in Listing 8.

Listing 8. Pushing changes via SSH
    $ hg push
    wvh@codeserver's password: 
    pushing to ssh://codeserver//home/wvh/src/pop3check
    searching for changes
    1 changesets found
    remote: adding changesets
    remote: adding manifests
    remote: adding file changes
    remote: added 1 changesets with 2 changes to 2 files

Pulling changes from a remote repository

If you are using a clone of a remote repository and other users are also using that same repository, you want to retrieve the changes that they have made and pushed to that repository. To do so, use Mercurial's hg pull command, as shown in Listing 9.

Listing 9. Pulling changes via SSH
    $ hg pull
    wvh@codeserver's password: 
    pulling from ssh://codeserver//home/wvh/src/pop3check
    searching for changes
    adding changesets
    adding manifests
    adding file changes
    added 1 changesets with 0 changes to 0 files
    (run 'hg update' to get a working copy)
    remote: 1 changesets found

As shown in the output from this command, this command only retrieves information about remote changes—you must run the hg update command to show the associated changes in your local repository. This command identifies the ways the repository has been updated, as shown in Listing 10.

Listing 10. Updating your repository to show changes
    $ hg update
    0 files updated, 0 files merged, 1 files removed, 0 files unresolved

Undoing changes in Mercurial

Mercurial provides the following built-in commands that make it easy to undo committed changes:

  • hg backout CHANGESET: Undoes a specific changeset and creates a changeset that undoes that changeset. Unless you specify the --merge option when executing this command, you have to merge that changeset into your current revision to push it back to a remote repository.
  • hg revert: Returns to previous versions of one or more files by specifying their names or returning to the previous version of all files by specifying the --all command-line option.
  • hg rollback: Undoes the last Mercurial transaction, which is commonly a commit, pull from a remote repository, or a push to this repository. You can only undo a single transaction.

See the online help for all of these commands before attempting to use them!

Summary

Mercurial and other distributed source code management systems are the wave of the future. Mercurial is open source software, and pre-compiled versions of Mercurial are available for Linux, UNIX, Microsoft Windows, and Mac OS® X systems. This article highlighted how to use Mercurial to perform a number of common VCS tasks, showing how easy it is to get started using Mercurial. For more advanced purposes, Mercurial provides many more advanced commands and configuration options to help you manage your source code and customize your interaction with a Mercurial installation.

Resources

Learn

Get products and technologies

  • The Mercurial wiki's Download page provides links to compiled versions of Mercurial for all supported platforms.
  • TortiseHG provides a shell extension and command-line Mercurial applications for Microsoft Windows systems.
  • The MercurialEclipse plug-in provides support for Mercurial within the Eclipse Integrated Development Environment.
  • FogCreek Software's Kiln provides free trials and student/start-up versions of its online, Mercurial-based hosting service that is similar to online Git hosting services such as GitHub, Repo.Org.Cz, and so on.
  • BitBucket provides free hosting for Open Source projects in its online, Mercurial-based hosting service, as well as paid hosting plans for larger groups of developers.
  • Explore msysGit, which is Git for Windows.

Discuss

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into AIX and Unix on developerWorks


static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=AIX and UNIX, Linux
ArticleID=645661
ArticleTitle=Managing source code with Mercurial
publish-date=04122011