Configuring source control to manage Content Analytics Studio resources

Using a source control system allows multiple users to collaborate on the development of a UIMA pipeline and its associated linguistic resources.

Before you begin

If you installed Content Analytics Studio in the Program Files or Program Files (x86) directory of your computer, which is the default behavior of the installation program, you must run Content Analytics Studio as an administrator when you install the source control client. Right-click the Content Analytics Studio icon on your desktop and click Run as administrator.

About this task

The following source control repositories were tested with Content Analytics Studio.
  • Subclipse 1.6.23 client from Tigris
  • Rational Team Concert™ 4.0.3

Using source control in Content Analytics Studio is similar to using source control in other Eclipse projects. The only difference is how databases are handled. The Content Analytics Studio databases store the various linguistic resources, such as dictionary entries and parsing rules. Because the databases are stored in multiple binary files, changes from different users cannot be directly compared and merged. Therefore, Content Analytics Studio maintains a copy of the data in CSV files.

When you modify Content Analytics Studio resources, the corresponding database is updated. After you close the database, any changes to the data are written to a CSV file. When you synchronize your project with the source control repository, any updates that are made by you or other users are shown as changes to the CSV file. Because CSV files are plain text, you can compare the changes and merge changes if necessary. When you update your project with changes from the source control repository, Content Analytics Studio detects whether any CSV files were changed and updates the database accordingly.
Important: Close all databases before you synchronize your project. If a database is open when you synchronize your project and changes need to be applied to that database, you receive the error The database cannot be restored from the CSV data because it is open. To resolve this problem, close the database, right-click the database, and click Restore from CSV.

Procedure

To configure source control in Content Analytics Studio:

  1. Click Window > Preferences. In the Preferences window, click General > Capabilities and select the Software Updates and Team check boxes.
    Tip: You can disable the CVS client by clicking Advanced and clearing the Team > CVS Support check box.
  2. For Rational Team Concert only, you must download the client for Eclipse and extract the contents of the package into a temporary directory.
  3. Install the Eclipse client for your source control system. Click Help > Install New Software and click Add to specify the web address http://download.eclipse.org/releases/kepler/, which is the location of the software repository from which Eclipse obtains dependency features.
  4. Click Add again to specify the location of the installation package and the components to install.
    For this source control system: Specify the following installation package location and components to install:
    Subclipse Specify the web address http://subclipse.tigris.org/update_1.6.x. Select all the components.
    Rational® Team Concert Specify the path to the directory in which you extracted the installation package, such as file:/C:/Install/RTC/403. Select Rational Team Concert Client (extend an Eclipse installation).
  5. Configure the source control client to connect to the source control repository. For instructions, see the documentation for your source control client.