Over the years, we continue to face the increasing demand for a highly automated software build-and-release process within large organizations. IBM® Rational® Build Forge® makes it possible to achieve this within all of the IBM Rational Software Delivery Platform applications. There is always a huge debate on whether it's a good idea to store binary files in IBM® Rational® ClearCase® implementations or not. For example, see the discussion titled "How bad is [it] to store deployment units in ClearCase?" on the CM Crossroads Web site (see Resources).
This article explains certain circumstances where storing binary files in ClearCase can make the build-and-release process smoother. It also discusses merge problems with binary files and practical solutions to managing them in ClearCase environments.
Types of binary files in ClearCase implementations
There are a few cases in which we store binary files in ClearCase.
The first category is the common third-party libraries that most applications use. A lot of them are open-source binary releases in a Java™ archive (JAR) file format, such as Jakarta_Logging (see Figure 1).
Figure 1. Commonly used third-party files
However, you simply maintain different versions of these files with different baselines in ClearCase. Therefore, no parallel development is required, thus you don't need to merge these components. Because these releases are quite limited, there is no storage issue either.
Figure 2 shows a typical example of when a merge is required for binary files. In our organization, the official integration build process uses a binary dependency if an application has dependencies on other applications. That way, we don't build all dependencies from scratch for each build. This method can dramatically reduce the total build time.
In this example, the ReCApp application builds its interface to other applications that use it, checks in the build result in a /bin directory inside of the same source component, and then establishes a single baseline for both the source and binary files.
Figure 2. A typical example of when a merge is required
Tip:
Actually, it's not a good approach to mix the text files and binary files in the same versioned object base (VOB), but it turns out to be the easiest way for us to match the source code and binary build result. After you have done this, you may get a merge conflict when you deliver this component from stream to stream.
The largest portion of our binary files is stored in our release management VOBs, as Figure 3 shows.
Figure 3. Binary files stored in release management VOBs
These files are final deployment units generated by Build Forge and will be installed into various test and production environments. No merge is required, but they use a huge amount of disk space.
Merge issues, reasons, and solutions
By default, ClearCase does not merge binary file types, such as .jar and .zip files. Therefore, you need to stop in the middle of a normal delivery or rebase operation and fix such a file if a merge is required. Some developers expect ClearCase V7 to do a copy-merge for binary files, but it does not.
Here is how ClearCase software handles binary files by default during a delivery, as tested in Version 7:
When new files are in the source stream only
If the file under testing exists only in the source stream, not in the target stream, the files will be copied from the source to the target after delivery. This is a trivial merge, so there is no concern. Figures 4 through 8 illustrate this scenario.
Figure 4: Binary files exist only in the development stream before delivery
Figure 5: Binary files are not in the integration stream before delivery
Figure 6: Delivery from the development stream to the integration stream
Figure 7: Trivial merge
Figure 8: A new version created in the integration stream
Files under testing are copied over from the development stream to integration steam, as Figure 9 shows.
Figure 9: Binary files have been copied to the integration steam after delivery
When new versions are in both the source and target streams
If you change the binary files on both the source and the target, then a merge is required during the delivery. Notice that castor-0.9.5.3-xml.jar file is changed in Figure 10, for example.
Figure 10: Change the castor-0.9.5.3-xml.jar file in the devevelopment stream
ClearCase can't merge it automatically, so you will see the message in Figure 11.
Figure 11. Error message: "The element below cannot be merged automatically"
You will get the error messages shown in Figures 12 and 13 if you choose the first option of changing both the source and target files.
Figure 12. Diff Merge error message
Figure 13. Deliver from Stream error message
The reason is that ClearCase, by default, does not merge binary files. You can find more details in the IBM Technote titled "Compare utilities cleardiff and cleartool diff fail to note differences between binary files" (see the link in Resources).
You can't complete the delivery and merge until you fix the files. One way to do this is by manually merging the files: Check out the target file, copy the source file to the target file, check in the target file, and then create a hyperlink.
Technical reasons for the merge constraint
This section digs into technical details about why ClearCase does not automatically merge binary files as it does text files. The key reason is that ClearCase, by default, does not include a merge utility for these binary files.
The Magic file: The purpose of the Magic file is to correctly assign a type to an element, based on the contents of a given file. The Magic file selects the type for elements when you add a new file to ClearCase source control. The default Magic file is clearly called default.magic, and it is located here:
- On IBM® AIX® platform: /opt/rational/clearcase/config/magic
- In the Microsoft® Windows system: C:\ProgramFiles\Rational\ClearCase\config\magic
From the Magic file you can see, for example, that .zip files are stored as the type called file and that the .jar file type is compressed_file
:
-
zip_archive archive file:
-name "*.[zZ][iI][pP]" ; -
java_archive compressed_file:
-name "*.jar" ;
This example assumes that the VOB does not contain element types of zip_archive, archive, or java_archive. Otherwise, the files are stored as the first existing element type.
File element types: When you know the exact element type for your binary files, you can run the cleartool desc -1eltype:compressed_file command to find out the default Type Manager for all element types defined in the VOB. The default Type Manger for compressed_file is z_whole_copy, as shown in Figure 14.
Figure 14. Screen output showing default Type Managers
You can also verify the element type of a particular file from each element's property information. File element types for .jar, .zip, .gz, and .dll files in our environment are shown in Figures 15 through 18 as examples.
Figure 15. File element type for .jar files
Figure 16. File element type for .zip files
Figure 17. File element type for .gz files
Figure 18. File element type for .dll files
Map file: The tools that perform diff and merge commands for certain Type Managers are defined in the map file of each ClearCase client. The default location in Windows is:
C:\Program Files\Rational\ClearCase\lib\mgrs\map
From this, you can see that the z_whole_copy Type Manager uses the cleardiff.exe file to perform merges that cannot actually compare anything other than text files:
whole_copy merge ..\..\bin\cleardiff.exe
z_whole_copy merge ..\..\bin\cleardiff.exe
That explains why ClearCase, by default, can't merge most of binary files unless you install a customized merge utility for these Type Managers.
Note:
ClearCase software merges Microsoft® Word® files, IBM® Rational® Software Architect models, and IBM® Rational® XDE files, as well as directories.
To resolve the ClearCase merge constraint, you can introduce new types that can handle the merge conflict automatically. Depending on the actual requirement, you can either completely ignore these binary files during the merge or do a copy-merge action.
Never-Merge: The IBM Technote titled "Handling binary files in ClearCase" (see Resources) proposed a NEVER_MERGE binary files solution for ClearCase versions before Version 7. In our experience, that is the best solution for most environments. It can save a lot of disk space, too.
To create a new NEVER_MERGE element type, follow these steps:
- In Type Explorer, select the Playground_apps_sub PVOB or VOB (see Figure 19).
Figure 19: Select the PVOB or VOB in Type Explorer
- Click Create, and in the window that pops up, type
NEVER_MERGEin the Name field.
Figure 20: Create a new element type named NEVER_MERGE
- Enter a description for this new type of element, then click the General tab, where the description should also show (Figure 21).
Figure 21. Description appears under General also
- Then, under the Type Manager tab, change the Merge type to Never consider elements of this type for merging, as Figure 22 shows.
Figure 22: Set the merge type for the element type
- Now that you have created a new element type called NEVER_MERGE, associate all ClearCase binary files with that new type.
- Next, edit the Magic file. To make binary files that are newly added to source control use this new element type, you need to change the type of binary files in the Magic magic.default file to NEVER_MERGE by using the commands that code Listing 1 shows.
Listing 1. Commands to edit the Magic file
shlib library compressed_file : -name "*.[dD][lL][lL]" ;
#zip_archive archive file : -name "*.[zZ][iI][pP]" ;
zip_archive archive NEVER_MERGE : -name "*.[zZ][iI][pP]" ;
gtar_archive archive file : -name "*.gtar" ;
|
- Do the same for all of the other binary files that you do not want to merge.
Hereafter, if you add a new .zip file to the source control (in the development stream while it's not in integration stream), the type will be automatically set to NEVER_MERGE, and no merge will be required for this type of files, as Figure 23 shows.
Figure 23: Adding a new .zip file to the development stream
Thus, when you deliver it to the integration stream, it is not necessary to merge the file (Figure 24)
Figure 24: Delivery to the integration stream without merging
You will see two separate versions of the file. Because there is no merge arrow displayed onscreen, you know that the files haven't merged (Figure 25).
Figure 25: Two separate versions, no merge arrow
You will see a 0-byte file in the integration stream (Figure 26).
Figure 26. Empty file in the integration stream
Tip:
If you check new versions in to both the development and integration streams, as Figure 27 shows, then deliver a file without the new NEVER_MERGE type from the development stream to the integration stream, you will get a merge conflict error message.
Figure 27. Check new versions in to both streams
However, as Figures 28 and 29 show, these two versions are completely separated, as the lack of a merge arrow indicates. Therefore, no new version has been created in the target stream. Instead, ClearCase software just completely ignores this file element.
Figure 28. No prompt about a merge conflict appears
Figure 29. No merge took place
Using the MAGIC_PATH variable for easier updating
To avoid updating the Magic file on everyone's workstation, you can store the Magic file in a central location by using the MAGIC_PATH environment variable (Figure 30).
Figure 30. How to store the Magic file in a central location
Where to define the new element type: The new element type has to be available in every component VOB where the binary files exist. Suppose that you add the new element type to only one component VOB, such as the Playground_p VOB. Figure 31 shows what happens if you then try to add a new .zip file to another VOB, General_app_sub, that does not have an administrative link to the Playground_p VOB.
Figure 31. Error generated if you did not define the new type
This happens because you changed the type of .zip file to NEVER_MERGE in the Magic file. That change applies globally, to all VOBs, but the new NEVER_MERGE element type is not defined in the VOB where the file is going to be added into the source control. The best place to define this new type is either the top-level PVOB or your administrative VOB. You must define the Scope of the type as Global (Figure 32).
Figure 32. Define the scope of the new type as Global
Now, if you add a .zip file to another component VOB, such as Mortgage_apps_sub (with the administrative VOB linked to Intralink_pVOB), the type is automatically added to Mortgage_apps_sub, as Figure 33 shows.
Figure 33. Local copy of the newly defined type of Global
The following excerpt from ClearCase documentation explains how you can use an administrative VOB to store global types of elements.
Understanding the role of the administrative VOB
An administrative VOB stores global type definitions. VOBs that are joined to the administrative VOB with AdminVOB hyperlinks that share the same type definitions without having to define them in each VOB. For example, you can define element types, attribute types, hyperlink types, and so on in an administrative VOB. Any VOB linked to that administrative VOB can then use those type definitions to make elements, attributes, and hyperlinks.
If you currently use an administrative VOB, you can associate it with your PVOB by created an AdminVOB hyperlink between the PVOB and the administrative VOB. In Windows, the VOB Creation Wizard creates the AdminVOB hyperlink for you. On UNIX®, use thecleartool mkhlinkcommand to create the AdminVOB hyperlinks between the VOBs that store the components' root directories and the administrative VOB, so that the components can use the administrative VOB's global type definition.
If you do not currently use an administrative VOB, do not create one. When you create components, ClearCase makes AdminVOB hyperlinks between the VOBs that store the components' root directories and the PVOB, and the PVOB assumes the role of administrative VOB.
Merging user-defined files: ClearCase Version 7 introduced changes to merging binary files and other user-defined types of files. One of them is that if an element's merge type is defined as user, you can choose to do a copy-merge action during a delivery or when re-establishing the baseline (rebase).
The following information is an excerpt from the V7 Release Notes:
Type 1. New merge behavior type, âmergetype copy, for binary file element types
| mkeltype | -mergetype copy | Create a copy merge element type. The findmerge operation attempts to merge elements of this type automatically by copying the from-version to the to-version (replacing the to-version with the from-version). |
|---|
Note:
This type of merge is not supported by the Rational ClearCase Remote Client.
User merge type options in deliver (and rebase) GUI
If a version that has been created as a result of a deliver (or a rebase) operation cannot be automatically merged with an existing version in the target stream (for a delivery or a rebase) and the element has a merge type of user, a window titled Deliver User Mergetype Element (or Rebase User Mergetype Element) is displayed. A field in the window shows the path for the element that cannot be merged automatically. Under Please choose one of the following options, the following options are shown:
- Copy the version from the source stream: Set this option to have the version in the source view copied to the target view and have the merge arrow created. The version is considered merged.
- Keep the version currently checked out on the target stream....
- Skip this element, merge later or manually....
- Back out from the current delivery (or Back out from the current rebase)....
IBM Technote 1123371, "Handling binary files in ClearCase," proposed a COPY merge solution for ClearCase after Version 7. In general, however, we believe that using copy-merge is less useful than most people think. Take the scenario described previously under "Build results for common code," for example. It is problematic to use copy-merge to transfer the build result JAR files from stream to stream, because the source code is usually merged but not copy-merged. If you copy-merge the JAR files, then the source code in the target stream will not match the build result JAR files, which basically makes the JAR files useless.
Another issue is that using copy-merge creates a new version on the target and makes the storage issue worse. (See the Storage considerations section that follows for more about that issue.)
Figures 34 and 35 demonstrate how to create a new element type of COPY with a merge type of copy in case that is appropriate in your environment.
- Click Create, and in the window that you see next, enter
COPYin the Name field, and then type a description.
Figure 34. Create a new element type named COPY
- Go to the Type Manager tab, and under Merge type, select Always copy elements of this type.
- Click Apply, then OK.
Figure 35. Choose "Always copyâ¦" for the merge type
- From here, follow the same steps that you did for creating the new NEVER_MERGE type.
Hereafter, a copy-merge action occurs during delivery whenever a merge is necessary, as Figures 38 through 42 illustrate.
Figure 36. Add a new .jar file into the development stream
Figure 37. Before delivery and merging
Figure 38. The "Delivery from stream, merges complete" display
Figure 39. Delivering to view display
Figure 40. A copy-merge display after delivery, showing merged files
Handling existing files: For those binary file elements that are already in the ClearCase environment (still with the original element type), you will need to change their element types to the either NEVER_MERGE or COPY. For example, this command looks for all .jar files and changes their element types to NEVER_MERGE:
cleartool find . -name '*.jar' -exec 'cleartool chtype "NEVER_MERGE"
$CLEARCASE_XPN'
After you have made this change, all binary file elements will merge in the same way as files that you add to the source control.
As CM Crossroads members discussed in that "How bad is [it] to store deployment units in ClearCase?" thread, the main problem of storing binary files in ClearCase is their relatively large size in comparison to normal text files. This section describes practical ways to purge outdated binary files.
Caution:
In general, never purge data stored in ClearCase VOBs. If you purge that data, you lose an important benefit of the CCM system: the ability to reconstruct any given configuration. Also, if you purge that data accidentally, you cannot undo that deletion.
The preferred approaches to handling ClearCase storage issues are these two:
- Store only files that are really required to reproduce a software release. These files include source code, build files, and build tools, but not the build results, such as JAR files. By following this approach, the disk storage requirement grows much slower.
- If you must store your build results, then enough disk capacity has to be available to store all data that will be created during the lifetime of the project. You can estimate the approximate space requirement when you start the project and add disk space as you need it later. There is no problem with this approach, because ClearCase software can handle almost any amount of data.
Note:
There has been no theoretical limitation on the VOB size since IBM released ClearCase Version 7. The purge method described later in this article is simply to remove data that you no longer need.
Unlike text files, where changes are stored as delta in ClearCase, there is normally a copy of each version of a binary file in the data container. This makes the data container grow much quicker than for a normal text file.
A large, ever-growing VOB affects many aspects of ClearCase performance because of the large amount of data that has to be transferred over the network, especially when you have to access it remotely. The large VOB dramatically increases the backup time, too. IBM Technotes 1127322 and 1124619 give other details on this topic (see Resources).
Because of these issues, be sure to prepare a good approach to prevent the VOB size from growing too big while you continually add large binary files to ClearCase repositories. The next section describes a time-oriented purging method for these binary files.
In our environment, we store some sharable components' intermediate build results, as well the finally deployment units, in our ClearCase setup. After in-depth investigation on what purging method to adopt, we decided to use a time-oriented approach. Our method includes using time-based UCM components, as well as removing older and basically no longer in-use versions for each binary file.
Because we run a pure UCM environment, we use components to store binary files, just as we store other text files. After much investigation, we found that the easiest way to purge binary files is simply to remove them when they are out-of-date by using the rmelem command.
Figure 41. Time-based components
For our deployment units, we name the component according to our release date. Therefore, when the release is out-of-date, we can simply go back to these components and delete (rmelem) the contents. This removes all versions, which thereby shrinks the storage space. (Every team will have its own definition of what's out-of-date, of course.)
Note:
Our Internet banking application has only one version in production, which is similar to how many other Web sites operate. We rarely roll back to an older release, but always go forward.
For those intermediate build results of sharable components, we cannot simply delete the element at any time, because there is always someone using some of the versions. The best solution that we have found is to remove some earlier and not-in-use versions, using the rmversion command.
To identify the versions that we can safely remove, we rely on our baseline promotion levels, which have a label attached to the associated version of each element. For example, we use the following command to find all baselines that are no longer supported:
cleartool lsbl -level UNSUPPORTED -component component:$comp@/VOBs/Intralink_pVOB
When we have found these unsupported baselines, we can find the individual versions associated with that label and remove them with these commands.
cleartool find $VOB_root/$VOB/$comp/bin -name '*.jar' -version "lbtype($bl)" âprint
cleartool rmver -f -data $ver
Tip:
The rmver -data command safely removes the data and leaves most of VOB metadata untouched.
Here's more about this command from the ClearCase documentation:
Data-only Deletion
Default:rmverdeletes the version object in the VOB database along with the associated metadata and the corresponding data container in a source storage pool.
-data: Deletes only the data for the specified version, leaving the version object, its subbranches, and its associated metadata intact. In particular, this option preserves even records and enables continued access to the configuration record of a DO version. Caution: Using the-dataoption implicitly invokes the-xbranch,-xlabel,-xattr, and-xhlinkoptions, as well. That is, the data container is deleted even if the version has a label, attribute, or hyperlink attached or has a branch sprouting from it.
After we have removed all identified versions, we run this command to reclaim the disk space and reduce the size of the VOB database:
reformatVOB -update /VOBstore/Intralink_VOBs/Intralink_Common_sub.VOB
Caution
You must use this approach carefully, because a version can be included in more than one baseline, and the other baselines could be supported baselines. Therefore, the algorithm would delete data from still-supported baselines. However, in our environment, these baselines are created after each build. This means that every baseline has a corresponding set of JAR files. Thus, rejecting a particular baseline has the same meaning as rejecting included versions for all JAR files in that baseline and component. That's why it is safe for us to use this method in our particular environment.
Check Resources for links to IBM Technotes mentioned in this article, downloads, discussion forums, and more useful information.
Learn
-
Visit the Rational ClearCase area on developerWorks for articles and tutorials.
-
Read IBM Technote 1123371: Handling binary files in ClearCase, which outlines a method to manage binary files that are stored in an IBM Rational ClearCase VOB in either a base ClearCase or UCM environment.
-
Read IBM Technote 1240740: New merge type Copy feature with ClearCase Version 7 about the new IBM Rational ClearCase merge type behavior introduced in Version 7.0 that enables you to copy element versions during a merge.
-
Read IBM Technote 1149498: Compare utilities cleardiff and cleartool diff fail to note differences between binary files, which explains why these ClearCase commands attempt to start the text_file manager when comparing binary files, and gives a solution.
-
Read IBM Technote 1127322: About type managers and size limitations, which explains what the size limitation is for all ClearCase Type Managers on Microsoft Windows, Linux®, and UNIX.
-
Read IBM Technote 1124619: About the 2 gigabyte limitation of VOB database files, which explains which ClearCase schema 54 VOB database files can and cannot grow over 2 GB when you are working on Microsoft Windows, UNIX, or Linux operating systems.
-
Read "How
bad is [it] to store deployment units in ClearCase?" A discussion thread on the CM Crossroads Web site (November 2006).
-
Visit the Rational software area on developerWorks for technical resources and best practices for Rational Software Delivery Platform products.
-
Enroll in RS523: Essentials of IBM Rational ClearCase UCM for Windows, V7.0. This live instructor-led online training course introduces the concepts of Rational ClearCase Unified Change Management (UCM), the Rational software development best practice that integrates artifact and activity management. Course materials, hands-on labs, and real-time interactions with the instructor and others students teach you to use Rational ClearCase UCM and Rational ClearQuest to perform common, day-to-day software development tasks, such as joining a project, checking source control files out and back in, merging your work with the work of others, updating your workspace, and working offline.
.
-
Enroll in RS403: Configuration Management with IBM Rational ClearCase UCM, V7.0. Learn how to implement ClearCase unified change management (UCM) effectively in your Windows-based development environment. This two-day, live instructor-led training for an online audience includes hands-on labs and real-time interactions that teach you what information is critical to include in a configuration management plan. You will create and implement a usage model based on a simulated software development project.
-
Enroll in RS603: Mastering IBM Rational ClearCase Administration for Windows, V7.0. Learn to perform the administrative tasks necessary to deploy and maintain an enterprise-wide ClearCase implementation, including setting up an environment, system and data maintenance, security, backup methods, licensing, and installation. This is live instructor-led training for an online audience, with hands-on labs and real-time interactions.
-
Enroll in RS125: Essentials of IBM Rational ClearCase for Developers. Learn how to perform common day-to-day tasks in Rational ClearCase. The focus is on the concepts and skills that developers need to successfully manage source code changes in their development environments. This is self-directed, self-paced online learning with rich media content, including interactions, quizzes, and virtual labs. Rational Web-based training (WBT) courses are sold on a per-user basis. When you purchase a WBT course, you have access to the entire catalog of WBT courses for a year.
-
Enroll in RS130: Essentials of IBM Rational ClearCase UCM for Developers. This course introduces the concepts of IBM Rational ClearCase Unified Change Management (UCM), a development best practice that integrates artifact and activity management. This is self-directed, self-paced online learning with rich media content, including interactions, quizzes, and virtual labs. Rational Web-based training (WBT) courses are sold on a per-user basis. When you purchase a WBT course, you have access to the entire catalog of WBT courses for a year.
-
Subscribe to the developerWorks Rational zone newsletter to keep up with developerWorks Rational content. Every other week, you'll receive updates on the latest technical resources and best practices for the Rational Software Delivery Platform.
-
Subscribe to The Rational Edge e-zine for articles on the concepts behind effective software development.
-
Subscribe to the IBM developerWorks newsletter, a weekly update on the best of developerWorks tutorials, articles, downloads, community activities, webcasts and events.
-
Browse the technology bookstore for books on these and other technical topics.
Get products and technologies
-
Try Rational ClearCase V7.0. Evaluate Rational ClearCase V7.0 online without installing or configuring it on your own system. When you register for this online trial, you can access both Rational ClearCase and Rational ClearQuest installed into a combined environment. You can choose from the resources provided below for guidance in exploring the features and functions of Rational ClearCase or the combined environment.
-
Download IBM product evaluation versions and get your hands on application development tools and middleware products from DB2®, Lotus®, Rational®, Tivoli®, and WebSphere®.
Discuss
-
Participate in the Rational ClearCase forum on developerWorks. Post your ClearCase questions and comments, and share your thoughts, ideas, and solutions with other users. To participate by email rather than on the Web, subscribe by sending a note to cciug-subscribe@lists.ca.ibm.com.
-
Get involved in developerWorks forums about IBM software.
-
Check out developerWorks blogs and get involved in the developerWorks community.




