 | Level: Introductory Jim Whitehead, Chief, IETF WebDAV Working Group
05 Jan 2004 The success of geographically dispersed development teams working together over the Web depends on their ability to manage and control change of their source code. Read about the software configuration management systems available today, including CVS, WebDAV, to Delta-V.
From CVS to WebDAV to Delta-V
Every day, developers with a shared software vision band together from
around the world to develop Open Source software. A similar trend occurs
in the corporate world: Large companies with physically dispersed divisions
create distributed teams to work together on software projects. Cross-organizational
projects also occur with greater frequency, such as a subcontractor working
closely with a primary systems-integration contractor on a large project.
These geographically dispersed teams share the same needs for distributed
source-code control. When it comes to working on the design documents,
test cases, specifications, and source code that comprise the project,
individual team members need to work on pieces in isolation, then integrate
those pieces with the modifications of their coworkers, without clobbering
anyone else's changes. Changes need to be tracked so that errors and exploratory
design changes can be undone easily. Tracking creates a group memory of
how files have changed over time -- valuable for later reconstruction of
detailed design rationales. Released and stable configurations of the project
are tracked so they can be regenerated quickly, and so that bug fixes can
be made to the appropriate release. These capabilities are all provided
by software configuration management (SCM) systems.
SCM systems use a library metaphor to control access to project documents
and source code. At first, the SCM repository holds all development files
in a "checked-in" state. To work on a file, one needs to check it out,
just like taking a book out of a library. Once changes are complete, the
file is checked back in, accompanied with brief comments describing the
changes. A checked-in file is immutable, and can't be changed again without
checking it out.
Once a change-tracking system is in place, it's possible to view previous
revisions of a file and see differences between revisions. Another typical
feature is viewing the change history of a file -- listing the modification
date for all revisions, the person who made the change, and the comments
he or she submitted with the change. It's also possible to discard some
revisions -- a useful capability if an exploratory change doesn't work
out as intended.
Revision tracking also makes configuration tracking possible. Since
any nontrivial software system is composed of multiple source objects,
which are described by multiple design and requirements documents, freezing
the state of an entire project requires knowing the exact version of each
file in the project so that a consistent snapshot can be made. SCM systems
provide this capability, allowing users to create baselines that can be
used for testing and release tracking. Since all checked-in revisions are
immutable, it's possible to revert to a previous project configuration,
a critical capability for supporting previously released software projects.
Remote Configuration
Management with CVS
Today, the distributed configuration management system of choice for Open Source
developers is the Concurrent Versions
System (CVs). Currently in use by the Apache
HTTP Server and Netscape Communicator Web-browser Open Source ( Mozilla.org)
projects, CVs has many advantages for distributed teamwork. Since CVs is itself
an Open Source project, it's freely and widely available. In addition to providing
typical versioning and configuration-management features, CVs also offers excellent
work isolation for team members, and the CVs client/server protocol allows this
teamwork to occur remotely. The cvsWeb utility allows CVs version histories,
old revisions, and differences between revisions to be browsed in a read-only
manner on the Web. CVs front ends have been developed for UNIX, PC, and Mac
systems, allowing developers from all platforms to participate on a project
(see the sidebar,
Giving CVs a Facelift). Since many Open Source projects use CVs, there is
a large and growing pool of developers who know CVs, and understand how to use
it for team work. In conjunction with an e-mail mailing list, a Web site giving
project overview and documentation, and a bug reporting and tracking system,
CVS is a key coordination infrastructure for performing collaborative teamwork
via the Internet.
Jim Jagielski's article from
Web
Techniques
on the Apache
development process highlights how CVS is used on a successful Open
Source development project. Using the CVS update-edit-commit work cycle,
Apache developers are able to work on source code on their local machines,
thereby isolating themselves from the changes made by other developers.
When local changes are complete, they are merged with the intervening modifications
of other developers, and then committed to central development server.
CVS isn't the only configuration-management tool that supports remote development
teams. Commercial SCM systems frequently provide this capability, examples
being Rational
ClearCase MultiSite, Merant PVCS Replicator, and the Continuus/DCM distributed
change management product. Other Open Source tools also offer distribution
support, a notable one being the Distributed
Versioning System (DVS) available at the University of Colorado. These
systems are just the tip of the iceberg. The Configuration
Management Yellow Pages has an exhaustive listing of existing commercial
and Open Source systems. (For a list of Web-based resources related to this
article, see the section "Related Resources" at the
bottom of the article.)
Today: Remote Web Authoring with
WebDAV
Exciting new work that's just starting in the Internet
Engineering Task Force (IETF) promises
to make it easier to perform remote collaborative project work over the
Web. The new effort is called Delta-V,
and its goal is to provide versioning and configuration management capabilities
for the Web by extending the Web's core protocol, HTTP. Using Delta-V,
collaborative teams will be able to edit the source code, documents, Web
pages, and binary graphics in a project, then record important revisions
and manage project configurations -- all in-place on the Web. The Delta-V
activity is building upon the work of the Web-based
Distributed Authoring (WebDAV) protocol, an IETF standard that has
extended HTTP with operations for remote collaborative authoring on the
Web. Delta-V extends HTTP and WebDAV with versioning, isolation of individual
changes from collaborators' changes, and SCM capabilities.
The WebDAV protocol, the foundation on which Delta-V is built, extends
the Web to make authoring of Web resources as easy as browsing them. Unlike
CVS, which downloads files to a local hard drive to retain compatibility
with existing applications, with WebDAV Web resources are edited directly
on a Web server. Applications must be modified to interact with the Web
server using the WebDAV protocol. Though WebDAV is still in the early stages
of adoption, Internet Explorer 5 and the Office 2000 suite of applications
have already integrated WebDAV support via a feature called Web Folders,
providing remote authoring for Word, Excel, and PowerPoint documents directly
on the Web (see the sidebar, "Web
Folders and WebDAV"). Additionally, WebDAV Explorer provides a file-system
explorer interface for a WebDAV server. There are many existing WebDAV
servers, including the mod_dav
module for the Apache server, Microsoft Internet Information Server
(IIS) 5, Glyphica PortalWare, Xythos Storage Server, DataChannel RIO, Intraspect
Knowledge Server, Digital Creations Zope, CyberTeams WebSite Director lite,
and the freely available WebRFM. The IBM DAV4J server, available from AlphaWorks,
also provides a Java client API for WebDAV.
WebDAV features are designed to accommodate existing tools, making it
straightforward to integrate WebDAV-based remote authoring into them. WebDAV's
namespace operations provide the ability to create and list collections,
and to copy and move Web resources, thus supporting the needs of "File...
Open" and "File... Save" user-interface dialog boxes. Locking of entire
Web resources provides overwrite protection for all types of Web resources
(HTML pages, GIF images, word processing documents, and source-code text
files), and in fact, one of WebDAV's design principles is to provide equal
support for all Web-resource types. WebDAV also provides support for storing
and retrieving metadata, in the form of attribute-value pairs called properties,
associated with a resource. The name of a WebDAV property is a URL, used
in this case as a property identifier, not as a locator, and a property
value is well-formed XML, gaining XML's advantages for representing structured
data and for internationalizing string values.
Early Web-authoring tools encountered the "lost update problem," which
occurs when two or more simultaneous authors of a Web page clobber each
other's work with successive saves to the same URL without first merging
their changes. Although HTTP 1.1 has support for detecting lost updates
through unique identifiers associated with the document state, no support
is provided for preventing lost updates in the first place. To solve this
problem, WebDAV uses long-duration, whole resource locking as its concurrency
control mechanism. The WebDAV protocol provides a write lock, but no read
lock capability. On the Web, by default a resource is readable, although
it may be protected by access control. Therefore, HTTP doesn't require
that a Web browser obtain a lock in order to read a resource, as is the
case with traditional database locking, retrofitting the Web with this
capability was neither feasible nor desirable. Web servers implement the
write operation PUT by saving the contents of the resource in a temporary
buffer until the entire new resource has been transmitted, then using internal
concurrency control to block read access while the new value is quickly
updated. So the traditional database problem of reading a value in an inconsistent
state is avoided. Another traditional database issue, deadlock, is also
avoided with WebDAV locks. Since locks are granted via a protocol request,
with a given request either granted or denied, there's no blocking, and
hence no possibility of deadlock.
WebDAV servers have used differing strategies to implement the features
in the protocol -- the major difference is the underlying repository chosen
by the server to store properties and resources. Microsoft's IIS 5 server
uses the Windows 2000 file system as its repository, and provides an extremely
tight integration between file system services and WebDAV services. When
a file is locked via WebDAV, it is also locked in the file system, and
hence a local user cannot clobber a file locked by a remote user. IIS 5
also uses Windows 2000 user and access-control lists to determine whether
a WebDAV user has access to a particular file; there is no separate Web
access-control mechanism used by IIS 5. In contrast, the mod_dav Apache
module also uses a file system repository, but requires that the Apache
server owns all WebDAV authorable files, thus effectively preventing local
access to the files. This avoids the need to assume root privileges under
UNIX to change the ownership of files -- a security risk -- and lets mod_dav
create users that don't have local system accounts, only WebDAV authoring
privileges. Restricting local file access prevents another potential problem:
Since mod_dav stores properties in a separate database, moving or deleting
a file without telling mod_dav results in "ghost" property entries for
a resource that no longer exists.
Other WebDAV servers store their information in databases instead of
the file system. The Glyphica PortalWare server has created a content management
system that sits on top of the Versant object-oriented database system.
All documents that are submitted to PortalWare are indexed for full-text
searching, and have properties associated with them in the database. The
Xythos Storage Server uses a relational database for storage, instead of
an object-oriented one. The Xythos server uses standard SQL via JDBC to
interface with its database, which, combined with the cross-platform support
of databases like Oracle, Sybase, and Informix, lets the Xythos server
run cross-platform, and on a variety of databases. Both servers gain several
typical database advantages, including transaction support that's useful
in implementing WebDAV methods, and good recovery from disasters like power
outages and disk failures.
The Future: Web-Based Delta-V
While WebDAV's remote-authoring features are useful for performing remote
collaborative authoring, they highlight the need for versioning support
to preserve the history of work. The work on Delta-V is intended to fill
this role, adding versioning support to WebDAV. Work on Delta-V is ongoing,
so details of the protocol may change as the standardization work continues,
but there's increasing convergence on its features and benefits. Figure 1 provides a high-level architecture diagram showing several applications
using Delta-V.
| |
Figure 1: This diagram shows three versioning-capable remote-authoring tools communicating via the
HTTP protocol using WevDAV and Delta-V extensions to a Delta-V capable server.
|
Work is progressing rapidly, driven by working group participants with
a deep background in SCM, document management, software environments, and
Web portal systems. These participants come from the leading companies
in these areas: IBM, Microsoft, Novell, Rational, Merant, DataChannel,
Object Technology International, and Dynamic Diagrams, with university
participation from U.C. Irvine.
The Delta-V protocol addresses several shortcomings in CVS. The primary
advantage of Delta-V is its tight integration with the Web. Using CVS to
manage a Web site requires understanding how the file structure managed
by CVS maps into URLs served by HTTP, a difficult concept for many users.
With Delta-V, Web resources are edited in-place, at a specific URL, and
no mapping of filenames to URLs is necessary. Furthermore, the Web-native
Delta-V protocol can handle the different types of Web resources better
than a file-oriented system like CVS. By versioning Web resources, Delta-V
allows HTML links to old revisions of Web pages, creating a sort of time
machine for the Web. Linking to a specific revision often can preserve
the semantic meaning of a link, such as when linking to a Web-log site
that changes frequently, where the linked-to information may be gone in
a week. If the site used Delta-V to version its content, these old revisions
would still be accessible.
The Delta-V protocol has several unique features. Delta-V assumes that
most editing will take place directly on Web resources, which differs from
CVS in that there's no local replica. Isolation from the changes of other
team members is provided by "workspaces," which provide each collaborator
with his or her own view on the resources being edited. Unlike the local
replicas that provide isolation in CVS, workspaces isolate collaborators
as they work on the remote Web server. Overwrite conflicts are avoided
because a resource can be checked out by multiple people simultaneously,
and each check out creates a separate working resource. Each collaborator
actively working on a resource has a separate virtual working area, identified
by his or her workspace, and modifications are made first in a workspace,
then merged with the changes of other collaborators.
Another drawback of CVS is its client/server protocol, which is tightly
coupled to CVS's repository. Unlike CVS, HTTP and WebDAV have a proven
track record of mapping to multiple types of server back-end stores, such
as databases, document management systems, and file systems. Delta-V provides
a cross-platform integration layer, thus bringing the benefits of remote
Web collaboration support to a diverse set of existing back-end repositories
that do not currently provide Web authoring or versioning support. Judging
by the participants in the working group, the Delta-V protocol will be
mapped to SCM systems, document management systems, and content management
systems, all of which employ a database to provide their features. This
makes the Delta-V protocol a more powerful data integration technology
than the CVS client/server protocol, which maps only to the CVS repository.
Delta-V provides versioning of collections, a feature not supported
by CVS. When a collection is versioned, collections and their contents
follow the check-out/edit/check-in model. When a collection is checked
in, its membership is frozen, and can't be changed until the collection
is checked out again. Making a new file or deleting an existing file requires
the parent collection to be checked out. When all collections in a project
are versioned, it's possible to record permanently the membership of each
collection for each moment in time, thus making configuration management
support possible. Once both collections and their contents are versioned,
it's possible to explicitly pick a single revision of each collection and
file (often the most recent revision), creating a snapshot of the entire
project.
CVS doesn't provide full versioned collection support, leading to odd
glitches. As an example, consider renaming a file from A to B. Using CVS,
this requires three steps: copying file A's contents into the new location
at B; using a cvs add to put B into the CVS repository; and a
cvs
remove to delete file A. If the collection containing B were reverted
to a previous state when A was present but B had not yet been added, the
collection will contain both A and B. Since CVS doesn't store previous
revisions of collections, it doesn't know when B was added, and so can't
revert the collection correctly. Because Delta-V versions collections,
it can avoid this problem. Renaming the file in Delta-V would involve checking
out the collection to make it editable, moving the file from A to B, and
then checking in the collection. If the collection is reverted to the original
version, just before the initial check out, it will contain A, but not
B, and similarly the following revisions will contain B, but not A. Versioned
collections thus provide the foundation for rigorous configuration management.
Since Delta-V assumes work will take place directly on a Web server,
rather than on a local replica, existing WebDAV editing tools, like Office
2000, that are not versioning-aware need to be accommodated. Delta-V can
automatically record, as separate revisions, changes to a document made
by a versioning-unaware client. Delta-V also divides its functionality
into two layers: a simple versioning layer, and a more complex SCM layer.
Since authoring clients (word processors, text editors, spreadsheets, and
so on) typically work on a single file at a time, they are only expected
to use the basic versioning layer to support a check out/edit/check in
style of work. The typical authoring client is not expected to provide
a user interface for operations like creating and reverting configurations,
since a configuration spans an entire project, far greater than their single-file
editing scope. A separate SCM control panel application will make use of
the features in the SCM layer. This control panel will operate at a collection
and project level, providing the capability to create a project configuration
or revert to a previous configuration. It will complement the single-file
focus of the authoring tools with project-wide capabilities. A full-featured
programming environment will be a third class of Delta-V application, one
that uses both the versioning and configuration capabilities of Delta-V,
providing support for editing individual source-code files, as well as
project-level SCM support.
Despite their differences, Delta-V and CVS have much to offer each other.
Though Delta-V has been designed for collaborators to work directly on
a Web server, it's technically feasible to use the protocol to create local
replicas, as in CVS. In fact, though it has not been attempted, it appears
to possible to replace the CVS client/server protocol with Delta-V, and
an existing WebDAV client called sitecopy provides a glimpse of how this
could be done. The sitecopy utility allows a local file-system directory
to be replicated to a remote WebDAV server, so a Web site can be created
locally using file-system based authoring tools, then published remotely
using the WebDAV protocol. In its remote replication support, sitecopy
is similar to the CVS update operation. Though sitecopy and WebDAV don't
support versioning, it's not a far stretch to imagine adding bidirectional
synchronization, conflict flagging, and versioning operations to sitecopy,
thus creating a system that has many of the capabilities of CVS. But why
recreate the CVS user interface? It's far better to integrate the Delta-V
protocol into CVS, retaining the benefits of the CVS without having to
learn a new system. Since Delta-V can map to multiple back-end repositories,
Delta-V would allow the CVS style of work to be used against multiple repositories,
not just with CVS.
The Delta-V protocol opens up several intriguing possibilities for building
software systems. These possibilities vary based on where the source code,
compiler, and object files are located -- on the remote Delta-V server
or on the local machine. If they're all on the local machine, then the
build process is very CVS-like, with source code replicated to the local
machine before the compiler begins operation, yielding object files that
reside locally. But if the source code, compiler, and object files are
held remotely, a client would initiate a build by sending a build request
to a remote compile server, giving the URL of a makefile and a workspace,
storing the object files in the same version-controlled URL hierarchy as
the source code. In this scheme, a different compile server could compile
each platform variant. While the compiler wouldn't typically be placed
on the same machine as the Delta-V server -- so compiles don't adversely
affect server performance -- it would be reasonable to place the compile
server on the same local storage area network as the Delta-V server. Many
interesting configurations are possible for build management using Delta-V,
undoubtedly an area where implementations will innovate on different strategies.
With a proven track record based on successful use on a wide range of
Open Source projects, CVS is a low-cost, high-value system available today.
Looking to the future, the Delta-V protocol melds versioning and SCM with
the Web, adding powerful team collaborative work facilities, with the potential
for a value-adding integration with CVS. Whether you're looking at the
state of things today, or the promise of the future, the implication of
these two technologies is clear: It's easier than ever before to assemble
a virtual team for remote collaborative project work
Related Resources
CVS
CVS Version Control for Web Site Projects
durak.org/cvsWebsites
Cyclic Sortware's CVS Information page
www.cyclic.com/cvs/info.html
WebDAV
Digital Creations Zope
www.zope.org
Glyphica PortalWare
www.glyphica.com
sitecopy
www.lyra.org/sitecopy
WebDAV Resources and Apache mod_dav
www.Webdav.org
Xythos Storage Server
www.xythos.com
DataChannel RIO
www.datachannel.com
IBM DAV4J
www.alphaworks.ibm.com/tech/DAV4J
IETF WevDAV Working Group
www.ics.uci.edu/pub/ietf/Webdav
Intraspect Knowledge Server
www.intraspect.com
Microsoft IIS
www.microsoft.com/ntserver/Web/default.asp
WebSite Director lite
www.cyberteams.com/products/wsdlite/wsdlite-overview.html
Delta-V
IETF Delta-V Working Group
www.ics.uci.edu/pub/ietf/deltav
Other distributed CM tools
ClearCase MultiSite
www.rational.com/products/cc_multisite/index.jtmpl
Configuration Management Yellow Pages
www.cs.colorado.edu/~andre/configuration_management.html
Continuus/DCM distributed change management
www.cs.colorado.edu/serl/cm/dvs.html
Merant PVCS Replicator
www.merant.com/pvcs/products/replicator/index.asp
About the author  | |  | Jim Whitehead is the Chair of the IETF WebDAV Working Group, and an active participant in the Delta-V
Working Group. He is also a Ph.D. student in the Department of Information
and Computer Science at the University of California, Irvine. Professional
experience includes a position at Raytheon, where he designed firmware
in C and Ada for the German civilian air traffic control system (DERD)
and for a prototype Microwave Airplane Landing System. |
Rate this page
|  |