Separating the pieces from the whole
At its most basic level, a software "module" consists of an interface and one or more implementations of the interface. The interface defines the relationship of that class with the outside world.
Outside of software, modularization is very common. One great example is that of shipping. The introduction in the 1950s of the intermodal shipping container had a significant impact in the cost of shipping things around the world. The container is a standard size with a standard design that is stacked in a standard way. This meant that ships, trains, and lorries could be designed to accommodate a standard sized container without caring what went inside – dealing primarily with just the external interface of the container. This significantly cut the time involved in loading and unloading ships enabling them to be used more efficiently. What went inside the container and how the contents was configured was the responsibility of the people who filled them.
Within IT there are also excellent examples of hardware modularity. To upgrade the graphics card in your computer, you consider your motherboard’s interface (for example, PCI Express) and your monitor’s input (VGA or DVI), and then choose a card that supports both of those interfaces. The monitor and motherboard do not place constraints on the card like the chipset it uses to provide the support; it is only the interface it is connected to that’s important. A graphics card manufacturer can then safely design how the card works, or change the way it works, and concentrate their efforts on building a high quality graphics card without worrying about what lies on the other side of the interface.
In software, though, things are not often designed in that way, mainly because good modularity is difficult to achieve.
Let’s start by laying out some of the key considerations for good modularity. The reason these characteristics are important is because they contribute to making code more independent, flexible, and resilient:
- Contract: What is the contract between this component and its clients? How do they access it, what behaviour can they expect, and so on. It is common in software for developers to write code that depends on some undocumented behaviour of a function, such as expecting an undocumented side effect to occur.
- Visibility: Any module is likely to have a large number of useful internal classes, methods, and utilities that exist to provide the module interface (or contract) but that are not considered by the module to be external. Often when using a vendor-provided module, it can be difficult to tell what is internal and what is external. The result is that if an internal module is used, the first time the client finds out is when either a compile failure occurs -- or when a run time failure occurs because the method signature changed.
- Co-existence: Any system can consist of a set of modules where each module has dependencies on other modules. These modules need to be able to run together without causing problems for each other. As a simple example of this, the JDK contains two List classes; the first is part of AWT (Abstract Window Toolkit) and the second is part of the Java Collections API. These do not interfere with each other because they are in different packages. Although packages do go to some lengths to permit co-existence, it is still possible for an occasional problem to occur, the biggest one being versioning of classes.
- Replaceability: Having defined a contract or interface, you need to be able to replace the module without breaking the system. Think back to the earlier graphics card example: it is possible for you to replace (or upgrade) your graphics card without having to worry about breaking your entire system.
Applying modularity considerations
To validate that the modularity considerations above are reasonable, let’s apply them to the shipping container example and see how they hold up:
- Contract: A container is a standard size and has feet that can slot into a hole on the roof of another container. This enables containers to be stacked one on top of another. Ships, lorries, cranes, and trains built to accommodate these standards can accept containers without knowing who built the container or what is inside.
- Visibility: A crane, ship, or lorry built to accept a container doesn’t care what is inside each one, and the contents of each container is irrelevant to all the others. If you have a container full of strawberries you don’t need to worry about putting a container full of cars on top (although perhaps a bad idea for other reasons, this isn’t really a modularity concern).
- Co-existence: One container doesn’t need to know the characteristics of other containers. Whether they are painted differently or built by different manufacturers doesn’t matter.
- Replaceability: One container can be easily swapped out and replaced without affecting other containers or its carrier.
As you can see, shipping containers meet these modularity considerations quite well.
If we apply these considerations to Java, you can quickly see that Java by itself does not provide the best environment for modularity:
- Contract: While Javadoc provides a great way to document the API, the default just describes the classes, interfaces, and methods. To get the real benefit of the contract, however, you need a developer to write good Javadoc comments. All too often, APIs end up with insufficient or incorrect documentation (if any), which doesn’t compare well with autocomplete features in modern IDEs. To be fair, this isn’t a Java problem; it is the responsibility of the project team to maintain good development practices.
- Visibility: Java has two main visibility schemes. The first is a visibility marker that marks fields, methods, and classes for either public, protected, public, or default visibility. This marker is enforced strongly by the compiler and weakly at run time; that is, the runtime will expose the internals via reflection if asked. The second is that anything on the classpath it is accessible, meaning the classes that make up the internals are visible to the caller.
- Co-existence: Looking again at the classpath, you cannot have two modules rely on a common library when those modules are at different versions that are not binary compatible. You would need to upgrade everything in step, and while this might be easy for a simple application, it could involve some testing and QA challenges for a complex application.
- Replaceability: There are two aspects of replaceability: cold (JVM restart allowed) and hot (no JVM restart). Cold replaceability is achievable, but you need to know what JAR files on the classpath need to be removed and replaced. This might sound simple, but in fact can be quite complex to get right. Java does not provide hot replaceability.
Although Java does not satisfy these requirements on its own, it does provide tools to enable modularity to be built on top of it. Anyone using a Java EE application server will know that these modularity considerations can all be achieved within Java: each EAR file is separate from each other and you can have multiple EARs running with different levels of code. But the Java EE view is that each of these EARs are independent applications -- not modules -- so while it displays some modularity behaviours, it does so at a course-grained level.
So, given that Java does not in itself provide a good modularity solution, what options do you have? There are two complementary systems that are commonly used: dependency management and module systems.
The first step to adopting modularity is usually to make use of a build time dependency management system. One of the best known of these is Maven, in which you define your module with a name and version, and define the content by placing Java source in it. Other modules can then depend on your module by name and version. Maven also makes it easy to define modules that aggregate other modules. Aggregation enables you to have an API module and an implementation module; modules that depend on yours would build against the API module, and run against the aggregate of API and implementation.
Dependency management systems like Maven and others define the dependencies between modules. This often provides enough modularity for many people, but compare these provisions to the modularity considerations to see how much value is really provided:
- Contract: A dependency management system defines the identity of the module as a part of the contract. This is an artificial construct, however, and it produces great limitations on what the provider can do. For example, if you refactor a module into two, you will break clients that depend on it being one; they will get part of the required API, but not all of it. The code itself does not change, but the module metadata needs to. There are ways to get by this -- you can produce aggregate modules -- but these are workarounds at best, and expose more than might be necessary. Going back to the graphics card analogy, this would mean that the graphics card could see not just PCI Express, which it needs, but the motherboard interfaces to the CPU, printer, RAM, and so on, which it does not.
- Visibility: Using dependency management, visibility is addressed at build time, but not at run time. This might be enough, but getting access to the internals is simply a matter of adding a dependency on the implementation module. The control over a module’s visibility lies with the user, not with the provider; there is no run time enforcement because normal Java visibility comes into effect at run time.
- Co-existence is not supported at run time -- normal Java classloading is used there -- but you can have two modules building against different versions of a dependency with no problem.
- Replaceability is limited to changing the dependency to depend on something different. However, this generally only works when going through a rebuild.
While dependency management starts down the road as a good modularity solution, it only goes part of the way.
Among the several different attempts to design a module system for Java, the best known is OSGi. OSGi defines a module (called a bundle) that declares dependencies on the packages it needs and the packages it provides. Built upon this is the concept of a module lifecycle and a service model. The service model uses a service registry where objects can be advertised using the interface they provide. The service registry is dynamic so services can come and go. Users of services can be notified when a service is removed and switch to using a different service.
So how does OSGi compare with respect to the modular considerations?
- Contract: OSGi provides a good match to the contract. A module declares the capabilities it provides to client modules and the capabilities it requires from other modules in terms of packages. You do not declare dependencies on the modules, but on the packages that you use. The modularity system can then match you with what you need irrespective of which module provides it. The modularity system will not expose you to things you have not declared as a requirement.
- Visibility: OSGi has two ways of ensuring visibility. First, modules only see packages on which they declare a dependency. Second, a module declares which packages it exposes, so packages that are not exposed are private. This means it is very clear which packages are private and OSGi enforces this.
- Co-existence is supported with package versioning. Each module can define the versions of the packages it provides and requires. This enables two modules to exist in the system which use different versions of a package.
- Replaceability: OSGi permits a module to be shutdown and replaced at run time. The effect this has on the system varies depending on the nature of the change and how the application is written. Some changes will cause significant perturbation and others will be minor. Updates that change an application’s view of the classes it has are more disruptive than replacing a service registered in the service registry.
As you can see, OSGi provides a good fit for the modularity considerations.
After reviewing the considerations that are considered critical for good Java modularity, it is clear that OSGi provides perhaps the best fit. The first wave of modularization in Java began with large software products like IDEs, followed by applications servers. The considerations presented in this article were among the reasons that OSGi was selected as the mechanism to modularize IBM® WebSphere® Application Server. In addition, several large Java projects, such as Eclipse, Glassfish, Apache Geronimo 3, and Apache ServiceMix, are all based on OSGi.
The next wave of adoption will be large applications that could also benefit from the same modularity capabilities as the application servers that run them. It was from this realization that the OSGi Enterprise Expert group was formed to ensure that enterprise Java applications could leverage the benefits of OSGi. The OSGi Alliance released the first version of the OSGi Service Platform Enterprise Specification (version 4.2) in March 2010, and IBM released the WebSphere Application Server Feature Pack for OSGi Applications and JPA 2.0 later that same year.
- Developing enterprise OSGi applications for WebSphere Application Server
- Innovations within reach: Are we ready for enterprise OSGi?
- Best practices for developing and working with OSGi applications
- IBM developerWorks WebSphere