Bobby Woolf: WebSphere SOA and JEE in Practice
What's the difference between interoperability and integration?
I found this question in "Outstanding questions regarding BPEL and ESB" on the Richard Brown's Gendal World blog. Richard Brown is an ISSW coworker of mine in the UK and, from what I hear, quite smart. (Then again, he says he reads my blog. But maybe he's still smart anyway!) His blog is really good, BTW. Richard gets the question from James McGovern in a post with the same title, "Outstanding questions regarding BPEL and ESB."
So, what's the difference? Wikipedia says "Interoperability: the capability of different programs to exchange data via a common set of business procedures, and to read and write the same file formats and use the same protocols" and "Integration allows data from one device or software to be read or manipulated by another, resulting in ease of use." Yuck, those aren't much help.
To me, interoperability means that two (or more) systems work together unchanged even though they weren't necessarily designed to work together. Integration means that you've written some custom code to connect two (or more) systems together. So integrating two systems which are already interoperable is trivial; you just configure them to know about each other. Integrating non-interoperable systems takes more work.
The beauty of interoperability is that two systems developed completely independently can still work together. Magic? No, standards (or at least specifications, open or otherwise); see Open Standards in Everyday Life. Consider a Web services consumer that wants to invoke a particular WSDL, and a provider that implements the same WSDL; they'll work together, even if they were implemented independently. Why? Because they agree on the same WSDL (which may have come from a third party) and a protocol (such as SOAP over HTTP) discovered in the binding. How does the consumer discover the provider? Some registry, perhaps one that implements UDDI (which sucks, BTW). So SOAP, HTTP, WSDL, UDDI--all that good WS-I stuff--make Web services interoperable.
Another example I like is the "X/Open Distributed Transaction Processing (DTP) model" (aka the XA spec); see "Configuring and using XA distributed transactions in WebSphere Studio." With it, a transaction manager by one vendor can use resource managers by other vendors. Even though they weren't all written for each other, they still work together because they follow the same spec. They're interoperable.
Now consider two systems that weren't designed to be interoperable, or perhaps interoperable but with different specs. This requires integration. The integration code--could be Java, Message Broker, etc.; I co-authored a whole book on this--takes the interface one system expects and converts it to the one the other system provides. This is why WPS has stuff like Interface Maps and Business Object Maps.
So, you want interoperable systems; integrating them is simple. Otherwise, you have to integrate them yourself.
I'm documenting tricks in RAD 6. In Part 1, I explained how Activation Spec is replacing Listener Port for configuring MDBs, and gave some J2EE-spec-based guidance on which approach to use when. Here is some additional advice that is more specific to WAS 6.
In the WAS 6 docs, "Creating a new listener port" recommends that you should upgrade your EJB 2.0 MDBs that use Listener Ports to EJB 2.1 MDBs that use JCA adapters (and therefore Activation Specs). It strikes me that this advice is a little bit empty because your EJB 2.0 MDBs are using a JMS provider, so you can't update to Activation Specs until your messaging system vendor updates their JMS implementation to use JCA. Nevertheless, the advice still holds that if you can update your MDBs to 2.1 and Activation Specs, you should do so and quit configuring your WAS servers with Listener Ports.
Also in the WAS 6 docs, "Administering support for message-driven beans" explains what type of configuration to use with the supported JMS provider types:
It goes on to say that Activation Specs must be used with any MDBs that use JCA adapters. So, the JMS API implementation for (V6) Default messaging is apparently implemented using JCA and therefore requires Activation Specs. The JMS impls for the other providers apparently do not use JCA and therefore require Listener Ports. If and when those other providers are upgraded to JCA, then you'll be able to use Activation Specs.
Which brings to mind another question:
If a JMS provider is implemented using JCA, then you can configure the MDB connection as JCA using an Activation Spec, or as JMS using a Listener Port. Which should you use?
Just for fun, I implemented an EJB 2.1 MDB in RAD 6 to use (V6) Default messaging, but instead of using an Activation Spec, I configured its connection using a Listener Port (which specified a connection factory and destination that were configured in the default messaging provider). When I tried to run this configuration, the server started with no errors. But when I deployed the app and the server tried to start the EJB jar, I got this
WMSG0063E: Unable to start message-driven bean (MDB) MyExampleMDB against listener port MyExampleQueuePort. It is not valid to specify a default messaging Java Message Service (JMS) resource for a listener port; the MDB must be redeployed against a default messaging JMS activation specification.
So when configuring an MDB that listens for messages from the default messaging provider, there's nothing to stop you from using a Listener Port instead of an Activation Spec. But WAS will refuse to run the app. So I guess that's how you'll know when you need to convert your Listener Ports to Activation Specs.
This presents a bit of a dilemma, though. When you're developing an MDB, you're not supposed to know where the messages/events are really coming from. The configuration of the Activation Spec or Listener Port effectively hides this from you and makes it part of the deployment configuration. You just have to know that there'll be a resource in the deployment server with the expected resource name. But now you also have to know whether or not the source will be accessed through a JCA adapter, and therefore whether to configure it with a Listener Port or an Activation Spec. This makes your code, or at least its configuration, a bit less flexible.
For example, if you design your MDB to use WMQ, then you'll need to specify a Listener Port that the deployer will provide to map your MDB to its WMQ queue. But then if the deployer decides to use WAS 6's Default messaging instead, now he'll provide an Activation Spec to map your MDB to its queue. But your MDB isn't configured to use an Activation Spec, it's configured to use a Listener Port. The deployer will need to go modify the
Which is the simpler approach for implementing web services, REST or SOAP/WSDL? I've been thinking about how they compare.
Supporters of the REST approach claim that it's simpler than SOAP. (By SOAP, we really mean SOAP and WSDL, since SOAP is just a data format, not a service.) Amazon Web Services implements both, and there are claims that 85% of the AWS traffic is REST (therefore only 15% SOAP). I have no idea if these numbers are accurate, or how they change over time, but let's suppose they're accurate. This is the main REST drumbeat, that REST is simpler than SOAP.
So is REST simpler? Hmmm, from who's perspective?
REST leads to lots of little URLs--one per remotely-accessible object, plus URLs for listing objects and creating new instances. For example,
Thanks to all the URLs in REST, each URL's behavior is pretty darn simple. There are only four operations: GET, PUT, POST, and DELETE (although it appears that POST can be overloaded to do lots of things depending on the request document; see What is REST?). So assuming the operations' behavior is truly as intuitively obvious as it's supposed to be, once you've got the URL, invoking the behavior should be simple. A SOAP/WSDL service has fewer URLs, basicly one per port type, but lots of operations, as listed in the WSDL.
SOAP and REST seem to agree on XML as the data format of choice. But a SOAP/WSDL format requires a particular XML schema for the SOAP body (a schema for the request and another for the reply). A REST service, in theory, can support multiple formats; the client and provider negotiate which format to use for a particular invocation. Will these negotiations really make REST simpler? Or will a typical REST service maintain simplicity by only supporting a single format?
These are points upon which learned men can disagree. Do you prefer lots of URLs and a small, constant set of fine-grained operations (REST)? Or fewer URLs, each hosting a larger, application/service-specific set of potentially course-grained operations (SOAP/WSDL)? Do you want to support one set of XML data formats (SOAP/WSDL) or potentially negotiate the data format every time (REST)? Which do you think is simpler?
Some other thoughts along these lines:
What is a composite service?
There doesn't seem to be a whole lot of agreement (yet) in the industry, or perhaps even within IBM, on what this term means. But here's my take, at least.
In the tradition of the composite pattern, a composite service is a service whose implementation calls other services. This is as opposed to an atomic service, whose implementation is self-contained and does not invoke any other services.
A composite service acts as both a service provider of the (composite) service and as a service consumer of its child services. The composite can be considered to be aggregating together the child services into a bigger service. A composite service is one kind of what I call a service coordinator, a coordinator that also itself is a service.
In "Service-oriented modeling and architecture," IBM's Ali Arsanjani shows composite and atomic services in the Services layer of an SOA:
The layers of a SOA
You cannot tell from a service's API whether it's composite. In fact, two providers may implement the same service, one in a composite fashion and the other in an atomic fashion. A provider implemented one way might be reimplemented the other way; this change makes no difference to the consumers because the interface is still the same.
Service component architecture (SCA) supports composite service components. Recall that an SCA has a service interface and service references. Those references give the implementation the ability to call the other services referenced. They also serve as a declaration that this component needs these services, much like a Java class importing other Java classes. An SCA with no references is atomic; one with references is composite.
Service Component Architecture overview
A business process is a kind of composite service. A business process can be invoked as a service, a request to perform its functionality. The process contains a sequence of activities implemented as services, so therefore is a service that calls other services. That's a composite service.
In my mind, a business process is a special case of composite service. Usually, a service executes what might be described as synchronously. (The execution can be invoked synchronously or asynchronously.) A service has a return value (or exception); it is returned at the end of execution. The consumer usually waits (but not necessarily with a blocking thread) while the service executes, and does not continue until the service completes and returns its result.
What makes business process a special case is that a caller usually doesn't wait while the process executes. A caller invoking a business process (synchronously or asynchronously) usually waits only while the process starts, to confirm that it starts successfully. But once the process starts, it runs on separate thread(s) "in the background" while the caller proceeds with other work.
Because of this difference, it seems to me, although a business process is just another service and another way to implement a service, a caller usually knows whether or not the service it's invoking is a business process. If it's not, the caller cannot tell whether the service is composite or atomic, but if it waits for a return value, then the service is most likely not a business process.
The latest version of WebSphere Application Server, WAS 6, has a new feature called "Service Integration Bus" (WAS benefits talks about "a new pure-Java JMS engine"). The SIB is implemented as a group of messaging engines running in application servers (usually one-to-one engine-to-server) in a cell. As a service in WAS 6, SIB is a complete JMS v1.1 provider implementation. (Not just the API; a working messaging system.) The JMS provider is a pure Java implementation that runs completely within the application server's JVM process. (For persistant messaging, WAS also requires a JDBC database such as DB2.) Thus JMS messaging is built into WAS and easily available to any J2EE application deployed in WAS.
Why Service Integration Bus? IBM's software customers over the past few years have divided into two overlapping but still distinct markets with different needs:
For a customer that finds itself in both groups--you have lots of WAS apps communicating, but you also need to communicate with other non-WAS apps--you will still need full WebSphere MQ. Embedded Messaging and Service Integration Bus only support WAS apps, so if any of the apps are not WAS apps, you need full WMQ. WAS 6 has a feature called MQ Link for connecting SIB and WMQ.
So here's the basic breakdown of WebSphere JMS options:
Feb 23, 2005
Building an Enterprise Service Bus with WebSphere Application Server V6 -- Part 1 -- The first in a series of articles on the SIB in WAS 6. Also see my blog posting, IBM Info on ESBs.
Some posts by my fellow blogger Bob Sutor have made me think more about what an ESB (Enterprise Service Bus) is. A specific comment that caught my attention, in ESB*?, is "if you use WebSphere MQ and other WebSphere brokers or integration servers, you have an ESB today."
Well, yes and no. I think there are two closely related but distinct issues:
If you want an ESB today, it's a build-your-own affair. In terms of IBM products, you use:
(WBI-MB runs on top of WMQ.) This provides the basic means for applications to invoke each others' services. Each service is represented by a pair of Request-Reply queues. Any application that wants to invoke a particular service places an appropriate request message on the request queue for that service, and then listens for the response message on the service's reply queue. Any application that provides a particular service listens for requests on the service's request queue, performs the service, and then sends the response on the service's reply queue.
The thing is, this is really a Message Bus, not an ESB. Enterprises with sophisticated use of messaging have been doing this for years, and it's still where we are today for the most part. So today, I don't think it's fair to say that we can really do an ESB, but we can do a Message Bus, and that's a pretty good start.
So, what's the difference between a Message Bus and an ESB? I think there are two key differences in the way clients (applications that invoke services) will interact with an ESB that is better than what they can do with a Message Bus today:
An ESB can do lots of other things, and for that matter a Message Bus can too. (A lot of this is broadly referred to as "mediation.") But how will the client experience differ? And how will you code your clients to work differently? An ESB's services are self-describing and discoverable; a Message Bus' are not.
Why is this not practical yet? Standards. Today WSDL describes the format of the SOAP request and response messages and the HTTP URL for invoking the service. WSDL need to be expanded with a standard way to specify a pair of request-reply queues instead of an HTTP URL. Given the URL/address for an ESB, there needs to be a simple and standard way to access its directory service, which must implement a standard for querying an ESB for its services and how to invoke them.
These standards are coming, but we don't have them yet. Which is why today, while we strive for an Enterprise Service Bus, we're still at the stage where what we can build is a Message Bus. But don't let that hold you up; a Message Bus is still a very good place to start, and you can start doing that today. (For more help, see "Understand and implement the message bus pattern" by James Snell.)
I kinda thought everybody knew this already, but it's come up a couple of times in the past few weeks, so maybe this is a question that still needs answering: What's the best way to serialize an object, binary or XML?
First, what's the difference?
Serialization usually means binary serialization, a Java facility for writing an object to a byte stream. The object's class must implement
Java objects can also be serialized to XML, a text (character) format. O/X mapping libraries like JDOM (JSR 102), dom4j, Transformation API For XML (TrAX), JSR 173: Streaming API for XML (StAX), etc. can make this easier. JAXB was supposed to facilitate declarative tools for mapping XML schemas to Java object structures, but it doesn't ever seem to have gotten off the ground. (See JAX-WS Improves JAX-RPC with Better O/X Mapping.) What has stuck better is SAAJ, an object model for SOAP messages, whose implementation usually contains something JAXB-like. (Keep in mind that SOAP is just one kind of XML, a specific XML schema.)
In theory, XML serialization can be built right into Java the way binary serialization is. This is why binary serialization is implemented as a specific set of classes separate from the rest of Java; so that you can substitute another set of classes that implement another serialization scheme. Instead of
So which should you use, binary or XML serialization?
Binary serialization is much more efficient than XML and easier to get working. Writing out an object in binary form is a little faster than text/XML form. Binary form is much more compact and so saves memory and bandwidth. And binary form is much faster and easier to parse than XML.
So why would anyone use XML instead of binary serialization?
Flexibility. Binary serialization requires the serializer and the deserializer be Java (but see the comments for details), and both Java program(s) must have the classes in their classpaths for the objects being serialized. There are also class version issues. XML is not Java-specific and so works with any language. Its text data can more easily be displayed and read by people.
I like to say this: If you find that your apps are too efficient and don't burden your hardware enough, use more XML. There is no code so inefficient that it can't be made even more inefficient using XML. As a keynote speaker said at OOPSLA a couple of years ago (I think it was Alfred Spector): "We just never thought that the programming community would be so accepting of a format as inefficient as XML."
There are ways to make XML more efficient. Don't include whitespace. Use short element names and shallow namespace trees with short names. There's even a move afoot for "binary XML" (practically an oxymoron); see Better Web Services Performance.
So bottom line, if all your apps writing and reading your data are implemented in Java, use binary serialization. Compared to XML, it's easier to get working and has better performance. If and when you need interoperability with other languages or better support for people reading the serialized data, then go through the extra effort to implement the code for XML marshalling and demarshalling.
A few references for more detail:
The IBM Sequoia supercomputer will be faster than the fastest 500 supercomputers, combined.
Thanks to my friend Brian for pointing this out.
How should you set up WebSphere Process Server in your production environment?
I've talked about WebSphere Process Server (WPS) and the latest version 6.2. The simplest way to install a WPS runtime is a single server on a single node in a single cell, which is perfectly adequate for unit testing (and in fact is the topology for the WPS test server in WebSphere Integration Developer (WID)). However, this is not the best set-up for production. In production, you usually want a basic amount of high availability (HA) so that your users still get service even when part of your infrastructure goes down.
We have a recommended "golden topology" for WPS that, among other things, provides a pretty good level of HA. It's documented in the InfoCenter in "Tutorial: Building clustered topologies in WebSphere Process Server" and is summarized in this picture:
This topology is also explained in the article "Building clustered topologies in WebSphere Process Server V6.1" by my ISSW colleague Michele Chilanti. (In fact, I think Michele's article is the inspiration for the InfoCenter tutorial.)
Let's take a quick look at what's in the golden topology. Two details to observe are that the cell contains two nodes and three clusters. Why is that?
So when deciding how to install WPS in production, the golden topology is at least a good start.
So what's this cloud computing thing all about? Sounds like SOA and ESBs to me.
David Chappell, frequent industry commentator and author of books like Understanding .NET (not to be confused with David Chappell, Sonic MQ and Oracle guy and author of Enterprise Service Bus, nor with Dave Chappelle, the guy with the self-titled TV show), has a new and rather interesting paper, "A Short Introduction to Cloud Platforms." There's a discussion of it, David Chappell: Introduction To Cloud Computing, on InfoQ.
I personally get cloud computing confused with grid computing. According to Wikipedia (chronicler of wikiality), grid computing (part of the onetime future of computing) is a cluster of resources that act together like one big resource, such that you don't care where in the grid your functionality gets performed. This sounds like, for example, a J2EE application deployed to a WAS ND cluster; the user doesn't know nor care which cluster member is performing his work. Cloud computing, says Wikipedia, occurs on the Internet (or some other type of network, I suppose) such that you don't even know where it's occurring. When you perform a search using Google, Amazon, Travelocity, etc., where is your search executing? Silicon Valley, New York City, or Bangalore--it doesn't matter. In fact, users in NYC are probably hitting different servers than those in Bangalore; those servers are running in a cloud. The data centers in Silicon Valley, New York City, and Bangalore should each be running a grid.
"What cloud computing really means" (InfoWorld) (part of Inside the emerging world of cloud computing) doesn't really answer its own question. Instead, it covers all the bases, saying cloud computing can mean: Software as a service (SaaS), utility computing, Web services in the cloud, platform as a service, managed service providers (MSPs), service commerce platforms, and Internet integration. Gee, clear as mud. (At least they didn't say it's Web 2.0 (which I say is MVC for the Web).)
Likewise, "Guide To Cloud Computing" (Information Week) doesn't really say what it is. But Amazon, Google, Salesforce, etc. are all doing it. An example that a lot of journalists are talking about is Amazon Web Services (AWS), which essentially lets you outsource computing jobs to them. Need some data crunched? Give it to Amazon and they'll get it done. Of course, there's a lot of constraints in how you package up your functionality to be performed, you need to have a lot of flexibility on when it gets done exactly, and you may need to worry about the security (esp. privacy) of your data.
Of course, I should also mention that IBM does cloud computing as well. See:
The Africa press release even has an IBM definition of cloud computing:
Cloud computing enables the delivery of personal and business services from remote, centralized servers (the "cloud") that share computing resources and bandwidth -- to any device, anywhere.
Back to David's paper. He divides an application platform into three parts (see Fig. 2): Foundation, such as the operating system, and I'd include middleware like a J2EE application server; Infrastructure Services, other capabilities and middleware that the app uses for persistence, security, messaging, etc.; and Application Services, which perform business functionality and ideally are wrapped up as SOA business services. The upshot (see Fig. 3) is that cloud computing makes infrastructure and application services available outside the enterprise, in the cloud. Cloud computing also enables the app itself to run in the cloud, so you just deploy your app to the cloud and access it from anywhere (again, like a world-wide WAS ND cluster).
To me, this approach isn't that astonishing; I guess someone just had to give it a name. I (and many others, I think) look at SOA as being an app that works as (what I call) a service coordinator consuming services, namely service providers. The key is that the providers for any given service may be inside the enterprise (what David calls on-premises) or may be outside the enterprise (what David calls the cloud). In fact, a single service may have both internal and external providers, and it seems to me that the cloud should include both, so that the app consuming the service doesn't need to know whether the provider is inside or outside the enterprise (or both). I think an important part of solving this problem, making services available to consumers without having to know where the providers are, is the enterprise service bus. This is one of the main points of my articles "Why do developers need an Enterprise Service Bus?" and "Simplify integration architectures with an Enterprise Service Bus" (the latter with James Snell).
So cloud computing is functionality being performed wherever is convenient, where the client application doesn't know nor care where the functionality actually lives. A great approach to make this happen, and to prepare for more of it in the future than may be practical for you today, is to use SOA and ESBs.