Apache Tuscany is an open-source runtime for SOA applications using SCA and SDO.
The Apache Tuscany project is an open source project which "provides runtime capabilities for applications built using a Service Oriented Architecture (SOA)." It "provides capabilities which follow the Service Component Architecture specification and the Service Data Objects specification, which together define a simpler, business-oriented approach to the creation of applications and solutions which use a SOA." It currently provides support for running service components developed in either Java or C++. The project proposal states this rational: "Tuscany provides multiple language implementations of the Service Component Architecture (SCA) specifications and related technologies such as SDO."
According to "Apache Tuscany Project to Simplify SOA Development" in eWeek, "Most of the committers to the Tuscany project are from IBM and BEA Systems Inc." The champion and an initial committer is Geir Magnusson Jr. of the Apache Harmony project and the Gluecode product that became WAS CE.
SDO is currently implemented in WAS and WebLogic. In WAS, it's one way for our Java ServerFaces (JSF) screens to access data, and the way messages appear in the mediation framework in the Service Integration Bus. SCA is currently implemented in WPS, along with the Business Objects component that is an extension of SDO. There are currently efforts to make SCA and SDO open standards supported by a variety of vendors. Tuscany is an important step towards making that happen.[Read More]
Bobby Woolf: WebSphere SOA and JEE in Practice
From archive: March 2006 X
Maureen Dowd has asked, "Are Men Necessary?" And one of my readers seems to be asking: Are transactions necessary?
In The Message Consumer Rollback Pattern, I implied that a message should usually be consumed from a queue in a transaction. In this case, one advantage is that if the processing of a valid message fails, it can be rolled back onto the queue (instead of onto a standby queue). In the comments, Euxx (Eugene Kuleshov?) argues against this in his own blog posting, Sometimes transactions are not your friends. His main motivation against using transactions seems to be everyone's favorite whipping boy, performance.
The point of using transactions to consume messages is to ensure that messages are not lost, that all of the processing of a message completes successfully before permanently removing the message from the channel. So the questions are: What's the risk of losing messages? And what're the consequences?
Transactions definitely add overhead, as does persistence, security, and a host of other quality of service capabilities which help ensure a system functions correctly even in the presence of adverse conditions (such as crashing when a task is half-completed). You don't need security (or locks on your doors) if you know that no one will ever try to do something they're not supposed to, and you don't need transactions if you know that all your tasks once begun will always complete successfully. Transactions, like these other QoS capabilities, are insurance against failure; every success pays a small tax which more than pays for itself if and when failure occurs.
Like with any insurance, you have to ask yourself: Is it worth the cost? Namely: How likely or often will failure occur, how negative can its impact be, and is the overhead on the cases that turn out to be successful worth it? If you knew which cases were likely to fail, those are the only ones that need the overhead. (That's why not all Internet traffic is SSL encrypted, only traffic containing private data.) So it's never as simple as saying transactions (or persistence, security, etc.) add too much overhead, systems run faster without them, so don't use them. Of course removing overhead improves performance, and you'll look like a genius, right up until a problem occurs which would've made the overhead worth it.
I've been talking about WebSphere Business Process Management (WBPM) and showing some pictures that help explain what's going on. I now have some better pictures publicly available, so let's review those.
In WebSphere Process Server: A Russian Doll, I showed how WPS contains WESB, which contains WAS. Here's a better picture of that:
WebSphere Application Server, ESB, and Process Server
In WebSphere Process Server Components, I showed the component architecture in WPS. That picture changed a little when we released WESB, which WPS is now built on. Here's the updated picture:
WebSphere integration product family
Actually, I disagree with this picture a little bit. I think we should show the Mediation Flows box as its own row above SOA Core and below Supporting Services. Then you could draw a line between Mediation Flows and Supporting Services and say that everything below the line is in WESB, whereas the stuff above the line what WPS adds. In other words, WESB has Mediation Flows, but doesn't have anything else in the Supporting Services row; I wish the picture showed that better.
Way back in December, in Service Component Architecture, I showed what an SCA looks like and how it can be implemented. Here's an update:
Service Component Architecture overview
I like the way this picture shows both the Interface and the Reference(s). Other versions of this diagram I've seen show Selector and Interface Maps separated from the rest of the Implementation list; this is because those latter two don't really implement anything, they're adapters for connecting to an existing implementation. Too bad this version doesn't show that detail.
So, I hope these updated diagrams help you understand what I've been talking about a little better. Enjoy.
The developerWorks Architecture Zone has a new article, "developerWorks bloggers on Architecture."
The subtitle is "Code is not your friend, and other words of wisdom from developerWorks bloggers." The article compiles blog postings from several of IBM's leading bloggers concerning architectural issues.
For example, Marc Colan discusses ESB compared with point to point. Marc is great at making new technology understandable in simple terms. Here he does a nice job of enumerating the issues that lead you to needing an ESB. Reminds me of my ESB for Developers Article (which includes pictures).
The article also includes one of my favorite blog postings by Don Ferguson, SOA product complexity. It does a nice job of explaining succinctly different kinds of services and how the different products apply.
And (of course) there's one of my postings, Interoperability or integration?, where I try to define the difference between the two. (Originally Interoperability vs. Integration and More on Interoperability vs. Integration.) 'Course, guess this discussion isn't new to you if you've been keeping up with this blog.
So, good stuff. Go check it out.[Read More]
A significant section of developerWorks which is easy to overlook is the Technical events and webcasts section.
For example, just the sheer number of Webcasts we have is amazing. Here are a few that caught my eye:
There are also several Technical briefings listed. These are half- or full-day mini-conferences presented in selected cities. Some interesting ones are:
So, there seems to be something for everyone. Check it out.
A reader asks, "How does an ESB address business process?"
He was reading one of the articles I've written, "Why do developers need an Enterprise Service Bus?" (Which, BTW, he describes as "exactly the article I was looking for." And no, I didn't have to pay him anything. Although come to think of it, he is buttering me up so that he can ask me a question.)
Bob Sutor addressed this topic briefly in his wishful predictions for 2005:
Some people feel that one of the capabilities of an ESB should be to execute business processes. Others feel that business process is outside of an ESB's duties, but that a business process can and should use an ESB to invoke its activities implemented as services.
So does workflow run inside an ESB or does it use an ESB? I fall into the latter camp, that a workflow uses an ESB, but runs in something separate, a workflow engine.
An area of confusion is that workflow is similar to mediation flow (aka message flow, managed by a process manager). The latter is an ESB's ability to intercept a message in transit and mess with it by transforming its contents and/or routing it to a new destination.
Message flows are a lot like non-interruptible workflows (aka microflows) in that they have no human interaction activities and tend to run in a single transaction with no persistence. A difference is that a message flow is focused on manipulating a message, whose transmission kicks off the flow; whereas a workflow (interruptible or not) is more of a service that can be invoked whenever needed via messaging or otherwise. Message flows (and non-interruptible workflows) tend to be stateless; they don't wait for events or timings. For this reason, mediations are much better for, for example, implementing splitters than aggregrators because the latter requires state to store the messages before merging them.
So the generally accepted answer is that ESBs implement mediation flows but not business processes. The latter run outside of an ESB in a process engine, but implement their activies as service consumers that use ESBs to invoke the service providers.
A phrase IBM is throwing around a fair bit is "information as a service." What does that mean?
Information as a service (IaaS?) is part of our service-oriented architecture (SOA) approach. In our SOA Reference Architecture diagram, this is the Information Services block.
The idea is that in an SOA,
A good example is the way many companies store the info for a single customer in several different databases, because different apps want their own customer attributes and because they all want local access to the customer from the local database they manage. Such spread out, duplicated data makes it a real challenge to create, delete, or update a customer. So one approach is to define a customer management service, with operations like create, delete, and update. Expose that service on an ESB, and you can invoke it from anywhere, it'll run anywhere it's hosted, and it'll do whatever steps are necessary (whether that's two steps or 200) to update the databases. If tomorrow you add another database, modify the service to update that database as well and the service consumers never need to know the difference.
Some sources for more info:
To learn some interesting stuff about intellectual property, check out Lawrence Lessig.
A reader pointed me to his blog, which has a really interesting premise.
The blog is Leadership by Numbers, authored by Jack Dausman. Thanks to Jack for pointing me/us all to it in his comment on Social Network Analysis. I don't know Jack and have only looked at his blog a little bit, but I love the premise:
As kids, we made art with paint-by-number kits. Simply matching the outline numbers with an oil paint gave us the illusion of mastery. Today, I'm in IT management in Washington, DC: consulting, systems administration, development & training. And, I still see a lot of paint-by-number projects which try to be the real thing. IT leadership is about reading the numbers, then going outside the lines and taking risks.
Now that is an excellent analogy for how I feel about heavyweight software methodologies! All too often, people are following all the steps but totally missing the point of producing good software. One colleague recently told me, "Sub-average programmers aren't successful with agile methodologies." I responded, "As compared to what? Sub-average programmers certainly aren't successful with heavyweight methodologies! They're not successful with anything; that's what makes them sub-average." Paint-by-numbers doesn't make you an artist, and methodologies by themselves don't make you good at developing software.
For more thoughts along these lines, check out how complicated use cases have gotten.
I've given my blog a new name. Same old blog, but new and improved name.
So hopefully the new name isn't just slick marketing, but a more accurate description of the blog's evolving mission.[Read More]
I've talked about Social Network Analysis, finding connections between things, sometimes seemingly unrelated. Well here's a trail a colleague as found for getting to my blog.
So there we go: From a UCLA law professor to a list of Fortune 500 blogs to IBM's blogs to developerWorks' blogs to mine. Go figure. What would the social networking guys say?![Read More]
Alphaworks has a new tool, the IBM Service Component Architecture Explorer Tool (SCA Explorer).
I've talked about service component architecture (SCA) for developing SOA applications. The spec is part of WebSphere Process Server and part of Apache Tuscany.
So, why not take it for a spin?
What happens when messaging systems go bad? Specifically, when a consumer can receive and read a message, but cannot do what it says to do? Here's the answer in a pattern-ish sort of form.
Here's the pattern:
A messaging consumer (either event-driven or polling) receives a message and successfully parses it. But when the consumer tries to process the contents--such as performing an action, storing some data, or reacting to an event--the processing fails; probably because the app doing the processing uses a database and the database is down, inaccessible, overloaded, etc. So the app gets an error that the processing cannot be done.
When a consumer cannot process a valid message, what should it do with the message?
The app could throw the message away, but that'd be bad. It could try to process the message again, but that'll probably fail again. (The definition of insanity is ...) It could put the message on an invalid message channel, but that's misleading because there's nothing wrong with the message; the problem is with the consumer's app that can't successfully processes the message.
This is what transactions are for. Hopefully the consumer is a transactional consumer.
The consumer should be transactional and should rollback the transaction, therefore putting the message back on the queue.
This way, the message remains on the queue as if it were never consumed in the first place. It waits on the queue until it's consumed again, and hopefully when that retry happens, the app will succeed in processing the message this time. If the consumer is a message-driven bean (MDB), transactions are part of the EJB component model, so the rollback basically happens for free (because the EJB container does it for you).
Moral of this pattern: Use transactions, they're your friends. For more info, see "Configuring and using XA distributed transactions in WebSphere Studio," one of the articles I've written (also listed in my author's spotlight).[Read More]
So how do you tell if something is wrong with a message? Here's a pattern.
A messaging consumer receives a message.
How can a consumer make sure it has received a valid message?
A consumer can only process a message if the message is valid, meaning that the message's format and contents fit the consumer's expectations. Otherwise, the consumer won't be able to process the message and make sense out of it. If the message contents are supposed to conform to a particular XML schema, the contents had better be valid for the schema. If the message is supposed to contain a message/request ID to use as a reply's correlation identifier, the message had better contain that ID.
Part of the value of detecting invalid messages is so that they can be put on an invalid message channel, kind of an error log for invalid messages. This gets the bad messages off the main queues and on to a side queue where error handling code can try to do something with them.
Some consumers can assume that all messages are valid, that only valid messages are put on the queue. But that's often not a safe assumption.
For data validation, the consumer could try to commit the data to the database and see if that fails; but if the commit does fail, then the transaction is ruined. The consumer should be a transactional consumer so that messages are not lost, but if the transaction fails, then it can only be rolled back, after which the message will just be consumed again and fail again. To put the message on the invalid message channel, the transaction must be able to commit successfully.
So invalid messages need to be detected, and must do so without invalidating the transaction.
Use a message validator, which parses the message format and checks the data, making sure that the message is valid.
A message validator can be a separate object--you pass in the message, it answers whether or not the message is valid. Or the validator can be the message processing code which errors out if a problem is encountered--for example, parsing XML data using a validating parser. In any event, the validation must be contained in the consumer's transactional context, so that once the consumer detects an invalid message, it can move the message to an invalid message channel and commit successfully.
If the message is valid, but processing it fails, then the cause must be a problem with the resources being used to process the message. As long as the message is valid, save the message to retry it again when the resource is fixed by using message consumer rollback to put the message back on the queue.[Read More]
I've worked with a couple of clients who use standby queues. This is supposed to solve problems, but I think it really creates more.
The problem is the same as described in The Message Consumer Rollback Pattern: A consumer successfully receives a valid message but cannot process it because a resource it needs is down.
We know the message is valid because it passes the message validator.
Options: Throw away the message, retry, put it on an invalid message queue. All bad.
If only there were someplace to queue up these messages that cannot be processed, so that they can be processed later.
Therefore, use a standby queue. When a message cannot be processed, move it to the standby queue.
This seems great. The message is out of the way but not lost. Processing can continue on other messages on the queue.
But wait: If some of the messages fail, won't they all fail? The queue should be a datatype channel, so the messages should all be doing the same sort of thing, thus all requiring the unavailable resource; so why keep reading from it? Let's say it's not a datatype channel, that some of the messages need database A and others need database B. Then you need two standby queues, one for messages that need database A and one for B messages; so that when one database is available again, you'll know which messages to retry. With two standby queues, A and B, you've now created datatype channels, so you should have just done that in the first place.
Anyway, once you get the messages on the standby queue, how do you get them off again? You don't want to do it until the resource is available again, but how do you know when that is? When it is, what do you do with the messages? Move them from the standby queue back to the input queue? How?
Standby queues also make transactions very difficult. When you move a message from one queue to another, you need to do so in a single transaction so that the message cannot be lost. But if the transaction tries to use the resource and fails (which is the whole premise of this (anti)pattern), then the transaction is shot, so it can't be used to commit the message onto the standby channel. So separate transactions need to be used to read the message off the input queue and update the resource, which is a lousy transaction model that can easily duplicate or loose data.
So standby queues are bad. I wouldn't use them.
What should you do instead? Use The Message Consumer Rollback Pattern. If the message is valid, but you can't process it right now, don't bother putting it on a dead letter queue or an invalid message queue or a standby queue, just roll back the transaction and the message will roll back onto the input queue. It's the best way to put things back the way they were and wait until the resource is available again.[Read More]