An ACID transaction
is one that guarantees reliability, even when success is not possible. The ACID properties
(from the docs for CICS
, the mother of all transaction managers) are:
- Atomic - Enables changes to be grouped and performed as if they're a single operation--all or nothing.
- Consistent - A transaction begins and ends with valid data, even if the data is invalid during the transaction. This means that UIDs are indeed unique. For a relational database, this means that it must maintain referential integrity.
- Isolated - Each transaction executes as if it's the only one; it is independent of any other concurrent transactions. A transaction's intermediate state is not visable to operations outside of the transaction.
- Durable - Changes made by the transaction are persisted and cannot be lost, even if the resource subsequently fails.
A transaction can complete in one of two ways:
- commit - Changes made during the transaction are made permanent and cannot be undone.
- rollback - Changes made during the transaction are undone or discarded and the resource is returned to the state it held when the transaction began.
You may also be interested in my discussion: Are Transactions Necessary?
Peter Deutsch asserts Eight Fallacies of Distributed Computing
- The network is reliable
- Latency is zero
- Bandwidth is infinite
- The network is secure
- Topology doesn't change
- There is one administrator
- Transport cost is zero
- The network is homogeneous
Fallacies of Distributed Computing Explained (PDF) describes them in greater detail.
Most programmers initially learned their craft developing programs that run in a single process (and therefore on a single machine). This may have changed somewhat in the past decade, but we still start out learning to write simple programs and even today we still develop (though not deploy) complex systems in simplified environments.
The jump from single-process programs to distributed computing is difficult, as Deutsch observed. It requires a significant change in mindset. We get used to the idea that every line of code runs very fast and is immediately followed by execution of the next line of code. We know each line of code takes a little time to run and programs can fail, but generally we get used to programs being reliable stacks of code that can perform meaningful units of work very quickly. But once the program architecture becomes distributed, with parts running in separate processes invoking each other remotely across network connections, these assumptions about the simplicity of program execution get exposed.
I think fallacy #7, Transport cost is zero, nicely summarizes this dilemma. We're used to computational overhead being close to zero; when distributed computing changes that, it messes up a lot of our assumptions. This fallacy overlaps with #1: Reliability, #2: Latency, and #3: Bandwidth. They are why programs designed to run in a single process often perform poorly when arbitrarily split across tiers.
Patterns have evolved to address this issue of transport cost. A Session Facade is used to make remote EJB clients less chatty, since every invocation across the network adds overhead. It's a major theme underlying the Enterprise Integration Patterns, especially the first few chapters of the book. Network latency, along with marshalling/demarshalling of data, are significant issues for RPC/RMI programming. Those, along with asynchronous service invocation, are significant issues for messaging.
Systems were easier to design and develop when they were assumed to run in a single process. Distributed computing makes that a lot more complicated. But what is a problem is also an opportunity.
IBM Support Assistant comes in two editions, Workbench and Lite. Which should you use?
I've talked about IBM Support Assistant (ISA)
), IBM Software's tool for gathering diagnostic data on your installation of our products and sending that data to IBM Support
for analysis, usually as detail of a problem management report (PMR). (See "Submitting diagnostic information to IBM Technical Support for problem determination
.") ISA has a pluggable architecture of collectors. Each collector is for a different IBM product; often a product has a suite of collectors, each designed to diagnose a different part of the product. You can also design your own collectors.
The question remains: Install ISA Workbench or Lite? The download page
has a quick comparison. Both have the same plug-in collectors architecture. Often that's all you need to gather data for IBM and send it to them. Frequently you don't need the extra bells and whistles in Workbench. When it doubt, try Lite first, it's probably all you need. Lite is smaller to download, easier to install, and does the basics Workbench is usually used for anyway.
WMQ v7.0.1 and WMB v7 have a new feature for standby processes that make the products more highly available.
The feature in WebSphere MQ, introduced in v7.0.1, is called multi-instance queue managers. The corresponding feature in WebSphere Message Broker, introduced in v7, is called multi-instance brokers. In both cases, the queue manager or broker runs in two processes, one active and the other on standby. If the active one fails, the product automatically fails over to the standby, with virtually no service interruption. Note that any resources the processes use, such as a database, must have their own high availability capabilities.
Multi-instance queue manager
In prior versions, to make WMQ or WMB highly available, one had to use hardware clustering (such as PowerHA (formerly known as HACMP) or Veritas). Hardware clustering may still be the gold standard for HA, but for environments that don't quite need the gold standard, software clustering via multi-instances may be good enough.
This new design reminds me of how the service integration bus in WebSphere Application Server works. An SIB bus is a collection of messaging engines managed by the WAS HA manager. By default, it runs two copies of each messaging engine, an active and a standby. If (parts of) the cluster lose communication with the messaging engine, the HA manager switches them to use the standby. Only one copy of the messaging engine can be active and only that one can maintain a lock on the external storage for the persistent messages. Multi-instance queue managers work in much the same way.
The new Redbook High Availability in WebSphere Messaging Solutions contains a section on multi-instance queue managers and brokers. To quote:
WebSphere MQ V7.0.1 introduced the ability to configure multi-instance queue managers. This capability allows you to configure multiple instances of a queue manager on separate machines, providing a highly available active-standby solution.
WebSphere Message Broker V7 builds on this feature to add the concept of multi-instance brokers. This feature works in the same way by hosting the broker's configuration on network storage and making the brokers highly available.
So if you needed a reason to upgrade to WMQ and WMB 7, now you have it.
Thanks to my colleague Guy Hochstetler who made me aware of this new feature.
We've updated the Recommended Reading List for JEE and WebSphere Application Server.
I've talked about our Recommended Reading Lists
, compilations of what we think are the best articles for getting started in a topic area. We update them from time to time to add the latest articles.
We have just updated the Java EE (fka J2EE) reading list: "Recommended reading list: Java EE and WebSphere Application Server
." Thanks to ISSW's Sree Ratnasinghe for making the updates. Check out the list, I think you'll find it helpful.
I find that customers often confuse the role of an environment with its quality of service.
I previously discussed Data Center Environments
, specifically the typical environment roles of Dev, Test, Stage, and Prod. These separate environments keep code under development (Dev) away from code the enterprise's users use to do their work (Prod). They also create a reliable, controlled environment for performing testing (Test) and for practicing installation and migration procedures (Stage).
These are environment roles, which describe who should be able to access an environment, what it is used for, and therefore what it should and shouldn't contain. For example, only the Prod environment should be able to access and change real customer data. Stage may contain a separate copy of the production data. Dev and Test shouldn't even have a copy of the production data, which is probably confidential and should be protected, but instead should conatin a representative set of fake data. (Use a Data Obfuscator
to produce test data from a set of production data.)
The role of an environment is often confused with the quality of
service (QoS) an environment should support to meet its requirements.
One common example is availability. The applications running in Prod
are typically assumed to need to be available 24x7 (aka always). Test
and Stage are understood to be unreliable, that they may be taken down
or crash at any time as testing needs dictate. The Dev environment is
typically assumed to be fairly reliable, but with the understanding
that outages are acceptable.
These assumptions about the availability of different environments can become a problem for repository products
like Rational Asset Manager (RAM) and WebSphere Services Registry and Repository (WSRR). Dev environments are typically not managed for reliability, yet products like RAM and WSRR (used in development to manage SOA governance) need to be reliably available. This is likewise true for the source code management system, but somehow the reliability requirements of RAM and WSRR are seen as being much more complex.
Long story short, customers often decide to install RAM and WSRR in their Prod environment simply because that has people prepared to manage WebSphere Application Server (WAS) servers (which is what RAM and WSRR run in) and make those WAS servers highly available. This, in my mind, is kind of crazy. RAM and WSRR store development artifacts, which are not used by production applications any more than source code is, and so should not be stored in Prod.
Customers often insist on installing RAM and WSRR in Prod because it's set up to make them highly available. I think the far better approach is to set up a couple of WAS servers in Dev for reasonable (maybe high) availability and install RAM and WSRR there, and assign personnel (who perhaps normally work in Prod) to manage those servers in Dev.
I'd be interested to hear from customers using RAM and/or WAS: What environment do you have them installed in?
One issue I see customers get confused on is the purpose of separate but equivalent software environments.
An enterprise should divide its IT servers and software into multiple separate and fairly independent environments. The number and purpose of these can vary somewhat, but a typical separation is these four environments:
- Dev -- The development environment used to implement and compile software. Typically used for unit testing as well.
- Test -- Used to perform functional testing and otherwise make sure that the software from development meets requirements. Scalability testing can be performed here if the hardware is robust and representative of Prod; if it's a shell, scalability results may well be misleading.
- Stage -- A representative mirror of Prod, a place to test installation and migration procedures and perhaps the best place for scalability testing. Can also be used as an alternative/backup for Prod. New applications can be deployed by installing them in (part of) Stage, testing that, then swapping it for (part of) Prod.
- Prod -- The production environment used to execute applications so that users (internal and external) may use them.
The users in the enterprise really only care about Prod. Dev, Test, and Stage are only used by IT. "The Ideal WebSphere Development Environment
" is an old but good article which explains environments and how to use them in greater detail.
These environments are actually four roles than an environment can play. An environment in the Dev role needs development tools and test data, but probably doesn't need monitoring. The environment in the Prod role is the only one that should store confidential customer data and should have monitoring to verify that it's running properly.
The role of an environment is independent of its quality of service (QoS) requirements, a topic I'll discuss in my next posting.
developerWorks has designated me as a Master Author.
I've talked about the IBM developerWorks Author Achievement Recognition Program
. When they updated the list of recognized authors
back in July, I was named a Master Author
, which basically means I've published more than just about anyone else (in the past five years, except for Roland Barcia
In response to: Silver Lining: You can start now with Cloud Computing!!
Tendai, you make some good points here, but I'd like to play devil's advocate for a minute.
How is an SOA Billing Service which is provisioned in a cloud any different from the same SOA Billing Service that's deployed in some sort of non-cloud? Either way, isn't is just an SOA service which performs billing functionality? If the cloud make it something more, how so?
Doug Tidwell says that cloud computing with SOA
is much more useful than without SOA and I agree
. SOA and cloud computing is a killer combo.
I believe the solution your post suggests is actually two distinct parts:
- SOA -- Each department deployed the same billing application, each requiring substantial middleware and hardware. Streamline this by having the departments share a billing service which can be deployed once on a single (clustered) set of middleware and hardware.
- cloud -- An SOA reference architecture which SOA services, such as this billing service, can easily be deployed to. The reference architecture should be a grid which dynamically adjusts capacity for the billing service as needed. (I notice that my four-year-old link to grid computing still works but connects to a page now titled "IBM Cloud Computing.")
So, a doubter might ask: How much is your example about cloud computing and how much is it about SOA? I'd like to see a blog posting which addresses that distinction. Thanks.
There's a new release of the Eclipse platform available.
The Eclipse Foundation
has released a new version of the Eclipse platform
, Eclipse Galileo
. Technically, they call it a "release train" built on the platform, which means it doesn't change the platform but instead adds a whole bunch of stuff they think people are going to want. The stuff doesn't necessarily work together, but at least it's all grouped into one downloadable thing.
For more info, see "Eclipse Galileo Release Now Available
." Also see "An Eclipse Galileo flyby
" on developerWorks.
The new release contains something for everyone:
- Java: JDT - Java development tools
- Workflow: Java Workflow Tooling
- Ajax: Rich Ajax Platform
- SOA: Swordfish, SOA Tools, and SCA Tools
- Code generation: M2T JET (Java Emitter Templates) - aka JET2 (see The Design Patterns Toolkit)
- Component architecture: Equinox (see Eclipse Equinox)
Check out the full list of projects
. The article says it also includes the Eclipse IDE for Java EE Developers
Technorati Tags: eclipse, galileo, developerworks |
| Digg it | Slashdot it | Post to del.icio.us |
In response to: Sample application using WPF - part 1
Sami, thanks for making available these videos showing how to build a simple demo application using WebSphere Portlet Factory (WPF).
What's with all these blog postings that begin "Re: ..."?
Some of our blog postings are starting to look like e-mail replies, with titles that start "Re:". For example, I have posts today called Re: Cloud - SOA = Zero
, Re: NEW RELEASE: IBM Support Assistant v4.1
, and Re: SOA for Dummies 2nd IBM Limited Edition Mini eBook
. What are those?
These "Re:" posts are a new feature of the My developerWorks
blogs. When you're a blogger on MydW and you comment on someone else's MydW blog, you also have an option to cross-post your comment as a posting on your own blog. So when I commented on Doug Tidwell's post, Cloud - SOA = Zero
, my comment also appeared on my blog as Re: Cloud - SOA = Zero
. Also, where my comment appears on Doug's blog posting, the comment has a Trackback
link which connects to the posting of this comment on my blog. And the posting of the comment on my blog has a header with a link to the original post on his blog. It's all interconnected.
So if you're interested in what I have to say about stuff--and presumably you are since you're reading my blog--you can also easily find comments I've made on other people's blogs and easily jump to those original blog postings I commented on. And these links can daisy chain
, with a comment on a comment on a posting, showing a conversation between two authors or several.
This is going to be a good way for me to show you postings on other blogs that I think you'll be interested in. For example, I commented on postings about SOA for Dummies
and IBM Support Assistant for two reasons:
- To let readers of those blogs know that I've posted info about those topics in the past on my blog
- To let you, the readers of my blog, know that another blogger has new information on a topic I've discussed in the past
It's easy for me and hopefully helpful to you, the readers. It's a win-win
for both blogs, and a win for you the reader as well. (Or as Michael Scott
would say, a win/win/win
To my fellow MydW bloggers: Let the comments rip!
Technorati Tags: my developerworks, blogging, developerworks, ibm |
| Digg it | Slashdot it | Post to del.icio.us |