Time to look back,
... look around,
... and look forward.
I've been heavily involved in work with the W3C over the past 8+ years, as a chair, as a spec editor, and as a working group member. What they've built, and the open processes they use to build it, still impresses me.
On a more personal (set of) note(s), much of what I've spend the last few years laboring on has matured:
For me, this means taking a new job within IBM. I'll be involved with clouds on the mainframe: z/VM's OpenStack enablement for starters. This blog will continue, but as you might expect with a new day job its emphasis will evolve.
First, the answer; then, in HHGTTG style (42!), the question.
The question was: how do all these wonderful things I'm hearing about relate? How do I wrap my head around all this?
By way of background: Innovate is the annual conference that focuses on IBM's Rational brand - developer tooling. In the Exhibitor hall, there was a whole section of pedestals on IBM DevOps Services (until recently known as JazzHub); there was also a set of other pedestals in the Cloud section representing other capabilities that IBM offers: BlueMix, SoftLayer, Service Engage, and some others that simply don't relate to this view. The IBM Cloud Marketplace is the umbrella through which you can find them all, what's above and what's not (like BPaaS).
What the picture above says, and some additional details:
Clients choose how to consume what they need; there's no "one size fits all" answer. The good news is you get freedom of choice; the bad news is you have the responsibility of choosing (we'll help, if you want). Typically this choice is based on figuring out the subset of options that meet non-technical constraints, and then comparing lifecycle costs.
By the way: if you're happy where you are without any of this (aka remaining "on-premise"), that's fine.
If you want to try something out, cloud delivery options give you a way to kick the tires quickly and with minimal investment of your time or effort. If you prefer to deploy on-premise for production, that's fine too.
If you want to explore these options, the Big Question is: what do you want to pay someone else to do, in order to avoid doing it yourself? More on that in the next picture.
DevOps Services gives you a way - a delivery option - to construct applications using Rational tools delivered via the cloud.
Advantage: you're always on the latest level of development tooling code that you get from the cloud, since it's continuously delivered.
It includes a one-button "deploy into BlueMix" option. You can also deploy to other targets.
Your code still lives whereever it lives now, and you can use any tools (cloud or not) as a result. If you're starting something new it gives you an option to host the repository, but it will just as happily use an external repo like GitHub.
BlueMix gives you a way - a delivery option - to host running applications, for testing and for production.
You can avoid Dependency Hell by binding to other BlueMix services.
You can deploy your app as a BlueMix service for others to re-use - and to pay you for.
Service Engage actually is implemented as a BlueMix app - we're using it for production ourselves.
Service Engage gives you a way - a delivery option - to get systems management from the cloud.
Advantage: you're always on the latest level of systems management code that you get from the cloud, since it's continuously delivered. We also add offering on an on-going basis... in February we had 3, now it's 5, and it's only going to grow over time. These offerings correspond what you might know as Tivoli products - certainly more people at the peds recognized them that way; Tivoli changed its name to Cloud and Smarter Infrastructure last year, if you missed that.
If you want to try something out, cloud delivery options give you a way to kick the tires quickly and with minimal investment of your time or effort. Each offering includes live demo and free trial options.
Your existing applications and data still live whereever they live now.
I must have talked through that picture 20 times over the 3-ish ped-days I was there. After the pattern emerged, we drew the whole picture out on the SoftLayer ped's whiteboard, which made it easier to see/refer to. I guarantee you it was neither as neatly done as the version above, nor as complete as the text points above allow.
The next picture focuses on that Big Question - what do you want to pay someone else to do, in order to avoid doing it yourself? - just in case you're not familiar with which pieces are "in the box" at these levels.
Attribution: I lifted the second picture from a presentation the BlueMix folks did.
I learned a new word a few weeks ago: exegesis (waves in thanks to EW); I have a respectable vocabulary, so not something that happens every day. It dovetails with something that I've been meaning to write about for a while though...follow your nose. You can think of exegesis as an extreme form of following your nose, one that borders on religious fervor and potentially gets hung up on precision in prose that might or might not actually exist.
Follow your nose, in standards circles like the W3C, is a term of art that answers the question: "how do I know X is true?" Since "you indirectly commit to specs when you go online" (Tim Berners-Lee, 2002), it's a polite way of saying "read the specifications in the references sections"...recursively . Don't let the people who refer to RFCs (specs in the IETF) by number scare you, either. Everyone there started knowing none of them at some point.
A performance monitoring example that works for any linked data
Back in October 2012 (yes, "been meaning to" for a while now, ahem), I was working on the OSLC Performance Monitoring spec that ITM 6.3 eventually implemented. One of the developers asked: Where is the dbpedia Percentage class defined in human readable format? I had suggested re-using terms she was unfamiliar with, a common enough occurrence given that vocabulary re-use is a Best Practice in Linked Data. When I responded, I laid out the process step by step because you can do this, in general, for any Linked Data. It's particularly easy for the case of linked data because most if not all URIs are HTTP URIs that serve a useful representation.
A reconciliation example on deciding whether or not to re-use existing vocabulary terms
Six months later, it was I'm wondering if we could use the CS property "hostid" to store the unique system ID? Do you think that is a valid approach or should we better define a custom property?, in the context of a services engagement. I also worked on the OSLC Reconciliation spec, so it went like this:
Look at definition of the property itself, which is found at its namespace-qualified name. That said A globally unique ID assigned to their machines by some manufacturers (.e.g Sun Solaris) . Good match on uniqueness, bad match on "assigned by whom". Debatable match on clarity; creating a custom one for this client is the gold standard there. There's a "see also" link...follow that.
Look at the Reconciliation spec's usage to see if there are any other constraints. Nothing new there.
Look at guidance for Reconciliation spec implementers ...same reason. Some new transformation rules there. You're not a product, so do you need to follow them or not? Record it as a question and proceed for now.
Ask (your favorite experts) where else to look.
In this specific case, since the identifier was assigned by the client, we decided to define a new term in the client's namespace, to avoid ambiguity. Topic for another day.
Speaking as a recovering exegesist
FYN turns into exegesis when you start hanging vastly different interpretations on a turn of phrase that's not visibly different than the surrounding text. As a spec editor and reader, I have a gift (curse?) for seeing how a single set of words can be interpreted several ways. For example, reading ought to as a potential requirement for compliance.
I started observing a cultural distinction between IETF and W3C specs recently, while working on LDP. The IETF specs I was spending a fair amount of time referring to were at times maddeningly vague on issues like extension; if you look at the HTTP Accept syntax, it permits extension parameters (A Good Thing, generally speaking). Ditto media type registrations' generic syntax. When looking at application/json's registration however, which defines no parameters, are other specs allowed to define them? I could not find a definitive answer, even FYNing. In W3C specs I observed many cases where "remaining silent" was eschewed, in favor of an explicit MAY clause ("other specs MAY define blah", etc.). When I contacted some RFC authors to clarify the intent, (some) others were aghast that I should even consider their answers relevant - for them, The Written Word was all that mattered, even if TWW was obviously incomplete.
In some cases, I found myself asking questions like "RFC blah says this new header is defined for successful requests. I could really use that behavior on a redirect; is that considered a 'successful' request?" Coming from an IETF-simulated mindset, I'd guess they'd answer "no, it's not limiting; success doesn't only mean 2xx status codes"; I think my W3C friends would be more split on that.
I've also seen (and perpetrated in some cases) absolutely unreadable linguistic convolutions in normative (MUST, SHOULD, etc) text in order to specify only what's needed, and no more. When TimBL talked to LDP about his comments on the LDP Last Call spec a few months ago, this was on my mind. He said (paraphrased from memory) I prefer a writing style that just tells the client/server what to do. Which I think is meaningfully different from the attitude If it's convoluted to make it Just Right, better that than leaving anything open to interpretation. So I find I'm drifting somewhat back toward what I think of, for now, as the "looser IETF style".
Modified by JohnArwe
Well-meaning, smart developers sometimes do the darndest things; usually in the name of efficiency, or optimization, or taking some idea that works in-the-small and pushing it waaaaaay past the assumptions that made it work there. By and large I don't blame them: it's not like anyone is really trained for creating loosely coupled protocols - in college it's all about cranking out the next project by the next deadline, then moving on. The resulting code is not something they have to live with for anything like "long term", update, extend to match new requirements. We should hardly be surprised if measurements drive behavior.
Web Architecture has a few things to say about URIs, which are used to link to things:
They have two "facets": identification and interaction (what we commonly call location). This is true, by the way, even of HTTP URIs.
There is no reliable mapping from URI to media type (URIs ending in .html need not have an HTML representation).
They are allocated (in the case of HTTP URI) via a hierarchy of delegated authorities.
There's a lot more, of course - worth a read or 3 if you actually program on the Web.
Practical example: Jazz for Service Management Registry Services. Products in the Cloud and Smarter Infrastructure (formerly Tivoli) area use one of those services, the Provider Registry (PR), to locate (obtain links to) specific services at run time. In order to make a simplified discovery process possible, we came up with the process documented in the publications: basically, offer a well-known default URL that a client could override at configuration time. Just like we use DHCP for IP addresses, use an "always on" service like DNS to grease the skids; simple enough. If the client is able to get a DNS entry in place, life is simpler for them, but it has to be optional - some organizations outsource their network management, and getting a new DNS entry can be like dealing with your local tax authority. The DNS approach also helps insulate admins from later changes; if PR has to move, update the DNS entry and Bob's your uncle.
If you look at the PR pubs, the well-known default value is https://oslc-registry/oslc/pr/collection, and the next sentence says: Allow the customer to change the default value.
Example: how some people go off the rails
They think "change the default" means being able to change oslc-registry only... the hostname and port portion of the URI. They realize from experience that hostnames and port values are configured and often under someone else's control anyway, but surely the /oslc/pr/collection portion would never change - that's fixed in code!
Well, no. In a loosely coupled system you want the entire URL to be completely opaque to clients. In other words, only two parties should ever be looking at the contents of a URI: its owner (the software that serves it), and humans (not code!) engaged in debugging efforts. As far as every other component is concerned, it should be treated like a sequence of bits ... the only sensible operation is comparison, you have no idea what the bits mean. Much the same as a digital signature. Technically, you're entitled to parse any URI according to the generic URI syntax, but that doesn't give you enough to achieve an end in itself.
Why? For one thing, it's only a subset of /oslc/pr/collection is actually managed by PR as a Web application; that first path segment, oslc , is a deployment choice. oslc might be the default, but admins of the Web container (WebSphere , in PR's case) can change that. The HTTP delegation of authority concept, which is really a social contract, does not stop at the hostname level.
For another thing, the "single well-known URI" model implicitly assumes that there's only one copy of PR running, at least from a logical point of view (that single DNS could in theory front a set of load-balanced back ends with a shared database, even though we don't support that today). While that's going to be true today in most cases, simply because of where everyone is from an adoption curve point of view, there's no reason to expect that to remain true for everyone over time. It might continue to be true for small and mid-size enterprises, but there's every reason to expect (based on past experiences) that larger enterprises will want to divvy things up internally in some cases - by geography, by business unit, etc.
When you build code, it's important that you know what assumptions you're baking into it, and that you choose them consciously. If you're dependent on someone else's actions, you're getting into tightly-coupled territory. If you're writing code that peeks inside URIs you don't own (what some call "cracking open" the URI), it's broken, full stop; maybe not now/today, but it's just a matter of "when", not "if".
Modified by JohnArwe
Linked data != Object oriented
At least in the sense that a simple, direct mapping misses out on much of the promise of Linked Data. Take a "simplest possible" case: A is related to B. The object-oriented solution is to create a class for A's that contains a B-object, which ends up being a pointer or object handle to an object of B's class.
When you put A and B on the Web, each has a URL. http://example.org/A and http://example.org/B will suffice. Somewhere in the representation of A, you expect to find a link (URL) to B. Simple. Direct. But not quite what I started out with.
Humans recognize that "A is related to B" also means "B is related to A". The OO fix is simple; add the reverse link, sometimes called a back link (or change to use an association class). That maps to HTTP just as easily as the first case. But now there's a management issue: if someone updates the link from A to B, what happens (if anything) to the reverse link?
Often the application goal is "coherency" of those links, so the answer is that B's back-link should be automatically updated; but then there's a different management issue, access control. What if the user that updated A does not have the authority to update B?
Linked Data gives you more flexibility. In an OO system, if you have a reference to A you can look at A's properties. In a Linked Data implementation, if you ask for the representation of B, you'll usually get back RDF triples of the form < B , property , value >, but that is not guaranteed. In our case, you get back those and you get back < A , is related to, B >.
This is simple example violates most developer's intuition. It shows up most often when they want to provide information about some other resource that they link to. For a concrete example, think about owned assets.
Each owner (be that a person, an organization, whatever) has a list of assets that it owns. In linked data terms, there's an owning resource that has an "owns" link to each asset it owns. An IT department links to its servers and PCs:
dept-IT owns http://.../SN339487-14404
In order to display that data more usefully in a UI, a consumer of this data wants to know something about the type of the linked-to resource - is it a server or a PC? The typical thinking is to add this to the data.
dept-IT ownedType server
server would be a URL too (actually all 3 pieces of information would be URLs, but that's for a separate entry). The problem with the solution above is two-fold. First, it's needlessly indirect - a well-known and very commonly used term that many pieces of existing code understand, rdf:type, exists to describe the type of something. Second, there's no grouping. If the IT department owns 2 assets, there's no way to tell which ownedType goes with which owns link. It's possible to introduce the grouping, but that causes yet more issues.
Linked data takes exactly the same approach as we would take in natural language. Just as we'd say (in English at least) "The IT dept owns asset [id], and asset [id] is a server", in linked data you'd simply "say"
dept-IT owns http://.../SN339487-14404
http://.../SN339487-14404 rdf:type server
The fact that the UI wants to retrieve this information when it asks for the IT departments representation does not change how the data is structured.
Update on the W3C Linked Data Platform
When last I wrote about LDP, there were 2 weeks left in the Last Call review period. It's not uncommon at W3C for a Last Call review to cause the community to seriously look at the content for the first time, and LDP was not exception. While the number of commenters was small, we made up for it in stature - amongst them was Tim Berners-Lee himself. He's clearly got some uses for LDP in mind, and we've had two follow-up calls to talk through his issues. The working group has resolved most of them, so the work has shifted more towards the editors (including myself). The editor's draft is close now to reflecting the working group's intent, and by this time next week I'm betting we'll be completely caught up.
An experiment: the practical side of using Linked Data
An important part of my job in Cloud and Smarter Infrastructure (nee Tivoli) is helping our cross-product integrations walk the right path when using Linked Data, which is part of our architecture. We want the benefits that Linked Data offers: open integrations that "keep on ticking" in the face of change, so they're extensible without introducing risk and so any investments in them are protected over time. This is new territory for most developers, however. I think the mistakes and misconceptions I see (and correct) during our reviews would be useful to others, so I'll be starting a series Real Soon on this blog. They'll be lower level (more concrete/geeky) than is customary, but I think that's important to make it relevant for people writing code. They also apply more widely than "just" Jazz for Service Management, so I'm putting them here rather than on that blog. Stay tuned.
One of my day jobs is to contribute to the development of standards. At the moment, I'm co-editing a specification in the World Wide Web Consortium (known as W3C in many circles) ... the people who bring you HTML, XML, and other interesting kettles of fish.... called the Linked Data Platform.
If you already know what REST and Linked Data are (extra credit: OSLC too), you can jump right in (details on the Service Management Connect blog). What LDP focuses on that's new is the write operations; the common view of linked data to date is largely read-only.
If you need more background, there's an earlier entry with a reasonably concise version of that.
I'm a raving lunatic perfectionist when it comes to writing documents, so I won't even pretend that the Last Call draft is where I'd really like it to be. Suffice to say, before I left for my summer walkabout I made sure that the normative parts (stuff you can't just fix later) hung together. There's certainly room for more/better examples, and no doubt places where mere mortals' parsers will throw an exception - and that's exactly what the review process is there to help deal with, in addition to the normative content that people will depend on when it comes to asserting compliance.
Read with a skeptical eye.
Modified by JohnArwe
A few weeks ago someone approached me with questions on something (I think it was OSLC Automation, but the pattern holds widely). Naturally (for me), I asked what he'd already read so I'd understand what he (in theory) already knew and what I'd have to explain in simpler terms. He started answer with: "Specifications are too dry to read." Now I realize not everyone's a spec reader, or even learns best through reading anything. Some people are kinesthetic learners, some verbal, some visual. C'est la vie. But the comment stuck with me.
In my spare time, I occasionally carve out a time slice for pleasure reading; what I call pleasure reading scares many people I talk to (A Brief History of Time and so on). At some point since his comment I read the "Figure versus Ground" topic in Gödel, Escher, Bach.
Fast forward a bit, and I've been editing the W3C Linked Data Platform specification as we get ready to issue a Last Call working draft (see the Jazz for Service Management blog for why I think LDP is a Good Thing for the Web and for open standards). As I was drafting some sections, my brain kept coming back to his comment like a song you can't get out of your head.
A bit of background: Escher is famous for his drawings; one set of them pokes at figure vs ground (aka positive vs negative space) directly, whereas others use it to construct higher order effects (optical illusions/contradictions). He's not the only one to make use of figure vs ground though ... most people are probably familiar with the Rubin vase, which demonstrates the concept nicely: is it one vase or two faces? It depends how you look at it.
The (ha! like it's just one!) problem in writing specifications is you're simultaneously writing for different audiences. On one hand, you're writing for implementers (servers, in Web specs); they want to know what behaviors they have to code in each single interaction, what they can skip, and they want as little else (chaff, distraction, etc.) as possible. On the other hand, you're writing for adopters/users (Web clients), who are more interested in stringing together a sequence of interactions to accomplish some purpose of interest to them; they want examples, background, informative text that shows them what some of those useful interaction sequences look like. Satisfy the adopters/users, and you've got a spec big enough to scare off some implementers without even reading it - too big and scary looking (parceling it up into multiple documents helps to a degree). Satisfy the implementers, and no one knows how to use it. Argh, I hate it when that happens. And yet, I could argue it's just a symptom.
I think the underlying problem is that we have to use prose to describe a picture (the old "a picture is worth a thousand words" maxim, demonstrated ad nauseam). What specs attempt to do is to divide the universe into two sets (one or more sets of two, more often): they define some conformance or compliance criteria, and they're trying to divide the universe into "compliant Xs" and "non-compliant Xs" ... in other words, they're trying to outline the vase - with words instead of a visual line. Imagine trying to describe the Rubin vase above using only words, no picture. You can try short-cuts like "take the starting rectangle, and subtract a strip at top and bottom whose height is xyz% of the total height." They're not very satisfying; describing what something is by stating what it isn't is cognitively harder. When it comes to that curve under the lip, good luck using prose (without reference to an image by analogy) for that.
What compounds the problem is the figure versus ground issue. Absent any way to draw the outline of that vase with words, we end up with not only positive and negative space, but "line space" in between. By the way, in open specifications we're not describing just one vase either, we're describing a class ("compliant vases"). Ow my head.
Back to the emergence of line space... the visual analog of the problem is to draw the outline of [the class of compliant] vase[s] using a 3-inch (8 cm)-wide paintbrush instead of a fine-line pen. The line itself is big enough that it now constitutes another space unto itself; not all of which falls within [compliant] vase[s], so you add more words (dab the line space while holding the brush sideways). It's like learning integration (the mathematical kind: using progressively thinner rectangles to measure/estimate the area under a curve) all over again.
I always liked calculus; maybe that's why I'm still editing specs instead of giving up