Well, I have always looked at Workplace Forms as an eco-friendly product line since, after all, it does dramatically cut down on the use of actual paper, which cuts down on the cutting down of actual trees. But, this post is actually quite a bit more technical than that. In this case, we're going to talk a little about the the *Workplace Forms Designer 2.7* (part of the 2.7 GA release this week) and the three wizards it includes to help form authors with creating dynamic tables and the components and formulae that make them dynamic. The focus, of course, will be the new wizard in 2.7.
The first *Table Wizard* helps you to create a repeating set of user interface controls over some data. You get a chance to set up nice presentational features like column headers, column dividers and row highlighting, and you also get Add and Delete buttons for the table. The second *Row Operation Wizard* allows the form author to express a formula over two columns of a table, with the result being placed into a third column. The formula is defined on the underlying data, of course, via an XForms calculate. The wizard is ideal for setting up that "line total = quantity * unit price" formula that shows up in every purchase order. It could also be used as the starting point for setting up the XForms constructs for a more complex calculation.
Although we've made improvements to these wizards in 2.7 (and will do so again in the future), the new kid on the block is the *Column Summation Wizard*, which is ideal for setting up a summation over a column of data, e.g. for taking the subtotal of a purchase order. The XPath expression written by this wizard is very interesting and relates to the title of this post. All you do as the form author is gesture at the user interface object in the column to be summed and at the user interface object where the result will go. Here is a sample of the underlying XPath calculate that results:
This formula requests the summation of all the line "total" values on each row of a purchase order form, and if the result is not a number (NaN) for some reason such as bogus user input, the formula produces an empty string result rather than the summation value.
This is helpful in and of itself because XPath, XML Schema and other W3C technologies thend to regard an empty string as not a number. So if you try to add or multiply a data node that has no value yet (like the quantity of items desired), then NaN starts to percolate through your form unless you take special measures.
But the really special sauce is the XPath predicate shown here:
[.!='']. An XPath predicate is used to filter out nodes that have some undesirable property. In this case, we want to get rid of the empties and only sum up the data nodes that have a non-empty value. The dot means "this node", so the expression in square brackets says "this node must be not equal to the empty string".
Suppose for a moment that we didn't add this extra tidbit to ignore the empties. Suppose a user has a partially filled out purchase order form with three or four rows of data leading to a non-empty column total. If the user then adds one or more empty rows to add more data, then at that moment the column total formula would update itself to calculate the total over rows that included empties, which would result in a NaN, which the 'if' part of the expression converts to an empty string result. So the user would get treated to this odd experience of having the column total disappear each time he tries to add another row to the purchase order. The extra predicate created by the *Column Summation Wizard* says to ignore the empties so that the user will have a chance to fill in a row of data before it contributes to the column summation.
Well, no blogs from me next week, so seems a good idea to knock off another one this week...
XML namespaces rec (pardon the pun) states that an attribute which is namespace unqualified
is 'local' to an element and is uniquely identified by a combination of the attribute local name and the type and namespace URI of the containing element.
The word identify
really should be used more sparingly, as here is a case where its misuse has caused years of confusion and acrimony in the XML community. An identifier is something that established the identity of something else. You cannot have two things associated with the same identifier unless they are identical things.
I am often frustrated by seasoned W3C folks who say that "this depends on your definition of identify" and honestly
believe this is a defense of the confusion in the namespaces rec. This is like saying ot me, "Well you're right unless you have a definition for identify that doesn't identify things, which is what we
did at the W3C."
To see the problem, you have to look earlier in the spec where a namespace n1 is associated with a URI and then the following code appears:
<good a="1" n1:a="2" />
The problem is that the element 'good' is in the same namespace as n1. So now you have local attribute a
that is essentially given meaning by an element that is in the same namespaces as the 'global' attribute n1:a
. Yes the two attributes have different values.
From this we have to infer that, although a local attribute is given meaning based on the containing element and its namespace, a global attribute with the same local name and namespace qualified into the same namespace can actually mean something totally different.
In other words, we have two attributes with the same local name, contained by the same element, and given meaning by the same namespace URI, but they are not identical. This is the local attribute not
being 'identified' (in my sense) by local name and containing element type and namespace.
The technically subtle W3Cer will tell you that there was no reason to spell out using words in the normative part of the spec the fact that the two attributes are different things because the spec says that local attributes are in a different partition than global ones. Problem is, this partitioning info is in a non-normative part of the spec, and particularly in the same part that has the language about how local attributes are 'identified' by local name and containing element type and namespace.
Anyway, the upshot is that an XML vocabulary does not need to but is allowed to say that local and global attributes with the same local name can mean different things. If they're supposed to mean the same thing, then the XML language has to define a precedence rule for what happens if the two attributes differ in value. Here's an example:
<data xmlns="http://example.org" xmlns:ex="http://example.org">
<price currency="USD" ex:currency="EUR">10</price>
Question: Is the price in USD or Euros?
Answer: Depends on who designed the language.
Second answer: Don't do that.
Interestingly, the XHTML working group came up with a fascinating example where it is legitimate and sensible to have a global and local attribute with the same local name but completely different meanings. It has to do with the next version of XML events. After hours of discussion, the decision was that they weren't going to do that (second answer above). Not because it's illegal, but because it's too subtle for most XML people.
Well, the XHTML group may change their minds, but even if they don't, the example is really worth understanding because it actually makes sense why you'd want to have the two attributes mean something different. Stay tuned, I'll tell you all about it when I get back...[Read More
The annual World Wide Web Conference is being held next week in Banff, Canada. It's about an hour's flight from Victoria, where I live, so of course I'm going!
I served on the program committee of the XML Web Data track this year. Competition to get into the conference is quite stiff, so recommending a paper for acceptance is a low probability event. Of the nine papers I reviewed, most were really good, and a couple that I had hoped would make it to the program will indeed appear. I am pleased to be chairing a session on Thursday May 10 that includes one such paper (I did not receive the other two for review).
I will also be presenting in the W3C Track on Friday at 1:30 as part of an Architectural Integration session organized and chaired by Steven Pemberton. I will be presenting on The W3C Rich Web Application Backplane, a forward-looking view of the possibilities for integration and composition of web applications leveraging W3C formats and APIs.
Finally, I'll be presenting in the Dev Track on Saturday at 1:30. My talk will demonstrate a schema-initiated drag-and-drop design experience for XForms using the Lotus Forms Designer.
Hope to see you there!
The XML Schema language is a good language for capturing the syntactic constraints of an XML vocabulary. But let's face it, it is really designed more for describing data. It's just not powerful enough to be used for making normative contributions to W3C Recommendations like, say, XForms.
This is not to say that a disagreement between the recommendation and the schema won't sometimes be resolved in favor of what the schema says, but there is a reasonably popular view that the schema is also normative. Actually, the recommendation is normative. In fact, the recommendation is typically provided in two or three formats (sliced up version, single HTML file, diff marked version), and only one of those is considered normative. Even more to the point, English is, for better or worse, the only language in which the normative recommendation is expressed. Any other versions are also considered to be informative.
Quite apart from what anyone tells you about the normativeness of a schema, you can determine that the schema associated with a recommendation is informative, not normative, if it does not appear in TR space. This is a subdirectory of the W3 website named 'tr' and used to publish all technical recommendations of the W3C. In the case of XForms, the schema lives in the working group space, not TR space, so it is informative.
However, the location of the schema isn't really the deciding factor for me. Even if a schema did appear in TR space, it still would only be informative in my opinion because there are lots of language constraints that you just cannot express in schema but which are expressed in the recommendation. A number of great examples of this can be found in XForms, two of which are explained below.
Perhaps the easiest is the use of XPath expressions in XForms. In the XForms schema, a number of the attributes like
ref have values that must be XPath expressions. The schema declaration for those attributes declares a type of
xforms:XPathExpression. But here is what the XForms schema defines for the XPathExpression type:
<xsd:simpleType name="XPathExpression"> <xsd:restriction base="xsd:string"/> </xsd:simpleType>
Shock of shocks, it's a no-restriction restriction from the base type of
xsd:string. You can just see two spec writers hands getting tossed way up in the air on this one and being caught, if only tentatively, by wrists not anxious to end up with carpal tunnel syndrome from the attempt to express the XPath BNF rules as an XML schema restriction. (Those with formal language training can groan a little louder than the rest).
The point is that schema just can't touch this. But, the rules of XPath are clearly and normatively defined in the XPath recommendation, and the XForms recommendation dutifully cites XPath in its list of normative references. That's why the recommendation text is normative and the schema isn't.
Not to put too fine a point on this, but you might claim that the XPath example is a perverse one. Well, it's not the only one. For example, a number of XForms elements like
select1 have a required single node binding. For the two elements I just mentioned, the single node binding attaches the user interface control to a node of instance data so that user input can be placed into the data model of the form. A single node binding can be expressed using either a
ref attribute or a
bind attribute. Unfortunately, XML schema doesn't have a way to express the requirement that one of two attributes must appear. So, in the schema for XForms, we say that both
bind are optional even though that's not very accurate. Separately, each is optional, but that's only the half of the story that XML schema can tell. The full story appears in the normative text of the XForms Recommendation.
The purpose of XForms is to express the core XML data processing asset used in sophisticated data collection scenarios.
In fact, it would be better if XForms were called the XML data processing language (XDP or XDPL) because XML is about standardizing data and about 80% of business transactions are based on filling out some kind of form to collect the transactional data.
An XForm contains one or more XML data instances. An instance is an arbitrarily structured XML data document that is typically an instance of some XML schema that expresses the static validation rules for a target namespace.
One can write an XForm without an XML schema by just expressing the XML data in an instance. This is because XForms provides other channels of data validity checking that can be easier to work with when only simple data type validation is needed. For example, you can use an XForms
type declaration to associate an
xsd:date or similar data type to an XML data node without writing an XML schema for your XForm.
But XForms validity checking is also dynamic, in recognition of the fact that validity of some values can be based on other values or the aggregation of other values. For example, in an interlibrary article request, the upper bound page number in the journal must not be less than the lower bound page number. Or, the user is only authorized to make a purchase order with less than $10,000 total value.
The latter example is important because it leads to the conclusion that we not only need a way of testing data values relative to other data values, but also that we need a way of calculating data values that are then used in validity tests.
From there, it is not a big leap to conclude that generalized XML data processing requires some way to indicate dynamically whether further changes should be allowed to certain pieces of data or whether certain parts of the data are still applicable to the transaction based on other data values. A good example is a mortgage preapproval form that can handle both single and joint applications. The co-applicant data is only relevant if the user selects the joint application mode.
XForms allows the form author to express formulae for these aspects of data, which are called model item properties, or just MIPs. Not too surprisingly, the names of these MIPs are
Of course, there is no point in representing data, calculating values over data, and validating data if you have no way to change the data. XForms allows simple content data values to be changed, but it also allows insertion and deletion of larger blocks of data that contain internal structure because this is essentially what's needed to add or delete a row from a table.
Most importantly, XForms offers form controls that expose data to the surrounding application context. If the data changes, the form controls change. This includes not only exposing a changed simple content data value, but if a set of form controls are associated with a repeated sequence of structured data, and the number of data nodes in the sequence changes, then form controls are created or destroyed as needed to respond to the change of data.
XForms is all about thinking of the data first and driving outward to how that data gets exposed to applications. Perhaps the most prevalent of such applications are for presenting the data to a human user, though even human users have highly varied capabilities. For example, the desktop user and the PDA user have very different visual capabilities. Of course, this argument extends easily to meeting the far greater accessibility needs of the sight-impaired.
For this reason, the XForms form controls represent what I've often called an intent-based user interface. It's kind of neat to see the term popping up more frequently now. It gets to the heart of the matter: XForms does not provide a presentation layer. XForms relies for presentation on a host language like XFDL (in Workplace Forms) or XHTML (in web browsers). I am certainly hoping that VoiceXML will come to the conclusion that they should soak up the benefits of XForms rather than reinventing all of this stuff over again (partly because almost everybody underestimates how much work goes into it until it's too late; but maybe they will prove to be wiser than the rest).
I sometimes get asked whether XForms will next extend itself to standardizing the actual presentation layer. Clearly, from above the answer is no. XForms standardizes the core XML data processing asset, and more work will go into doing a better job of that. The key issue we want is to address interoperability and reusability across applications and user contexts of the data processing behaviors that are fundamental to completing a transaction.
A lot of talks about XForms are a bit technical in nature because the people who manufacture XForms processors tend to be technical people who understand the business value of XForms in terms like "model-view-controller" architectures, a superset of AJAX, and software engineering benefits like abstraction. But the C-level executive cares not about these things. It is important to connect them to what the C-level executive does care about.
The C-level exec is about efficiency, flexibility and accountability.
The CEO wants
- efficiency via reduced operating costs and decreased time to close deals
- flexibility to react to new business processes and changes of business partners
- accountability for control of expenditures and compliance with regulations
The CIO wants to achieve
- efficiency through end-to-end business process integration and ability to leverage business objects as IT assets
- flexibility through a malleable system architecture that can be rapidly reconfigured with replaceable, reusable components
- accountability through transaction record auditability
We need to speak about how XForms helps the organization to achieve better results along these metrics. This is where the global picture of what XForms does against the backdrop of the classic 3-tier architecture comes in handy.
The C-level exec is not up at night worrying about the server tier, where databases, content managers and workflow engines live. These may be expensive systems, but they are designed to be robust and highly scalable systems whose metrics relative to the size of the anticipated user base of a system are easier to quantify. In other words, they are low risk numbers that are easy to budget for.
The C-level exec is not as concerned about the client tier, where the browser and OS live, again because the metrics are easy and stable.
The C-level exec has the most uncertainty and risk at the middle tier. There are two parts to it. There is the part that uses APIs to talk to DBs, CMs and workflow engines to implement custom application logic as needed. This part is not as scary because we have reliable, robust, scalable, stable APIs, and because the components we're talking to are those reliable, robust, scalable, stable, expensive systems sold by companies with important people's ties to yank when there is a support problem. Then there's the scary, assembly-language-of-the-web-part of the middle tier. This is the part that has to juggle a dynamic, multistep end-user experience on the anything-goes client side, where the web browsers are free and you get what you pay for. The upfront cost of the thin client is low, but the time to market with new offerings is the most highly affected because a lot of unpredictable, time-consuming work lives here.
- It allows you to standardize and consolidate the end-user experience into a single business object that represents the overall transaction.
- It allows that business object to access the web services of an SOA as a natural part of the process of going from empty transaction data to completed transaction data.
- It fully represents the transaction and therefore can function as an integral part of the records needed for auditability.
As a final thought, it should be clear from the exponential scales of complexity and cost that the diagram above argues that XForms-based systems have business value through complexity containment, and that said value is rightly reflected in software product cost.
A rich client platform (RCP) is a framework of widgets, widget containers and other processing capabilities that facilitate sophisticated application development and deployment. One important aspect of RCPs is that they tend to be highly cross-platform because platform support is provided by the framework.
In both cases, you get sophisticated capabilities and cross-platform support. In the case of RCPs, you get these benefits by learning a framework, deploying the platform to the client-side and developing applications targeted for the platform. The one negative here is that you are invested heavily in the platform, so if some aspect of the platform is a showstopper (such as large download size), then you won't be able to deliver the application. In the case of RIAs, you get the benefits by learning a framework (browser HTML/JS/AJAX + J2EE), deploying the platform to the server side and developing applications targeted for the platform. The one negative here is that you are heavily invested in the platform, so if some aspect of the platform is a showstopper (such as no offline operation of the application), then you won't be able to deliver the application.
XForms solves the RIA/RCP conundrum by providing a framework for expressing the core data processing asset of an XML-based application independently of the deployment paradigm. There are XForms implementations of both types, RCP and RIA. In particular, the IBM Lotus Forms Webform Server is a product that converts an XForms-based Lotus Form into a rich internet application (RIA) automatically. The IBM Lotus Forms Viewer is a rich client platform (RCP) that provides the run-time for XForms-based Lotus Forms. The application developer is free to think about the application without thinking so much about the pecularities of how it will be deployed.
We're just finishing up the XForms face-to-face meeting in Amsterdam. Given my prior post and the approach of XForms 1.1 toward last call status, it seemed a good idea to talk about the improvements to XForms 1.1 that make the repeat construct easier to work with.
It's about being able to say more exactly what you mean, really. For example, on my first night in Amsterdam, my hotel room became quite cool, but it wasn't clear how to turn up the heat. Apparently I'm just not old school enough to relate to a radiator, so I called down to the front desk. I was informed that in order to turn the heat up in my room I had to "squeeze the knob". On the radiator was implied. Despite being pretty good with a pun, it seems I still needed a friend to help me fully appreciate how important my complaint about the advice truly was, especially in Amsterdam. In fine propeller head form, I complained that the proper advice was to "turn" the knob. On the radiator was again implied. Oblivious to any possible alternate interpretations, I proceeded through the explanation that while squeezing the knob was a necessary component of turning it, it was also an implicit part of the process which created the friction necessary to ultimately turn up the heat. From the radiator was implied.
Anyway, the point is that a lot of trouble can be avoided when one is able to say exactly what one means to say. In the case of XForms 1.1 authors, there are two new attributes on the insert action that make it a lot easier to manage repeat constructs. In XForms 1.0, we add to the container element of a sequence an extra subelement to act as the prototypical data. We must add the prototype as a child of the container element because the insert action in XForms 1.0 can only get the prototype from the last node of the sequence of items over which it operates. This limitation forces you to adjust the repeat to omit the last node, and it forces you to add application logic to remove the last node when the data is submitted or alternatively before it is processed on the server side.
In XForms 1.1, there is a new origin attribute on the insert action. This attribute allows you to give an XPath that says where the prototype for insertion is located. This allows it to be placed in another instance rather than being stored in the container element of the sequence, which eliminates the necessity of adjusting the repeat and of removing the prototype later (it's already in a separate instance).
The second issue that we solved in XForms 1.0 by putting the prototype data at the end of the container element is that we avoided the problem of the container element becoming empty. It is easy to understand the need for an empty shopping cart, but if the data container becomes empty, then the insert nodeset attribute produces empty nodeset, so you don't know where to insert the copy of the prototype node. To be clear, if nodeset="/cart/item" returns empty nodeset, it's like walking off a cliff because you don't know that the XPath expression visited the cart element right before it found zero elements named item within it. In order to insert into an empty repeat, the parent of the elements being repeated must be non-empty in XForms 1.0.
In XForms 1.1, we solved this by adding a context attribute to insert. The context attribute set the context for evaluating the nodeset attribute. The expected use of this attribute is to choose the parent or container element of the nodeset. So, when the nodeset resolves to empty nodeset, the context node is used as the container item into which the prototype node is inserted. So here's the shopping cart example in XForms 1.1.
<!-- Initial instance data contains the prototypical node as the last element -->
<xf:instance id="liveData" xmlns=""> <cart> </cart></xf:instance>
<xf:instance id="protoCart" xmlns=""> <cart> <item> <name/> ... </cart></xf:instance>...
<!-- repeat operates over the live data -->
<xf:repeat nodeset="item" id="repeatCart"> ...</xf:repeat>
<!-- Add new row after any current row, but do it in a way that can also handle zero rows. -->
<xf:trigger> <xf:label>Insert <xf:insert ev:event="DOMActivate" context="/cart" nodeset="item" at="index('repeat-cart')" position="after" origin="instance('protoCart')/item"/></xf:trigger>
<!-- Delete a row from the repeat. -->
<xf:trigger> <xf:label>Delete <xf:delete ev:event="DOMActivate" context="/cart" nodeset="item" at="index('repeat-cart')"/></xf:trigger>
To complete the example from XForms 1.0 in XForms 1.1, we should now look at an example in which you want a repeat that stays at one row. If the user hits delete for that row, then the row stays, but the data is cleared from it.
<xf:trigger> <xf:label>Delete <xf:action ev:event="DOMActivate"> <xf:delete context="/cart" nodeset="item" at="index('repeat-cart')"/> <xf:insert context="/cart" origin="instance('protoCart')/item" if="not(item)"/> </xf:action></xf:trigger>
In the final insert, we see the appearance of the new if attribute to more precisely communicate that the action is conditionally run. In XForms 1.0, you have to use an XPath predicate to produce an empty nodeset, which is a bit like saying "squeeze the knob". It gets the job done, but it ain't pretty.
The final insert also shows off one of the new things decided at this face to face meeting of the XForms team. In the latest working draft, the nodeset binding is listed as required, but we are changing that to optional in order to make writing inserts like the final one above look more natural. It just identifies the container element as the destination of the insert, the origin as the source of the insert, and a condition that says when to do the insert. And now my room is too hot!
Modified by John M. Boyer
Machine learning today is every bit as calculated, as simulated, as is machine intelligence. It is easier to use machine intelligence to highlight how much greater human cognition is, which is why I've been using a machine intelligence algorithm over the last several entries. However, the conclusion drawn so far is that, while machine intelligence is only simulated, it is still quite effective and valuable as an aid to human insight and decision making. Machine learning offers another leap forward in the effectiveness and hence value of machine intelligence, so let's see what that is.
Machine learning occurs when the machine intelligence is developed or adapted in response to data from the domain in which the machine intelligence operates. The James Blog entry only does this degenerately, at a very coarse grain level, so it doesn't really count except as a way to begin giving you the idea. The James Blog entry plays a game with you, and if he loses, he adapts by increasing his lookahead level so that his minimax method will play more effectively against you next time. In some sense, he learned that you were a better player. However, this is only a single integer of configurability with only a few settings of adjustment that controls only one aspect of the machine intelligence algorithm's operation. To be considered machine learning, a method must typically have a more profound impact on the operation of the algorithm, with much more adaptation and configurability based on many instances of input data. An example will clarify the more fine grain nature of machine learning.
The easiest example of which I can think is a predictive analytic algorithm called linear regression. Let's say you'd like to be able to predict or approximate the purchase price of a person's new car based on their age. Perhaps you want to do this so that you can figure out what automobile advertisements are most appropriate to show the person. Now, as soon as you hear this example, your human cognition kicks in and you rattle off several other likely variables that would impact the most likely amount of money a person is willing to spend on a car, such as their income level, debt level, nuclear familial factors, etc. This analytic technique is typically called multiple linear regression (MLR) exactly because we humans most often dream up many more than two variables that we want to simultaneously consider. Like most machine learning techniques, MLR does not learn of new factors to consider by itself. It only considers those factors that a human has programmed it to consider. When they are well chosen, additional variables typically do make an MLR model more effective, but for the purpose of discussing the concept of machine learning, the simple two-variable example suffices since your mind will have no problem generalizing the concept.
Suppose you have records of many prior car purchases, including a wide and nicely distributed selection of prices of the cars and ages of their buyers. This is referred to as "training data". If you plotted the training data, it might look something like the blue points in the image below. Let purchase price be on the vertical Y axis since it is the "dependent" variable that we want to predict, and let age be on the X-axis since it is a predictor, or "independent" variable. MLR uses a standard formula to compute a "line of best fit" through the given data points, again like the one shown in red in the picture.
A line has a formula that looks like this: Y=C1X1+C0, where C1 is a constant that governs the slant (slope) of the line, and C0 is a constant that governs how high or low the line is (C0 happens to be the point where the line meets the Y-axis, and the line slopes up or down from there). If we had more variables, then MLR would just compute more constants to go with each of them. For example, if we wanted to use two variable predictors of a dependent variable, then we'd be using MLR to create a line of the form Y=C2X2+C1X1+C0.
Technically, MLR computes the constants like C1 and C0 of the line Y=C1X1+C0 in such a way that the line minimizes the sum of the squares of the vertical (Y) distances between each data point and the line. For each point, we take its distance from the line as an amount of "error" in the prediction. We square it because that gets rid of the negative sign (and, less importantly, magnifies the error resulting from being further from the line). We sum the squares of the errors to get a total measure of the error produced by the line, and the line is computed so as to minimize that total error.
Once the constants have been computed, it is a trivial matter to use the MLR model as a predictor. You simply plug the known values of the predictor variables into the formula to compute the predicted Y-value. In the car buying example, X1 is the age of a potential buyer, and so you multiply that by the C1 constant, then add C0 to obtain the Y-value, which is the predicted value of the car.
In this way, hopefully you can see that the MLR "learns" the values of the constants like C1 and C0 from the given data points. Furthermore, the actual algorithm that produces the machine intelligence only computes the result of a simple linear equation, so hopefully you can also see that the predictive power comes mainly from the constants, which were "learned" from the data. In the case of the minimax method, most of the machine intelligence came from the algorithm, but with MLR-- as with most machine learning-- the machine intelligence is for the most part an emergent property of the training data.
Lastly, it's worth noting that there are a lot of "best practices" around using MLR. However, these are orthogonal to topic of this post. Suffice it to say that just like the minimax method has a very limited domain in which it is effective as a machine intelligence, MLR also has a limited domain. For example, the predictor variables (the X's) do need to be linearly related to the dependent variable in reality. However, within the limited domain of its linearly related data, MLR is quite effective and an excellent example of a simple machine learning technique that produces machine intelligence within that domain.
A Lotus Form is a document currently expressed in an XML vocabulary called XFDL (html version, pdf version).
This is significant enough, in and of itself, to be set off in a paragraph. XML is a de facto industry standard from the W3C, and this means that widely deployed, interoperable, industry standard software toolkits can be used to introspect and manipulate the content of XFDL, i.e. Lotus Forms documents, thus mitigating issues of vendor lock-in.
But the story gets better...
The XFDL format internally employs W3C standard XForms, so without further reference to any vendor-specific documents, the standard indicates where in the XFDL document to look for the data content created by an end-user who fills in the Lotus Forms document. Anyone with access to a Java reference manual could write code to prepopulate data into an XFDL (Lotus Forms) document before it is delivered to an end-user, and they can write code to extract data from the document when it is returned to the server for processing.
But the story gets even better...
XFDL incorporates the W3C XML Signatures standard, so widely available industry standard tools are available from multiple vendors for validating the security of digital signatures in XFDL documents. In addition to the Apache XML Security implementation, it is notable that Java itself now natively contains support for XML Signatures (JSR 105).
These standards mean that the entire server-side lifecycle of the Lotus Form can be achieved without being locked into using any particular vendor's API. Any application server, any portal server, any server-side environment that can receive HTTP POSTs and process XML in the POST data can be used in combination with Lotus Forms.
And if you ever hear a vendor try to play up high availability of a client-side browser plugin for processing their favorite file format, ask them "Plugin?? What plugin??" With Lotus Forms, you don't need a browser plugin at all because the Lotus Forms Web Form Server converts the XFDL into dynamic HTML that can be processed directly by the browser with no plugins at all. Best of all, when the user finishes interacting with the Lotus Form, the resulting XFDL document is delivered to the application server endpoint, where it can be processed as XML using the standard APIs.
As promised, here are the links to my W3C Track talk at WWW2007 on the Rich Web Application Backplane (html, click here for the ppt file). Also, note that I heavily annotated my slides so you can read more about what I said than just the points on the slides. Let me know your thoughts on the content...
At the specification level, the W3C Forms working group is working on modularization of XForms as one means of promoting incremental adoption. This is following the simple business axiom that "embrace and extend" is economically preferable to "rip and replace". If you have existing web applications, and you want to start incorporating selected features of XForms to get at certain benefits, it helps to be able to get at just those features without having to change your entire application over to XForms in one development cycle. Modularization supports agile development.
The Forms working group is now also working on XForms 1.2, even as XForms 1.1 continues toward full W3C Recommendation status. In XForms 1.2, the second approach the Forms working group is using to promote incremental adoption is to stream-line web application authoring through emphasis on attribute decoration and exploitation of UI element nesting as a means of expressing the simpler classes of MVC applications. The attributes are attached directly to UI elements, but they imply a proper XForms MVC architecture. In other words, the XForms processor can read the attributes and auto-generate a proper model for you. To get an idea of what the stream-lined vocabulary might look like, and what canonical XForms it might correspond to, look here.
The approach of providing stream-lined syntax is simpatico with modularization for two reasons. First, it means we can provide the stream-lined vocabulary and processing model as a module separately from modules that provide the components of the core XForms engine (e.g. the instance data module, the submission module, and so forth). Second, the stream-lined syntax approach means that authors can start with attribute decoration and UI nesting, and then scale up in more complex functional areas as the need arises by adopting the core XForms modules appropriate to the problem at hand.
The stream-lined syntax approach is also interesting because techniques such as attribute decoration, UI nesting, and progressive elaboration of a document are techniques familiar to AJAX developers and made popular by AJAX libraries such as Dojo. In fact, I am really coming to the conclusion that AJAX libraries are a viable alternative to browser-maker adoption as a means of delivering the most web-centric components of the W3C technology stack. It should not be necessary for Microsoft, Mozilla, Apple, Opera and other browser makers to build a W3C technology into their browsers before it becomes widely adopted and deployed. More importantly, this means that the W3C, not the browser makers, defines the web and its standards.
We've been quite busy today in the XForms working group. Our work has been focused on modifications of the functionality of XForms submission for XForms 1.1. Tomorrow we will address any details that come up as the pieces of the spec are put together for the next working draft, but here is an overview:
- We're adding context information to the xforms-submit-done event so that an XForm can access the http return code.
- We're adding support for the DELETE method, which is more reasonable can be done now that the return code will be available. One practical result of this is that you will be able to write an XForms that speaks ATOM.
- We're allowing run-time modification of the submission URL. We'll be adding a child element called
resource to the submission element, and it will be able to use a single-node binding or value attribute to construct a URL that includes data from an XForms instance.
- We're allowing the ability to set content headers for the submission so that, among other things, an XForm can speak WebDAV. We'll be adding a child element called
header that will take a name attribute as well as single-node binding, value attribute, or content for the value of the header. You will be able to put as many headers as you want. The main detail to be fleshed out is out to say that user agents/XForms processors may choose to ignore some of the header settings.
- We're updating the context information available to xforms-submit-error so that XForms authors will be able to determine whether the submission failed due to a validation error versus an error resolving the URL.
Tomorrow we will be discussing aspects of the XForms type model item property. I'm anxiously awaiting the outcome because, frankly, I've been waiting a few weeks now to tell you about these aspects, but I need the group resolutions to occur before posting my comments. I think it will turn out well, though!
Someone recently asked me to explain why the XForms way of binding form controls to XML data with XPath was better than, say, using XSLT templates to match data with XPaths and output an XHTML form. Seemed a fair question.
An XSLT describes a transformation step that happens entirely before the user gets involved, so all the limitations of current XHTML forms still applies to the result of the transformation.
In comparison, XForms defines an intelligent, interactive layer that governs the fill experience for the data. Applying an XSLT to data does not, in and of itself, describes what happens as the user interacts with the data to create new content, so the author of the XSLT is left to encode the old script and HTML devices into the XSLT.
XForms standardizes the following:
- schema-based validation as you fill out the form
- dynamic validation constraints applied as you fill out the form
- calculated dependent values (like tax and total values on a purchase order)
- accessibility text
- access to the web services to let an SOA enrich the fill experience
- preservation of data hierarchy in the UI
- access to template-based UI repetition responsive to data mutations
- user interface switching (presenting panels of controls conditionally or in sequence)
- dynamic user interface properties (readonly, visibility/display)
- XML data submissions to the Web 2.0 server tier
Not only does XForms standardize these things, but it standardizes how they work together.
To further drive home the point, note that XSLT transformation could actually be used as a sympathetic technology with XForms. You could, for example, write an XSLT that transforms an XML Schema into data+XForms rather than data+DHTML. Then, you'd still get all of the above value adds because XForms is a standardized interactivity layer that describes what happens after the XSLT generates the initial rendition.
XForms is an important standard for encoding the core XML data processing asset of a forms application for multiple reasons.
For one, it fills the gap of schema languages, which are predominantly focused on describing what constitutes correct data that should drive server-side transactions. XForms codifies what it takes to get from an empty initial instance of a schema to a completed correct instance of a schema.
Another reason is that XForms provides an efficient language for doing the above, one that is based on such well-known and time-tested techniques as the Model-View-Controller design pattern, declarative formulas based on the spreadsheet algorithm (the second killer app of computing), and asynchronous event-based scripting. Frankly, the scripting capability is very effective for two reasons. First, each command is fine-tuned to the data by all the declarative constructs, so each line of script can do the work of 10 to 100 lines of purely imperative code. Second, the scripting in XForms actions are quite focused on mutating the XML data, so you don't get the baggage of a general purpose language that can create arbitrarily complex data structures that exist only in the memory heap of the run-time processor.
On the one hand, one could conclude that all of these language efficiency and simplicity aspects of XForms are beneficial to application authors who write XForms. But on the other hand, it turns out that the crown jewel is that the language efficiency and simplicity of XForms reverberate into the design tools that can all authors to write the application for a schema without much exposure to pointy brackets, much less coding.
To illustrate the point, I've created a 12 minute video of an XForms design experience on developerWorks to help show you more about what can be done with XForms based on a wizard-driven, point-and-click visual design experience. This video is similar to what I presented in the developer track at WWW 2007 in Banff. Please check it out and let me know what you think.