Throughout the history of mathematics, new ideas have tended to be labeled in a demeaning way. The first few that come to mind are negative numbers, irrational numbers, and imaginary numbers. It's always as if the person who invents a new concept to solve a problem is somehow losing his mind because all the sane people use numbers that are positive, rational and real!
The same thing seems to be happening in the web world. I've recently heard the assertion that it is illegal to use XML constructs to upgrade the HTML language because it is illegal to send content over the wire as anything but tag soup if the content-type is text/html. On the one hand, I could be losing my mind, but on the other, maybe the assertion doesn't hold up to scrutiny. The argument seems to be that the user agent would have to commit the sin of content-sniffing in order to determine whether or not the text/html content can be fed to an XML parser. See how the label 'content-sniffing' just oozes negative connotations?!
Well, all of those clever mathematicians of old won out in the end because, quite frankly, their ideas worked. And so it will be with content-sniffing.
To be honest, I don't see the logical difference between testing a document for XML well-formedness versus checking a
In fact, all features that are placed into IBM Workplace Forms are activated by version. So, if you write a form using, say, version 6.0 of the XFDL language, the form continues to work in the same way in the latest version of the product. Changes to the way features work, and even most bug fixes, are implemented in a version-specific manner so that forms generally don't change behavior as the user agent is updated.
In essence, the Workplace Forms viewer products do a form of content-sniffing by activating certain subprocessors based on some of the content in the document. The point is that the document is still of content type application/vnd.xfdl, but we don't assume that a document processor must be dumb except for the information it derives from HTTP content negotiation. Perhaps this comes from our more document-centric focus-- after all, you don't get HTTP content negotiation when loading a form previously saved to the local disk drive!
The same kind of thing needs to happen with HTML processors. The html tag could bear a boolean attribute that indicates whether or not the content is well-formed XML. Or, there could be added a version tag whose value would imply whether XML was used. This way, new features of HTML could be added to the XML-based variant, and user agents could be
But either way, serving out well-formed XHTML with the content type of text/html is not only a valid thing to do, the simple truth of the matter is that people do it all the time. Why? Because it works.