On Saturday I attendedBarCampRDU 2007, and here's a quick round-up.
First, thanks to the organizers and sponsors; it was fun geeky way to spend aSaturday, and the amenities were fantastic,especially considering the price of admission. Especially enjoyed the catered coffee bar andlunch from Neomonde.
Here's a list of some of the sessions that were run:
- Distributed SCM - Bazaar, GIT, etc
- Camping in 10 (Ruby)
- Linux virtualization
- Bug House, a chess variant
- Ruby and X-10
- Extending SketchUp with Ruby
- Geeks for Good
- Startup Weekend
- What's stopping you from doing something?
- Open Services
- Digital Identity Management
- FaceBook application development
Lots o' pictureshere,and lots o' blog entrieshere.
Some of the sessions I went to:
Camping at 10
Very cool; I really need to dive deeper into Ruby. Completely random factoid: of the15 laptops in the room for this session, 12 were Macs.
FaceBook application development
Quite illuminating. I had thought all those 'user generated' FaceBook apps were actuallyhosted by/on FaceBook, but they aren't. Which makes me even more leery of whatthose applications are doing with the data we're giving them. hmmm
Whereas Open Source / Free Software has defined freedoms surrounding source code,the next battleground on the freedom front are the services we use on theweb. What does it mean for something to be an "Open Service"?Not sure why Luis thought he "trivialized the discussion"; absolutely not.Lots of really interesting things to think about here. Especiallyas we find our data getting mashed up and otherwise reused in ways we never expected.
Other notes ...
Rather than take my laptop into the unconference, I decided to tryliving with just my Nintendo DS-Lite with theOpera browser.I've surfed a bit here and there with it, but not for an entire day.While it's quite capable at handling degraded 'mobile' sites, available for Google Reader, GMail, and Twitter, for instance,it's rather painful for websites designed for desktops. Most folks probablywon't tolerate the slow, small device, especially compared to the iPhone.But having worked on embedded devices for years, I can easily overlookthe limitations. It's a lot cheaper than the iPhone also.
All in all, worked out pretty good; checked my mail, checked twitter,checked blogs; posted a few twitters. A bit painful, pecking outmessages on a tiny touch screen with a stylus, but worth it to nothave to lug my MacBook around in a backpack. Just had to lug around my man purse.
Lastly, one thing I already knew, but had reinforced, was that all the kewlkidswear bright yellow / orange shirts fromThreadless.
A few blog posts reverberating in my mind:
I just ran across a post by Joe Gregorio today, where he's comparing WS-* RPC vs. REST, specifically talking about the factthat WS-* RPC only uses the POST request method of HTTP:
"That POST of a generic media type gives no indication if the request is safe, or idempotent, nor is there any indication of the cachability of the response."
I also just read this morning apost by Leonard Richardson on Amazon's new FPS service, commenting on the 'REST'y ness of the service:
"its 'REST' interface is about as RESTful as Flickr's and del.icio.us's 'REST' interfaces"
Note, the Flickr and del.icio.us aren't considered terribly RESTy. :-)
Joe notes a handful of things that "just using POST in HTTP" breaks.Leonard notes that FPS isn't truly RESTy, but is in fact just as non-RESTyas two other fairly popular services.
And I'm left wondering: do we really need to do everything in pure REST?
Perhaps we can identify a small set of characteristics that get us most of the benefits of REST, that would be easier to implement than going'full' REST. And just focus on those. Because apparently, it's hard to do full REST. Or to be even more pessimistic, perhaps there are disadvantages to usingREST. Why would Amazon and Yahoo not use REST?
Most of the obvious advantages with REST revolve around the GET verb. Make sure youonly put safe, idempotent actions behind the logic of your GET processing.Use ETags, or even better, cache directives, in your HTTP responses onthe GETs to allow clients to intelligently cache things.
What else? Are there really other benefits? If there are, why are peoplestill not jumping on the RESTy bandwagon?
Here's my thought on the lack of adoption of REST: it's complicatedfor anything but the GETs. Namely, the mutating verbs POST, PUT, and DELETE.Not technically, but semantically. Check therest-discuss mailing listsometime. I've been having conversations with folks here recently regardingsome of the semantics of 'updates' as well; ask 5 people their thoughts, you'llget 10 different answers.
Martin Nally had told me a few months ago that he thought people shouldbe able to do 90-95% of their services in a RESTy style, and for thoseinterfaces that didn't fit the REST mold, you could do 'em in non-RESTy style(ie, some kind of RPC-ish non-GET mutation operation), butit should cost you a $1000. I think I'm ready to lower the price to $100.
Updated on 2007/08/03 to fix a syntax error in the image element.
Dan Jemiolo: What kind of client generation are you looking for?
I suppose I must not have gotten around to telling Dan my horror stories of using WSDL in the early, early days of Jazz.The low point was when it once took me two working days to getthe code working again, after we made some slight changes to the WSDL.Of course, we were doing some evil, evil things, like parsing Javacode with a JDT jar in an ant task, and replacing generated codefrom the WSDL generation process. But still, code generation ofthis ilk leaves a bad taste in my mouth.
Also see Dare's issues with bananas.
The best code generation is no code generation.
And that's what we changed in Jazz. Because we already had complete control over thedata typing story (EMF),we had no problem generating the XML and JSON we wanted, completely dynamically, by reflecting over the dataused in our services. But we had to do something about the service methods themselves.
So we rolled our own client and server side stack for this.
We kept the notion of defining the operations in a web service in a Java interface,because this makes a lot of sense to do in Java. We can reflect over it to look at themethods and signatures. On the server, you can write a class to implement the interface,and that's your service implementation. The low-level server interface (ie, Servlet for Java)can figure out what service to invoke, and then call the service implementation reflectively.And on the client, you can use Proxyand friends to build an object which implements the interface by making the HTTP requeston your behalf.
(Quick aside to note that Jazz services are largely RPC styled, though there are some that are more RESTy flavored - stay tuned; they've caught the REST bug.I think the 'client library' or invocation style is largely independent of the architectural style, so I think everything I'm saying here completely holds forREST as well as RPC, and everything in between.)
By having the client and server work completely reflectively, all we had to do wasmake sure the data classes and service interfaces were the same (or at least compatible)between the client and server. Understand, "all we had to do" can be a problem; butat least we didn't have the generated code to deal with as well, nor did we have a separate build step for it that can be an agility killer.
It goes without saying that you can get by with a lot less configuration in such a situation. Reflection over Configuration. Don't Repeat Yourself.
Looking back on this, I think this was a great trade-off in the time expendedto build the stacks. For instance, we were able to tweak the system in variousways that would have been impossible to do with a code-gen built system. I suspect this is going to be the case for any medium- to large-scaled system built using a number of services. You can either lock yourself into a system under which you have very limited control, and spend your time working around it and fighting it, or you can write code customized to yourneeds and spend time tweaking as you need.
Let's get back to your original question, but tweak it a bit:What should we do to make it easier for people to build clients?
- Provide machine-readable descriptions of the data flowing over the wire.Not just the HTTP content, but query string parameters where appropriate.
If you can reflect on your service interfaces and data dynamically in the server,then you can generate all this meta-data reflectively as well.
Dan Jemiolo posted today about his restdoc tool,which produces Gregorio Tablesfrom comments in REST service implementations designed to be runin Project Zero.He also posted to the Project Zero forum, and included some screen shots of the output of his tool, here.
Here is what's cool about this:
I love it when I can keep artifact information like this with my source code;easy to keep in sync.
It's genuinely useful information.
I'm tired of the "we don't need no stinkin' tools" attitude of some of the RESTafarians. Baloney. I think we cancertainly live without overengineered tools like WSDL,but having small tools can certainly help.
Now, for some rocks:
As I mentioned in a comment on Dan's blog, it would probably be useful to have this information available at runtime on the server; the server couldactually validate what it's doing. And I'm not talking about the serverreading it's restdoc information back in; I'm talking about the serverhaving that information available to it, at runtime, obtainedreflectively.The advantage is that you know the information is never stale. No more,"When was the last time I ran restdoc again?".In Java, you would presumably use annotations to do this.
While Joe's tables included a non-precise description of the HTTP content for themethod / uri rows (eg, "Employee Format"), restdoc only includesthe 'format' (eg, JSON). I'd prefer to see more precise typing,but to start with, something as vague as Joe's would be good.
That HTML table looks like it may be too wide to be viewed on an iPhone. The good news is, I don't have to worry about that problem.
Finally, not to knock Dan at all, but I really have to wonder what'sgoing on if the best we can do is describe our services in human readable tables of text. Really? I mean, can't these multi-core,multi-gigahertz computers of ours really help out anymore thanby rendering some HTML to a display that we read while hand-coding our HTTP invocations?
That just isn't right.
I went to the firsterlounge RDU meet-uptonight to find out more about Erlang.The meet-up was arranged byKevin Smith,who provided a short presentation.
I had done a little boning up on Erlang last night, so the presentation wasn't completely foreign to me, and I was able to squeeze a few questions in while theS5 presentation took forever to switch slides. Some of those questions:
Q: Does Erlang compile to binary, or use bytecodes, or ???
A: Compile to bytecodes.
Q: What's the deployment story for web server code?
A: Run yaws, a web server written in Erlang.Presumably, you'd be able to proxy to this via Apache, like other stand-alone servers(on a shared host, for instance; TextDrive doesn't currently have it installed,asking about it now).
After doing a bit of reading last night,I came away a little less enthusiastic than when I started. Not sure why. Themeet-up re-invigorated me a bit. It's probably time to buy the book.Even if I don't actually do anything with Erlang, it's always nice to see what'sgoing on with other languages, in hopes of maybe transferring some of those ideas,as appropriate, into other work you're doing.
On the negative side, it sounds like the error reporting (compilation and runtime)is fairly nasty. And the doc is largely man pages. And there's no real unicodesupport. And it's compiled.
It was a good omen to see my old young buddy, and fellow Building-50x-iteChris Grindstaffat the meet-up. Chris has been yammering on about Ruby for years.Coincidently, the erlounge RDU meet-up group is an off-shoot of the Raleigh-area Ruby Brigade. Hmmm.
Two projects in IBM have recently decloaked, both which I've had the pleasure of working on: Jazz andProject Zero.Both projects have come under some attack by folks as being "not open source". For instance, see Mark Pilgrim's typically humorous responseto Project Zero.
I'm not interested in talking about that specific aspect of the projects. I'm an open source commiefrom way back, but the fact of the matter is that IBM contributes a lot of open source to variouscommunities, and at the same time produces commercial, proprietary software.
What I'm quite happy about with both products is the transparency they provide to IBM's productdevelopment process. The term "transparent" I've borrowed from Stephen O'Grady,which he used to describe the Jazz development process. I thought I'd give a quick rundown on the transparency I've experienced in my 20+ years of software development at IBM.
From 1985 till about 1995 I worked on projects internal to IBM, and didn't really have a needto deal with customer feedback on the products I was working on, because they were internal toIBM. We did have a 'newsgroup'-like system called "IBMPC Conferencing" that was a fantastic resource for IBMers. Not a whole lot of people used it (relatively), which was unfortunate,but also kept the wheat/chaff percentage quite high. It did serve as a way to communicatewith our internal customers though.
The IBMPC conferencing system expanded sometime in the early 1990's to allow customersaccess to a restricted set of 'newsgroups' called CFORUMs. This was quite convenient for IBM developers,since it used the same news system we already knew how to use, and customers also had accessto it somehow. Sometime later, IBM created a real NNTP server named news.software.ibm.com, which served the samepurpose as CFORUMs, but used the more 'standard' NNTP protocol.
By this time, I was working on VisualAge Smalltalk, and we created at least one newsgroupfor it on the NNTP server. For other projects I was involved with later, we also created newsgroups on theserver.
So, we've provided some kind of newsgroup-y access to product groups for a while.However, we were limited in what we could actually discuss on the newsgroups. Whilewe would frequently accept bug reports from users posted to the newsgroups, we couldn't really provide bug tracking numbers, since there was no way for users toaccess our bug tracking systems. We did it anyway, to at least give our usersa shorter handle by which to reference the bugs. For new feature requests, or requests for when the next release would be made available,you'd typically see a response of "We cannot provide information regarding future versionsof the product". Which was insanely frustrating for us developers. But webit our tongues and pasted that response into posts, a lot.
So now, roll forward to the mid-late 2000's, with Jazz and Project Zero.We have 'newsgroup'-y access like we have had for a while. But we also have accessto the bug tracking systems. And source code management systems. And a generalnotion of talking more openly about future release content and dates (I hope!)
It certainly seems to me, that over time, we've become more transparent with our development processes for our commercial products. I think this is great forcustomers, who will have more direct access to the product development teams andthe product development process itself. It's also great for the development groups within IBM, as these open processesare a great equalizer: both new hires and CTOs have the opportunity to converseequally with customers.
I'm a firm believer in this sort of transparency being a benefit to everyone, and I'm looking forward to this becoming the rule rather than the exception for more IBM products.
Joe Gregorio answered some questions about WADL in his post"Do we need WADL?".Also note that Leonard Richardson has chimed inrecently on the WADL issue.And I of course have some different thoughts. :-)
Quotes from Joe inbold italic, mine in plain.
If I describe an Atom Syndication Feed in WADL, how close will the generated code be to a feed reader like Bloglines? Or even to a library like Abdera? If I write a really good WADL for (X)HTML how close will the generated code be to a web browser? The point is that generated code stubs are so far from a completed consumer of a web service that the utility seems questionable.
I don't think anyone is expecting a magic machine to appear that eats WADL and generates applications out theother side. At best you're going to get some code stubs. Which is whatWSDL does today. And functionally works. I'm not saying it's nice, or pretty; just that it functionally works. It is easier than writing allthe SOAP glorp yourself, so I would say such a scheme has a lot of value.
Code generation of data bindings from XML Schema, though, seems fraught with problems.You can either design a nice document, in which case the resulting codewill be 'ugly', or you can design nice objects and your XML schema will be be 'ugly'. That's why I'm interested in JSON; perhaps we can have niceobjects AND
documents serialized formats!
Yes, people want to describe interfaces, and those descriptions are brittle. If I download a WADL and compile my client today it will break tomorrow when you change the service. If, instead, you use hypertext, link following and request construction based on the hypertext and client state, then the interface won't break when you change servers, or URI structures.
I think of interfaces described here in the same way as Java interfaces.Namely, it's an external description of the system that people interact with.The guts can change, the interface can remain the same. One of the nicefeatures of Java, if you're in the 'binding contracts' business (ie, youuse Java). So, no, just because the service changes, does not imply thatthe client breaks.
But even beyond that, there's no reason someone can't deploy a new interfacefor a service and leave older interfaces still working. Some peoplecreate new versions of their service interfaces every few weeksand support each version for about a year.
Code gen is brittle,and I generally dislike it. But some languages don't requireany code-gen, like PHP's SOAP support. Just give it a WSDL,it provides an interface to make calls against, as long as youcan figure out what the methods and data are. Even for Java,there a minimizations that can be made; for instance, usingdynamic Proxy's against generated Java Interfaces could leaveyou with a story of just having to generate Java Interface'stubs', the rest being all handled dynamically.
And of course it would be useful to mention non code-gen uses; even if WADL were totally uselessas a code gen device, it might still be handy to have as adocumentation format for someone's services. It could alsobe used to provide validation for the client and server.
... you can't expect me to believe that if you had a carefully crafted WADL you could hand it to WADL2Java and out would pop flickrfs.
Again, of course we're not expecting fully formed apps or filesystems to pop out of a schema-2-language grinder (thoughI'm curious about what it would mean to plug APP into FUSE).But perhaps something like the "API Kits" listed here?Absolutely! That's what I'm talking about!
Q: You don't expect everything to be built with APP, do you?
Paraphrasing Joe's answer: Not everything, but a lot.
I'm leery of the blog-based legacy of APP. For instance, thesecond-class nature of binary resources. Also, attributes oncollections and entries such as categories,title, etc. A lot of human readable, textual attributes.The kind of stuff you see in ... what is that word again ...oh yeah ...hypermedia.
I'm leery of APP, but I'm hopeful. It's especially niceto think of someone creating all the infrastructure for theCRUD-like interfaces, especially being able to handleHTTP cache validators (I hope). In the end though,you still need to describe the non-Atomdata you are transferring over APP.
On Saturday, I attendedDCampSouthat the rather bizarrely shaped, but very coolSchool of Communication Arts(aka "The Digital Circus") outside Raleigh.
Lots of people will tell you that the best part of conferences in generalare the informal, impromptu conversations that happen in the hallways. Unconferences like DCampSouth are all day, just slightly more structured, hallway conversations.
Big thank you to Jackson Foxand the rest of the crew who put this together.
I'm looking forward to attending theRuby Hoedown 2007 later this summer, and the next BarCamp RDUthis summer or fall.
Finally we have a programmable persistence engine for our browsers.Thank you, Googleplex.
Here's what I want next: more scripting languages. Obvious choices being python andruby. Whatever happened to this?Slingshot would still have a slightadvantage over such a contraption, as Slingshot has some additional desktop integration featuresthat browsers don't currently have. It also has the advantage that it's not running inan application shell designed for browsing the entire web; there's no time machine (back button).
There's also Adobe Flex/Apollo to consider, since they will also have an embedded database available.On the language front with Flex, Adobe recently made anActionScript Virtual Machine 2 (AVM2) Overviewavailable. How long before someone ports some languages to that VM?Especially since dynamic languages like python and ruby are a fairly natural fit to the AVM2 engine(compared to the JVM anyway), and the AVM2 engine will likely be the most widely deployed VMin the near future (it's included in Flash 9).
The one thing I've been most excited about given the rash of new client productsavailable, is that we've finally got a new "browser war" on our hands. Competition is fantastic; it's going to be a wild next couple of years.
So if you're writing (or generating) contract/interface-level code which can't late-bind to all resources, everywhere, you're not doing REST ...
Is this "We don't need no stinkin contracts!" meme a reaction to the non-web-friendlyWS-* world, what with it's overly complex and verbose schemas? Because I think there'splenty of room for some people to apply contracts to parts of the web. I certainlydon't believe the entire web can be fully described using some all-encompassing schema language; but smallpieces? Sure.
I guess what I don't understand, is how you are supposed to describe your servicesto someone without some kind of meta-data describing the service. Every 'web api'I've ever seen has human readable text describingthe URIs, the data expected as input, and the data expected as output. (Admittedly, most of these 'web api's violate best practice HTTP principles somehow, but I think that's notan issue here; they could all be refactored to be be better HTTP citizens.) That human readable text is a contract; an interface. In English. Which is terrible. I'd rather have a machine readbleversion of that contract, so I can generate something human readable from it.And perhaps a validator. And some client stubs. Maybe some some test cases.Diagnostic tools. Etc.
What is the the alternative to describing your services?How is anyone going to write code to use these services, if they don't know whereto send requests, what verbs to use, what data to send, and what kind of data to expect?Instead of Flickr producing a description of their web services like this, they're simply supposedto say "Flickr is now fully REST-enabled. Start here, and have fun!" ??
As with data modelling,I don't feel like there is a single answer to what schema or contract languagebe used. I'm not initially sold on WADL (seems too verbose), and certainly wouldn't use it if therewas something else, better, for whatever project I was working on. The shape ofthe schema language isn't important, as long as it works for you.
So I guess contract-driven HTTP interfaces aren't REST. But this is an areaI'm interested in; what name should I use, so I can avoid be labelled as "not doing REST" while I'm optimizing my use of the web by being a goodHTTP citizen?
Count me as someone who wants some typing in the REST world, based on the arguments made in thepost by Aristotle Pagaltzislast week.
We're talking about contracts here. Contracts need to be formalized, somehow.English is not the best language to use, especially since we have much more preciselanguages available to us.
My thoughts here are really just an extension to my thoughts on data serialization.Services are just the next level of thing that need to be meta-described.
Several folks have pointed out WADL (Web Application Description Language)as a potential answer, but it has at least one hole:it doesn't have a way of describing non-XML data usedas input or output. For example, JSON. It certainly is simpler and more direct than WSDL, so it does have that going for it.
All in all, good thoughts all around, but we have more work to do,more proof to provide. And by more work, I don't mean getting a handful ofexperts in a smoky back room mandating what the formats are going to be.In fact, I'm not so sure we need a single 'format'. If you've creating somekind of machine-readable schema to describe your data and your services,you're way ahead of the game.
In any case, don't wait for WADL to be finshed before starting to build outschema for your services. Use WADL if you can, use something else (hopefully simpler)if it's more appropriate for you.
Additional thoughts on Aristotle's post fromTim Bray,Stefan Tilkovand Mike Herrick.
The Redmonkers have been starting to web publish video interviews along with their usual audio interviews.Coté seems to be doing most (all?) of the work, and you can catch these as he releases them on his blog.
I like to see people experimenting with new technology, and 'upgrading' from audio to video soundslike a fun experiment (pardon the pun). But it doesn't work for me.
There really isn't that much 'extra' in a video interview, over just the audio.You get to see faces. You get to see some body language. Maybe a picture or two.
The idea of watching an interview means I have to have two senses trainedon the media: eyes and ears. You're killing my continuous partial attention!
I can't listen to it on my video-challenged iPod.
The reason I don't have a video-capable iPod is that the situations inwhich I listen to my iPod don't lend themselves to allowing me to watchsomething on the device as well: driving, mowing the lawn, washing the dishes, etc.
I fully admit to being an old fuddy-duddy; even Rush Limbaugh does videoof his show. Good luck guys, and, if you can, also make the audio portionof your videos available as an MP3. I'm not alone in this wish.
But let me change the direction here. Let's look at an environment that ishigh on video interaction, and absolutely bereft of audio interaction. Yourprogramming environment. Your IDE, be it Eclipse, NetBeans, IntelliJ, Visual Studio,XCode, Emacs, or a text editor and a command-line. How many of these programs useaudio to help you develop code? None. Well, I might be lying, I'm not familiar with all of these environments, but I don't recall any of theseenvironments making use of audio like they do visuals.
When we're programming on all cylinders, we're in 'the zone'. Continuousfull attention. Eyes reading and scanning, fingers typing and clicking andmoving mice. Where's the audio? It ain't there.
Nathan Harrington has a number of articles up at developerWorks, such as"Monitor your Linux computer with machine-generated music"which discuss ways developers can use audio in their computing environment.
This is good stuff, and we need more of it.
I would be remiss in not pointing out here that audio feedback like this is onlyuseful to those of us lucky enough to have decent audio hardware and software in ourheads. Those of us without such luck wouldn't be able to take advantage of audio feedback. Onthe other hand, folks who lack decent video hardware and software in their heads would most likely appreciate moreemphasis on a sense they are more dependant on.
The most obvious use case for audio in a development environment is with debugging. There are a lot of cases whiledebugging when you just want to know if you hit a certain point in your program. Thetypical way you'd do this is to set a breakpoint, and when the location is hit, thedebugger stops, you see where you are, and hit continue. Breaking your attention and demanding your input. What if, instead, you could set an audio breakpoint, that wouldplay a sound when the location was hit? So your attention wasn't broken. And you didn'thave to press the Continue button to proceed.
With regard to audio debugging, I know this has been experimented with manytimes in the past. I've done it as well, a decade ago, when I was using a programming environment that I was able to easily reprogram: Smalltalk.
But audio usage in development environments is not yet mainstream.There's lots of research to be done here:
What are the best sound palettes to use: audio clips, midi tones,short midi sequences, percussion vs tones?
How should we take advantage of other audio aspects likevolume and pitch and three dimensional positioning, especially regarding continuously changing quantitieslike memory or i/o usage by an application?
How do we deal with 'focus' if I'm also listening toRadio Paradise while I'm working?
Does text-to-speech make sense in some cases?
How do we arrange multiple audio feedback to be presentedto us in a way that's not unpleasing to listen to? Not just acacophony of random sounds.
Beyond debugging, where else can we make use of audio feedback?
How might audio be integrated into diagnostic tools likeDTrace?
panacea: A remedy for all diseases, evils, or difficulties; a cure-all.
In my post "That Darned Cat! - 1",I complained about Twitter performance, and peeked at some of their HTTP headers noticing they didn'tseem to respect ETags or Last-Modified header cache validator tests.
Since posting, Twitter performance is back on track. I haven't checked, but I'm guessing they didn'tadd ETag support. :-)
A number of people seemed to read into my post that ETags are a cause of Twitter's performance problems.I'd be the first to admit that such a proposition is a bit of a stretch. ETags are no panacea, and in fact you'll obviously have to write more code to handle them correctly. Harder even, if you'reusing some kind of high level framework for your app. This isn't easy stuff.
And in general, my 20+ years of programming have taught me that your first guess at whereyour performance problems in your code are, is dead wrong. You really need to break out somediagnostic tools, or write some, to figure out where your problems are. Since I don't havethe Twitter code, I'm of course at a complete loss to guess where their problems are, when theyhave them.
ETags and Last-Modified processing is something you ought to do, if you can afford it, because it does allow for some optimization in your client / server transactions. To be clear, the optimization is that the server doesn't have to send the contentit would have sent to the client, as the client has indicated it already hasthat 'version' of it cached. There is still a round-trip to the server involved. If you'relooking for an absolute killer optimization though, you should be looking atExpires and Cache-Control headers. See Mark Nottingham's recent post"Expires vs. max-age" forsome additional information, along with the link to his caching tutorial.
Expires and friends are killer, because they allow the ultimate in client / servertransaction optimization; the transaction is optimized away completely. The client can check the expiration data, and determine that the data they havecached has not 'expired', and thus they don't need to ask the server for it at all. Unfortunately,many applications won't be able to use these headers, if their data is designed to changerapidly; eg, Twitter.
Sam Ruby also blogged about anothergreat exampleof Expires headers. How often does the Google maps image data really change?
Here's another great example, applicable to our new web 2.0-ey future.Yahoo! is hosting their YUI libraryfor any application to use directly, without copying the toolkit to their own web site.Let's peek at the headers from one of their files:
$ curl -I http://yui.yahooapis.com/2.2.2/build/reset/reset-min.cssHTTP/1.1 200 OKLast-Modified: Wed, 18 Apr 2007 17:35:45 GMTX-Yahoo-Compressed: trueContent-Type: text/cssCache-Control: max-age=313977290Expires: Tue, 02 May 2017 04:08:44 GMTDate: Mon, 21 May 2007 04:13:54 GMTConnection: keep-alive
Good stuff to know, and take advantage of if you can.
For more specific information about our essential protocol, HTTP 1.1, see RFC 2616.It's available in multiple formats, here: http://www.faqs.org/rfcs/rfc2616.html.
In"Lesson learned",my colleague Robert Berry recounts 'losing' a blog post he wasediting. Not the first time I've heard this recently. I thought I'd document my process of creating blog posts, in caseit's of any use to anyone. Because I don't lose blog posts.
My secret: I use files.
Although many blogging systems let you edit your blog posts 'online',and even let you save them as drafts, I don't actually go into myblogging system to enter a blog post, until it's complete. Theprocess is:
Create a new blog entry by going into the Documents/blog-posts folderin my home directory of my primary computer, and creatinga new file, the name of which will be the title of the blog post. The 'extension' ofthe file is .html.
Edit the blog post in html, in a plain old text editor.
While editing, at some point, double click on the file in my file system browser (Explorer, PathFinder, Nautilus, etc)to preview it in a web browser.
Churn on the edit / proof-read-in-a-web-browser cycle, for hours or days.
Ready to post? First, check all links.
Surf over to blog editing site, enter the body of the post into the text editorvia the clipboard, set the title, categories / links, etc.
Preview the post on the blog editing site. Press the "Publish" button.
Move the file with the blog post from Documents/blog-posts toDocuments/blog-posts/posted .
HTML TextAreas are an extremely poor replacementfor a decent text editor. Using HTML is handy, since some (most?) blogging systems will accept it as input, and you can preview it yourself with your favorite web browser. Saving the files,even after finished posting, is a convenient backup mechanism, should you ever lose yourentire blog.
Besides these obvious advantages, I noticed some behaviours of other blogging systems thatI really didn't like, when saving drafts of posts 'online':
On one system I used, the title saved with the first draft was used as the slug of theblog URL. Even if I later changed the title, the slug remained some abbreviated version ofthe first saved title. Ick.
On one system I used, tags I saved with a post ended up showing up in the global list of tags on the blog. Even if there weren't any published posts that had used that tag. Ick.
I should note that I also have a directory Documents/blog-posts/unused for postswhich I've started, and decided not to post. The "island of misfit blog posts", as it were,but "unused" was shorter.
There you have it! Since you religiously backup files on your primary computeryou'll have no concern about ever losing a blog post again!
Some more thoughts on Twitter performance, as a followup to "That Darned Cat! - 1".
Twitter supports three different communication mediums:
- SMS text messaging
- A handful of IM (Instant Messaging) services
- HTTP - which can be further subdivided into the web page access, the Twitter API, and RSS and Atom feeds
I'm not going to talk about the first two, since I'm not familiar with the technical details of how they work. Other than to notice that I don't see how Twitter can be generating any direct revenue off of HTTP (no ads on the web pages, even), whereas they could certainly be generating revenue of off theSMS traffic they drive to whoever hosts their SMS service. IM? Dunno.
It would appear, or at least I guess, that most of the folks I follow on Twitter areusing HTTP, rather than the other communication mediums. Maybe I'm in a microcosm here,but I'm guessing there are a lot of people who only use the HTTP medium. And there'sno money to be made there.
So, we got a web site getting absolutely pounded, that's generating no direct revenuefor the traffic it's handling.And it's become a bottleneck. What might we do?
Distribute the load.
Here's a thought on how this might work. Instead of people posting messages toTwitter, have them post to their own site, just like a blog. HTTP-based Twitterclients could then feed off of the personal sites, instead of going through theTwitter.com bottleneck.
This sounds suspiciously like blogging, no? Well, it is a lot like blogging.Twitter itself is a lot like blogging to begin with. Only the posts have to be at most 140 bytes.So let's start thinking about it in that light, and see what tools and techniques we can bring from that world.
For instance, my Twitter 'friends' are nothing but a feed aggregator like Planet Planet orVenus.Only the software to do this would be a lot easier.
Josh notes: Hmm, but doesn't RSS/Atom already give us everything we need for a twitter protocol minus the SMS (and text limit)?Indeed. Twitter already does support both RSS and Atom (I didn't see explicit Atom links, butfind an RSS link @ Twitter, and replace the .rss URL suffix with .atom). They aren'tthings of beauty, but it something to start with. While you can already usingblogging tools to follow Twitter, I'm not sure that makes sense for most people. However,reusing the data formats probably makes a lot of sense.
So, why would Twitter ever want to do something like this? I alreadymentioned they don't seem to be making any direct revenue off the HTTP traffic, sooff-loading some of that is simply going to lower their network bill. Theycould concentrate instead, in providing some kind of value, such as contact management and discovery. Index the TwitterSphere, instead of owning and bottlenecking it.And of course continue to handle SMS and IM traffic, if that happens to bring in some cash.
In the end, I'm not sure any one company can completely 'own' a protocol like this forever.Either they simply won't be able to afford to (expense at running it, combinedwith a lack of revenue), or something better will come along to replace it.
If you love something, set it free.
There are other ideas. In"Twitter Premium?",Dave Winer suggests building Twitter "peers". This sounds like distributing Twitter fromone central site, to a small number of sites. I don't think that's good enough. Things willscale better with millions of sites.