server.shutdownscripts now work on Mac OS for all the other Mac developers out there.[Read More]
As I mentioned in a previous post I am now doing all of my JRS development on a MacBook Pro, using the Rational Team Concert beta 2 client. I hadn't given any thought to the stand-alone server until the work item  appeared. I always run the server from within Team Concert as I am working, not from the command line. Anway, a little shell script hacking later and at least the JRS
Looks like Jazz is now for everyone (well not that progressive modern stuff, but who doesn't like Louis Armstrong?). Seriously the Jazz web site is now accepting registration from anyone, lifting the restriction that you be an IBM Rationl partner or customer.
Everyone is now welcome to join Jazz.net. A special thank-you to all our Rational customers and partners, the university researchers and students, and everyone else who was part of the Jazz.net early pilot program.
Looking for JRS? Currently our builds aren't in the build list, but you can see our work items on the site.[Read More]
In a previous post on JCR I mentioned that JRS had consciously avoided the development of a client-side Java API. In fact there is no requirement for application clients to be developed in Java at all. One of the concerns we saw for previous Rational products was the complexity of the API and it's proprietary nature which made interoperability, integration and extension an expensive and complex proposition.
For Java client applications we really don't expect the use of the JDK and have tested with both Apache HttpClient and Abdera (for feed/entry creation and parsing). These seem to be the preferred libraries the application teams want to, and probably should, use.
So at least for us in JRS, if not for the rest of IBM, "J" stands for the Jazz Project and not Java.[Read More]
In an email response to Jazz REST Services, Bill De hOra asked about the relationship between JRS and JRS-170 (Java Content Repository, JCR). He noted the following language in the IBM description of JCR:
"Every node has one and only one primary node type. A primary node type defines the characteristics of the node, such as the properties and child nodes that the node is allowed to have. In addition to the primary node type, a node may also have one or more mixin types. A mixin type acts a lot like a decorator, providing extra characteristics to a node. A JCR implementation, in particular, can provide three predefined mixin types..."
Well, it seems to me there are a few specific areas of difference between JRS and JSR-170.
Firstly the strictly hierarchical model for JSR-170 is interesting but we decided not to be so restrictive, and simply to allow the URL-space to be open for users to choose naming schemes, either hierarchical, flat or non-contiguous. We did have some pressure from initial consumers to make everything completely Atom based so that you had to build a hierarchy of folders. When this led to having to create intermediate nodes we went back to a non-contiguous scheme where the client application chooses the storage scheme most appropriate to them.
Secondly a driving principle for the work was to ensure the repository itself was as open as possible, to not have any client language or platform assumptions and so have no Java client-side API - everything is documented in terms of the HTTP/APP operations. This again is a departure for us, not only do we tend to assume we're building Java clients and Java APIs but we advantage Java to the point of making it impossible in some cases to use anything else.The last major difference with the JSR is the fact that the nodes are "typed", which we decided to avoid in terms of having the server know about the resources (except for the distinction between "simple" resource and feed, analogous to the JSR "unstructured" and "folder"). We also decided that properties should not be attached to a resource as in webDAV but we would extract properties from resources which is where the indexers come in. An indexer is a client written description of how to extract properties from an XML resource (using XPath) so that specific property forms can be indexed in an efficient way. [Read More]
Well, the last few months have been very busy and really fun - am writing code for real! I have been seconded to work on the new Jazz REST Services (JRS) project**. JRS is a technology incubator project as part of the The Jazz Project and provides a RESTful, resource-neutral store which I'll talk about in subsequent posts.
This post then is about using Jazz, rather than developing for, which has been a really positive experience. I've used a whole bunch of source control and configuration management systems over the years, RCS, PVCS, PCMS, CVS, SVN, ClearCase and ClearCase/ClearQuest UCM. They seem to fall into one of two broad categories, file based or work-item based, that is they either deal in checking in/out files and folders or they track work against work items and you commit the item to check-in all the associated change sets. PCMS (way-back when) was work item based, UCM is and now Jazz is as well; however, the level of integration and ease of use in Jazz is really a huge leap forward from any of those.
The workflow, creating a defect/task making changes and associating them to the item is as easy as you think it should be and then the collaboration features to share changes in-flight with team members, request validation of work and so on have been simple enough to use that even a small team like ours has used daily. If anyone has seen any of the demos of Jazz so far you'll have seen Eclipse and Java, lots of Java :-) Well I can say that this is pretty much the out-of-the-box configuration, however it works just as well with PyDev and our Python test client projects.
So, to the last part of the title, yep all my Jazz dev is done on my nice shiny new MacBook Pro. The Jazz client is always provided in a Mac OS X package and has worked perfectly all the way through the project. And, of course, the screen envy from my ThinkPad using colleagues is always nice.
** the link will, at least for now require sign on but that should be removed in the next week or so.[Read More]
So, I promised a post on the Jazz REST Services work, and here it is. So first of all what is JRS?
JRS implements a RESTful repository following the architecture and style of the web, the repository is resource neutral, you don't have to pre-define resource types and you certainly don't have to have the repository understand them ahead of use. This is a departure from the way we tend to build tools today where both client and server knows the set and types of "things" during development and this set is not user extensible. This leads to all sorts of difficult issues in tool integration both for us as well as customers and partners. We also have problems in extending these tool "models" or resource types as we tend to develop the models with a closed-world assumption thinking that we can analyze the problem entirely and produce one single model for a given tool domain. Well, we can all imagine how well that works out, and some of us have to live the consequences.
So our proposition is that rather than producing large, monolithic models and closed tools we develop with a much more fine-grained approach and move from a file-system approach to a repository approach. It is also important that the repository should need to know as little as possible about the resource types, this also means that the usual approach of moving resource-specific operations to the server should be discouraged as well. The upshot is a server that we're already building some interesting sample projects and product prototypes upon, with a very cool set of features:
Wondering about the duck? You'll see this fellow a lot on our pages, he's an Indian Running Duck (image courtesy of the Indian Running Duck Association) and the mascot for JRS. Why a duck? Well the three of us in Raleigh who formed the core of the development team got to know each other pretty well locked up in a series of secret hideouts around the IBM campus. Two of us have flocks of runners, and chickens, but you simply can't use a chicken as a mascot can you![Read More]
skjohnston 120000GMPN 969 Visits
Via /. I read this great article on ars technica titled MIT startup raises multicore bar with new 64-core CPU. More interesting is this quote from the article:
"Tell me if this sounds familiar: a grid of processor "tiles" arranged in a mesh network, where each tile houses a general purpose processor, cache, and a non-blocking router that the tile uses to communicate with the other tiles on the chip."
Makes that Intel Core Duo in my ThinkPad seem pretty tame now doesn't it. But seriously the question is raised on slashdot already - how do we program this, and efficiently? The company is Tilera, a small player, but maybe the first of many?[Read More]
I was checking up on Anant Jhingran's blog and noted that he mentioned Erlang so going back up to the top realized that he was discussing a post from Sam Ruby. Sam is discussing a set of "long bets" although some seem pretty short bets to me. It also seems as if Sam's posting caused enough of a stir to cause an apologia posting :-) I agree with Sam's comments on Erlang, and while the world probably doesn't *need* another programming language I do think we can learn from the set of languages out there and in the area of concurrency I think there are lessons we need to learn.
I also try and separate out language, VM and library as Sam does, so I do like the .net CLR as a VM mainly because a lot of thought about the kinds of languages planned to run on it was done before implementagtion; this means adding Python (IronPython), Ruby (IronRuby) and others has been interesting to watch. As for languages I think many of my preferences are known, but as a collector of programming languages I am probably not an objective source.
One comment on Sam's apology mentioned Stackless Python, something I was planning to write a posting on myself some time. The difference I see is that stackless is an enabling feature of the VM, that we will still need the language-level primitives for message send/receieve I see in Erlang and then library support for managing local and remote distribution.[Read More]
I have a copy of the new book Beautiful Code: Leading Programmers Explain How They Think (you can also check out the O'Reilly Beautiful Code Home. My concern is that Beauty, depending on how you define it in this context, does not seem to me to be the way to measue or judge code. Now, some people seem to define beauty in terms of the readability of code and that is important for those that follow in your footsteps. Some define it in terms of the simplicity and compactness of an algorithm and implementation and again that seems valuable in that a smaller implementation tends to be more understandable (fills fewer pages in the brain). But those who become enamoured with the elegance, symmetry or "beauty" of code we should remember the words of Donald Norman "".
My personal preferrence is to see well-laid out code, reasability, simplicity and openess as great tools in the service of safe code. I would be much happier to judge the value of my code on what the test team think of it rather than the adulation of other programmers (even though that is nice). Code that doesn't come back to haunt you, that's beautiful. So what else can we include in the list of tools for developing safe code? Well Bryan Cantrill discusses the book here but more interestingly here where he argues that programming language choice plays a part in beautiful (and by my extension, safe) code. This an area which tends to bring about some heated, even passionate, discussion but I believe that language choice really does make a difference in both the ease with which concise and clear code can be written as well as the ability to develop safe code.
To this end one area where I think many programmers struggle is the development of parallel code; and with the widespread availability of multi-core machines (it's hard to by a PC these days which isn't a Duo) it's a skill more of us will need to know when our jobs include the performance of applications. This is certainly part of the discussion in a new book on the language Erlang - a language which includes simple, compact and elegant parallel primitives. I spent some time working in Ada which has a good set of parallel abstractions and while Ada has many problems it is interesting that few of the popular languages today provide much in the way of parallel primitives beyond Thread classes and synchronized keywords. I'm not sure that Erlang is going to be any more successful than Ada outside of it's current niche but it is now a fully open sourced project and does seem to be generating quite a bit of buzz. The nice thing about Erlang is that it combines a god functional language, single-assignment and a high-level set of parallel primitives in an elegant (dare I say beautiful?) manner. Whether Erlang does take off or not I do think that we'll have to work out a way to keep our code beautiful when it is split into numerous components running parallel across different cores, processors, blades or machines.
Another lesson I have learned is to distrust beauty. It seems that infatuation with a design inevitably leads to heartbreak, as overlooked ugly realities intrude. Love is blind, but computers aren’t. A long term relationship – maintaining a system for years – teaches one to appreciate more domestic virtues, such as straightforwardness and conventionality. Beauty is an idealistic fantasy: what really matters is the quality of the never ending conversation between programmer and code, as each learns from and adapts to the other. Beauty is not a sufficient basis for a happy marriage.[Read More]
Django is cool - and to be really clear if you think I mean Django Reinhardt then yes I agree he is very cool, or perhaps you think I mean Pearl Django and yes they are pretty darn cool too, but if you thought instantly of the Python "The Web framework for perfectionists with deadlines" then we're on the same page (though that means we probably both need a life).
As part of the team here we tend to develop prototypes to prove out certain technical risks and right now my favorite platform for these throw-away projects has become Django, although for some more control over low-level details Twisted is great, but a bit more work. For web applications Django has so much in the box it's very easy and remarkably quick to get going - however what we were trying to do was a little different and so one of the things we had to do was add a few pieces to the Django framework itself, which turned out to also be a lot less work than we thought. Specifically we had need of two new capabilities not included in the current Django (0.96):
The first was easy, we simply subclassed the standard Django CharField model field class, fixed it's length at 36 characters and used the uuid module to generate a value if the 'auto' property is set. Note that the uuid module is included in Python 2.5 but not 2.4 or previous so you'll need to download it from Ka-Ping Yee. We also ensured that if 'auto' is set then any such property is not editable in the Django admin UI - this logic was taken from the current implementation of auto properties in Django itself. The code below shows the content of a module used in a number of places in the project and specifically the class UuidField is used by our model classes.
The second was also relatively easy, though it took a little longer to find some code to crib from but the result is also shown below in the isValidRegularExpression function. The approach is pretty simple (simplistic?) and involves passing the field value through the regular expression compile function and if that throws an exception assume that the value is not a legal expression. This seems to work pretty well, certainly well enough for our purposes anyway.
There are a few more Django tweaks as well as some tips/tricks we found that hopefully I can post over the next week or so.[Read More]
Python has a decent set of libraries for XML processing, SAX, DOM and ElementTree but unfortunately some of the more advanced processing tends to be supported only in add-on packages such as PyXML or 4suite. One particular area that I needed in the last few weeks was a reasonably complete XPath implementation and I was already using the standard library DOM for manipulating documents and so I'd rather not change to either PyXML or 4suite. Then I got to thinking, there is another issue lurking here, the same issue that 3GL programmers have faced with SQL -- the mixing of two languages in an application. Specifically my logic is written in Python withe the syntax and semantics of the language front-and-center; however when I want to query my XML resources I have to use an alternative language and I cannot code in that language rather I have to construct an expression in a string and submit it to a single "evaluate" method.
The question really was, can I use the data/programming model of XPath directly from within my Python code? Seems pretty simple enough and I started to sketch out what this would look like taking some simple but reasonable example queries. The result was a very simple xpath library which has two classes, XPathNode and XPathSequence that both contain methods corresponding to the Axis navigation and standard functions defined by XPath, as an example the mapping from XPath to the methods on XPathNode/XPathSequence is shown in the following table. This allows the programmer to not only "think" in XPath but to not have to context-switch between languages and also to make use of editors with syntax editing, command completion and so forth for their XPath code.
In the same way the functions root, element, node, comment and text as well as the features name, local-name and namespace-uri are all present as methods on XPathNode.
This results in a relatively consistent API that can be used instead of having to construct a string represent XPath in your Python code, evaluating it and parsing the results. For example, consider the following example code.
from xpath import minidom from xpath.predicates import *
This seems pretty simple and while the API is not complete yet and has some issues (asking for an attribute or element named "*:something" or "this|that" doesn't work yet) I have also started on a simple parser to convert an XPath string into a set of commands that can be executed - compile an XPath string so that it can be used over and over. One key to making this work efficiently is the use of Python partial functions - these are used extensively inside the code as well as in the API for dealing with predicates. The following example illustrates this.
from xpath import minidom from xpath.predicates import *
The evaluate() method on XPathNode and XPathSequence takes a partial function (a Python callable that has had some of it's parameters already bound) which is then invoked within the evaluate() method with the context node as an additional parameter.
Hopefully I will get a chance over the next few weeks to clean up and post the code - at least get it to pass a decent set of unit tests.
Here's a little fun aside for you, a problem I sat down with three adults a 10 year old and a 7 year old. The problem concerns a happy chap by the name of Eric the Sheep, which I sill summarize below (visit the site though there are some really interesting problems for kids).
Eric the sheep is lining up to be shorn before the hot summer ahead. There are fifty  sheep in front of him. Eric can't be bothered waiting in the queue properly, so he decides to sneak towards the front. Every time Eric passes two sheep, one sheep from the front of the line is taken in to be shorn. How many sheep will be shorn before Eric?
Well once we were close to the answer, and being a geek at heart, out came the ThinkPad and Python for a quick solution check ... so enjoy this.[Read More]
I wouldn't normally use this blog for a commercial, and I remember being told by an IBM Distinguished Engineer that he rarely passed on book recommendations because they can be very personal and some people may not like his choice (didn't stop him giving me his recommendation on that occasion). But having just finished reading Dreaming in Code by Scott Rosenberg (subtitled Two Dozen Programmers, Three Years, 4,732 Bugs, And One Quest For Transcendent Software) I have to say it is an excellent read in part because of the care and detail that obviously went into the research.
The book follows the development of Chandler which I had been following as an interesting and ambitious Python application. The book not only looks at the particular issues faced by the Chandler team but also how this relates to perennial problems faced in software development. It was particularly interesting for me as the reason this blog has been a little light in recent days is due to the start up of a project here in IBM Rational which I will hopefully be able to talk about in the next few weeks - at least in general. Watching real time slowly stretch out into software time has been frustrating but inevitable I suppose.
P.S. the recommendation given to me was for Lean Software Development: An Agile Toolkit by Mary and Tom Poppendieck.[Read More]
OK more Python, some code and a rant -- it all happens here.
Rant On First let me get this out of my system - Python has a great standard library and it's great seeing packages extended, mature and versions come out that provide additional capability or performance. Also the Cheeseshop and Sourceforge host many great Python projects - so why do we see so many of these packages and frameworks re-create standard features such as logging? For example I've been using both Twisted and Django lately and both include their own custom logging which seems to add nothing over the capabilities of the standard library logging package. I know most of us are used to print statements (or System.out.println for my Java colleagues) and it's a slippery slope to create debug() methods and then move them to their own package and soon you have your own logger. Now when we have particular needs we sometimes have to roll our own solution, but hopefully these requirements get fed back into the main project for the community to benefit from. Rant Off
In the case of logging eventually we deploy systems and these logs are important, we collect them, analyze them and correlate events across systems to determine faults. It's this thinking that led to the Common Base Event format which IBM and others brought to OASIS and which has now become the WSDM Event Format or WEF. The purpose is to provide a common, XML based, event record format that can be used to log events for later correlation and analysis. An example WEF event is shown below, generated by an extension package for Python's common logging. This package(available as loggingx-0.1.1.zip) uses the standard extension mechanisms of the logging library, in this case providing a customer event formatter object (WEFFormatter) that can convert Python event records into the XML shown below. There are also libraries for other languages, notably in Java you can use the Apache MUSE project.
<ManagementEvent ReportTime="2007-02-14T18:12:45Z" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns="http://docs.oasis-open.org/wsdm/muws1-2.xsd"> <EventId>urn:uuid:e0bf22a6-bc80-11db-bfac-0014a4dafba5</EventId> <SourceComponent processId="5676" locationType="path" threadId="2344" component="logtest" componentType="Module" application="d:\Python25\python.exe" location="d:\my projects\eclipse3.2m3\logging\src\test\logtest.py" componentIdType="Module"> </SourceComponent> <ReporterComponent executionEnvironment="win32" componentIdType="Module" subComponent="WEFFormatter" component="logging" componentType="Module" location="skjohnt60p" locationType="Hostname"> </ReporterComponent> <Situation> <SituationCategory> <ReportSituation> <Log></Log> </ReportSituation> </SituationCategory> <SituationTime>2007-02-14T18:12:45Z</SituationTime> <Priority>50</Priority> <Severity>4</Severity> <Message>Oops, that didnt really work now did it.</Message> </Situation> <extendedContent> <extendedDataElements type="xsd:int" name="Line"> <values>36</values> </extendedDataElements> <extendedDataElements type="xsd:string" name="Exception"> <values>integer division or modulo by zero</values> <children type="xsd:string" name="Traceback"> <values>File: D:\My Projects\eclipse3.2m3\logging\src\test\logtest.py, Line: 34</values> </children> <children type="xsd:string" name="Type"> <values>ZeroDivisionError</values> </children> </extendedDataElements> </extendedContent> </ManagementEvent>
The following snippet shows how to connect this custom formatter to the logger, setting the log level to DEBUG so we should see lots of messages as we go. If you provide a 'format' value on the call to basicConfig() it will be ignored by the WEFFormatter which doesn't support any configuration of the output format.
import logging # standard library! from loggingx.wef import WEFFormatter
The example event above was actually reporting a Python exception using the exception() method on the logger object. This method includes exception information in the event record before passing to the formatter object and our WEFFormatter uses the extendedContent tag to encompass details about the exception (the traceback information and the type of the exception) in the formatted event. The test code which generated the XML above is shown in the snippet below.
try: 1/0 except Exception, e: logger.exception("Oops, that didnt really work now did it.")
However gobs and gobs of XML are not necessarily the most effective of efficient native format for log records (and there are many applications that have well known existing log formats - such as Apache) and so the loggingx package also includes a module for logging events to a SQL database using the Python standard DBAPI. This module allows multiple applications to log to the same database and therefore provides an efficient method for log analysis across applications. In this case we simply created a customer log Handler object, called DatabaseHandler, which can take a set of initialization parameters to choose the database API and provide connection details. Below is a snippet showing how to use the default configuration (SQLite3 - provided with Python 2.5) and how we add this handler to the logger. Note that this handler will ignore any provided formatter as it has to do it's own 'formatting' that is extract data from the log record and insert into a SQL row.
handler = DatabaseHandler() logger = logging.getLogger('') logger.addHandler(handler)
However, for analysis applications expecting to use WEF (such as those provided in the IBM Autonomic Tool Kit) an additional module in the package provides a function to extract records from a log database and format according to any provided log formatter object. The snippet below shows how the run_extract() function works, taking an existing, opened, connection object and any log formatter object and the identifier of the first log record to extract. In this manner we could not only extract log records into WEF events but perhaps into Apache style log records or more.
from loggingx.db.extract import run_extract from loggingx.wef import WEFFormatter
Hopefully we will over time see more and more projects using the standard logging and, if they have special needs to extend the package themselves or contribute these changes back to the community.[Read More]
In a recent project I needed a more fun demo than the usual "hello world" and having just used my copy of Microsoft Streets and Trips with the bundled GPS I got to wondering...
So, how hard would it be to write some Python to read the GPS device? Well first you have to work out that a GPS device conforms to a well known standard published by the National Marine Electronics Association, and I found a good reference here. Then you have to realize that while the GPS device is connected on a USB cable it acts like an old-fashioned serial port, so I grabbed a copy of the very good PySerial package and started experimenting. Well, the record format is pretty simple so I wrote some decoders so that the GPSDevice class (you can download the GPS module here) returns Python dictionaries for the most common (and interesting) record types.
The following example shows how the code works, nothing dramatic but certainly a little more interesting to have some dynamic information in the demo I was putting together. One last task was formatting dates and longitude/latitude values correctly, but that's all done now as well.
import sys port = 6 gps = GPSDevice(port) gps.open() for record in gps.read_all(): if sys.argv == 'forever': print record else: if record['sentence'] == 'GLL': print 'I am hanging out at long %s, lat %s at %s' % ( record['longitude'], record['latitude'], record['time']) break
Oh, if anyone has some need then it wouldn't be hard to put together more record decoders.[Read More]