Starting with SPSS version 14, we put a lot of effort into adding several programming languages into SPSS. We've kept this up now through four SPSS versions. What's the point? After all, SPSS already had a rich command language familiar to hordes of users, and it also had SaxBasic scripting. And why did we come up with something so unfamiliar to SPSS users? I'll sketch some of the major motivations here. Tell us what you think about what we have done by commenting on this (and future) posts.First, though, rest assured that SPSS syntax - even the ugly macro language - and Basic scripting are not going away. In fact, in version 16 where Python scripting (as opposed to Python programs) was introduced, a huge amount of work went into reimplementing the Basic/COM scripting interfaces all over again in the new architecture. If my memory is correct, there were 310 apis (application program interfaces) that had to be reimplemented as well as creating all the new Python ones.
In designing programmability back in verison 14, we started down the road of enhancing the SPSS command language to add more programming features. The SPSS language lacked many important characteristics of modern programming languages, and its style was not what younger programmers expected. It worked well for statistical procedures and data transformations, but many useful things were difficult or impossible to accomplish using it. Some of these lacks could be worked around using SaxBasic, but as a front-end scripting language originally intended mainly as a way to manipulate objects in the Viewer, that was never a great solution.
It was hard to write jobs, other than transformation programs, that could be very general rather than building in a lot of assumptions about the input variable names etc. And it was very hard for a job stream to react to results or characteristics of the data and apply logic to decide what to do next. You might want, for example, to open an arbitrary dataset, inspect the meta data such as variable measurement levels or look for patterns in the variable names, and carry out some analyses automatically based on that. Or you might want to inspect the output from a procedure such as REGRESSION and take some action if the fit is unsatisfactory, or report outlier cases to some agent for review.
We decided after working on the SPSS command language for a while that this was the wrong approach. There were a lot of good, portable, and embedable programming languages around already. By adopting one (or more) of those, we realized that we could make a lot more progress and offer a lot more functionality in a modern style by going that route. And it had the extra advantage that huge libraries of useful code written in those languages could be used immediately within SPSS.
We, therefore, gave up on extending the traditional syntax and put our effort into embedding these languages. That meant both accepting such code in the SPSS input stream and creating a set of apis that allowed that code to communicate with and control SPSS. Hence BEGIN PROGRAM and END PROGRAM and all that followed.
The result of all this, IMO, is the greatest leap forward that the SPSS product has taken in the last 25 years. The great thing about it is when a user asks, "Can I do ...?", the answer is almost always "Yes!" even if it isn't something we had already thought of. The downside is having to learn a programming language whose conventions and structures are wildly different from traditional SPSS syntax. Furthermore, lots of SPSS users are really not programmers and don't want to be, so they struggle to use all this new power. The extension command mechanism introduced in version 16 and extended in 17 was designed to solve that problem.
Next time I'll write about why we picked the languages we did.