On February 25, 2005 IBM and Zend Technologies announced a strategic partnership to collaborate on the development and support of the PHP environment. Under this collaboration, Zend made 'Zend Core for IBM' available. Beyond this initial announcement, IBM and Zend are working together on furthering PHP technology directions. There are two goals:
- The first is to bring the simplicity inherent in the 'do-it-yourself' infrastructure and PHP development to enterprise customers.
- The second goal is to add enterprise-class support to PHP in a non-intrusive way -- without adding complexity for those who don't need the support, and without losing sight of what has made PHP so successful in the first place.
With the PHP 5 release, Zend took an important step towards serving the needs of enterprise customers. This article provides a high-level overview of some of the important new PHP 5 features and introduces some of the technologies we are investigating. Watch this space -- both the roadmap and this article will evolve quickly.
The evolution of the Web
The Web has evolved into a robust platform for building software applications and businesses. It continues to evolve quickly. To cope with this rapid change, there are two key trends that we are focusing on: leveraging do-it-yourself infrastructure and application building so that relatively simple projects stay low-cost, fast and simple; and bringing the benefits of Services Oriented Architectures to scripting languages.
One trend that is becoming more apparent is shown in Figure 1 -- the growth of 'do-it-yourself' (DIY) Web infrastructure and applications. This growth in DIY-IT has been facilitated by the availability of mature and robust open source Web infrastructure, and increasingly powerful (yet relatively easy-to-use) scripting languages. These DIY applications tend to:
- Show up in small and medium sized organizations and in Enterprise departments
- Be designed to solve a specific task, often situational and activity-based, and usually focused on accessing and aggregating valuable content with relatively little programming
- Be built to be "just good enough", often built by non-programmers with little up front emphasis on the "X-abilities" (reliability, scalability, availability, maintainability,...)
- Focus on time-to-value with a shorter, more iterative development lifecycle measured in minutes, days or weeks, not months or years
- Evolve continuously and quickly as the needs of the group change.
Figure 1. Evolution of dynamic content
The applications are focused on leveraging three categories of content, as shown in Figure 2:
- Content-at-rest - databases, flat files, XML, ...
- Active content - Web pages, forms, scripting, wikis, blogs, ...
- Content-in-flight - RSS feeds, Web services, replication, ...
Figure 2. Do-it-yourself IT
The evolution of the Web application is further fueled by the extreme popularity of blogs and wikis, which are creating new ways of interacting with each other and the Web to truly enable the read/write Web. These technologies are evolving to enable the next wave of DIY-IT by combining the flexibility of user-oriented information architecture provided by active content (such as wikis) with that of content-in-flight (such as Web services) to provide an ease-to-use integration platform for creating the new style of content-centric applications.
Innovation is driving the creation of new platforms that provide simple, feature-rich environments for editing and contributing content. New capabilities for publishing and handling content are growing out of the professional Web site community, targeting a broader, less technically sophisticated audience. Content is handled with a simple WYSIWYG editor that makes editing accessible to anyone. Much content also provides mechanisms to directly interact with the content provider through discussion tabs, history tabs, version tabs and mailing list tabs.
Services Oriented Architecture
Another trend in the evolution of the Web that will reinforce DIY-IT is the adoption of Service-Oriented Architectures (SOA). Application design is rebalancing the ratio of code to content using SOA; that is, a huge amount of SOA-delivered content is available and is growing. Today many businesses have IT infrastructures that consist of a complex array of heterogeneous platforms, technologies and applications. Added to this are programming models that are too complex and too fragile. Application programmers are confronted with too many technology choices, too many concepts and interfaces to learn, and too many ways of doing similar things differently. It is imperative that we drastically simplify the programming model. IBM sees SOA as key to interoperability and flexibility for supporting end-to-end integration across the business, among business partners and across runtimes.
SOA is simply an architectural style for building applications that promotes the loose coupling between components so that you can easily reuse them. SOA applications are built from services. A service is a self-contained software module that performs a specific task and has a published contract/interface that is both platform and programming language independent. This allows service users to be oblivious to the technical details of the service's implementation. Service operations are usually invoked using messages - a request message and a response - rather than through APIs or file formats. This feature of having a neutral interface definition that is not strongly tied to a particular implementation is known as loose coupling between services. By contrast, tight-coupling means that the interfaces between the different components of an application are tightly interrelated in function and form, thus making them brittle when any form of change is required to parts or the whole application. Loosely-coupled systems offer several key advantages. They can:
- Adapt applications to evolving technologies
- Leverage existing investments in legacy applications by wrappering them as services
- Integrate applications with other systems both within or external to the enterprise
- Quickly and more easily build new business processes by assembling existing services.
An evolving open source platform
An open source platform
One of the driving forces behind the emergence of DIY-IT has been the success of the LAMP open-source stack. The LAMP stack consists of the following components
- Linux - provides the core operating system
- Apache - provides the Web server
- MySQL - the predominant database server
- PHP, Perl and Python - a set of server-side scripting languages
When these components are used together, they provide a low-cost solution stack for building robust, scalable, database-driven dynamic Web applications. There are many variations on this set of technologies that include support for other operations systems, such as Windows and Mac OS X, and databases, such as PostgreSQL and DB2.
The communities around the technology in the LAMP stack also play a vital role in the growing popularity of the solutions. The LAMP community is one of the most vibrant among the open source communities. The community offers an unprecedented level of support and provides a large body of single-click install & auto-configure solutions and libraries; while maintaining its focus on ease-of-use and affordability. The combination of solid technology components and an active community has enabled LAMP to emerge as a de facto Web application platform.
An important component of the LAMP stack is PHP, one of the most popular server-side scripting languages in the world. By some estimates it is used by more than 40 percent of Web developers and is available on approximately 70 percent of the UNIX-based Apache Web servers. It has demonstrated phenomenal growth over the last 4 years and is now used on over 15 million websites today. Major businesses have adopted the PHP, for example: Lufthansa for its e-ticketing system, Electronic Arts for Sim City Online, Boeing for a payload measure system, and Orange for its WAP portal.
PHP was designed from the outset for Web development. With easy access to form variables, built-in HTML templates, and integrated database access, PHP has proven to be a rapid programming language for Web developers. In addition, the vibrant PHP community has created an impressive amount of additional functionality through code samples (see http://hotscripts.com) and PHP extensions (see http://pear.php.net and http://pecl.php.net) and documentation. PHP 5, released in July 2004, introduced enhanced support for Web Services, XML and object orientation.
Web Services support
The PHP community has long understood the value of sharing and reusing code. The thousands of scripts and programs available for reuse on Hotscripts.com are proof of that. Web services take code reuse and integration to the next level -- they provide a standardized way for creating and using reusable services based on open standards over the web. Web services are simply a SOA with the following additional constraints:
- The interfaces must be based on Internet protocols such as HTTP, FTP, and SMTP.
- The messages must be in XML, except for binary attachments.
SOAP Web Services have the additional constraints:
- Except for binary attachments, the message must be carried by SOAP (a simple XML-based protocol specification that defines a uniform way of passing information between applications).
- The description of the service must be in WSDL (an XML document that describes what a Web service can do, where it resides, and how to invoke it).
In PHP 4, the Web Service support was provided by libraries such as PEAR::SOAP written in PHP. PHP 5 introduced native support for a high performing and reliable SOAP-based Web Services stack implemented as a native extension written with in C. It supports a subset of SOAP 1.1, SOAP 1.2 and WSDL 1.1 specifications and can be used to create Document Literal, RPC Literal, and RPC Encoded based Web Services. The Web Services stack can be used to transfer complex structures of data. However, care should be used when transferring data types not supported in PHP.
PHP 5 supports creating Web Services using a WSDL file or can be created manually without a WSDL file using a SOAP API. When using a WSDL file, PHP will map the types passed in the Web Service call to the types defined in the WSDL file to create the SOAP message. The Web Service stack does handle the new Object-Oriented functionality in PHP 5. This means the complex data structures can be represented as classes and the Web Services stack will automatically map the class attributes to the data types defined in the WSDL file.
There are a number of possible enhancements to the Web Services stack in PHP 5. There are already plans for supporting HTTP digest authentication and sending HTTP Cookies in the SOAP envelope. Other possible enhancements are the following:
- Sending of SOAP attachments
- Support for Web Service Security
- Better support for handling SOAP headers
- Creation of WSDL. In PHP 5, the WSDL has to be created by a tool like IBM's Rational® Application Developer
- A Web Services explorer
The WS-I profiles provide implementation guidelines for Web Services specifications to provide interoperability between the different Web Services vendors. WS-I has finalized the Basic Profile Attachments Profile, and the Simple SOAP binding Profile. Work is underway on a Basic Security Profile. IBM and Zend believe that support for the WS-I profiles is important and will be working on supporting WS-I conformance in the future.
XML is a markup language designed to describe structured data. It is becoming a common way to both store data and exchange it between applications and companies. As a result, many applications being created today need to be able to create or consume XML data. In PHP 4, it was somewhat painful to process XML documents that had any degree of complexity and required that you be proficient in awkward XML parsing APIs that are strongly object-oriented, like Document Object Model (DOM). Fortunately, PHP 5 added SimpleXML, which allows developers to process XML documents using a more natural mechanism. SimpleXML is built on the dynamic nature of PHP and simply provides an XML document as a data structure of anonymous PHP objects with properties and text content. It excels at the common tasks of attribute and content handling, while leaving the more complicated XML manipulation to other extensions. Additionally, in PHP 5, all XML extensions were rewritten to provide stable, robust and standard's conforming XML implementations, for example the DOM methods were renamed to conform to the DOM specification. A Java-like XMLReader class will also be part of 5.1.
Web application design will continue to rebalance the ratio of code to content, unifying the various content types into a consistent data access model focused on simplicity of manipulation of data from disparate sources. New abstractions such as Service Data Objects will be used to hide the complexities of interacting with data source such as relational databases. The unifying layer will be built using existing data source capabilities. Content-at-rest will continue to be the heart of data-centric Web applications.
PHP 4 contains more than a dozen different database-specific extensions optimized for a particular database's functionality and performance. Each extension claims a different functional namespace - for example, the mysql_connect() and ibase_connect() return database connections for MySQL and Firebird/Interbase, respectively -- and each extension implements its own API with sometimes minor but confusing variations from other database extensions. While this proliferation of database-specific extensions is good for developers that only need to access a single database, it potentially "locks-in" an application and requires a significant investment in code refactoring if the application needs to support another database.
A core extension since PHP 3, the Unified ODBC extension is the ideological ancestor of a fast, light database access abstraction for PHP. Written in C for maximum speed and conforming to a subset of the ODBC specification for portability, the Unified ODBC extension can be compiled against ODBC driver managers or native database libraries that implement an ODBC-compatible interface. Compiling Unified ODBC against an ODBC driver manager enables it to access any database that offers an ODBC driver, at the slight cost of imposing an additional layer of overhead in database communication. Compiling Unified ODBC directly against native database libraries, such as the DB2 Call Level Interface (CLI), avoids the overhead of the ODBC driver manager layer at the cost of restricting you to a subset of accessible databases.
For many years, PHP developers relied exclusively on the Unified ODBC extension to access DB2 databases. In 2001 IBM recognized the demand for PHP by officially supporting access to DB2 Universal Database Version 7.2 for Linux, UNIX, and Windows from PHP applications through the Unified ODBC extension. The extension provides a solid subset of database access features that are ideally suited for PHP 4 and continue to be available in PHP 5. However, the burden of supporting many different databases with conflicting interpretations of the ODBC specification means that Unified ODBC cannot provide PHP developers access to the full functionality of DB2 or Cloudscape databases. Limitations of Unified ODBC include difficulties working with large objects (BLOB and CLOB data types) and the inability to return OUT or INOUT parameters from stored procedures. Fortunately, many applications do not require these advanced features.
PHP data objects (PDO)
PHP 5 introduced true object-oriented support for PHP applications, at the same time that the PHP community realized the proliferating set of database-specific extensions was hampering the portability of their database-driven applications by requiring any application aspiring to portable status to implement their own de facto portability layer on top of the database-specific drivers. Additionally, the quality of the DB extensions in PHP varied from driver to driver. Some drivers were very stable and well maintained, while others where poorly implemented. PDO will enable all DB extensions to be of high quality because the majority of the code will be shared in the base PDO class, and therefore, bug fixes will propagate to all DB extensions. Recognizing this problem, and the opportunity represented by an object-oriented solution, the PHP community defined a common, object-oriented interface for data access called PDO and developed database-specific drivers that implemented the PDO interface. The design philosophy for PDO is as follows:
- PDO defines a lightweight, consistent, object-oriented interface for data access: a PDO object that represents a database connection, a PDOStatement object that represents an SQL statement and any results returned by the database, and methods and properties for each object that satisfy the most common data access requirements across databases.
- Each database driver that implements the PDO interface has the option of exposing additional database-specific features as regular extension functions.
- While PDO implements basic compatibility features, like ensuring AUTOCOMMIT is on for every database by default, or enabling bind-by-name and bind-by-position for every database driver, it will never emulate advanced database functionality like sequences or stored procedures that are not natively available for a given database.
With the PDO beta release in February 2005, PDO drivers were already available for IBM Cloudscape and DB2 Universal Database (through PDO_ODBC), Firebird/Interbase, Microsoft® SQL Server, MySQL, ODBC, Oracle, PostgreSQL, and SQLite. Compiling the PDO_ODBC driver against the DB2 libraries offers Cloudscape and DB2 UDB users full support for stored procedures and large objects with excellent performance out of the box.
While the current focus on PDO is to stabilize the core set of data access objects and methods in time for the PHP 5.1 release, and a secondary goal is to increase the number of databases that are accessible via PDO drivers (for example, providing a means of accessing Informix® Dynamic Server databases), future iterations of PDO may define common administrative tasks for databases (such as backup, restore, stop, and start).
DB2 and Cloudscape extension for PHP
The DB2 and Cloudscape extension for PHP is a new database-specific extension that IBM will contribute shortly to the PHP Extension Community Library (PECL) under the Apache License Version 2.0. Given the effort to define a common database interface for PHP 5 through PDO, and the basic support offered by Unified ODBC in PHP 4, this might seem like an odd action. However, IBM developed and contributed this extension to satisfy the growing demand for an extension that can fully exploit DB2 and Cloudscape features in PHP 4. The majority of production PHP servers today still run PHP 4, and that balance is not expected to shift to PHP 5 until sometime in 2006. Rather than overloading the Unified ODBC extension with even more
#ifdefs for specific DB2 requirements, IBM made the decision to write a new extension from scratch. And because IBM is contributing the code to the open-source community, the community has the opportunity to reuse any of that code in PDO_ODBC or other drivers. The API for the DB2 and Cloudscape extension is largely compatible with the Unified ODBC extension, and the new extension is officially supported by IBM in both PHP 4 and PHP 5 environments, so PHP developers have a solid, fully-functional platform on which to build applications which need to run on PHP 4.
Service Data Objects for PHP
Today, developers are faced with using many different APIs for data access, depending on the data source and platform. PHP is no exception, with a number of database APIs and variations on Data Object patterns available. Service Data Objects (SDOs) are designed to simplify and unify the way in which applications handle data. Using SDO, application programmers can uniformly access and manipulate data from heterogeneous data sources, including relational databases, XML data sources, Web services, and enterprise information systems. For more information about the goals and architecture of SDO, see the white paper "Next-Generation Data Programming: Service Data Objects."
The task of connecting applications to data sources is performed by data mediator services. Data mediators provide the glue layer between disparate data sources and a unified data access API used by the client application. Client applications query a data mediator service to obtain data. The data is represented as simple data graphs. A data graph is a collection of tree-structured or graph-structured data objects. Client applications manipulate the data using a consistent API irrespective of where the data originated. To commit changes to the data source, the client application simply requests the data be updated. The data mediator handles the complexities of interacting with the back-end data source to apply the changes. This architecture allows applications to deal principally with data graphs and data objects. The combination of SDOs and data mediator services provides a powerful, yet simple programming model, which greatly improves developer productivity when working with data.
We believe SDO could be broadened to support PHP, and would be well-suited to the kinds of database-centric Web applications typically developed with this technology. A PHP-centric implementation of SDO would offer some exciting design opportunities. Exploiting the dynamic and weak typing of PHP would greatly simplify the SDO APIs. The use of PHP class overloading and metadata could be used to 'simulate' APIs which require code generation in Java SDO. A PHP version of SDO would be interoperable with WebSphere® servers, simplifying data sharing between the PHP and WebSphere environments by providing equivalent data models.
PHP security has become increasingly important. As PHP gains in popularity, PHP developers need to pay more attention to security issues which are, unfortunately, a daily concern in the general Web development community.
There are many security considerations, and it's important for PHP developers to think about security from the start of any project, and to design security into every Web application. Wisely using PHP compile-time and runtime configuration options and adhering to proper coding practices can take you a long way toward securing your PHP Web application.
Many attacks are based on common mistakes in configuring the hosting environment. One such error is putting the PHP interpreter in CGI directory. Another is allowing scripts to be run directly from an URL versus only by redirect from a Web server. If PHP is run as an Apache module, you can make use of Apache authorization. Database security and filesystem security also come into play. But there are still other forms of attacks to protect against: session fixation, shared hosting exploits, SQL injection, data filtering exploits, and form spoofing.
Visit the Security section of the PHP manual which is good source of information. You can also look to the Zend Developer Zone for good general programming information, as well as, security-specific topics. The PHP development team is working on a standard input filtering extension which will provide a common framework for developers to validate their input data, and therefore promote more secure development practices.
In this article we discussed the evolution of dynamic content delivered by Web applications and some of the elements fueling the shift to situational and activity based Web applications. The do-it-yourself IT infrastructure provided by platforms like Zend Core for IBM, in conjunction with innovations around content publishing and management, are combining to create a new segment of programmers. Providing content through a Service-Oriented Architecture and a unified data access API will continue to make it easier for all customers to build their own dynamic Web applications.
- The article PHP Development with DB2 and Cloudscape: Unified ODBC (developerWorks, February 2005) provides detailed information on connecting to DB2 and Cloudscape using the Unified ODBC Driver for PHP
- Download the Next-Generation Data Programming: Service Data Objects whitepaper for more information on the SDO specification.
- Browse all of the PHP columns on developerWorks.
- Visit Zend Core for IBM to register and download the solution
- Visit the Zend Developer Zone for more information regarding general PHP programming
- Find general PHP security information at the php.net site
- Visit the DB2 and Cloudscape open source development page for more information about open source development on IBM Information Management products.