In the first part of this article, I introduced you to YQL, the Yahoo! Query Language, which offers a unified, SQL-like interface to retrieve data from multiple web service APIs. I demonstrated how to access the YQL service from a PHP application and offered some examples of how YQL lets you make use of SQL-type constructs to filter, sort, and link query results.
In this second and concluding article, I dive deeper into the world of YQL and demonstrate how you can use YQL from within a PHP application to add, edit, and delete data on third-party web services. I also show you how to use YQL to search through and retrieve data from HTML pages as well as structured XML formats such as RSS and Atom.
As you saw in the previous article, YQL lets you extract data from third-party web services with SELECT statements. SELECT, however, isn't the only SQL statement supported by YQL; you can also use INSERT, DELETE, and UPDATE queries to manipulate data on third-party services using the same SQL syntax.
To illustrate how this works, consider Listing 1, which uses an INSERT query to add a blog post to a user's blog on Wordpress.com. To run this example, you must replace the BLOGNAME, BLOGUSER, and BLOGPASS variables in the listing with valid Wordpress.com account credentials; you can obtain these by signing up for a free Wordpress.com blog (see Resources for a link).
Listing 1. Adding data with a YQL INSERT query
<?php
// set up Zend auto-loader
// load Zend REST client classes
require_once 'Zend/Loader.php';
Zend_Loader::loadClass('Zend_Rest_Client');
// execute YQL query
// add new post to Wordpress blog using POST
try {
$client = new Zend_Rest_Client('http://query.yahooapis.com/v1/public/yql');
$client->env('store://datatables.org/alltableswithkeys');
$client->q("INSERT INTO wordpress.post
(title, description, blogurl, username, password) VALUES
('Hello world', 'This is my first blog post with YQL...woohoo!',
'http://BLOGNAME.wordpress.com', 'BLOGUSER', 'BLOGPASS')");
$result = $client->post();
echo 'Entry posted with ID: ' . $result->results->postid;
} catch (Zend_Rest_Client_Exception $e) {
echo "Client error: " . $e->getResponse();
exit;
} catch (Exception $e) {
echo "Error: " . $e->getMessage();
exit;
}
?>
|
As Listing 1 illustrates, performing an INSERT query with YQL is identical to performing an INSERT query on any SQL-compliant RDBMS. You must specify the table name to which data is to be inserted, followed by a list of keys and values. YQL uses the table definition to convert the query data into an XML-encoded packet that is submitted, through POST, to the Wordpress.com web service. The response is returned to the client as a standard YQL results document, containing the ID of the newly inserted post.
Figure 1 illustrates the newly added post on the Wordpress.com blog.
Figure 1. A blog post added with YQL
Listing 2 offers a more interactive version of Listing 1, allowing the user to directly enter content into a web form and converting that content into a Wordpress.com blog post.
Listing 2. Interactively adding blog posts with YQL
<html>
<head></head>
<body>
<h2>Post Blog Entry</h2>
<form method="post" action="<?php echo htmlentities($_SERVER['PHP_SELF']); ?>">
<div>
Title: <br/>
<input type="text" name="title" size="40" />
</div>
<div>
Body: <br/>
<textarea name="body" rows="5" cols="30"></textarea>
</div>
<div>
Blog URL: <br/>
http:// <input type="text" name="prefix" size="40" /> .wordpress.com
</div>
<div>
Username: <br/>
<input type="text" name="user" size="40" />
</div>
<div>
Password: <br/>
<input type="text" name="pass" size="40" />
</div>
<input type="submit" name="submit" value="Post" />
</form>
<?php
if (isset($_POST['submit'])) {
// validate input (omitted for brevity)
$title = $_POST['title'];
$body = $_POST['body'];
$blog = 'http://' . $_POST['prefix'] . '.wordpress.com';
$user = $_POST['user'];
$pass = $_POST['pass'];
// set up Zend auto-loader
// load Zend REST client classes
require_once 'Zend/Loader.php';
Zend_Loader::loadClass('Zend_Rest_Client');
// execute YQL query
try {
$client = new Zend_Rest_Client('http://query.yahooapis.com/v1/public/yql');
$client->env('store://datatables.org/alltableswithkeys');
$client->q(
"INSERT INTO wordpress.post (title, description, blogurl, username, password)
VALUES ('$title', '$body', '$blog', '$user', '$pass')");
$result = $client->post();
} catch (Zend_Rest_Client_Exception $e) {
echo "Client error: " . $e->getResponse();
exit;
} catch (Exception $e) {
echo "Error: " . $e->getMessage();
exit;
}
// iterate over query result set
echo 'Entry posted with ID: ' . $result->results->postid;
}
?>
</body>
</html>
|
Figure 2 illustrates the web form displayed to the user with some input.
Figure 2. A web form to interactively add a blog post to Wordpress.com
Figure 3 displays the blog post on Wordpress.com after the form is submitted.
Figure 3. The newly added blog post on Wordpress.com
Updating and deleting data with YQL
Just as you can INSERT, you also can UPDATE and DELETE. Consider Listing 3, which uses an UPDATE query to update the newly created post from Listing 1.
Listing 3. Modifying data with a YQL UPDATE query
<?php
// set up Zend auto-loader
// load Zend REST client classes
require_once 'Zend/Loader.php';
Zend_Loader::loadClass('Zend_Rest_Client');
// execute YQL query
// update post on Wordpress blog
try {
$client = new Zend_Rest_Client('http://query.yahooapis.com/v1/public/yql');
$client->env('store://datatables.org/alltableswithkeys');
$client->q(
"UPDATE wordpress.post SET title='Updated blog post',
description='Look, ma, I updated my blog post via YQL', publish='true'
WHERE blogurl = 'http://BLOGNAME.wordpress.com' AND
username = 'BLOGUSER' AND password = 'BLOGPASS' AND postid = '5'");
$result = $client->post();
if ($result->methodResponse->params->param->value->boolean == 1) {
echo 'Post successfully updated';
} else {
throw new Exception ('Could not delete post');
}
} catch (Zend_Rest_Client_Exception $e) {
echo "Client error: " . $e->getResponse();
exit;
} catch (Exception $e) {
echo "Error: " . $e->getMessage();
exit;
}
?>
|
As with any database UPDATE query, you must pass the YQL query a list of fields to be updated, together with their new values, and a WHERE clause that restricts the update to a set of records. In Listing 3, the WHERE clause contains the user credentials as well as the post ID to be updated.
Figure 4 illustrates the updated post on the Wordpress.com blog.
Figure 4. A blog post modified through YQL
And Listing 4 demonstrates how to delete a post with YQL.
Listing 4. Deleting data with a YQL DELETE query
<?php
// set up Zend auto-loader
// load Zend REST client classes
require_once 'Zend/Loader.php';
Zend_Loader::loadClass('Zend_Rest_Client');
// execute YQL query
// delete post from Wordpress blog
try {
$client = new Zend_Rest_Client('http://query.yahooapis.com/v1/public/yql');
$client->env('store://datatables.org/alltableswithkeys');
$client->setConfig(array('timeout' => 30));
$client->q(
"DELETE FROM wordpress.post WHERE blogurl = 'http://BLOGNAME.wordpress.com'
AND username = 'BLOGUSER' AND password = 'BLOGPASS' AND postid = '5'");
$result = $client->post();
if ($result->methodResponse->params->param->value->boolean == 1) {
echo 'Post successfully deleted';
} else {
throw new Exception ('Could not delete post');
}
} catch (Zend_Rest_Client_Exception $e) {
echo "Client error: " . $e->getResponse();
exit;
} catch (Exception $e) {
echo "Error: " . $e->getMessage();
exit;
}
?>
|
Note that as of this writing, the community YQL table for Wordpress.com does not include functional UPDATE and DELETE support, although a patch is in development. If you have problems running the preceding listings, you can instead reference the author's patched versions of these tables, altering the YQL query string to reference the custom table with a USE clause. Here's an example:
USE 'http://github.com/vikram0/yql-tables/raw/master/wordpress/wordpress.post.xml' AS wordpress.post; DELETE FROM wordpress.post WHERE blogurl = 'http://BLOGNAME.wordpress.com' AND username = 'BLOGUSER' AND password = 'BLOGPASS' AND postid = '5' |
You might also need to adjust the client timeout in this case, as shown here:
$client->setConfig(array('timeout' => 30));
|
Example application: Cloud shopping cart
With all this background information at hand, let's look at building an example application that utilizes the YQL data modification capabilities. Your target web service in this case is Payvment (see Resources for a link), a "shopping cart in the cloud" implementation. As the name suggests, Payvment allows you to create, edit, and modify shopping carts using a REST API, making it possible to easily integrate shopping cart capabilities into a web application. Payvment also makes a community Open Data Table available for YQL, enabling you to perform cart operations using YQL instead of accessing the Payvment REST API.
With this in mind, let's put together an example application that uses YQL and PHP to build a shopping cart in the cloud. This application defines a basic product catalog, with prices and descriptions for each SKU, and allows users to add these items to their Payvment cart using YQL INSERT queries. Users can also modify quantities or remove items from their shopping carts using UPDATE or DELETE queries. You can retrieve the current contents of the shopping cart (with subtotals) at any time by using a SELECT query.
Listing 5 has the complete code.
Listing 5. A shopping cart built using the Payvment YQL community table
<?php
// set API keys and constants
define (PAYVMENT_API_KEY, 'YOUR-API-KEY');
define (PAYVMENT_ID, 'YOUR-ID');
// define simple product catalog
$catalog = array(
'167' => array('title' => 'Flying hammer', 'price' => 35.99),
'543' => array('title' => 'Lightsaber', 'price' => 76.99),
'129' => array('title' => 'Sonic screwdriver', 'price' => 10.99)
);
// set up Zend auto-loader
// load Zend REST client classes
require_once 'Zend/Loader.php';
Zend_Loader::loadClass('Zend_Rest_Client');
// start session
session_start();
try {
// check if a cart already exists, if not, get a new cart ID
if (!isset($_SESSION['cart'])) {
$client = new Zend_Rest_Client('http://query.yahooapis.com/v1/public/yql');
$query = "USE 'http://github.com/vikram0/yql-tables/raw/master/shoppingcart/
shoppingcart.YourProducts.xml' AS shoppingcart;
SELECT * FROM shoppingcart WHERE apiKey='" . PAYVMENT_API_KEY . "'";
$client->q($query);
$result = $client->get();
if ($result->results->cartfeed->status->StatusCode != 200) {
throw new Exception ('ERROR: Cannot get shopping cart');
}
$_SESSION['cart'] = (string) $result->results->cartfeed->results->CartID;
}
// check for add/edit/delete operations
switch(strtolower($_REQUEST['op'])) {
// add = INSERT
case 'add':
$id = $_REQUEST['id'];
$client = new Zend_Rest_Client('http://query.yahooapis.com/v1/public/yql');
$query =
"USE 'http://github.com/vikram0/yql-tables/raw/master/shoppingcart/
shoppingcart.YourProducts.xml' AS shoppingcart;
INSERT INTO shoppingcart (apiKey, cartID, payvmentID, itemID, itemName,
itemPrice) VALUES ('" . PAYVMENT_API_KEY . "', '" .
$_SESSION['cart'] . "', '" . PAYVMENT_ID . "', '$id', '" .
$catalog[$id]['title'] . "', '" . $catalog[$id]['price'] . "')";
$client->q($query);
$result = $client->post();
if ($result->results->cartfeed->status->StatusCode != 200) {
throw new Exception ('ERROR: Cannot add item to shopping cart');
}
break;
// remove = DELETE
case 'remove':
$id = $_REQUEST['id'];
$client = new Zend_Rest_Client('http://query.yahooapis.com/v1/public/yql');
$query =
"USE 'http://github.com/vikram0/yql-tables/raw/master/shoppingcart/
shoppingcart.YourProducts.xml' AS shoppingcart;
DELETE FROM shoppingcart WHERE apiKey = '" . PAYVMENT_API_KEY . "'
AND cartID = '" . $_SESSION['cart'] . "' AND itemID = '" . $id . "'
AND retailerID = '" . PAYVMENT_ID . "'";
$client->q($query);
$result = $client->post();
if ($result->results->cartfeed->status->StatusCode != 200) {
throw new Exception ('ERROR: Cannot remove item from shopping cart');
}
break;
// update = UPDATE
case 'update':
$id = $_REQUEST['id'];
$qty = $_REQUEST['qty'];
$client = new Zend_Rest_Client('http://query.yahooapis.com/v1/public/yql');
$query =
"USE 'http://github.com/vikram0/yql-tables/raw/master/shoppingcart/
shoppingcart.YourProducts.xml' AS shoppingcart;
UPDATE shoppingcart SET itemQty = $qty WHERE apiKey = '" . PAYVMENT_API_KEY . "'
AND cartID = '" . $_SESSION['cart'] . "' AND itemID = '" . $id . "'
AND retailerID = '" . PAYVMENT_ID . "'";
$client->q($query);
$result = $client->post();
if ($result->results->cartfeed->status->StatusCode != 200) {
throw new Exception ('ERROR: Cannot update item in shopping cart');
}
break;
}
// get current cart contents (summary and individual items)
$client = new Zend_Rest_Client('http://query.yahooapis.com/v1/public/yql');
$query =
"USE 'http://github.com/vikram0/yql-tables/raw/master/shoppingcart/
shoppingcart.YourProducts.xml' AS shoppingcart;
SELECT * FROM shoppingcart WHERE apiKey='" . PAYVMENT_API_KEY . "'
AND cartID='" . $_SESSION['cart'] . "'";
$client->q($query);
$result = $client->get();
if ($result->results->cartfeed->status->StatusCode == 200) {
$cart_summary = $result->results->cartfeed->results;
$cart_items = $result->results->cartfeed->items->storeitems;
} else {
throw new Exception ('ERROR: Cannot get shopping cart contents');
}
} catch (Exception $e) {
die($e->getMessage());
}
?>
<html>
<head>
<style type="text/css">
table {
border-width: 1px;
border-spacing: 2px;
border-style: outset;
border-color: red;
border-collapse: collapse;
background-color: white;
}
table th {
border-width: 1px;
padding: 4px;
border-style: inset;
border-color: red;
background-color: white;
-moz-border-radius: 0px 0px 0px 0px;
font-weight: bolder;
}
table td {
border-width: 1px;
padding: 4px;
border-style: inset;
border-color: red;
background-color: white;
-moz-border-radius: 0px 0px 0px 0px;
}
</style>
</head>
<body>
<h2>Current Shopping Cart</h2>
<table>
<tr>
<th>Item</th>
<th>Price</th>
<th>Quantity</th>
<th>Subtotal</th>
<th></th>
<th></th>
</tr>
<?php foreach ($cart_items->item as $i): // display current cart ?>
<form method="post">
<input type="hidden" name="id" value="<?php echo $i->ItemID; ?>" />
<tr>
<td><?php echo $i->ItemName; ?></td>
<td><?php echo $i->ItemPrice; ?></td>
<td><input type="text" name="qty" size="3"
value="<?php echo $i->ItemQty; ?>" /></td>
<td><?php echo $i->ItemQtyPrice; ?></td>
<td><input type="submit" name="op" value="Update" /></td>
<td><input type="submit" name="op" value="Remove" /></td>
</tr>
</form>
<?php endforeach; ?>
<tr>
<td colspan="3">Total</td>
<td><?php echo $cart_summary->GrandTotal; ?></td>
<td colspan="2"></td>
</tr>
</table>
<h2>Product Catalog</h2>
<table>
<tr>
<th>Title</th>
<th>Price</th>
<th></th>
</tr>
<?php foreach ($catalog as $id => $item): // display product catalog ?>
<form method="post">
<input type="hidden" name="id" value="<?php echo $id; ?>" />
<tr>
<td><?php echo $item['title']; ?></td>
<td><?php echo $item['price']; ?></td>
<td><input type="submit" name="op" value="Add" /></td>
</tr>
</form>
<?php endforeach; ?>
</table>
</body>
</html>
|
This code might seem intimidating at first, but it's actually far simpler than it looks. Here's a quick breakdown of the key components of Listing 5:
- The first few lines of the script define the user's Payvment API key and unique user identifier. These tokens are freely available from the Payvment website (see Resources for a link) on registration. Following this code, the script defines a simple product catalog as an array, with the array keys as SKUs and the corresponding values holding product information. This product catalog is formatted into an HTML table, with controls for the user to add each item to his or her shopping cart.
- Next, the script initializes a session and checks to see if the session already contains a shopping cart ID. This ID serves to uniquely identify the user's shopping cart across the Payvment API, and it must be included in most API calls. If the shopping cart ID is present in the session, the script continues to the next step; if not, it formulates and executes a dummy SELECT query that returns a new shopping cart ID, and then it saves this ID to the session.
- Following this code, the script checks to see if the current request includes an add, edit, or delete operation, indicated by the presence of the $_REQUEST['op'] variable. Depending on the operation requested, Zend_Rest_Client performs an INSERT, UPDATE, or DELETE query on the Payvment data table using YQL. Notice that each query includes the API key, user ID, and, in some cases, the ID of the item being modified or deleted.
- As a final step, the script performs a SELECT query to retrieve the current contents of the shopping cart. This information is then formatted into an HTML table for display, together with appropriate controls to modify quantities or delete items. Notice that the Payvment API automatically calculates item subtotals and the cart total, based on the information provided when adding each item to the cart.
Note that as of this writing, the community YQL table for the Payvment service uses an out-of-date REST API endpoint, although a patch is in development. The previous listing makes use of the author's patched version of the table, available on Github (see Resources).
Figure 5 demonstrates the application in action.
Figure 5. A cloud-based shopping cart
There is one important point to note about using INSERT, DELETE, and UPDATE queries over YQL. Most web services that allow these operations require a username, password, and/or API key to perform data modification; you can usually obtain these credentials for free from each service's website. In these cases, the procedure to add or modify data is similar to that in the previous listings, where the credentials form part of the YQL query and are passed to the YQL service by Zend_Rest_Client.
For services that require OAuth authorization, it is better tp use an OAuth-capable toolkit like the Yahoo! Open Social SDK instead of Zend_Rest_Client, as this handles all the details of OAuth authorization for you. You can find a link to this SDK, as well as links to additional documentation, in Resources.
In the previous section of this article, you saw how YQL makes it possible to filter and sort search results, with WHERE clause supports and utility functions to sort, count, and eliminate duplicates from the result set. That's not all you can do, though; YQL also makes it possible to limit the number of records in a result set with both local and remote paging limits.
A remote limit controls the number of records retrieved by YQL from the third-party web service; a local limit controls the number of records returned by YQL to the calling application.
If no limit is specified, YQL typically returns 10 records. To modify this number, pass a limit to the query in parentheses, as in the following example, which sets a remote limit of 15 records:
SELECT * FROM music.artist.popular (15) |
To return all records, set the limit to 0, as in the next example:
SELECT * FROM music.artist.popular (0) |
Local limits are enforced with the LIMIT and OFFSET clauses, which work much like their SQL counterparts: The LIMIT clause defines the number of records to be returned, while the OFFSET clause specifies the start offset. Here's an example, which returns records 5-9 of a result set:
SELECT * FROM music.artist.popular (15) LIMIT 5 OFFSET 5 |
With all this control at hand, it's quite easy to implement a simple paging system for YQL results using a pagination widget like Zend_Paginator. Consider Listing 6, which demonstrates one approach to paging YQL results:
Listing 6. Paginating YQL results with Zend_Paginator
<?php
// set up Zend auto-loader
// load Zend REST client classes
require_once 'Zend/Loader.php';
Zend_Loader::loadClass('Zend_Rest_Client');
Zend_Loader::loadClass('Zend_Paginator');
Zend_Loader::loadClass('Zend_Paginator_Adapter_Array');
try {
// execute YQL query
$client = new Zend_Rest_Client('http://query.yahooapis.com/v1/public/yql');
$client->q("SELECT name FROM music.artist.popular (0)");
$result = $client->get();
$data = array();
foreach ($result->results->Artist as $artist) {
$data[] = (string)$artist['name'];
}
// initialize pager with data set
$pager = new Zend_Paginator(new Zend_Paginator_Adapter_Array($data));
// set page number from request
$currentPage = isset($_GET['p']) ? (int) htmlentities($_GET['p']) : 1;
$pager->setCurrentPageNumber($currentPage);
// set number of items per page from request
$itemsPerPage = isset($_GET['c']) ? (int) htmlentities($_GET['c']) : 10;
$pager->setItemCountPerPage($itemsPerPage);
// get pages
$pages = $pager->getPages();
// create page links
$pageLinks = array();
$separator = ' | ';
for ($x=1; $x<=$pages->pageCount; $x++) {
if ($x == $pages->current) {
$pageLinks[] = $x;
} else {
$q = http_build_query(array('p' => $x, 'c' => $itemsPerPage));
$pageLinks[] = "<a href=\"?$q\">$x</a>";
}
}
} catch (Zend_Rest_Client_Exception $e) {
echo "Client error: " . $e->getResponse();
} catch (Exception $e) {
echo "Error: " . $e->getMessage();
}
// iterate over current page set
echo '<h2>Popular Artists</h2>';
$ctr = $pages->firstItemNumber;
foreach ($pager->getCurrentItems() as $item) {
echo $ctr . '. ' . $item . '<br/>';
$ctr++;
}
// print page links
echo '<div>';
echo 'Pages:' . implode($pageLinks, $separator);
echo '</div>';
?>
|
Listing 6 first loads the Zend_Paginator component. As the name suggests, this is a ready-made Zend Framework component that you can use to paginate large data sets. It also provides a set of ready-made methods to generate page links, making it easy for users to navigate through the different pages of a result set.
After Zend_Paginator loads, Listing 6 performs an unbounded SELECT query to retrieve a complete set of YQL results. It then massages the returned data into a PHP array and passes it to the Zend_Paginator array data source, which takes care of extracting the array subset required for the current page. Zend_Paginator also takes care of generating the page navigation links, which allow the user to move forward and backward through the result set.
This approach isn't the most efficient way to paginate YQL results, and you should avoid using this example in a production environment. Listing 6 must necessarily re-execute the YQL query on each request, adding a time and performance cost to the pagination exercise. A more efficient approach is to store the YQL result set in a server-side cache and to retrieve the data from there on each request instead of requesting it from the YQL service. Listing 7 illustrates this improvement, using Zend_Cache to hold the YQL result in a server-side file cache.
Listing 7. A more efficient pagination approach using Zend_Paginator and Zend_Cache
<?php
// set up Zend auto-loader
// load Zend REST client classes
require_once 'Zend/Loader.php';
Zend_Loader::loadClass('Zend_Rest_Client');
Zend_Loader::loadClass('Zend_Paginator');
Zend_Loader::loadClass('Zend_Cache');
Zend_Loader::loadClass('Zend_Paginator_Adapter_Array');
try {
// set up cache
$front = array(
'lifetime' => 2000,
'automatic_serialization' => true
);
$back = array(
'cache_dir' => './'
);
$cache = Zend_Cache::factory('Core', 'File', $front, $back);
// look for YQL data in cache and use if available
// if not, execute YQL query, get fresh result set and save to cache
if(!$data = $cache->load('yql')) {
// execute YQL query
// get total count of items
$client = new Zend_Rest_Client('http://query.yahooapis.com/v1/public/yql');
$client->q("SELECT name FROM music.artist.popular (0)");
$result = $client->get();
$data = array();
foreach ($result->results->Artist as $artist) {
$data[] = (string)$artist['name'];
}
$cache->save($data, 'yql');
}
// initialize pager with data set
$pager = new Zend_Paginator(new Zend_Paginator_Adapter_Array($data));
// set page number from request
$currentPage = isset($_GET['p']) ? (int) htmlentities($_GET['p']) : 1;
$pager->setCurrentPageNumber($currentPage);
// set number of items per page from request
$itemsPerPage = isset($_GET['c']) ? (int) htmlentities($_GET['c']) : 10;
$pager->setItemCountPerPage($itemsPerPage);
// get pages
$pages = $pager->getPages();
// create page links
$pageLinks = array();
$separator = ' | ';
for ($x=1; $x<=$pages->pageCount; $x++) {
if ($x == $pages->current) {
$pageLinks[] = $x;
} else {
$q = http_build_query(array('p' => $x, 'c' => $itemsPerPage));
$pageLinks[] = "<a href=\"?$q\">$x</a>";
}
}
} catch (Zend_Rest_Client_Exception $e) {
echo "Client error: " . $e->getResponse();
} catch (Exception $e) {
echo "Error: " . $e->getMessage();
}
// iterate over current page set
echo '<h2>Popular Artists</h2>';
$ctr = $pages->firstItemNumber;
foreach ($pager->getCurrentItems() as $item) {
echo $ctr . '. ' . $item . '<br/>';
$ctr++;
}
// print page links
echo '<div>';
echo 'Pages:' . implode($pageLinks, $separator);
echo '</div>';
?>
|
The basic logic behind Listing 7 is simple: On each request, the script checks the cache to see if the YQL result already exists. If it does, the cached copy is used to populate the Zend_Paginator data array, and pagination proceeds as usual. Only if a cached copy cannot be found or has expired is the YQL query executed, and the results again are saved to the cache for future use.
As an alternative, you also might consider storing the result data in a client-side variable, and using JavaScript to dynamically populate the page container as the user moves between the pages.
Figure 6 illustrates the output:
Figure 6. YQL results with pagination
YQL also provides a number of additional utility tables that can be used for various purposes, such as running multiple queries at once or parsing external feeds. To illustrate, consider Listing 8, which uses the special 'yql.query.multi' table to run a set of three queries at once:
Listing 8. Performing a YQL multi-query
<?php
// set up Zend auto-loader
// load Zend REST client classes
require_once 'Zend/Loader.php';
Zend_Loader::loadClass('Zend_Rest_Client');
// execute multiple YQL queries
// combine results
try {
$client = new Zend_Rest_Client('http://query.yahooapis.com/v1/public/yql');
$q=<<<EOF
SELECT *
FROM yql.query.multi
WHERE queries="
SELECT * FROM geo.places WHERE text='london';
SELECT * FROM geo.places WHERE text='hong kong';
SELECT * FROM geo.places WHERE text='tanzania'
"
EOF;
$client->q($q);
$result = $client->get();
} catch (Zend_Rest_Client_Exception $e) {
echo "Client error: " . $e->getResponse();
} catch (Exception $e) {
echo "Error: " . $e->getMessage();
}
// iterate over query result set
foreach ($result->results->results as $result) {
foreach ($result->place as $place) {
echo '<div>';
echo '<strong>' . $place->name . ', ' .
$place->country . ' (' . $place->placeTypeName .
') </strong> <br/>';
echo 'Latitude:' . $place->centroid->latitude .
' Longitude:' . $place->centroid->longitude;
echo '</div>';
}
}
?>
|
Listing 8 contains three separate YQL queries, each looking up information about a different city from the geo.places table. To make things a little more efficient, these three queries are wrapped into a single query to the yql.query.multi table, and the results of each query are merged into a single result document. Performing a multi-query in this manner offers some benefits: It results in more readable code, and it can also produce faster queries if all the queries are targeted to the same table, as YQL uses internal caching to avoid reading the same table definition file more than once.
Figure 7 illustrates the output of Listing 8.
Figure 7. The merged result of a YQL multi-query
Another set of utility tables simplifies parsing and extracting data from documents in standard XML formats, such as RSS or Atom feeds. Consider Listing 9, which illustrates how to use YQL to parse an external RSS feed and extract a subset of information from it using the rss table:
Listing 9. Parsing RSS feeds with YQL
<?php
// set up Zend auto-loader
// load Zend REST client classes
require_once 'Zend/Loader.php';
Zend_Loader::loadClass('Zend_Rest_Client');
// execute YQL query
// get 5 top stories from Google News RSS feed
try {
$client = new Zend_Rest_Client('http://query.yahooapis.com/v1/public/yql');
$client->q("SELECT title FROM rss (5)
WHERE url='http://news.google.com/news?ned=us&topic=h&output=rss'");
$result = $client->get();
} catch (Zend_Rest_Client_Exception $e) {
echo "Client error: " . $e->getResponse();
} catch (Exception $e) {
echo "Error: " . $e->getMessage();
}
// iterate over query result set
echo '<h2>Top Stories</h2>';
foreach ($result->item as $item) {
echo '<div>' . $item->title . '</div>';
}
?>
|
Figure 8 illustrates the output.
Figure 8. A list of Google News stories, extracted from an RSS feed with YQL
In a similar vein, with an xml table, you can apply YQL filters to any XML document, combining it with XPath to retrieve specific node collections. Listing 10 illustrates an example of this technique.
Listing 10. Parsing XML documents with YQL
<?php
// set up Zend auto-loader
// load Zend REST client classes
require_once 'Zend/Loader.php';
Zend_Loader::loadClass('Zend_Rest_Client');
// execute YQL query
// get 5 top stories from Google News RSS feed
try {
$client = new Zend_Rest_Client('http://query.yahooapis.com/v1/public/yql');
$client->q("SELECT title, link FROM xml (5)
WHERE url='http://news.google.com/news?ned=us&topic=h&output=rss'
AND itemPath='//item'");
$result = $client->get();
} catch (Zend_Rest_Client_Exception $e) {
echo "Client error: " . $e->getResponse();
} catch (Exception $e) {
echo "Error: " . $e->getMessage();
}
// iterate over query result set
echo '<h2>Top Stories</h2>';
foreach ($result->item as $i) {
echo '<div><a href="' . $i->link . '">' . $i->title .
'</a></div>';
}
?>
|
Listing 10 uses a SELECT query to parse an external XML document and apply an XPath expression to the result to select a specific subset of nodes. This ability to apply SQL-like syntax and filters to XML data in a generic fashion can significantly ease the task of extracting information from complex XML document trees.
And Figure 9 illustrates the output.
Figure 9. A list of Google News stories, extracted from an XML document with YQL
Extracting data from HTML pages
YQL can also extract data from HTML web pages. YQL includes an html utility table, which you can use as a filter to scrape specific pieces of data from an HTML page. XPath provides the expression language necessary to locate and retrieve individual nodes or node collections from the HTML document tree.
To illustrate how this works, assume that you want to obtain a complete list of Indian states and union territories. This information is available on the official Indian Government website, as a formatted HTML table (Figure 10). View a larger version of Figure 10.
Figure 10. The HTML source of a web page containing data on Indian states
With YQL, it's a matter of a single SELECT query to extract it from its source page into a YQL result set, as in Listing 11.
Listing 11. Parsing HTML documents with YQL
<?php
// set up Zend auto-loader
// load Zend REST client classes
require_once 'Zend/Loader.php';
Zend_Loader::loadClass('Zend_Rest_Client');
// execute YQL query
// scrape GOI page to get list of Indian states
try {
$client = new Zend_Rest_Client('http://query.yahooapis.com/v1/public/yql');
$client->q("SELECT content FROM html
WHERE url='http://india.gov.in/knowindia/state_uts.php'
AND xpath='//ul[@class=\'sec1\']/li/a'");
$result = $client->get();
} catch (Zend_Rest_Client_Exception $e) {
echo "Client error: " . $e->getResponse();
} catch (Exception $e) {
echo "Error: " . $e->getMessage();
}
// iterate over query result set
echo '<h2>List of States in India</h2>';
echo '<ul>';
foreach ($result->a as $state) {
echo "<li>$state</li>";
}
echo '</ul>';
?>
|
Figure 11 shows an illustration of the output.
Figure 11. A list of Indian states, extracted from a web page with YQL
Clearly, you can do a lot more with YQL than simply extract information from one or more web services. With YQL, it's possible to add and modify data in third-party web applications using standard INSERT, UPDATE, and DELETE queries, and this ability makes it easy to create new cloud-based applications that read and write data from multiple data sources. YQL also offers a number of utility tables that make it possible to extract information from popular file formats such as RSS, Atom, CSV, and HTML using a combination of YQL XPath filters.
YQL is currently under active development, so expect even more benefits in the future. Feel free to experiment with YQL, or add your own data tables to the community repository even as you begin integrating it into your PHP applications. Happy coding!
Learn
- Building web applications with YQL and PHP, Part 1: Use PHP and YQL to retrieve and combine data from multiple web services (Vikram Vaswani, developerWorks, November 2010): Get started with YQL, the Yahoo! Query Language, which offers a unified, SQL-like interface to retrieve data from multiple web service APIs. Access the YQL service from a PHP application and use it with SQL-type constructs to filter, sort, and link query results.
- The YQL Guide: Learn more about YQL and access Internet data with SQL-like commands.
- The YQL Console: Experiment with YQL queries.
- YDN Forums > YQL: Participate in discussions related to YQL development.
- The Zend_Rest_Client library: Read more about the Zend_Rest_Client library in a searchable Reference Guide.
- The Zend_Cache component: Read more about a generic way to cache any data.
- The Zend_Paginator component: Read more about a flexible component for paginating collections of data and presenting that data to users.
- A complete list of community YQL tables: Review the list of built-in tables that you t can use to access data.
- The Wordpress API: Learn more about the API and sign up for a free Wordpress.com blog.
- The Payvment REST API: Learn how to retrieve a specific shopping cart and display its contents in your app. Request an API key and get started.
- Stack Overflow discussion: Find out how to escape special characters in nested or multiple YQL queries.
- More articles by this author (Vikram Vaswani, developerWorks, August 2007-current): Read articles about XML, additional Google APIs and other technologies.
- XML area on developerWorks: Get the resources you need to advance your skills in the XML arena.
- My developerWorks: Personalize your developerWorks experience.
- IBM XML certification: Find out how you can become an IBM-Certified Developer in XML and related technologies.
- XML technical library: See the developerWorks XML Zone for a wide range of technical articles and tips, tutorials, standards, and IBM Redbooks. Also, read more XML tips.
- developerWorks technical events and webcasts: Stay current with technology in these sessions.
- developerWorks on Twitter: Join today to follow developerWorks tweets.
- developerWorks podcasts: Listen to interesting interviews and discussions for software developers.
- developerWorks on-demand demos: Watch demos ranging from product installation and setup for beginners to advanced functionality for experienced developers.
Get products and technologies
- The Zend Framework: Download the latest Zend Framework releases.
- The Yahoo! Social SDK: Download and work with this flexible experimental SDK.
- Patched yql-tables / shoppingcart : Download the author's patched verison of the YQL tables or view the patched table.
- IBM product evaluation versions: Download or explore the online trials in the IBM SOA Sandbox and get your hands on application development tools and middleware products from DB2®, Lotus®, Rational®, Tivoli®, and WebSphere®.
Discuss
- XML zone discussion forums: Participate in any of several XML-related discussions.
- developerWorks blogs: Check out these blogs and get involved.

Vikram Vaswani is the founder and CEO of Melonfire, a consulting services firm with special expertise in open-source tools and technologies. He is also the author of the books Zend Framework: A Beginners Guide and PHP: A Beginners Guide.




