Skip to main content

By clicking Submit, you agree to the developerWorks terms of use.

The first time you sign into developerWorks, a profile is created for you. Select information in your developerWorks profile is displayed to the public, but you may edit the information at any time. Your first name, last name (unless you choose to hide them), and display name will accompany the content that you post.

All information submitted is secure.

  • Close [x]

The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerworks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

By clicking Submit, you agree to the developerWorks terms of use.

All information submitted is secure.

  • Close [x]

Understanding the Zend Framework, Part 7: Searching

Tyler Anderson (tyleranderson5@yahoo.com), Engineer, Stexar Corp.
Tyler Anderson graduated with a degree in computer science from Brigham Young University in 2004 and is currently in his last semester as a master's student in computer engineering. In the past, he worked as a database programmer for DPMG.com, and he is currently an engineer for Stexar Corp., based in Beaverton, Ore.

Summary:  Let's continue on with the "Understanding the Zend Framework" series. In Part 6, you learned how to use the Zend Framework to send e-mail from within your feed reader application. Now, here in Part 7, you will use the Zend Framework to search the titles and content of articles saved via the feed reader application and view the resulting ranked results.

View more content in this series

Date:  18 Jan 2011 (Published 22 Aug 2006)
Level:  Intermediate PDF:  A4 and Letter (205KB | 14 pages)Get Adobe® Reader®

Activity:  16177 views
Comments:  

About this series

This series chronicles the building of an online feed reader, Chomp, while explaining all of the major aspects of using the open source PHP Zend Framework.

Part 1 talked about the overall concepts of the Zend Framework, including a list of relevant classes and a general discussion of the MVC pattern. Part 2 expanded on that to show how MVC can be implemented in a Zend Framework application. You also created the user registration and login process, adding user information to the database and pulling it back out again.

See Part 2 for details on installing the Zend framework and XAMPP.

Parts 3 and 4 dealt with the actual RSS and Atom feeds. In Part 3, you learned how to enable users to subscribe to individual feeds and to display the items listed in those feeds. You also discovered some of the Zend Framework's form-handling capabilities, validating data, and sanitizing feed items. Part 4 explained how to create a proxy to pull data from a site that has no feed.

The rest of the series involves adding value to the Chomp application. Part 5 explained how to use the Zend_PDF module to enable the user to create a customized PDF of saved articles, images, and search results. In Part 6, you used the Zend_Mail module to alert users to new posts. Here in Part 7, you will look at searching saved content and returning ranked results. In Part 8, you will create your own mashup, adding information from Amazon, Flickr, and Yahoo! And in Part 9, you will add Ajax interactions to the site using JavaScript object notation.


Introduction

This article explains how to use the Zend_Search module to search existing, current and saved blog entries for a particular search term, and return ranked results. You will learn:

  • How to use the Zend_Search module and related classes to index and search data.
  • How to perform different types of simple and advanced searches using the Zend_Search module.

At the end of this article, you will be able to search feed entries that have been saved in your feed reader. First, you will build a function that creates the search index and adds new content to the index. Next, you will create two actions that will provide the search functionality: search and viewSearchResults. The search action provides a form to perform searches, and the viewSearchResults action processes the input from the form and displays the ranked results to you.


Building the search index

The Zend Framework provides an excellent search mechanism that's simple to use. The search mechanism works by creating an index in a directory that's not Web-accessible. You can then add items to the index with several searchable subitems and search the indexed items in various methods. That's what this section is all about, so start by creating the index.

Creating and adding an entry to the index

To start, you need a helper function that creates a new item for the search index. Then you create the index if it doesn't exist and add the item to it. Create this helper function at the top of FeedController.php, as shown in Listing 1.


Listing 1. Creating and adding items to the search index
                
<?php
define('INDEX', 'c:\nonWWWAcessibleDirectory\myIndex');

function addEntryToSearchIndex($url, $contents, 
                               $feedname, $articletitle='')
{
    $doc = new Zend_Search_Lucene_Document();
        
    $doc->addField(Zend_Search_Lucene_Field::Text('url', $url));
    $doc->addField(Zend_Search_Lucene_Field::Text('feedname',
                                                  $feedname));
    if($articletitle != '')
        $doc->addField(Zend_Search_Lucene_Field::Text('articletitle', 
                                                      $articletitle));
    $doc->addField(Zend_Search_Lucene_Field::UnStored('contents', 
                                                      $contents));
        
    if ( !is_dir(INDEX) ) {
        $index = Zend_Search_Lucene::create(INDEX);
    }
    else {
        $index = Zend_Search_Lucene::open(INDEX);
    }
    $index->addDocument($doc);
    $index->commit();
}

class FeedController extends Zend_Controller_Action
...

First, define the non-Web-accessible directory that will contain the search index. This is where the indexed items will be stored. Pass in the URL of the article, the contents of it, the name of the feed or Web page, and the article title. Then create the new document, which you'll add to the index later, and store it in $doc. Next, add four fields to it. The first is the URL, which you store as Text. This means the actual URL will be stored along with the index, and will be retrievable when you search the index later. This means if the entry matches, you'll be able to retrieve its URL and display it back to a user.

Next, store the feedname, which is stored the same way as the URL, and if the articletitle is defined, you store that in the index also. Finally, store the contents of the article as an UnStored type. This type means that the data will be indexed as usual, but it will not be stored along with the index, so it won't be retrievable from a matching result. This is OK, since you'll only need the URL, the feed or Web page name, and the article title, if defined.

Finally, create the index and store it in $index. If the directory of the index already exists, you won't need to create a new one; otherwise, you create a new index (as specified by $newIndex). Then, add the search item ($doc) created earlier to the index and commit it, saving the changes to the index.

You now have the means to create and add items to your index. Next, go to the spot in your code where you'll call this function.

Adding entries to the index

Now that you've created the addEntryToSearchIndex function, you can begin adding items to search to your index. Go to the saveEntryAction method in the FeedController class and add the code in Listing 2.


Listing 2. Adding items to your index
                
...
                echo 'Error occurred, full text not saved,'.
                     ' please reload.';
                return;
            }
        }

        addEntryToSearchIndex($channelLink,
                              Zend_Filter::noTags($fullText),
                              $feedTitle,
                              $channelTitle);

        $db = Zend_Registry::get('db');
...

Here, you simply call your new method, passing in the URL ($channelLink), the description or full text of the entry. You pass the $fullText string to the Zend_Filter::noTags method so all HTML tags will be removed (no need to index those). You also pass in the name of the feed ($feedTitle) and the article title ($channelTitle). Note that every time a feed is saved to your index, no matter whom it is, that feed will be saved in your index. Another possible task you can explore is to make sure no duplicate entries get added.

That completes this section. Now go add something to your index by saving a feed entry, as done in Part 5. In the next section, you'll start using your index in new actions you'll create in the FeedController class.


Adding new search actions to the FeedController

So you have an index. Now it's time to use it. This section creates two new action methods: searchAction and viewSearchResultsAction. The first causes a search page to display to a user, and the second performs the search, based on parameters it receives from the first, and displays the results back to the user.

searchAction method

Here, you add the searchAction method to the FeedController class, which displays the searchResults view to the user. Do so, as shown in Listing 3.


Listing 3. The searchAction method
                
    public function searchAction()
    {
        $view = Zend_Registry::get('view');
        $view->title = "Search Results";
        echo $view->render('searchResults.php');
    }

This simply takes the $view object out of the Zend registry, as you've done in previous parts of this series, and displays it to the user. Next, add a link to the main page that will take you to this part of your feed reader.


Adding a link for searching to the main page

You don't yet have a way to reach the /feed/search area of your feed reader, do you? Add the following link, as shown in Listing 4, to the viewFeeds view in viewFeeds.php.


Listing 4. Modifying the viewFeeds view
                
...
  [<a href="feed/viewSavedEntries">View Saved Entries/
                                   Generate PDF</a>]<br>
  [<a href="feed/search">Search Saved Entries</a>]<br>
  <h1>CHOMP! The Feed Reader</h1>
...

This simply displays the link to search saved entries to users (see Figure 1).


Figure 1. The modified viewFeeds view
The modified viewFeeds view

Clicking this link results in an error because the search view doesn't exist yet. You'll create this view next.

Search view

This view allows users to enter their search to your index. Create this view, searchResults.php, as shown in Listing 5.


Listing 5. The search view
                
<html>
<head>
    <title><?php echo $this->escape($this->title); ?></title>
</head>
<body>
  [<a href='/'>Back to Main Menu</a>]<br>
  <h1><?php echo $this->escape($this->title); ?></h1>
  
  <form method='GET' action='/feed/viewSearchResults'>
    Query: <input name='query'><br>
    Choose a field to search:<br>
    <input type='radio' name='field' value='raw' checked="yes">
        Raw String (allows fancy search types)<br>
    <input type='radio' name='field'
 value='contents'>Contents<br>
    <input type='radio' name='field' value='feedname'>Feed
 Title<br>
    <input type='radio' name='field' value='articletitle'>
        Article Title<br>
    Slop (not allowed for Raw String searches):
        <input name='slop' value='0'><br>
    <input type='Submit' value='Search'>
  </form>
</body>
</html>

This page allows several types of searches, which you'll learn about more in the viewSearchResultsAction method, next. This view provides a form that allows users to enter a search string and a series of radio buttons that allow users to search specific entries in the index. Note that the method of the form is GET because performing a search has no side effects, and so GET is safe here.

Finally, a slop field is provided. Slop is defined as the number of positions that strings in a phrase are allowed to separate. Thus if the slop is 0, and the phrase is "hey you" then the phrase must be found in the position defined in the query. If the slop is 1, then "hey ... you" is acceptable, where ... is defined as a single word. If the slop is 2, then "hey ... ... you" and "you hey" are acceptable. Thus, for every additional slop value, the values in the phrase are allowed to separate even more. This allows you to configure the near factor for acceptable search results.

Preview the search view in Figure 2.


Figure 2. The search view
The search view

Next, take a look at the search types available.

Search types

There are several search types, and you'll focus on these:

  • Search any phrase with any slop in one field
  • Advanced searches using a raw string query:
    • Search a phrase by providing a space-delimited string of words: "hey you"
    • Search for some words, but not others: "+hey -you" (make sure the document contains "hey" and not "you")
    • Search either of the above two, but you can also specify the field of the word: "Hey -you feedname:Google"

Searching with the above types of advanced querying is a piece of cake for advanced Googlers, but anyone can get a feel for them with practice. Next, add the viewSearchResultsAction method.

viewSearchResultsAction method

This method performs the search and displays results back to the user. Create the viewSearchResultsAction method in the FeedController class, as shown in Listing 6.


Listing 6. The viewSearchResultsAction method
                
    public function viewSearchResultsAction()
    {
        $input = new Zend_Filter_Input(
            array('*'=>'StringTrim'),
            null,
            $_GET);
        $query = strtolower($input->getUnescaped('query'));
        $slop = $input->getUnescaped('slop');
        $field = $input->getUnescaped('field');
        
        if($field != "raw" | $query == ''){
            $queryObj = new
                Zend_Search_Lucene_Search_Query_Phrase(explode(" ", 
                                                               $query),
                                                       null, $field);
            $queryObj->setSlop($slop);
        }
        else $queryObj = $query;
		
        if ( !is_dir(INDEX) ) {
            $index = Zend_Search_Lucene::create(INDEX);
        }
        else {
            $index = Zend_Search_Lucene::open(INDEX);
        }
        $hits = $index->find($queryObj);

        $view = Zend_Registry::get('view');
        $view->title = "Search Results for: $query";
        $view->hits = $hits;
        echo $view->render('viewSearchResults.php');
    }

This method retrieves the $query, the $slop, and the $field from the GET array. If the $field is not "raw" or a $query wasn't entered, you'll create a special Zend_Search_Lucene_Search_Query_Phrase construct and store it in $queryObj and set the slop to the value in $slop (this allows the first query type shown earlier). Otherwise, you'll set $queryObj to $query, the raw search string (this allows the more advanced query types). Then grab the index and retrieve matching results by calling $index->find($queryObj) and store them in $hits. Last, create and render the viewSearchResults view and display it to the user. You see how this view displays the results next.

viewSearchResults view

This view iterates over the results returned to it and displays them back to the user. Create this view in a file named viewSearchResults.php, and define it as shown in Listing 7


Listing 7. The viewSearchResults view
                
<html>
<head>
    <title><?php echo $this->escape($this->title); ?></title>
</head>
<body>
  [<a href='/'>Back to Main Menu</a>]<br>
  <h1><?php echo $this->escape($this->title); ?></h1>
  
  <table>
    <tr>
      <td></td>
      <td>Title (Click to view article)
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</td>
      <td>Relevancy</td>
    </tr>
  <?php
     $i = 1;
     foreach ($this->hits as $hit) {
         $score = $hit->score;
         $feedTitle = $hit->feedname;
         $channelTitle = $hit->articletitle;
         $url = $hit->url;

         $title = $feedTitle;
         if($channelTitle != '')
             $title = "$title > $channelTitle";
         echo "<tr><td>#" . $i++ . ":</td>";
         echo "<td><a href=\"$url\">$title</a></td>";
         echo "<td>$score</td></tr>";
     }
  ?>
  </table>
</body>
</html>

This view iterates over each of the hits sent to it from the viewSearchResultsAction method. Each matching hit is returned in ranked order with the first result having the highest relevancy, stored as the score. Here, grab the $score, $feedTitle, $channelTitle, and $url from each $hit, and display them back to the user, with a link being provided so users can view the full text (see Figure 3).


Figure 3. The viewSearchResults view
The viewSearchResults view

Well, that's it. Your feed reader now has searching capabilities.


Summary

You completed Part 7 of this "Understanding the Zend Framework" series by mastering the Zend_Search class in the Zend Framework, which allows you to search the saved entries in your feed reader.

The rest of this series involves adding even more value to the Chomp application. In Part 8, you'll add Ajax interactions to the site using JavaScript object notation. Finally, in Part 9, you'll create your own mashup, adding information from Amazon, Flickr, Twitter and Yahoo.



Download

DescriptionNameSizeDownload method
Part 7 source codeos-php-zend7.source.zip11KBHTTP

Information about download methods


Resources

Learn

Get products and technologies

Discuss

About the author

Tyler Anderson graduated with a degree in computer science from Brigham Young University in 2004 and is currently in his last semester as a master's student in computer engineering. In the past, he worked as a database programmer for DPMG.com, and he is currently an engineer for Stexar Corp., based in Beaverton, Ore.

Report abuse help

Report abuse

Thank you. This entry has been flagged for moderator attention.


Report abuse help

Report abuse

Report abuse submission failed. Please try again later.


developerWorks: Sign in


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Select information in your developerWorks profile is displayed to the public, but you may edit the information at any time. Your first name, last name (unless you choose to hide them), and display name will accompany the content that you post.

Choose your display name

The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


Rate this article

Comments

Help: Update or add to My dW interests

What's this?

This little timesaver lets you update your My developerWorks profile with just one click! The general subject of this content (AIX and UNIX, Information Management, Lotus, Rational, Tivoli, WebSphere, Java, Linux, Open source, SOA and Web services, Web development, or XML) will be added to the interests section of your profile, if it's not there already. You only need to be logged in to My developerWorks.

And what's the point of adding your interests to your profile? That's how you find other users with the same interests as yours, and see what they're reading and contributing to the community. Your interests also help us recommend relevant developerWorks content to you.

View your My developerWorks profile

Return from help

Help: Remove from My dW interests

What's this?

Removing this interest does not alter your profile, but rather removes this piece of content from a list of all content for which you've indicated interest. In a future enhancement to My developerWorks, you'll be able to see a record of that content.

View your My developerWorks profile

Return from help

static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Open source
ArticleID=154040
ArticleTitle=Understanding the Zend Framework, Part 7: Searching
publish-date=01182011
author1-email=tyleranderson5@yahoo.com
author1-email-cc=troy@backstopmedia.com

Tags

Help
Use the search field to find all types of content in My developerWorks with that tag.

Use the slider bar to see more or fewer tags.

For articles in technology zones (such as Java technology, Linux, Open source, XML), Popular tags shows the top tags for all technology zones. For articles in product zones (such as Info Mgmt, Rational, WebSphere), Popular tags shows the top tags for just that product zone.

For articles in technology zones (such as Java technology, Linux, Open source, XML), My tags shows your tags for all technology zones. For articles in product zones (such as Info Mgmt, Rational, WebSphere), My tags shows your tags for just that product zone.

Use the search field to find all types of content in My developerWorks with that tag. Popular tags shows the top tags for this particular content zone (for example, Java technology, Linux, WebSphere). My tags shows your tags for this particular content zone (for example, Java technology, Linux, WebSphere).

Try IBM PureSystems. No charge.

Special offers