Skip to main content

By clicking Submit, you agree to the developerWorks terms of use.

The first time you sign into developerWorks, a profile is created for you. Select information in your profile (name, country/region, and company) is displayed to the public and will accompany any content you post. You may update your IBM account at any time.

All information submitted is secure.

  • Close [x]

The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerworks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

By clicking Submit, you agree to the developerWorks terms of use.

All information submitted is secure.

  • Close [x]

Conduct Web experiments using PHP, Part 1

Use categorical data analysis tools to analyze factorial Web experiments

Paul Meagher (paul@datavore.com), CEO, Datavore Productions
Paul Meagher is a freelance Web developer, writer, and data analyst. Paul has a graduate degree in Cognitive Science and has spent the last six years developing Web applications. His current interests include statistical computing, data mining, content management, and e-learning.

Summary:  This two-part article series offers Web developers a practical introduction to the design of experiments (DOE) and categorical data analysis (CDA). This first part demonstrates how to use PHP to implement an experimental protocol for measuring the effectiveness of a Web-based offer. The second part will examine analyzing the resulting data using CDA tools that we'll implement using PHP.

Date:  05 Oct 2004
Level:  Intermediate

Activity:  4012 views
Comments:  

It's practically impossible for you to use the Web today and not come across a Web offer -- a Web-based product or service offer that includes an online method of responding to the offer. The desired response might be to fill out an information request form, to join a site mailing list, to set up an appointment to meet with a representative of the product or service, or any number of other options.

A Web offer is a powerful marketing tool that gives a sales force entrèe to a buyer's most pervasive motivating force. Through the unique structure of online technology, marketing can easily tap into a consumer's brightest fantasy or most vulnerable flank (depending on your point of view): almost instant gratification. And like any marketing tool, half of its effectiveness lies in being able to measure its effectiveness.

That's where these two articles come in. In this one, I'll demonstrate how to implement Web-based factorial experiments using PHP. Using this code, you can begin running Web-based quality control experiments in the background on your own sites. And I won't stop there. In the next article, you will learn how to analyze factorial data arising from these Web experiments using PHP-based Categorical Data Analysis (CDA) tools.

To start, ask yourself the question, How can I assess the effectiveness of a Web offer?

A question of design

You can use factorial designs to measure the effectiveness of one or more components of Web offers. You'll begin by trying to measure the effectiveness of the ad banner component of a sample Web offer.

In this sample, the ad banner is a 200- by 400-pixel ad appearing on the right-hand side of all your site pages. The ad includes an offer to join your mailing list. You'd like to determine which version of the banner generates the most mailing-list sign ups.

A marketing hypothesis is that you will get more responses by using an image of a person versus an image of the product in the ad banner. Another hypothesis is that you will get more responses by using longer sell text versus short sell text (for instance, just the product byline) in the banner.

The hypotheses suggest two corresponding factors you should manipulate (and presents you with a 2 by 2 matrix):

  • An image factor with two levels: Product, Person
  • A text factor with two levels: Short, Long

In a full factorial design, all possible factor-level combinations are presented equally often during the course of the experiment. In this Web offer effectiveness study, you have four possible combinations of image and text factors: product-short, product-long, person-short, and person-long. The end result is that for each factor-level combination, you have the same number of opportunities to measure whether a response occurred or not.


Goal decomposition

Your goal is to conduct a Web offer effectiveness study in which you factorially manipulate a TEXT aspect and an IMAGE aspect, then record whether visitors respond or not.

To implement the software to conduct this quality-control experiment, you need to consider whether your goal can be decomposed into smaller goals that might be addressed by separate software packages. My prototyping research suggests that the following decomposition is viable in this project:

  • A stimulus-response package (aka the SR Package) is required to generate the stimulus codes for the experiment, randomly assign the stimulus codes to new visitors, and log whether or not a site visitor responded to the offer. A stimulus code is simply a factor-level combination such as product-short which is used to control which stimulus is presented to the site visitor (that is, the ad banner version consisting of an image of the product accompanied by short sell text).
  • A visitor-logging package (the VLOG Package) is required to track visitors across sessions.

VLOG package

The main purpose of the visitor-logging package is to supply a visitor ID to the stimulus-response package so the allotment of stimulus codes to site visitors is properly managed and visitor responses (or lack thereof) are properly logged. To learn more about the VLOG package, you can consult my developerWorks tutorial "Web site User Modeling with PHP" in which I discuss an earlier procedural version of this user-tracking code (see Resources).

For the purposes of this article, all you need to know is that the VLOG package tracks visitors across sessions through a persistent identity cookie (in this example, $_COOKIE['vis_id']).


Database schema

This Web offer study involves logging visitor information to several database tables. You will be concerned with information appearing in the WebOffer table. The MySQL schema definition for the WebOffer table looks like this:


Listing 1. MySQL schema for WebOffer table
CREATE TABLE `WebOffer` (
  `id` int(9) NOT NULL auto_increment,
  `vis_id` varchar(32) default NULL,
  `sound` enum('on','off') NOT NULL default 'off',
  `image` enum('person','product') NOT NULL default 'person',
  `text` enum('short','long') NOT NULL default 'short',
  `rand` float(5,4) NOT NULL default '0.0000',
  `used` enum('y','n') NOT NULL default 'n',
  `joined` char(1) default NULL,
  `views` int(4) default NULL,  
  `timer_start` datetime default NULL,
  `timer_stop` datetime default NULL,
  PRIMARY KEY  (`id`),
  KEY `vis_id` (`vis_id`),
  KEY `used` (`used`)
) TYPE=MyISAM;

Note the use of default NULL values as my convention for coding when no response has occurred in the four response columns appearing in this table:

  • joined
  • views
  • timer_start
  • timer_stop

WebOffer table

The WebOffer table serves three main functions:

  • It contains the stimulus codes (also known as the factor-level combinations) that are used to determine what ad banner version to present to a site visitor.
  • It contains the ID of the visitor who has been assigned a particular stimulus code so you can use this stimulus code to control the display of the Web offer to that visitor on subsequent page views.
  • It contains all the factorial response data in a form that can directly analyze by CDA software (which I will discuss and implement in the next article).

The following table is an example of what the WebOffer table for a two-factor Web experiment (manipulating image and text factors) might look like halfway though the study.

Table 1. WebOffer table manipulating two factors

idvis_idimagetextrandusedviewsjoinedtimer_starttimer_stop
1NULLpersonshort0.9976nNULLNULLNULLNULL
28e8b099fpersonlong0.1667y4y2004-08-05 10:22:282004-08-05 10:40:28
3f8eac462productshort0.0670y7NULL2004-08-06 08:10:08NULL
4NULLproductlong0.9838nNULLNULLNULLNULL
5NULLpersonshort0.8058nNULLNULLNULLNULL
649f39195personlong0.3127y12y2004-08-05 11:22:282004-08-06 03:22:28
7f9b73f6fproductshort0.5328y15NULL2004-08-04 09:50:43NULL
8NULLproductlong0.9596nNULLNULLNULLNULL
...

The values appearing in the image and text columns are the stimulus codes. Each new visitor to your site is assigned a random stimulus code that controls what Web offer version he will encounter. In this excerpt from the WebOffer table, note that the four possible factor-level combinations (person-short, person-long, product-short, product-long) are replicated twice.

In this next section, I will discuss the software needed to generate the stimulus codes that appear in the WebOffer table. The software requires you to specify the factor levels and the number of design replications for your experiment.

Before I move on, examine the complete data dictionary for the WebOffer table.

Table 2. WebOffer table data dictionary

FieldUsage
idArbitrary record ID number.
vis_idContains visitor ID corresponding to a persistent cookie identifier.
imageContains stimulus code used to control graphic aspect of the Web offer.
textContains stimulus code used to control textual aspect of the Web offer.
randRandom number between 0 and 1 used to assign a random stimulus code to a new site visitor.
usedContains a "y" value if a particular stimulus code replicate has already been assigned to a site visitor. "n" is the default setting is.
viewsRecords the number of times the visitor saw the assigned ad version. Counting stops when the visitor responds to the ad.
joinedContains a "y" value if the visitor has joined the mailing list. The default value is NULL and can be changed to "y" at any point during the Web offer study if the visitor joins the mailing list.
timer_startRecords date and time when visitor was first exposed to assigned ad version.
timer_stopRecords date and time when visitor responded to assigned ad version. Contains a NULL value when no response has been made.

SR package

The SR package (stimulus response) consists of the following classes:

  • StimulusResponse.php
  • StimulusResponse1D.php
  • StimulusResponse2D.php
  • StimulusResponse3D.php

Web developers who want to conduct Web experiments with this package can select the SR class based on the number of factors they intend to factorially manipulate. For example, if you are only going to manipulate one factor (such as different ad banners with no factorial manipulation of ad banner components), then choose the StimulusResponse1D.php class to:

  • Generate the stimulus codes for your Web experiment
  • Assign a random stimulus code (such as a Web offer) to each new visitor
  • Log whether a visitor responded to the assigned offer

The classes with the 1D, 2D, and 3D suffixes extend the StimulusResponse.php base class by implementing the generation, assignment, and logging functions. These functions are called interface methods in the StimulusResponse.php source in Listing 2.


Listing 2. Source code for StimulusResponse.php base class
<?php
/**
* @package SR
* 
* Base class for stimulus-response package.  Implements common
* methods and defines interface methods to be implemented in
* classes that extend the base class.
*/
class StimulusResponse {

  /**
  * String to hold name of stimulus table.
  */
  var $table = "";

  /**
  * Associative array to hold factor names and factor levels.
  */
  var $factors = array();

  /**
  * Associative array to hold a retrieved stimulus code.
  */
  var $code = array();

  /*
  * Boolean variable indicating whether a response has been
  * observed or not for a particular visitor.
  */
  var $success = false;

  /*
  * Boolean variable indicating whether a stimulus code has been
  * generated or not for a particular visitor.
  */
  var $generated = false;

  /**
  * Boolean variable used to turn debugging output on or off.
  */
  var $debug = true;

  /**
  * Method for setting the table that holds, or will hold,
  * the stimulus codes and responses for the Web experiment.
  */
  function setTable($table) {
    $this->table = $table;
  }

  /**
  * Method for removing existing stimulus codes and responses.
  */
  function emptyTable($table) {
    global $db;
    $sql = " DELETE FROM $this->table ";
    $result = $db->query($sql);
    if (DB::isError($result)) {
      die($result->getMessage());
    } else {
      return true;
    }
  }

  /**
  * Method for defining the stimulus factors to be used
  * as well as their factor levels.
  */
  function setFactors($factors) {
    $this->factors = $factors;
  }

  /**
  * Method for generating and inserting stimulus codes into
  * the stimulus table.
  */
  function insertStimulusCodes($num_reps) {
    // interface method
  }

  /**
  * Method for selecting a random and unused stimulus code
  * from the stimulus table.
  */
  function getRandomStimulusCode($vis_id) {
    // interface method
  }

  /**
  * Method for logging any response to the stimulus if any are present.
  */
  function logResponse($vis_id, $resp_col, $response=false) {
    // interface method
  }

}
?>


Generating the design

Before you can run the experiment, you need to generate the stimulus codes for it. The SR Package contains an examples directory with scripts to generate the stimulus codes for one-, two-, and three-factor experiments (see Resources to download the package). The script for generating the stimulus codes for two-factor experiments is called generate_2d.php and the source code looks like this:


Listing 3. Source of generate_2d.php script
<?php
/**
* @package SR
*
* Example script demonstrating how to set up factor codes
* for a two-factor experiment.
*/ 
require_once "config.php";

require_once PHPMATH . "/SR/StimulusResponse2D.php";

$sr = new StimulusResponse2D;

$sr->setTable("WebOffer");

// You can optionally delete all existing stimulus-response
// info before adding new stimulus codes.
$sr->emptyTable();

// Specify factors and their levels. Factor names should
// correspond to database columns and the factor levels
// should correspond to the enum values for the columns.
$factors["image"] = array("person", "product");
$factors["text"]  = array("short", "long");
$sr->setFactors($factors);

// Specify number of replications of the factorial design.
$num_reps = 5;

$num_codes = $sr->insertStimulusCodes($num_reps);

if ($num_codes) {
  echo "Success: Inserted $num_codes stimulus codes.";
} else {
  echo "Failure: No stimulus codes were inserted.";
}
?>

The script in Listing 3 requires you to specify the factors you will use along with their levels. You also need to specify the number of design replications to use. The insertStimulusCodes() method, located in the StimulusResponse2D.php class, does the main work of populating the WebOffer table with stimulus codes, as shown in Listing 4.


Listing 4. Source of insertStimulusCodes() method
<?php

require_once "StimulusResponse.php";

class StimulusResponse2D extends StimulusResponse {

  function insertStimulusCodes($num_reps) {
    global $db;
    list($factor1, $factor2) = array_keys($this->factors);
    $num_codes=0;
    for($rep=0; $rep < $num_reps; $rep++) {
      foreach($this->factors[$factor1] AS $level1) {
        foreach($this->factors[$factor2] AS $level2) {
          $rand = mt_rand() / mt_getrandmax();
          $sql  = " INSERT INTO $this->table ";
          $sql .= " ( $factor1, $factor2, rand ) ";
          $sql .= " VALUES ";
          $sql .= " ( '$level1', '$level2', $rand ) ";
          $result = $db->query($sql);
          if (DB::isError($result)) {
            die($result->getMessage());
          }
          $num_codes++;
        }
      }
    }
    return $num_codes;
  }

}

?>

Note that a random value between 0 and 1 is generated through this assignment:


Listing 5. Generating a unit random value
$rand = mt_rand() / mt_getrandmax();

The result of executing the insertStimulusCodes() method is that your WebOffer table contains all the stimulus codes you will need for your Web experiment. The script generates these stimulus codes so all factorial combinations are represented equally often and the order of display is randomized.


Assigning stimulus codes

The end result of invoking the generate_2d.php script is that you have all your stimulus codes set up in a proper factorial design format. You also have a random number between 0 and 1 associated with each stimulus code. This number is used to randomly select a stimulus code for a new visitor to your site. The getRandomStimulusCode() method is responsible for the random assignment of stimulus codes to new visitors, as shown in Listing 6.


Listing 6. Source of getRandomStimulusCode.php method
<?php

require_once "StimulusResponse.php";

class StimulusResponse2D extends StimulusResponse {
 
  function getRandomStimulusCode($vis_id) {
    global $db;
    list($factor1, $factor2) = array_keys($this->factors);
    $sql  = " SELECT id, $factor1, $factor2 FROM $this->table ";
    $sql .= " WHERE used='n' ORDER BY rand LIMIT 1 ";
    $result = $db->query($sql);
    if (DB::isError($result)) {
      die($result->getMessage());
    }
    if ($result->numRows()) {
      $row = $result->fetchRow();
      $this->code[$factor1] = $row[$factor1];
      $this->code[$factor2] = $row[$factor2];
      $id  = $row["id"];
      $sql  = " UPDATE $this->table ";
      $sql .= " SET used='y', vis_id='$vis_id', views=1, timer_start=now() ";
      $sql .= " WHERE id='$id' ";
      $result = $db->query($sql);
      if (DB::isError($result)) {
        die($result->getMessage());
      } else {
        return true;
      }
    } else {
      return false;
    }
  }

}

?>

Listing 7 displays an except of the demo code that shows the context in which the getRandomStimulusCode() method is called.


Listing 7. Context in which getRandomStimulusCode.php method is called
<?php

// Log the response if any.
$response_field = "response";
$sr->logResponse($_COOKIE['vis_id'], $response_field, $_GET['response']);

// If stimulus code not already generated for visitor then get one.
if (!$sr->generated) {
  $sr->getRandomStimulusCode($_COOKIE['vis_id']);
}

?>

The logResponse() method does a lookup to see if a stimulus code has already been assigned to the visitor. If so, the generated flag is set to true and the need to generate a random stimulus code is bypassed. The logResponse() method also takes care of retrieving the previously assigned stimulus codes that are used to determine which Web offer version the site visitor is shown.


Logging responses

The logResponse() method logs the number of offer views and whether an offer response occurred. It also logs when the visitor first viewed the offer and when the visitor responded to the offer.

In addition to these logging actions, this method also updates the StimulusResponse2D object with information about:

  • Whether a stimulus code already exists for the visitor (through the generated flag)
  • Whether the visitor responded to the offer (through the success flag)
  • What stimulus codes to use (through the codes array)

Many of these updating actions are performed in the response-logging method because a table lookup always needs to occur in order to properly log a response. And you can satisfy two needs by using the retrieved visitor information to set variables needed later in your script.


Listing 8. Source code for logResponse method
<?php

require_once "StimulusResponse.php";

class StimulusResponse2D extends StimulusResponse {

 function logResponse($vis_id, $resp_col, $response=false) {
  global $db;
  list($factor1, $factor2) = array_keys($this->factors);
  $sql  = " SELECT id, $factor1, $factor2, $resp_col ";
  $sql .= " FROM $this->table ";
  $sql .= " WHERE vis_id='$vis_id' LIMIT 1 ";
  $result = $db->query($sql);
  if (DB::isError($result)) {
    die($result->getMessage());
  } 
  if ($result->numRows()) {
   $row = $result->fetchRow();
   $this->code[$factor1] = $row[$factor1];
   $this->code[$factor2] = $row[$factor2];
   // See if a response exists
   if (strlen($row[$resp_col]) >= 1) {
     $this->success = true;
   } else {
    // If response does not already exist, then see if the response
    // string has been set to something other than false.  If so, then
    // update the stimulus table with the response value.
    if ($response != false) {
     $sql  = " UPDATE $this->table ";
     $sql .= " SET $resp_col='$response', timer_stop=now() ";
     $sql .= " WHERE vis_id='$vis_id' ";
     $result = $db->query($sql);
     if (DB::isError($result)) {
       die($result->getMessage());
     } else {
       $this->success = true;
     }
    } else {
     $sql  = " UPDATE $this->table SET views=views+1 WHERE vis_id='$vis_id' ";
     $result = $db->query($sql);
     if (DB::isError($result)) {
       die($result->getMessage());
     }
     $this->success = false;
    }
   }
   $this->generated = true;
  } else {
   $this->generated = false;
  }
 }

}

?>


Understanding the demo

In the examples directory, you'll find a demo_2d.php script. To get this demo to work, set up the demo database and modify the config.php file to appropriate values. After you've done this, point your browser at the demo_2d.php script and you will see something similar to Figure 1:


Figure 1. Two-factor Web offer demo
Two-factor Web offer demo

Because $this->debug is, by default, set to true in the StimulusResponse.php class, you can see some of the experiment-related SQL statements issued prior to the generation of the page. As you click between the home, about, products, and profile links, you can see which SQL statements are used to obtain the stimulus codes and to track the number of exposures to the offer.

For your amusement, the content area of the page consists of the output of a UNIX Fortune call.

A surrogate for the actual ad banner is presented in the right-hand column in the form of a reference to the relevant image. When you test the demo, notice that the ad banner variant does not change for you unless you clear your cookies and happen to be assigned a different stimulus code.

You also can click on a link to add yourself to a mailing list. If you proceed with the demo by clicking this link you can submit your e-mail address using a dummy form that only records that the form was submitted. When you submit the form your response column is set to "y" in the WebOffer record next to the stimulus code replicate you were assigned. Also, the response timer is stopped by recording when your response occurred.


Demo source code

To see how the demo works, examine the demo_2d.php source code in Listing 9.


Listing 9. Source code of demo_2d.php script
<?php
/**
* Script demonstrating how to integrate the VLOG package
* with the SR package to log visitor state transitions,
* visitor accesses, and visitor responses to Web offers.
*/
require_once "config.php";

// 1. Instantiate VisitorLog class and start timer.
require_once PHPMATH . "/VLOG/VisitorLog.php";
$vlog = new VisitorLog;
$vlog->setStartTime();

// 2. Instantiate TransitionMatrix class before calling start method.
require_once PHPMATH . "/VLOG/TransitionMatrix.php";
$tm = new TransitionMatrix();

// 3. Start method sets an identity cookie (such as, $_COOKIE['vis_id') and
// updates the TransitionMatrix object. $SITE_STATES is defined in the
// config.php file and corresponds to the name of each page in the demo site.
$vlog->start($SITE_STATES);

// 4. Instantiate StimulusResponse2D class for two factor experiments.
require_once PHPMATH . "/SR/StimulusResponse2D.php";
$sr = new StimulusResponse2D;

// 5. Set the stimulus-response table you will use.
$sr->setTable("WebOffer");

// 6. Set stimulus factors and levels.
$factors["image"] = array("person", "product");
$factors["text"]  = array("short", "long");
$sr->setFactors($factors);

// 7. Log any responses to the $response_field.
$response_field = "joined";
$sr->logResponse($_COOKIE['vis_id'], $response_field, $_GET['response']);

// 8. If stimulus code is not already generated for visitor then get one.
if (!$sr->generated) {
  $sr->getRandomStimulusCode($_COOKIE['vis_id']);
}

// 9. Use stimulus codes to control Web offer presentation.
$demo_offer = "<img src='".$sr->code["image"]."_". $sr->code["text"].".gif'>";

// 10. Give the demo site a title, then load it.
$demo_title = "Two Factor Web Offer Demo";
require_once "demo_site.php";

// 11. Log visitor accesses using the end method.
$vlog->end();
?>

The value of a visitor tracking cookie (such as, $_COOKIE['vis_id']) is set in the $vlog->start() method and passed into the $sr->logResponse() method. The last part of the script sets the $demo_title and $demo_offer values that are used in the demo_site.php script that is included. The last line of the script ($vlog->end()) indicates where many visitor-logging details are recorded.


In conclusion

The end result of your efforts to track Web offer effectiveness is an accumulation of categorical response data (joined='y') that you might use to draw conclusions about what Web offer is the most effective and why.

Measurement data arising from factorial experiments supplies high quality data that you can use to improve the quality of your Web sites. To properly analyze the categorical response data arising from this Web offer effectiveness study, you will need to become familiar with categorical data analysis techniques.

In the next article of this series, I introduce you to a popular categorical data analysis technique. You can use the technique to determine whether the IMAGE and TEXT factors you manipulated had any effect on the main categorical response variable (encouraging a site visitor to join your mailing list or not).

Note: I want to thank Dr. Tessema Astatkie for discussions on factorial design and Cyril Meagher for discussions on quality control.


Resources

About the author

Paul Meagher is a freelance Web developer, writer, and data analyst. Paul has a graduate degree in Cognitive Science and has spent the last six years developing Web applications. His current interests include statistical computing, data mining, content management, and e-learning.

Report abuse help

Report abuse

Thank you. This entry has been flagged for moderator attention.


Report abuse help

Report abuse

Report abuse submission failed. Please try again later.


developerWorks: Sign in


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Select information in your profile (name, country/region, and company) is displayed to the public and will accompany any content you post. You may update your IBM account at any time.

Choose your display name

The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


Rate this article

Comments

static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Web development, Open source
ArticleID=18045
ArticleTitle=Conduct Web experiments using PHP, Part 1
publish-date=10052004
author1-email=paul@datavore.com
author1-email-cc=