Skip to main content

Simple linear regression with PHP, Part 2

A data-exploration tool to address output and probability function shortcomings

Paul Meagher (paul@datavore.com), CEO, Datavore Productions
Paul Meagher is a freelance Web developer, writer, and data analyst. Paul has a graduate degree in Cognitive Science and has spent the last six years developing Web applications. His current projects and interests center around e-learning, content management, statistical computing, and database technology. Paul resides in Truro, Nova Scotia and can be reached at paul@datavore.com.

Summary:  Part one of this series ended by noting three elements that were lacking in the Simple Linear Regression class. In this article, the author, Paul Meagher, addresses these shortcomings with PHP-based probability functions; demonstrates how to integrate output methods into the SimpleLinearRegression class; and creates graphical output. He then tackles these issues by building a data-exploration tool, designed to plumb the depths of information contained in small- to medium-sized datasets. (In part one, the author demonstrated how to develop and implement the heart of a simple linear regression algorithm package using PHP as the implementation language.)

Date:  29 Apr 2003
Level:  Intermediate
Activity:  4349 views

In the first of this two-part series, "Simple linear regression with PHP," I explained why a math library can be useful for PHP. I also demonstrated how to develop and implement the heart of a simple linear regression algorithm using PHP as the implementation language.

The object of this article is to show you how to build a non-trivial data-exploration tool using the SimpleLinearRegression class discussed in Part 1.

Recap: The concept

The basic goal behind simple linear regression modeling is to find the line of best fit through a two-dimensional plane of paired X and Y values (that is, your X and Y measurements). Once you find this line using the least-squared-error criterion, then you can perform various statistical tests to determine how well this line accounts for the observed variance in Y scores.

A linear equation -- y = mx + b -- has two parameters that must be estimated based on the X and Y data provided, which are the slope (m) and y intercept (b). Once you have estimates of these parameters, you can enter your observed values into a linear equation and see what predicted Y values your equation generates.

To estimate the m and b parameters using a least-squared-error criterion you'll want to find estimates of m and Y that minimize the difference between your observed and predicted values for all values of X. The difference between observed and predicted values is called error (yi - (mxi + b)) and, if you square each error score and sum these residuals, the result is a number called the squared error of prediction. Using a least-squared-error criterion to determine the line of best fit involves finding estimates of m and b that minimize the squared error of prediction.

The estimators, m and b, that satisfy the least-squared-error criterion can be found in two basic ways. First, you can use a numerical search procedure to propose and evaluate different values of m and b, ultimately settling on estimates producing the least squared error. The second approach is to use calculus to find the equations for estimating m and b. I will not go into the calculus involved to derive these equations, but I do use these analytic equations in the SimpleLinearRegression class to find the least-squared estimates of m and b (see the getSlope() and getYIntercept methods in the SimpleLinearRegression class).

Even though you have equations that can be used to find the least squared estimates of m and b, it does not follow that once you plug these parameters into the linear equation that the result is a line that provides a good fit to the data. The next step in the simple linear regression procedure is to determine if the remaining squared error of prediction is acceptable or not.

You can use a statistical decision procedure to reject the alternative hypothesis that a straight line fits the data. This procedure is based upon computing a T statistic and using a probability function to find the probability of observing a value that large by chance. As mentioned in Part 1, the SimpleLinearRegression class generates a fairly large number of summary values and one important summary value is a T statistic that can be used to measure how well a linear equation fits the data. The T statistic tends toward a large value if the fit is good; if the T value is small, your linear equation should be replaced by a default model that assumes the mean of the Y values is the best predictor (because the mean of a set of values is often a useful predictor of the next observed value).

To test whether the T statistic is large enough that you can reject the mean of the Y values as the best predictor, you need to compute the probability of obtaining the T statistic by chance. If that probability is low, then you can reject the null hypothesis that the mean is the best predictor and, correspondingly, gain confidence that a simple linear model offers a good fit for the data. (For more on computing the probability of the T statistic, see Part 1.)

Back to the statistical decision procedure. It tells you when to reject the null hypothesis, but it does not tell you whether to accept the alternative hypothesis. In a research context, the alternative hypothesis of a linear model needs to be established by theoretical and statistical arguments.

The data-exploration tool you are building implements the statistical decision procedure for a linear model (the T test) and provides summary data that can be used to construct the theoretical and statistical arguments necessary to establish a linear model. The data-exploration tool could be classified as a decision-support tool for knowledge workers exploring patterns in small- to medium-sized datasets.

From a learning point of view, simple linear regression modeling is worth studying because it is the gateway to understanding more advanced forms of statistical modeling. Many of the core concepts from simple linear regression, for example, establish a good foundation for understanding Multiple Regression, Factor Analysis, Time Series, and so on.

Simple linear regression is also a versatile modeling technique. It can be used to model curvilinear data by transforming the raw data, typically with logarithmic or power transformations. These transformations can linearize the data so that simple linear regression can be used to model the data. The resulting linear model would be expressed as a linear formula relating the transformed values.


Probability functions

In the previous article, I side-stepped the issue of probability functions in PHP by shelling out to R for a probability value. I was not entirely happy with this solution, so I began exploring the issue of what it would take to develop PHP-based probability functions.

I began trolling the Web for information and code. One source of both was the probability functions in the book Numerical Recipes in C. I re-implemented some probability function code in PHP (the gammln.c and betai.c functions), but, again, was not happy with the results. Compared with some of the other implementations, it seemed to be a lot of code. Also, I needed to have inverse probability functions.

Luckily, I stumbled upon John Pezzullo's Interactive Statistical Calculation. John's Web site on Probability Distribution Functions had all the probability functions I needed already implemented in JavaScript for easy learning.

I ported the Student T and Fisher F functions to PHP. I changed the API a bit to conform to a Java naming style and embedded all the functions in a class called Distribution. One elegant feature of this implementation is the doCommonMath method that is reused by all the functions in this library. The other tests that I have not bothered to implement (Normal and ChiSquare) use the doCommonMath method.

One other aspect of the port is noteworthy. By using JavaScript, the user can assign a dynamically determined value to an instance variable, such as:

var PiD2 = pi() / 2 

You cannot do this in PHP. The only values you can assign to an instance variable are simple constants. Hopefully this limitation will be addressed in PHP5.

Notice that the code in Listing 1 has no defined instance variables -- this is because in the JavaScript version, they were dynamically assigned values.


Listing 1. Implementing probability functions
<?php 

 // Distribution.php 

 // Copyright John Pezullo 
 // Released under same terms as PHP. 
 // PHP Port and OO'fying by Paul Meagher 

 class Distribution { 

  function doCommonMath($q, $i, $j, $b) { 
       
   $zz = 1;  
   $z  = $zz;  
   $k  = $i;  
       
       
   while($k <= $j) {  
        $zz = $zz * $q * $k / ($k - $b);  
        $z  = $z + $zz;  
        $k  = $k + 2;  
   } 
   return $z; 
  } 
       
  function getStudentT($t, $df) {   

   $t  = abs($t);  
   $w  = $t  / sqrt($df);  
   $th = atan($w); 
       
   if ($df == 1) {  
    return 1 - $th / (pi() / 2);  
   } 
     
   $sth = sin($th);  
   $cth = cos($th); 
     
   if( ($df % 2) ==1 ) {  
    return 
      1 - ($th + $sth * $cth * $this->doCommonMath($cth * $cth, 2, $df - 3, -1)) 
                         / (pi()/2); 
   } else { 
    return 1 - $sth * $this->doCommonMath($cth * $cth, 1, $df - 3, -1);  
   } 
     
  } 
     
  function getInverseStudentT($p, $df) {  
       
   $v =  0.5;  
   $dv = 0.5;  
   $t  = 0; 
       
   while($dv > 1e-6) {  
    $t = (1 / $v) - 1;  
    $dv = $dv / 2;  
    if ( $this->getStudentT($t, $df) > $p) {  
     $v = $v - $dv; 
    } else {  
     $v = $v + $dv; 
    }  
   } 
   return $t; 
  } 
     

  function getFisherF($f, $n1, $n2) { 
   // implemented but not shown     
  } 

  function getInverseFisherF($p, $n1, $n2) {  
   // implemented but not shown     
  } 

 } 
 ?> 


Output methods

Now that you have implemented the probability functions in PHP, the only remaining hurdle in developing a PHP-based data-exploration tool is to devise methods for displaying the results of the analysis.

The simple solution is to dump the values of all the instance variables to the screen as needed. I did this in the first article when I displayed the linear equation, T value, and T probability for the Burnout Study. It is useful to be able to access a particular value for particular purposes and the SimpleLinearRegression supports this type of usage.

Another way to output the results, however, is to systematically group parts of the output. If you study the output of the leading statistical packages for regression analysis, you will notice that they tend to group the output in the same manner. They tend to have a Summary Table, an Analysis Of Variance table, a Parameter Estimates table, and R Values. Similarly, I have created output methods called:

  • showSummaryTable()
  • showAnalysisOfVariance()
  • showParameterEstimates()
  • showRValues()

I also have a method for showing the linear prediction formula (getFormula()). Many statistical packages do not output the formula, expecting the user to construct the formula based on output from the above methods. This is partly due to the final form of the formula you ultimately use to model the data may be different than this default formula because:

  • The Y-intercept has no meaningful interpretation, or
  • The input values may be transformed and you probaby need to un-transform them for final interpretation.

All of these methods assume that the output medium is a Web page. To anticipate the possibility that you would want to output these summary values using a medium other than a Web page, I decided to wrap these output methods inside a class that extends the SimpleLinearRegression class. The code in Listing 2 is meant to demonstrate the general logic of the output class. Code implementing the various show methods are removed to make the general logic more apparent.


Listing 2. Demonstrating the general logic of the output class
<?php 

  // HTML.php 

  // Copyright 2003, Paul Meagher 
  // Distributed under GPL   

  include_once "slr/SimpleLinearRegression.php"; 

  class SimpleLinearRegressionHTML extends SimpleLinearRegression { 

    function SimpleLinearRegressionHTML($X, $Y, $conf_int) { 
      SimpleLinearRegression::SimpleLinearRegression($X, $Y, $conf_int); 
    } 

    function showTableSummary($x_name, $y_name) { } 
       
    function showAnalysisOfVariance() { } 

    function showParameterEstimates() { } 

    function showFormula($x_name, $y_name) { } 

    function showRValues() {} 
  } 

  ?> 

The constructor of this class is simply a wrapper for the constructor of the SimpleLinearRegression class. This means that when you want to display HTML output from a SimpleLinearRegression analysis, you should instantiate the SimpleLinearRegressionHTML class in lieu of instantiating the SimpleLinearRegression class directly. The benefit is that you do not bloat the SimpleLinearRegression class with unused methods and you have more freedom to define classes for other output media (perhaps implementing the same API for different media types).


Graphical output

The output methods you have implemented so far display the summary values in HTML format. It would also be desirable to display scatter plots and line plots of this data in GIF, JPEG, or PNG format.

Rather than writing the code for generating line and scatter plots myself, I thought it would be best to use the PHP-based graphics library called JpGraph. JpGraph is under active development by Johan Persson and is described as follows on the project Web site:

JpGraph makes it easy to draw both "quick and dirty" graphs with a minimum of code and complex professional graphs, which requires a very fine grain control. JpGraph is equally well suited for both scientific and business types of graphs.

The JpGraph distribution contains a large number of example scripts that can be customized for your particular needs. Using JpGraph for the data-exploration tool was a simple matter of finding an example script that did something similiar to what I wanted and adapting it to my particular requirements.

The script in Listing 3 is extracted from the sample data-exploration tool (explore.php) and demonstrates how the library is invoked and how the data from the SimpleLinearRegression analysis is fed into Line and Scatter classes. The comments in this code are by Johan Persson (the JPGraph codebase is well documented).


Listing 3. Detailed functions from the sample data-exploration tool explore.php
<?php 

  // Snippet extracted from explore.php script 

  include ("jpgraph/jpgraph.php"); 
  include ("jpgraph/jpgraph_scatter.php"); 
  include ("jpgraph/jpgraph_line.php"); 

  // Create the graph 
  $graph = new Graph(300,200,'auto'); 
  $graph->SetScale("linlin"); 

  // Setup title   
  $graph->title->Set("$title"); 
  $graph->img->SetMargin(50,20,20,40);    
  $graph->xaxis->SetTitle("$x_name","center"); 
  $graph->yaxis->SetTitleMargin(30);      
  $graph->yaxis->title->Set("$y_name");  

  $graph->title->SetFont(FF_FONT1,FS_BOLD); 

  // make sure that the X-axis is always at the 
  // bottom at the plot and not just at Y=0 which is 
  // the default position   
  $graph->xaxis->SetPos('min'); 

  // Create the scatter plot with some nice colors 
  $sp1 = new ScatterPlot($slr->Y, $slr->X); 
  $sp1->mark->SetType(MARK_FILLEDCIRCLE); 
  $sp1->mark->SetFillColor("red"); 
  $sp1->SetColor("blue"); 
  $sp1->SetWeight(3); 
  $sp1->mark->SetWidth(4); 

  // Create the regression line 
  $lplot = new LinePlot($slr->PredictedY, $slr->X); 
  $lplot->SetWeight(2); 
  $lplot->SetColor('navy'); 

  // Add the pltos to the line 
  $graph->Add($sp1); 
  $graph->Add($lplot); 

  // ... and stroke 
  $graph_name = "temp/test.png"; 
  $graph->Stroke($graph_name); 
  ?> 
  <img src='<?php echo $graph_name ?>' vspace='15'> 

  ?>


Data-exploration script

The data-exploration tool consists of a single script (explore.php) that calls methods from the SimpleLinearRegressionHTML class and the JpGraph libary.

The script uses a simple processing logic. The first part of the script performs basic validation on submitted form data. If the form data validates, then the second part of the script is executed.

The second part of the script contains code for analyzing the data and displaying summary results in HTML and graphics formats. The essential structure of the explore.php script is shown in Listing 4:


Listing 4. The structure of explore.php
<?php 

  // explore.php 

  if (!empty($x_values)) { 
    $X    = explode(",", $x_values); 
    $numX = count($X); 
  }   

  if (!empty($y_values)) { 
    $Y    = explode(",", $y_values); 
    $numY = count($Y); 
  }   

  // display entry data entry form if variables not set 

  if ( (empty($title)) OR (empty($x_name)) OR (empty($x_values)) OR  
       (empty($y_name)) OR (empty($conf_int)) OR (empty($y_values)) OR  
       ($numX != $numY) ) {          

    // Omitted code for displaying entry form 
     
  } else { 
     
    include_once "slr/SimpleLinearRegressionHTML.php";                       
    $slr = new SimpleLinearRegressionHTML($X, $Y, $conf_int);    

    echo "<h2>$title</h2>"; 
     
    $slr->showTableSummary($x_name, $y_name); 
    echo "<br><br>"; 
     
    $slr->showAnalysisOfVariance();   
    echo "<br><br>"; 

    $slr->showParameterEstimates($x_name, $y_name);  
    echo "<br>"; 

    $slr->showFormula($x_name, $y_name); 
    echo "<br><br>"; 

    $slr->showRValues($x_name, $y_name); 
    echo "<br>"; 

    include ("jpgraph/jpgraph.php"); 
    include ("jpgraph/jpgraph_scatter.php"); 
    include ("jpgraph/jpgraph_line.php");   
                       
    // The code for displaying the graphics is inline in the 
    // explore.php script.  The code for these two line plots 
    // finishes off the script: 
     
    // Omitted code for displaying scatter plus line plot 
    // Omitted code for displaying residuals plot 
     
  } 

  ?>


Fire damage study

To demonstrate how to use the data-exploration tool, I will use data from a hypothetical fire damage study. This study correlates the amount of fire damage in major residential fires to their distance from the nearest fire station. Insurance companies, for example, would be interested in studying this relationship for the purpose of determining premiums.

The data for the study are shown in the input screen in Figure 1.


Figure 1. Input screen shows study data
Figure 1. Input screen shows study data

When the data is submitted, it is analyzed and the results of those analyses are displayed. The first result set to display is the Table summary, shown in Figure 2.


Figure 2. Table summary is first set of results displayed
Figure 2. Table summary is first set of results displayed

The table summary displays, in tabular form, the input data along with other columns indicating the predicted Y value for the observed X value, the difference between the predicted and observed Y values, and the lower and upper confidence intervals for the predicted Y value.

Figure 3 shows three higher-level summaries of the data that come after the table summary.


Figure 3. Three high-level data summaries follow table summary
Figure 3. Three high-level data summaries follow table summary

The Analysis of variance table shows how the variance of the Y scores is partitioned into two main sources of variance -- the variance accounted for by the model (see the Model row) and the variance unaccounted for (see the Error row). A large F value tells you that the linear model captures most of the variance in your Y measurements. This table becomes even more useful in Multiple Regression contexts in which each independent variable has a row in the table.

The Parameter estimates table shows the estimated Y Intercept and Slope. Each row includes a T value and the probability of observing a T value that extreme (see the Prob > T column). The Prob > T for the Slope can be used to reject a linear model.

If the probability of the T value is greater then 0.05, or some similarly low probability, then you can reject the null hypothesis because a value that extreme has a low likelihood of being observed by chance. Otherwise you must retain the null hypothesis.

In the fire damage study, the probability of obtaining a T value of 12.57 by chance is less than 0.00000. This means that a linear model is a useful predictor of Y values (better than the mean of the Y values) for the range of X values observed in the study.

The final report displays correlation coefficients or R values. They can be used to assess how well your linear model fits the data. High R values indicate a good fit.

Each summary report provides answers to different analytic questions that you might have about the relationship of your linear model to the data. Consult textbooks by Hamilton, Neter, or Pedhauzeur for more advanced treatment of regression analysis (see Resources).

The final report elements to display are the scatter and line plots of the data, as seen in Figure 4.


Figure 4. Final report elements -- scatter and line plots
Figure 4. Final report elements -- scatter and line plots

Most people are familiar with interpreting line graphs such as the top graphic in this series, so I won't comment except to say that the JPGraph library produces high-quality scientific plots for the Web. It also does the right thing when you feed in your scatter and line data.

The second plot relates residuals (Observed Y, Predicted Y) to your predicted Y scores. This is an example of a graph used by proponents of Exploratory Data Analysis (EDA) to help maximize the analyst's ability to detect and understand patterns in data. This graph can be used by the trained eye to answer questions about:

  • Potential outliers or overly influential cases
  • Possible curvilinear relation (use Transformation?)
  • Non-normal residual distribution
  • Non-constant error variance or heteroscedasticity

This data-exploration tool could easily be extended to generate more types of graphs -- histograms, box plots, quartile plots -- that are standard EDA tools.


Math library architecture

My math hobby has continued to peak my interest for the last few months. Such explorations have motivated me to think about ways in which my code base might be organized to anticipate future growth.

I've provisionally settled on the directory structure in Listing 5:


Listing 5. A growth-friendly directory structure
phpmath/
    
    burnout_study.php    
    explore.php
    fire_study.php  
    navbar.php   
      
    dist/
      Distribution.php  
      fisher.php
      student.php
      source.php
      
    jpgraph/  
      etc...
    
    slr/ 
      SimpleLinearRegression.php 
      SimpleLinearRegressionHTML.php
      
    temp/

Future work on multiple regression, for example, would involve extending this library to include a matrix directory to house PHP code for performing matrix operations (a requirement for more advanced forms of regression analysis). I would also create an mr directory to house PHP code that implements the input, logic, and output methods for multiple regression analysis.

Note that this directory structure contains a temp directory. Permissions on this directory must be set so that the explore.php script can write output plots to this directory. Keep this in mind when you try to install the phpmath_002.tar.gz source code. Also, read the instructions for installing JpGraph on the JpGraph project Web site (see the Resources).

On a final note, it is possible to move all software classes to a document root other than the Web root if you:

  • Let a global PHP_MATH variable have access to a non-Web root location, and
  • Make sure you prefix this defined constant to all required or included file paths.

In the future, setting the PHP_MATH variable might be done through a config file for the PHP math library as a whole.


What have you learned?

In this article, you demonstrated how the SimpleLinearRegression class could be used to develop a data-exploration tool for small- to medium-sized datasets. Along the way, I also developed native probability functions for the SimpleLinearRegression class to use and extended the class with HTML output methods and graph-generating code based upon the JpGraph library.

From a learning point of view, simple linear regression modeling is worth further study because it is arguably the gateway to understanding more advanced forms of statistical modeling. Before you plunge into learning more advanced techniques, like multiple regression or manova, you could benefit from having a solid understanding of simple linear regression.

Even though simple linear regression only uses one variable to account for, or predict, the variance in another variable, looking for simple linear relations between all your study variables is often the first step in exploratory data analysis. Just because your data is multivariate does not mean you only have to examine it with multivariate tools. Indeed, using basic tools like simple linear regression initially is a good way to begin probing data for patterns.

This series has studied two applications of simple linear regression analysis. In this article, I looked at an example of a strong linear relationship between "Distance from a Fire Station" and "Fire Damage". In the first article, I looked at a weaker but, nevertheless significant, linear relationship between a measure called "Social Concentration" and a measure called the "Exhaustion Index". (As an exercise, it might be interesting to re-examine the messier data from the first study with the data-exploration tool discussed in this article. One thing you will note is that the y intercept is a negative number, meaning that when "Social Concentration" is 0, the predicted Exhaustion Index is -29.50. Does this make sense? When modeling a phenomenon you should ask whether your equation should include the optional y intercept and, if so, what role would the y intercept play in your linear equation.)

Further studies into simple linear regression might include research into such topics as:

  • When to omit the intercept term from your equation and alternative computational formulas you can use if you decide to do so
  • When and how to use power, logorithmic, and other transformations to linearize the data so that simple linear regression can be used to model the data
  • Other visualizations that can be used to assess the adequacy of your modeling assumptions and to gain deeper insight into the patterns in your data

These are some of the more advanced topics awaiting the student of simple linear regression. Resources contains pointers to several advanced texts that you can consult for more information on regression analysis.

Standard PHP installations provide many of the resources necessary to develop non-trivial mathematics-based applications. I hope that this series of articles inspires other developers to implement math routines in PHP for the pleasure, technical, or learning challenges involved.


Resources

  • Get the source code for this article at Datavore Productions.

  • Check out the popular undergraduate textbook, Statistics, 9th ed., by James T. McClave and Terry Sincich (Prentice-Hall, online) that was consulted for the algorithm steps and the "Burnout Study" example in this article.

  • Look at the PEAR repository's low-level PHP math classes. Eventually, it would be nice to see PEAR contain packages that implement standard higher-level numerical methods, such as SimpleLinearRegression, MultipleRegression, TimeSeries, ANOVA, FactorAnalysis, or FourierAnalysis.

  • Visit the Numerical Python Project which extends Python with a full scientific array language complete with sophisticated indexing. Mathematical operations with this extension are close to what one would expect from a compiled language.

  • Explore several math resources available for Perl, including an index of CPAN Math modules and the modules in the Algorithm section at CPAN, as well as the Perl Data Language, designed to deliver to Perl the ability to compactly store and speedily manipulate large N-dimensional data arrays.

  • For more on John Chambers' S programming language, check out these links to his publications and various research projects at Bell Labs.

  • R is a language and environment for statistical computing and graphics, similar to the award-winning S system, that provides such statistical and graphical techniques as linear and nonlinear modeling, statistical tests, time series analysis, classification, and clustering. Learn about R at the R Project homepage.

  • Discover a catalog of code-optimization techniques in Steven Gould's IBM tutorial "Writing Efficient PHP" (developerWorks, July 2002).

  • Read this developerWorks roundup of math library articles:

  • Learn more on PHP, read the IBM tutorial "Creating dynamic Web sites with PHP and MySQL" (developerWorks, May 2001).

  • Visit John Pezzullo's excellent site dedicated to Web pages that perfom statistical calculations. The PHP-based probability functions were based upon code found on John's probability functions page.

  • Learn more about the M. Abramowitz and I.A. Stegun book, The Handbook of Mathematical Functions (also known as the AMS55), at the Digital Library of Mathematical Functions.

  • Check out the JpGraph site for a wealth of information about PHP's premier OO Graph Library.

  • Read The Engineering Handbook of Statistics, published by the National Institute of Standards (NIST), which has an excellent section on Exploratory Data Analysis.

  • Try these useful references, if you are interested in studying the topic of Regression in more detail:
    • Hamilton, L. C. (1992). Regression with Graphics. Pacific Grove, California: Brooks/Cole Publishing Company.
    • Neter, J, Kutner, M.H., Wasserman, W. (1990). Applied Linear Regression Models (3th Edition). Irwin, Chicago.
    • Pedhazur, E. J. (1982). Multiple regression in behavioral research. New York, NY: Holt, Rinehart and Winston.


  • Read Cameron Laird's article "Open source in the biosciences." PHP needs better math tools to participate in this growth market (developerWorks, November 2002).

  • Check out RWeb, a Web-based interface to R.

About the author

Paul Meagher is a freelance Web developer, writer, and data analyst. Paul has a graduate degree in Cognitive Science and has spent the last six years developing Web applications. His current projects and interests center around e-learning, content management, statistical computing, and database technology. Paul resides in Truro, Nova Scotia and can be reached at paul@datavore.com.

Comments (Undergoing maintenance)



Trademarks  |  My developerWorks terms and conditions

Help: Update or add to My dW interests

What's this?

This little timesaver lets you update your My developerWorks profile with just one click! The general subject of this content (AIX and UNIX, Information Management, Lotus, Rational, Tivoli, WebSphere, Java, Linux, Open source, SOA and Web services, Web development, or XML) will be added to the interests section of your profile, if it's not there already. You only need to be logged in to My developerWorks.

And what's the point of adding your interests to your profile? That's how you find other users with the same interests as yours, and see what they're reading and contributing to the community. Your interests also help us recommend relevant developerWorks content to you.

View your My developerWorks profile

Return from help

Help: Remove from My dW interests

What's this?

Removing this interest does not alter your profile, but rather removes this piece of content from a list of all content for which you've indicated interest. In a future enhancement to My developerWorks, you'll be able to see a record of that content.

View your My developerWorks profile

Return from help

static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Web development, Open source
ArticleID=11792
ArticleTitle=Simple linear regression with PHP, Part 2
publish-date=04292003
author1-email=paul@datavore.com
author1-email-cc=

My developerWorks community

Tags

Help
Use the search field to find all types of content in My developerWorks with that tag.

Use the slider bar to see more or fewer tags.

Popular tags shows the top tags for this particular content zone (for example, Java technology, Linux, WebSphere).

My tags shows your tags for this particular content zone (for example, Java technology, Linux, WebSphere).

Use the search field to find all types of content in My developerWorks with that tag. Popular tags shows the top tags for this particular content zone (for example, Java technology, Linux, WebSphere). My tags shows your tags for this particular content zone (for example, Java technology, Linux, WebSphere).

Rate a product. Write a review.

Special offers