Major web properties like Wikipedia, Facebook, and Yahoo! use the LAMP architecture to serve millions of requests a day, while web application software like Wordpress, Joomla, Drupal, and SugarCRM use this architecture to enable organizations to deploy web-based applications easily.
The strength of the architecture lies in its simplicity. While stacks like .NET and Java™ technology may use massive hardware, expensive software stacks, and complex performance tuning, the LAMP stack can run on commodity hardware, using open source software stacks. Because the software stack is a loose set of components rather than a monolithic stack, tuning for performance can be a challenge since each component needs to be analyzed and tuned.
However, there are several simple performance tasks that can have a huge impact on the performance of websites of any size. In this article, we will look at five such tasks designed to optimize LAMP application performance. These items should require very little if any architecture changes to your application, making them safe and easy options to maximize the responsiveness and hardware requirements for your web application.
The easiest thing to boost performance of any PHP application (the "P" in LAMP, of course) is to take advantage of an opcode cache. For any website I work with, it's the one thing I make sure is present, since the performance impact is huge (many times with response times half of what they are without an opcode cache). But the big question most people new to PHP have is why the improvement is so drastic. The answer lies in how PHP handles web requests. Figure 1 outlines the flow of a PHP request.
Figure 1. PHP request
Since PHP is an interpreted language rather than a compiled one like C or the Java language, the entire parse-compile-execute steps are carried out for every request. You can see how this can be time- and resource-consuming, especially when scripts rarely change between requests. After the script is parsed and compiled, the script is in a machine parseable state as a series of opcodes. This is where an opcode cache comes into effect. It caches these compiled scripts as a series of opcodes to avoid the parse and compile steps for every request. You can see how such a workflow would work in Figure 2.
Figure 2. PHP request that utilizes an opcode cache
So when the cached opcodes of a PHP script exists, we can skip by the parse and compile steps of the PHP request process and directly execute the cache opcodes and output the results. The checking algorithm takes care of situations where you may have made a change to the script file, so on the first request of the changed script, the opcodes will be automatically recompiled and cached then for subsequent requests, replacing the cached script.
Opcode caches have long been popular for PHP, with some of the first ones coming about back in the heyday of PHP V4. Today there are a few popular choices that are in active development and being used:
- Alternative PHP Cache (APC) is probably the most popular opcode cache for PHP (see Resources). It is developed by several core PHP developers and has had major contributions to it, gaining its speed and stability from engineers at Facebook and Yahoo! It also sports several other speed improvements for handling PHP requests, including a user cache component we'll look at later in this article.
- Wincache is an opcode cache that is most actively developed by the Internet Information Services (IIS) team at Microsoft® for use only on Windows® using the IIS web server (see Resources). It was developed predominately in an effort to make PHP a first-class development platform on the Windows-IIS-PHP stack, as APC was known not to work well on that stack. It is very similar to APC in function and sports a user cache component, as well as a built-in session handler to leverage Wincache directly as a session handler.
- eAccelerator is a fork of one of the original PHP caches, the Turck MMCache opcode cache (see Resources). Unlike APC and Wincache, it is only an opcode cache and optimizer, so it does not contain the user cache components. It is fully compatible across UNIX® and Windows stacks, and it is quite popular for sites that don't intend to leverage the additional features APC or Wincache provide. This is often the case if you will be using a solution like memcache to have a separate user cache server for a multi-web server environment.
Without a doubt, an opcode cache is the first step in speeding up PHP by removing the need to parse and compile a script on every request. Once this first step is completed, you should see an improvement in response time and server load. But there is more you can do to optimize PHP, which we'll look next.
While implementing an opcode cache is a big bang for performance improvement, there are a number of other tweaks you can do to optimize your PHP setup, based upon the settings in your php.ini file. These settings are more appropriate for production instances; on development or testing instances, you may not want to make these changes as it can make it more difficult to debug application issues.
Let's take a look at a few items that are important to help performance.
Things that should be disabled
There are several php.ini settings that should be disabled, since they are often used for backward-compatibility:
register_globals— This functionality used to be the default before PHP V4.2, where the incoming request variables are automatically assigned to normal PHP variables. Other than the major security issues in doing this (having unfiltered incoming request data being mixed with normal PHP variable content), there is also the overhead of having to do this on every request. So turning this off will keep your application safer and improve performance.magic_quotes_*— This is another relic of PHP V4, where incoming data would automatically escape risky form data. It was designed to be a security feature to help sanitize incoming data before having it sent to a database, but it isn't very effective since it doesn't protect users against the more common types of SQL injection attacks out there. Since most database layers support prepared statements that handle this risk much better, turning this off will again remove an annoying performance problem.always_populate_raw_post_data— This is really only needed if for some reason you need to look at the entire payload of the incomingPOSTdata unfiltered. Otherwise, it's just storing in memory a duplicate copy of the POST data, which isn't needed.
Disabling these options on legacy code can be risky, however, since they may be depending upon them being set for proper execution. Any new code should not be developed depending on these options being set, and you should look for ways to refactor your existing code away from using them if possible.
Things that should be enabled or have its setting tweaked
There are some good performance options you can enable in the php.ini file to give your scripts a bit of a speed boost:
output_buffering— You should make sure this is on, since it will flush output back to the browser in a large chunk rather than on everyechoorprintstatement, where the latter can very much slow down your request response time.variables_order— This directive controls the order of the EGPCS (Environment,Get,Post,Cookie, andServer) variable parsing for the incoming request. If you aren't using certain superglobals (such as environment variables), you can safely remove them to gain a small speedup from not having to parse them on every request.date.timezone— This is a directive that was added in PHP V5.1 to set the default timezone for use with theDateTimefunctions introduced then. If you don't set this in the php.ini file, PHP will do a number of system requests to figure out what it is, and in PHP V5.3, a warning will be emitted on every request.
These are considered "low-hanging fruit" in terms of settings that should
be configured on your production instance. There is one more thing you
should look at as far as PHP in concerned. This is the use of
require() and
include() (as well as their siblings
require_once() and
include_once()) in your application. These
optimize your PHP configuration and code to prevent unneeded file status
checks on every request, which can slow down response times.
Manage your require()s and
include()s
File status calls (meaning calls made to the underlying file system to
check for the existence of a file) can be quite costly in terms of
performance. One of the biggest culprits of file stats comes in the form
of the require() and
include() statement, which are used to bring
code into your script. The sibling calls of
require_once() and
include_once() can be more problematic, as they
not only need to verify the existence of the file, but also that it hasn't
be included before.
So what's the best way to deal with this? There are a few things you can do to speed this up.
- Use absolute paths for all
require()andinclude()calls. This will make it more clear to PHP the exact file you are wishing to include, thus not needing to check the entireinclude_pathfor your file. - Keep the number of entries in the
include_pathlow. This will help for situations where it's difficult to provide an absolute path for everyrequire()andinclude()call (often the case in large, legacy applications) by not checking locations where the file you are including won't be.
APC and Wincache also have mechanisms for caching the results of file status checks made by PHP, so repeated file-system checks are not needed. They are most effective when you keep your include file names static rather than variable-driven, so it's important to try to do this whenever possible.
Database optimization can become a pretty advanced topic quickly, and I don't have nearly the space here to do this topic full justice. But if you are looking at optimizing the speed of your database, there are a few steps that you should take first which should help the most common issues encountered.
Put the database on its own machine
Database queries can become quite intense on their own, often pegging a CPU
at 100 percent for doing simple SELECT statement with
reasonable size datasets. If both your web server and database server are
competing for CPU time on a single machine, this will definitely slow down
your request. Thus I consider it a good first step to have the web server
and database server on separate machines and be sure you make your
database server the beefier of the two (database servers love lots of
memory and multiple CPUs).
Properly design and index tables
Probably the biggest issues with database performance come as a result of
poor database design and missing indexes.
SELECT statements are usually overwhelmingly
the most common types of queries run in a typical web application. They
are also the most time-consuming queries run on a database server.
Additionally, these kinds of SQL statements are the most sensitive to
proper indexing and database design, so look to the following pointers for
tips for optimal performance.
- Make sure each table has a primary key. This provides the table a default order and a fast way to join the table against other tables.
- Make sure any foreign keys in a table (that is, keys that link a record to a record in another table) are properly indexed. Many databases will enforce constraints on these keys automatically so that value actually matches a record in the another table, which can help this out.
- Try to limit the number of columns in a table. Too many columns in a
table can make the scan time for queries much longer than if there are
just a few columns. In addition, if you have a table with many columns
that aren't typically used, you are also wasting disk space with
NULLvalue fields. This is also true with variable size fields, such as text or blob, where the table size can grow much larger than needed. In this case, you should consider splitting off the additional columns into a different table, joining them together on the primary key of the records.
Analyze the queries being run on the server
The best tool for improving database performance is analyzing what queries
are being run on your database server and how long they are taking to run.
Just about every database out there has tools for doing this. With MySQL,
you can take advantage of the slow query log to find the problematic
queries. To use it, set the slow_query_log
setting to 1 in the MySQL configuration file, then log_output to FILE
to have them logged to the file hostname-slow.log. You can set the
long_query_time threshold to how long the query
must run in number of seconds to be considered a "slow query." I'd
recommend setting this to 5 seconds at first and move it down to 1 second
over time, depending upon your data set. If you look at this file, you'll
see the queries detailed similar to Listing 1.
Listing 1. MySQL slow query log
/usr/local/mysql/bin/mysqld, Version: 5.1.49-log, started with: Tcp port: 3306 Unix socket: /tmp/mysql.sock Time Id Command Argument # Time: 030207 15:03:33 # User@Host: user[user] @ localhost.localdomain [127.0.0.1] # Query_time: 13 Lock_time: 0 Rows_sent: 117 Rows_examined: 234 use sugarcrm; select * from accounts inner join leads on accounts.id = leads.account_id; |
The key thing we want to look at is Query_time,
which shows how long the query took. Another thing to look at is the
numbers of Rows_sent and
Rows_examined, since these can point to
situations where a query might be written incorrectly if it's looking at
too many rows or returning too many rows. You can delve deeper into how a
query is written by prepending EXPLAIN to the
query, which will return the query plan instead of the result set, as show
in Listing 2.
Listing 2. MySQL
EXPLAIN resultsmysql> explain select * from accounts inner join leads on accounts.id = leads.account_id; +----+-------------+----------+--------+--------------------------+---------+--- | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+----------+--------+--------------------------+---------+-------- | 1 | SIMPLE | leads | ALL | idx_leads_acct_del | NULL | NULL | NULL | 200 | | | 1 | SIMPLE | accounts | eq_ref | PRIMARY,idx_accnt_id_del | PRIMARY | 108 | sugarcrm.leads.account_id | 1 | | +----+-------------+----------+--------+--------------------------+---------+--------- 2 rows in set (0.00 sec) |
The MySQL manual dives much deeper into the topic of the
EXPLAIN output (see Resources), but the big thing I look at is places where the
'type' column is 'ALL', since this requires MySQL to do a full table scan
and doesn't use a key for a lookup. These help point you to places where
adding indexes will significantly help query speed.
As we saw in the previous section, databases can easily be the biggest pain point of performance in your web application. But what if the data you are querying doesn't change very often? In this case, it may be a good option to store those results locally instead of calling the query on every request.
Two of the opcode caches we looked at earlier, APC and Wincache, have facilities for doing just this, where you can store PHP data directly into a shared memory segment for quick retrieval. Listing 3 provides an example on how to do this.
Listing 3. Example of using APC for caching database results
<?php
function getListOfUsers()
{
$list = apc_fetch('getListOfUsers');
if ( empty($list) ) {
$conn = new PDO('mysql:dbname=testdb;host=127.0.0.1', 'dbuser', 'dbpass');
$sql = 'SELECT id, name FROM users ORDER BY name';
foreach ($conn->query($sql) as $row) {
$list[] = $row;
}
apc_store('getListOfUsers',$list);
}
return $list;
}
|
We'll only need to do the query one time. Afterward, we push the result
into the APC user cache under the key
getListOfUsers. From here on out, until the
cache expires, you will be able to fetch the result array directly out of
cache, skipping over the SQL query.
APC and Wincache aren't the only choices for a user cache; memcache and Redis are other popular choices that don't require you to run the user cache on the same server as the Web server. This gives added performance and flexibility, especially if your web application is scaled out across several Web servers.
In this article, we looked at five simple ways to tune your LAMP application for better performance. We looked at techniques not only at the PHP level, by leveraging an opcode cache and optimizing the PHP configuration, but also looked at optimizing your database design for proper indexing. We also took a look at leveraging a user cache (using APC as an example) to show how you can avoid repeated database calls when the data doesn't change very often.
| Description | Name | Size | Download method |
|---|---|---|---|
| Source code | os-5waystunelamp.zip | HTTP |
Information about download methods
Learn
-
"A PHP V5 migration guide": Learn how to migrate code developed in PHP V4 to V5.
-
Planet PHP is the PHP developer
community news source.
- The MySQL manual dives much deeper into the topic of the
EXPLAINoutput. -
PHP.net is the central resource for PHP developers.
-
Check out the "Recommended PHP reading list."
-
Browse all the PHP content on developerWorks.
-
Follow developerWorks on Twitter.
-
Expand your PHP skills by checking out IBM developerWorks' PHP project resources.
-
To listen to interesting interviews and discussions for software developers, check out developerWorks podcasts.
-
Using a database with PHP? Check out the Zend Core for
IBM, a seamless, out-of-the-box, easy-to-install PHP development and production environment that supports IBM DB2 V9.
-
Stay current with developerWorks' Technical events and webcasts.
-
Check out upcoming conferences, trade shows, webcasts, and other Events around the world that are of interest to IBM open source developers.
-
Visit the developerWorks Open source zone for extensive how-to information, tools, and project updates to help you develop with open source technologies and use them with IBM's products, as well as our most popular articles and tutorials.
-
Watch and learn about IBM and open source technologies and product functions with the no-cost developerWorks On demand demos.
Get products and technologies
-
Alternative
PHP Cache, is probably the most popular opcode cache for PHP.
-
Wincache is an opcode
cache that is most actively developed by the IIS team at Microsoft for use
only on Windows using the IIS (Internet Information Services) Web server.
-
eAccelerator is a fork of one of
the original PHP caches, the Turck MMCache opcode cache.
-
Innovate your next open source development project with IBM trial software, available for download or on DVD.
- Download
IBM product evaluation versions
or explore
the online trials in the IBM SOA Sandbox and get your hands on application development tools and middleware products from
DB2®, Lotus®, Rational®, Tivoli®, and WebSphere®.
Discuss
-
Get involved in the developerWorks community.
Connect with other developerWorks users while exploring the developer-driven blogs, forums, groups, and wikis.
-
Participate in developerWorks blogs and get involved in the developerWorks community.
-
Participate in the developerWorks PHP Forum: Developing PHP applications with IBM Information Management products (DB2, IDS).
John Mertic is a software engineer at SugarCRM and has several years of experience with PHP web applications. At SugarCRM, he has specialized in data integration, mobile, and user interface architecture. An avid writer, he has been published in php|architect, IBM developerworks, and in the Apple Developer Connector, and is the author of the book "The Definitive Guide to SugarCRM: Better Business Applications." He has also contributed to many open source projects, most notably the PHP project, where he is the creator and maintainer of the PHP Windows Installer.




