You can do three things to make your MySQL server faster, from least effective to most effective:
- Throw hardware at the problem.
- Tune the settings of the MySQL process.
- Optimize your queries.
Throwing hardware at a problem is often the first thought, especially because databases are resource hogs. This solution can only take you so far, though. In practical terms, you can usually double your central processing unit (CPU) or disk speed, and maybe increase your memory by a factor of 4 or 8.
The second best thing to do is to tune the MySQL server, also called
mysqld. Tuning the process means allocating memory to
the right places and giving mysqld an idea of what type
of load to expect. Rather than make the disks faster, it's better to reduce the
number of disk accesses needed. Similarly, making sure the MySQL process is
operating correctly means it can spend more time servicing queries than taking
care of background tasks like temporary disk tables and opening and closing files.
Tuning mysqld is the focus of this article.
The best thing you can do is make sure your queries are optimized. This means the
proper indexes are applied to tables, and queries are written in such a way that
they take advantage of MySQL's strengths. Even though this article doesn't cover
query tuning (books have been written on the subject), it configures
mysqld to report queries that may need tuning.
Just because these tasks have been assigned an order doesn't mean you can ignore
the hardware and mysqld settings in favor of properly
tuned queries. A slow machine is a slow machine, and I've seen fast machines with
well-written queries fail under load because mysqld was
consumed with busy-work instead of servicing queries.
In a SQL server, the data tables sit on disk. Indexes provide a means for the server to find a particular row of data in the table without having to search the entire table. When the entire table has to be searched, it's called a table scan. Most often, you want only a small subset of the data in the table, so a full table scan wastes a lot of disk I/O and therefore time.This problem is compounded when data must be joined, because many more rows must be compared between the two sides of the join.
Of course, table scans aren't always a sign of a problem; sometimes it's more efficient to read the whole table than it is to pick through it (making these decisions is the job of the query planner in the server process). Inefficient use of indexes, or not being able to use indexes at all, slows the queries, and this issue becomes more pronounced as the load on the server and the size of the tables increases. Queries that take more than a given amount of time to execute are called slow queries.
You can configure mysqld to log slow queries in the
aptly named slow query log. Administrators then look at this log to help them
determine which parts of the application need further investigation. Listing 1
shows the configuration required in my.cnf to enable the slow query log.
Listing 1. Enable the MySQL slow query log
[mysqld] ; enable the slow query log, default 10 seconds log-slow-queries ; log queries taking longer than 5 seconds long_query_time = 5 ; log queries that don't use indexes even if they take less than long_query_time ; MySQL 4.1 and newer only log-queries-not-using-indexes |
These three settings, used together, log any queries that take longer than 5
seconds and any queries that don't use indexes. Note the caveat about
log-queries-not-using-indexes: You must have MySQL 4.1
or newer. The slow query log is in your MySQL data directory and is called
hostname-slow.log. If you'd rather use a different name or path, you can
do so with log-slow-queries = /new/path/to/file in
my.cnf.
Reading through the slow query log is best done with the
mysqldumpslow command. Specify the path to the logfile,
and you're given a sorted list of the slow queries, along with how many times
they're found in the log. One helpful feature is that
mysqldumpslow removes any user-specified data before
collating the results, so different invocations of the same query are counted as
one; this helps point out queries in need of the most work.
Many LAMP applications rely heavily on the database but make the same queries over and over. Each time the query is made, the database must do the same work -- parse the query, determine how to execute it, load information from disk, and return it to the client. MySQL has a feature called the query cache that stores the result of a query in memory, should it be needed again. In many instances, this increases performance drastically. The catch, though, is that the query cache is disabled by default.
Adding query_cache_size = 32M to /etc/my.conf enables
a 32MB query cache.
After you enable the query cache, it's important to understand whether it's being used effectively. MySQL has several variables you can watch to see how things are going in the cache. Listing 2 shows the status of the cache.
Listing 2. Display the query cache statistics
mysql> SHOW STATUS LIKE 'qcache%'; +-------------------------+------------+ | Variable_name | Value | +-------------------------+------------+ | Qcache_free_blocks | 5216 | | Qcache_free_memory | 14640664 | | Qcache_hits | 2581646882 | | Qcache_inserts | 360210964 | | Qcache_lowmem_prunes | 281680433 | | Qcache_not_cached | 79740667 | | Qcache_queries_in_cache | 16927 | | Qcache_total_blocks | 47042 | +-------------------------+------------+ 8 rows in set (0.00 sec) |
The breakdown of these items is shown in Table 1.
Table 1. MySQL query cache variables
| Variable name | Description |
|---|---|
Qcache_free_blocks | The number of contiguous memory blocks in the cache. Higher numbers are a
sign of fragmentation. FLUSH QUERY CACHE
defragments the cache so there's one free block. |
Qcache_free_memory | The free memory in the cache. |
Qcache_hits | Incremented each time a query is served from the cache. |
Qcache_inserts | Incremented each time a query is inserted. Divide the number of inserts by the hits to get your miss rate; subtract this value from 1 to get your hit rate. In the previous example, approximately 87% of the queries are getting served from cache. |
Qcache_lowmem_prunes | How many times the cache ran out of memory and had to be cleaned up to make
room for more queries. This number is best looked at over time; if it's
increasing, it's a sign that either fragmentation is serious or memory is low
(free_blocks and
free_memory, above, tell you which it is). |
Qcache_not_cached | The number of queries that weren't candidates for caching, usually because
they weren't SELECT statements. |
Qcache_queries_in_cache | The number of queries (and responses) current cached. |
Qcache_total_blocks | The number of blocks in the cache. |
Often, showing the variables several seconds apart indicates change, which helps
determine whether the cache is being used effectively. Running
FLUSH STATUS resets some of the counters, which is
helpful if the server has been running for a while.
It's tempting to make an excessively large query cache in the hopes of caching
everything. Because mysqld must perform maintenance on
the cache, such as pruning when memory becomes low, the server can get bogged down
trying to manage the cache. As a rule, if
FLUSH QUERY CACHE takes a long time, the cache is too
large.
You should enforce a few limits in mysqld to ensure
that the system load doesn't cause resource starvations. Listing 3 shows some
important resource-related settings from my.cnf.
Listing 3. MySQL resource settings
set-variable=max_connections=500 set-variable=wait_timeout=10 max_connect_errors = 100 |
The maximum connections are governed in the first line. Like
MaxClients from Apache, the idea is to make sure only
the number of connections you can serve are allowed. To determine the maximum
number of connections your server has seen so far, execute
SHOW STATUS LIKE 'max_used_connections'.
The second line tells mysqld to terminate any
connections that have been idle for more than 10 seconds. In LAMP applications,
the connection to the database is usually only as long as the Web server takes to
process the request. Sometimes, under load, connections hang around and take up
connection table space. If you have many interactive users or use persistent
connections to the database, then setting this low isn't a good idea!
The final line is a safety measure. If a host has problems connecting to the
server and ends up aborting the request too many times, the host is locked until
FLUSH HOSTS can be run. By default, 10 failures are
enough to cause blocking. Changing this value to 100 gives the server enough time
to recover from whatever problems it has. Using a higher value doesn't help you
much because if the server can't connect once in 100 tries, chances are it's not
going to connect at all.
MySQL supports well over 100 tunable settings; but luckily, mastering a small
handful will take care of most needs. Finding the right value for these settings
involves looking at status variables via the
SHOW STATUS command and, from that, determining whether
mysqld is behaving as you wish. You can't allocate more
memory to buffers and caches than exists in the system, so tuning often involves
making compromises.
MySQL tunables apply to either the whole mysqld
process or each individual client session.
Each table is represented as a file on disk and must be opened before it can be
read. To speed up the process of reading from the file,
mysqld caches these open files up to the limit
specified by table_cache in /etc/mysqld.conf. Listing 4
shows how to display the activity associated with opening tables.
Listing 4. Display table-open activity
mysql> SHOW STATUS LIKE 'open%tables'; +---------------+-------+ | Variable_name | Value | +---------------+-------+ | Open_tables | 5000 | | Opened_tables | 195 | +---------------+-------+ 2 rows in set (0.00 sec) |
Listing 4 shows that 5,000 tables are currently open and that 195 tables had to
be opened because there was no available file descriptor in the cache (the
statistics were cleared earlier, so it's feasible to have 5,000 open tables with a
history of only 195 opens). If Opened_tables increases
quickly as you rerun the SHOW STATUS command, you
aren't getting enough hits out of your cache. If
Open_tables is much lower than your
table_cache setting, you have too many (some room to
grow is never a bad thing, though). Adjust your table cache with
table_cache = 5000, for example.
Like the table cache, there is also a cache for threads.
mysqld spawns threads as needed when receiving
connections. On a busy server where connections are torn up and down quickly,
caching threads for use later speeds up the initial connection.
Listing 5 shows how to determine if you have enough threads cached.
Listing 5. Show thread-usage statistics
mysql> SHOW STATUS LIKE 'threads%'; +-------------------+--------+ | Variable_name | Value | +-------------------+--------+ | Threads_cached | 27 | | Threads_connected | 15 | | Threads_created | 838610 | | Threads_running | 3 | +-------------------+--------+ 4 rows in set (0.00 sec) |
The important value here is Threads_created, which is
incremented each time mysqld has to create a new
thread. If this number increases quickly between successive
SHOW STATUS commands, you should look at increasing
your thread cache. You do this with thread_cache = 40,
for example, in my.cnf.
The key buffer stores index blocks for MyISAM tables. Ideally, requests for these blocks should come from memory instead of disk. Listing 6 shows how to determine how many blocks were read from disk versus those from memory.
Listing 6. Determine key efficiency
mysql> show status like '%key_read%'; +-------------------+-----------+ | Variable_name | Value | +-------------------+-----------+ | Key_read_requests | 163554268 | | Key_reads | 98247 | +-------------------+-----------+ 2 rows in set (0.00 sec) |
Key_reads represents the number of requests that hit
disk, and Key_read_requests is the total number.
Dividing the reads by the read requests gives the miss rate -- in this case, 0.6
misses per 1,000 requests. If you're missing more than 1 per 1,000 requests, you
should consider increasing your key buffer.
key_buffer = 384M, for example, sets the buffer to 384MB.
Temporary tables are used in more advanced queries where data must be stored
temporarily before further processing happens, such as in
GROUP BY clauses. Ideally, such tables are created in
memory; but if a temporary table gets too large, it's written to disk. Listing 7
shows the statistics associated with temporary-table creation.
Listing 7. Determine temporary-table usage
mysql> SHOW STATUS LIKE 'created_tmp%'; +-------------------------+-------+ | Variable_name | Value | +-------------------------+-------+ | Created_tmp_disk_tables | 30660 | | Created_tmp_files | 2 | | Created_tmp_tables | 32912 | +-------------------------+-------+ 3 rows in set (0.00 sec) |
Each use of a temporary table increases
Created_tmp_tables; disk-based tables also increment
Created_tmp_disk_tables. There is no hard-and-fast rule
for the ratio because it depends on the queries involved. Watching
Created_tmp_disk_tables over time shows the rate of
created disk tables, and you can determine the effectiveness of the settings. Both
tmp_table_size and
max_heap_table_size control the maximum size of
temporary tables, so make sure you set them both in my.cnf.
The following settings are per session. Take care when you set these numbers, because when multiplied by the number of potential connections, these options represent a lot of memory! You can change these numbers in the session through code or for all sessions in my.cnf.
When MySQL must perform a sort, it allocates a sort buffer to store the rows as
they're read from disk. If the size of the data to sort is too large, the data
must go to temporary files on disk and be sorted again. If the
sort_merge_passes status variable is high, this is an
indication of this disk activity. Listing 8 shows some of the sort-related status
counters.
Listing 8. Show-sort statistics
mysql> SHOW STATUS LIKE "sort%"; +-------------------+---------+ | Variable_name | Value | +-------------------+---------+ | Sort_merge_passes | 1 | | Sort_range | 79192 | | Sort_rows | 2066532 | | Sort_scan | 44006 | +-------------------+---------+ 4 rows in set (0.00 sec) |
If sort_merge_passes is high, this is an indication
that sort_buffer_size needs attention. For example,
sort_buffer_size = 4M sets the sort buffer to 4MB.
MySQL also allocates memory to read tables. Ideally, the indexes provide enough
information to read in only the needed rows, but sometimes queries (through poor
design or the nature of the data) require large chunks of the table to be read. To
understand this behavior, you need to know how many
SELECT statements were run and the number of times you
had to read the next row in the table (rather than a direct access through an
index). The commands to do so are shown in Listing 9.
Listing 9. Determine table-scan ratio
mysql> SHOW STATUS LIKE "com_select"; +---------------+--------+ | Variable_name | Value | +---------------+--------+ | Com_select | 318243 | +---------------+--------+ 1 row in set (0.00 sec) mysql> SHOW STATUS LIKE "handler_read_rnd_next"; +-----------------------+-----------+ | Variable_name | Value | +-----------------------+-----------+ | Handler_read_rnd_next | 165959471 | +-----------------------+-----------+ 1 row in set (0.00 sec) |
The Handler_read_rnd_next /
Com_select gives your table-scan ratio -- in this case,
521:1. Anything over 4000, and you should look at your
read_buffer_size, such as
read_buffer_size = 4M. If you're growing this number
beyond 8M, it's time to talk to your developers about tuning those queries!
Even though the SHOW STATUS commands are helpful when
you're drilling down into specific settings, you need some tools to help you
interpret the vast amounts of data provided by mysqld.
I've found three tools to be indispensable; you can find links in the
Resources section.
Most sysadmins are familiar with the top command,
which provides a constantly updated view of the CPU and memory consumed by tasks.
mytop is modelled after top;
it provides a view of all the connected clients along with any queries they're
currently running. mytop also provides real-time and
historical data about key-buffer and query-cache efficiency, and statistics about
the queries being run. It's a useful tool to see what's going on -- within 10
seconds, you can get a view of the server's health and display any connections
that are causing problems.
mysqlard is a daemon that connects to the MySQL server
and collects data every 5 minutes, storing it in a Round Robin Database backend. A
Web page displays the data, such as table-cache usage, key efficiency, connected
clients, and temporary-table usage. Whereas mytop
provides a snapshot of server health, mysqlard provides
long-term health information. As a bonus, mysqlard uses
some of the information it collects to make suggestions about how to tune your
server.
Another tool for collecting SHOW STATUS information is
mysqlreport. It's far more verbose in its reporting
than mysqlard because it analyzes every facet of the
server. It's an excellent tool for tuning a server because it performs the
appropriate calculations on the status variables to help you determine what needs
fixing.
This article covered the basics of MySQL tuning and concludes this three-part series on tuning LAMP components. Tuning is largely about understanding how things work, determining if they're working properly, making adjustments, and re-evaluating. Each component -- Linux, Apache, PHP, or MySQL -- has various needs. Understanding them individually helps eliminate the bottlenecks that can slow your application.
Learn
-
"Migrate from MySQL or PostgreSQL to DB2 Express-C"
(developerWorks, June 2006) offers an easy way to migrate from MySQL
to DB2 Express-C.
- IBM also provides help for MySQL administrators
who are moving to DB2 Express-C by means of
"Leveraging MySQL skills to learn DB2 Express"
(developerWorks, Feb 2006) and other articles in the series.
-
"Using MySQL in a federated database environment"
(developerWorks, Dec 2004) is a tutorial on accessing data stored in a MySQL
database from WebSphere. IBM makes sure that WebSphere® software works
well with MySQL.
- SHOW VARIABLES
and
SHOW
STATUS are defined well in the MySQL documentation.
- If you like blogs,
MySQL Performance Blog,
Xaprb, and
MySQL DBA are worth
reading.
- In the
Architecture area on developerWorks,
get the resources you need to advance your skills in the architecture arena.
Developing the proper architecture is the key to scaling LAMP applications.
- In the
developerWorks Linux zone,
find more resources for Linux developers, including
Linux tutorials,
as well as
our readers' favorite Linux articles and tutorials
over the last month.
- Stay current with
developerWorks technical events and Webcasts.
Get products and technologies
- Despite being three
years old by now, High Performance MySQL is still a valuable book. The author also has a Web site with
various articles about MySQL.
- mytop tells you exactly
what's going on with your MySQL server at that instant and provides some key
statistics. This is the first program I turn to when I hear of database problems.
- mysqlard graphs key performance indicators in
MySQL servers and provides tuning advice.
- mysqlreport is a must-have tool. It
analyzes
SHOW STATUSvariables on your behalf. - A MySQL article isn't
complete without a link to phpMyAdmin.
Even though some interpretation of the status variables is provided, this
product's strength is how easy it makes administration.
- With
IBM trial software,
available for download directly from developerWorks, build your next development
project on Linux.
Discuss
- Get involved in the
developerWorks community
through our developer blogs, forums, podcasts, and community topics in our new
developerWorks spaces.






