Mysql: (date) functions and indexes

Mysql is usually pretty fast by default, but to keep performance to the max sometimes requires knowledge on how mysql works and how to write queries that does their job most efficiently. Today we ran across a simple example which illustrates that mysql’s ability to use an index depends on the way you write the query. Let’s make a simple table and add an index on the date column:

create table members (
	m_id integer auto_increment primary key,
	m_name char(20),
	m_since date not null
);

create index ix_m_since on members (m_since);

Then we populate the table with some sample data:

insert into members (m_name, m_since) values ('memeber a', 1994-01-22);
insert into members (m_name, m_since) values ('memeber b', 1996-02-12);
insert into members (m_name, m_since) values ('memeber c', 1998-03-02);
insert into members (m_name, m_since) values ('memeber d', 2000-04-31);
insert into members (m_name, m_since) values ('memeber e', 2002-05-21);
insert into members (m_name, m_since) values ('memeber f', 2004-06-11);

Now suppose we want all members which has a m_since-date after 2002. We could write:

select count(*) from members where year(m_since)>2002;
but we could also get the same data by writing:

select count(*) from members where m_since >= "2002-01-01";

The data is the same but the efficiency between the two queries is quite different. Mysql can explain what it does with queries and you get it explained simply by adding “explain” in front of your query.

Let’s see the difference between the two queries:

mysql> explain select count(*) from members where year(m_since)>=2002;
+---------+-------+---------------+------------+---------+------+------+--------------------------+
| table   | type  | possible_keys | key        | key_len | ref  | rows | Extra                    |
+---------+-------+---------------+------------+---------+------+------+--------------------------+
| members | index | NULL          | ix_m_since |       3 | NULL |    6 | Using where; Using index |
+---------+-------+---------------+------------+---------+------+------+--------------------------+
1 row in set (0.00 sec)

mysql> explain select count(*) from members where m_since >= "2002-01-01";
+---------+-------+---------------+------------+---------+------+------+--------------------------+
| table   | type  | possible_keys | key        | key_len | ref  | rows | Extra                    |
+---------+-------+---------------+------------+---------+------+------+--------------------------+
| members | range | ix_m_since    | ix_m_since |       3 | NULL |    1 | Using where; Using index |
+---------+-------+---------------+------------+---------+------+------+--------------------------+
1 row in set (0.00 sec)

The field to notice in the two explain outputs is “rows” – In the first query it examines 6 rows, in the second it only examines one row. The second query is much more efficient than the first.

In such a small example, the difference is hardly noticeable, but in our production systems with thousands and thousands of rows, and the difference between the two queries is several seconds – and that makes a difference to the user.

Lesson learned: Be careful when using functions in WHERE clauses. They may cause a huge decrease in the performance of your SQL queries.