In this post I'll cover a lecture on MySQL performance tuning.
This lecture is given by Jay Pipes. Jay works at MySQL and has written a book on MySQL. It's called Pro MySQL and it covers intermediate and advanced features of the database. He also has an interesting blog, which I am long subscribed to - Jay Pipes blog.
In this lecture Mr. Pipes talks about core concepts of profiling and benchmarking, about the most common sources of performance problems, about indexing, schema, coding guidelines, and a little about server parameter tuning.
Here is his talk at Google:
The most interesting performance tuning tips from the video:
- [02:20] Don't benchmark without a goal. Have a goal like "improve performance by 20%". Otherwise you'll waste a lot of time tuning milliseconds out of your application.
- [02:50] Change just one thing at a time and re-run the benchmarks.
- [03:40] Disable the query cache by setting the cache size to 0 when running MySQL benchmarks.
- [05:22] The best tool for profiling MySQL queries is the EXPLAIN command. Understand it!
- [06:40] Log slow queries and use mysqldumpshow to parse the log. It also has an option (--log-queries-not-using-indexes) of logging any query that does not use an index on a table.
- [07:40] Jeremy Zawodny wrote the mytop utility for monitoring the threads and overall performance of MySQL servers.
- [08:55 && 11:30] Repeated queries on an un-indexed field will kill your application faster than anything else.
- [09:30] Don't de-normalize just because you think it will be faster. Start with normalized database schemes.
- [10:15] Server parameter tweaking is not a catch-all. Tuning server parameters can help but it's very specific to certain situations.
- [12:05] If you use MyISAM storage engine, exploit covering indexes.
- [12:50] Ensure good selectivity on index fields.
- [14:45] On multi-column indexes, pay attention to order of fields within the index definition.
- [15:40] Be aware that as your database grows, the data in the indexed fields can gradate, deteriorating the usefulness of that index. As you data grows, always examine if the indexes you originally thought are still relevant to the data.
- [17:02] Example of a common index problem, where an index is created on multiple fields.
- [20:30] Use the smallest data types possible. Don't use bigint, when int will do. Or, don't use char(200), when a varchar or smaller char() would do. Using the right type will fit more records in memory or index key block, meaning fewer reads, resulting in faster performance.
- [21:30] Consider horizontally spitting many-columned tables if they contain a lot of NULLs or rarely used columns.
- [23:35] Get rid of surrogate keys (with example).
- [24:05 && 33:20] Be an SQL programmer who thinks in sets, not procedural programming paradigms.
- [24:35] InnoDB can't optimize SELECT COUNT(*) queries. Use counter tables! That's how to scale InnoDB.
- [27:20] Always try to isolate index fields on one side of condition in a query (with example).
- [28:20] Avoid using CURRENT_DATE() as it invalidates the cache.
- [29:34] Example of using calculated fields when searching on top level domain. Idea - put a reversed TLD in the table.
- [33:20] Avoid correlated subqueries. Think in sets not loops! Here is a great article on visualizing SQL joins.
- [34:50] Example of using derived tables to avoid correlated subqueries.
- [36:25] Be aware of global and per-thread server variables.
- [37:50] Enable query cache if your application is doing a lot more reads than writes!
- [28:50] MySQL uses MyISAM for internal data storage.
- [40:00] MySQL loves ram!
- [40:35] Q and A.
Jay also recently published slides from his Join-Fu talk. Get them here.
I enjoyed this talk a lot. I am an intermediate MySQL user and I had not read his book. It was really informative.