Mon, 19 Sep 2005
MySQL InnoDB Defrag HOWTO
So you’re running MySQL and you have a lot of data in an InnoDB. Over time, your InnoDB files will fragment just like an old FAT based Windows box. Unlike that Windows box MySQL doesn’t ship with a defrag utility so you have to brute force it. The fragmentation problem gets much worse when you want to drop tables; the free space that generates is generally scattered all throughout the InnoDB which causes performance to get worse and worse as writes occur. The following procedure will demonstrate how to delete large tables and defrag your database with a minimal amount of downtime.
This HOWTO assumes that you have a master/slave replication setup; if you don’t you’ll need to incur enough downtime to dump and reload your database. This can take several hours for a 20GB database, and therefor isn’t practical for many organizations. This HOWTO also assumes that you’re using MySQL 4.0.24 or higher (it should work fine with 4.1, too).
We’re going to call your master database master, and your replicant database
slave in this document.
On master, create a new database to hold your garbage tables called “trash”:
master# mysqladmin -u root create trash
Then, rename all of your tables that you would otherwise DROP into trash:
master# mysql -u root -e "RENAME TABLE mydb.mytable TO trash.mytable"
The idea here is that this will get them out of your database (so you can verify that you don’t actually need them), but doesn’t free up the InnoDB blocks, so your master DB performance won’t take a hit during the defrag process.
Now, we move over to slave, and do the rest of the work. Make sure that
nothing is referencing your slave, as it will have to be offline for the
duration of the dump and reload.
On your slave:
slave# mysql -u root -e "STOP SLAVE IO_THREAD"
This stops the slave from pulling binlog data from master but allows it to complete processing all of the binlogs that it has already pulled.
Now, run:
slave# mysql -u root -e "SHOW PROCESSLIST"
until the state for the SQL thread says “Has read all relay log; waiting for the I/O slave thread to update it”.
Now, run:
slave# mysql -u root -e "SHOW SLAVE STATUS\G"
and note the Master_Log_File and Read_Master_Log_Pos values; you’ll need them later.
Now shutdown mysql on slave' (/etc/init.d/mysql stop on Debian) and edit the
my.cnf. Addskip-networking(to minimize the chance of someone accidentaly
connecting to your DB while you're working on it) andskip-slave-start` (so it
doesn’t automatically resume replication when you bring it back online).
Start mysql up again, and prepare to dump your data:
For every database except for trash, mysqldump it to somewhere safe:
slave# mysqldump -u root --opt --verbose --databases db1 db2 db3 ... > /var/tmp/mysql-dump.sql
This step will take a very long time, potentially.
Once the dump completes, shut mysql down again, and move the mysql data directory out of the way:
slave# mv /var/lib/mysql /var/lib/mysql-fragmented
Then create an empty mysql datadir, and populate the `mysql’ database (so you have users and passwords):
slave# mkdir /var/lib/mysql
slave# chown mysql:mysql /var/lib/mysql
slave# chmod 0770 /var/lib/mysql
slave# cp -av /var/lib/mysql-fragmented/mysql /var/lib/mysql
Also, if you run your `slave’ with read-only set, you should comment it out now.
Now, start up mysql. It will create fresh, empty InnoDB files (ibdata* and ib_logfile*). This can also take a long time, especially if you have a large-ish initial InnoDB datafile allocation (some of our servers pre-allocate 100GB of space for the InnoDB).
Now, create an empty trash database:
slave# mysqladmin -u root create trash
And reload your database:
slave# mysql -u root < /var/tmp/mysql-dump.sql
This will also take a very long time. At this point, `slave’ is fully loaded and defragged.
Now edit your my.cnf, and comment out skip-networking, and uncomment read-only.
Restart mysql.
Now, configure your slave:
slave# mysql -u root -e "CHANGE MASTER TO MASTER_HOST='master', MASTER_USER='replicator_user', MASTER_PASSWORD='s3cr3tp@ssw0rd', MASTER_LOG_FILE='Master_Log_File', MASTER_LOG_POS=Read_Master_Log_Pos"
where Master_Log_File and Read_Master_Log_Pos are the values you wrote down from before.
Now start your slave:
slave# mysql -u root -e "START SLAVE"
Verify that things are working:
slave# mysql -u root -e "SHOW SLAVE STATUS\G"
Now your slave is defragged assuming it replicates okay. If it doesn’t, it’s likely you’ve missed a database that you’re actually still using. Restore your backup (/var/lib/mysql-fragmented), let it catch up with your master, and then try again.
Do one final restart to make sure that replication starts up again
automatically, and then you’re done. Well, almost. At this point, your master
is still fragmented. So, the next step is to switch your MySQL master from
master to slave, which is described in the yet to be written
MySQL-Master-Slave-Switch-HOWTO.