gluster-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Gluster-devel] Performance tuning for MySQL


From: David Sickmiller
Subject: [Gluster-devel] Performance tuning for MySQL
Date: Wed, 11 Feb 2009 01:53:02 -0500
User-agent: Thunderbird 2.0.0.19 (Windows/20081209)

Hi,

I really appreciate the information from Raghavendra regarding how the performance translators affect MySQL's integrity.  This week I spent some hours coarsely testing various performance options, and I would be interested in validating whether I'm getting typical results as well as learning ways to improve from here.  Perhaps my experience would be useful for others.

I'm running 2.0rc1 with the 2.6.27 kernel.  I have a 2-node cluster.  GlusterFS runs on both nodes, and MySQL runs on the active node.  If the active node fails or is put on standby, MySQL fires up on the other node.  Unlike MySQL Replication with its slave lag, I know my data changes are durable in the event of a server failure.  Most people use DRBD for this, but I'm hoping to enjoy GlusterFS's benefits of handling split-brain situations at the file level instead of the volume level, future scalability avenues, and general ease of use.  Hopefully DRBD doesn't have unmatchable performance advantages I'm overlooking.

I've been running two tests.  They aren't necessarily realistic usage, but I'm just looking for the big settings that affect performance by a factor of two or more.  My database files are about 1GB.  The first test is "time mysqldump --no-data" which simply prints out the schema.  The second test is "time mysqldump | gzip > /glusterfs/export.gz" which exports the entire database, compresses it, and saves it onto the GlusterFS filesystem.  The 450MB of exported SQL statements compress to 75MB.

I'm going to report my testing in order, because the changes were cumulative.  I used server-side io-threads from the start.  Before I started recording the speed, I discovered that running in single process mode was dramatically faster.  At that time, I also configured read-subvolume to use the local server.  At this point I started measuring:
  • Printing schema: 18s
  • Compressed export: 2m45s
For a benchmark, I moved MySQL's datafiles to the local ext3 disk (but kept writing the export to GlusterFS).  It was 10-100X faster!
  • Printing schema: 0.2s
  • Compressed export: 28s
There was no appreciable changes from installing fuse-2.7.4glfs11, using Booster, or running blockdev to increase readahead from 256 to 16384.

Adding the io-cache client-side translator didn't affect printing the schema but cut the export in half:
  • Compressed export: 1m10s
Going off on a tangent, I shut down the remote node.  This increased the performance by an order of magnitude:
  • Printing schema: 2s
  • Compressed export: 24s
I resumed testing with both servers running.  Switching the I/O scheduler to deadline had no appreciable affect.  Neither did adding client-side io-threads, or server-side write-behind.  Surprisingly, I found that changing read-subvolume to the remove server had only a minor penalty.

Then I noticed that the remote server was listed first in the volfile, which means that it gets used for the lock server.  Swapping the order in the volfile on one server seemed to cause split-brain errors -- does the order need to be the same on both servers?  When I changed both servers' volfiles to use the active MySQL server as the lock server, there was a dramatic performance increase, to roughly around the 2s/24s speed I saw with one server down.  (I lost the exact stats.)

In summary, running in single process mode, client-side io-cache, and a local lock file were the changes that made a significant difference.


Since I'm only going to have one server writing to the filesystem at a time, I could mount it read-only (or not at all) on the other server.  Would that mean I could safely set data-lock-server-count=0 and entry-lock-server-count=0 because I can be confident that there won't be any conflicting writes?  I don't want to take unnecessary risks, but it seems like unnecessary overhead for my use case.

Are there any additional recommended performance changes?  Would server-side AFR change things?  Printing the schema still runs 10X faster when the database is on a local ext3 filesystem.

Thank you,
David
-- 
David Sickmiller

reply via email to

[Prev in Thread] Current Thread [Next in Thread]