Studying I/O performance in a complex system

We have a pair of identical computers (Dell R610s) clustered as our Subversion server.  The repo administrators began reporting poor performance about six weeks ago.  Performance like this: a directory listing (ls -l) sometimes takes over 30 seconds to return. That’s just annoying, but it’s indicative of an underlying problem which is causing very, very long delays in loading data into and exporting data from the repositories.  And that’s not just annoying – they do small loads and exports all day long to keep the master repo server in sync with the remote servers in our other offices.

The Subversion repositores are in a ~ 450 GB OCFS2 filesystem, /srv/data1, on a DRBD device, /dev/drbd1.  The DRBD device is active/active and configured to mirror data between the two computers; both members of the cluster can read to and write from the device at the same time.  On each computer, the DRBD device is hosted by a Linux software RAID mirror, /dev/md3. At the very bottom of this pile, are the two 500 GB SATA HDDs on each computer.  They each have a single partition, /dev/sdc1 and /dev/sdd1 which are mirrored with Linux software RAID (mdadm).  If you’re with me so far, you’ll understand that we have four copies of the Subversion repos.

The challenge is to find out where the bottle neck is and solve it.  Or them.

One idea we’ve tossed around is that the software RAID is slower than hardware RAID would be.  However,  I’ve found results from a test showing software RAID outperforming hardware RAID in situations similar to ours.  Also, the Wiki on RAID seems to indicate that software RAID can usually outperform hardware RAID in our situation.  From what the author(s) write and what I read elsewhere, I think I can believe that.  Up until you bring dedicated SANs and NASes into the picture, I mean.

Another idea is that maybe “chunk” or “stripe” size mismatches could be to blame.  We have OCFS2 on DRBD on MD Raid.  Well, according to the linux.org RAID HowTo authors, mirrors don’t use “stripes” and so I can eliminate the RAID stripe size from the question.

Perhaps the problem is in the config of drbd. Maybe having the metadata “internal” isn’t so good for this application. In DRBD, “internal” means on the same backing store device. I don’t see any other tunables so far. I have plenty of space on the system drives to store the metadata. Per the formula on http://www.drbd.org/users-guide-emb/ch-internals.html#s-meta-data-size, I need 15 MB. I wonder if I can “move” the metadata….

Leave a Reply

You must be logged in to post a comment.