From: Andy Lutomirski <luto@amacapital.net>
To: stan@hardwarefreak.com
Cc: John Robinson <john.robinson@anonymous.org.uk>,
linux-kernel@vger.kernel.org, linux-raid@vger.kernel.org
Subject: Re: O_DIRECT to md raid 6 is slow
Date: Wed, 15 Aug 2012 15:10:44 -0700 [thread overview]
Message-ID: <CALCETrUTNV0r6xeF+mbqqw7w_StxoF2qFxzCLfb-LVH7ay_SHw@mail.gmail.com> (raw)
In-Reply-To: <502C1C01.1040509@hardwarefreak.com>
On Wed, Aug 15, 2012 at 3:00 PM, Stan Hoeppner <stan@hardwarefreak.com> wrote:
> On 8/15/2012 12:57 PM, Andy Lutomirski wrote:
>> On Wed, Aug 15, 2012 at 4:50 AM, John Robinson
>> <john.robinson@anonymous.org.uk> wrote:
>>> On 15/08/2012 01:49, Andy Lutomirski wrote:
>>>>
>>>> If I do:
>>>> # dd if=/dev/zero of=/dev/md0p1 bs=8M
>>>
>>> [...]
>>>
>>>> It looks like md isn't recognizing that I'm writing whole stripes when
>>>> I'm in O_DIRECT mode.
>>>
>>>
>>> I see your md device is partitioned. Is the partition itself stripe-aligned?
>>
>> Crud.
>>
>> md0 : active raid6 sdg1[5] sdf1[4] sde1[3] sdd1[2] sdc1[1] sdb1[0]
>> 11720536064 blocks super 1.2 level 6, 512k chunk, algorithm 2
>> [6/6] [UUUUUU]
>>
>> IIUC this means that I/O should be aligned on 2MB boundaries (512k
>> chunk * 4 non-parity disks). gdisk put my partition on a 2048 sector
>> (i.e. 1MB) boundary.
>
> It's time to blow away the array and start over. You're already
> misaligned, and a 512KB chunk is insanely unsuitable for parity RAID,
> but for a handful of niche all streaming workloads with little/no
> rewrite, such as video surveillance or DVR workloads.
>
> Yes, 512KB is the md 1.2 default. And yes, it is insane. Here's why:
> Deleting a single file changes only a few bytes of directory metadata.
> With your 6 drive md/RAID6 with 512KB chunk, you must read 3MB of data,
> modify the directory block in question, calculate parity, then write out
> 3MB of data to rust. So you consume 6MB of bandwidth to write less than
> a dozen bytes. With a 12 drive RAID6 that's 12MB of bandwidth to modify
> a few bytes of metadata. Yes, insane.
Grr. I thought the bad old days of filesystem and related defaults
sucking were over. cryptsetup aligns sanely these days, xfs is
sensible, etc. wtf? <rant>Why is there no sensible filesystem for
huge disks? zfs can't cp --reflink and has all kinds of source
availability and licensing issues, xfs can't dedupe at all, and btrfs
isn't nearly stable enough.</rant>
Anyhow, I'll try the patch from Wu Fengguang. There's still a bug here...
--Andy
next prev parent reply other threads:[~2012-08-15 22:11 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-08-15 0:49 O_DIRECT to md raid 6 is slow Andy Lutomirski
2012-08-15 1:07 ` kedacomkernel
2012-08-15 1:12 ` Andy Lutomirski
2012-08-15 1:23 ` kedacomkernel
2012-08-15 11:50 ` John Robinson
2012-08-15 17:57 ` Andy Lutomirski
2012-08-15 22:00 ` Stan Hoeppner
2012-08-15 22:10 ` Andy Lutomirski [this message]
2012-08-15 23:50 ` Stan Hoeppner
2012-08-16 1:08 ` Andy Lutomirski
2012-08-16 6:41 ` Roman Mamedov
2012-08-15 23:07 ` Miquel van Smoorenburg
2012-08-16 11:05 ` Stan Hoeppner
2012-08-16 21:50 ` Miquel van Smoorenburg
2012-08-17 7:31 ` Stan Hoeppner
2012-08-17 11:16 ` Miquel van Smoorenburg
[not found] ` <502F237D.6060806@hardwarefreak.com>
[not found] ` <502F698C.9010507@msgid.tls.msk.ru>
[not found] ` <50305AB9.5080302@hardwarefreak.com>
[not found] ` <5030F1C6.90205@hesbynett.no>
2012-08-19 23:34 ` Stan Hoeppner
2012-08-20 0:01 ` NeilBrown
2012-08-20 7:47 ` David Brown
2012-08-21 14:51 ` Miquel van Smoorenburg
2012-08-22 3:59 ` Stan Hoeppner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CALCETrUTNV0r6xeF+mbqqw7w_StxoF2qFxzCLfb-LVH7ay_SHw@mail.gmail.com \
--to=luto@amacapital.net \
--cc=john.robinson@anonymous.org.uk \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-raid@vger.kernel.org \
--cc=stan@hardwarefreak.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).