All of lore.kernel.org
 help / color / mirror / Atom feed
From: Doug Ledford <dledford@redhat.com>
To: "Finlayson, James M CIV (USA)" <james.m.finlayson4.civ@mail.mil>,
	'Matt Wallis' <mattw@madmonks.org>,
	"linux-raid@vger.kernel.org" <linux-raid@vger.kernel.org>
Subject: Re: [Non-DoD Source] Can't get RAID5/RAID6 NVMe randomread IOPS - AMD ROME what am I missing?????
Date: Wed, 18 Aug 2021 15:59:04 -0400	[thread overview]
Message-ID: <aa5c7b6aa143ca75c0d5840af1700b1a85b05efd.camel@redhat.com> (raw)
In-Reply-To: <5EAED86C53DED2479E3E145969315A238585E1AD@UMECHPA7B.easf.csd.disa.mil>

[-- Attachment #1: Type: text/plain, Size: 1933 bytes --]

> > The question I have for the list, given my large drive sizes, it
> > takes me a day to set up and build an mdraid/lvm configuration.   
> > Has anybody found the "sweet spot" for how many partitions per
> > drive?    I now have a script to generate the drive partitions, a
> > script for building the mdraid volumes, and a procedure for
> > unwinding from all of this and starting again.    

I don't have a feeling for the sweet spot on the number of partitions,
but if you put too many devices in a raid5/6 array, you virtually
guarantee all writes will have to be read-modify-write writes instead of
full stripe writes.

So, when dealing with keeping the parity on the array in sync, a full
stripe write allows you to simply write all blocks in the stripe,
calculate the parity as you do so, and then write the parity out.  For a
partial stripe write, you either have to read in the blocks you aren't
writing and then treat it as a full stripe write and calculate the
parity, or you have to read in the blocks being written and the current
parity block, xor the blocks being over written out of the existing
parity block and then xor the blocks you are writing over the old ones
into the parity block, then write the new blocks and new parity out.

For that reason, I usually try to keep my arrays to no more than 7 or 8
members.  A lot of times, for streaming testing, really high numbers of
drives in a parity raid array will seem to perform fine, but when under
real world conditions might not do so well.  There are also several
filesystems that will optimize their metadata layout when put on an
mdraid device (xfs and ext4), but I'm pretty sure that gets blocked when
you put lvm between the filesystem and the mdraid device.

-- 
Doug Ledford <dledford@redhat.com>
    GPG KeyID: B826A3330E572FDD
    Fingerprint = AE6B 1BDA 122B 23B4 265B  1274 B826 A333 0E57 2FDD

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

      parent reply	other threads:[~2021-08-18 19:59 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-27 20:32 Can't get RAID5/RAID6 NVMe randomread IOPS - AMD ROME what am I missing????? Finlayson, James M CIV (USA)
2021-07-27 21:52 ` Chris Murphy
2021-07-27 22:42 ` Peter Grandi
2021-07-28 10:31 ` Matt Wallis
2021-07-28 10:43   ` [Non-DoD Source] " Finlayson, James M CIV (USA)
2021-07-29  0:54     ` [Non-DoD Source] " Matt Wallis
2021-07-29 16:35       ` Wols Lists
2021-07-29 18:12         ` Finlayson, James M CIV (USA)
2021-07-29 22:05       ` Finlayson, James M CIV (USA)
2021-07-30  8:28         ` Matt Wallis
2021-07-30  8:45           ` Miao Wang
2021-07-30  9:59             ` Finlayson, James M CIV (USA)
2021-07-30 14:03               ` Doug Ledford
2021-07-30 13:17             ` Peter Grandi
2021-07-30  9:54           ` Finlayson, James M CIV (USA)
2021-08-01 11:21 ` Gal Ofri
2021-08-03 14:59   ` [Non-DoD Source] " Finlayson, James M CIV (USA)
2021-08-04  9:33     ` Gal Ofri
     [not found] ` <AS8PR04MB799205817C4647DAC740DE9A91EA9@AS8PR04MB7992.eurprd04.prod.outlook.com>
     [not found]   ` <5EAED86C53DED2479E3E145969315A2385856AD0@UMECHPA7B.easf.csd.disa.mil>
     [not found]     ` <5EAED86C53DED2479E3E145969315A2385856AF7@UMECHPA7B.easf.csd.disa.mil>
2021-08-05 19:52       ` Finlayson, James M CIV (USA)
2021-08-05 20:50         ` Finlayson, James M CIV (USA)
2021-08-05 21:10           ` Finlayson, James M CIV (USA)
2021-08-08 14:43             ` Gal Ofri
2021-08-09 19:01               ` Finlayson, James M CIV (USA)
2021-08-17 21:21                 ` Finlayson, James M CIV (USA)
2021-08-18  0:45                   ` [Non-DoD Source] " Matt Wallis
2021-08-18 10:20                     ` Finlayson, James M CIV (USA)
2021-08-18 19:48                       ` Doug Ledford
2021-08-18 19:59                       ` Doug Ledford [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aa5c7b6aa143ca75c0d5840af1700b1a85b05efd.camel@redhat.com \
    --to=dledford@redhat.com \
    --cc=james.m.finlayson4.civ@mail.mil \
    --cc=linux-raid@vger.kernel.org \
    --cc=mattw@madmonks.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.