All of lore.kernel.org
 help / color / mirror / Atom feed
From: Marcin Wanat <marcin.wanat@gmail.com>
To: linux-raid@vger.kernel.org
Subject: Re: Slow initial resync in RAID6 with 36 SAS drives
Date: Wed, 25 Aug 2021 12:06:09 +0200	[thread overview]
Message-ID: <CAFDAVzmjGYsdgx0Yyn3n8NWVpAZQqmhBSneZY9fagV5PGTrgGw@mail.gmail.com> (raw)
In-Reply-To: <CAFDAVznKiKC7YrCTJ4oj6NimXrhnY-=PUnJhFopw6Ur5LvOCjg@mail.gmail.com>

On Thu, Aug 19, 2021 at 11:28 AM Marcin Wanat <marcin.wanat@gmail.com> wrote:
>
> Sorry, this will be a long email with everything I find to be relevant.
> I have a mdraid6 array with 36 hdd SAS drives each able to do
> >200MB/s, but I am unable to get more than 38MB/s resync speed on a
> fast system (48cores/96GB ram) with no other load.

I have done a bit more research on 24 NVMe drives server and found
that resync speed bottleneck affect RAID6 with >16 drives:

# mdadm --create --verbose /dev/md0 --level=6 --raid-devices=16
/dev/nvme1n1 /dev/nvme2n1 /dev/nvme3n1 /dev/nvme4n1 /dev/nvme5n1
/dev/nvme6n1 /dev/nvme7n1 /dev/nvme8n1 /dev/nvme9n1 /dev/nvme10n1
/dev/nvme11n1 /dev/nvme12n1 /dev/nvme13n1 /dev/nvme14n1 /dev/nvme15n1
/dev/nvme16n1
# iostat -dx 5
Device            r/s     w/s     rkB/s     wkB/s   rrqm/s   wrqm/s
%rrqm  %wrqm r_await w_await aqu-sz rareq-sz wareq-sz  svctm  %util
nvme0n1          0.00    0.00      0.00      0.00     0.00     0.00
0.00   0.00    0.00    0.00   0.00     0.00     0.00   0.00   0.00
nvme1n1        342.60    0.40 161311.20      0.90 39996.60     0.00
99.15   0.00    2.88    0.00   0.99   470.84     2.25   2.51  86.04
nvme4n1        342.60    0.40 161311.20      0.90 39996.60     0.00
99.15   0.00    2.89    0.00   0.99   470.84     2.25   2.51  86.06
nvme5n1        342.60    0.40 161311.20      0.90 39996.60     0.00
99.15   0.00    2.89    0.00   0.99   470.84     2.25   2.51  86.14
nvme10n1       342.60    0.40 161311.20      0.90 39996.60     0.00
99.15   0.00    2.90    0.00   0.99   470.84     2.25   2.51  86.20
nvme9n1        342.60    0.40 161311.20      0.90 39996.60     0.00
99.15   0.00    2.91    0.00   1.00   470.84     2.25   2.53  86.76
nvme13n1       342.60    0.40 161311.20      0.90 39996.60     0.00
99.15   0.00    2.93    0.00   1.00   470.84     2.25   2.54  87.00
nvme12n1       342.60    0.40 161311.20      0.90 39996.60     0.00
99.15   0.00    2.94    0.00   1.01   470.84     2.25   2.54  87.08
nvme8n1        342.60    0.40 161311.20      0.90 39996.60     0.00
99.15   0.00    2.93    0.00   1.00   470.84     2.25   2.54  87.02
nvme14n1       342.60    0.40 161311.20      0.90 39996.60     0.00
99.15   0.00    2.96    0.00   1.01   470.84     2.25   2.56  87.64
nvme22n1         0.00    0.00      0.00      0.00     0.00     0.00
0.00   0.00    0.00    0.00   0.00     0.00     0.00   0.00   0.00
nvme17n1         0.00    0.00      0.00      0.00     0.00     0.00
0.00   0.00    0.00    0.00   0.00     0.00     0.00   0.00   0.00
nvme16n1       342.60    0.40 161311.20      0.90 39996.60     0.00
99.15   0.00    3.05    0.00   1.04   470.84     2.25   2.58  88.56
nvme19n1         0.00    0.00      0.00      0.00     0.00     0.00
0.00   0.00    0.00    0.00   0.00     0.00     0.00   0.00   0.00
nvme2n1        342.60    0.40 161311.20      0.90 39996.60     0.00
99.15   0.00    2.94    0.00   1.01   470.84     2.25   2.54  87.20
nvme6n1        342.60    0.40 161311.20      0.90 39996.60     0.00
99.15   0.00    2.95    0.00   1.01   470.84     2.25   2.55  87.52
nvme7n1        342.60    0.40 161311.20      0.90 39996.60     0.00
99.15   0.00    2.94    0.00   1.01   470.84     2.25   2.54  87.22
nvme21n1         0.00    0.00      0.00      0.00     0.00     0.00
0.00   0.00    0.00    0.00   0.00     0.00     0.00   0.00   0.00
nvme11n1       342.60    0.40 161311.20      0.90 39996.60     0.00
99.15   0.00    2.96    0.00   1.02   470.84     2.25   2.56  87.72
nvme15n1       342.60    0.40 161311.20      0.90 39996.60     0.00
99.15   0.00    2.99    0.00   1.02   470.84     2.25   2.53  86.84
nvme23n1         0.00    0.00      0.00      0.00     0.00     0.00
0.00   0.00    0.00    0.00   0.00     0.00     0.00   0.00   0.00
nvme18n1         0.00    0.00      0.00      0.00     0.00     0.00
0.00   0.00    0.00    0.00   0.00     0.00     0.00   0.00   0.00
nvme3n1        342.60    0.40 161311.20      0.90 39996.60     0.00
99.15   0.00    2.97    0.00   1.02   470.84     2.25   2.53  86.66
nvme20n1         0.00    0.00      0.00      0.00     0.00     0.00
0.00   0.00    0.00    0.00   0.00     0.00     0.00   0.00   0.00

as you can see, there are 342 iops with ~470 rareq-sz, but when i
create RAID6 with 17 drives or more:

# mdadm --create --verbose /dev/md0 --level=6 --raid-devices=17
/dev/nvme1n1 /dev/nvme2n1 /dev/nvme3n1 /dev/nvme4n1 /dev/nvme5n1
/dev/nvme6n1 /dev/nvme7n1 /dev/nvme8n1 /dev/nvme9n1 /dev/nvme10n1
/dev/nvme11n1 /dev/nvme12n1 /dev/nvme13n1 /dev/nvme14n1 /dev/nvme15n1
/dev/nvme16n1 /dev/nvme17n1
# iostat -dx 5
Device            r/s     w/s     rkB/s     wkB/s   rrqm/s   wrqm/s
%rrqm  %wrqm r_await w_await aqu-sz rareq-sz wareq-sz  svctm  %util
nvme0n1          0.00    0.00      0.00      0.00     0.00     0.00
0.00   0.00    0.00    0.00   0.00     0.00     0.00   0.00   0.00
nvme1n1       21484.20    0.40  85936.80      0.90     0.00     0.00
0.00   0.00    0.04    0.00   0.82     4.00     2.25   0.05  99.16
nvme4n1       21484.00    0.40  85936.00      0.90     0.00     0.00
0.00   0.00    0.03    0.00   0.74     4.00     2.25   0.05  99.16
nvme5n1       21484.00    0.40  85936.00      0.90     0.00     0.00
0.00   0.00    0.04    0.00   0.84     4.00     2.25   0.05  99.16
nvme10n1      21483.80    0.40  85935.20      0.90     0.00     0.00
0.00   0.00    0.03    0.00   0.65     4.00     2.25   0.04  83.64
nvme9n1       21483.80    0.40  85935.20      0.90     0.00     0.00
0.00   0.00    0.03    0.00   0.67     4.00     2.25   0.04  85.86
nvme13n1      21483.60    0.40  85934.40      0.90     0.00     0.00
0.00   0.00    0.03    0.00   0.63     4.00     2.25   0.04  83.66
nvme12n1      21483.60    0.40  85934.40      0.90     0.00     0.00
0.00   0.00    0.03    0.00   0.65     4.00     2.25   0.04  83.66
nvme8n1       21483.60    0.40  85934.40      0.90     0.00     0.00
0.00   0.00    0.04    0.00   0.81     4.00     2.25   0.05  99.22
nvme14n1      21481.80    0.40  85927.20      0.90     0.00     0.00
0.00   0.00    0.03    0.00   0.67     4.00     2.25   0.04  83.66
nvme22n1         0.00    0.00      0.00      0.00     0.00     0.00
0.00   0.00    0.00    0.00   0.00     0.00     0.00   0.00   0.00
nvme17n1      21482.00    0.40  85928.00      0.90     0.00     0.00
0.00   0.00    0.02    0.00   0.49     4.00     2.25   0.03  67.12
nvme16n1      21481.60    0.40  85926.40      0.90     0.00     0.00
0.00   0.00    0.03    0.00   0.75     4.00     2.25   0.04  83.66
nvme19n1         0.00    0.00      0.00      0.00     0.00     0.00
0.00   0.00    0.00    0.00   0.00     0.00     0.00   0.00   0.00
nvme2n1       21481.60    0.40  85926.40      0.90     0.00     0.00
0.00   0.00    0.04    0.00   0.95     4.00     2.25   0.05  99.26
nvme6n1       21481.60    0.40  85926.40      0.90     0.00     0.00
0.00   0.00    0.04    0.00   0.91     4.00     2.25   0.05  99.26
nvme7n1       21481.60    0.40  85926.40      0.90     0.00     0.00
0.00   0.00    0.04    0.00   0.87     4.00     2.25   0.05  99.24
nvme21n1         0.00    0.00      0.00      0.00     0.00     0.00
0.00   0.00    0.00    0.00   0.00     0.00     0.00   0.00   0.00
nvme11n1      21481.20    0.40  85924.80      0.90     0.00     0.00
0.00   0.00    0.03    0.00   0.75     4.00     2.25   0.04  83.66
nvme15n1      21480.20    0.40  85920.80      0.90     0.00     0.00
0.00   0.00    0.04    0.00   0.80     4.00     2.25   0.04  83.66
nvme23n1         0.00    0.00      0.00      0.00     0.00     0.00
0.00   0.00    0.00    0.00   0.00     0.00     0.00   0.00   0.00
nvme18n1         0.00    0.00      0.00      0.00     0.00     0.00
0.00   0.00    0.00    0.00   0.00     0.00     0.00   0.00   0.00
nvme3n1       21480.40    0.40  85921.60      0.90     0.00     0.00
0.00   0.00    0.05    0.00   1.02     4.00     2.25   0.05  99.26
nvme20n1         0.00    0.00      0.00      0.00     0.00     0.00
0.00   0.00    0.00    0.00   0.00     0.00     0.00   0.00   0.00

rareq-sz drops to 4, iops increase to 21483 and resync speed drops to 85MB/s.

Why is it like that? Could someone let me know which part of mdraid
kernel code is responsible for this limitation ? Is changing this and
recompiling the kernel on machine with 512GB+ ram safe ?

Regards,
Marcin Wanat

  reply	other threads:[~2021-08-25 10:06 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-19  9:28 Slow initial resync in RAID6 with 36 SAS drives Marcin Wanat
2021-08-25 10:06 ` Marcin Wanat [this message]
2021-08-25 10:28   ` [Non-DoD Source] " Finlayson, James M CIV (USA)
2021-09-01  1:22     ` antlists
2021-09-01  1:50       ` Guoqing Jiang
2021-09-01  5:19   ` Song Liu
2021-09-03  0:58     ` Song Liu
2021-09-03  2:56       ` Jens Axboe
2021-09-04 15:24       ` Marcin Wanat

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAFDAVzmjGYsdgx0Yyn3n8NWVpAZQqmhBSneZY9fagV5PGTrgGw@mail.gmail.com \
    --to=marcin.wanat@gmail.com \
    --cc=linux-raid@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.