linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH V2 0/5] md/raid10: Improve handling raid10 discard request
@ 2021-02-04  7:50 Xiao Ni
  2021-02-04  7:50 ` [PATCH V2 1/5] md: add md_submit_discard_bio() for submitting discard bio Xiao Ni
                   ` (5 more replies)
  0 siblings, 6 replies; 15+ messages in thread
From: Xiao Ni @ 2021-02-04  7:50 UTC (permalink / raw)
  To: songliubraving
  Cc: linux-raid, matthew.ruffell, colyli, guoqing.jiang, ncroxon

Hi all

Now mkfs on raid10 which is combined with ssd/nvme disks takes a long time.
This patch set tries to resolve this problem.

This patch set had been reverted because of a data corruption problem. This
version fix this problem. The root cause which causes the data corruption is
the wrong calculation of start address of near copies disks.

Now we use a similar way with raid0 to handle discard request for raid10.
Because the discard region is very big, we can calculate the start/end
address for each disk. Then we can submit the discard request to each disk.
But for raid10, it has copies. For near layout, if the discard request
doesn't align with chunk size, we calculate a start_disk_offset. Now we only
use start_disk_offset for the first disk, but it should be used for the
near copies disks too.

[  789.709501] discard bio start : 70968, size : 191176
[  789.709507] first stripe index 69, start disk index 0, start disk offset 70968
[  789.709509] last stripe index 256, end disk index 0, end disk offset 262144
[  789.709511] disk 0, dev start : 70968, dev end : 262144
[  789.709515] disk 1, dev start : 70656, dev end : 262144

For example, in this test case, it has 2 near copies. The start_disk_offset
for the first disk is 70968. It should use the same offset address for second disk.
But it uses the start address of this chunk. It discard more region. This version
simply spilt the un-aligned part with strip size.

And it fixes another problem. The calculation of stripe_size is wrong in reverted version.

V2: Fix problems pointed by Christoph Hellwig.

Xiao Ni (5):
  md: add md_submit_discard_bio() for submitting discard bio
  md/raid10: extend r10bio devs to raid disks
  md/raid10: pull the code that wait for blocked dev into one function
  md/raid10: improve raid10 discard request
  md/raid10: improve discard request for far layout

 drivers/md/md.c     |  20 +++
 drivers/md/md.h     |   2 +
 drivers/md/raid0.c  |  14 +-
 drivers/md/raid10.c | 434 +++++++++++++++++++++++++++++++++++++++++++++-------
 drivers/md/raid10.h |   1 +
 5 files changed, 402 insertions(+), 69 deletions(-)

-- 
2.7.5


^ permalink raw reply	[flat|nested] 15+ messages in thread
* RE: [PATCH V2 0/5] md/raid10: Improve handling raid10 discard request
@ 2021-03-11  3:55 Adrian Huang12
  0 siblings, 0 replies; 15+ messages in thread
From: Adrian Huang12 @ 2021-03-11  3:55 UTC (permalink / raw)
  To: Xiao Ni, songliubraving
  Cc: linux-raid, matthew.ruffell, colyli, guoqing.jiang, ncroxon, hch

> -----Original Message-----
> From: Xiao Ni <xni@redhat.com>
> Sent: Thursday, February 4, 2021 1:57 PM
> To: songliubraving@fb.com
> Cc: linux-raid@vger.kernel.org; matthew.ruffell@canonical.com;
> colyli@suse.de; guoqing.jiang@cloud.ionos.com; ncroxon@redhat.com;
> hch@infradead.org
> Subject: [PATCH V2 0/5] md/raid10: Improve handling raid10 discard request
> 
> Xiao Ni (5):
>   md: add md_submit_discard_bio() for submitting discard bio
>   md/raid10: extend r10bio devs to raid disks
>   md/raid10: pull the code that wait for blocked dev into one function
>   md/raid10: improve raid10 discard request
>   md/raid10: improve discard request for far layout

Hi Xiao Ni,

Thanks for this series. I also reproduced this issue when creating a RAID10 disk via
Intel VROC.

The xfs formatting process was not finished on 5.4.0-66 and 5.12.0-rc2 (waiting
for one hour), and there were lots of IO timeouts from dmesg. 

With this series (on top of 5.12.0-rc2), the xfs formatting process only took
1 second. And, I did not see any IO timeouts from dmesg.

The test detail [0] is shown as follows. 

So, feel free to add my tested-by. 

[0] https://gist.githubusercontent.com/AdrianHuang/56daafe1b4dbd8b5744d02c5a473e5cd/raw/82f33862698be2567af48b7662f08ccd8e8d27fd/raid10-issue-test-detail.log

-- Adrian

^ permalink raw reply	[flat|nested] 15+ messages in thread
* [PATCH V2 0/5] md/raid10: Improve handling raid10 discard request
@ 2021-02-04  5:57 Xiao Ni
  2021-02-04  7:38 ` Xiao Ni
  0 siblings, 1 reply; 15+ messages in thread
From: Xiao Ni @ 2021-02-04  5:57 UTC (permalink / raw)
  To: songliubraving
  Cc: linux-raid, matthew.ruffell, colyli, guoqing.jiang, ncroxon, hch

Hi all

Now mkfs on raid10 which is combined with ssd/nvme disks takes a long time.
This patch set tries to resolve this problem.

This patch set had been reverted because of a data corruption problem. This
version fix this problem. The root cause which causes the data corruption is
the wrong calculation of start address of near copies disks.

Now we use a similar way with raid0 to handle discard request for raid10.
Because the discard region is very big, we can calculate the start/end
address for each disk. Then we can submit the discard request to each disk.
But for raid10, it has copies. For near layout, if the discard request
doesn't align with chunk size, we calculate a start_disk_offset. Now we only
use start_disk_offset for the first disk, but it should be used for the
near copies disks too.

[  789.709501] discard bio start : 70968, size : 191176
[  789.709507] first stripe index 69, start disk index 0, start disk offset 70968
[  789.709509] last stripe index 256, end disk index 0, end disk offset 262144
[  789.709511] disk 0, dev start : 70968, dev end : 262144
[  789.709515] disk 1, dev start : 70656, dev end : 262144

For example, in this test case, it has 2 near copies. The start_disk_offset
for the first disk is 70968. It should use the same offset address for second disk.
But it uses the start address of this chunk. It discard more region. This version
simply spilt the un-aligned part with strip size.

And it fixes another problem. The calculation of stripe_size is wrong in reverted version.

V2: Fix problems pointed by Christoph Hellwig.

Xiao Ni (5):
  md: add md_submit_discard_bio() for submitting discard bio
  md/raid10: extend r10bio devs to raid disks
  md/raid10: pull the code that wait for blocked dev into one function
  md/raid10: improve raid10 discard request
  md/raid10: improve discard request for far layout

 drivers/md/md.c     |  20 +++
 drivers/md/md.h     |   2 +
 drivers/md/raid0.c  |  14 +-
 drivers/md/raid10.c | 434 +++++++++++++++++++++++++++++++++++++++++++++-------
 drivers/md/raid10.h |   1 +
 5 files changed, 402 insertions(+), 69 deletions(-)

-- 
2.7.5


^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2021-03-11  3:56 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-02-04  7:50 [PATCH V2 0/5] md/raid10: Improve handling raid10 discard request Xiao Ni
2021-02-04  7:50 ` [PATCH V2 1/5] md: add md_submit_discard_bio() for submitting discard bio Xiao Ni
2021-02-04  7:50 ` [PATCH V2 2/5] md/raid10: extend r10bio devs to raid disks Xiao Ni
2021-02-04  7:50 ` [PATCH V2 3/5] md/raid10: pull the code that wait for blocked dev into one function Xiao Ni
2021-02-04  7:50 ` [PATCH V2 4/5] md/raid10: improve raid10 discard request Xiao Ni
2021-02-04  7:50 ` [PATCH V2 5/5] md/raid10: improve discard request for far layout Xiao Ni
2021-02-15  4:05 ` [PATCH V2 0/5] md/raid10: Improve handling raid10 discard request Matthew Ruffell
2021-02-20  8:12   ` Xiao Ni
2021-02-24  8:41     ` Song Liu
  -- strict thread matches above, loose matches on Subject: below --
2021-03-11  3:55 Adrian Huang12
2021-02-04  5:57 Xiao Ni
2021-02-04  7:38 ` Xiao Ni
2021-02-04  8:12   ` Song Liu
2021-02-04  8:39     ` Xiao Ni
2021-02-04 17:29       ` Song Liu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).