All of lore.kernel.org
 help / color / mirror / Atom feed
From: Avi Kivity <avi@scylladb.com>
To: Coly Li <colyli@suse.de>
Cc: linux-raid@vger.kernel.org
Subject: Re: raid0 vs. mkfs
Date: Sun, 27 Nov 2016 19:25:18 +0200	[thread overview]
Message-ID: <14c4b1d4-2fd3-b97f-934e-414a8d45fb18@scylladb.com> (raw)
In-Reply-To: <470ba5d0-e54d-3e5e-c639-4591549b9574@suse.de>

On 11/27/2016 07:09 PM, Coly Li wrote:
> On 2016/11/27 下午11:24, Avi Kivity wrote:
>> mkfs /dev/md0 can take a very long time, if /dev/md0 is a very large
>> disk that supports TRIM/DISCARD (erase whichever is inappropriate).
>> That is because mkfs issues a TRIM/DISCARD (erase whichever is
>> inappropriate) for the entire partition. As far as I can tell, md
>> converts the large TRIM/DISCARD (erase whichever is inappropriate) into
>> a large number of TRIM/DISCARD (erase whichever is inappropriate)
>> requests, one per chunk-size worth of disk, and issues them to the RAID
>> components individually.
>>
>>
>> It seems to me that md can convert the large TRIM/DISCARD (erase
>> whichever is inappropriate) request it gets into one TRIM/DISCARD (erase
>> whichever is inappropriate) per RAID component, converting an O(disk
>> size / chunk size) operation into an O(number of RAID components)
>> operation, which is much faster.
>>
>>
>> I observed this with mkfs.xfs on a RAID0 of four 3TB NVMe devices, with
>> the operation taking about a quarter of an hour, continuously pushing
>> half-megabyte TRIM/DISCARD (erase whichever is inappropriate) requests
>> to the disk. Linux 4.1.12.
> It might be possible to improve a bit for DISCARD performance, by your
> suggestion. The implementation might be tricky, but it is worthy to try.
>
> Indeed, it is not only for DISCARD, for read or write, it might be
> helpful for better performance as well. We can check the bio size, if,
> 	bio_sectors(bio)/conf->nr_strip_zones >= SOMETHRESHOLD
> it means on each underlying device, we have more then SOMETHRESHOLD
> continuous chunks to issue, and they can be merged into a larger bio.

It's true that this does not strictly apply to TRIM/DISCARD (erase 
whichever is inappropriate), but to see any gain for READ/WRITE, you 
need a request that is larger than (chunk size) * (raid elements), which 
is unlikely for reasonable values of those parameters.  But a common 
implementation can of course work for multiple request types.

> IMHO it's interesting, good suggestion!

Looking forward to seeing an implementation!

>
> Coly
>


  reply	other threads:[~2016-11-27 17:25 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-11-27 15:24 raid0 vs. mkfs Avi Kivity
2016-11-27 17:09 ` Coly Li
2016-11-27 17:25   ` Avi Kivity [this message]
2016-11-27 19:25     ` Doug Dumitru
2016-11-28  4:11 ` Chris Murphy
2016-11-28  7:28   ` Avi Kivity
2016-11-28  7:33     ` Avi Kivity
2016-11-28  5:09 ` NeilBrown
2016-11-28  6:08   ` Shaohua Li
2016-11-28  7:38   ` Avi Kivity
2016-11-28  8:40     ` NeilBrown
2016-11-28  8:58       ` Avi Kivity
2016-11-28  9:00         ` Christoph Hellwig
2016-11-28  9:11           ` Avi Kivity
2016-11-28  9:15             ` Coly Li
2016-11-28 17:47             ` Shaohua Li
2016-11-29 21:14         ` NeilBrown
2016-11-29 22:45           ` Avi Kivity
2016-12-07  5:08             ` Mike Snitzer
2016-12-07 11:50             ` Coly Li
2016-12-07 12:03               ` Coly Li
2016-12-07 16:59               ` Shaohua Li
2016-12-08 16:44                 ` Coly Li
2016-12-08 19:19                   ` Shaohua Li
2016-12-09  7:34                     ` Coly Li
2016-12-12  3:17                       ` NeilBrown
2017-06-29 15:15                   ` Avi Kivity
2017-06-29 15:31                     ` Coly Li
2017-06-29 15:36                       ` Avi Kivity
2017-01-22 18:01 ` Avi Kivity
2017-01-23 12:26   ` Coly Li

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=14c4b1d4-2fd3-b97f-934e-414a8d45fb18@scylladb.com \
    --to=avi@scylladb.com \
    --cc=colyli@suse.de \
    --cc=linux-raid@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.