All of lore.kernel.org
 help / color / mirror / Atom feed
From: Kinga Tanska <kinga.tanska@linux.intel.com>
To: Logan Gunthorpe <logang@deltatee.com>
Cc: linux-raid@vger.kernel.org, Jes Sorensen <jes@trained-monkey.org>,
	Guoqing Jiang <guoqing.jiang@linux.dev>, Xiao Ni <xni@redhat.com>,
	Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>,
	Coly Li <colyli@suse.de>,
	Chaitanya Kulkarni <chaitanyak@nvidia.com>,
	Jonmichael Hands <jm@chia.net>,
	Stephen Bates <sbates@raithlin.com>,
	Martin Oliveira <Martin.Oliveira@eideticom.com>,
	David Sloan <David.Sloan@eideticom.com>
Subject: Re: [PATCH mdadm v4 0/7] Write Zeroes option for Creating Arrays
Date: Thu, 3 Nov 2022 09:14:15 +0100	[thread overview]
Message-ID: <20221103091415.00000b8c@intel.linux.com> (raw)
In-Reply-To: <20221007201037.20263-1-logang@deltatee.com>

On Fri,  7 Oct 2022 14:10:30 -0600
Logan Gunthorpe <logang@deltatee.com> wrote:

> Hi,
> 
> This is the next iteration of the patchset that added the discard
> option to mdadm. Per feedback from Martin, it's more desirable
> to use the write-zeroes functionality than rely on devices to zero
> the data on a discard request. This is because standards typically
> only require the device to do the best effort to discard data and
> may not actually discard (and thus zero) it all in some circumstances.
> 
> This version of the patch set adds the --write-zeroes option which
> will imply --assume-clean and write zeros to the data region in
> each disk before starting the array. This can take some time so
> each disk is done in parallel in its own fork. To make the forking
> code easier to understand this patch set also starts with some
> cleanup of the existing Create code.
> 
> We tested write-zeroes requests on a number of modern nvme drives of
> various manufacturers and found most are not as optimized as the
> discard path. A couple drives that were tested did not support
> write-zeroes at all but still performed similarly with the kernel
> falling back to writing zero pages. Typically we see it take on the
> order of one minute per 100GB of data zeroed.
> 
> One reason write-zeroes is slower than discard is that today's NVMe
> devices only allow about 2MB to be zeroed in one command where as
> the entire drive can typically be discarded in one command. Partly,
> this is a limitation of the spec as there are only 16 bits avalaible
> in the write-zeros command size but drives still don't max this out.
> Hopefully, in the future this will all be optimized a bit more
> and this work will be able to take advantage of that.
> 
> Logan
> 
> --
> 
> Changes since v3:
>    * Store the pid in a local variable instead of the mdinfo struct
>     (per Mariusz and Xiao)
> 
> Changes since v2:
> 
>    * Use write-zeroes instead of discard to zero the disks (per
>      Martin)
>    * Due to the time required to zero the disks, each disk is
>      now done in parallel with separate forks of the process.
>    * In order to add the forking some refactoring was done on the
>      Create() function to make it easier to understand
>    * Added a pr_info() call so that some prints can be done
>      to stdout instead of stdour (per Mariusz)
>    * Added KIB_TO_BYTES and SEC_TO_BYTES helpers (per Mariusz)
>    * Added a test to the mdadm test suite to test the option
>      works.
>    * Fixed up how the size and offset are calculated with some
>      great information from Xiao.
> 
> Changes since v1:
> 
>    * Discard the data in the devices later in the create process
>      while they are already open. This requires treating the
>      s.discard option the same as the s.assume_clean option.
>      Per Mariusz.
>    * A couple other minor cleanup changes from Mariusz.
> 
> 
> *** BLURB HERE ***
> 
> Logan Gunthorpe (7):
>   Create: goto abort_locked instead of return 1 in error path
>   Create: remove safe_mode_delay local variable
>   Create: Factor out add_disks() helpers
>   mdadm: Introduce pr_info()
>   mdadm: Add --write-zeros option for Create
>   tests/00raid5-zero: Introduce test to exercise --write-zeros.
>   manpage: Add --write-zeroes option to manpage
> 
>  Create.c           | 479
> ++++++++++++++++++++++++++++----------------- ReadMe.c           |
> 2 + mdadm.8.in         |  16 ++
>  mdadm.c            |   9 +
>  mdadm.h            |   7 +
>  tests/00raid5-zero |  12 ++
>  6 files changed, 350 insertions(+), 175 deletions(-)
>  create mode 100644 tests/00raid5-zero
> 
> 
> base-commit: 8b668d4aa3305af5963162b7499b128bd71f8f29
> --
> 2.30.2

Acked-by: Kinga Tanska <kinga.tanska@linux.intel.com>

      parent reply	other threads:[~2022-11-03  8:14 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-10-07 20:10 [PATCH mdadm v4 0/7] Write Zeroes option for Creating Arrays Logan Gunthorpe
2022-10-07 20:10 ` [PATCH mdadm v4 1/7] Create: goto abort_locked instead of return 1 in error path Logan Gunthorpe
2022-10-07 20:10 ` [PATCH mdadm v4 2/7] Create: remove safe_mode_delay local variable Logan Gunthorpe
2022-10-07 20:10 ` [PATCH mdadm v4 3/7] Create: Factor out add_disks() helpers Logan Gunthorpe
2022-10-07 20:10 ` [PATCH mdadm v4 4/7] mdadm: Introduce pr_info() Logan Gunthorpe
2022-10-07 20:10 ` [PATCH mdadm v4 5/7] mdadm: Add --write-zeros option for Create Logan Gunthorpe
2022-10-07 20:10 ` [PATCH mdadm v4 6/7] tests/00raid5-zero: Introduce test to exercise --write-zeros Logan Gunthorpe
2022-10-07 20:10 ` [PATCH mdadm v4 7/7] manpage: Add --write-zeroes option to manpage Logan Gunthorpe
2022-10-12  1:09 ` [PATCH mdadm v4 0/7] Write Zeroes option for Creating Arrays Xiao Ni
2022-10-12 16:59   ` Logan Gunthorpe
2022-10-13  1:33     ` Martin K. Petersen
2022-10-13  7:51       ` Xiao Ni
2022-10-26  2:41         ` Martin K. Petersen
2022-10-27  8:44           ` Xiao Ni
2022-11-16 17:11       ` Logan Gunthorpe
2022-11-03  8:14 ` Kinga Tanska [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20221103091415.00000b8c@intel.linux.com \
    --to=kinga.tanska@linux.intel.com \
    --cc=David.Sloan@eideticom.com \
    --cc=Martin.Oliveira@eideticom.com \
    --cc=chaitanyak@nvidia.com \
    --cc=colyli@suse.de \
    --cc=guoqing.jiang@linux.dev \
    --cc=jes@trained-monkey.org \
    --cc=jm@chia.net \
    --cc=linux-raid@vger.kernel.org \
    --cc=logang@deltatee.com \
    --cc=mariusz.tkaczyk@linux.intel.com \
    --cc=sbates@raithlin.com \
    --cc=xni@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.