All of lore.kernel.org
 help / color / mirror / Atom feed
From: Logan Gunthorpe <logang@deltatee.com>
To: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: Xiao Ni <xni@redhat.com>,
	linux-raid@vger.kernel.org, Jes Sorensen <jes@trained-monkey.org>,
	Guoqing Jiang <guoqing.jiang@linux.dev>,
	Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>,
	Coly Li <colyli@suse.de>,
	Chaitanya Kulkarni <chaitanyak@nvidia.com>,
	Jonmichael Hands <jm@chia.net>,
	Stephen Bates <sbates@raithlin.com>,
	Martin Oliveira <Martin.Oliveira@eideticom.com>,
	David Sloan <David.Sloan@eideticom.com>
Subject: Re: [PATCH mdadm v4 0/7] Write Zeroes option for Creating Arrays
Date: Wed, 16 Nov 2022 10:11:17 -0700	[thread overview]
Message-ID: <716b86e4-e7eb-8d07-c1cb-962c10537ea3@deltatee.com> (raw)
In-Reply-To: <yq15ygo4jkv.fsf@ca-mkp.ca.oracle.com>

Sorry a little late responding to this.

On 2022-10-12 19:33, Martin K. Petersen wrote:
> 
> Logan,
> 
>> 2) We could split up the fallocate call into multiple calls to zero
>> the entire disk. This would allow a quicker ctrl-c to occur, however
>> it's not clear what the best size would be to split it into. Even
>> zeroing 1GB can take a few seconds,
> 
> FWIW, we default to 32MB per request in SCSI unless the device
> explicitly advertises wanting something larger.
> 
>> (with NVMe, discard only requires a single command to handle the
>> entire disk
> 
> In NVMe there's a limit of 64K blocks per range and 256 ranges per
> request. So 8GB or 64GB per request for discard depending on the block
> size. So presumably it will take several operations to deallocate an
> entire drive.
> 
>> where as write-zeroes requires a minimum of one command per 2MB of
>> data to zero).
> 
> 32MB for 512-byte blocks and 256MB for 4096-byte blocks. Which matches
> how it currently works for SCSI devices.

The 2MB I was referring to was the typical maximum we see on real
devices. We tested a number of NVMe drives from a number of different
vendors and found most to be a maximum of 2MB, some devices had 512KB.
Which is unfortunate.

>> I was hoping write-zeroes could be made faster in the future, at least
>> for NVMe.
> 
> Deallocate had a bit of a head start and vendors are still catching up
> in the zeroing department. Some drives do support using Deallocate for
> zeroing and we quirk those in the driver so they should perform OK with
> your change.

Yeah, my hope is that larger zeroing requests can be supported which
will be handled performantly by deallocating the device. So I don't want
mdadm to slow this down by splitting the request to the kernel into a
number of smaller requests. But this seems to be the only way forward
because the request is uninterruptible and we don't want to hang the
user for several minutes.

Logan

  parent reply	other threads:[~2022-11-16 17:11 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-10-07 20:10 [PATCH mdadm v4 0/7] Write Zeroes option for Creating Arrays Logan Gunthorpe
2022-10-07 20:10 ` [PATCH mdadm v4 1/7] Create: goto abort_locked instead of return 1 in error path Logan Gunthorpe
2022-10-07 20:10 ` [PATCH mdadm v4 2/7] Create: remove safe_mode_delay local variable Logan Gunthorpe
2022-10-07 20:10 ` [PATCH mdadm v4 3/7] Create: Factor out add_disks() helpers Logan Gunthorpe
2022-10-07 20:10 ` [PATCH mdadm v4 4/7] mdadm: Introduce pr_info() Logan Gunthorpe
2022-10-07 20:10 ` [PATCH mdadm v4 5/7] mdadm: Add --write-zeros option for Create Logan Gunthorpe
2022-10-07 20:10 ` [PATCH mdadm v4 6/7] tests/00raid5-zero: Introduce test to exercise --write-zeros Logan Gunthorpe
2022-10-07 20:10 ` [PATCH mdadm v4 7/7] manpage: Add --write-zeroes option to manpage Logan Gunthorpe
2022-10-12  1:09 ` [PATCH mdadm v4 0/7] Write Zeroes option for Creating Arrays Xiao Ni
2022-10-12 16:59   ` Logan Gunthorpe
2022-10-13  1:33     ` Martin K. Petersen
2022-10-13  7:51       ` Xiao Ni
2022-10-26  2:41         ` Martin K. Petersen
2022-10-27  8:44           ` Xiao Ni
2022-11-16 17:11       ` Logan Gunthorpe [this message]
2022-11-03  8:14 ` Kinga Tanska

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=716b86e4-e7eb-8d07-c1cb-962c10537ea3@deltatee.com \
    --to=logang@deltatee.com \
    --cc=David.Sloan@eideticom.com \
    --cc=Martin.Oliveira@eideticom.com \
    --cc=chaitanyak@nvidia.com \
    --cc=colyli@suse.de \
    --cc=guoqing.jiang@linux.dev \
    --cc=jes@trained-monkey.org \
    --cc=jm@chia.net \
    --cc=linux-raid@vger.kernel.org \
    --cc=mariusz.tkaczyk@linux.intel.com \
    --cc=martin.petersen@oracle.com \
    --cc=sbates@raithlin.com \
    --cc=xni@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.