All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mike Snitzer <snitzer-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
To: Eric Wheeler
	<drbd-dev-Himy5ogN2wUERf3Jot9Y56xOck334EZe@public.gmane.org>,
	Christoph Hellwig <hch-jcswGhMUV9g@public.gmane.org>,
	axboe-tSWWG44O7X1aa/9Udqfwiw@public.gmane.org,
	martin.petersen-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org,
	agk-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org,
	shli-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org,
	philipp.reisner-63ez5xqkn6DQT0dZR+AlfA@public.gmane.org,
	linux-block-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-raid-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	dm-devel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org,
	linux-scsi-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	drbd-dev-cunTk1MwBs8qoQakbn7OcQ@public.gmane.org
Cc: ejt-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org
Subject: Re: [PATCH 23/27] drbd: make intelligent use of blkdev_issue_zeroout
Date: Mon, 15 Jan 2018 10:07:38 -0500	[thread overview]
Message-ID: <20180115150738.GA20967@redhat.com> (raw)
In-Reply-To: <20180115124635.GA4107-w1SgEEioFePxa46PmUWvFg@public.gmane.org>

On Mon, Jan 15 2018 at  7:46am -0500,
Lars Ellenberg <lars.ellenberg-63ez5xqkn6DQT0dZR+AlfA@public.gmane.org> wrote:
 
> As I understood it,
> blkdev_issue_zeroout() was supposed to "always try to unmap",
> deprovision, the relevant region, and zero-out any unaligned
> head or tail, just like my work around above was doing.
> 
> And that device mapper thin was "about to" learn this, "soon",
> or maybe block core would do the equivalent of my workaround
> described above.
> 
> But it then did not.
> 
> See also:
> https://www.redhat.com/archives/dm-devel/2017-March/msg00213.html
> https://www.redhat.com/archives/dm-devel/2017-March/msg00226.html

Right, now that you mention it it is starting to ring a bell (especially
after I read your 2nd dm-devel archive url above).

> I then did not follow this closely enough anymore,
> and I missed that with recent enough kernel,
> discard on DRBD on dm-thin would fully allocate.
> 
> In our out-of-tree module, we had to keep the older code for
> compat reasons, anyways. I will just re-enable our zeroout
> workaround there again.
> 
> In tree, either dm-thin learns to do REQ_OP_WRITE_ZEROES "properly",
> so the result in this scenario is what we expect:
> 
>   _: unprovisioned, not allocated, returns zero on read anyways
>   *: provisioned, some arbitrary data
>   0: explicitly zeroed:
> 
>   |gran|ular|ity |    |    |    |
>   |****|****|____|****|
>      to|-be-|zero|ed
>   |**00|____|____|00**|
> 
> (leave unallocated blocks alone,
>  de-allocate full blocks just like with discard,
>  explicitly zero unaligned head and tail)

"de-allocate full blocks just like with discard" is an interesting take
what it means for dm-thin to handle REQ_OP_WRITE_ZEROES "properly".

> Or DRBD will have to resurrect that reinvented zeroout again,
> with exactly those semantics. I did reinvent it for a reason ;)

Yeah, I now recall dropping that line of development because it
became "hard" (or at least harder than originally thought).

Don't people use REQ_OP_WRITE_ZEROES to initialize a portion of the
disk?  E.g. zeroing superblocks, metadata areas, or whatever?

If we just discarded the logical extent and then a user did a partial
write to the block, areas that a user might expect to be zeroed wouldn't
be (at least in the case of dm-thinp if "skip_block_zeroing" is
enabled).  And yes if discard passdown is enabled and the device's
discard implementation does "discard_zeroes_data" then it'd be
fine.. but there are a lot of things that need to line up for drbd's
REQ_OP_WRITE_ZEROES to "just work" (as it expects).

(now I'm just echoing the kinds of concerns I had in that 2nd dm-devel
post above).

This post from mkp is interesting:
https://www.redhat.com/archives/dm-devel/2017-March/msg00228.html

Specifically:
"You don't have a way to mark those blocks as being full of zeroes
without actually writing them?

Note that the fallback to a zeroout command is to do a regular write. So
if DM doesn't zero the blocks, the block layer is going to it."

No, dm-thinp doesn't have an easy way to mark an allocated block as
containing zeroes (without actually zeroing).  I toyed with adding that
but then realized that even if we had it it'd still require block
zeroing be enabled.  But block zeroing is done at allocation time.  So
we'd need to interpret the "this block is zeroes" flag to mean "on first
write or read to this block it needs to first zero it".  Fugly to say
the least...

I've been quite busy with other things but I can revisit all this with
Joe Thornber and see what we come up with after a 2nd discussion.

But sadly, in general, this is a low priority for me, so you might do
well to reintroduce your drbd workaround.. sorry about that :(

Mike

WARNING: multiple messages have this Message-ID (diff)
From: Mike Snitzer <snitzer@redhat.com>
To: Eric Wheeler <drbd-dev@lists.ewheeler.net>,
	Christoph Hellwig <hch@lst.de>,
	axboe@kernel.dk, martin.petersen@oracle.com, agk@redhat.com,
	shli@kernel.org, philipp.reisner@linbit.com,
	linux-block@vger.kernel.org, linux-raid@vger.kernel.org,
	dm-devel@redhat.com, linux-scsi@vger.kernel.org,
	drbd-dev@lists.linbit.com
Cc: ejt@redhat.com
Subject: Re: [PATCH 23/27] drbd: make intelligent use of blkdev_issue_zeroout
Date: Mon, 15 Jan 2018 10:07:38 -0500	[thread overview]
Message-ID: <20180115150738.GA20967@redhat.com> (raw)
In-Reply-To: <20180115124635.GA4107@soda.linbit>

On Mon, Jan 15 2018 at  7:46am -0500,
Lars Ellenberg <lars.ellenberg@linbit.com> wrote:
 
> As I understood it,
> blkdev_issue_zeroout() was supposed to "always try to unmap",
> deprovision, the relevant region, and zero-out any unaligned
> head or tail, just like my work around above was doing.
> 
> And that device mapper thin was "about to" learn this, "soon",
> or maybe block core would do the equivalent of my workaround
> described above.
> 
> But it then did not.
> 
> See also:
> https://www.redhat.com/archives/dm-devel/2017-March/msg00213.html
> https://www.redhat.com/archives/dm-devel/2017-March/msg00226.html

Right, now that you mention it it is starting to ring a bell (especially
after I read your 2nd dm-devel archive url above).

> I then did not follow this closely enough anymore,
> and I missed that with recent enough kernel,
> discard on DRBD on dm-thin would fully allocate.
> 
> In our out-of-tree module, we had to keep the older code for
> compat reasons, anyways. I will just re-enable our zeroout
> workaround there again.
> 
> In tree, either dm-thin learns to do REQ_OP_WRITE_ZEROES "properly",
> so the result in this scenario is what we expect:
> 
>   _: unprovisioned, not allocated, returns zero on read anyways
>   *: provisioned, some arbitrary data
>   0: explicitly zeroed:
> 
>   |gran|ular|ity |    |    |    |
>   |****|****|____|****|
>      to|-be-|zero|ed
>   |**00|____|____|00**|
> 
> (leave unallocated blocks alone,
>  de-allocate full blocks just like with discard,
>  explicitly zero unaligned head and tail)

"de-allocate full blocks just like with discard" is an interesting take
what it means for dm-thin to handle REQ_OP_WRITE_ZEROES "properly".

> Or DRBD will have to resurrect that reinvented zeroout again,
> with exactly those semantics. I did reinvent it for a reason ;)

Yeah, I now recall dropping that line of development because it
became "hard" (or at least harder than originally thought).

Don't people use REQ_OP_WRITE_ZEROES to initialize a portion of the
disk?  E.g. zeroing superblocks, metadata areas, or whatever?

If we just discarded the logical extent and then a user did a partial
write to the block, areas that a user might expect to be zeroed wouldn't
be (at least in the case of dm-thinp if "skip_block_zeroing" is
enabled).  And yes if discard passdown is enabled and the device's
discard implementation does "discard_zeroes_data" then it'd be
fine.. but there are a lot of things that need to line up for drbd's
REQ_OP_WRITE_ZEROES to "just work" (as it expects).

(now I'm just echoing the kinds of concerns I had in that 2nd dm-devel
post above).

This post from mkp is interesting:
https://www.redhat.com/archives/dm-devel/2017-March/msg00228.html

Specifically:
"You don't have a way to mark those blocks as being full of zeroes
without actually writing them?

Note that the fallback to a zeroout command is to do a regular write. So
if DM doesn't zero the blocks, the block layer is going to it."

No, dm-thinp doesn't have an easy way to mark an allocated block as
containing zeroes (without actually zeroing).  I toyed with adding that
but then realized that even if we had it it'd still require block
zeroing be enabled.  But block zeroing is done at allocation time.  So
we'd need to interpret the "this block is zeroes" flag to mean "on first
write or read to this block it needs to first zero it".  Fugly to say
the least...

I've been quite busy with other things but I can revisit all this with
Joe Thornber and see what we come up with after a 2nd discussion.

But sadly, in general, this is a low priority for me, so you might do
well to reintroduce your drbd workaround.. sorry about that :(

Mike

  parent reply	other threads:[~2018-01-15 15:07 UTC|newest]

Thread overview: 91+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-04-05 17:20 always use REQ_OP_WRITE_ZEROES for zeroing offload V2 Christoph Hellwig
2017-04-05 17:20 ` Christoph Hellwig
2017-04-05 17:20 ` [PATCH 01/27] sd: split sd_setup_discard_cmnd Christoph Hellwig
2017-04-05 17:20   ` Christoph Hellwig
2017-04-05 17:21 ` [PATCH 02/27] block: renumber REQ_OP_WRITE_ZEROES Christoph Hellwig
2017-04-05 17:21   ` Christoph Hellwig
2017-04-05 17:21 ` [PATCH 03/27] block: implement splitting of REQ_OP_WRITE_ZEROES bios Christoph Hellwig
2017-04-05 17:21   ` Christoph Hellwig
2017-04-05 17:21 ` [PATCH 04/27] sd: implement REQ_OP_WRITE_ZEROES Christoph Hellwig
2017-04-05 17:21   ` Christoph Hellwig
2017-04-05 17:21 ` [PATCH 05/27] md: support REQ_OP_WRITE_ZEROES Christoph Hellwig
2017-04-05 17:21   ` Christoph Hellwig
2017-04-05 17:21 ` [PATCH 06/27] dm io: discards don't take a payload Christoph Hellwig
2017-04-05 17:21   ` Christoph Hellwig
2017-04-05 17:21 ` [PATCH 07/27] dm: support REQ_OP_WRITE_ZEROES Christoph Hellwig
2017-04-05 17:21   ` Christoph Hellwig
2017-04-05 17:21 ` [PATCH 08/27] dm kcopyd: switch to use REQ_OP_WRITE_ZEROES Christoph Hellwig
2017-04-05 17:21   ` Christoph Hellwig
2017-04-05 17:21 ` [PATCH 09/27] block: stop using blkdev_issue_write_same for zeroing Christoph Hellwig
2017-04-05 17:21   ` Christoph Hellwig
2017-04-05 17:21 ` [PATCH 10/27] block: add a flags argument to (__)blkdev_issue_zeroout Christoph Hellwig
2017-04-05 17:21   ` Christoph Hellwig
2017-04-05 17:21 ` [PATCH 11/27] block: add a REQ_NOUNMAP flag for REQ_OP_WRITE_ZEROES Christoph Hellwig
2017-04-05 17:21   ` Christoph Hellwig
2017-04-05 17:21 ` [PATCH 12/27] block: add a new BLKDEV_ZERO_NOFALLBACK flag Christoph Hellwig
2017-04-05 17:21   ` Christoph Hellwig
2017-04-05 17:21 ` [PATCH 13/27] block_dev: use blkdev_issue_zerout for hole punches Christoph Hellwig
2017-04-05 17:21   ` Christoph Hellwig
2017-04-05 17:21 ` [PATCH 14/27] sd: implement unmapping Write Zeroes Christoph Hellwig
2017-04-05 17:21   ` Christoph Hellwig
2017-04-05 17:21 ` [PATCH 15/27] nvme: implement REQ_OP_WRITE_ZEROES Christoph Hellwig
2017-04-05 17:21   ` Christoph Hellwig
2017-04-05 17:21 ` [PATCH 16/27] zram: " Christoph Hellwig
2017-04-05 17:21   ` Christoph Hellwig
2017-04-05 17:21 ` [PATCH 17/27] loop: " Christoph Hellwig
2017-04-05 17:21   ` Christoph Hellwig
2017-04-05 17:21 ` [PATCH 18/27] brd: remove discard support Christoph Hellwig
2017-04-05 17:21   ` Christoph Hellwig
2017-04-05 17:21 ` [PATCH 19/27] rbd: remove the discard_zeroes_data flag Christoph Hellwig
2017-04-05 17:21   ` Christoph Hellwig
2017-04-05 17:21 ` [PATCH 20/27] rsxx: " Christoph Hellwig
2017-04-05 17:21   ` Christoph Hellwig
2017-04-05 17:21 ` [PATCH 21/27] mmc: " Christoph Hellwig
2017-04-05 17:21   ` Christoph Hellwig
2017-04-05 17:21 ` [PATCH 22/27] block: stop using discards for zeroing Christoph Hellwig
2017-04-05 17:21   ` Christoph Hellwig
2017-04-05 17:21 ` [PATCH 23/27] drbd: make intelligent use of blkdev_issue_zeroout Christoph Hellwig
2017-04-05 17:21   ` Christoph Hellwig
2018-01-13  0:46   ` [Drbd-dev] " Eric Wheeler
2018-01-15 12:46     ` Lars Ellenberg
2018-01-15 12:46       ` Lars Ellenberg
     [not found]       ` <20180115124635.GA4107-w1SgEEioFePxa46PmUWvFg@public.gmane.org>
2018-01-15 15:07         ` Mike Snitzer [this message]
2018-01-15 15:07           ` Mike Snitzer
2018-01-16  8:55           ` [Drbd-dev] " Lars Ellenberg
2017-04-05 17:21 ` [PATCH 24/27] drbd: implement REQ_OP_WRITE_ZEROES Christoph Hellwig
2017-04-05 17:21   ` Christoph Hellwig
2017-04-05 17:21 ` [PATCH 25/27] block: remove the discard_zeroes_data flag Christoph Hellwig
2017-05-01 20:45   ` Bart Van Assche
2017-05-01 20:45     ` Bart Van Assche
     [not found]     ` <1493671519.2665.15.camel-XdAiOPVOjttBDgjK7y7TUQ@public.gmane.org>
2017-05-02  6:43       ` Nicholas A. Bellinger
2017-05-02  6:43         ` Nicholas A. Bellinger
     [not found]         ` <1493707425.23202.77.camel-XoQW25Eq2zviZyQQd+hFbcojREIfoBdhmpATvIKMPHk@public.gmane.org>
2017-05-02  7:16           ` Nicholas A. Bellinger
2017-05-02  7:16             ` Nicholas A. Bellinger
     [not found]             ` <1493709373.23202.79.camel-XoQW25Eq2zviZyQQd+hFbcojREIfoBdhmpATvIKMPHk@public.gmane.org>
2017-05-02  7:23               ` hch-jcswGhMUV9g
2017-05-02  7:23                 ` hch
2017-05-03  3:33                 ` Nicholas A. Bellinger
2017-05-03  3:33                   ` Nicholas A. Bellinger
2017-05-03 14:33                   ` Mike Snitzer
2017-05-05  3:10                     ` Nicholas A. Bellinger
2017-05-05  3:10                       ` Nicholas A. Bellinger
     [not found]                   ` <1493782395.23202.84.camel-XoQW25Eq2zviZyQQd+hFbcojREIfoBdhmpATvIKMPHk@public.gmane.org>
2017-05-07  9:22                     ` hch-jcswGhMUV9g
2017-05-07  9:22                       ` hch
     [not found]                       ` <20170507092209.GA27370-jcswGhMUV9g@public.gmane.org>
2017-05-09  6:46                         ` Nicholas A. Bellinger
2017-05-09  6:46                           ` Nicholas A. Bellinger
2017-05-10 14:06                           ` hch
     [not found]                             ` <20170510140627.GA23759-jcswGhMUV9g@public.gmane.org>
2017-05-11  4:50                               ` Nicholas A. Bellinger
2017-05-11  4:50                                 ` Nicholas A. Bellinger
     [not found]                                 ` <1494478235.16894.115.camel-XoQW25Eq2zviZyQQd+hFbcojREIfoBdhmpATvIKMPHk@public.gmane.org>
2017-05-11  6:26                                   ` hch-jcswGhMUV9g
2017-05-11  6:26                                     ` hch
     [not found]                                     ` <20170511062630.GA18517-jcswGhMUV9g@public.gmane.org>
2017-05-11  6:36                                       ` Nicholas A. Bellinger
2017-05-11  6:36                                         ` Nicholas A. Bellinger
2017-04-05 17:21 ` [PATCH 26/27] scsi: sd: Separate zeroout and discard command choices Christoph Hellwig
2017-04-06  6:17   ` Hannes Reinecke
2017-04-06  6:17     ` Hannes Reinecke
2017-04-19 14:56   ` Paolo Bonzini
     [not found]     ` <58c3d6a6-924e-cc86-1907-a9fd02a39c0e-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2017-04-20  1:34       ` Martin K. Petersen
2017-04-20  1:34         ` Martin K. Petersen
2017-04-05 17:21 ` [PATCH 27/27] scsi: sd: Remove LBPRZ dependency for discards Christoph Hellwig
2017-04-06  6:18   ` Hannes Reinecke
2017-04-06  6:18     ` Hannes Reinecke
2017-04-08 17:26 ` always use REQ_OP_WRITE_ZEROES for zeroing offload V2 Jens Axboe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180115150738.GA20967@redhat.com \
    --to=snitzer-h+wxahxf7alqt0dzr+alfa@public.gmane.org \
    --cc=agk-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
    --cc=axboe-tSWWG44O7X1aa/9Udqfwiw@public.gmane.org \
    --cc=dm-devel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
    --cc=drbd-dev-Himy5ogN2wUERf3Jot9Y56xOck334EZe@public.gmane.org \
    --cc=drbd-dev-cunTk1MwBs8qoQakbn7OcQ@public.gmane.org \
    --cc=ejt-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
    --cc=hch-jcswGhMUV9g@public.gmane.org \
    --cc=linux-block-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=linux-raid-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=linux-scsi-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=martin.petersen-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org \
    --cc=philipp.reisner-63ez5xqkn6DQT0dZR+AlfA@public.gmane.org \
    --cc=shli-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.