All of lore.kernel.org
 help / color / mirror / Atom feed
From: Eric Blake <eblake@redhat.com>
To: Fam Zheng <famz@redhat.com>, Paolo Bonzini <pbonzini@redhat.com>
Cc: Kevin Wolf <kwolf@redhat.com>,
	"nbd-general@lists.sourceforge.net"
	<nbd-general@lists.sourceforge.net>,
	qemu-devel@nongnu.org, qemu-block@nongnu.org,
	Max Reitz <mreitz@redhat.com>
Subject: Re: [Qemu-devel] [PATCH v5 13/14] nbd: Implement NBD_CMD_WRITE_ZEROES on server
Date: Tue, 19 Jul 2016 21:47:18 -0600	[thread overview]
Message-ID: <578EF446.70202@redhat.com> (raw)
In-Reply-To: <20160720033402.GA7641@ad.usersys.redhat.com>

[-- Attachment #1: Type: text/plain, Size: 2498 bytes --]

On 07/19/2016 09:34 PM, Fam Zheng wrote:
> On Tue, 07/19 17:45, Paolo Bonzini wrote:
>>
>>
>> On 19/07/2016 17:28, Eric Blake wrote:
>>>> If I'm reading the NBD proto.md correctly, this is not enough if
>>>> NBD_CMD_FLAG_NO_HOLE is specified. We probably need to use a zeroed buffer with
>>>> blk_pwrite, or pass a new flag (BDRV_RED_NO_HOLE) to blk_pwrite_zeroes to
>>>> enforce the bdrv_driver_pwritev() branch in bdrv_co_do_pwrite_zeroes().
>>
>> I agree with Eric's interpretation.  It's a bit weird to have the
>> direction inverted, but I'm not sure I see the ambiguity.  Can you explain?
> 
> Write zeroes _means_ "punch hole" on a raw file.
> 
> In block/raw-posix.c:handle_aiocb_write_zeroes():
>> #ifdef CONFIG_FALLOCATE_PUNCH_HOLE
>>     if (s->has_discard && s->has_fallocate) {
>>         int ret = do_fallocate(s->fd,
>>                                FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE,
>>                                aiocb->aio_offset, aiocb->aio_nbytes);
>>         if (ret == 0) {
>>             ret = do_fallocate(s->fd, 0, aiocb->aio_offset, aiocb->aio_nbytes);

That is just implementation: punch a hole, BUT THEN reallocate it back,
so that in the end, the file is still not sparse in that region.  Or am
I reading it wrong?

But the implementation under the hood is not visible to the guest - as
long as the end result is that a guest requesting NO_HOLE ends up with a
non-sparse file, and the data reads back as all 0, the client doesn't
care whether the zeros were written byte-by-byte or sped up by punching
a hole then reallocating.

> 
> And unmap is translated to "punch hole", too.
> 
> In block/raw-posix.c:handle_aiocb_discard():
>> #ifdef CONFIG_FALLOCATE_PUNCH_HOLE
>>         ret = do_fallocate(s->fd, FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE,
>>                            aiocb->aio_offset, aiocb->aio_nbytes);
>> #endif

No, this call is different - it punches a hole, then stops.  There is no
followup do_fallocate(,0,,) to reallocate, so the file remains sparse.

> 
> So I agree that NBD_CMD_FLAG_NO_HOLE is a poorly named flag, because there is
> always going to be a hole event if it's set.

If we are punching holes even when BDRV_REQ_MAY_UNMAP is not set, that
seems like we have a bug in qemu (unless we are immediately then
reallocating so that there is no resulting hole).

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]

  reply	other threads:[~2016-07-20  3:47 UTC|newest]

Thread overview: 55+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-07-19  4:07 [Qemu-devel] [PATCH for-2.7 v5 00/14] nbd: efficient write zeroes Eric Blake
2016-07-19  4:07 ` [Qemu-devel] [PATCH v5 01/14] nbd: Fix bad flag detection on server Eric Blake
2016-07-19  4:07 ` [Qemu-devel] [PATCH v5 02/14] nbd: Add qemu-nbd -D for human-readable description Eric Blake
2016-07-19  4:07 ` [Qemu-devel] [PATCH v5 03/14] nbd: Limit nbdflags to 16 bits Eric Blake
2016-07-19  4:07 ` [Qemu-devel] [PATCH v5 04/14] nbd: Treat flags vs. command type as separate fields Eric Blake
2016-07-19  4:07 ` [Qemu-devel] [PATCH v5 05/14] nbd: Share common reply-sending code in server Eric Blake
2016-07-19  5:10   ` Fam Zheng
2016-07-19 14:52     ` Eric Blake
2016-07-20  4:39       ` Fam Zheng
2016-07-19  4:07 ` [Qemu-devel] [PATCH v5 06/14] nbd: Send message along with server NBD_REP_ERR errors Eric Blake
2016-07-19  5:15   ` Fam Zheng
2016-10-11 15:12     ` Eric Blake
2016-07-19  4:07 ` [Qemu-devel] [PATCH v5 07/14] nbd: Share common option-sending code in client Eric Blake
2016-07-19  5:31   ` Fam Zheng
2016-07-19  4:07 ` [Qemu-devel] [PATCH v5 08/14] nbd: Let server know when client gives up negotiation Eric Blake
2016-07-19  4:07 ` [Qemu-devel] [PATCH v5 09/14] nbd: Let client skip portions of server reply Eric Blake
2016-07-19  4:07 ` [Qemu-devel] [PATCH v5 10/14] nbd: Less allocation during NBD_OPT_LIST Eric Blake
2016-07-19  4:07 ` [Qemu-devel] [PATCH v5 11/14] nbd: Support shorter handshake Eric Blake
2016-07-19  4:07 ` [Qemu-devel] [PATCH v5 12/14] nbd: Improve server handling of shutdown requests Eric Blake
2016-07-19  4:08 ` [Qemu-devel] [PATCH v5 13/14] nbd: Implement NBD_CMD_WRITE_ZEROES on server Eric Blake
2016-07-19  6:21   ` Fam Zheng
2016-07-19 15:28     ` Eric Blake
2016-07-19 15:45       ` Paolo Bonzini
2016-07-20  3:34         ` Fam Zheng
2016-07-20  3:47           ` Eric Blake [this message]
2016-07-20  4:37             ` Fam Zheng
2016-07-20  7:09               ` Paolo Bonzini
2016-07-20  7:38                 ` Fam Zheng
2016-07-20  8:16                   ` Paolo Bonzini
2016-07-20  9:04                     ` Fam Zheng
2016-07-20  9:19                   ` [Qemu-devel] semantics of FIEMAP without FIEMAP_FLAG_SYNC (was Re: [PATCH v5 13/14] nbd: Implement NBD_CMD_WRITE_ZEROES on server) Paolo Bonzini
2016-07-20 12:30                     ` Dave Chinner
2016-07-20 13:35                       ` Niels de Vos
2016-07-21 11:43                         ` Dave Chinner
2016-07-21 12:31                           ` Pádraig Brady
2016-07-21 13:15                             ` Dave Chinner
2016-07-20 13:40                       ` Paolo Bonzini
2016-07-21 12:41                         ` Dave Chinner
2016-07-21 13:01                           ` Pádraig Brady
2016-07-21 14:23                           ` Paolo Bonzini
2016-07-22  8:58                             ` Dave Chinner
2016-07-22 10:41                               ` Paolo Bonzini
2018-02-15 16:40                                 ` Vladimir Sementsov-Ogievskiy
2018-02-15 16:42                                   ` Paolo Bonzini
2018-04-18 14:25                                     ` Vladimir Sementsov-Ogievskiy
2018-04-18 14:41                                       ` [Qemu-devel] semantics of FIEMAP without FIEMAP_FLAG_SYNC Eric Blake
2016-08-18 13:50   ` [Qemu-devel] [PATCH v5 13/14] nbd: Implement NBD_CMD_WRITE_ZEROES on server Vladimir Sementsov-Ogievskiy
2016-08-18 13:52     ` Paolo Bonzini
2016-07-19  4:08 ` [Qemu-devel] [PATCH v5 14/14] nbd: Implement NBD_CMD_WRITE_ZEROES on client Eric Blake
2016-07-19  6:24   ` Fam Zheng
2016-07-19 15:31     ` Eric Blake
2016-07-19  6:33 ` [Qemu-devel] [PATCH for-2.7 v5 00/14] nbd: efficient write zeroes Fam Zheng
2016-07-19  8:53 ` Paolo Bonzini
2016-07-19 15:33   ` Eric Blake
2016-07-19 15:41     ` Paolo Bonzini

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=578EF446.70202@redhat.com \
    --to=eblake@redhat.com \
    --cc=famz@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=mreitz@redhat.com \
    --cc=nbd-general@lists.sourceforge.net \
    --cc=pbonzini@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.