QEMU-Devel Archive on lore.kernel.org
 help / color / Atom feed
From: Eric Blake <eblake@redhat.com>
To: "Richard W.M. Jones" <rjones@redhat.com>
Cc: libguestfs@redhat.com, QEMU <qemu-devel@nongnu.org>,
	"qemu-block@nongnu.org" <qemu-block@nongnu.org>,
	nbd@other.debian.org
Subject: Re: [Qemu-devel] [Libguestfs] cross-project patches: Add NBD Fast Zero support
Date: Tue, 27 Aug 2019 08:23:14 -0500
Message-ID: <5ee3861d-1591-5328-d04c-63dff4c8cb0f@redhat.com> (raw)
In-Reply-To: <20190827121449.GX7304@redhat.com>

[-- Attachment #1.1: Type: text/plain, Size: 3966 bytes --]

On 8/27/19 7:14 AM, Richard W.M. Jones wrote:

> 
> Is the plan to wait until NBD_CMF_FLAG_FAST_ZERO gets into the NBD
> protocol doc before doing the rest?  Also I would like to release both
> libnbd 1.0 and nbdkit 1.14 before we introduce any large new features.
> Both should be released this week, in fact maybe even today or
> tomorrow.

Sure, I don't mind this being the first feature for the eventual libnbd
1.2 and nbdkit 1.16.

> 
> [...]
>> First, I had to create a scenario where falling back to writes is
>> noticeably slower than performing a zero operation, and where
>> pre-zeroing also shows an effect.  My choice: let's test 'qemu-img
>> convert' on an image that is half-sparse (every other megabyte is a
>> hole) to an in-memory nbd destination.  Then I use a series of nbdkit
>> filters to force the destination to behave in various manners:
>>  log logfile=>(sed ...|uniq -c) (track how many normal/fast zero
>> requests the client makes)
>>  nozero $params (fine-tune how zero requests behave - the parameters
>> zeromode and fastzeromode are the real drivers of my various tests)
>>  blocksize maxdata=256k (allows large zero requests, but forces large
>> writes into smaller chunks, to magnify the effects of write delays and
>> allow testing to provide obvious results with a smaller image)
>>  delay delay-write=20ms delay-zero=5ms (also to magnify the effects on a
>> smaller image, with writes penalized more than zeroing)
>>  stats statsfile=/dev/stderr (to track overall time and a decent summary
>> of how much I/O occurred).
>>  noextents (forces the entire image to report that it is allocated,
>> which eliminates any testing variability based on whether qemu-img uses
>> that to bypass a zeroing operation [1])
> 
> I can't help thinking that a sh plugin might have been simpler ...

Maybe, but the extra cost of forking per request may have also made
obvious timing comparisons harder.  I'm just glad that nbdkit's
filtering system was flexible enough to do what I wanted, even if I did
have fun stringing together 6 filters :)

> 
>> I hope you enjoyed reading this far, and agree with my interpretation of
>> the numbers about why this feature is useful!
> 
> Yes it seems reasonable.
> 
> The only thought I had is whether the qemu block layer does or should
> combine requests in flight so that a write-zero (offset) followed by a
> write-data (same offset) would erase the earlier request.  In some
> circumstances that might provide a performance improvement without
> needing any changes to protocols.

As in, maintain a backlog of requests that are needed but have not yet
been sent over the wire because of backlog, and merge those requests (by
splitting an existing large zero request into smaller pieces) if write
requests come in that window before actually transmitting to the NBD
server?  I know qemu has some write coalescing when servicing guest
behaviors; but I was testing on 'qemu-img convert' which does not depend
on guest behavior and therefore has already sent the zero request to the
NBD server before sending any data writes, so coalescing wouldn't see
anything to combine.  Or are you worried about qemu as the NBD server,
performing coalescing of incoming requests from the client?  But you are
right that some smarts about I/O coalescing at various points in the
data path may show some slight optimizations.

> 
>> - NBD should have a way to advertise (probably via NBD_INFO_ during
>> NBD_OPT_GO) if the initial image is known to begin life with all zeroes
>> (if that is the case, qemu-img can skip the extents calls and
>> pre-zeroing pass altogether)
> 
> Yes, I really think we should do this one as well.

Stay tuned for my next cross-project post ;)  Hopefully in the next week
or so.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

      reply index

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-08-23 14:30 [Qemu-devel] " Eric Blake
2019-08-23 14:34 ` [Qemu-devel] [PATCH 0/1] NBD protocol change to add fast zero support Eric Blake
2019-08-23 14:34   ` [Qemu-devel] [PATCH 1/1] protocol: Add NBD_CMD_FLAG_FAST_ZERO Eric Blake
2019-08-23 18:48     ` Wouter Verhelst
2019-08-23 18:58       ` Eric Blake
2019-08-24  6:44         ` Wouter Verhelst
2019-08-28  9:57     ` Vladimir Sementsov-Ogievskiy
2019-08-28 13:04       ` Eric Blake
2019-08-28 13:45         ` Vladimir Sementsov-Ogievskiy
2019-09-03 20:53   ` [Qemu-devel] [Libguestfs] [PATCH 0/1] NBD protocol change to add fast zero support Eric Blake
2019-08-23 14:37 ` [Qemu-devel] [PATCH 0/5] Add NBD fast zero support to qemu client and server Eric Blake
2019-08-23 14:37   ` [Qemu-devel] [PATCH 1/5] nbd: Improve per-export flag handling in server Eric Blake
2019-08-30 18:00     ` Vladimir Sementsov-Ogievskiy
2019-08-30 23:10       ` Eric Blake
2019-08-30 23:32         ` Eric Blake
2019-09-03 16:39           ` Eric Blake
2019-09-04 17:08     ` Vladimir Sementsov-Ogievskiy
2019-08-23 14:37   ` [Qemu-devel] [PATCH 2/5] nbd: Prepare for NBD_CMD_FLAG_FAST_ZERO Eric Blake
2019-08-30 18:07     ` Vladimir Sementsov-Ogievskiy
2019-08-30 23:37       ` Eric Blake
2019-08-31  8:11         ` Vladimir Sementsov-Ogievskiy
2019-09-03 18:49       ` Eric Blake
2019-08-31  8:20     ` Vladimir Sementsov-Ogievskiy
2019-08-23 14:37   ` [Qemu-devel] [PATCH 3/5] nbd: Implement client use of NBD FAST_ZERO Eric Blake
2019-08-30 18:11     ` Vladimir Sementsov-Ogievskiy
2019-08-23 14:37   ` [Qemu-devel] [PATCH 4/5] nbd: Implement server " Eric Blake
2019-08-30 18:40     ` Vladimir Sementsov-Ogievskiy
2019-08-23 14:37   ` [Qemu-devel] [PATCH 5/5] nbd: Tolerate more errors to structured reply request Eric Blake
2019-08-23 16:41     ` Eric Blake
2019-08-28 13:55   ` [Qemu-devel] [PATCH 0/5] Add NBD fast zero support to qemu client and server Vladimir Sementsov-Ogievskiy
2019-08-28 14:05     ` Eric Blake
2019-08-23 14:38 ` [Qemu-devel] [libnbd PATCH 0/1] libnbd support for new fast zero Eric Blake
2019-08-23 14:38   ` [Qemu-devel] [libnbd PATCH 1/1] api: Add support for FAST_ZERO flag Eric Blake
2019-08-27 12:25     ` [Qemu-devel] [Libguestfs] " Richard W.M. Jones
2019-08-23 14:40 ` [Qemu-devel] [nbdkit PATCH 0/3] nbdkit support for new NBD fast zero Eric Blake
2019-08-23 14:40   ` [Qemu-devel] [nbdkit PATCH 1/3] server: Add internal support for NBDKIT_FLAG_FAST_ZERO Eric Blake
2019-08-23 14:40   ` [Qemu-devel] [nbdkit PATCH 2/3] filters: Add .can_fast_zero hook Eric Blake
2019-08-23 14:40   ` [Qemu-devel] [nbdkit PATCH 3/3] plugins: " Eric Blake
2019-08-23 21:16     ` [Qemu-devel] [Libguestfs] " Eric Blake
2019-08-27 15:43     ` Richard W.M. Jones
2019-08-23 15:05 ` [Qemu-devel] cross-project patches: Add NBD Fast Zero support Vladimir Sementsov-Ogievskiy
2019-08-27 12:14 ` [Qemu-devel] [Libguestfs] " Richard W.M. Jones
2019-08-27 13:23   ` Eric Blake [this message]

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5ee3861d-1591-5328-d04c-63dff4c8cb0f@redhat.com \
    --to=eblake@redhat.com \
    --cc=libguestfs@redhat.com \
    --cc=nbd@other.debian.org \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=rjones@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

QEMU-Devel Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/qemu-devel/0 qemu-devel/git/0.git
	git clone --mirror https://lore.kernel.org/qemu-devel/1 qemu-devel/git/1.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 qemu-devel qemu-devel/ https://lore.kernel.org/qemu-devel \
		qemu-devel@nongnu.org qemu-devel@archiver.kernel.org
	public-inbox-index qemu-devel

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.nongnu.qemu-devel


AGPL code for this site: git clone https://public-inbox.org/ public-inbox