Cross-project NBD extension proposal: NBD_INFO_INIT_STATE

* Cross-project NBD extension proposal: NBD_INFO_INIT_STATE
@ 2020-02-10 21:37 Eric Blake
  2020-02-10 21:41 ` [qemu PATCH 0/3] NBD_INFO_INIT_STATE extension Eric Blake
                   ` (2 more replies)
  0 siblings, 3 replies; 19+ messages in thread
From: Eric Blake @ 2020-02-10 21:37 UTC (permalink / raw)
  To: nbd, QEMU, qemu-block, libguestfs
  Cc: Vladimir Sementsov-Ogievskiy, Alberto Garcia, Richard W.M. Jones,
	Max Reitz

I will be following up to this email with four separate threads each 
addressed to the appropriate single list, with proposed changes to:
- the NBD protocol
- qemu: both server and client
- libnbd: client
- nbdkit: server

The feature in question adds a new optional NBD_INFO_ packet to the 
NBD_OPT_GO portion of handshake, adding up to 16 bits of information 
that the server can advertise to the client at connection time about any 
known initial state of the export [review to this series may propose 
slight changes, such as using 32 bits; but hopefully by having all four 
series posted in tandem it becomes easier to see whether any such tweaks 
are warranted, and can keep such tweaks interoperable before any of the 
projects land the series upstream].  For now, only 2 of those 16 bits 
are defined: NBD_INIT_SPARSE (the image has at least one hole) and 
NBD_INIT_ZERO (the image reads completely as zero); the two bits are 
orthogonal and can be set independently, although it is easy enough to 
see completely sparse files with both bits set.  Also, advertising the 
bits is orthogonal to whether the base:allocation metacontext is used, 
although a server with all possible extensions is likely to have the two 
concepts match one another.

The new bits are added as an information chunk rather than as runtime 
flags; this is because the intended client of this information is 
operations like copying a sparse image into an NBD server destination. 
Such a client only cares at initialization if it needs to perform a 
pre-zeroing pass or if it can rely on the destination already reading as 
zero.  Once the client starts making modifications, burdening the server 
with the ability to do a live runtime probe of current reads-as-zero 
state does not help the client, and burning per-export flags for 
something that quickly goes stale on the first edit was not thought to 
be wise, similarly, adding a new NBD_CMD did not seem worthwhile.

The existing 'qemu-img convert source... nbd://...' is the first command 
line example that can benefit from the new information; the goal of 
adding a protocol extension was to make this benefit automatic without 
the user having to specify the proposed --target-is-zero when possible. 
I have a similar thread pending for qemu which adds similar 
known-reads-zero information to qcow2 files:
https://lists.gnu.org/archive/html/qemu-devel/2020-01/msg08075.html

That qemu series is at v1, and based on review it has had so far, it 
will need some interface changes for v2, which means my qemu series here 
will need a slight rebasing, but I'm posting this series to all lists 
now to at least demonstrate what is possible when we have better startup 
information.

Note that with this new bit, it is possible to learn if a destination is 
sparse as part of NBD_OPT_GO rather than having to use block-status 
commands.  With existing block-status commands, you can use an O(n) scan 
of block-status to learn if an image reads as all zeroes (or 
short-circuit in O(1) time if the first offset is reported as probable 
data rather than reading as zero); but with this new bit, the answer is 
O(1).  So even with Vladimir's recent change to make the spec permit 4G 
block-status even when max block size is 32M, or the proposed work to 
add 64-bit block-status, you still end up with more on-the-wire traffic 
for block-status to learn if an image is all zeroes than if the server 
just advertises this bit.  But by keeping both extensions orthogonal, a 
server can implement whichever one or both reporting methods it finds 
easiest, and a client can work with whatever a server supplies with sane 
fallbacks when the server lacks either extension.  Conversely, 
block-status tracks live changes to the image, while this bit is only 
valid at connection time.

My repo for each of the four projects contains a tag 'nbd-init-v1':
  https://repo.or.cz/nbd/ericb.git/shortlog/refs/tags/nbd-init-v1
  https://repo.or.cz/qemu/ericb.git/shortlog/refs/tags/nbd-init-v1
  https://repo.or.cz/libnbd/ericb.git/shortlog/refs/tags/nbd-init-v1
  https://repo.or.cz/nbdkit/ericb.git/shortlog/refs/tags/nbd-init-v1

For doing interoperability testing, I find it handy to use:

PATH=/path/to/built/qemu:/path/to/built/nbdkit:$PATH
/path/to/libnbd/run your command here

to pick up just-built qemu-nbd, nbdsh, and nbdkit that all support the 
feature.

For quickly setting flags:
nbdkit eval init_sparse='exit 0' init_zero='exit 0' ...

For quickly checking flags:
qemu-nbd --list ... | grep init
nbdsh -u uri... -c 'print(h.get_init_flags())'

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

^ permalink raw reply	[flat|nested] 19+ messages in thread