All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] question on ioctl NBD_SET_FLAGS
@ 2016-04-20 15:42 Eric Blake
  2016-04-20 16:18 ` [Qemu-devel] [Nbd] " Wouter Verhelst
  0 siblings, 1 reply; 3+ messages in thread
From: Eric Blake @ 2016-04-20 15:42 UTC (permalink / raw)
  To: nbd-general, qemu-devel, qemu block

[-- Attachment #1: Type: text/plain, Size: 3902 bytes --]

NBD commit 39e20207 (2.9.22) introduced the use of the ioctl
NBD_SET_FLAGS, and tries to pass the concatenation of the 16-bit global
flags in the most significant half + the 16-bit transmission flags in
the lower half (during newstyle negotiation).  Of course, at that point
in time, the global flags were ALWAYS 0, so this results in just giving
the kernel the transmission flags.

While oldstyle negotiation gives 32 bits for flags, it never defined any
bits in the upper 16, and is now no longer under development, so no
(working) oldstyle server will ever advertise more than 16 bits of
information, so blindly using the 32 bit flags field to hand to the
kernel via NBD_SET_FLAGS is just fine.

But when commit 626c2a37 (3.1) introduced the first global flag,
NBD_FLAG_FIXED_NEWSTYLE, it computes:

                if(read(sock, &tmp, sizeof(uint16_t)) < 0) {
                        err("Failed reading flags: %m");
                }
                *flags = ((u32)ntohs(tmp));
...
                if(read(sock, &tmp, sizeof(tmp)) < 0)
                        err("Failed/4: %m\n");
                *flags |= (uint32_t)ntohs(tmp);

Whoops - this computes an overlap of flags (it leaves the upper 16 bits
of *flags clear, and sets the lower 16 bits to the bitwise OR of the
global flags and transmission flags) - although no one cared in any
release between there and 3.8, because it so happens that
NBD_FLAG_FIXED_NEWSTYLE and NBD_FLAG_HAS_FLAGS happen to be the same bit.

But in 3.9, the overlap bug was still present, and the set of global
flags had grown to include NBD_FLAG_NO_ZEROS (commit 038e066), which
overlaps with NBD_FLAG_READ_ONLY.  Ouch.  This means that a client
talking to a server that advertises NO_ZEROES means that the client will
mistakenly tell the kernel to treat EVERY export as read-only, even if
the client doesn't respond with NBD_FLAG_C_NO_ZEROES.

3.10 fixed things; negotiate() now uses uint16_t *flags (instead of u32
*), and no longer tries to merge global flags with transmission flags;
only the transmission flags are ever passed to the kernel via
NBD_SET_FLAGS.  Maybe it's good that there was only 2 weeks between 3.9
and 3.10, so hopefully few distros are trying to ship that broken version.

Meanwhile, qemu 2.5 (and probably 2.6, because it's now hard freeze and
too late to change things without good reason) will populate things as:

        if (read_sync(csock, &tmp, sizeof(tmp)) != sizeof(tmp)) {
            error_setg(errp, "Failed to read server flags");
            goto fail;
        }
        *flags = be16_to_cpu(tmp) << 16;
...
        if (read_sync(csock, &tmp, sizeof(tmp)) != sizeof(tmp)) {
            error_setg(errp, "Failed to read export flags");
            goto fail;
        }
        *flags |= be16_to_cpu(tmp);

which means it has been calling the kernel with NBD_SET_FLAGS 0x10001
when talking to the same server where the upstream NBD client would use
merely 0x0001.

I plan on fixing qemu to no longer set the upper 16 bits with global
flags (as I don't see how NBD_FLAG_FIXED_NEWSTYLE or NBD_FLAG_NO_ZEROES
could ever affect what the kernel wants to do in transmission phase).
But do we need to document in the kernel code that existing clients
mistakenly pass too many bits to the NBD_SET_FLAGS ioctl, so that if we
ever reach the future point where we need more than 16 transmission
flags, AND where we have more than 2 global flags defined, existing qemu
2.5 clients don't confuse the kernel when calling NBD_SET_FLAGS?  Or do
we think that it is unlikely enough to worry about, where by the time
there are more than 16 transmission flags, users are likely to already
be using new-enough qemu that doesn't send global flags to the kernel?

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [Qemu-devel] [Nbd] question on ioctl NBD_SET_FLAGS
  2016-04-20 15:42 [Qemu-devel] question on ioctl NBD_SET_FLAGS Eric Blake
@ 2016-04-20 16:18 ` Wouter Verhelst
  2016-05-04 10:29   ` Alex Bligh
  0 siblings, 1 reply; 3+ messages in thread
From: Wouter Verhelst @ 2016-04-20 16:18 UTC (permalink / raw)
  To: Eric Blake; +Cc: nbd-general, qemu-devel, qemu block

Hi Eric,

On Wed, Apr 20, 2016 at 09:42:02AM -0600, Eric Blake wrote:
[...]
> But in 3.9, the overlap bug was still present, and the set of global
> flags had grown to include NBD_FLAG_NO_ZEROS (commit 038e066), which
> overlaps with NBD_FLAG_READ_ONLY.  Ouch.  This means that a client
> talking to a server that advertises NO_ZEROES means that the client will
> mistakenly tell the kernel to treat EVERY export as read-only, even if
> the client doesn't respond with NBD_FLAG_C_NO_ZEROES.
> 
> 3.10 fixed things; negotiate() now uses uint16_t *flags (instead of u32
> *), and no longer tries to merge global flags with transmission flags;
> only the transmission flags are ever passed to the kernel via
> NBD_SET_FLAGS.  Maybe it's good that there was only 2 weeks between 3.9
> and 3.10, so hopefully few distros are trying to ship that broken version.

Well, yeah, since 3.10 was an "oops" release when 3.9 exposed that bug
(which indeed had existed for a while) and which was reported quite
quickly on the list. Released versions of nbd which have the bug exist
though, and trying to have a 3.8 (or below) client talk to a 3.9 (or
above) server has the same issue.

I decided that there was no way in which I could fix it, and that "the
export is readonly" is bad but not a "critical data loss" kind of bug,
so releasing 3.10 was pretty much the only sane thing I could do (other
than delaying NO_ZEROES, which might have worked too).

[...]
> But do we need to document in the kernel code that existing clients
> mistakenly pass too many bits to the NBD_SET_FLAGS ioctl, so that if we
> ever reach the future point where we need more than 16 transmission
> flags, AND where we have more than 2 global flags defined, existing qemu
> 2.5 clients don't confuse the kernel when calling NBD_SET_FLAGS?  Or do
> we think that it is unlikely enough to worry about, where by the time
> there are more than 16 transmission flags, users are likely to already
> be using new-enough qemu that doesn't send global flags to the kernel?

I'm going to assume the latter is the case, yeah, but we could skip bits
16 and 17 (the only two handshake flags currently defined) and use the
upper 14 bits before using the lower 2 if it does turn out to be a
problem.

-- 
< ron> I mean, the main *practical* problem with C++, is there's like a dozen
       people in the world who think they really understand all of its rules,
       and pretty much all of them are just lying to themselves too.
 -- #debian-devel, OFTC, 2016-02-12

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [Qemu-devel] [Nbd] question on ioctl NBD_SET_FLAGS
  2016-04-20 16:18 ` [Qemu-devel] [Nbd] " Wouter Verhelst
@ 2016-05-04 10:29   ` Alex Bligh
  0 siblings, 0 replies; 3+ messages in thread
From: Alex Bligh @ 2016-05-04 10:29 UTC (permalink / raw)
  To: Wouter Verhelst
  Cc: Alex Bligh, Eric Blake, nbd-general, qemu-devel, qemu block


On 20 Apr 2016, at 17:18, Wouter Verhelst <w@uter.be> wrote:

> Hi Eric,
> 
> On Wed, Apr 20, 2016 at 09:42:02AM -0600, Eric Blake wrote:
> [...]
>> But in 3.9, the overlap bug was still present, and the set of global
>> flags had grown to include NBD_FLAG_NO_ZEROS (commit 038e066), which
>> overlaps with NBD_FLAG_READ_ONLY.  Ouch.  This means that a client
>> talking to a server that advertises NO_ZEROES means that the client will
>> mistakenly tell the kernel to treat EVERY export as read-only, even if
>> the client doesn't respond with NBD_FLAG_C_NO_ZEROES.
>> 
>> 3.10 fixed things; negotiate() now uses uint16_t *flags (instead of u32
>> *), and no longer tries to merge global flags with transmission flags;
>> only the transmission flags are ever passed to the kernel via
>> NBD_SET_FLAGS.  Maybe it's good that there was only 2 weeks between 3.9
>> and 3.10, so hopefully few distros are trying to ship that broken version.
> 
> Well, yeah, since 3.10 was an "oops" release when 3.9 exposed that bug
> (which indeed had existed for a while) and which was reported quite
> quickly on the list. Released versions of nbd which have the bug exist
> though, and trying to have a 3.8 (or below) client talk to a 3.9 (or
> above) server has the same issue.
> 
> I decided that there was no way in which I could fix it, and that "the
> export is readonly" is bad but not a "critical data loss" kind of bug,
> so releasing 3.10 was pretty much the only sane thing I could do (other
> than delaying NO_ZEROES, which might have worked too).

For what it's worth, Ubuntu 14.04 (still under long term
support) ships with nbd-client 3.7 and I've just had a bug
report filed against gonbdserver where all exports go readonly.
   https://github.com/abligh/gonbdserver/issues/4

I've added an option to disable NBD_FLAG_NO_ZEROS on nbd-server
but that's pretty horrible. And I think this will affect all
pre-3.10 clients talking to 3.9 and later servers.

I'm going to find a minimal patch to nbd-client and offer that
to Ubuntu / Debian.

This message is here in part so I have something to point them at
on the mailing list :-)

-- 
Alex Bligh

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2016-05-04 10:30 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-04-20 15:42 [Qemu-devel] question on ioctl NBD_SET_FLAGS Eric Blake
2016-04-20 16:18 ` [Qemu-devel] [Nbd] " Wouter Verhelst
2016-05-04 10:29   ` Alex Bligh

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.