All of lore.kernel.org
 help / color / mirror / Atom feed
From: Kevin Wolf <kwolf@redhat.com>
To: Nicholas Thomas <nick@bytemark.co.uk>
Cc: Stefan Hajnoczi <stefanha@gmail.com>, qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] NBD block device backend - 'improvements'
Date: Wed, 16 Feb 2011 13:00:59 +0100	[thread overview]
Message-ID: <4D5BBC7B.9020807@redhat.com> (raw)
In-Reply-To: <1297805193.12551.39.camel@den>

Hi Nick,

[ Please use "reply to all" so that the CC list is kept intact,
re-adding Stefan ]

Am 15.02.2011 22:26, schrieb Nicholas Thomas:
> Hi Kevin, Stefan.
> 
> On Tue, 2011-02-15 at 12:09 +0100, Kevin Wolf wrote:
>> Am 14.02.2011 21:32, schrieb Stefan Hajnoczi:
> [...]
>>> block/nbd.c needs to be made asynchronous in order for this change to
>>> work.  
>>
>> And even then it's not free of problem: For example qemu_aio_flush()
>> will hang. We're having all kinds of fun with NFS servers that go away
>> and let requests hang indefinitely.
>>
>> So maybe what we should add is a timeout option which defaults to 0
>> (fail immediately, like today)
> 
> Noted, so long as we can have -1 as "forever". 

As long as it's optional, that's fine with me.

> I'm currently spending
> time reworking block/nbd.c to be asynchronous, following the model in
> block/sheepdog.c
> 
> There does seem to be a lot of scope for code duplication (setting up
> the TCP connection, taking it down, the mechanics of actually reading /
> writing bytes using the aio interface, etc) between the two, and
> presumably for rbd as well. 
> 
> Reading http://www.mail-archive.com/qemu-devel@nongnu.org/msg36479.html
> suggests it should be possible to have a "tcp" (+ "unix") protocol /
> transport, which nbd+sheepdog could stack on top of (curl+rbd seem to
> depend on their own libraries for managing the TCP part of the
> connection).
> 
> They would implement talking the actual protocol, while the tcp/unix
> transports would have the duplicatable bits.
> 
> I've not investigated it in code yet - it's possible I'm just letting my
> appetite for abstraction get away with me. Thoughts?

I'm not sure about how much duplication there actually is, but if you
can take a closer look and think it's worthwhile, we should probably
consider it.

>> Unconditionally stopping the VM from a block driver sounds wrong to me.
>> If you want to have this behaviour, the block driver should return an
>> error and you should use werror=stop.
> 
> Unconditional? - if the socket manages to re-establish, the process
> continues on its way (I guess we'd see the same behaviour if a send/recv
> happened to take an unconscionably long time with the current code).
> 
> Making just the I/O hang until the network comes back, keeping guest
> execution and qemu monitor working, is obviously better than that
> (although not /strictly/ necessary for our particular use case), so I
> hope to be able to offer an AIO NBD patch for review "soon". 

Maybe I wasn't very clear. I was talking about Stefan's suggestion to
completely stop the VM, like we already can do for I/O errors (see the
werror and rerror options for -drive). I think it's not what you're
looking for, you just need the timeout=-1 thing.

>>> IPv6 would be nice and if you can consolidate that in qemu_socket.h,
>>> then that's a win for non-nbd socket users too.
>>
>> Agreed.
> 
> We'd get it for free with a unified TCP transport, as described above
> (sheepdog already uses getaddrinfo and friends) - but if that's not
> feasible, I'll be happy to supply a patch just for this. Much easier
> than aio! :)

Sure, I think it would be a good thing to have. And even if you
implemented this unified TCP transport (I'm not sure yet what it would
look like), I think the basic support could still be in qemu_socket.h,
so that users outside the block layer can benefit from it, too.

Kevin

  reply	other threads:[~2011-02-16 11:59 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-02-14 19:40 [Qemu-devel] NBD block device backend - 'improvements' Nicholas Thomas
2011-02-14 20:32 ` Stefan Hajnoczi
2011-02-15 11:09   ` Kevin Wolf
2011-02-15 21:26     ` Nicholas Thomas
2011-02-16 12:00       ` Kevin Wolf [this message]
2011-02-17 16:27         ` [Qemu-devel] " Nicholas Thomas
2011-02-17 16:34         ` [Qemu-devel] [PATCH 1/3] NBD library: whitespace changes Nicholas Thomas
2011-02-17 16:34         ` [Qemu-devel] [PATCH 2/3] NBD library: add aio-compatible read/write function Nicholas Thomas
2011-02-17 16:34         ` [Qemu-devel] [PATCH 3/3] block/nbd: Make the NBD block device use the AIO interface Nicholas Thomas
2011-02-17 19:28           ` Nicholas Thomas
2011-02-18 12:16             ` [Qemu-devel] [PATCH 3/3 v2] " Nicholas Thomas
2011-02-18 12:23               ` Kevin Wolf
2011-02-18 12:55                 ` Nicholas Thomas

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4D5BBC7B.9020807@redhat.com \
    --to=kwolf@redhat.com \
    --cc=nick@bytemark.co.uk \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.