qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
To: qemu-block@nongnu.org
Cc: kwolf@redhat.com, vsementsov@virtuozzo.com, berrange@redhat.com,
	qemu-devel@nongnu.org, mreitz@redhat.com, kraxel@redhat.com,
	den@openvz.org
Subject: [PATCH 4/4] block/nbd: use non-blocking connect: fix vm hang on connect()
Date: Mon, 20 Jul 2020 21:07:15 +0300	[thread overview]
Message-ID: <20200720180715.10521-5-vsementsov@virtuozzo.com> (raw)
In-Reply-To: <20200720180715.10521-1-vsementsov@virtuozzo.com>

This make nbd connection_co to yield during reconnects, so that
reconnect doesn't hang up the main thread. This is very important in
case of unavailable nbd server host: connect() call may take a long
time, blocking the main thread (and due to reconnect, it will hang
again and again with small gaps of working time during pauses between
connection attempts).

How to reproduce the bug:

1. Create an image on node1:
   qemu-img create -f qcow2 xx 100M

2. Start NBD server on node1:
   qemu-nbd xx

3. Start vm with second nbd disk on node2, like this:

  ./x86_64-softmmu/qemu-system-x86_64 -nodefaults -drive \
    file=/work/images/cent7.qcow2 -drive file=nbd+tcp://192.168.100.2 \
    -vnc :0 -qmp stdio -m 2G -enable-kvm -vga std

4. Access the vm through vnc (or some other way?), and check that NBD
   drive works:

   dd if=/dev/sdb of=/dev/null bs=1M count=10

   - the command should succeed.

5. Now, let's trigger nbd-reconnect loop in Qemu process. For this:

5.1 Kill NBD server on node1

5.2 run "dd if=/dev/sdb of=/dev/null bs=1M count=10" in the guest
    again. The command should fail and a lot of error messages about
    failing disk may appear as well.

    Now NBD client driver in Qemu tries to reconnect.
    Still, VM works well.

6. Make node1 unavailable on NBD port, so connect() from node2 will
   last for a long time:

   On node1 (Note, that 10809 is just a default NBD port):

   sudo iptables -A INPUT -p tcp --dport 10809 -j DROP

   After some time the guest hangs, and you may check in gdb that Qemu
   hangs in connect() call, issued from the main thread. This is the
   BUG.

7. Don't forget to drop iptables rule from your node1:

   sudo iptables -D INPUT -p tcp --dport 10809 -j DROP

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 block/nbd.c | 11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/block/nbd.c b/block/nbd.c
index d9cde30457..931eadbe6f 100644
--- a/block/nbd.c
+++ b/block/nbd.c
@@ -1421,16 +1421,19 @@ static void nbd_client_close(BlockDriverState *bs)
     nbd_teardown_connection(bs);
 }
 
-static QIOChannelSocket *nbd_establish_connection(SocketAddress *saddr,
+static QIOChannelSocket *nbd_establish_connection(BlockDriverState *bs,
+                                                  SocketAddress *saddr,
                                                   Error **errp)
 {
     ERRP_GUARD();
     QIOChannelSocket *sioc;
+    AioContext *aio_context = bdrv_get_aio_context(bs);
 
     sioc = qio_channel_socket_new();
     qio_channel_set_name(QIO_CHANNEL(sioc), "nbd-client");
 
-    qio_channel_socket_connect_sync(sioc, saddr, errp);
+    qio_channel_attach_aio_context(QIO_CHANNEL(sioc), aio_context);
+    qio_channel_socket_connect_non_blocking_sync(sioc, saddr, errp);
     if (*errp) {
         object_unref(OBJECT(sioc));
         return NULL;
@@ -1451,7 +1454,7 @@ static int nbd_client_connect(BlockDriverState *bs, Error **errp)
      * establish TCP connection, return error if it fails
      * TODO: Configurable retry-until-timeout behaviour.
      */
-    QIOChannelSocket *sioc = nbd_establish_connection(s->saddr, errp);
+    QIOChannelSocket *sioc = nbd_establish_connection(bs, s->saddr, errp);
 
     if (!sioc) {
         return -ECONNREFUSED;
@@ -1461,8 +1464,6 @@ static int nbd_client_connect(BlockDriverState *bs, Error **errp)
 
     /* NBD handshake */
     trace_nbd_client_connect(s->export);
-    qio_channel_set_blocking(QIO_CHANNEL(sioc), false, NULL);
-    qio_channel_attach_aio_context(QIO_CHANNEL(sioc), aio_context);
 
     s->info.request_sizes = true;
     s->info.structured_reply = true;
-- 
2.21.0



  parent reply	other threads:[~2020-07-20 18:10 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-20 18:07 [PATCH for-5.1? 0/4] non-blocking connect Vladimir Sementsov-Ogievskiy
2020-07-20 18:07 ` [PATCH 1/4] qemu-sockets: refactor inet_connect_addr Vladimir Sementsov-Ogievskiy
2020-07-20 18:07 ` [PATCH 2/4] qemu-sockets: implement non-blocking connect interface Vladimir Sementsov-Ogievskiy
2020-07-20 18:07 ` [PATCH 3/4] io/channel-socket: implement non-blocking connect Vladimir Sementsov-Ogievskiy
2020-07-20 18:29   ` Daniel P. Berrangé
2020-07-22 11:00     ` Vladimir Sementsov-Ogievskiy
2020-07-22 11:21       ` Daniel P. Berrangé
2020-07-22 12:43         ` Vladimir Sementsov-Ogievskiy
2020-07-22 12:53           ` Daniel P. Berrangé
2020-07-22 13:47             ` Vladimir Sementsov-Ogievskiy
2020-07-22 15:04               ` Vladimir Sementsov-Ogievskiy
2020-07-22 15:21                 ` Daniel P. Berrangé
2020-07-22 15:40                   ` Vladimir Sementsov-Ogievskiy
2020-07-22 15:43                     ` Daniel P. Berrangé
2020-07-22 15:56                       ` Vladimir Sementsov-Ogievskiy
2020-07-20 18:07 ` Vladimir Sementsov-Ogievskiy [this message]
2020-07-23 19:35 ` [PATCH for-5.1? 0/4] " Eric Blake

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200720180715.10521-5-vsementsov@virtuozzo.com \
    --to=vsementsov@virtuozzo.com \
    --cc=berrange@redhat.com \
    --cc=den@openvz.org \
    --cc=kraxel@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=mreitz@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).