All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/6] qemu-iotests fixes for Kevin's block tree
@ 2023-12-21 15:35 Stefan Hajnoczi
  2023-12-21 15:35 ` [PATCH v2 1/6] fixup block-coroutine-wrapper: use qemu_get_current_aio_context() Stefan Hajnoczi
                   ` (5 more replies)
  0 siblings, 6 replies; 9+ messages in thread
From: Stefan Hajnoczi @ 2023-12-21 15:35 UTC (permalink / raw)
  To: Kevin Wolf, qemu-devel
  Cc: Juan Quintela, Fabiano Rosas, Vladimir Sementsov-Ogievskiy,
	Hanna Reitz, Paolo Bonzini, Leonardo Bras, Stefan Hajnoczi,
	Peter Xu, Fam Zheng, Eric Blake, qemu-block

v2:
- Drop useless if (client->quiesing) check [Kevin]
- run qio_channel_read_wake() in export AioContext to avoid race [Kevin]
- Introduce nbd_client_put_nonzero() optimization [Paolo]
- Reach goto label disconnect with client->lock taken [Paolo]
- Add doc comment explaining nbd_co_receive_request() loop in nbd_trip() [Kevin]

Kevin merged several of my outstanding multi-queue block layer patch series and
found that qemu-iotests -qcow2 was broken. This patch series fixes the block branch.

Most of the fixes are easy but the NBD server required deeper debugging and
thread-safety fixes. The NBD server patches can be inserted before "aio: make
aio_context_acquire()/aio_context_release() a no-op" to preserve bisectability.
The other patches are fixups that can be squashed into the original patches.

Stefan Hajnoczi (6):
  fixup block-coroutine-wrapper: use qemu_get_current_aio_context()
  fixup block: remove AioContext locking
  fixup scsi: only access SCSIDevice->requests from one thread
  nbd/server: avoid per-NBDRequest nbd_client_get/put()
  nbd/server: only traverse NBDExport->clients from main loop thread
  nbd/server: introduce NBDClient->lock to protect fields

 hw/scsi/scsi-bus.c            |   3 +-
 migration/block.c             |   7 ++
 nbd/server.c                  | 205 ++++++++++++++++++++++++++--------
 tests/qemu-iotests/051.pc.out |   4 +-
 4 files changed, 170 insertions(+), 49 deletions(-)

-- 
2.43.0



^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH v2 1/6] fixup block-coroutine-wrapper: use qemu_get_current_aio_context()
  2023-12-21 15:35 [PATCH v2 0/6] qemu-iotests fixes for Kevin's block tree Stefan Hajnoczi
@ 2023-12-21 15:35 ` Stefan Hajnoczi
  2023-12-21 15:35 ` [PATCH v2 2/6] fixup block: remove AioContext locking Stefan Hajnoczi
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 9+ messages in thread
From: Stefan Hajnoczi @ 2023-12-21 15:35 UTC (permalink / raw)
  To: Kevin Wolf, qemu-devel
  Cc: Juan Quintela, Fabiano Rosas, Vladimir Sementsov-Ogievskiy,
	Hanna Reitz, Paolo Bonzini, Leonardo Bras, Stefan Hajnoczi,
	Peter Xu, Fam Zheng, Eric Blake, qemu-block

qemu-iotests 051 fails on my machine so the change to 051.pc.out made by
the above commit appears to be incorrect, at least against the current
QEMU source tree. Revert it so that 051 passes again.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 tests/qemu-iotests/051.pc.out | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tests/qemu-iotests/051.pc.out b/tests/qemu-iotests/051.pc.out
index 86a3e113c4..7e10c5fa1b 100644
--- a/tests/qemu-iotests/051.pc.out
+++ b/tests/qemu-iotests/051.pc.out
@@ -177,11 +177,11 @@ QEMU X.Y.Z monitor - type 'help' for more information
 
 Testing: -drive file=TEST_DIR/t.qcow2,if=none,node-name=disk -object iothread,id=thread0 -device virtio-scsi,iothread=thread0,id=virtio-scsi0 -device scsi-hd,bus=virtio-scsi0.0,drive=disk,share-rw=on -device lsi53c895a,id=lsi0 -device scsi-hd,bus=lsi0.0,drive=disk,share-rw=on
 QEMU X.Y.Z monitor - type 'help' for more information
-QEMU_PROG: -device scsi-hd,bus=lsi0.0,drive=disk,share-rw=on: HBA does not support iothreads
+(qemu) QEMU_PROG: -device scsi-hd,bus=lsi0.0,drive=disk,share-rw=on: HBA does not support iothreads
 
 Testing: -drive file=TEST_DIR/t.qcow2,if=none,node-name=disk -object iothread,id=thread0 -device virtio-scsi,iothread=thread0,id=virtio-scsi0 -device scsi-hd,bus=virtio-scsi0.0,drive=disk,share-rw=on -device virtio-scsi,id=virtio-scsi1 -device scsi-hd,bus=virtio-scsi1.0,drive=disk,share-rw=on
 QEMU X.Y.Z monitor - type 'help' for more information
-QEMU_PROG: -device scsi-hd,bus=virtio-scsi1.0,drive=disk,share-rw=on: Cannot change iothread of active block backend
+(qemu) QEMU_PROG: -device scsi-hd,bus=virtio-scsi1.0,drive=disk,share-rw=on: Cannot change iothread of active block backend
 
 Testing: -drive file=TEST_DIR/t.qcow2,if=none,node-name=disk -object iothread,id=thread0 -device virtio-scsi,iothread=thread0,id=virtio-scsi0 -device scsi-hd,bus=virtio-scsi0.0,drive=disk,share-rw=on -device virtio-blk-pci,drive=disk,iothread=thread0,share-rw=on
 QEMU X.Y.Z monitor - type 'help' for more information
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v2 2/6] fixup block: remove AioContext locking
  2023-12-21 15:35 [PATCH v2 0/6] qemu-iotests fixes for Kevin's block tree Stefan Hajnoczi
  2023-12-21 15:35 ` [PATCH v2 1/6] fixup block-coroutine-wrapper: use qemu_get_current_aio_context() Stefan Hajnoczi
@ 2023-12-21 15:35 ` Stefan Hajnoczi
  2023-12-21 15:35 ` [PATCH v2 3/6] fixup scsi: only access SCSIDevice->requests from one thread Stefan Hajnoczi
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 9+ messages in thread
From: Stefan Hajnoczi @ 2023-12-21 15:35 UTC (permalink / raw)
  To: Kevin Wolf, qemu-devel
  Cc: Juan Quintela, Fabiano Rosas, Vladimir Sementsov-Ogievskiy,
	Hanna Reitz, Paolo Bonzini, Leonardo Bras, Stefan Hajnoczi,
	Peter Xu, Fam Zheng, Eric Blake, qemu-block

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 migration/block.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/migration/block.c b/migration/block.c
index 2bcfcbfdf6..6ec6a1d6e6 100644
--- a/migration/block.c
+++ b/migration/block.c
@@ -311,10 +311,17 @@ static int mig_save_device_bulk(QEMUFile *f, BlkMigDevState *bmds)
     block_mig_state.submitted++;
     blk_mig_unlock();
 
+    /*
+     * The migration thread does not have an AioContext. Lock the BQL so that
+     * I/O runs in the main loop AioContext (see
+     * qemu_get_current_aio_context()).
+     */
+    qemu_mutex_lock_iothread();
     bdrv_reset_dirty_bitmap(bmds->dirty_bitmap, cur_sector * BDRV_SECTOR_SIZE,
                             nr_sectors * BDRV_SECTOR_SIZE);
     blk->aiocb = blk_aio_preadv(bb, cur_sector * BDRV_SECTOR_SIZE, &blk->qiov,
                                 0, blk_mig_read_cb, blk);
+    qemu_mutex_unlock_iothread();
 
     bmds->cur_sector = cur_sector + nr_sectors;
     return (bmds->cur_sector >= total_sectors);
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v2 3/6] fixup scsi: only access SCSIDevice->requests from one thread
  2023-12-21 15:35 [PATCH v2 0/6] qemu-iotests fixes for Kevin's block tree Stefan Hajnoczi
  2023-12-21 15:35 ` [PATCH v2 1/6] fixup block-coroutine-wrapper: use qemu_get_current_aio_context() Stefan Hajnoczi
  2023-12-21 15:35 ` [PATCH v2 2/6] fixup block: remove AioContext locking Stefan Hajnoczi
@ 2023-12-21 15:35 ` Stefan Hajnoczi
  2023-12-21 15:35 ` [PATCH v2 4/6] nbd/server: avoid per-NBDRequest nbd_client_get/put() Stefan Hajnoczi
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 9+ messages in thread
From: Stefan Hajnoczi @ 2023-12-21 15:35 UTC (permalink / raw)
  To: Kevin Wolf, qemu-devel
  Cc: Juan Quintela, Fabiano Rosas, Vladimir Sementsov-Ogievskiy,
	Hanna Reitz, Paolo Bonzini, Leonardo Bras, Stefan Hajnoczi,
	Peter Xu, Fam Zheng, Eric Blake, qemu-block

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
---
 hw/scsi/scsi-bus.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/hw/scsi/scsi-bus.c b/hw/scsi/scsi-bus.c
index df68a44b6a..5b08cbf60a 100644
--- a/hw/scsi/scsi-bus.c
+++ b/hw/scsi/scsi-bus.c
@@ -127,7 +127,8 @@ static void scsi_device_for_each_req_async_bh(void *opaque)
      */
     ctx = blk_get_aio_context(s->conf.blk);
     if (ctx != qemu_get_current_aio_context()) {
-        aio_bh_schedule_oneshot(ctx, scsi_device_for_each_req_async_bh, data);
+        aio_bh_schedule_oneshot(ctx, scsi_device_for_each_req_async_bh,
+                                g_steal_pointer(&data));
         return;
     }
 
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v2 4/6] nbd/server: avoid per-NBDRequest nbd_client_get/put()
  2023-12-21 15:35 [PATCH v2 0/6] qemu-iotests fixes for Kevin's block tree Stefan Hajnoczi
                   ` (2 preceding siblings ...)
  2023-12-21 15:35 ` [PATCH v2 3/6] fixup scsi: only access SCSIDevice->requests from one thread Stefan Hajnoczi
@ 2023-12-21 15:35 ` Stefan Hajnoczi
  2023-12-21 15:35 ` [PATCH v2 5/6] nbd/server: only traverse NBDExport->clients from main loop thread Stefan Hajnoczi
  2023-12-21 15:35 ` [PATCH v2 6/6] nbd/server: introduce NBDClient->lock to protect fields Stefan Hajnoczi
  5 siblings, 0 replies; 9+ messages in thread
From: Stefan Hajnoczi @ 2023-12-21 15:35 UTC (permalink / raw)
  To: Kevin Wolf, qemu-devel
  Cc: Juan Quintela, Fabiano Rosas, Vladimir Sementsov-Ogievskiy,
	Hanna Reitz, Paolo Bonzini, Leonardo Bras, Stefan Hajnoczi,
	Peter Xu, Fam Zheng, Eric Blake, qemu-block

nbd_trip() processes a single NBD request from start to finish and holds
an NBDClient reference throughout. NBDRequest does not outlive the scope
of nbd_trip(). Therefore it is unnecessary to ref/unref NBDClient for
each NBDRequest.

Removing these nbd_client_get()/nbd_client_put() calls will make
thread-safety easier in the commits that follow.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
---
 nbd/server.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/nbd/server.c b/nbd/server.c
index 895cf0a752..0b09ccc8dc 100644
--- a/nbd/server.c
+++ b/nbd/server.c
@@ -1557,7 +1557,6 @@ static NBDRequestData *nbd_request_get(NBDClient *client)
     client->nb_requests++;
 
     req = g_new0(NBDRequestData, 1);
-    nbd_client_get(client);
     req->client = client;
     return req;
 }
@@ -1578,8 +1577,6 @@ static void nbd_request_put(NBDRequestData *req)
     }
 
     nbd_client_receive_next_request(client);
-
-    nbd_client_put(client);
 }
 
 static void blk_aio_attached(AioContext *ctx, void *opaque)
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v2 5/6] nbd/server: only traverse NBDExport->clients from main loop thread
  2023-12-21 15:35 [PATCH v2 0/6] qemu-iotests fixes for Kevin's block tree Stefan Hajnoczi
                   ` (3 preceding siblings ...)
  2023-12-21 15:35 ` [PATCH v2 4/6] nbd/server: avoid per-NBDRequest nbd_client_get/put() Stefan Hajnoczi
@ 2023-12-21 15:35 ` Stefan Hajnoczi
  2023-12-21 15:35 ` [PATCH v2 6/6] nbd/server: introduce NBDClient->lock to protect fields Stefan Hajnoczi
  5 siblings, 0 replies; 9+ messages in thread
From: Stefan Hajnoczi @ 2023-12-21 15:35 UTC (permalink / raw)
  To: Kevin Wolf, qemu-devel
  Cc: Juan Quintela, Fabiano Rosas, Vladimir Sementsov-Ogievskiy,
	Hanna Reitz, Paolo Bonzini, Leonardo Bras, Stefan Hajnoczi,
	Peter Xu, Fam Zheng, Eric Blake, qemu-block

The NBD clients list is currently accessed from both the export
AioContext and the main loop thread. When the AioContext lock is removed
there will be nothing protecting the clients list.

Adding a lock around the clients list is tricky because NBDClient
structs are refcounted and may be freed from the export AioContext or
the main loop thread. nbd_export_request_shutdown() -> client_close() ->
nbd_client_put() is also tricky because the list lock would be held
while indirectly dropping references to NDBClients.

A simpler approach is to only allow nbd_client_put() and client_close()
calls from the main loop thread. Then the NBD clients list is only
accessed from the main loop thread and no fancy locking is needed.

nbd_trip() just needs to reschedule itself in the main loop AioContext
before calling nbd_client_put() and client_close(). This costs more CPU
cycles per NBD request so add nbd_client_put_nonzero() to optimize the
common case where more references to NBDClient remain.

Note that nbd_client_get() can still be called from either thread, so
make NBDClient->refcount atomic.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 nbd/server.c | 61 +++++++++++++++++++++++++++++++++++++++++++---------
 1 file changed, 51 insertions(+), 10 deletions(-)

diff --git a/nbd/server.c b/nbd/server.c
index 0b09ccc8dc..e91e2e0903 100644
--- a/nbd/server.c
+++ b/nbd/server.c
@@ -122,7 +122,7 @@ struct NBDMetaContexts {
 };
 
 struct NBDClient {
-    int refcount;
+    int refcount; /* atomic */
     void (*close_fn)(NBDClient *client, bool negotiated);
 
     NBDExport *exp;
@@ -1501,14 +1501,17 @@ static int coroutine_fn nbd_receive_request(NBDClient *client, NBDRequest *reque
 
 #define MAX_NBD_REQUESTS 16
 
+/* Runs in export AioContext and main loop thread */
 void nbd_client_get(NBDClient *client)
 {
-    client->refcount++;
+    qatomic_inc(&client->refcount);
 }
 
 void nbd_client_put(NBDClient *client)
 {
-    if (--client->refcount == 0) {
+    assert(qemu_in_main_thread());
+
+    if (qatomic_fetch_dec(&client->refcount) == 1) {
         /* The last reference should be dropped by client->close,
          * which is called by client_close.
          */
@@ -1529,8 +1532,35 @@ void nbd_client_put(NBDClient *client)
     }
 }
 
+/*
+ * Tries to release the reference to @client, but only if other references
+ * remain. This is an optimization for the common case where we want to avoid
+ * the expense of scheduling nbd_client_put() in the main loop thread.
+ *
+ * Returns true upon success or false if the reference was not released because
+ * it is the last reference.
+ */
+static bool nbd_client_put_nonzero(NBDClient *client)
+{
+    int old = qatomic_read(&client->refcount);
+    int expected;
+
+    do {
+        if (old == 1) {
+            return false;
+        }
+
+        expected = old;
+        old = qatomic_cmpxchg(&client->refcount, expected, expected - 1);
+    } while (old != expected);
+
+    return true;
+}
+
 static void client_close(NBDClient *client, bool negotiated)
 {
+    assert(qemu_in_main_thread());
+
     if (client->closing) {
         return;
     }
@@ -2933,15 +2963,20 @@ static coroutine_fn int nbd_handle_request(NBDClient *client,
 static coroutine_fn void nbd_trip(void *opaque)
 {
     NBDClient *client = opaque;
-    NBDRequestData *req;
+    NBDRequestData *req = NULL;
     NBDRequest request = { 0 };    /* GCC thinks it can be used uninitialized */
     int ret;
     Error *local_err = NULL;
 
+    /*
+     * Note that nbd_client_put() and client_close() must be called from the
+     * main loop thread. Use aio_co_reschedule_self() to switch AioContext
+     * before calling these functions.
+     */
+
     trace_nbd_trip();
     if (client->closing) {
-        nbd_client_put(client);
-        return;
+        goto done;
     }
 
     if (client->quiescing) {
@@ -2949,10 +2984,9 @@ static coroutine_fn void nbd_trip(void *opaque)
          * We're switching between AIO contexts. Don't attempt to receive a new
          * request and kick the main context which may be waiting for us.
          */
-        nbd_client_put(client);
         client->recv_coroutine = NULL;
         aio_wait_kick();
-        return;
+        goto done;
     }
 
     req = nbd_request_get(client);
@@ -3012,8 +3046,13 @@ static coroutine_fn void nbd_trip(void *opaque)
 
     qio_channel_set_cork(client->ioc, false);
 done:
-    nbd_request_put(req);
-    nbd_client_put(client);
+    if (req) {
+        nbd_request_put(req);
+    }
+    if (!nbd_client_put_nonzero(client)) {
+        aio_co_reschedule_self(qemu_get_aio_context());
+        nbd_client_put(client);
+    }
     return;
 
 disconnect:
@@ -3021,6 +3060,8 @@ disconnect:
         error_reportf_err(local_err, "Disconnect client, due to: ");
     }
     nbd_request_put(req);
+
+    aio_co_reschedule_self(qemu_get_aio_context());
     client_close(client, true);
     nbd_client_put(client);
 }
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v2 6/6] nbd/server: introduce NBDClient->lock to protect fields
  2023-12-21 15:35 [PATCH v2 0/6] qemu-iotests fixes for Kevin's block tree Stefan Hajnoczi
                   ` (4 preceding siblings ...)
  2023-12-21 15:35 ` [PATCH v2 5/6] nbd/server: only traverse NBDExport->clients from main loop thread Stefan Hajnoczi
@ 2023-12-21 15:35 ` Stefan Hajnoczi
  2023-12-21 17:38   ` Kevin Wolf
  5 siblings, 1 reply; 9+ messages in thread
From: Stefan Hajnoczi @ 2023-12-21 15:35 UTC (permalink / raw)
  To: Kevin Wolf, qemu-devel
  Cc: Juan Quintela, Fabiano Rosas, Vladimir Sementsov-Ogievskiy,
	Hanna Reitz, Paolo Bonzini, Leonardo Bras, Stefan Hajnoczi,
	Peter Xu, Fam Zheng, Eric Blake, qemu-block

NBDClient has a number of fields that are accessed by both the export
AioContext and the main loop thread. When the AioContext lock is removed
these fields will need another form of protection.

Add NBDClient->lock and protect fields that are accessed by both
threads. Also add assertions where possible and otherwise add doc
comments stating assumptions about which thread and lock holding.

Note this patch moves the client->recv_coroutine assertion from
nbd_co_receive_request() to nbd_trip() where client->lock is held.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 nbd/server.c | 141 +++++++++++++++++++++++++++++++++++++++------------
 1 file changed, 108 insertions(+), 33 deletions(-)

diff --git a/nbd/server.c b/nbd/server.c
index e91e2e0903..5b08aae535 100644
--- a/nbd/server.c
+++ b/nbd/server.c
@@ -125,23 +125,25 @@ struct NBDClient {
     int refcount; /* atomic */
     void (*close_fn)(NBDClient *client, bool negotiated);
 
+    QemuMutex lock;
+
     NBDExport *exp;
     QCryptoTLSCreds *tlscreds;
     char *tlsauthz;
     QIOChannelSocket *sioc; /* The underlying data channel */
     QIOChannel *ioc; /* The current I/O channel which may differ (eg TLS) */
 
-    Coroutine *recv_coroutine;
+    Coroutine *recv_coroutine; /* protected by lock */
 
     CoMutex send_lock;
     Coroutine *send_coroutine;
 
-    bool read_yielding;
-    bool quiescing;
+    bool read_yielding; /* protected by lock */
+    bool quiescing; /* protected by lock */
 
     QTAILQ_ENTRY(NBDClient) next;
-    int nb_requests;
-    bool closing;
+    int nb_requests; /* protected by lock */
+    bool closing; /* protected by lock */
 
     uint32_t check_align; /* If non-zero, check for aligned client requests */
 
@@ -1415,11 +1417,15 @@ nbd_read_eof(NBDClient *client, void *buffer, size_t size, Error **errp)
 
         len = qio_channel_readv(client->ioc, &iov, 1, errp);
         if (len == QIO_CHANNEL_ERR_BLOCK) {
-            client->read_yielding = true;
+            WITH_QEMU_LOCK_GUARD(&client->lock) {
+                client->read_yielding = true;
+            }
             qio_channel_yield(client->ioc, G_IO_IN);
-            client->read_yielding = false;
-            if (client->quiescing) {
-                return -EAGAIN;
+            WITH_QEMU_LOCK_GUARD(&client->lock) {
+                client->read_yielding = false;
+                if (client->quiescing) {
+                    return -EAGAIN;
+                }
             }
             continue;
         } else if (len < 0) {
@@ -1528,6 +1534,7 @@ void nbd_client_put(NBDClient *client)
             blk_exp_unref(&client->exp->common);
         }
         g_free(client->contexts.bitmaps);
+        qemu_mutex_destroy(&client->lock);
         g_free(client);
     }
 }
@@ -1561,11 +1568,13 @@ static void client_close(NBDClient *client, bool negotiated)
 {
     assert(qemu_in_main_thread());
 
-    if (client->closing) {
-        return;
-    }
+    WITH_QEMU_LOCK_GUARD(&client->lock) {
+        if (client->closing) {
+            return;
+        }
 
-    client->closing = true;
+        client->closing = true;
+    }
 
     /* Force requests to finish.  They will drop their own references,
      * then we'll close the socket and free the NBDClient.
@@ -1579,6 +1588,7 @@ static void client_close(NBDClient *client, bool negotiated)
     }
 }
 
+/* Runs in export AioContext with client->lock held */
 static NBDRequestData *nbd_request_get(NBDClient *client)
 {
     NBDRequestData *req;
@@ -1591,6 +1601,7 @@ static NBDRequestData *nbd_request_get(NBDClient *client)
     return req;
 }
 
+/* Runs in export AioContext with client->lock held */
 static void nbd_request_put(NBDRequestData *req)
 {
     NBDClient *client = req->client;
@@ -1614,14 +1625,18 @@ static void blk_aio_attached(AioContext *ctx, void *opaque)
     NBDExport *exp = opaque;
     NBDClient *client;
 
+    assert(qemu_in_main_thread());
+
     trace_nbd_blk_aio_attached(exp->name, ctx);
 
     exp->common.ctx = ctx;
 
     QTAILQ_FOREACH(client, &exp->clients, next) {
-        assert(client->nb_requests == 0);
-        assert(client->recv_coroutine == NULL);
-        assert(client->send_coroutine == NULL);
+        WITH_QEMU_LOCK_GUARD(&client->lock) {
+            assert(client->nb_requests == 0);
+            assert(client->recv_coroutine == NULL);
+            assert(client->send_coroutine == NULL);
+        }
     }
 }
 
@@ -1629,6 +1644,8 @@ static void blk_aio_detach(void *opaque)
 {
     NBDExport *exp = opaque;
 
+    assert(qemu_in_main_thread());
+
     trace_nbd_blk_aio_detach(exp->name, exp->common.ctx);
 
     exp->common.ctx = NULL;
@@ -1639,8 +1656,12 @@ static void nbd_drained_begin(void *opaque)
     NBDExport *exp = opaque;
     NBDClient *client;
 
+    assert(qemu_in_main_thread());
+
     QTAILQ_FOREACH(client, &exp->clients, next) {
-        client->quiescing = true;
+        WITH_QEMU_LOCK_GUARD(&client->lock) {
+            client->quiescing = true;
+        }
     }
 }
 
@@ -1649,28 +1670,48 @@ static void nbd_drained_end(void *opaque)
     NBDExport *exp = opaque;
     NBDClient *client;
 
+    assert(qemu_in_main_thread());
+
     QTAILQ_FOREACH(client, &exp->clients, next) {
-        client->quiescing = false;
-        nbd_client_receive_next_request(client);
+        WITH_QEMU_LOCK_GUARD(&client->lock) {
+            client->quiescing = false;
+            nbd_client_receive_next_request(client);
+        }
     }
 }
 
+/* Runs in export AioContext */
+static void nbd_wake_read_bh(void *opaque)
+{
+    NBDClient *client = opaque;
+    qio_channel_wake_read(client->ioc);
+}
+
 static bool nbd_drained_poll(void *opaque)
 {
     NBDExport *exp = opaque;
     NBDClient *client;
 
+    assert(qemu_in_main_thread());
+
     QTAILQ_FOREACH(client, &exp->clients, next) {
-        if (client->nb_requests != 0) {
-            /*
-             * If there's a coroutine waiting for a request on nbd_read_eof()
-             * enter it here so we don't depend on the client to wake it up.
-             */
-            if (client->recv_coroutine != NULL && client->read_yielding) {
-                qio_channel_wake_read(client->ioc);
+        WITH_QEMU_LOCK_GUARD(&client->lock) {
+            if (client->nb_requests != 0) {
+                /*
+                 * If there's a coroutine waiting for a request on nbd_read_eof()
+                 * enter it here so we don't depend on the client to wake it up.
+                 *
+                 * Schedule a BH in the export AioContext to avoid missing the
+                 * wake up due to the race between qio_channel_wake_read() and
+                 * qio_channel_yield().
+                 */
+                if (client->recv_coroutine != NULL && client->read_yielding) {
+                    aio_bh_schedule_oneshot(nbd_export_aio_context(client->exp),
+                                            nbd_wake_read_bh, client);
+                }
+
+                return true;
             }
-
-            return true;
         }
     }
 
@@ -1681,6 +1722,8 @@ static void nbd_eject_notifier(Notifier *n, void *data)
 {
     NBDExport *exp = container_of(n, NBDExport, eject_notifier);
 
+    assert(qemu_in_main_thread());
+
     blk_exp_request_shutdown(&exp->common);
 }
 
@@ -2566,7 +2609,6 @@ static int coroutine_fn nbd_co_receive_request(NBDRequestData *req,
     int ret;
 
     g_assert(qemu_in_coroutine());
-    assert(client->recv_coroutine == qemu_coroutine_self());
     ret = nbd_receive_request(client, request, errp);
     if (ret < 0) {
         return ret;
@@ -2975,6 +3017,9 @@ static coroutine_fn void nbd_trip(void *opaque)
      */
 
     trace_nbd_trip();
+
+    qemu_mutex_lock(&client->lock);
+
     if (client->closing) {
         goto done;
     }
@@ -2990,7 +3035,21 @@ static coroutine_fn void nbd_trip(void *opaque)
     }
 
     req = nbd_request_get(client);
-    ret = nbd_co_receive_request(req, &request, &local_err);
+
+    /*
+     * nbd_co_receive_request() returns -EAGAIN when nbd_drained_begin() has
+     * set client->quiescing but by the time we get back nbd_drained_end() may
+     * have already cleared client->quiescing. In that case we try again
+     * because nothing else will spawn an nbd_trip() coroutine until we set
+     * client->recv_coroutine = NULL further down.
+     */
+    do {
+        assert(client->recv_coroutine == qemu_coroutine_self());
+        qemu_mutex_unlock(&client->lock);
+        ret = nbd_co_receive_request(req, &request, &local_err);
+        qemu_mutex_lock(&client->lock);
+    } while (ret == -EAGAIN && !client->quiescing);
+
     client->recv_coroutine = NULL;
 
     if (client->closing) {
@@ -3002,15 +3061,16 @@ static coroutine_fn void nbd_trip(void *opaque)
     }
 
     if (ret == -EAGAIN) {
-        assert(client->quiescing);
         goto done;
     }
 
     nbd_client_receive_next_request(client);
+
     if (ret == -EIO) {
         goto disconnect;
     }
 
+    qemu_mutex_unlock(&client->lock);
     qio_channel_set_cork(client->ioc, true);
 
     if (ret < 0) {
@@ -3030,6 +3090,10 @@ static coroutine_fn void nbd_trip(void *opaque)
         g_free(request.contexts->bitmaps);
         g_free(request.contexts);
     }
+
+    qio_channel_set_cork(client->ioc, false);
+    qemu_mutex_lock(&client->lock);
+
     if (ret < 0) {
         error_prepend(&local_err, "Failed to send reply: ");
         goto disconnect;
@@ -3044,11 +3108,13 @@ static coroutine_fn void nbd_trip(void *opaque)
         goto disconnect;
     }
 
-    qio_channel_set_cork(client->ioc, false);
 done:
     if (req) {
         nbd_request_put(req);
     }
+
+    qemu_mutex_unlock(&client->lock);
+
     if (!nbd_client_put_nonzero(client)) {
         aio_co_reschedule_self(qemu_get_aio_context());
         nbd_client_put(client);
@@ -3059,13 +3125,19 @@ disconnect:
     if (local_err) {
         error_reportf_err(local_err, "Disconnect client, due to: ");
     }
+
     nbd_request_put(req);
+    qemu_mutex_unlock(&client->lock);
 
     aio_co_reschedule_self(qemu_get_aio_context());
     client_close(client, true);
     nbd_client_put(client);
 }
 
+/*
+ * Runs in export AioContext and main loop thread. Caller must hold
+ * client->lock.
+ */
 static void nbd_client_receive_next_request(NBDClient *client)
 {
     if (!client->recv_coroutine && client->nb_requests < MAX_NBD_REQUESTS &&
@@ -3091,7 +3163,9 @@ static coroutine_fn void nbd_co_client_start(void *opaque)
         return;
     }
 
-    nbd_client_receive_next_request(client);
+    WITH_QEMU_LOCK_GUARD(&client->lock) {
+        nbd_client_receive_next_request(client);
+    }
 }
 
 /*
@@ -3108,6 +3182,7 @@ void nbd_client_new(QIOChannelSocket *sioc,
     Coroutine *co;
 
     client = g_new0(NBDClient, 1);
+    qemu_mutex_init(&client->lock);
     client->refcount = 1;
     client->tlscreds = tlscreds;
     if (tlscreds) {
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 6/6] nbd/server: introduce NBDClient->lock to protect fields
  2023-12-21 15:35 ` [PATCH v2 6/6] nbd/server: introduce NBDClient->lock to protect fields Stefan Hajnoczi
@ 2023-12-21 17:38   ` Kevin Wolf
  2023-12-21 19:21     ` Stefan Hajnoczi
  0 siblings, 1 reply; 9+ messages in thread
From: Kevin Wolf @ 2023-12-21 17:38 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: qemu-devel, Juan Quintela, Fabiano Rosas,
	Vladimir Sementsov-Ogievskiy, Hanna Reitz, Paolo Bonzini,
	Leonardo Bras, Peter Xu, Fam Zheng, Eric Blake, qemu-block

Am 21.12.2023 um 16:35 hat Stefan Hajnoczi geschrieben:
> NBDClient has a number of fields that are accessed by both the export
> AioContext and the main loop thread. When the AioContext lock is removed
> these fields will need another form of protection.
> 
> Add NBDClient->lock and protect fields that are accessed by both
> threads. Also add assertions where possible and otherwise add doc
> comments stating assumptions about which thread and lock holding.
> 
> Note this patch moves the client->recv_coroutine assertion from
> nbd_co_receive_request() to nbd_trip() where client->lock is held.
> 
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>

> +/* Runs in export AioContext */
> +static void nbd_wake_read_bh(void *opaque)
> +{
> +    NBDClient *client = opaque;
> +    qio_channel_wake_read(client->ioc);
> +}
> +
>  static bool nbd_drained_poll(void *opaque)
>  {
>      NBDExport *exp = opaque;
>      NBDClient *client;
>  
> +    assert(qemu_in_main_thread());
> +
>      QTAILQ_FOREACH(client, &exp->clients, next) {
> -        if (client->nb_requests != 0) {
> -            /*
> -             * If there's a coroutine waiting for a request on nbd_read_eof()
> -             * enter it here so we don't depend on the client to wake it up.
> -             */
> -            if (client->recv_coroutine != NULL && client->read_yielding) {
> -                qio_channel_wake_read(client->ioc);
> +        WITH_QEMU_LOCK_GUARD(&client->lock) {
> +            if (client->nb_requests != 0) {
> +                /*
> +                 * If there's a coroutine waiting for a request on nbd_read_eof()
> +                 * enter it here so we don't depend on the client to wake it up.
> +                 *
> +                 * Schedule a BH in the export AioContext to avoid missing the
> +                 * wake up due to the race between qio_channel_wake_read() and
> +                 * qio_channel_yield().
> +                 */
> +                if (client->recv_coroutine != NULL && client->read_yielding) {
> +                    aio_bh_schedule_oneshot(nbd_export_aio_context(client->exp),
> +                                            nbd_wake_read_bh, client);
> +                }

Doesn't the condition have to move inside the BH to avoid the race?

Checking client->recv_coroutine != NULL could work here because I don't
think it can go from NULL to something while we're quiescing, but
client->read_yielding can still change until the BH runs and we know
that the nbd_co_trip() coroutine has yielded. It seems easiest to just
move the whole condition to the BH.

> +                return true;
>              }
> -
> -            return true;
>          }
>      }

The rest looks good to me.

Kevin



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 6/6] nbd/server: introduce NBDClient->lock to protect fields
  2023-12-21 17:38   ` Kevin Wolf
@ 2023-12-21 19:21     ` Stefan Hajnoczi
  0 siblings, 0 replies; 9+ messages in thread
From: Stefan Hajnoczi @ 2023-12-21 19:21 UTC (permalink / raw)
  To: Kevin Wolf
  Cc: qemu-devel, Juan Quintela, Fabiano Rosas,
	Vladimir Sementsov-Ogievskiy, Hanna Reitz, Paolo Bonzini,
	Leonardo Bras, Peter Xu, Fam Zheng, Eric Blake, qemu-block

[-- Attachment #1: Type: text/plain, Size: 3024 bytes --]

On Thu, Dec 21, 2023 at 06:38:16PM +0100, Kevin Wolf wrote:
> Am 21.12.2023 um 16:35 hat Stefan Hajnoczi geschrieben:
> > NBDClient has a number of fields that are accessed by both the export
> > AioContext and the main loop thread. When the AioContext lock is removed
> > these fields will need another form of protection.
> > 
> > Add NBDClient->lock and protect fields that are accessed by both
> > threads. Also add assertions where possible and otherwise add doc
> > comments stating assumptions about which thread and lock holding.
> > 
> > Note this patch moves the client->recv_coroutine assertion from
> > nbd_co_receive_request() to nbd_trip() where client->lock is held.
> > 
> > Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> 
> > +/* Runs in export AioContext */
> > +static void nbd_wake_read_bh(void *opaque)
> > +{
> > +    NBDClient *client = opaque;
> > +    qio_channel_wake_read(client->ioc);
> > +}
> > +
> >  static bool nbd_drained_poll(void *opaque)
> >  {
> >      NBDExport *exp = opaque;
> >      NBDClient *client;
> >  
> > +    assert(qemu_in_main_thread());
> > +
> >      QTAILQ_FOREACH(client, &exp->clients, next) {
> > -        if (client->nb_requests != 0) {
> > -            /*
> > -             * If there's a coroutine waiting for a request on nbd_read_eof()
> > -             * enter it here so we don't depend on the client to wake it up.
> > -             */
> > -            if (client->recv_coroutine != NULL && client->read_yielding) {
> > -                qio_channel_wake_read(client->ioc);
> > +        WITH_QEMU_LOCK_GUARD(&client->lock) {
> > +            if (client->nb_requests != 0) {
> > +                /*
> > +                 * If there's a coroutine waiting for a request on nbd_read_eof()
> > +                 * enter it here so we don't depend on the client to wake it up.
> > +                 *
> > +                 * Schedule a BH in the export AioContext to avoid missing the
> > +                 * wake up due to the race between qio_channel_wake_read() and
> > +                 * qio_channel_yield().
> > +                 */
> > +                if (client->recv_coroutine != NULL && client->read_yielding) {
> > +                    aio_bh_schedule_oneshot(nbd_export_aio_context(client->exp),
> > +                                            nbd_wake_read_bh, client);
> > +                }
> 
> Doesn't the condition have to move inside the BH to avoid the race?
> 
> Checking client->recv_coroutine != NULL could work here because I don't
> think it can go from NULL to something while we're quiescing, but
> client->read_yielding can still change until the BH runs and we know
> that the nbd_co_trip() coroutine has yielded. It seems easiest to just
> move the whole condition to the BH.

I will add aio_wait_kick() into nbd_read_eof() immediately after setting
client->read_yielding. That way nbd_drained_poll() re-runs after
client->read_yielding is set to true.

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2023-12-21 19:22 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-12-21 15:35 [PATCH v2 0/6] qemu-iotests fixes for Kevin's block tree Stefan Hajnoczi
2023-12-21 15:35 ` [PATCH v2 1/6] fixup block-coroutine-wrapper: use qemu_get_current_aio_context() Stefan Hajnoczi
2023-12-21 15:35 ` [PATCH v2 2/6] fixup block: remove AioContext locking Stefan Hajnoczi
2023-12-21 15:35 ` [PATCH v2 3/6] fixup scsi: only access SCSIDevice->requests from one thread Stefan Hajnoczi
2023-12-21 15:35 ` [PATCH v2 4/6] nbd/server: avoid per-NBDRequest nbd_client_get/put() Stefan Hajnoczi
2023-12-21 15:35 ` [PATCH v2 5/6] nbd/server: only traverse NBDExport->clients from main loop thread Stefan Hajnoczi
2023-12-21 15:35 ` [PATCH v2 6/6] nbd/server: introduce NBDClient->lock to protect fields Stefan Hajnoczi
2023-12-21 17:38   ` Kevin Wolf
2023-12-21 19:21     ` Stefan Hajnoczi

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.