All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [RFC PATCH 00/11] Block layer thread-safety, part 2
@ 2017-05-31  9:43 Paolo Bonzini
  2017-05-31  9:43 ` [Qemu-devel] [PATCH 01/11] qcow2: call CoQueue APIs under CoMutex Paolo Bonzini
                   ` (11 more replies)
  0 siblings, 12 replies; 24+ messages in thread
From: Paolo Bonzini @ 2017-05-31  9:43 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, famz, qemu-block, mreitz

This part takes care of drivers and devices, making sure that they can
accept concurrent I/O from multiple AioContext.  Only RFC for now because it
depends on Kevin's QED conversion to coroutines, which is still in flux.

The following drivers are thread-safe without using any QemuMutex/CoMutex:
crypto, gluster, null, rbd, win32-aio.

The following drivers already use mutexes for everything except possibly
snapshots, which do not (yet?) need protection: bochs, cloop, dmg, qcow,
parallels, vhdx, vmdk, curl, iscsi, nfs.

The following drivers already use mutexes for _almost_ everything: vpc
(missing get_block_status), vdi (missing bitmap access), vvfat (missing
commit), nbd (s->in_flight not protected), qcow2 (must call CoQueue APIs
under CoMutex).  They are fixed by patches 1-6.

The following drivers must be changed to use CoMutex to protect internal
data: qed (patches 7-9), sheepdog (patch 10).

The following driver must be changed to support I/O from any AioContext:
ssh.  It is fixed by patch 11.

Paolo

Paolo Bonzini (11):
  qcow2: call CoQueue APIs under CoMutex
  coroutine-lock: add qemu_co_rwlock_downgrade and qemu_co_rwlock_upgrade
  vdi: make it thread-safe
  vpc: make it thread-safe
  vvfat: make it thread-safe
  nbd: make it thread-safe
  qed: move tail of qed_aio_write_main to qed_aio_write_{cow,alloc}
  block: invoke .bdrv_drain callback in coroutine context and from AioContext
  qed: protect table cache with CoMutex
  sheepdog: add queue_lock
  ssh: support I/O from any AioContext

 block/io.c                 |  42 +++++++---
 block/nbd-client.c         |  30 +++----
 block/qcow2.c              |   4 +-
 block/qed-cluster.c        |   4 +-
 block/qed-l2-cache.c       |   6 ++
 block/qed-table.c          |  24 ++++--
 block/qed.c                | 195 +++++++++++++++++++++++++++------------------
 block/qed.h                |  11 ++-
 block/sheepdog.c           |  21 ++++-
 block/ssh.c                |  24 ++++--
 block/vdi.c                |  48 +++++------
 block/vpc.c                |  20 ++---
 block/vvfat.c              |   8 +-
 include/block/block_int.h  |   2 +-
 include/qemu/coroutine.h   |  19 +++++
 util/qemu-coroutine-lock.c |  35 ++++++++
 16 files changed, 326 insertions(+), 167 deletions(-)

-- 
2.13.0

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [Qemu-devel] [PATCH 01/11] qcow2: call CoQueue APIs under CoMutex
  2017-05-31  9:43 [Qemu-devel] [RFC PATCH 00/11] Block layer thread-safety, part 2 Paolo Bonzini
@ 2017-05-31  9:43 ` Paolo Bonzini
  2017-05-31 15:18   ` Eric Blake
  2017-05-31  9:43 ` [Qemu-devel] [PATCH 02/11] coroutine-lock: add qemu_co_rwlock_downgrade and qemu_co_rwlock_upgrade Paolo Bonzini
                   ` (10 subsequent siblings)
  11 siblings, 1 reply; 24+ messages in thread
From: Paolo Bonzini @ 2017-05-31  9:43 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, famz, qemu-block, mreitz

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 block/qcow2.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index 1c2697732b..c9193603b9 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -1696,8 +1696,6 @@ static coroutine_fn int qcow2_co_pwritev(BlockDriverState *bs, uint64_t offset,
     ret = 0;
 
 fail:
-    qemu_co_mutex_unlock(&s->lock);
-
     while (l2meta != NULL) {
         QCowL2Meta *next;
 
@@ -1711,6 +1709,8 @@ fail:
         l2meta = next;
     }
 
+    qemu_co_mutex_unlock(&s->lock);
+
     qemu_iovec_destroy(&hd_qiov);
     qemu_vfree(cluster_data);
     trace_qcow2_writev_done_req(qemu_coroutine_self(), ret);
-- 
2.13.0

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [Qemu-devel] [PATCH 02/11] coroutine-lock: add qemu_co_rwlock_downgrade and qemu_co_rwlock_upgrade
  2017-05-31  9:43 [Qemu-devel] [RFC PATCH 00/11] Block layer thread-safety, part 2 Paolo Bonzini
  2017-05-31  9:43 ` [Qemu-devel] [PATCH 01/11] qcow2: call CoQueue APIs under CoMutex Paolo Bonzini
@ 2017-05-31  9:43 ` Paolo Bonzini
  2017-05-31 20:50   ` Eric Blake
  2017-05-31  9:43 ` [Qemu-devel] [PATCH 03/11] vdi: make it thread-safe Paolo Bonzini
                   ` (9 subsequent siblings)
  11 siblings, 1 reply; 24+ messages in thread
From: Paolo Bonzini @ 2017-05-31  9:43 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, famz, qemu-block, mreitz

These functions are more efficient in the presence of contention.
qemu_co_rwlock_downgrade also guarantees not to block, which may
be useful in some algorithms too.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 include/qemu/coroutine.h   | 19 +++++++++++++++++++
 util/qemu-coroutine-lock.c | 35 +++++++++++++++++++++++++++++++++++
 2 files changed, 54 insertions(+)

diff --git a/include/qemu/coroutine.h b/include/qemu/coroutine.h
index a4509bd977..0ca96fe3d3 100644
--- a/include/qemu/coroutine.h
+++ b/include/qemu/coroutine.h
@@ -229,6 +229,24 @@ void qemu_co_rwlock_init(CoRwlock *lock);
 void qemu_co_rwlock_rdlock(CoRwlock *lock);
 
 /**
+ * Write Locks the CoRwlock from a reader.  This is a bit more efficient than
+ * @qemu_co_rwlock_unlock followed by a separate @qemu_co_rwlock_wrlock.
+ * However, if the lock cannot be upgraded immediately, control is transferred
+ * to the caller of the current coroutine.  Also, @qemu_co_rwlock_upgrade
+ * only overrides CoRwlock's fairness if there are no concurrent readers;
+ * another writer might run while @qemu_co_rwlock_upgrade blocks.
+ */
+void qemu_co_rwlock_upgrade(CoRwlock *lock);
+
+/**
+ * Downgrades a write-side critical section to a reader.  Downgrading with
+ * @qemu_co_rwlock_downgrade never blocks, unlike @qemu_co_rwlock_unlock
+ * followed by @qemu_co_rwlock_rdlock.  This makes it more efficient, and
+ * may also sometimes be necessary for correctness.
+ */
+void qemu_co_rwlock_downgrade(CoRwlock *lock);
+
+/**
  * Write Locks the mutex. If the lock cannot be taken immediately because
  * of a parallel reader, control is transferred to the caller of the current
  * coroutine.
diff --git a/util/qemu-coroutine-lock.c b/util/qemu-coroutine-lock.c
index 6328eed26b..bcdcb91ee1 100644
--- a/util/qemu-coroutine-lock.c
+++ b/util/qemu-coroutine-lock.c
@@ -387,6 +387,21 @@ void qemu_co_rwlock_unlock(CoRwlock *lock)
     qemu_co_mutex_unlock(&lock->mutex);
 }
 
+void qemu_co_rwlock_downgrade(CoRwlock *lock)
+{
+    Coroutine *self = qemu_coroutine_self();
+
+    /* lock->mutex critical section started in qemu_co_rwlock_wrlock or
+     * qemu_co_rwlock_upgrade.
+     */
+    assert(lock->reader == 0);
+    lock->reader++;
+    qemu_co_mutex_unlock(&lock->mutex);
+
+    /* The rest of the read-side critical section is run without the mutex.  */
+    self->locks_held++;
+}
+
 void qemu_co_rwlock_wrlock(CoRwlock *lock)
 {
     qemu_co_mutex_lock(&lock->mutex);
@@ -401,3 +416,23 @@ void qemu_co_rwlock_wrlock(CoRwlock *lock)
      * There is no need to update self->locks_held.
      */
 }
+
+void qemu_co_rwlock_upgrade(CoRwlock *lock)
+{
+    Coroutine *self = qemu_coroutine_self();
+
+    qemu_co_mutex_lock(&lock->mutex);
+    assert(lock->reader > 0);
+    lock->reader--;
+    lock->pending_writer++;
+    while (lock->reader) {
+        qemu_co_queue_wait(&lock->queue, &lock->mutex);
+    }
+    lock->pending_writer--;
+
+    /* The rest of the write-side critical section is run with
+     * the mutex taken, similar to qemu_co_rwlock_wrlock.  Do
+     * not account for the lock twice in self->locks_held.
+     */
+    self->locks_held--;
+}
-- 
2.13.0

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [Qemu-devel] [PATCH 03/11] vdi: make it thread-safe
  2017-05-31  9:43 [Qemu-devel] [RFC PATCH 00/11] Block layer thread-safety, part 2 Paolo Bonzini
  2017-05-31  9:43 ` [Qemu-devel] [PATCH 01/11] qcow2: call CoQueue APIs under CoMutex Paolo Bonzini
  2017-05-31  9:43 ` [Qemu-devel] [PATCH 02/11] coroutine-lock: add qemu_co_rwlock_downgrade and qemu_co_rwlock_upgrade Paolo Bonzini
@ 2017-05-31  9:43 ` Paolo Bonzini
  2017-05-31 20:55   ` Eric Blake
  2017-05-31  9:43 ` [Qemu-devel] [PATCH 04/11] vpc: " Paolo Bonzini
                   ` (8 subsequent siblings)
  11 siblings, 1 reply; 24+ messages in thread
From: Paolo Bonzini @ 2017-05-31  9:43 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, famz, qemu-block, mreitz

The VirtualBox driver is using a mutex to order all allocating writes,
but it is not protecting accesses to the bitmap because they implicitly
happen under the AioContext mutex.  Change this to use a CoRwlock
explicitly.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 block/vdi.c | 48 ++++++++++++++++++++++++------------------------
 1 file changed, 24 insertions(+), 24 deletions(-)

diff --git a/block/vdi.c b/block/vdi.c
index d12d9cdc79..57f1a037c8 100644
--- a/block/vdi.c
+++ b/block/vdi.c
@@ -172,7 +172,7 @@ typedef struct {
     /* VDI header (converted to host endianness). */
     VdiHeader header;
 
-    CoMutex write_lock;
+    CoRwlock bmap_lock;
 
     Error *migration_blocker;
 } BDRVVdiState;
@@ -485,7 +485,7 @@ static int vdi_open(BlockDriverState *bs, QDict *options, int flags,
         goto fail_free_bmap;
     }
 
-    qemu_co_mutex_init(&s->write_lock);
+    qemu_co_rwlock_init(&s->bmap_lock);
 
     return 0;
 
@@ -557,7 +557,9 @@ vdi_co_preadv(BlockDriverState *bs, uint64_t offset, uint64_t bytes,
                n_bytes, offset);
 
         /* prepare next AIO request */
+        qemu_co_rwlock_rdlock(&s->bmap_lock);
         bmap_entry = le32_to_cpu(s->bmap[block_index]);
+        qemu_co_rwlock_unlock(&s->bmap_lock);
         if (!VDI_IS_ALLOCATED(bmap_entry)) {
             /* Block not allocated, return zeros, no need to wait. */
             qemu_iovec_memset(qiov, bytes_done, 0, n_bytes);
@@ -595,6 +597,7 @@ vdi_co_pwritev(BlockDriverState *bs, uint64_t offset, uint64_t bytes,
     uint32_t block_index;
     uint32_t offset_in_block;
     uint32_t n_bytes;
+    uint64_t data_offset;
     uint32_t bmap_first = VDI_UNALLOCATED;
     uint32_t bmap_last = VDI_UNALLOCATED;
     uint8_t *block = NULL;
@@ -614,10 +617,19 @@ vdi_co_pwritev(BlockDriverState *bs, uint64_t offset, uint64_t bytes,
                n_bytes, offset);
 
         /* prepare next AIO request */
+        qemu_co_rwlock_rdlock(&s->bmap_lock);
         bmap_entry = le32_to_cpu(s->bmap[block_index]);
         if (!VDI_IS_ALLOCATED(bmap_entry)) {
             /* Allocate new block and write to it. */
-            uint64_t data_offset;
+            qemu_co_rwlock_upgrade(&s->bmap_lock);
+            bmap_entry = le32_to_cpu(s->bmap[block_index]);
+            if (VDI_IS_ALLOCATED(bmap_entry)) {
+                /* A concurrent allocation did the work for us.  */
+                qemu_co_rwlock_downgrade(&s->bmap_lock);
+                goto nonallocating_write;
+            }
+
             bmap_entry = s->header.blocks_allocated;
             s->bmap[block_index] = cpu_to_le32(bmap_entry);
             s->header.blocks_allocated++;
@@ -635,30 +647,18 @@ vdi_co_pwritev(BlockDriverState *bs, uint64_t offset, uint64_t bytes,
             memset(block + offset_in_block + n_bytes, 0,
                    s->block_size - n_bytes - offset_in_block);
 
-            /* Note that this coroutine does not yield anywhere from reading the
-             * bmap entry until here, so in regards to all the coroutines trying
-             * to write to this cluster, the one doing the allocation will
-             * always be the first to try to acquire the lock.
-             * Therefore, it is also the first that will actually be able to
-             * acquire the lock and thus the padded cluster is written before
-             * the other coroutines can write to the affected area. */
-            qemu_co_mutex_lock(&s->write_lock);
+            /* Write the new block under CoRwLock write-side protection,
+             * so this full-cluster write does not overlap a partial write
+             * of the same cluster, issued from the "else" branch.
+             */
             ret = bdrv_pwrite(bs->file, data_offset, block, s->block_size);
-            qemu_co_mutex_unlock(&s->write_lock);
+            qemu_co_rwlock_unlock(&s->bmap_lock);
         } else {
-            uint64_t data_offset = s->header.offset_data +
-                                   (uint64_t)bmap_entry * s->block_size +
-                                   offset_in_block;
-            qemu_co_mutex_lock(&s->write_lock);
-            /* This lock is only used to make sure the following write operation
-             * is executed after the write issued by the coroutine allocating
-             * this cluster, therefore we do not need to keep it locked.
-             * As stated above, the allocating coroutine will always try to lock
-             * the mutex before all the other concurrent accesses to that
-             * cluster, therefore at this point we can be absolutely certain
-             * that that write operation has returned (there may be other writes
-             * in flight, but they do not concern this very operation). */
-            qemu_co_mutex_unlock(&s->write_lock);
+nonallocating_write:
+            data_offset = s->header.offset_data +
+                           (uint64_t)bmap_entry * s->block_size +
+                           offset_in_block;
+            qemu_co_rwlock_unlock(&s->bmap_lock);
 
             qemu_iovec_reset(&local_qiov);
             qemu_iovec_concat(&local_qiov, qiov, bytes_done, n_bytes);
-- 
2.13.0

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [Qemu-devel] [PATCH 04/11] vpc: make it thread-safe
  2017-05-31  9:43 [Qemu-devel] [RFC PATCH 00/11] Block layer thread-safety, part 2 Paolo Bonzini
                   ` (2 preceding siblings ...)
  2017-05-31  9:43 ` [Qemu-devel] [PATCH 03/11] vdi: make it thread-safe Paolo Bonzini
@ 2017-05-31  9:43 ` Paolo Bonzini
  2017-05-31 20:58   ` Eric Blake
  2017-05-31  9:43 ` [Qemu-devel] [PATCH 05/11] vvfat: " Paolo Bonzini
                   ` (7 subsequent siblings)
  11 siblings, 1 reply; 24+ messages in thread
From: Paolo Bonzini @ 2017-05-31  9:43 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, famz, qemu-block, mreitz

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 block/vpc.c | 20 ++++++++++----------
 1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/block/vpc.c b/block/vpc.c
index ecfee77149..d61754d4d5 100644
--- a/block/vpc.c
+++ b/block/vpc.c
@@ -496,12 +496,6 @@ static inline int64_t get_image_offset(BlockDriverState *bs, uint64_t offset,
     return block_offset;
 }
 
-static inline int64_t get_sector_offset(BlockDriverState *bs,
-                                        int64_t sector_num, bool write)
-{
-    return get_image_offset(bs, sector_num * BDRV_SECTOR_SIZE, write);
-}
-
 /*
  * Writes the footer to the end of the image file. This is needed when the
  * file grows as it overwrites the old footer
@@ -696,6 +690,7 @@ static int64_t coroutine_fn vpc_co_get_block_status(BlockDriverState *bs,
     VHDFooter *footer = (VHDFooter*) s->footer_buf;
     int64_t start, offset;
     bool allocated;
+    int64_t ret;
     int n;
 
     if (be32_to_cpu(footer->type) == VHD_FIXED) {
@@ -705,10 +700,13 @@ static int64_t coroutine_fn vpc_co_get_block_status(BlockDriverState *bs,
                (sector_num << BDRV_SECTOR_BITS);
     }
 
-    offset = get_sector_offset(bs, sector_num, 0);
+    qemu_co_mutex_lock(&s->lock);
+
+    offset = get_image_offset(bs, sector_num << BDRV_SECTOR_BITS, 0);
     start = offset;
     allocated = (offset != -1);
     *pnum = 0;
+    ret = 0;
 
     do {
         /* All sectors in a block are contiguous (without using the bitmap) */
@@ -723,15 +721,17 @@ static int64_t coroutine_fn vpc_co_get_block_status(BlockDriverState *bs,
          * sectors since there is always a bitmap in between. */
         if (allocated) {
             *file = bs->file->bs;
-            return BDRV_BLOCK_DATA | BDRV_BLOCK_OFFSET_VALID | start;
+            ret = BDRV_BLOCK_DATA | BDRV_BLOCK_OFFSET_VALID | start;
+            break;
         }
         if (nb_sectors == 0) {
             break;
         }
-        offset = get_sector_offset(bs, sector_num, 0);
+        offset = get_image_offset(bs, sector_num << BDRV_SECTOR_BITS, 0);
     } while (offset == -1);
 
-    return 0;
+    qemu_co_mutex_unlock(&s->lock);
+    return ret;
 }
 
 /*
-- 
2.13.0

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [Qemu-devel] [PATCH 05/11] vvfat: make it thread-safe
  2017-05-31  9:43 [Qemu-devel] [RFC PATCH 00/11] Block layer thread-safety, part 2 Paolo Bonzini
                   ` (3 preceding siblings ...)
  2017-05-31  9:43 ` [Qemu-devel] [PATCH 04/11] vpc: " Paolo Bonzini
@ 2017-05-31  9:43 ` Paolo Bonzini
  2017-05-31 21:09   ` Eric Blake
  2017-05-31  9:43 ` [Qemu-devel] [PATCH 06/11] nbd: " Paolo Bonzini
                   ` (6 subsequent siblings)
  11 siblings, 1 reply; 24+ messages in thread
From: Paolo Bonzini @ 2017-05-31  9:43 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, famz, qemu-block, mreitz

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 block/vvfat.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/block/vvfat.c b/block/vvfat.c
index 9c82371360..457b73a35c 100644
--- a/block/vvfat.c
+++ b/block/vvfat.c
@@ -2983,8 +2983,14 @@ static int coroutine_fn
 write_target_commit(BlockDriverState *bs, uint64_t offset, uint64_t bytes,
                     QEMUIOVector *qiov, int flags)
 {
+    int ret;
+
     BDRVVVFATState* s = *((BDRVVVFATState**) bs->opaque);
-    return try_commit(s);
+    qemu_co_mutex_lock(&s->lock);
+    ret = try_commit(s);
+    qemu_co_mutex_unlock(&s->lock);
+
+    return ret;
 }
 
 static void write_target_close(BlockDriverState *bs) {
-- 
2.13.0

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [Qemu-devel] [PATCH 06/11] nbd: make it thread-safe
  2017-05-31  9:43 [Qemu-devel] [RFC PATCH 00/11] Block layer thread-safety, part 2 Paolo Bonzini
                   ` (4 preceding siblings ...)
  2017-05-31  9:43 ` [Qemu-devel] [PATCH 05/11] vvfat: " Paolo Bonzini
@ 2017-05-31  9:43 ` Paolo Bonzini
  2017-05-31 21:20   ` Eric Blake
  2017-05-31  9:43 ` [Qemu-devel] [PATCH 07/11] qed: move tail of qed_aio_write_main to qed_aio_write_{cow, alloc} Paolo Bonzini
                   ` (5 subsequent siblings)
  11 siblings, 1 reply; 24+ messages in thread
From: Paolo Bonzini @ 2017-05-31  9:43 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, famz, qemu-block, mreitz

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 block/nbd-client.c | 30 +++++++++---------------------
 1 file changed, 9 insertions(+), 21 deletions(-)

diff --git a/block/nbd-client.c b/block/nbd-client.c
index 1e2952fdae..43e0292ac1 100644
--- a/block/nbd-client.c
+++ b/block/nbd-client.c
@@ -114,6 +114,10 @@ static int nbd_co_send_request(BlockDriverState *bs,
     int rc, ret, i;
 
     qemu_co_mutex_lock(&s->send_mutex);
+    while (s->in_flight == MAX_NBD_REQUESTS) {
+        qemu_co_queue_wait(&s->free_sema, &s->send_mutex);
+    }
+    s->in_flight++;
 
     for (i = 0; i < MAX_NBD_REQUESTS; i++) {
         if (s->recv_coroutine[i] == NULL) {
@@ -176,20 +180,6 @@ static void nbd_co_receive_reply(NBDClientSession *s,
     }
 }
 
-static void nbd_coroutine_start(NBDClientSession *s,
-                                NBDRequest *request)
-{
-    /* Poor man semaphore.  The free_sema is locked when no other request
-     * can be accepted, and unlocked after receiving one reply.  */
-    if (s->in_flight == MAX_NBD_REQUESTS) {
-        qemu_co_queue_wait(&s->free_sema, NULL);
-        assert(s->in_flight < MAX_NBD_REQUESTS);
-    }
-    s->in_flight++;
-
-    /* s->recv_coroutine[i] is set as soon as we get the send_lock.  */
-}
-
 static void nbd_coroutine_end(BlockDriverState *bs,
                               NBDRequest *request)
 {
@@ -197,13 +187,16 @@ static void nbd_coroutine_end(BlockDriverState *bs,
     int i = HANDLE_TO_INDEX(s, request->handle);
 
     s->recv_coroutine[i] = NULL;
-    s->in_flight--;
-    qemu_co_queue_next(&s->free_sema);
 
     /* Kick the read_reply_co to get the next reply.  */
     if (s->read_reply_co) {
         aio_co_wake(s->read_reply_co);
     }
+
+    qemu_co_mutex_lock(&s->send_mutex);
+    s->in_flight--;
+    qemu_co_queue_next(&s->free_sema);
+    qemu_co_mutex_unlock(&s->send_mutex);
 }
 
 int nbd_client_co_preadv(BlockDriverState *bs, uint64_t offset,
@@ -221,7 +214,6 @@ int nbd_client_co_preadv(BlockDriverState *bs, uint64_t offset,
     assert(bytes <= NBD_MAX_BUFFER_SIZE);
     assert(!flags);
 
-    nbd_coroutine_start(client, &request);
     ret = nbd_co_send_request(bs, &request, NULL);
     if (ret < 0) {
         reply.error = -ret;
@@ -251,7 +243,6 @@ int nbd_client_co_pwritev(BlockDriverState *bs, uint64_t offset,
 
     assert(bytes <= NBD_MAX_BUFFER_SIZE);
 
-    nbd_coroutine_start(client, &request);
     ret = nbd_co_send_request(bs, &request, qiov);
     if (ret < 0) {
         reply.error = -ret;
@@ -286,7 +277,6 @@ int nbd_client_co_pwrite_zeroes(BlockDriverState *bs, int64_t offset,
         request.flags |= NBD_CMD_FLAG_NO_HOLE;
     }
 
-    nbd_coroutine_start(client, &request);
     ret = nbd_co_send_request(bs, &request, NULL);
     if (ret < 0) {
         reply.error = -ret;
@@ -311,7 +301,6 @@ int nbd_client_co_flush(BlockDriverState *bs)
     request.from = 0;
     request.len = 0;
 
-    nbd_coroutine_start(client, &request);
     ret = nbd_co_send_request(bs, &request, NULL);
     if (ret < 0) {
         reply.error = -ret;
@@ -337,7 +326,6 @@ int nbd_client_co_pdiscard(BlockDriverState *bs, int64_t offset, int count)
         return 0;
     }
 
-    nbd_coroutine_start(client, &request);
     ret = nbd_co_send_request(bs, &request, NULL);
     if (ret < 0) {
         reply.error = -ret;
-- 
2.13.0

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [Qemu-devel] [PATCH 07/11] qed: move tail of qed_aio_write_main to qed_aio_write_{cow, alloc}
  2017-05-31  9:43 [Qemu-devel] [RFC PATCH 00/11] Block layer thread-safety, part 2 Paolo Bonzini
                   ` (5 preceding siblings ...)
  2017-05-31  9:43 ` [Qemu-devel] [PATCH 06/11] nbd: " Paolo Bonzini
@ 2017-05-31  9:43 ` Paolo Bonzini
  2017-05-31  9:43 ` [Qemu-devel] [PATCH 08/11] block: invoke .bdrv_drain callback in coroutine context and from AioContext Paolo Bonzini
                   ` (4 subsequent siblings)
  11 siblings, 0 replies; 24+ messages in thread
From: Paolo Bonzini @ 2017-05-31  9:43 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, famz, qemu-block, mreitz

This part is never called for in-place writes, move it away to avoid
the "backwards" coding style typical of callback-based code.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 block/qed.c | 68 ++++++++++++++++++++++++++++---------------------------------
 1 file changed, 31 insertions(+), 37 deletions(-)

diff --git a/block/qed.c b/block/qed.c
index ab505faa84..83a0973cfb 100644
--- a/block/qed.c
+++ b/block/qed.c
@@ -981,39 +981,11 @@ static int qed_aio_write_main(QEDAIOCB *acb)
     BDRVQEDState *s = acb_to_s(acb);
     uint64_t offset = acb->cur_cluster +
                       qed_offset_into_cluster(s, acb->cur_pos);
-    int ret;
 
     trace_qed_aio_write_main(s, acb, 0, offset, acb->cur_qiov.size);
 
     BLKDBG_EVENT(s->bs->file, BLKDBG_WRITE_AIO);
-    ret = bdrv_pwritev(s->bs->file, offset, &acb->cur_qiov);
-    if (ret < 0) {
-        return ret;
-    }
-
-    if (acb->find_cluster_ret != QED_CLUSTER_FOUND) {
-        if (s->bs->backing) {
-            /*
-             * Flush new data clusters before updating the L2 table
-             *
-             * This flush is necessary when a backing file is in use.  A crash
-             * during an allocating write could result in empty clusters in the
-             * image.  If the write only touched a subregion of the cluster,
-             * then backing image sectors have been lost in the untouched
-             * region.  The solution is to flush after writing a new data
-             * cluster and before updating the L2 table.
-             */
-            ret = bdrv_flush(s->bs->file->bs);
-            if (ret < 0) {
-                return ret;
-            }
-        }
-        ret = qed_aio_write_l2_update(acb, acb->cur_cluster);
-        if (ret < 0) {
-            return ret;
-        }
-    }
-    return 0;
+    return bdrv_pwritev(s->bs->file, offset, &acb->cur_qiov);
 }
 
 /**
@@ -1048,7 +1020,29 @@ static int qed_aio_write_cow(QEDAIOCB *acb)
         return ret;
     }
 
-    return qed_aio_write_main(acb);
+    ret = qed_aio_write_main(acb);
+    if (ret < 0) {
+        return ret;
+    }
+
+    if (s->bs->backing) {
+        /*
+         * Flush new data clusters before updating the L2 table
+         *
+         * This flush is necessary when a backing file is in use.  A crash
+         * during an allocating write could result in empty clusters in the
+         * image.  If the write only touched a subregion of the cluster,
+         * then backing image sectors have been lost in the untouched
+         * region.  The solution is to flush after writing a new data
+         * cluster and before updating the L2 table.
+         */
+        ret = bdrv_flush(s->bs->file->bs);
+        if (ret < 0) {
+            return ret;
+        }
+    }
+
+    return 0;
 }
 
 /**
@@ -1101,6 +1095,7 @@ static int qed_aio_write_alloc(QEDAIOCB *acb, size_t len)
         if (acb->find_cluster_ret == QED_CLUSTER_ZERO) {
             return 0;
         }
+        acb->cur_cluster = 1;
     } else {
         acb->cur_cluster = qed_alloc_clusters(s, acb->cur_nclusters);
     }
@@ -1113,15 +1108,14 @@ static int qed_aio_write_alloc(QEDAIOCB *acb, size_t len)
         }
     }
 
-    if (acb->flags & QED_AIOCB_ZERO) {
-        ret = qed_aio_write_l2_update(acb, 1);
-    } else {
+    if (!(acb->flags & QED_AIOCB_ZERO)) {
         ret = qed_aio_write_cow(acb);
+        if (ret < 0) {
+            return ret;
+        }
     }
-    if (ret < 0) {
-        return ret;
-    }
-    return 0;
+
+    return qed_aio_write_l2_update(acb, acb->cur_cluster);
 }
 
 /**
-- 
2.13.0

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [Qemu-devel] [PATCH 08/11] block: invoke .bdrv_drain callback in coroutine context and from AioContext
  2017-05-31  9:43 [Qemu-devel] [RFC PATCH 00/11] Block layer thread-safety, part 2 Paolo Bonzini
                   ` (6 preceding siblings ...)
  2017-05-31  9:43 ` [Qemu-devel] [PATCH 07/11] qed: move tail of qed_aio_write_main to qed_aio_write_{cow, alloc} Paolo Bonzini
@ 2017-05-31  9:43 ` Paolo Bonzini
  2017-06-08 12:46   ` Stefan Hajnoczi
  2017-05-31  9:43 ` [Qemu-devel] [PATCH 09/11] qed: protect table cache with CoMutex Paolo Bonzini
                   ` (3 subsequent siblings)
  11 siblings, 1 reply; 24+ messages in thread
From: Paolo Bonzini @ 2017-05-31  9:43 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, famz, qemu-block, mreitz

This will let the callback take a CoMutex in the next patch.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 block/io.c                | 42 +++++++++++++++++++++++++++++++++---------
 block/qed.c               |  6 +++---
 include/block/block_int.h |  2 +-
 3 files changed, 37 insertions(+), 13 deletions(-)

diff --git a/block/io.c b/block/io.c
index 03c77da66f..d5d9107885 100644
--- a/block/io.c
+++ b/block/io.c
@@ -157,6 +157,37 @@ bool bdrv_requests_pending(BlockDriverState *bs)
     return false;
 }
 
+typedef struct {
+    Coroutine *co;
+    BlockDriverState *bs;
+    bool done;
+} BdrvCoDrainData;
+
+static void coroutine_fn bdrv_drain_invoke_entry(void *opaque)
+{
+    BdrvCoDrainData *data = opaque;
+    BlockDriverState *bs = data->bs;
+
+    bs->drv->bdrv_co_drain(bs);
+
+    /* Set data->done before reading bs->wakeup.  */
+    atomic_mb_set(&data->done, true);
+    bdrv_wakeup(bs);
+}
+
+static void bdrv_drain_invoke(BlockDriverState *bs)
+{
+    BdrvCoDrainData data = { .bs = bs, .done = false };
+
+    if (!bs->drv || !bs->drv->bdrv_co_drain) {
+        return;
+    }
+
+    data.co = qemu_coroutine_create(bdrv_drain_invoke_entry, &data);
+    bdrv_coroutine_enter(bs, data.co);
+    BDRV_POLL_WHILE(bs, !data.done);
+}
+
 static bool bdrv_drain_recurse(BlockDriverState *bs)
 {
     BdrvChild *child, *tmp;
@@ -164,9 +195,8 @@ static bool bdrv_drain_recurse(BlockDriverState *bs)
 
     waited = BDRV_POLL_WHILE(bs, atomic_read(&bs->in_flight) > 0);
 
-    if (bs->drv && bs->drv->bdrv_drain) {
-        bs->drv->bdrv_drain(bs);
-    }
+    /* Ensure any pending metadata writes are submitted to bs->file.  */
+    bdrv_drain_invoke(bs);
 
     QLIST_FOREACH_SAFE(child, &bs->children, next, tmp) {
         BlockDriverState *bs = child->bs;
@@ -192,12 +222,6 @@ static bool bdrv_drain_recurse(BlockDriverState *bs)
     return waited;
 }
 
-typedef struct {
-    Coroutine *co;
-    BlockDriverState *bs;
-    bool done;
-} BdrvCoDrainData;
-
 static void bdrv_co_drain_bh_cb(void *opaque)
 {
     BdrvCoDrainData *data = opaque;
diff --git a/block/qed.c b/block/qed.c
index 83a0973cfb..990210cd9c 100644
--- a/block/qed.c
+++ b/block/qed.c
@@ -351,7 +351,7 @@ static void bdrv_qed_attach_aio_context(BlockDriverState *bs,
     }
 }
 
-static void bdrv_qed_drain(BlockDriverState *bs)
+static void coroutine_fn bdrv_qed_co_drain(BlockDriverState *bs)
 {
     BDRVQEDState *s = bs->opaque;
 
@@ -360,7 +360,7 @@ static void bdrv_qed_drain(BlockDriverState *bs)
      */
     if (s->need_check_timer && timer_pending(s->need_check_timer)) {
         qed_cancel_need_check_timer(s);
-        qed_need_check_timer_cb(s);
+        qed_need_check_timer_entry(s);
     }
 }
 
@@ -1544,7 +1544,7 @@ static BlockDriver bdrv_qed = {
     .bdrv_check               = bdrv_qed_check,
     .bdrv_detach_aio_context  = bdrv_qed_detach_aio_context,
     .bdrv_attach_aio_context  = bdrv_qed_attach_aio_context,
-    .bdrv_drain               = bdrv_qed_drain,
+    .bdrv_co_drain            = bdrv_qed_co_drain,
 };
 
 static void bdrv_qed_init(void)
diff --git a/include/block/block_int.h b/include/block/block_int.h
index a150791f0f..b14340d751 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -314,7 +314,7 @@ struct BlockDriver {
      * Drain and stop any internal sources of requests in the driver, and
      * remain so until next I/O callback (e.g. bdrv_co_writev) is called.
      */
-    void (*bdrv_drain)(BlockDriverState *bs);
+    void coroutine_fn (*bdrv_co_drain)(BlockDriverState *bs);
 
     void (*bdrv_add_child)(BlockDriverState *parent, BlockDriverState *child,
                            Error **errp);
-- 
2.13.0

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [Qemu-devel] [PATCH 09/11] qed: protect table cache with CoMutex
  2017-05-31  9:43 [Qemu-devel] [RFC PATCH 00/11] Block layer thread-safety, part 2 Paolo Bonzini
                   ` (7 preceding siblings ...)
  2017-05-31  9:43 ` [Qemu-devel] [PATCH 08/11] block: invoke .bdrv_drain callback in coroutine context and from AioContext Paolo Bonzini
@ 2017-05-31  9:43 ` Paolo Bonzini
  2017-05-31  9:43 ` [Qemu-devel] [PATCH 10/11] sheepdog: add queue_lock Paolo Bonzini
                   ` (2 subsequent siblings)
  11 siblings, 0 replies; 24+ messages in thread
From: Paolo Bonzini @ 2017-05-31  9:43 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, famz, qemu-block, mreitz

This makes the driver thread-safe.  The CoMutex is dropped temporarily
while accessing the data clusters or the backing file.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 block/qed-cluster.c  |   4 +-
 block/qed-l2-cache.c |   6 +++
 block/qed-table.c    |  24 +++++++---
 block/qed.c          | 131 ++++++++++++++++++++++++++++++++++-----------------
 block/qed.h          |  11 +++--
 5 files changed, 121 insertions(+), 55 deletions(-)

diff --git a/block/qed-cluster.c b/block/qed-cluster.c
index 88dc979d8c..fd8722e08f 100644
--- a/block/qed-cluster.c
+++ b/block/qed-cluster.c
@@ -85,6 +85,8 @@ static unsigned int qed_count_contiguous_clusters(BDRVQEDState *s,
  *
  * On failure QED_CLUSTER_L2 or QED_CLUSTER_L1 is returned for missing L2 or L1
  * table offset, respectively. len is number of contiguous unallocated bytes.
+ *
+ * Called with table_lock held.
  */
 int qed_find_cluster(BDRVQEDState *s, QEDRequest *request, uint64_t pos,
                      size_t *len, uint64_t *img_offset)
@@ -111,7 +113,6 @@ int qed_find_cluster(BDRVQEDState *s, QEDRequest *request, uint64_t pos,
     }
 
     ret = qed_read_l2_table(s, request, l2_offset);
-    qed_acquire(s);
     if (ret) {
         goto out;
     }
@@ -136,6 +137,5 @@ int qed_find_cluster(BDRVQEDState *s, QEDRequest *request, uint64_t pos,
 
 out:
     *img_offset = offset;
-    qed_release(s);
     return ret;
 }
diff --git a/block/qed-l2-cache.c b/block/qed-l2-cache.c
index 5cba794650..b548362398 100644
--- a/block/qed-l2-cache.c
+++ b/block/qed-l2-cache.c
@@ -101,6 +101,8 @@ CachedL2Table *qed_alloc_l2_cache_entry(L2TableCache *l2_cache)
 /**
  * Decrease an entry's reference count and free if necessary when the reference
  * count drops to zero.
+ *
+ * Called with table_lock held.
  */
 void qed_unref_l2_cache_entry(CachedL2Table *entry)
 {
@@ -122,6 +124,8 @@ void qed_unref_l2_cache_entry(CachedL2Table *entry)
  *
  * For a cached entry, this function increases the reference count and returns
  * the entry.
+ *
+ * Called with table_lock held.
  */
 CachedL2Table *qed_find_l2_cache_entry(L2TableCache *l2_cache, uint64_t offset)
 {
@@ -150,6 +154,8 @@ CachedL2Table *qed_find_l2_cache_entry(L2TableCache *l2_cache, uint64_t offset)
  * N.B. This function steals a reference to the l2_table from the caller so the
  * caller must obtain a new reference by issuing a call to
  * qed_find_l2_cache_entry().
+ *
+ * Called with table_lock held.
  */
 void qed_commit_l2_cache_entry(L2TableCache *l2_cache, CachedL2Table *l2_table)
 {
diff --git a/block/qed-table.c b/block/qed-table.c
index ebee2c50f0..eead8b0fc7 100644
--- a/block/qed-table.c
+++ b/block/qed-table.c
@@ -18,6 +18,7 @@
 #include "qed.h"
 #include "qemu/bswap.h"
 
+/* Called either from qed_check or with table_lock held.  */
 static int qed_read_table(BDRVQEDState *s, uint64_t offset, QEDTable *table)
 {
     QEMUIOVector qiov;
@@ -32,18 +33,22 @@ static int qed_read_table(BDRVQEDState *s, uint64_t offset, QEDTable *table)
 
     trace_qed_read_table(s, offset, table);
 
+    if (qemu_in_coroutine()) {
+        qemu_co_mutex_unlock(&s->table_lock);
+    }
     ret = bdrv_preadv(s->bs->file, offset, &qiov);
+    if (qemu_in_coroutine()) {
+        qemu_co_mutex_lock(&s->table_lock);
+    }
     if (ret < 0) {
         goto out;
     }
 
     /* Byteswap offsets */
-    qed_acquire(s);
     noffsets = qiov.size / sizeof(uint64_t);
     for (i = 0; i < noffsets; i++) {
         table->offsets[i] = le64_to_cpu(table->offsets[i]);
     }
-    qed_release(s);
 
     ret = 0;
 out:
@@ -61,6 +66,8 @@ out:
  * @index:      Index of first element
  * @n:          Number of elements
  * @flush:      Whether or not to sync to disk
+ *
+ * Called either from qed_check or with table_lock held.
  */
 static int qed_write_table(BDRVQEDState *s, uint64_t offset, QEDTable *table,
                            unsigned int index, unsigned int n, bool flush)
@@ -97,16 +104,20 @@ static int qed_write_table(BDRVQEDState *s, uint64_t offset, QEDTable *table,
     /* Adjust for offset into table */
     offset += start * sizeof(uint64_t);
 
+    if (qemu_in_coroutine()) {
+        qemu_co_mutex_unlock(&s->table_lock);
+    }
     ret = bdrv_pwritev(s->bs->file, offset, &qiov);
+    if (qemu_in_coroutine()) {
+        qemu_co_mutex_lock(&s->table_lock);
+    }
     trace_qed_write_table_cb(s, table, flush, ret);
     if (ret < 0) {
         goto out;
     }
 
     if (flush) {
-        qed_acquire(s);
         ret = bdrv_flush(s->bs);
-        qed_release(s);
         if (ret < 0) {
             goto out;
         }
@@ -123,6 +134,7 @@ int qed_read_l1_table_sync(BDRVQEDState *s)
     return qed_read_table(s, s->header.l1_table_offset, s->l1_table);
 }
 
+/* Called either from qed_check or with table_lock held.  */
 int qed_write_l1_table(BDRVQEDState *s, unsigned int index, unsigned int n)
 {
     BLKDBG_EVENT(s->bs->file, BLKDBG_L1_UPDATE);
@@ -136,6 +148,7 @@ int qed_write_l1_table_sync(BDRVQEDState *s, unsigned int index,
     return qed_write_l1_table(s, index, n);
 }
 
+/* Called either from qed_check or with table_lock held.  */
 int qed_read_l2_table(BDRVQEDState *s, QEDRequest *request, uint64_t offset)
 {
     int ret;
@@ -154,7 +167,6 @@ int qed_read_l2_table(BDRVQEDState *s, QEDRequest *request, uint64_t offset)
     BLKDBG_EVENT(s->bs->file, BLKDBG_L2_LOAD);
     ret = qed_read_table(s, offset, request->l2_table->table);
 
-    qed_acquire(s);
     if (ret) {
         /* can't trust loaded L2 table anymore */
         qed_unref_l2_cache_entry(request->l2_table);
@@ -170,7 +182,6 @@ int qed_read_l2_table(BDRVQEDState *s, QEDRequest *request, uint64_t offset)
         request->l2_table = qed_find_l2_cache_entry(&s->l2_cache, offset);
         assert(request->l2_table != NULL);
     }
-    qed_release(s);
 
     return ret;
 }
@@ -180,6 +191,7 @@ int qed_read_l2_table_sync(BDRVQEDState *s, QEDRequest *request, uint64_t offset
     return qed_read_l2_table(s, request, offset);
 }
 
+/* Called either from qed_check or with table_lock held.  */
 int qed_write_l2_table(BDRVQEDState *s, QEDRequest *request,
                        unsigned int index, unsigned int n, bool flush)
 {
diff --git a/block/qed.c b/block/qed.c
index 990210cd9c..3e93ef7873 100644
--- a/block/qed.c
+++ b/block/qed.c
@@ -94,6 +94,8 @@ int qed_write_header_sync(BDRVQEDState *s)
  *
  * This function only updates known header fields in-place and does not affect
  * extra data after the QED header.
+ *
+ * No new allocating reqs can start while this function runs.
  */
 static int qed_write_header(BDRVQEDState *s)
 {
@@ -110,6 +112,8 @@ static int qed_write_header(BDRVQEDState *s)
     QEMUIOVector qiov;
     int ret;
 
+    assert(s->allocating_acb || s->allocating_write_reqs_plugged);
+
     buf = qemu_blockalign(s->bs, len);
     iov = (struct iovec) {
         .iov_base = buf,
@@ -220,6 +224,8 @@ static int qed_read_string(BdrvChild *file, uint64_t offset, size_t n,
  * This function only produces the offset where the new clusters should be
  * written.  It updates BDRVQEDState but does not make any changes to the image
  * file.
+ *
+ * Called with table_lock held.
  */
 static uint64_t qed_alloc_clusters(BDRVQEDState *s, unsigned int n)
 {
@@ -237,6 +243,8 @@ QEDTable *qed_alloc_table(BDRVQEDState *s)
 
 /**
  * Allocate a new zeroed L2 table
+ *
+ * Called with table_lock held.
  */
 static CachedL2Table *qed_new_l2_table(BDRVQEDState *s)
 {
@@ -250,19 +258,32 @@ static CachedL2Table *qed_new_l2_table(BDRVQEDState *s)
     return l2_table;
 }
 
-static void qed_plug_allocating_write_reqs(BDRVQEDState *s)
+static bool qed_plug_allocating_write_reqs(BDRVQEDState *s)
 {
+    qemu_co_mutex_lock(&s->table_lock);
+
+    /* No reentrancy is allowed.  */
     assert(!s->allocating_write_reqs_plugged);
+    if (s->allocating_acb != NULL) {
+        /* Another allocating write came concurrently.  This cannot happen
+         * from bdrv_qed_co_drain, but it can happen when the timer runs.
+         */
+        qemu_co_mutex_unlock(&s->table_lock);
+        return false;
+    }
 
     s->allocating_write_reqs_plugged = true;
+    qemu_co_mutex_unlock(&s->table_lock);
+    return true;
 }
 
 static void qed_unplug_allocating_write_reqs(BDRVQEDState *s)
 {
+    qemu_co_mutex_lock(&s->table_lock);
     assert(s->allocating_write_reqs_plugged);
-
     s->allocating_write_reqs_plugged = false;
-    qemu_co_enter_next(&s->allocating_write_reqs);
+    qemu_co_queue_next(&s->allocating_write_reqs);
+    qemu_co_mutex_unlock(&s->table_lock);
 }
 
 static void qed_need_check_timer_entry(void *opaque)
@@ -270,17 +291,14 @@ static void qed_need_check_timer_entry(void *opaque)
     BDRVQEDState *s = opaque;
     int ret;
 
-    /* The timer should only fire when allocating writes have drained */
-    assert(!s->allocating_acb);
-
     trace_qed_need_check_timer_cb(s);
 
-    qed_acquire(s);
-    qed_plug_allocating_write_reqs(s);
+    if (!qed_plug_allocating_write_reqs(s)) {
+        return;
+    }
 
     /* Ensure writes are on disk before clearing flag */
     ret = bdrv_co_flush(s->bs->file->bs);
-    qed_release(s);
     if (ret < 0) {
         goto out;
     }
@@ -302,16 +320,6 @@ static void qed_need_check_timer_cb(void *opaque)
     qemu_coroutine_enter(co);
 }
 
-void qed_acquire(BDRVQEDState *s)
-{
-    aio_context_acquire(bdrv_get_aio_context(s->bs));
-}
-
-void qed_release(BDRVQEDState *s)
-{
-    aio_context_release(bdrv_get_aio_context(s->bs));
-}
-
 static void qed_start_need_check_timer(BDRVQEDState *s)
 {
     trace_qed_start_need_check_timer(s);
@@ -373,6 +381,7 @@ static int bdrv_qed_do_open(BlockDriverState *bs, QDict *options, int flags,
     int ret;
 
     s->bs = bs;
+    qemu_co_mutex_init(&s->table_lock);
     qemu_co_queue_init(&s->allocating_write_reqs);
 
     ret = bdrv_pread(bs->file, 0, &le_header, sizeof(le_header));
@@ -682,6 +691,7 @@ typedef struct {
     BlockDriverState **file;
 } QEDIsAllocatedCB;
 
+/* Called with table_lock held.  */
 static void qed_is_allocated_cb(void *opaque, int ret, uint64_t offset, size_t len)
 {
     QEDIsAllocatedCB *cb = opaque;
@@ -729,6 +739,7 @@ static int64_t coroutine_fn bdrv_qed_co_get_block_status(BlockDriverState *bs,
     uint64_t offset;
     int ret;
 
+    qemu_co_mutex_lock(&s->table_lock);
     ret = qed_find_cluster(s, &request, cb.pos, &len, &offset);
     qed_is_allocated_cb(&cb, ret, offset, len);
 
@@ -736,6 +747,7 @@ static int64_t coroutine_fn bdrv_qed_co_get_block_status(BlockDriverState *bs,
     assert(cb.status != BDRV_BLOCK_OFFSET_MASK);
 
     qed_unref_l2_cache_entry(request.l2_table);
+    qemu_co_mutex_unlock(&s->table_lock);
 
     return cb.status;
 }
@@ -865,6 +877,8 @@ out:
  *
  * The cluster offset may be an allocated byte offset in the image file, the
  * zero cluster marker, or the unallocated cluster marker.
+ *
+ * Called with table_lock held.
  */
 static void qed_update_l2_table(BDRVQEDState *s, QEDTable *table, int index,
                                 unsigned int n, uint64_t cluster)
@@ -879,6 +893,7 @@ static void qed_update_l2_table(BDRVQEDState *s, QEDTable *table, int index,
     }
 }
 
+/* Called with table_lock held.  */
 static void qed_aio_complete(QEDAIOCB *acb)
 {
     BDRVQEDState *s = acb_to_s(acb);
@@ -902,7 +917,7 @@ static void qed_aio_complete(QEDAIOCB *acb)
     if (acb == s->allocating_acb) {
         s->allocating_acb = NULL;
         if (!qemu_co_queue_empty(&s->allocating_write_reqs)) {
-            qemu_co_enter_next(&s->allocating_write_reqs);
+            qemu_co_queue_next(&s->allocating_write_reqs);
         } else if (s->header.features & QED_F_NEED_CHECK) {
             qed_start_need_check_timer(s);
         }
@@ -911,6 +926,8 @@ static void qed_aio_complete(QEDAIOCB *acb)
 
 /**
  * Update L1 table with new L2 table offset and write it out
+ *
+ * Called with table_lock held.
  */
 static int qed_aio_write_l1_update(QEDAIOCB *acb)
 {
@@ -939,6 +956,8 @@ static int qed_aio_write_l1_update(QEDAIOCB *acb)
 
 /**
  * Update L2 table with new cluster offsets and write them out
+ *
+ * Called with table_lock held.
  */
 static int qed_aio_write_l2_update(QEDAIOCB *acb, uint64_t offset)
 {
@@ -975,6 +994,8 @@ static int qed_aio_write_l2_update(QEDAIOCB *acb, uint64_t offset)
 
 /**
  * Write data to the image file
+ *
+ * Called with table_lock *not* held.
  */
 static int qed_aio_write_main(QEDAIOCB *acb)
 {
@@ -990,6 +1011,8 @@ static int qed_aio_write_main(QEDAIOCB *acb)
 
 /**
  * Populate untouched regions of new data cluster
+ *
+ * Called with table_lock held.
  */
 static int qed_aio_write_cow(QEDAIOCB *acb)
 {
@@ -997,6 +1020,8 @@ static int qed_aio_write_cow(QEDAIOCB *acb)
     uint64_t start, len, offset;
     int ret;
 
+    qemu_co_mutex_unlock(&s->table_lock);
+
     /* Populate front untouched region of new data cluster */
     start = qed_start_of_cluster(s, acb->cur_pos);
     len = qed_offset_into_cluster(s, acb->cur_pos);
@@ -1004,7 +1029,7 @@ static int qed_aio_write_cow(QEDAIOCB *acb)
     trace_qed_aio_write_prefill(s, acb, start, len, acb->cur_cluster);
     ret = qed_copy_from_backing_file(s, start, len, acb->cur_cluster);
     if (ret < 0) {
-        return ret;
+        goto out;
     }
 
     /* Populate back untouched region of new data cluster */
@@ -1017,12 +1042,12 @@ static int qed_aio_write_cow(QEDAIOCB *acb)
     trace_qed_aio_write_postfill(s, acb, start, len, offset);
     ret = qed_copy_from_backing_file(s, start, len, offset);
     if (ret < 0) {
-        return ret;
+        goto out;
     }
 
     ret = qed_aio_write_main(acb);
     if (ret < 0) {
-        return ret;
+        goto out;
     }
 
     if (s->bs->backing) {
@@ -1037,12 +1062,11 @@ static int qed_aio_write_cow(QEDAIOCB *acb)
          * cluster and before updating the L2 table.
          */
         ret = bdrv_flush(s->bs->file->bs);
-        if (ret < 0) {
-            return ret;
-        }
     }
 
-    return 0;
+out:
+    qemu_co_mutex_lock(&s->table_lock);
+    return ret;
 }
 
 /**
@@ -1065,6 +1089,8 @@ static bool qed_should_set_need_check(BDRVQEDState *s)
  * @len:        Length in bytes
  *
  * This path is taken when writing to previously unallocated clusters.
+ *
+ * Called with table_lock held.
  */
 static int qed_aio_write_alloc(QEDAIOCB *acb, size_t len)
 {
@@ -1079,7 +1105,7 @@ static int qed_aio_write_alloc(QEDAIOCB *acb, size_t len)
     /* Freeze this request if another allocating write is in progress */
     if (s->allocating_acb != acb || s->allocating_write_reqs_plugged) {
         if (s->allocating_acb != NULL) {
-            qemu_co_queue_wait(&s->allocating_write_reqs, NULL);
+            qemu_co_queue_wait(&s->allocating_write_reqs, &s->table_lock);
             assert(s->allocating_acb == NULL);
         }
         s->allocating_acb = acb;
@@ -1126,9 +1152,16 @@ static int qed_aio_write_alloc(QEDAIOCB *acb, size_t len)
  * @len:        Length in bytes
  *
  * This path is taken when writing to already allocated clusters.
+ *
+ * Called with table_lock held.
  */
 static int qed_aio_write_inplace(QEDAIOCB *acb, uint64_t offset, size_t len)
 {
+    BDRVQEDState *s = acb_to_s(acb);
+    int r;
+
+    qemu_co_mutex_unlock(&s->table_lock);
+
     /* Allocate buffer for zero writes */
     if (acb->flags & QED_AIOCB_ZERO) {
         struct iovec *iov = acb->qiov->iov;
@@ -1136,7 +1169,8 @@ static int qed_aio_write_inplace(QEDAIOCB *acb, uint64_t offset, size_t len)
         if (!iov->iov_base) {
             iov->iov_base = qemu_try_blockalign(acb->bs, iov->iov_len);
             if (iov->iov_base == NULL) {
-                return -ENOMEM;
+                r = -ENOMEM;
+                goto out;
             }
             memset(iov->iov_base, 0, iov->iov_len);
         }
@@ -1146,8 +1180,11 @@ static int qed_aio_write_inplace(QEDAIOCB *acb, uint64_t offset, size_t len)
     acb->cur_cluster = offset;
     qemu_iovec_concat(&acb->cur_qiov, acb->qiov, acb->qiov_offset, len);
 
-    /* Do the actual write */
-    return qed_aio_write_main(acb);
+    /* Do the actual write.  */
+    r = qed_aio_write_main(acb);
+out:
+    qemu_co_mutex_lock(&s->table_lock);
+    return r;
 }
 
 /**
@@ -1158,7 +1195,7 @@ static int qed_aio_write_inplace(QEDAIOCB *acb, uint64_t offset, size_t len)
  * @offset:     Cluster offset in bytes
  * @len:        Length in bytes
  *
- * Callback from qed_find_cluster().
+ * Called with table_lock held.
  */
 static int qed_aio_write_data(void *opaque, int ret,
                               uint64_t offset, size_t len)
@@ -1191,13 +1228,16 @@ static int qed_aio_write_data(void *opaque, int ret,
  * @offset:     Cluster offset in bytes
  * @len:        Length in bytes
  *
- * Callback from qed_find_cluster().
+ * Called with table_lock held.
  */
 static int qed_aio_read_data(void *opaque, int ret, uint64_t offset, size_t len)
 {
     QEDAIOCB *acb = opaque;
     BDRVQEDState *s = acb_to_s(acb);
     BlockDriverState *bs = acb->bs;
+    int r;
+
+    qemu_co_mutex_unlock(&s->table_lock);
 
     /* Adjust offset into cluster */
     offset += qed_offset_into_cluster(s, acb->cur_pos);
@@ -1206,21 +1246,22 @@ static int qed_aio_read_data(void *opaque, int ret, uint64_t offset, size_t len)
 
     qemu_iovec_concat(&acb->cur_qiov, acb->qiov, acb->qiov_offset, len);
 
-    /* Handle zero cluster and backing file reads */
+    /* Handle zero cluster and backing file reads, otherwise read
+     * data cluster directly.
+     */
     if (ret == QED_CLUSTER_ZERO) {
         qemu_iovec_memset(&acb->cur_qiov, 0, 0, acb->cur_qiov.size);
-        return 0;
+        r = 0;
     } else if (ret != QED_CLUSTER_FOUND) {
-        return qed_read_backing_file(s, acb->cur_pos, &acb->cur_qiov,
-                                     &acb->backing_qiov);
+        r = qed_read_backing_file(s, acb->cur_pos, &acb->cur_qiov,
+                                  &acb->backing_qiov);
+    } else {
+        BLKDBG_EVENT(bs->file, BLKDBG_READ_AIO);
+        r = bdrv_preadv(bs->file, offset, &acb->cur_qiov);
     }
 
-    BLKDBG_EVENT(bs->file, BLKDBG_READ_AIO);
-    ret = bdrv_preadv(bs->file, offset, &acb->cur_qiov);
-    if (ret < 0) {
-        return ret;
-    }
-    return 0;
+    qemu_co_mutex_lock(&s->table_lock);
+    return r;
 }
 
 /**
@@ -1233,6 +1276,7 @@ static int qed_aio_next_io(QEDAIOCB *acb)
     size_t len;
     int ret;
 
+    qemu_co_mutex_lock(&s->table_lock);
     while (1) {
         trace_qed_aio_next_io(s, acb, 0, acb->cur_pos + acb->cur_qiov.size);
 
@@ -1272,6 +1316,7 @@ static int qed_aio_next_io(QEDAIOCB *acb)
 
     trace_qed_aio_complete(s, acb, ret);
     qed_aio_complete(acb);
+    qemu_co_mutex_unlock(&s->table_lock);
     return ret;
 }
 
diff --git a/block/qed.h b/block/qed.h
index fb80943c2d..21a338cdd5 100644
--- a/block/qed.h
+++ b/block/qed.h
@@ -151,15 +151,21 @@ typedef struct QEDAIOCB {
 
 typedef struct {
     BlockDriverState *bs;           /* device */
-    uint64_t file_size;             /* length of image file, in bytes */
 
+    /* Written only by an allocating write or the timer handler (the latter
+     * while allocating reqs are plugged).
+     */
     QEDHeader header;               /* always cpu-endian */
+
+    /* Protected by table_lock.  */
+    CoMutex table_lock;
     QEDTable *l1_table;
     L2TableCache l2_cache;          /* l2 table cache */
     uint32_t table_nelems;
     uint32_t l1_shift;
     uint32_t l2_shift;
     uint32_t l2_mask;
+    uint64_t file_size;             /* length of image file, in bytes */
 
     /* Allocating write request queue */
     QEDAIOCB *allocating_acb;
@@ -177,9 +183,6 @@ enum {
     QED_CLUSTER_L1,            /* cluster missing in L1 */
 };
 
-void qed_acquire(BDRVQEDState *s);
-void qed_release(BDRVQEDState *s);
-
 /**
  * Header functions
  */
-- 
2.13.0

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [Qemu-devel] [PATCH 10/11] sheepdog: add queue_lock
  2017-05-31  9:43 [Qemu-devel] [RFC PATCH 00/11] Block layer thread-safety, part 2 Paolo Bonzini
                   ` (8 preceding siblings ...)
  2017-05-31  9:43 ` [Qemu-devel] [PATCH 09/11] qed: protect table cache with CoMutex Paolo Bonzini
@ 2017-05-31  9:43 ` Paolo Bonzini
  2017-05-31  9:43 ` [Qemu-devel] [PATCH 11/11] ssh: support I/O from any AioContext Paolo Bonzini
  2017-06-08 12:52 ` [Qemu-devel] [RFC PATCH 00/11] Block layer thread-safety, part 2 Stefan Hajnoczi
  11 siblings, 0 replies; 24+ messages in thread
From: Paolo Bonzini @ 2017-05-31  9:43 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, famz, qemu-block, mreitz

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 block/sheepdog.c | 21 +++++++++++++++++++--
 1 file changed, 19 insertions(+), 2 deletions(-)

diff --git a/block/sheepdog.c b/block/sheepdog.c
index 5ebf5d9fbb..551db7cd17 100644
--- a/block/sheepdog.c
+++ b/block/sheepdog.c
@@ -391,6 +391,7 @@ struct BDRVSheepdogState {
     QLIST_HEAD(inflight_aio_head, AIOReq) inflight_aio_head;
     QLIST_HEAD(failed_aio_head, AIOReq) failed_aio_head;
 
+    CoMutex queue_lock;
     CoQueue overlapping_queue;
     QLIST_HEAD(inflight_aiocb_head, SheepdogAIOCB) inflight_aiocb_head;
 };
@@ -489,7 +490,7 @@ static void wait_for_overlapping_aiocb(BDRVSheepdogState *s, SheepdogAIOCB *acb)
 retry:
     QLIST_FOREACH(cb, &s->inflight_aiocb_head, aiocb_siblings) {
         if (AIOCBOverlapping(acb, cb)) {
-            qemu_co_queue_wait(&s->overlapping_queue, NULL);
+            qemu_co_queue_wait(&s->overlapping_queue, &s->queue_lock);
             goto retry;
         }
     }
@@ -526,8 +527,10 @@ static void sd_aio_setup(SheepdogAIOCB *acb, BDRVSheepdogState *s,
         return;
     }
 
+    qemu_co_mutex_lock(&s->queue_lock);
     wait_for_overlapping_aiocb(s, acb);
     QLIST_INSERT_HEAD(&s->inflight_aiocb_head, acb, aiocb_siblings);
+    qemu_co_mutex_unlock(&s->queue_lock);
 }
 
 static SocketAddress *sd_socket_address(const char *path,
@@ -786,6 +789,7 @@ static coroutine_fn void reconnect_to_sdog(void *opaque)
      * have to move all the inflight requests to the failed queue before
      * resend_aioreq() is called.
      */
+    qemu_co_mutex_lock(&s->queue_lock);
     QLIST_FOREACH_SAFE(aio_req, &s->inflight_aio_head, aio_siblings, next) {
         QLIST_REMOVE(aio_req, aio_siblings);
         QLIST_INSERT_HEAD(&s->failed_aio_head, aio_req, aio_siblings);
@@ -795,8 +799,11 @@ static coroutine_fn void reconnect_to_sdog(void *opaque)
     while (!QLIST_EMPTY(&s->failed_aio_head)) {
         aio_req = QLIST_FIRST(&s->failed_aio_head);
         QLIST_REMOVE(aio_req, aio_siblings);
+        qemu_co_mutex_unlock(&s->queue_lock);
         resend_aioreq(s, aio_req);
+        qemu_co_mutex_lock(&s->queue_lock);
     }
+    qemu_co_mutex_unlock(&s->queue_lock);
 }
 
 /*
@@ -888,7 +895,10 @@ static void coroutine_fn aio_read_response(void *opaque)
     */
     s->co_recv = NULL;
 
+    qemu_co_mutex_lock(&s->queue_lock);
     QLIST_REMOVE(aio_req, aio_siblings);
+    qemu_co_mutex_unlock(&s->queue_lock);
+
     switch (rsp.result) {
     case SD_RES_SUCCESS:
         break;
@@ -1308,7 +1318,9 @@ static void coroutine_fn add_aio_request(BDRVSheepdogState *s, AIOReq *aio_req,
     uint64_t old_oid = aio_req->base_oid;
     bool create = aio_req->create;
 
+    qemu_co_mutex_lock(&s->queue_lock);
     QLIST_INSERT_HEAD(&s->inflight_aio_head, aio_req, aio_siblings);
+    qemu_co_mutex_unlock(&s->queue_lock);
 
     if (!nr_copies) {
         error_report("bug");
@@ -1679,6 +1691,7 @@ static int sd_open(BlockDriverState *bs, QDict *options, int flags,
     bs->total_sectors = s->inode.vdi_size / BDRV_SECTOR_SIZE;
     pstrcpy(s->name, sizeof(s->name), vdi);
     qemu_co_mutex_init(&s->lock);
+    qemu_co_mutex_init(&s->queue_lock);
     qemu_co_queue_init(&s->overlapping_queue);
     qemu_opts_del(opts);
     g_free(buf);
@@ -2432,12 +2445,16 @@ static void coroutine_fn sd_co_rw_vector(SheepdogAIOCB *acb)
 
 static void sd_aio_complete(SheepdogAIOCB *acb)
 {
+    BDRVSheepdogState *s;
     if (acb->aiocb_type == AIOCB_FLUSH_CACHE) {
         return;
     }
 
+    s = acb->s;
+    qemu_co_mutex_lock(&s->queue_lock);
     QLIST_REMOVE(acb, aiocb_siblings);
-    qemu_co_queue_restart_all(&acb->s->overlapping_queue);
+    qemu_co_queue_restart_all(&s->overlapping_queue);
+    qemu_co_mutex_unlock(&s->queue_lock);
 }
 
 static coroutine_fn int sd_co_writev(BlockDriverState *bs, int64_t sector_num,
-- 
2.13.0

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [Qemu-devel] [PATCH 11/11] ssh: support I/O from any AioContext
  2017-05-31  9:43 [Qemu-devel] [RFC PATCH 00/11] Block layer thread-safety, part 2 Paolo Bonzini
                   ` (9 preceding siblings ...)
  2017-05-31  9:43 ` [Qemu-devel] [PATCH 10/11] sheepdog: add queue_lock Paolo Bonzini
@ 2017-05-31  9:43 ` Paolo Bonzini
  2017-06-08 12:52 ` [Qemu-devel] [RFC PATCH 00/11] Block layer thread-safety, part 2 Stefan Hajnoczi
  11 siblings, 0 replies; 24+ messages in thread
From: Paolo Bonzini @ 2017-05-31  9:43 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, famz, qemu-block, mreitz

The coroutine may run in a different AioContext, causing the
fd handler to busy wait.  Fix this by resetting the handler
in restart_coroutine, before the coroutine is restarted.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 block/ssh.c | 24 +++++++++++++++++-------
 1 file changed, 17 insertions(+), 7 deletions(-)

diff --git a/block/ssh.c b/block/ssh.c
index 11203fc5a2..b5a35824d7 100644
--- a/block/ssh.c
+++ b/block/ssh.c
@@ -889,13 +889,22 @@ static int ssh_has_zero_init(BlockDriverState *bs)
     return has_zero_init;
 }
 
+typedef struct BDRVSSHRestart {
+    BlockDriverState *bs;
+    Coroutine *co;
+} BDRVSSHRestart;
+
 static void restart_coroutine(void *opaque)
 {
-    Coroutine *co = opaque;
+    BDRVSSHRestart *restart = opaque;
+    BlockDriverState *bs = restart->bs;
+    BDRVSSHState *s = bs->opaque;
+    AioContext *ctx = bdrv_get_aio_context(bs);
 
-    DPRINTF("co=%p", co);
+    DPRINTF("co=%p", restart->co);
+    aio_set_fd_handler(ctx, s->sock, false, NULL, NULL, NULL, NULL);
 
-    aio_co_wake(co);
+    aio_co_wake(restart->co);
 }
 
 /* A non-blocking call returned EAGAIN, so yield, ensuring the
@@ -906,7 +915,10 @@ static coroutine_fn void co_yield(BDRVSSHState *s, BlockDriverState *bs)
 {
     int r;
     IOHandler *rd_handler = NULL, *wr_handler = NULL;
-    Coroutine *co = qemu_coroutine_self();
+    BDRVSSHRestart restart = {
+        .bs = bs,
+        .co = qemu_coroutine_self()
+    };
 
     r = libssh2_session_block_directions(s->session);
 
@@ -921,11 +933,9 @@ static coroutine_fn void co_yield(BDRVSSHState *s, BlockDriverState *bs)
             rd_handler, wr_handler);
 
     aio_set_fd_handler(bdrv_get_aio_context(bs), s->sock,
-                       false, rd_handler, wr_handler, NULL, co);
+                       false, rd_handler, wr_handler, NULL, &restart);
     qemu_coroutine_yield();
     DPRINTF("s->sock=%d - back", s->sock);
-    aio_set_fd_handler(bdrv_get_aio_context(bs), s->sock, false,
-                       NULL, NULL, NULL, NULL);
 }
 
 /* SFTP has a function `libssh2_sftp_seek64' which seeks to a position
-- 
2.13.0

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [Qemu-devel] [PATCH 01/11] qcow2: call CoQueue APIs under CoMutex
  2017-05-31  9:43 ` [Qemu-devel] [PATCH 01/11] qcow2: call CoQueue APIs under CoMutex Paolo Bonzini
@ 2017-05-31 15:18   ` Eric Blake
  0 siblings, 0 replies; 24+ messages in thread
From: Eric Blake @ 2017-05-31 15:18 UTC (permalink / raw)
  To: Paolo Bonzini, qemu-devel; +Cc: kwolf, famz, qemu-block, mreitz

[-- Attachment #1: Type: text/plain, Size: 1011 bytes --]

On 05/31/2017 04:43 AM, Paolo Bonzini wrote:
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  block/qcow2.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 

Reviewed-by: Eric Blake <eblake@redhat.com>

> diff --git a/block/qcow2.c b/block/qcow2.c
> index 1c2697732b..c9193603b9 100644
> --- a/block/qcow2.c
> +++ b/block/qcow2.c
> @@ -1696,8 +1696,6 @@ static coroutine_fn int qcow2_co_pwritev(BlockDriverState *bs, uint64_t offset,
>      ret = 0;
>  
>  fail:
> -    qemu_co_mutex_unlock(&s->lock);
> -
>      while (l2meta != NULL) {
>          QCowL2Meta *next;
>  
> @@ -1711,6 +1709,8 @@ fail:
>          l2meta = next;
>      }
>  
> +    qemu_co_mutex_unlock(&s->lock);
> +
>      qemu_iovec_destroy(&hd_qiov);
>      qemu_vfree(cluster_data);
>      trace_qcow2_writev_done_req(qemu_coroutine_self(), ret);
> 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Qemu-devel] [PATCH 02/11] coroutine-lock: add qemu_co_rwlock_downgrade and qemu_co_rwlock_upgrade
  2017-05-31  9:43 ` [Qemu-devel] [PATCH 02/11] coroutine-lock: add qemu_co_rwlock_downgrade and qemu_co_rwlock_upgrade Paolo Bonzini
@ 2017-05-31 20:50   ` Eric Blake
  0 siblings, 0 replies; 24+ messages in thread
From: Eric Blake @ 2017-05-31 20:50 UTC (permalink / raw)
  To: Paolo Bonzini, qemu-devel; +Cc: kwolf, famz, qemu-block, mreitz

[-- Attachment #1: Type: text/plain, Size: 634 bytes --]

On 05/31/2017 04:43 AM, Paolo Bonzini wrote:
> These functions are more efficient in the presence of contention.
> qemu_co_rwlock_downgrade also guarantees not to block, which may
> be useful in some algorithms too.
> 
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  include/qemu/coroutine.h   | 19 +++++++++++++++++++
>  util/qemu-coroutine-lock.c | 35 +++++++++++++++++++++++++++++++++++
>  2 files changed, 54 insertions(+)
> 

Reviewed-by: Eric Blake <eblake@redhat.com>

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Qemu-devel] [PATCH 03/11] vdi: make it thread-safe
  2017-05-31  9:43 ` [Qemu-devel] [PATCH 03/11] vdi: make it thread-safe Paolo Bonzini
@ 2017-05-31 20:55   ` Eric Blake
  0 siblings, 0 replies; 24+ messages in thread
From: Eric Blake @ 2017-05-31 20:55 UTC (permalink / raw)
  To: Paolo Bonzini, qemu-devel; +Cc: kwolf, famz, qemu-block, mreitz

[-- Attachment #1: Type: text/plain, Size: 651 bytes --]

On 05/31/2017 04:43 AM, Paolo Bonzini wrote:
> The VirtualBox driver is using a mutex to order all allocating writes,
> but it is not protecting accesses to the bitmap because they implicitly
> happen under the AioContext mutex.  Change this to use a CoRwlock
> explicitly.
> 
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  block/vdi.c | 48 ++++++++++++++++++++++++------------------------
>  1 file changed, 24 insertions(+), 24 deletions(-)
> 

Reviewed-by: Eric Blake <eblake@redhat.com>

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Qemu-devel] [PATCH 04/11] vpc: make it thread-safe
  2017-05-31  9:43 ` [Qemu-devel] [PATCH 04/11] vpc: " Paolo Bonzini
@ 2017-05-31 20:58   ` Eric Blake
  0 siblings, 0 replies; 24+ messages in thread
From: Eric Blake @ 2017-05-31 20:58 UTC (permalink / raw)
  To: Paolo Bonzini, qemu-devel; +Cc: kwolf, famz, qemu-block, mreitz

[-- Attachment #1: Type: text/plain, Size: 1115 bytes --]

On 05/31/2017 04:43 AM, Paolo Bonzini wrote:
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  block/vpc.c | 20 ++++++++++----------
>  1 file changed, 10 insertions(+), 10 deletions(-)
> 

> @@ -705,10 +700,13 @@ static int64_t coroutine_fn vpc_co_get_block_status(BlockDriverState *bs,
>                 (sector_num << BDRV_SECTOR_BITS);
>      }
>  
> -    offset = get_sector_offset(bs, sector_num, 0);
> +    qemu_co_mutex_lock(&s->lock);
> +
> +    offset = get_image_offset(bs, sector_num << BDRV_SECTOR_BITS, 0);

s/0/false/

(Phooey - your code and my efforts to turn block_status into byte-based
code collide; I'll have to rebase again...)

>          if (nb_sectors == 0) {
>              break;
>          }
> -        offset = get_sector_offset(bs, sector_num, 0);
> +        offset = get_image_offset(bs, sector_num << BDRV_SECTOR_BITS, 0);

s/0/false/

With the correct parameter,
Reviewed-by: Eric Blake <eblake@redhat.com>

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Qemu-devel] [PATCH 05/11] vvfat: make it thread-safe
  2017-05-31  9:43 ` [Qemu-devel] [PATCH 05/11] vvfat: " Paolo Bonzini
@ 2017-05-31 21:09   ` Eric Blake
  0 siblings, 0 replies; 24+ messages in thread
From: Eric Blake @ 2017-05-31 21:09 UTC (permalink / raw)
  To: Paolo Bonzini, qemu-devel; +Cc: kwolf, famz, qemu-block, mreitz

[-- Attachment #1: Type: text/plain, Size: 372 bytes --]

On 05/31/2017 04:43 AM, Paolo Bonzini wrote:
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  block/vvfat.c | 8 +++++++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
> 

Reviewed-by: Eric Blake <eblake@redhat.com>

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Qemu-devel] [PATCH 06/11] nbd: make it thread-safe
  2017-05-31  9:43 ` [Qemu-devel] [PATCH 06/11] nbd: " Paolo Bonzini
@ 2017-05-31 21:20   ` Eric Blake
  0 siblings, 0 replies; 24+ messages in thread
From: Eric Blake @ 2017-05-31 21:20 UTC (permalink / raw)
  To: Paolo Bonzini, qemu-devel; +Cc: kwolf, famz, qemu-block, mreitz

[-- Attachment #1: Type: text/plain, Size: 1547 bytes --]

On 05/31/2017 04:43 AM, Paolo Bonzini wrote:
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  block/nbd-client.c | 30 +++++++++---------------------
>  1 file changed, 9 insertions(+), 21 deletions(-)
> 
> diff --git a/block/nbd-client.c b/block/nbd-client.c
> index 1e2952fdae..43e0292ac1 100644
> --- a/block/nbd-client.c
> +++ b/block/nbd-client.c
> @@ -114,6 +114,10 @@ static int nbd_co_send_request(BlockDriverState *bs,
>      int rc, ret, i;
>  
>      qemu_co_mutex_lock(&s->send_mutex);
> +    while (s->in_flight == MAX_NBD_REQUESTS) {
> +        qemu_co_queue_wait(&s->free_sema, &s->send_mutex);
> +    }
> +    s->in_flight++;

Nice - if I'm not mistaken, this also solves
https://bugzilla.redhat.com/show_bug.cgi?id=1454582 - you have a while
loop here...

> -static void nbd_coroutine_start(NBDClientSession *s,
> -                                NBDRequest *request)
> -{
> -    /* Poor man semaphore.  The free_sema is locked when no other request
> -     * can be accepted, and unlocked after receiving one reply.  */
> -    if (s->in_flight == MAX_NBD_REQUESTS) {
> -        qemu_co_queue_wait(&s->free_sema, NULL);
> -        assert(s->in_flight < MAX_NBD_REQUESTS);
> -    }

...compared to the old code that only tried once, and could therefore
hit the assertion failure depending on thread load.

Reviewed-by: Eric Blake <eblake@redhat.com>

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Qemu-devel] [PATCH 08/11] block: invoke .bdrv_drain callback in coroutine context and from AioContext
  2017-05-31  9:43 ` [Qemu-devel] [PATCH 08/11] block: invoke .bdrv_drain callback in coroutine context and from AioContext Paolo Bonzini
@ 2017-06-08 12:46   ` Stefan Hajnoczi
  2017-06-08 12:50     ` Paolo Bonzini
  0 siblings, 1 reply; 24+ messages in thread
From: Stefan Hajnoczi @ 2017-06-08 12:46 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: qemu-devel, kwolf, famz, qemu-block, mreitz

[-- Attachment #1: Type: text/plain, Size: 1204 bytes --]

On Wed, May 31, 2017 at 11:43:27AM +0200, Paolo Bonzini wrote:
> diff --git a/block/qed.c b/block/qed.c
> index 83a0973cfb..990210cd9c 100644
> --- a/block/qed.c
> +++ b/block/qed.c

Does this belong in the next patch?

> @@ -351,7 +351,7 @@ static void bdrv_qed_attach_aio_context(BlockDriverState *bs,
>      }
>  }
>  
> -static void bdrv_qed_drain(BlockDriverState *bs)
> +static void coroutine_fn bdrv_qed_co_drain(BlockDriverState *bs)
>  {
>      BDRVQEDState *s = bs->opaque;
>  
> @@ -360,7 +360,7 @@ static void bdrv_qed_drain(BlockDriverState *bs)
>       */
>      if (s->need_check_timer && timer_pending(s->need_check_timer)) {
>          qed_cancel_need_check_timer(s);
> -        qed_need_check_timer_cb(s);
> +        qed_need_check_timer_entry(s);
>      }
>  }
>  
> @@ -1544,7 +1544,7 @@ static BlockDriver bdrv_qed = {
>      .bdrv_check               = bdrv_qed_check,
>      .bdrv_detach_aio_context  = bdrv_qed_detach_aio_context,
>      .bdrv_attach_aio_context  = bdrv_qed_attach_aio_context,
> -    .bdrv_drain               = bdrv_qed_drain,
> +    .bdrv_co_drain            = bdrv_qed_co_drain,
>  };
>  
>  static void bdrv_qed_init(void)

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 455 bytes --]

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Qemu-devel] [PATCH 08/11] block: invoke .bdrv_drain callback in coroutine context and from AioContext
  2017-06-08 12:46   ` Stefan Hajnoczi
@ 2017-06-08 12:50     ` Paolo Bonzini
  0 siblings, 0 replies; 24+ messages in thread
From: Paolo Bonzini @ 2017-06-08 12:50 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: qemu-devel, kwolf, famz, qemu-block, mreitz



On 08/06/2017 14:46, Stefan Hajnoczi wrote:
> On Wed, May 31, 2017 at 11:43:27AM +0200, Paolo Bonzini wrote:
>> diff --git a/block/qed.c b/block/qed.c
>> index 83a0973cfb..990210cd9c 100644
>> --- a/block/qed.c
>> +++ b/block/qed.c
> 
> Does this belong in the next patch?

I don't think so, there is no .bdrv_drain callback anymore so I have to
switch the only implementation to .bdrv_co_drain.

Paolo

>> @@ -351,7 +351,7 @@ static void bdrv_qed_attach_aio_context(BlockDriverState *bs,
>>      }
>>  }
>>  
>> -static void bdrv_qed_drain(BlockDriverState *bs)
>> +static void coroutine_fn bdrv_qed_co_drain(BlockDriverState *bs)
>>  {
>>      BDRVQEDState *s = bs->opaque;
>>  
>> @@ -360,7 +360,7 @@ static void bdrv_qed_drain(BlockDriverState *bs)
>>       */
>>      if (s->need_check_timer && timer_pending(s->need_check_timer)) {
>>          qed_cancel_need_check_timer(s);
>> -        qed_need_check_timer_cb(s);
>> +        qed_need_check_timer_entry(s);
>>      }
>>  }
>>  
>> @@ -1544,7 +1544,7 @@ static BlockDriver bdrv_qed = {
>>      .bdrv_check               = bdrv_qed_check,
>>      .bdrv_detach_aio_context  = bdrv_qed_detach_aio_context,
>>      .bdrv_attach_aio_context  = bdrv_qed_attach_aio_context,
>> -    .bdrv_drain               = bdrv_qed_drain,
>> +    .bdrv_co_drain            = bdrv_qed_co_drain,
>>  };
>>  
>>  static void bdrv_qed_init(void)

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Qemu-devel] [RFC PATCH 00/11] Block layer thread-safety, part 2
  2017-05-31  9:43 [Qemu-devel] [RFC PATCH 00/11] Block layer thread-safety, part 2 Paolo Bonzini
                   ` (10 preceding siblings ...)
  2017-05-31  9:43 ` [Qemu-devel] [PATCH 11/11] ssh: support I/O from any AioContext Paolo Bonzini
@ 2017-06-08 12:52 ` Stefan Hajnoczi
  2017-06-08 12:54   ` Paolo Bonzini
  11 siblings, 1 reply; 24+ messages in thread
From: Stefan Hajnoczi @ 2017-06-08 12:52 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: qemu-devel, kwolf, famz, qemu-block, mreitz

[-- Attachment #1: Type: text/plain, Size: 2565 bytes --]

On Wed, May 31, 2017 at 11:43:19AM +0200, Paolo Bonzini wrote:
> This part takes care of drivers and devices, making sure that they can
> accept concurrent I/O from multiple AioContext.  Only RFC for now because it
> depends on Kevin's QED conversion to coroutines, which is still in flux.
> 
> The following drivers are thread-safe without using any QemuMutex/CoMutex:
> crypto, gluster, null, rbd, win32-aio.
> 
> The following drivers already use mutexes for everything except possibly
> snapshots, which do not (yet?) need protection: bochs, cloop, dmg, qcow,
> parallels, vhdx, vmdk, curl, iscsi, nfs.
> 
> The following drivers already use mutexes for _almost_ everything: vpc
> (missing get_block_status), vdi (missing bitmap access), vvfat (missing
> commit), nbd (s->in_flight not protected), qcow2 (must call CoQueue APIs
> under CoMutex).  They are fixed by patches 1-6.
> 
> The following drivers must be changed to use CoMutex to protect internal
> data: qed (patches 7-9), sheepdog (patch 10).
> 
> The following driver must be changed to support I/O from any AioContext:
> ssh.  It is fixed by patch 11.
> 
> Paolo
> 
> Paolo Bonzini (11):
>   qcow2: call CoQueue APIs under CoMutex
>   coroutine-lock: add qemu_co_rwlock_downgrade and qemu_co_rwlock_upgrade
>   vdi: make it thread-safe
>   vpc: make it thread-safe
>   vvfat: make it thread-safe
>   nbd: make it thread-safe
>   qed: move tail of qed_aio_write_main to qed_aio_write_{cow,alloc}
>   block: invoke .bdrv_drain callback in coroutine context and from AioContext
>   qed: protect table cache with CoMutex
>   sheepdog: add queue_lock
>   ssh: support I/O from any AioContext
> 
>  block/io.c                 |  42 +++++++---
>  block/nbd-client.c         |  30 +++----
>  block/qcow2.c              |   4 +-
>  block/qed-cluster.c        |   4 +-
>  block/qed-l2-cache.c       |   6 ++
>  block/qed-table.c          |  24 ++++--
>  block/qed.c                | 195 +++++++++++++++++++++++++++------------------
>  block/qed.h                |  11 ++-
>  block/sheepdog.c           |  21 ++++-
>  block/ssh.c                |  24 ++++--
>  block/vdi.c                |  48 +++++------
>  block/vpc.c                |  20 ++---
>  block/vvfat.c              |   8 +-
>  include/block/block_int.h  |   2 +-
>  include/qemu/coroutine.h   |  19 +++++
>  util/qemu-coroutine-lock.c |  35 ++++++++
>  16 files changed, 326 insertions(+), 167 deletions(-)
> 
> -- 
> 2.13.0
> 
> 

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 455 bytes --]

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Qemu-devel] [RFC PATCH 00/11] Block layer thread-safety, part 2
  2017-06-08 12:52 ` [Qemu-devel] [RFC PATCH 00/11] Block layer thread-safety, part 2 Stefan Hajnoczi
@ 2017-06-08 12:54   ` Paolo Bonzini
  0 siblings, 0 replies; 24+ messages in thread
From: Paolo Bonzini @ 2017-06-08 12:54 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: qemu-devel, kwolf, famz, qemu-block, mreitz



On 08/06/2017 14:52, Stefan Hajnoczi wrote:
> On Wed, May 31, 2017 at 11:43:19AM +0200, Paolo Bonzini wrote:
>> This part takes care of drivers and devices, making sure that they can
>> accept concurrent I/O from multiple AioContext.  Only RFC for now because it
>> depends on Kevin's QED conversion to coroutines, which is still in flux.
>>
>> The following drivers are thread-safe without using any QemuMutex/CoMutex:
>> crypto, gluster, null, rbd, win32-aio.
>>
>> The following drivers already use mutexes for everything except possibly
>> snapshots, which do not (yet?) need protection: bochs, cloop, dmg, qcow,
>> parallels, vhdx, vmdk, curl, iscsi, nfs.
>>
>> The following drivers already use mutexes for _almost_ everything: vpc
>> (missing get_block_status), vdi (missing bitmap access), vvfat (missing
>> commit), nbd (s->in_flight not protected), qcow2 (must call CoQueue APIs
>> under CoMutex).  They are fixed by patches 1-6.
>>
>> The following drivers must be changed to use CoMutex to protect internal
>> data: qed (patches 7-9), sheepdog (patch 10).
>>
>> The following driver must be changed to support I/O from any AioContext:
>> ssh.  It is fixed by patch 11.
>>
>> Paolo
>>
>> Paolo Bonzini (11):
>>   qcow2: call CoQueue APIs under CoMutex
>>   coroutine-lock: add qemu_co_rwlock_downgrade and qemu_co_rwlock_upgrade
>>   vdi: make it thread-safe
>>   vpc: make it thread-safe
>>   vvfat: make it thread-safe
>>   nbd: make it thread-safe
>>   qed: move tail of qed_aio_write_main to qed_aio_write_{cow,alloc}
>>   block: invoke .bdrv_drain callback in coroutine context and from AioContext
>>   qed: protect table cache with CoMutex
>>   sheepdog: add queue_lock
>>   ssh: support I/O from any AioContext
>>
>>  block/io.c                 |  42 +++++++---
>>  block/nbd-client.c         |  30 +++----
>>  block/qcow2.c              |   4 +-
>>  block/qed-cluster.c        |   4 +-
>>  block/qed-l2-cache.c       |   6 ++
>>  block/qed-table.c          |  24 ++++--
>>  block/qed.c                | 195 +++++++++++++++++++++++++++------------------
>>  block/qed.h                |  11 ++-
>>  block/sheepdog.c           |  21 ++++-
>>  block/ssh.c                |  24 ++++--
>>  block/vdi.c                |  48 +++++------
>>  block/vpc.c                |  20 ++---
>>  block/vvfat.c              |   8 +-
>>  include/block/block_int.h  |   2 +-
>>  include/qemu/coroutine.h   |  19 +++++
>>  util/qemu-coroutine-lock.c |  35 ++++++++
>>  16 files changed, 326 insertions(+), 167 deletions(-)
>>
>> -- 
>> 2.13.0
> 
> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>

Thanks, I'll repost once QED is converted to coroutines, and send part 3
when I have some time.

Paolo

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Qemu-devel] [PATCH 04/11] vpc: make it thread-safe
  2017-06-29 13:27 ` [Qemu-devel] [PATCH 04/11] vpc: make it thread-safe Paolo Bonzini
@ 2017-07-10 11:32   ` Fam Zheng
  0 siblings, 0 replies; 24+ messages in thread
From: Fam Zheng @ 2017-07-10 11:32 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: qemu-devel, kwolf, qemu-block, stefanha

On Thu, 06/29 15:27, Paolo Bonzini wrote:
> Reviewed-by: Eric Blake <eblake@redhat.com>
> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  block/vpc.c | 20 ++++++++++----------
>  1 file changed, 10 insertions(+), 10 deletions(-)
> 
> diff --git a/block/vpc.c b/block/vpc.c
> index 4240ba9d1c..0ff686540a 100644
> --- a/block/vpc.c
> +++ b/block/vpc.c
> @@ -496,12 +496,6 @@ static inline int64_t get_image_offset(BlockDriverState *bs, uint64_t offset,
>      return block_offset;
>  }
>  
> -static inline int64_t get_sector_offset(BlockDriverState *bs,
> -                                        int64_t sector_num, bool write)
> -{
> -    return get_image_offset(bs, sector_num * BDRV_SECTOR_SIZE, write);
> -}
> -
>  /*
>   * Writes the footer to the end of the image file. This is needed when the
>   * file grows as it overwrites the old footer
> @@ -696,6 +690,7 @@ static int64_t coroutine_fn vpc_co_get_block_status(BlockDriverState *bs,
>      VHDFooter *footer = (VHDFooter*) s->footer_buf;
>      int64_t start, offset;
>      bool allocated;
> +    int64_t ret;
>      int n;
>  
>      if (be32_to_cpu(footer->type) == VHD_FIXED) {
> @@ -705,10 +700,13 @@ static int64_t coroutine_fn vpc_co_get_block_status(BlockDriverState *bs,
>                 (sector_num << BDRV_SECTOR_BITS);
>      }
>  
> -    offset = get_sector_offset(bs, sector_num, 0);
> +    qemu_co_mutex_lock(&s->lock);
> +
> +    offset = get_image_offset(bs, sector_num << BDRV_SECTOR_BITS, 0);

You used false instead of 0 in v1 which was better, so this is a regression. :)

Can be fixed when applying.

>      start = offset;
>      allocated = (offset != -1);
>      *pnum = 0;
> +    ret = 0;
>  
>      do {
>          /* All sectors in a block are contiguous (without using the bitmap) */
> @@ -723,15 +721,17 @@ static int64_t coroutine_fn vpc_co_get_block_status(BlockDriverState *bs,
>           * sectors since there is always a bitmap in between. */
>          if (allocated) {
>              *file = bs->file->bs;
> -            return BDRV_BLOCK_DATA | BDRV_BLOCK_OFFSET_VALID | start;
> +            ret = BDRV_BLOCK_DATA | BDRV_BLOCK_OFFSET_VALID | start;
> +            break;
>          }
>          if (nb_sectors == 0) {
>              break;
>          }
> -        offset = get_sector_offset(bs, sector_num, 0);
> +        offset = get_image_offset(bs, sector_num << BDRV_SECTOR_BITS, 0);

Ditto.

>      } while (offset == -1);
>  
> -    return 0;
> +    qemu_co_mutex_unlock(&s->lock);
> +    return ret;
>  }
>  
>  /*
> -- 
> 2.13.0
> 
> 
> 

Fam

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [Qemu-devel] [PATCH 04/11] vpc: make it thread-safe
  2017-06-29 13:27 [Qemu-devel] [PATCH v2 " Paolo Bonzini
@ 2017-06-29 13:27 ` Paolo Bonzini
  2017-07-10 11:32   ` Fam Zheng
  0 siblings, 1 reply; 24+ messages in thread
From: Paolo Bonzini @ 2017-06-29 13:27 UTC (permalink / raw)
  To: qemu-devel; +Cc: stefanha, qemu-block, kwolf

Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 block/vpc.c | 20 ++++++++++----------
 1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/block/vpc.c b/block/vpc.c
index 4240ba9d1c..0ff686540a 100644
--- a/block/vpc.c
+++ b/block/vpc.c
@@ -496,12 +496,6 @@ static inline int64_t get_image_offset(BlockDriverState *bs, uint64_t offset,
     return block_offset;
 }
 
-static inline int64_t get_sector_offset(BlockDriverState *bs,
-                                        int64_t sector_num, bool write)
-{
-    return get_image_offset(bs, sector_num * BDRV_SECTOR_SIZE, write);
-}
-
 /*
  * Writes the footer to the end of the image file. This is needed when the
  * file grows as it overwrites the old footer
@@ -696,6 +690,7 @@ static int64_t coroutine_fn vpc_co_get_block_status(BlockDriverState *bs,
     VHDFooter *footer = (VHDFooter*) s->footer_buf;
     int64_t start, offset;
     bool allocated;
+    int64_t ret;
     int n;
 
     if (be32_to_cpu(footer->type) == VHD_FIXED) {
@@ -705,10 +700,13 @@ static int64_t coroutine_fn vpc_co_get_block_status(BlockDriverState *bs,
                (sector_num << BDRV_SECTOR_BITS);
     }
 
-    offset = get_sector_offset(bs, sector_num, 0);
+    qemu_co_mutex_lock(&s->lock);
+
+    offset = get_image_offset(bs, sector_num << BDRV_SECTOR_BITS, 0);
     start = offset;
     allocated = (offset != -1);
     *pnum = 0;
+    ret = 0;
 
     do {
         /* All sectors in a block are contiguous (without using the bitmap) */
@@ -723,15 +721,17 @@ static int64_t coroutine_fn vpc_co_get_block_status(BlockDriverState *bs,
          * sectors since there is always a bitmap in between. */
         if (allocated) {
             *file = bs->file->bs;
-            return BDRV_BLOCK_DATA | BDRV_BLOCK_OFFSET_VALID | start;
+            ret = BDRV_BLOCK_DATA | BDRV_BLOCK_OFFSET_VALID | start;
+            break;
         }
         if (nb_sectors == 0) {
             break;
         }
-        offset = get_sector_offset(bs, sector_num, 0);
+        offset = get_image_offset(bs, sector_num << BDRV_SECTOR_BITS, 0);
     } while (offset == -1);
 
-    return 0;
+    qemu_co_mutex_unlock(&s->lock);
+    return ret;
 }
 
 /*
-- 
2.13.0

^ permalink raw reply related	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2017-07-10 11:33 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-05-31  9:43 [Qemu-devel] [RFC PATCH 00/11] Block layer thread-safety, part 2 Paolo Bonzini
2017-05-31  9:43 ` [Qemu-devel] [PATCH 01/11] qcow2: call CoQueue APIs under CoMutex Paolo Bonzini
2017-05-31 15:18   ` Eric Blake
2017-05-31  9:43 ` [Qemu-devel] [PATCH 02/11] coroutine-lock: add qemu_co_rwlock_downgrade and qemu_co_rwlock_upgrade Paolo Bonzini
2017-05-31 20:50   ` Eric Blake
2017-05-31  9:43 ` [Qemu-devel] [PATCH 03/11] vdi: make it thread-safe Paolo Bonzini
2017-05-31 20:55   ` Eric Blake
2017-05-31  9:43 ` [Qemu-devel] [PATCH 04/11] vpc: " Paolo Bonzini
2017-05-31 20:58   ` Eric Blake
2017-05-31  9:43 ` [Qemu-devel] [PATCH 05/11] vvfat: " Paolo Bonzini
2017-05-31 21:09   ` Eric Blake
2017-05-31  9:43 ` [Qemu-devel] [PATCH 06/11] nbd: " Paolo Bonzini
2017-05-31 21:20   ` Eric Blake
2017-05-31  9:43 ` [Qemu-devel] [PATCH 07/11] qed: move tail of qed_aio_write_main to qed_aio_write_{cow, alloc} Paolo Bonzini
2017-05-31  9:43 ` [Qemu-devel] [PATCH 08/11] block: invoke .bdrv_drain callback in coroutine context and from AioContext Paolo Bonzini
2017-06-08 12:46   ` Stefan Hajnoczi
2017-06-08 12:50     ` Paolo Bonzini
2017-05-31  9:43 ` [Qemu-devel] [PATCH 09/11] qed: protect table cache with CoMutex Paolo Bonzini
2017-05-31  9:43 ` [Qemu-devel] [PATCH 10/11] sheepdog: add queue_lock Paolo Bonzini
2017-05-31  9:43 ` [Qemu-devel] [PATCH 11/11] ssh: support I/O from any AioContext Paolo Bonzini
2017-06-08 12:52 ` [Qemu-devel] [RFC PATCH 00/11] Block layer thread-safety, part 2 Stefan Hajnoczi
2017-06-08 12:54   ` Paolo Bonzini
2017-06-29 13:27 [Qemu-devel] [PATCH v2 " Paolo Bonzini
2017-06-29 13:27 ` [Qemu-devel] [PATCH 04/11] vpc: make it thread-safe Paolo Bonzini
2017-07-10 11:32   ` Fam Zheng

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.