All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH v2 0/5] dataplane snapshot fixes
@ 2015-10-27 14:09 Denis V. Lunev
  2015-10-27 14:09 ` [Qemu-devel] [PATCH 1/5] fifolock: create rfifolock_is_locked helper Denis V. Lunev
                   ` (5 more replies)
  0 siblings, 6 replies; 13+ messages in thread
From: Denis V. Lunev @ 2015-10-27 14:09 UTC (permalink / raw)
  Cc: Denis V. Lunev, Paolo Bonzini, qemu-devel, Stefan Hajnoczi, qemu-stable

The following test
    while /bin/true ; do
        virsh snapshot-create rhel7
        sleep 10
        virsh snapshot-delete rhel7 --current
    done
with enabled iothreads on a running VM leads to a lot of troubles: hangs,
asserts, errors.

Though (in general) HMP snapshot code is terrible. I think it should be
dropped at once and replaced with blkdev transactions code. Though is
could not fit to QEMU 2.5/stable at all.

Anyway, I think that the construction like
    assert(aio_context_is_locked(aio_context));
should be widely used to ensure proper locking.

Changes from v1:
- aio-context locking added
- comment is rewritten

Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Stefan Hajnoczi <stefanha@redhat.com>
CC: Paolo Bonzini <pbonzini@redhat.com>

Denis V. Lunev (4):
  fifolock: create rfifolock_is_locked helper
  aio_context: create aio_context_is_locked helper
  io: add locking constraints check into bdrv_drain to ensure locking
  migration: add missed aio_context_acquire into hmp_savevm/hmp_delvm

Pavel Butsykin (1):
  virtio: sync the dataplane vring state to the virtqueue before
    virtio_save

 async.c                  | 5 +++++
 block/io.c               | 5 ++++-
 block/snapshot.c         | 5 +++++
 hw/block/virtio-blk.c    | 5 +++++
 hw/scsi/virtio-scsi.c    | 5 +++++
 include/block/aio.h      | 3 +++
 include/qemu/rfifolock.h | 1 +
 migration/savevm.c       | 7 +++++++
 util/rfifolock.c         | 9 +++++++--
 9 files changed, 42 insertions(+), 3 deletions(-)

-- 
2.1.4

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Qemu-devel] [PATCH 1/5] fifolock: create rfifolock_is_locked helper
  2015-10-27 14:09 [Qemu-devel] [PATCH v2 0/5] dataplane snapshot fixes Denis V. Lunev
@ 2015-10-27 14:09 ` Denis V. Lunev
  2015-10-27 14:09 ` [Qemu-devel] [PATCH 2/5] aio_context: create aio_context_is_locked helper Denis V. Lunev
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 13+ messages in thread
From: Denis V. Lunev @ 2015-10-27 14:09 UTC (permalink / raw)
  Cc: Denis V. Lunev, Paolo Bonzini, qemu-devel, Stefan Hajnoczi, qemu-stable

This helper is necessary to ensure locking constraints.

Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Stefan Hajnoczi <stefanha@redhat.com>
CC: Paolo Bonzini <pbonzini@redhat.com>
---
 include/qemu/rfifolock.h | 1 +
 util/rfifolock.c         | 9 +++++++--
 2 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/include/qemu/rfifolock.h b/include/qemu/rfifolock.h
index b23ab53..2c00c4b 100644
--- a/include/qemu/rfifolock.h
+++ b/include/qemu/rfifolock.h
@@ -50,5 +50,6 @@ void rfifolock_init(RFifoLock *r, void (*cb)(void *), void *opaque);
 void rfifolock_destroy(RFifoLock *r);
 void rfifolock_lock(RFifoLock *r);
 void rfifolock_unlock(RFifoLock *r);
+int  rfifolock_is_locked(RFifoLock *r);
 
 #endif /* QEMU_RFIFOLOCK_H */
diff --git a/util/rfifolock.c b/util/rfifolock.c
index afbf748..8ac58cb 100644
--- a/util/rfifolock.c
+++ b/util/rfifolock.c
@@ -48,7 +48,7 @@ void rfifolock_lock(RFifoLock *r)
     /* Take a ticket */
     unsigned int ticket = r->tail++;
 
-    if (r->nesting > 0 && qemu_thread_is_self(&r->owner_thread)) {
+    if (r->nesting > 0 && rfifolock_is_locked(r)) {
         r->tail--; /* put ticket back, we're nesting */
     } else {
         while (ticket != r->head) {
@@ -69,10 +69,15 @@ void rfifolock_unlock(RFifoLock *r)
 {
     qemu_mutex_lock(&r->lock);
     assert(r->nesting > 0);
-    assert(qemu_thread_is_self(&r->owner_thread));
+    assert(rfifolock_is_locked(r));
     if (--r->nesting == 0) {
         r->head++;
         qemu_cond_broadcast(&r->cond);
     }
     qemu_mutex_unlock(&r->lock);
 }
+
+int rfifolock_is_locked(RFifoLock *r)
+{
+    return qemu_thread_is_self(&r->owner_thread);
+}
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [Qemu-devel] [PATCH 2/5] aio_context: create aio_context_is_locked helper
  2015-10-27 14:09 [Qemu-devel] [PATCH v2 0/5] dataplane snapshot fixes Denis V. Lunev
  2015-10-27 14:09 ` [Qemu-devel] [PATCH 1/5] fifolock: create rfifolock_is_locked helper Denis V. Lunev
@ 2015-10-27 14:09 ` Denis V. Lunev
  2015-10-27 14:09 ` [Qemu-devel] [PATCH 3/5] io: add locking constraints check into bdrv_drain to ensure locking Denis V. Lunev
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 13+ messages in thread
From: Denis V. Lunev @ 2015-10-27 14:09 UTC (permalink / raw)
  Cc: Denis V. Lunev, Paolo Bonzini, qemu-devel, Stefan Hajnoczi, qemu-stable

This helper is necessary to ensure locking constraints.

Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Stefan Hajnoczi <stefanha@redhat.com>
CC: Paolo Bonzini <pbonzini@redhat.com>
---
 async.c             | 5 +++++
 include/block/aio.h | 3 +++
 2 files changed, 8 insertions(+)

diff --git a/async.c b/async.c
index bdc64a3..4a9250e 100644
--- a/async.c
+++ b/async.c
@@ -361,3 +361,8 @@ void aio_context_release(AioContext *ctx)
 {
     rfifolock_unlock(&ctx->lock);
 }
+
+int aio_context_is_locked(AioContext *ctx)
+{
+    return rfifolock_is_locked(&ctx->lock);
+}
diff --git a/include/block/aio.h b/include/block/aio.h
index bcc7d43..d20c7b8 100644
--- a/include/block/aio.h
+++ b/include/block/aio.h
@@ -166,6 +166,9 @@ void aio_context_acquire(AioContext *ctx);
 /* Relinquish ownership of the AioContext. */
 void aio_context_release(AioContext *ctx);
 
+/* Check that AioContext is locked by the caller. */
+int aio_context_is_locked(AioContext *ctx);
+
 /**
  * aio_bh_new: Allocate a new bottom half structure.
  *
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [Qemu-devel] [PATCH 3/5] io: add locking constraints check into bdrv_drain to ensure locking
  2015-10-27 14:09 [Qemu-devel] [PATCH v2 0/5] dataplane snapshot fixes Denis V. Lunev
  2015-10-27 14:09 ` [Qemu-devel] [PATCH 1/5] fifolock: create rfifolock_is_locked helper Denis V. Lunev
  2015-10-27 14:09 ` [Qemu-devel] [PATCH 2/5] aio_context: create aio_context_is_locked helper Denis V. Lunev
@ 2015-10-27 14:09 ` Denis V. Lunev
  2015-10-27 14:09 ` [Qemu-devel] [PATCH 4/5] migration: add missed aio_context_acquire into hmp_savevm/hmp_delvm Denis V. Lunev
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 13+ messages in thread
From: Denis V. Lunev @ 2015-10-27 14:09 UTC (permalink / raw)
  Cc: Denis V. Lunev, Paolo Bonzini, qemu-devel, Stefan Hajnoczi, qemu-stable

as described in the comment of the function

Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Stefan Hajnoczi <stefanha@redhat.com>
CC: Paolo Bonzini <pbonzini@redhat.com>
---
 block/io.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/block/io.c b/block/io.c
index 5ac6256..2e98d45 100644
--- a/block/io.c
+++ b/block/io.c
@@ -247,12 +247,15 @@ bool bdrv_requests_pending(BlockDriverState *bs)
 void bdrv_drain(BlockDriverState *bs)
 {
     bool busy = true;
+    AioContext *aio_context = bdrv_get_aio_context(bs);
+
+    assert(aio_context_is_locked(aio_context));
 
     while (busy) {
         /* Keep iterating */
          bdrv_flush_io_queue(bs);
          busy = bdrv_requests_pending(bs);
-         busy |= aio_poll(bdrv_get_aio_context(bs), busy);
+         busy |= aio_poll(aio_context, busy);
     }
 }
 
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [Qemu-devel] [PATCH 4/5] migration: add missed aio_context_acquire into hmp_savevm/hmp_delvm
  2015-10-27 14:09 [Qemu-devel] [PATCH v2 0/5] dataplane snapshot fixes Denis V. Lunev
                   ` (2 preceding siblings ...)
  2015-10-27 14:09 ` [Qemu-devel] [PATCH 3/5] io: add locking constraints check into bdrv_drain to ensure locking Denis V. Lunev
@ 2015-10-27 14:09 ` Denis V. Lunev
  2015-10-27 18:12   ` Paolo Bonzini
  2015-10-28 10:11   ` Juan Quintela
  2015-10-27 14:09 ` [Qemu-devel] [PATCH 5/5] virtio: sync the dataplane vring state to the virtqueue before virtio_save Denis V. Lunev
  2015-10-27 18:41 ` [Qemu-devel] [PATCH v2 0/5] dataplane snapshot fixes Paolo Bonzini
  5 siblings, 2 replies; 13+ messages in thread
From: Denis V. Lunev @ 2015-10-27 14:09 UTC (permalink / raw)
  Cc: Juan Quintela, qemu-devel, qemu-stable, Stefan Hajnoczi,
	Paolo Bonzini, Amit Shah, Denis V. Lunev

aio_context should be locked in the similar way as was done in QMP
snapshot creation in the other case there are a lot of possible
troubles if native AIO mode is enabled for disk.

- the command can hang (HMP thread) with missed wakeup (the operation is
  actually complete)
    io_submit
    ioq_submit
    laio_submit
    raw_aio_submit
    raw_aio_readv
    bdrv_co_io_em
    bdrv_co_readv_em
    bdrv_aligned_preadv
    bdrv_co_do_preadv
    bdrv_co_do_readv
    bdrv_co_readv
    qcow2_co_readv
    bdrv_aligned_preadv
    bdrv_co_do_pwritev
    bdrv_rw_co_entry

- QEMU can assert in coroutine re-enter
    __GI_abort
    qemu_coroutine_enter
    bdrv_co_io_em_complete
    qemu_laio_process_completion
    qemu_laio_completion_bh
    aio_bh_poll
    aio_dispatch
    aio_poll
    iothread_run

AioContext lock is reqursive. Thus nested locking should not be a problem.

Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Stefan Hajnoczi <stefanha@redhat.com>
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: Juan Quintela <quintela@redhat.com>
CC: Amit Shah <amit.shah@redhat.com>
---
 block/snapshot.c   | 5 +++++
 migration/savevm.c | 7 +++++++
 2 files changed, 12 insertions(+)

diff --git a/block/snapshot.c b/block/snapshot.c
index 89500f2..f6fa17a 100644
--- a/block/snapshot.c
+++ b/block/snapshot.c
@@ -259,6 +259,9 @@ void bdrv_snapshot_delete_by_id_or_name(BlockDriverState *bs,
 {
     int ret;
     Error *local_err = NULL;
+    AioContext *aio_context = bdrv_get_aio_context(bs);
+
+    aio_context_acquire(aio_context);
 
     ret = bdrv_snapshot_delete(bs, id_or_name, NULL, &local_err);
     if (ret == -ENOENT || ret == -EINVAL) {
@@ -267,6 +270,8 @@ void bdrv_snapshot_delete_by_id_or_name(BlockDriverState *bs,
         ret = bdrv_snapshot_delete(bs, NULL, id_or_name, &local_err);
     }
 
+    aio_context_release(aio_context);
+
     if (ret < 0) {
         error_propagate(errp, local_err);
     }
diff --git a/migration/savevm.c b/migration/savevm.c
index dbcc39a..83d2efa 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -1289,6 +1289,7 @@ void hmp_savevm(Monitor *mon, const QDict *qdict)
     struct tm tm;
     const char *name = qdict_get_try_str(qdict, "name");
     Error *local_err = NULL;
+    AioContext *aio_context;
 
     /* Verify if there is a device that doesn't support snapshots and is writable */
     bs = NULL;
@@ -1320,6 +1321,9 @@ void hmp_savevm(Monitor *mon, const QDict *qdict)
     }
     vm_stop(RUN_STATE_SAVE_VM);
 
+    aio_context = bdrv_get_aio_context(bs);
+    aio_context_acquire(aio_context);
+
     memset(sn, 0, sizeof(*sn));
 
     /* fill auxiliary fields */
@@ -1378,6 +1382,8 @@ void hmp_savevm(Monitor *mon, const QDict *qdict)
     }
 
  the_end:
+    aio_context_release(aio_context);
+
     if (saved_vm_running) {
         vm_start();
     }

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [Qemu-devel] [PATCH 5/5] virtio: sync the dataplane vring state to the virtqueue before virtio_save
  2015-10-27 14:09 [Qemu-devel] [PATCH v2 0/5] dataplane snapshot fixes Denis V. Lunev
                   ` (3 preceding siblings ...)
  2015-10-27 14:09 ` [Qemu-devel] [PATCH 4/5] migration: add missed aio_context_acquire into hmp_savevm/hmp_delvm Denis V. Lunev
@ 2015-10-27 14:09 ` Denis V. Lunev
  2015-10-27 18:41 ` [Qemu-devel] [PATCH v2 0/5] dataplane snapshot fixes Paolo Bonzini
  5 siblings, 0 replies; 13+ messages in thread
From: Denis V. Lunev @ 2015-10-27 14:09 UTC (permalink / raw)
  Cc: Kevin Wolf, Pavel Butsykin, Michael S. Tsirkin, qemu-devel,
	qemu-stable, Stefan Hajnoczi, Paolo Bonzini, Denis V. Lunev

From: Pavel Butsykin <pbutsykin@virtuozzo.com>

When creating snapshot with the dataplane enabled, the snapshot file gets
not the actual state of virtqueue, because the current state is stored in
VirtIOBlockDataPlane. Therefore, before saving snapshot need to sync
the dataplane vring state to the virtqueue. The dataplane will resume its
work at the next notify virtqueue.

When snapshot loads with loadvm we get a message:
VQ 0 size 0x80 Guest index 0x15f5 inconsistent with Host index 0x0:
    delta 0x15f5
error while loading state for instance 0x0 of device
    '0000:00:08.0/virtio-blk'
Error -1 while loading VM state

Signed-off-by: Pavel Butsykin <pbutsykin@virtuozzo.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Stefan Hajnoczi <stefanha@redhat.com>
CC: "Michael S. Tsirkin" <mst@redhat.com>
CC: Kevin Wolf <kwolf@redhat.com>
CC: Paolo Bonzini <pbonzini@redhat.com>
---
 hw/block/virtio-blk.c | 5 +++++
 hw/scsi/virtio-scsi.c | 5 +++++
 2 files changed, 10 insertions(+)

diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
index 8beb26b..89ab72a 100644
--- a/hw/block/virtio-blk.c
+++ b/hw/block/virtio-blk.c
@@ -798,6 +798,11 @@ static void virtio_blk_set_status(VirtIODevice *vdev, uint8_t status)
 static void virtio_blk_save(QEMUFile *f, void *opaque)
 {
     VirtIODevice *vdev = VIRTIO_DEVICE(opaque);
+    VirtIOBlock *s = VIRTIO_BLK(vdev);
+
+    if (s->dataplane) {
+        virtio_blk_data_plane_stop(s->dataplane);
+    }
 
     virtio_save(vdev, f);
 }
diff --git a/hw/scsi/virtio-scsi.c b/hw/scsi/virtio-scsi.c
index 20885fb..8ad3257 100644
--- a/hw/scsi/virtio-scsi.c
+++ b/hw/scsi/virtio-scsi.c
@@ -665,6 +665,11 @@ static void virtio_scsi_reset(VirtIODevice *vdev)
 static void virtio_scsi_save(QEMUFile *f, void *opaque)
 {
     VirtIODevice *vdev = VIRTIO_DEVICE(opaque);
+    VirtIOSCSI *s = VIRTIO_SCSI(vdev);
+
+    if (s->dataplane_started) {
+        virtio_scsi_dataplane_stop(s);
+    }
     virtio_save(vdev, f);
 }
 
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH 4/5] migration: add missed aio_context_acquire into hmp_savevm/hmp_delvm
  2015-10-27 14:09 ` [Qemu-devel] [PATCH 4/5] migration: add missed aio_context_acquire into hmp_savevm/hmp_delvm Denis V. Lunev
@ 2015-10-27 18:12   ` Paolo Bonzini
  2015-10-27 18:23     ` Denis V. Lunev
  2015-10-28 10:11   ` Juan Quintela
  1 sibling, 1 reply; 13+ messages in thread
From: Paolo Bonzini @ 2015-10-27 18:12 UTC (permalink / raw)
  To: Denis V. Lunev
  Cc: Amit Shah, qemu-stable, qemu-devel, Stefan Hajnoczi, Juan Quintela



On 27/10/2015 15:09, Denis V. Lunev wrote:
> aio_context should be locked in the similar way as was done in QMP
> snapshot creation in the other case there are a lot of possible
> troubles if native AIO mode is enabled for disk.
> 
> - the command can hang (HMP thread) with missed wakeup (the operation is
>   actually complete)
>     io_submit
>     ioq_submit
>     laio_submit
>     raw_aio_submit
>     raw_aio_readv
>     bdrv_co_io_em
>     bdrv_co_readv_em
>     bdrv_aligned_preadv
>     bdrv_co_do_preadv
>     bdrv_co_do_readv
>     bdrv_co_readv
>     qcow2_co_readv
>     bdrv_aligned_preadv
>     bdrv_co_do_pwritev
>     bdrv_rw_co_entry
> 
> - QEMU can assert in coroutine re-enter
>     __GI_abort
>     qemu_coroutine_enter
>     bdrv_co_io_em_complete
>     qemu_laio_process_completion
>     qemu_laio_completion_bh
>     aio_bh_poll
>     aio_dispatch
>     aio_poll
>     iothread_run
> 
> AioContext lock is reqursive. Thus nested locking should not be a problem.
> 
> Signed-off-by: Denis V. Lunev <den@openvz.org>
> CC: Stefan Hajnoczi <stefanha@redhat.com>
> CC: Paolo Bonzini <pbonzini@redhat.com>
> CC: Juan Quintela <quintela@redhat.com>
> CC: Amit Shah <amit.shah@redhat.com>
> ---
>  block/snapshot.c   | 5 +++++
>  migration/savevm.c | 7 +++++++
>  2 files changed, 12 insertions(+)
> 
> diff --git a/block/snapshot.c b/block/snapshot.c
> index 89500f2..f6fa17a 100644
> --- a/block/snapshot.c
> +++ b/block/snapshot.c
> @@ -259,6 +259,9 @@ void bdrv_snapshot_delete_by_id_or_name(BlockDriverState *bs,
>  {
>      int ret;
>      Error *local_err = NULL;
> +    AioContext *aio_context = bdrv_get_aio_context(bs);
> +
> +    aio_context_acquire(aio_context);
>  
>      ret = bdrv_snapshot_delete(bs, id_or_name, NULL, &local_err);
>      if (ret == -ENOENT || ret == -EINVAL) {
> @@ -267,6 +270,8 @@ void bdrv_snapshot_delete_by_id_or_name(BlockDriverState *bs,
>          ret = bdrv_snapshot_delete(bs, NULL, id_or_name, &local_err);
>      }
>  
> +    aio_context_release(aio_context);

Why here and not in hmp_delvm, for consistency?

The call from hmp_savevm is already protected.

Thanks for fixing the bug!

Paolo

>      if (ret < 0) {
>          error_propagate(errp, local_err);
>      }
> diff --git a/migration/savevm.c b/migration/savevm.c
> index dbcc39a..83d2efa 100644
> --- a/migration/savevm.c
> +++ b/migration/savevm.c
> @@ -1289,6 +1289,7 @@ void hmp_savevm(Monitor *mon, const QDict *qdict)
>      struct tm tm;
>      const char *name = qdict_get_try_str(qdict, "name");
>      Error *local_err = NULL;
> +    AioContext *aio_context;
>  
>      /* Verify if there is a device that doesn't support snapshots and is writable */
>      bs = NULL;
> @@ -1320,6 +1321,9 @@ void hmp_savevm(Monitor *mon, const QDict *qdict)
>      }
>      vm_stop(RUN_STATE_SAVE_VM);
>  
> +    aio_context = bdrv_get_aio_context(bs);
> +    aio_context_acquire(aio_context);
> +
>      memset(sn, 0, sizeof(*sn));
>  
>      /* fill auxiliary fields */
> @@ -1378,6 +1382,8 @@ void hmp_savevm(Monitor *mon, const QDict *qdict)
>      }
>  
>   the_end:
> +    aio_context_release(aio_context);
> +
>      if (saved_vm_running) {
>          vm_start();
>      }
> 
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH 4/5] migration: add missed aio_context_acquire into hmp_savevm/hmp_delvm
  2015-10-27 18:12   ` Paolo Bonzini
@ 2015-10-27 18:23     ` Denis V. Lunev
  0 siblings, 0 replies; 13+ messages in thread
From: Denis V. Lunev @ 2015-10-27 18:23 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Amit Shah, qemu-stable, qemu-devel, Stefan Hajnoczi, Juan Quintela

On 10/27/2015 09:12 PM, Paolo Bonzini wrote:
>
> On 27/10/2015 15:09, Denis V. Lunev wrote:
>> aio_context should be locked in the similar way as was done in QMP
>> snapshot creation in the other case there are a lot of possible
>> troubles if native AIO mode is enabled for disk.
>>
>> - the command can hang (HMP thread) with missed wakeup (the operation is
>>    actually complete)
>>      io_submit
>>      ioq_submit
>>      laio_submit
>>      raw_aio_submit
>>      raw_aio_readv
>>      bdrv_co_io_em
>>      bdrv_co_readv_em
>>      bdrv_aligned_preadv
>>      bdrv_co_do_preadv
>>      bdrv_co_do_readv
>>      bdrv_co_readv
>>      qcow2_co_readv
>>      bdrv_aligned_preadv
>>      bdrv_co_do_pwritev
>>      bdrv_rw_co_entry
>>
>> - QEMU can assert in coroutine re-enter
>>      __GI_abort
>>      qemu_coroutine_enter
>>      bdrv_co_io_em_complete
>>      qemu_laio_process_completion
>>      qemu_laio_completion_bh
>>      aio_bh_poll
>>      aio_dispatch
>>      aio_poll
>>      iothread_run
>>
>> AioContext lock is reqursive. Thus nested locking should not be a problem.
>>
>> Signed-off-by: Denis V. Lunev <den@openvz.org>
>> CC: Stefan Hajnoczi <stefanha@redhat.com>
>> CC: Paolo Bonzini <pbonzini@redhat.com>
>> CC: Juan Quintela <quintela@redhat.com>
>> CC: Amit Shah <amit.shah@redhat.com>
>> ---
>>   block/snapshot.c   | 5 +++++
>>   migration/savevm.c | 7 +++++++
>>   2 files changed, 12 insertions(+)
>>
>> diff --git a/block/snapshot.c b/block/snapshot.c
>> index 89500f2..f6fa17a 100644
>> --- a/block/snapshot.c
>> +++ b/block/snapshot.c
>> @@ -259,6 +259,9 @@ void bdrv_snapshot_delete_by_id_or_name(BlockDriverState *bs,
>>   {
>>       int ret;
>>       Error *local_err = NULL;
>> +    AioContext *aio_context = bdrv_get_aio_context(bs);
>> +
>> +    aio_context_acquire(aio_context);
>>   
>>       ret = bdrv_snapshot_delete(bs, id_or_name, NULL, &local_err);
>>       if (ret == -ENOENT || ret == -EINVAL) {
>> @@ -267,6 +270,8 @@ void bdrv_snapshot_delete_by_id_or_name(BlockDriverState *bs,
>>           ret = bdrv_snapshot_delete(bs, NULL, id_or_name, &local_err);
>>       }
>>   
>> +    aio_context_release(aio_context);
> Why here and not in hmp_delvm, for consistency?
>
> The call from hmp_savevm is already protected.
>
> Thanks for fixing the bug!
>
> Paolo

the situation is more difficult. There are several disks in VM.
One disk is used for state saving (protected in savevm)
and there are several disks touched via

static int del_existing_snapshots(Monitor *mon, const char *name)
     while ((bs = bdrv_next(bs))) {
         if (bdrv_can_snapshot(bs) &&
             bdrv_snapshot_find(bs, snapshot, name) >= 0) {
             bdrv_snapshot_delete_by_id_or_name(bs, name, &err);
         }
     }

in savevm and similar looking code in delvm with similar cycle
implemented differently.

This patchset looks minimal for me to kludge situation enough.

True fix would be a drop of this code in favour of blockdev
transactions. At least this is my opinion. Though I can not do
this at this stage, this will take a lot of time.

Den

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH v2 0/5] dataplane snapshot fixes
  2015-10-27 14:09 [Qemu-devel] [PATCH v2 0/5] dataplane snapshot fixes Denis V. Lunev
                   ` (4 preceding siblings ...)
  2015-10-27 14:09 ` [Qemu-devel] [PATCH 5/5] virtio: sync the dataplane vring state to the virtqueue before virtio_save Denis V. Lunev
@ 2015-10-27 18:41 ` Paolo Bonzini
  2015-10-27 19:05   ` Denis V. Lunev
  5 siblings, 1 reply; 13+ messages in thread
From: Paolo Bonzini @ 2015-10-27 18:41 UTC (permalink / raw)
  To: Denis V. Lunev; +Cc: qemu-devel, Stefan Hajnoczi, qemu-stable

On 27/10/2015 15:09, Denis V. Lunev wrote:
> The following test
>     while /bin/true ; do
>         virsh snapshot-create rhel7
>         sleep 10
>         virsh snapshot-delete rhel7 --current
>     done
> with enabled iothreads on a running VM leads to a lot of troubles: hangs,
> asserts, errors.
> 
> Though (in general) HMP snapshot code is terrible. I think it should be
> dropped at once and replaced with blkdev transactions code. Though is
> could not fit to QEMU 2.5/stable at all.
> 
> Anyway, I think that the construction like
>     assert(aio_context_is_locked(aio_context));
> should be widely used to ensure proper locking.
> 
> Changes from v1:
> - aio-context locking added
> - comment is rewritten
> 
> Signed-off-by: Denis V. Lunev <den@openvz.org>
> CC: Stefan Hajnoczi <stefanha@redhat.com>
> CC: Paolo Bonzini <pbonzini@redhat.com>

For patches 4-5:

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>

For patches 1-3 I'm not sure, because we will remove RFifoLock
relatively soon and regular pthread recursive mutexes do not have an
equivalent of rfifolock_is_locked.

Paolo

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH v2 0/5] dataplane snapshot fixes
  2015-10-27 18:41 ` [Qemu-devel] [PATCH v2 0/5] dataplane snapshot fixes Paolo Bonzini
@ 2015-10-27 19:05   ` Denis V. Lunev
  2015-10-27 23:22     ` Denis V. Lunev
  0 siblings, 1 reply; 13+ messages in thread
From: Denis V. Lunev @ 2015-10-27 19:05 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: qemu-devel, Stefan Hajnoczi, qemu-stable

On 10/27/2015 09:41 PM, Paolo Bonzini wrote:
> On 27/10/2015 15:09, Denis V. Lunev wrote:
>> The following test
>>      while /bin/true ; do
>>          virsh snapshot-create rhel7
>>          sleep 10
>>          virsh snapshot-delete rhel7 --current
>>      done
>> with enabled iothreads on a running VM leads to a lot of troubles: hangs,
>> asserts, errors.
>>
>> Though (in general) HMP snapshot code is terrible. I think it should be
>> dropped at once and replaced with blkdev transactions code. Though is
>> could not fit to QEMU 2.5/stable at all.
>>
>> Anyway, I think that the construction like
>>      assert(aio_context_is_locked(aio_context));
>> should be widely used to ensure proper locking.
>>
>> Changes from v1:
>> - aio-context locking added
>> - comment is rewritten
>>
>> Signed-off-by: Denis V. Lunev <den@openvz.org>
>> CC: Stefan Hajnoczi <stefanha@redhat.com>
>> CC: Paolo Bonzini <pbonzini@redhat.com>
> For patches 4-5:
>
> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
>
> For patches 1-3 I'm not sure, because we will remove RFifoLock
> relatively soon and regular pthread recursive mutexes do not have an
> equivalent of rfifolock_is_locked.
>
> Paolo

This does not break any future.

Yes, FifoLock will go away, but aio_context_is_locked will
survive like it stays in the kernel code. We can either have
plain pthread_mutex_try_lock/unlock at first or we can
have additional stubs for linux with checks like this

(gdb)  p *(pthread_mutex_t*)0x6015a0
$3  =  {
   __data  =  {
     __lock  =  2,
     __count  =  0,
     __owner  =  12276,   <==  LWP12276  is Thread 3
     __nusers  =  1,
     __kind  =  0,        <==  non-recursive
     __spins  =  0,
     __list  =  {
       __prev  =  0x0,
       __next  =  0x0
     }
   },
   __size  =      "\002\000\000\000\000\000\000\000\364/\000\000\001",'\000'  <repeats26  times>,
   __align  =  2
}

in debug mode. Yes, they relays on internal representation,
but they are useful.

This assert was VERY useful for me. I presume that there are
a LOT of similar places in the code with different functions
where aio_context lock was not acquired and there was no
way to ensure consistency.

Den

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH v2 0/5] dataplane snapshot fixes
  2015-10-27 19:05   ` Denis V. Lunev
@ 2015-10-27 23:22     ` Denis V. Lunev
  0 siblings, 0 replies; 13+ messages in thread
From: Denis V. Lunev @ 2015-10-27 23:22 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: qemu-devel, Stefan Hajnoczi, qemu-stable

On 10/27/2015 10:05 PM, Denis V. Lunev wrote:
> On 10/27/2015 09:41 PM, Paolo Bonzini wrote:
>> On 27/10/2015 15:09, Denis V. Lunev wrote:
>>> The following test
>>>      while /bin/true ; do
>>>          virsh snapshot-create rhel7
>>>          sleep 10
>>>          virsh snapshot-delete rhel7 --current
>>>      done
>>> with enabled iothreads on a running VM leads to a lot of troubles: 
>>> hangs,
>>> asserts, errors.
>>>
>>> Though (in general) HMP snapshot code is terrible. I think it should be
>>> dropped at once and replaced with blkdev transactions code. Though is
>>> could not fit to QEMU 2.5/stable at all.
>>>
>>> Anyway, I think that the construction like
>>>      assert(aio_context_is_locked(aio_context));
>>> should be widely used to ensure proper locking.
>>>
>>> Changes from v1:
>>> - aio-context locking added
>>> - comment is rewritten
>>>
>>> Signed-off-by: Denis V. Lunev <den@openvz.org>
>>> CC: Stefan Hajnoczi <stefanha@redhat.com>
>>> CC: Paolo Bonzini <pbonzini@redhat.com>
>> For patches 4-5:
>>
>> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
>>
>> For patches 1-3 I'm not sure, because we will remove RFifoLock
>> relatively soon and regular pthread recursive mutexes do not have an
>> equivalent of rfifolock_is_locked.
>>
>> Paolo
>
> This does not break any future.
>
> Yes, FifoLock will go away, but aio_context_is_locked will
> survive like it stays in the kernel code. We can either have
> plain pthread_mutex_try_lock/unlock at first or we can
> have additional stubs for linux with checks like this
>
> (gdb)  p *(pthread_mutex_t*)0x6015a0
> $3  =  {
>   __data  =  {
>     __lock  =  2,
>     __count  =  0,
>     __owner  =  12276,   <==  LWP12276  is Thread 3
>     __nusers  =  1,
>     __kind  =  0,        <==  non-recursive
>     __spins  =  0,
>     __list  =  {
>       __prev  =  0x0,
>       __next  =  0x0
>     }
>   },
>   __size  = "\002\000\000\000\000\000\000\000\364/\000\000\001",'\000' 
> <repeats26  times>,
>   __align  =  2
> }
>
> in debug mode. Yes, they relays on internal representation,
> but they are useful.
>
> This assert was VERY useful for me. I presume that there are
> a LOT of similar places in the code with different functions
> where aio_context lock was not acquired and there was no
> way to ensure consistency.
>
> Den

there is a sane fast crossplatform approach :)

struct AioContextLock {
     pthread_t owner;
     pthread_mutex_t mutex;
     int count;
};

void aio_context_lock(struct AioContextLock *lock)
{
     pthread_mutex_lock(&lock->mutex);
     lock->owner = pthread_self();
     lock->count++;
}

void aio_context_unlock(struct AioContextLock *lock)
{
     if (--lock->count == 0) {
         lock->owner = NULL;
     }
pthread_mutex_lock(&lock->mutex);
}

int aio_context_is_locked(struct AioContextLock *lock)
{
     if (pthread_mutex_trylock(&lock->mutex)) {
         int own_by_us = lock->owner == pthread_self();
         pthread_mutex_unlock(&lock->mutex);
         return own_by_us ? 1 : 0;
     }
     /* the lock was taken by other thread */
     return 0;
}

Den

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH 4/5] migration: add missed aio_context_acquire into hmp_savevm/hmp_delvm
  2015-10-27 14:09 ` [Qemu-devel] [PATCH 4/5] migration: add missed aio_context_acquire into hmp_savevm/hmp_delvm Denis V. Lunev
  2015-10-27 18:12   ` Paolo Bonzini
@ 2015-10-28 10:11   ` Juan Quintela
  2015-10-28 10:38     ` Denis V. Lunev
  1 sibling, 1 reply; 13+ messages in thread
From: Juan Quintela @ 2015-10-28 10:11 UTC (permalink / raw)
  To: Denis V. Lunev
  Cc: Amit Shah, Paolo Bonzini, qemu-devel, Stefan Hajnoczi, qemu-stable

"Denis V. Lunev" <den@openvz.org> wrote:
> aio_context should be locked in the similar way as was done in QMP
> snapshot creation in the other case there are a lot of possible
> troubles if native AIO mode is enabled for disk.
>
> - the command can hang (HMP thread) with missed wakeup (the operation is
>   actually complete)
>     io_submit
>     ioq_submit
>     laio_submit
>     raw_aio_submit
>     raw_aio_readv
>     bdrv_co_io_em
>     bdrv_co_readv_em
>     bdrv_aligned_preadv
>     bdrv_co_do_preadv
>     bdrv_co_do_readv
>     bdrv_co_readv
>     qcow2_co_readv
>     bdrv_aligned_preadv
>     bdrv_co_do_pwritev
>     bdrv_rw_co_entry
>
> - QEMU can assert in coroutine re-enter
>     __GI_abort
>     qemu_coroutine_enter
>     bdrv_co_io_em_complete
>     qemu_laio_process_completion
>     qemu_laio_completion_bh
>     aio_bh_poll
>     aio_dispatch
>     aio_poll
>     iothread_run
>
> AioContext lock is reqursive. Thus nested locking should not be a problem.
>
> Signed-off-by: Denis V. Lunev <den@openvz.org>
> CC: Stefan Hajnoczi <stefanha@redhat.com>
> CC: Paolo Bonzini <pbonzini@redhat.com>
> CC: Juan Quintela <quintela@redhat.com>
> CC: Amit Shah <amit.shah@redhat.com>
> ---
>  block/snapshot.c   | 5 +++++
>  migration/savevm.c | 7 +++++++
>  2 files changed, 12 insertions(+)



Reviewed-by: Juan Quintela <quintela@redhat.com>

But once there, I can't understand why migration have to know about
aio_contexts at all.

I *think* that it would be a good idea to "hide" the
adi_context_acquire(aio_context) inside qemu_fopen_bdrv() (yes, it is
still in migration/*, but you get the idea).  But don't propose it,
because we don't have qemu_fclose_bdrv().  Yes we could add an
aio_context inside QemuFile, and release it on qemu_fclose(), but I
guess this needs more thought yet.

BTW, once that I got your attention, why is this needed on hmp_savevm()
but it is not needed on load_vmstate()?  We are also using
qemu_fopen_bdrv()?  Because we are only reading from there?  Just curios
the reason or if we are missing something there.

Thanks, Juan.


>
> diff --git a/block/snapshot.c b/block/snapshot.c
> index 89500f2..f6fa17a 100644
> --- a/block/snapshot.c
> +++ b/block/snapshot.c
> @@ -259,6 +259,9 @@ void bdrv_snapshot_delete_by_id_or_name(BlockDriverState *bs,
>  {
>      int ret;
>      Error *local_err = NULL;
> +    AioContext *aio_context = bdrv_get_aio_context(bs);
> +
> +    aio_context_acquire(aio_context);
>  
>      ret = bdrv_snapshot_delete(bs, id_or_name, NULL, &local_err);
>      if (ret == -ENOENT || ret == -EINVAL) {
> @@ -267,6 +270,8 @@ void bdrv_snapshot_delete_by_id_or_name(BlockDriverState *bs,
>          ret = bdrv_snapshot_delete(bs, NULL, id_or_name, &local_err);
>      }
>  
> +    aio_context_release(aio_context);
> +
>      if (ret < 0) {
>          error_propagate(errp, local_err);
>      }
> diff --git a/migration/savevm.c b/migration/savevm.c
> index dbcc39a..83d2efa 100644
> --- a/migration/savevm.c
> +++ b/migration/savevm.c
> @@ -1289,6 +1289,7 @@ void hmp_savevm(Monitor *mon, const QDict *qdict)
>      struct tm tm;
>      const char *name = qdict_get_try_str(qdict, "name");
>      Error *local_err = NULL;
> +    AioContext *aio_context;
>  
>      /* Verify if there is a device that doesn't support snapshots and is writable */
>      bs = NULL;
> @@ -1320,6 +1321,9 @@ void hmp_savevm(Monitor *mon, const QDict *qdict)
>      }
>      vm_stop(RUN_STATE_SAVE_VM);
>  
> +    aio_context = bdrv_get_aio_context(bs);
> +    aio_context_acquire(aio_context);
> +
>      memset(sn, 0, sizeof(*sn));
>  
>      /* fill auxiliary fields */
> @@ -1378,6 +1382,8 @@ void hmp_savevm(Monitor *mon, const QDict *qdict)
>      }
>  
>   the_end:
> +    aio_context_release(aio_context);
> +
>      if (saved_vm_running) {
>          vm_start();
>      }

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH 4/5] migration: add missed aio_context_acquire into hmp_savevm/hmp_delvm
  2015-10-28 10:11   ` Juan Quintela
@ 2015-10-28 10:38     ` Denis V. Lunev
  0 siblings, 0 replies; 13+ messages in thread
From: Denis V. Lunev @ 2015-10-28 10:38 UTC (permalink / raw)
  To: quintela
  Cc: Amit Shah, Paolo Bonzini, qemu-devel, Stefan Hajnoczi, qemu-stable

On 10/28/2015 01:11 PM, Juan Quintela wrote:
> "Denis V. Lunev" <den@openvz.org> wrote:
>> aio_context should be locked in the similar way as was done in QMP
>> snapshot creation in the other case there are a lot of possible
>> troubles if native AIO mode is enabled for disk.
>>
>> - the command can hang (HMP thread) with missed wakeup (the operation is
>>    actually complete)
>>      io_submit
>>      ioq_submit
>>      laio_submit
>>      raw_aio_submit
>>      raw_aio_readv
>>      bdrv_co_io_em
>>      bdrv_co_readv_em
>>      bdrv_aligned_preadv
>>      bdrv_co_do_preadv
>>      bdrv_co_do_readv
>>      bdrv_co_readv
>>      qcow2_co_readv
>>      bdrv_aligned_preadv
>>      bdrv_co_do_pwritev
>>      bdrv_rw_co_entry
>>
>> - QEMU can assert in coroutine re-enter
>>      __GI_abort
>>      qemu_coroutine_enter
>>      bdrv_co_io_em_complete
>>      qemu_laio_process_completion
>>      qemu_laio_completion_bh
>>      aio_bh_poll
>>      aio_dispatch
>>      aio_poll
>>      iothread_run
>>
>> AioContext lock is reqursive. Thus nested locking should not be a problem.
>>
>> Signed-off-by: Denis V. Lunev <den@openvz.org>
>> CC: Stefan Hajnoczi <stefanha@redhat.com>
>> CC: Paolo Bonzini <pbonzini@redhat.com>
>> CC: Juan Quintela <quintela@redhat.com>
>> CC: Amit Shah <amit.shah@redhat.com>
>> ---
>>   block/snapshot.c   | 5 +++++
>>   migration/savevm.c | 7 +++++++
>>   2 files changed, 12 insertions(+)
>
>
> Reviewed-by: Juan Quintela <quintela@redhat.com>
>
> But once there, I can't understand why migration have to know about
> aio_contexts at all.
>
> I *think* that it would be a good idea to "hide" the
> adi_context_acquire(aio_context) inside qemu_fopen_bdrv() (yes, it is
> still in migration/*, but you get the idea).  But don't propose it,
> because we don't have qemu_fclose_bdrv().  Yes we could add an
> aio_context inside QemuFile, and release it on qemu_fclose(), but I
> guess this needs more thought yet.
>
> BTW, once that I got your attention, why is this needed on hmp_savevm()
> but it is not needed on load_vmstate()?  We are also using
> qemu_fopen_bdrv()?  Because we are only reading from there?  Just curios
> the reason or if we are missing something there.
>
> Thanks, Juan.

I think that the race is still there (I have checked this several times 
but less
amount of times then create/delete snapshot) but the windows is seriously
reduced due to bdrv_drain_all at the beginning.

In general your are right. But in this case we are almost doomed. Any single
read/write operation could executed in iothread only. May be I have missed
something in this puzzle.

OK. What about bdrv_fclose callback and similar (new) callback for open
which should be called through qemu_fopen_bdrv for block driver only?

Den

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2015-10-28 10:38 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-10-27 14:09 [Qemu-devel] [PATCH v2 0/5] dataplane snapshot fixes Denis V. Lunev
2015-10-27 14:09 ` [Qemu-devel] [PATCH 1/5] fifolock: create rfifolock_is_locked helper Denis V. Lunev
2015-10-27 14:09 ` [Qemu-devel] [PATCH 2/5] aio_context: create aio_context_is_locked helper Denis V. Lunev
2015-10-27 14:09 ` [Qemu-devel] [PATCH 3/5] io: add locking constraints check into bdrv_drain to ensure locking Denis V. Lunev
2015-10-27 14:09 ` [Qemu-devel] [PATCH 4/5] migration: add missed aio_context_acquire into hmp_savevm/hmp_delvm Denis V. Lunev
2015-10-27 18:12   ` Paolo Bonzini
2015-10-27 18:23     ` Denis V. Lunev
2015-10-28 10:11   ` Juan Quintela
2015-10-28 10:38     ` Denis V. Lunev
2015-10-27 14:09 ` [Qemu-devel] [PATCH 5/5] virtio: sync the dataplane vring state to the virtqueue before virtio_save Denis V. Lunev
2015-10-27 18:41 ` [Qemu-devel] [PATCH v2 0/5] dataplane snapshot fixes Paolo Bonzini
2015-10-27 19:05   ` Denis V. Lunev
2015-10-27 23:22     ` Denis V. Lunev

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.