All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH v3 0/4] dataplane snapshot fixes
@ 2015-10-28 15:01 Denis V. Lunev
  2015-10-28 15:01 ` [Qemu-devel] [PATCH 1/4] fifolock: create rfifolock_is_locked helper Denis V. Lunev
                   ` (3 more replies)
  0 siblings, 4 replies; 21+ messages in thread
From: Denis V. Lunev @ 2015-10-28 15:01 UTC (permalink / raw)
  Cc: Denis V. Lunev, Paolo Bonzini, qemu-devel, Stefan Hajnoczi, qemu-stable

with test
    while /bin/true ; do
        virsh snapshot-create rhel7
        sleep 10
        virsh snapshot-delete rhel7 --current
    done
with enabled iothreads on a running VM leads to a lot of troubles: hangs,
asserts, errors.

Anyway, I think that the construction like
    assert(aio_context_is_locked(aio_context));
should be widely used to ensure proper locking.

Changes from v2:
- droppped patch 5 as already merged
- changed locking scheme in patch 4 by suggestion of Juan

Changes from v1:
- aio-context locking added
- comment is rewritten

Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Stefan Hajnoczi <stefanha@redhat.com>
CC: Paolo Bonzini <pbonzini@redhat.com>

  fifolock: create rfifolock_is_locked helper
  aio_context: create aio_context_is_locked helper
  io: add locking constraints check into bdrv_drain to ensure locking
  migration: add missed aio_context_acquire into hmp_savevm/hmp_delvm

 async.c                  |  5 +++++
 block/io.c               |  5 ++++-
 block/snapshot.c         |  5 +++++
 include/block/aio.h      |  3 +++
 include/qemu/rfifolock.h |  1 +
 migration/savevm.c       | 18 +++++++++++++++---
 util/rfifolock.c         |  9 +++++++--
 7 files changed, 40 insertions(+), 6 deletions(-)

-- 
2.1.4

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Qemu-devel] [PATCH 1/4] fifolock: create rfifolock_is_locked helper
  2015-10-28 15:01 [Qemu-devel] [PATCH v3 0/4] dataplane snapshot fixes Denis V. Lunev
@ 2015-10-28 15:01 ` Denis V. Lunev
  2015-10-30 15:41   ` Stefan Hajnoczi
  2015-10-28 15:01 ` [Qemu-devel] [PATCH 2/4] aio_context: create aio_context_is_locked helper Denis V. Lunev
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 21+ messages in thread
From: Denis V. Lunev @ 2015-10-28 15:01 UTC (permalink / raw)
  Cc: Denis V. Lunev, Paolo Bonzini, qemu-devel, Stefan Hajnoczi, qemu-stable

This helper is necessary to ensure locking constraints.

Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Stefan Hajnoczi <stefanha@redhat.com>
CC: Paolo Bonzini <pbonzini@redhat.com>
---
 include/qemu/rfifolock.h | 1 +
 util/rfifolock.c         | 9 +++++++--
 2 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/include/qemu/rfifolock.h b/include/qemu/rfifolock.h
index b23ab53..2c00c4b 100644
--- a/include/qemu/rfifolock.h
+++ b/include/qemu/rfifolock.h
@@ -50,5 +50,6 @@ void rfifolock_init(RFifoLock *r, void (*cb)(void *), void *opaque);
 void rfifolock_destroy(RFifoLock *r);
 void rfifolock_lock(RFifoLock *r);
 void rfifolock_unlock(RFifoLock *r);
+int  rfifolock_is_locked(RFifoLock *r);
 
 #endif /* QEMU_RFIFOLOCK_H */
diff --git a/util/rfifolock.c b/util/rfifolock.c
index afbf748..8ac58cb 100644
--- a/util/rfifolock.c
+++ b/util/rfifolock.c
@@ -48,7 +48,7 @@ void rfifolock_lock(RFifoLock *r)
     /* Take a ticket */
     unsigned int ticket = r->tail++;
 
-    if (r->nesting > 0 && qemu_thread_is_self(&r->owner_thread)) {
+    if (r->nesting > 0 && rfifolock_is_locked(r)) {
         r->tail--; /* put ticket back, we're nesting */
     } else {
         while (ticket != r->head) {
@@ -69,10 +69,15 @@ void rfifolock_unlock(RFifoLock *r)
 {
     qemu_mutex_lock(&r->lock);
     assert(r->nesting > 0);
-    assert(qemu_thread_is_self(&r->owner_thread));
+    assert(rfifolock_is_locked(r));
     if (--r->nesting == 0) {
         r->head++;
         qemu_cond_broadcast(&r->cond);
     }
     qemu_mutex_unlock(&r->lock);
 }
+
+int rfifolock_is_locked(RFifoLock *r)
+{
+    return qemu_thread_is_self(&r->owner_thread);
+}
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Qemu-devel] [PATCH 2/4] aio_context: create aio_context_is_locked helper
  2015-10-28 15:01 [Qemu-devel] [PATCH v3 0/4] dataplane snapshot fixes Denis V. Lunev
  2015-10-28 15:01 ` [Qemu-devel] [PATCH 1/4] fifolock: create rfifolock_is_locked helper Denis V. Lunev
@ 2015-10-28 15:01 ` Denis V. Lunev
  2015-10-30 15:42   ` Stefan Hajnoczi
  2015-10-28 15:01 ` [Qemu-devel] [PATCH 3/4] io: add locking constraints check into bdrv_drain to ensure locking Denis V. Lunev
  2015-10-28 15:01 ` [Qemu-devel] [PATCH 4/4] migration: add missed aio_context_acquire into HMP snapshot code Denis V. Lunev
  3 siblings, 1 reply; 21+ messages in thread
From: Denis V. Lunev @ 2015-10-28 15:01 UTC (permalink / raw)
  Cc: Denis V. Lunev, Paolo Bonzini, qemu-devel, Stefan Hajnoczi, qemu-stable

This helper is necessary to ensure locking constraints.

Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Stefan Hajnoczi <stefanha@redhat.com>
CC: Paolo Bonzini <pbonzini@redhat.com>
---
 async.c             | 5 +++++
 include/block/aio.h | 3 +++
 2 files changed, 8 insertions(+)

diff --git a/async.c b/async.c
index bdc64a3..4a9250e 100644
--- a/async.c
+++ b/async.c
@@ -361,3 +361,8 @@ void aio_context_release(AioContext *ctx)
 {
     rfifolock_unlock(&ctx->lock);
 }
+
+int aio_context_is_locked(AioContext *ctx)
+{
+    return rfifolock_is_locked(&ctx->lock);
+}
diff --git a/include/block/aio.h b/include/block/aio.h
index bcc7d43..d20c7b8 100644
--- a/include/block/aio.h
+++ b/include/block/aio.h
@@ -166,6 +166,9 @@ void aio_context_acquire(AioContext *ctx);
 /* Relinquish ownership of the AioContext. */
 void aio_context_release(AioContext *ctx);
 
+/* Check that AioContext is locked by the caller. */
+int aio_context_is_locked(AioContext *ctx);
+
 /**
  * aio_bh_new: Allocate a new bottom half structure.
  *
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Qemu-devel] [PATCH 3/4] io: add locking constraints check into bdrv_drain to ensure locking
  2015-10-28 15:01 [Qemu-devel] [PATCH v3 0/4] dataplane snapshot fixes Denis V. Lunev
  2015-10-28 15:01 ` [Qemu-devel] [PATCH 1/4] fifolock: create rfifolock_is_locked helper Denis V. Lunev
  2015-10-28 15:01 ` [Qemu-devel] [PATCH 2/4] aio_context: create aio_context_is_locked helper Denis V. Lunev
@ 2015-10-28 15:01 ` Denis V. Lunev
  2015-10-30 15:43   ` Stefan Hajnoczi
  2015-10-28 15:01 ` [Qemu-devel] [PATCH 4/4] migration: add missed aio_context_acquire into HMP snapshot code Denis V. Lunev
  3 siblings, 1 reply; 21+ messages in thread
From: Denis V. Lunev @ 2015-10-28 15:01 UTC (permalink / raw)
  Cc: Denis V. Lunev, Paolo Bonzini, qemu-devel, Stefan Hajnoczi, qemu-stable

as described in the comment of the function

Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Stefan Hajnoczi <stefanha@redhat.com>
CC: Paolo Bonzini <pbonzini@redhat.com>
---
 block/io.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/block/io.c b/block/io.c
index 5ac6256..2e98d45 100644
--- a/block/io.c
+++ b/block/io.c
@@ -247,12 +247,15 @@ bool bdrv_requests_pending(BlockDriverState *bs)
 void bdrv_drain(BlockDriverState *bs)
 {
     bool busy = true;
+    AioContext *aio_context = bdrv_get_aio_context(bs);
+
+    assert(aio_context_is_locked(aio_context));
 
     while (busy) {
         /* Keep iterating */
          bdrv_flush_io_queue(bs);
          busy = bdrv_requests_pending(bs);
-         busy |= aio_poll(bdrv_get_aio_context(bs), busy);
+         busy |= aio_poll(aio_context, busy);
     }
 }
 
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Qemu-devel] [PATCH 4/4] migration: add missed aio_context_acquire into HMP snapshot code
  2015-10-28 15:01 [Qemu-devel] [PATCH v3 0/4] dataplane snapshot fixes Denis V. Lunev
                   ` (2 preceding siblings ...)
  2015-10-28 15:01 ` [Qemu-devel] [PATCH 3/4] io: add locking constraints check into bdrv_drain to ensure locking Denis V. Lunev
@ 2015-10-28 15:01 ` Denis V. Lunev
  2015-10-28 15:33   ` Juan Quintela
  2015-10-30 15:52   ` Stefan Hajnoczi
  3 siblings, 2 replies; 21+ messages in thread
From: Denis V. Lunev @ 2015-10-28 15:01 UTC (permalink / raw)
  Cc: Juan Quintela, qemu-devel, qemu-stable, Stefan Hajnoczi,
	Paolo Bonzini, Amit Shah, Denis V. Lunev

aio_context should be locked in the similar way as was done in QMP
snapshot creation in the other case there are a lot of possible
troubles if native AIO mode is enabled for disk.

- the command can hang (HMP thread) with missed wakeup (the operation is
  actually complete)
    io_submit
    ioq_submit
    laio_submit
    raw_aio_submit
    raw_aio_readv
    bdrv_co_io_em
    bdrv_co_readv_em
    bdrv_aligned_preadv
    bdrv_co_do_preadv
    bdrv_co_do_readv
    bdrv_co_readv
    qcow2_co_readv
    bdrv_aligned_preadv
    bdrv_co_do_pwritev
    bdrv_rw_co_entry

- QEMU can assert in coroutine re-enter
    __GI_abort
    qemu_coroutine_enter
    bdrv_co_io_em_complete
    qemu_laio_process_completion
    qemu_laio_completion_bh
    aio_bh_poll
    aio_dispatch
    aio_poll
    iothread_run

qemu_fopen_bdrv and bdrv_fclose are used in real snapshot operations only
along with block drivers. This change should influence only HMP snapshot
operations.

AioContext lock is reqursive. Thus nested locking should not be a problem.

Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Stefan Hajnoczi <stefanha@redhat.com>
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: Juan Quintela <quintela@redhat.com>
CC: Amit Shah <amit.shah@redhat.com>
---
 block/snapshot.c   |  5 +++++
 migration/savevm.c | 18 +++++++++++++++---
 2 files changed, 20 insertions(+), 3 deletions(-)

diff --git a/block/snapshot.c b/block/snapshot.c
index 89500f2..f6fa17a 100644
--- a/block/snapshot.c
+++ b/block/snapshot.c
@@ -259,6 +259,9 @@ void bdrv_snapshot_delete_by_id_or_name(BlockDriverState *bs,
 {
     int ret;
     Error *local_err = NULL;
+    AioContext *aio_context = bdrv_get_aio_context(bs);
+
+    aio_context_acquire(aio_context);
 
     ret = bdrv_snapshot_delete(bs, id_or_name, NULL, &local_err);
     if (ret == -ENOENT || ret == -EINVAL) {
@@ -267,6 +270,8 @@ void bdrv_snapshot_delete_by_id_or_name(BlockDriverState *bs,
         ret = bdrv_snapshot_delete(bs, NULL, id_or_name, &local_err);
     }
 
+    aio_context_release(aio_context);
+
     if (ret < 0) {
         error_propagate(errp, local_err);
     }
diff --git a/migration/savevm.c b/migration/savevm.c
index dbcc39a..1653f56 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -153,7 +153,11 @@ static ssize_t block_get_buffer(void *opaque, uint8_t *buf, int64_t pos,
 
 static int bdrv_fclose(void *opaque)
 {
-    return bdrv_flush(opaque);
+    BlockDriverState *bs = (BlockDriverState *)opaque;
+    int ret = bdrv_flush(bs);
+
+    aio_context_release(bdrv_get_aio_context(bs));
+    return ret;
 }
 
 static const QEMUFileOps bdrv_read_ops = {
@@ -169,10 +173,18 @@ static const QEMUFileOps bdrv_write_ops = {
 
 static QEMUFile *qemu_fopen_bdrv(BlockDriverState *bs, int is_writable)
 {
+    QEMUFile *file;
+
     if (is_writable) {
-        return qemu_fopen_ops(bs, &bdrv_write_ops);
+        file = qemu_fopen_ops(bs, &bdrv_write_ops);
+    } else {
+        file = qemu_fopen_ops(bs, &bdrv_read_ops);
+    }
+
+    if (file != NULL) {
+        aio_context_acquire(bdrv_get_aio_context(bs));
     }
-    return qemu_fopen_ops(bs, &bdrv_read_ops);
+    return file;
 }
 
 
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH 4/4] migration: add missed aio_context_acquire into HMP snapshot code
  2015-10-28 15:01 ` [Qemu-devel] [PATCH 4/4] migration: add missed aio_context_acquire into HMP snapshot code Denis V. Lunev
@ 2015-10-28 15:33   ` Juan Quintela
  2015-10-28 15:57     ` Denis V. Lunev
  2015-10-30 15:52   ` Stefan Hajnoczi
  1 sibling, 1 reply; 21+ messages in thread
From: Juan Quintela @ 2015-10-28 15:33 UTC (permalink / raw)
  To: Denis V. Lunev
  Cc: Amit Shah, Paolo Bonzini, qemu-devel, Stefan Hajnoczi, qemu-stable

"Denis V. Lunev" <den@openvz.org> wrote:
> aio_context should be locked in the similar way as was done in QMP
> snapshot creation in the other case there are a lot of possible
> troubles if native AIO mode is enabled for disk.
>
> - the command can hang (HMP thread) with missed wakeup (the operation is
>   actually complete)
>     io_submit
>     ioq_submit
>     laio_submit
>     raw_aio_submit
>     raw_aio_readv
>     bdrv_co_io_em
>     bdrv_co_readv_em
>     bdrv_aligned_preadv
>     bdrv_co_do_preadv
>     bdrv_co_do_readv
>     bdrv_co_readv
>     qcow2_co_readv
>     bdrv_aligned_preadv
>     bdrv_co_do_pwritev
>     bdrv_rw_co_entry
>
> - QEMU can assert in coroutine re-enter
>     __GI_abort
>     qemu_coroutine_enter
>     bdrv_co_io_em_complete
>     qemu_laio_process_completion
>     qemu_laio_completion_bh
>     aio_bh_poll
>     aio_dispatch
>     aio_poll
>     iothread_run
>
> qemu_fopen_bdrv and bdrv_fclose are used in real snapshot operations only
> along with block drivers. This change should influence only HMP snapshot
> operations.
>
> AioContext lock is reqursive. Thus nested locking should not be a problem.
>
> Signed-off-by: Denis V. Lunev <den@openvz.org>
> CC: Stefan Hajnoczi <stefanha@redhat.com>
> CC: Paolo Bonzini <pbonzini@redhat.com>
> CC: Juan Quintela <quintela@redhat.com>
> CC: Amit Shah <amit.shah@redhat.com>

Reviewed-by: Juan Quintela <quintela@redhat.com>

Should this one go through the block layer?  I guess that the block
layer, but otherwise, I will get it.


> -    return bdrv_flush(opaque);

> +    BlockDriverState *bs = (BlockDriverState *)opaque;

Cast not needed.

BlockDriverState * bs = opaque;

is even better.

Thanks, Juan.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH 4/4] migration: add missed aio_context_acquire into HMP snapshot code
  2015-10-28 15:33   ` Juan Quintela
@ 2015-10-28 15:57     ` Denis V. Lunev
  0 siblings, 0 replies; 21+ messages in thread
From: Denis V. Lunev @ 2015-10-28 15:57 UTC (permalink / raw)
  To: quintela
  Cc: Amit Shah, Paolo Bonzini, qemu-devel, Stefan Hajnoczi, qemu-stable

On 10/28/2015 06:33 PM, Juan Quintela wrote:
> "Denis V. Lunev" <den@openvz.org> wrote:
>> aio_context should be locked in the similar way as was done in QMP
>> snapshot creation in the other case there are a lot of possible
>> troubles if native AIO mode is enabled for disk.
>>
>> - the command can hang (HMP thread) with missed wakeup (the operation is
>>    actually complete)
>>      io_submit
>>      ioq_submit
>>      laio_submit
>>      raw_aio_submit
>>      raw_aio_readv
>>      bdrv_co_io_em
>>      bdrv_co_readv_em
>>      bdrv_aligned_preadv
>>      bdrv_co_do_preadv
>>      bdrv_co_do_readv
>>      bdrv_co_readv
>>      qcow2_co_readv
>>      bdrv_aligned_preadv
>>      bdrv_co_do_pwritev
>>      bdrv_rw_co_entry
>>
>> - QEMU can assert in coroutine re-enter
>>      __GI_abort
>>      qemu_coroutine_enter
>>      bdrv_co_io_em_complete
>>      qemu_laio_process_completion
>>      qemu_laio_completion_bh
>>      aio_bh_poll
>>      aio_dispatch
>>      aio_poll
>>      iothread_run
>>
>> qemu_fopen_bdrv and bdrv_fclose are used in real snapshot operations only
>> along with block drivers. This change should influence only HMP snapshot
>> operations.
>>
>> AioContext lock is reqursive. Thus nested locking should not be a problem.
>>
>> Signed-off-by: Denis V. Lunev <den@openvz.org>
>> CC: Stefan Hajnoczi <stefanha@redhat.com>
>> CC: Paolo Bonzini <pbonzini@redhat.com>
>> CC: Juan Quintela <quintela@redhat.com>
>> CC: Amit Shah <amit.shah@redhat.com>
> Reviewed-by: Juan Quintela <quintela@redhat.com>
>
> Should this one go through the block layer?  I guess that the block
> layer, but otherwise, I will get it.

let's wait opinion from Stefan :)

Either way would be good to me, but I want to have
previous patches from the set committed too. They
should be definitely flow through block tree thus
block tree would be better.

Anyway, I can retry the process with patches 1-3
if you'll get 4 through your queue.

Den

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH 1/4] fifolock: create rfifolock_is_locked helper
  2015-10-28 15:01 ` [Qemu-devel] [PATCH 1/4] fifolock: create rfifolock_is_locked helper Denis V. Lunev
@ 2015-10-30 15:41   ` Stefan Hajnoczi
  2015-10-30 20:30     ` Denis V. Lunev
  2015-11-01 13:55     ` Denis V. Lunev
  0 siblings, 2 replies; 21+ messages in thread
From: Stefan Hajnoczi @ 2015-10-30 15:41 UTC (permalink / raw)
  To: Denis V. Lunev; +Cc: Paolo Bonzini, qemu-devel, qemu-stable

[-- Attachment #1: Type: text/plain, Size: 1584 bytes --]

On Wed, Oct 28, 2015 at 06:01:02PM +0300, Denis V. Lunev wrote:
> +int  rfifolock_is_locked(RFifoLock *r);

Please use bool instead of int.

> diff --git a/util/rfifolock.c b/util/rfifolock.c
> index afbf748..8ac58cb 100644
> --- a/util/rfifolock.c
> +++ b/util/rfifolock.c
> @@ -48,7 +48,7 @@ void rfifolock_lock(RFifoLock *r)
>      /* Take a ticket */
>      unsigned int ticket = r->tail++;
>  
> -    if (r->nesting > 0 && qemu_thread_is_self(&r->owner_thread)) {
> +    if (r->nesting > 0 && rfifolock_is_locked(r)) {
>          r->tail--; /* put ticket back, we're nesting */
>      } else {
>          while (ticket != r->head) {
> @@ -69,10 +69,15 @@ void rfifolock_unlock(RFifoLock *r)
>  {
>      qemu_mutex_lock(&r->lock);
>      assert(r->nesting > 0);
> -    assert(qemu_thread_is_self(&r->owner_thread));
> +    assert(rfifolock_is_locked(r));
>      if (--r->nesting == 0) {
>          r->head++;
>          qemu_cond_broadcast(&r->cond);
>      }
>      qemu_mutex_unlock(&r->lock);
>  }
> +
> +int rfifolock_is_locked(RFifoLock *r)
> +{
> +    return qemu_thread_is_self(&r->owner_thread);
> +}

The function name confused me since "does the current thread hold the
lock?" != "does anyone currently hold the lock?".

I suggest:

bool rfifolock_held_by_current_thread(RFifoLock *r) {
    return r->nesting > 0 && qemu_thread_is_self(&r->owner_thread);
}

Then the r->nesting > 0 testing can also be dropped by callers, which is
good since rfifolock_is_locked() does not return a meaningful result
when r->nesting == 0.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH 2/4] aio_context: create aio_context_is_locked helper
  2015-10-28 15:01 ` [Qemu-devel] [PATCH 2/4] aio_context: create aio_context_is_locked helper Denis V. Lunev
@ 2015-10-30 15:42   ` Stefan Hajnoczi
  0 siblings, 0 replies; 21+ messages in thread
From: Stefan Hajnoczi @ 2015-10-30 15:42 UTC (permalink / raw)
  To: Denis V. Lunev; +Cc: Paolo Bonzini, qemu-devel, qemu-stable

[-- Attachment #1: Type: text/plain, Size: 777 bytes --]

On Wed, Oct 28, 2015 at 06:01:03PM +0300, Denis V. Lunev wrote:
> This helper is necessary to ensure locking constraints.
> 
> Signed-off-by: Denis V. Lunev <den@openvz.org>
> CC: Stefan Hajnoczi <stefanha@redhat.com>
> CC: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  async.c             | 5 +++++
>  include/block/aio.h | 3 +++
>  2 files changed, 8 insertions(+)
> 
> diff --git a/async.c b/async.c
> index bdc64a3..4a9250e 100644
> --- a/async.c
> +++ b/async.c
> @@ -361,3 +361,8 @@ void aio_context_release(AioContext *ctx)
>  {
>      rfifolock_unlock(&ctx->lock);
>  }
> +
> +int aio_context_is_locked(AioContext *ctx)

s/int/bool/

The name aio_context_acquired_by_current_thread() would be clearer (but
longer) than aio_context_is_locked().

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH 3/4] io: add locking constraints check into bdrv_drain to ensure locking
  2015-10-28 15:01 ` [Qemu-devel] [PATCH 3/4] io: add locking constraints check into bdrv_drain to ensure locking Denis V. Lunev
@ 2015-10-30 15:43   ` Stefan Hajnoczi
  0 siblings, 0 replies; 21+ messages in thread
From: Stefan Hajnoczi @ 2015-10-30 15:43 UTC (permalink / raw)
  To: Denis V. Lunev; +Cc: Paolo Bonzini, qemu-devel, qemu-stable

[-- Attachment #1: Type: text/plain, Size: 773 bytes --]

On Wed, Oct 28, 2015 at 06:01:04PM +0300, Denis V. Lunev wrote:
> as described in the comment of the function
> 
> Signed-off-by: Denis V. Lunev <den@openvz.org>
> CC: Stefan Hajnoczi <stefanha@redhat.com>
> CC: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  block/io.c | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/block/io.c b/block/io.c
> index 5ac6256..2e98d45 100644
> --- a/block/io.c
> +++ b/block/io.c
> @@ -247,12 +247,15 @@ bool bdrv_requests_pending(BlockDriverState *bs)
>  void bdrv_drain(BlockDriverState *bs)
>  {
>      bool busy = true;
> +    AioContext *aio_context = bdrv_get_aio_context(bs);
> +
> +    assert(aio_context_is_locked(aio_context));

Cool, thanks!  This is a useful assertion to have.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH 4/4] migration: add missed aio_context_acquire into HMP snapshot code
  2015-10-28 15:01 ` [Qemu-devel] [PATCH 4/4] migration: add missed aio_context_acquire into HMP snapshot code Denis V. Lunev
  2015-10-28 15:33   ` Juan Quintela
@ 2015-10-30 15:52   ` Stefan Hajnoczi
  2015-11-03 14:48     ` Juan Quintela
  1 sibling, 1 reply; 21+ messages in thread
From: Stefan Hajnoczi @ 2015-10-30 15:52 UTC (permalink / raw)
  To: Denis V. Lunev
  Cc: Amit Shah, Paolo Bonzini, Juan Quintela, qemu-devel, qemu-stable

[-- Attachment #1: Type: text/plain, Size: 1290 bytes --]

On Wed, Oct 28, 2015 at 06:01:05PM +0300, Denis V. Lunev wrote:
> diff --git a/block/snapshot.c b/block/snapshot.c
> index 89500f2..f6fa17a 100644
> --- a/block/snapshot.c
> +++ b/block/snapshot.c
> @@ -259,6 +259,9 @@ void bdrv_snapshot_delete_by_id_or_name(BlockDriverState *bs,
>  {
>      int ret;
>      Error *local_err = NULL;
> +    AioContext *aio_context = bdrv_get_aio_context(bs);
> +
> +    aio_context_acquire(aio_context);
>  
>      ret = bdrv_snapshot_delete(bs, id_or_name, NULL, &local_err);
>      if (ret == -ENOENT || ret == -EINVAL) {
> @@ -267,6 +270,8 @@ void bdrv_snapshot_delete_by_id_or_name(BlockDriverState *bs,
>          ret = bdrv_snapshot_delete(bs, NULL, id_or_name, &local_err);
>      }
>  
> +    aio_context_release(aio_context);
> +
>      if (ret < 0) {
>          error_propagate(errp, local_err);
>      }

Please make the caller acquire the AioContext instead of modifying
bdrv_snapshot_delete_id_or_name() because no other functions in this
file acquire AioContext and the API should be consistent.

There's no harm in recursive locking but it is hard to write correct
code if related functions differ in whether or not they acquire the
AioContext.  Either all of them should acquire AioContext or none of
them.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH 1/4] fifolock: create rfifolock_is_locked helper
  2015-10-30 15:41   ` Stefan Hajnoczi
@ 2015-10-30 20:30     ` Denis V. Lunev
  2015-11-02 13:10       ` Stefan Hajnoczi
  2015-11-01 13:55     ` Denis V. Lunev
  1 sibling, 1 reply; 21+ messages in thread
From: Denis V. Lunev @ 2015-10-30 20:30 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: Paolo Bonzini, qemu-devel, qemu-stable

On 10/30/2015 06:41 PM, Stefan Hajnoczi wrote:
> On Wed, Oct 28, 2015 at 06:01:02PM +0300, Denis V. Lunev wrote:
>> +int  rfifolock_is_locked(RFifoLock *r);
> Please use bool instead of int.
>
>> diff --git a/util/rfifolock.c b/util/rfifolock.c
>> index afbf748..8ac58cb 100644
>> --- a/util/rfifolock.c
>> +++ b/util/rfifolock.c
>> @@ -48,7 +48,7 @@ void rfifolock_lock(RFifoLock *r)
>>       /* Take a ticket */
>>       unsigned int ticket = r->tail++;
>>   
>> -    if (r->nesting > 0 && qemu_thread_is_self(&r->owner_thread)) {
>> +    if (r->nesting > 0 && rfifolock_is_locked(r)) {
>>           r->tail--; /* put ticket back, we're nesting */
>>       } else {
>>           while (ticket != r->head) {
>> @@ -69,10 +69,15 @@ void rfifolock_unlock(RFifoLock *r)
>>   {
>>       qemu_mutex_lock(&r->lock);
>>       assert(r->nesting > 0);
>> -    assert(qemu_thread_is_self(&r->owner_thread));
>> +    assert(rfifolock_is_locked(r));
>>       if (--r->nesting == 0) {
>>           r->head++;
>>           qemu_cond_broadcast(&r->cond);
>>       }
>>       qemu_mutex_unlock(&r->lock);
>>   }
>> +
>> +int rfifolock_is_locked(RFifoLock *r)
>> +{
>> +    return qemu_thread_is_self(&r->owner_thread);
>> +}
> The function name confused me since "does the current thread hold the
> lock?" != "does anyone currently hold the lock?".
>
> I suggest:
>
> bool rfifolock_held_by_current_thread(RFifoLock *r) {
>      return r->nesting > 0 && qemu_thread_is_self(&r->owner_thread);
> }
>
> Then the r->nesting > 0 testing can also be dropped by callers, which is
> good since rfifolock_is_locked() does not return a meaningful result
> when r->nesting == 0.
what about

rfifolock_is_owner() ?

Den

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH 1/4] fifolock: create rfifolock_is_locked helper
  2015-10-30 15:41   ` Stefan Hajnoczi
  2015-10-30 20:30     ` Denis V. Lunev
@ 2015-11-01 13:55     ` Denis V. Lunev
  2015-11-02 13:12       ` Stefan Hajnoczi
  1 sibling, 1 reply; 21+ messages in thread
From: Denis V. Lunev @ 2015-11-01 13:55 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: Paolo Bonzini, qemu-devel, qemu-stable

On 10/30/2015 06:41 PM, Stefan Hajnoczi wrote:
> On Wed, Oct 28, 2015 at 06:01:02PM +0300, Denis V. Lunev wrote:
>> +int  rfifolock_is_locked(RFifoLock *r);
> Please use bool instead of int.
>
>> diff --git a/util/rfifolock.c b/util/rfifolock.c
>> index afbf748..8ac58cb 100644
>> --- a/util/rfifolock.c
>> +++ b/util/rfifolock.c
>> @@ -48,7 +48,7 @@ void rfifolock_lock(RFifoLock *r)
>>       /* Take a ticket */
>>       unsigned int ticket = r->tail++;
>>   
>> -    if (r->nesting > 0 && qemu_thread_is_self(&r->owner_thread)) {
>> +    if (r->nesting > 0 && rfifolock_is_locked(r)) {
>>           r->tail--; /* put ticket back, we're nesting */
>>       } else {
>>           while (ticket != r->head) {
>> @@ -69,10 +69,15 @@ void rfifolock_unlock(RFifoLock *r)
>>   {
>>       qemu_mutex_lock(&r->lock);
>>       assert(r->nesting > 0);
>> -    assert(qemu_thread_is_self(&r->owner_thread));
>> +    assert(rfifolock_is_locked(r));
>>       if (--r->nesting == 0) {
>>           r->head++;
>>           qemu_cond_broadcast(&r->cond);
>>       }
>>       qemu_mutex_unlock(&r->lock);
>>   }
>> +
>> +int rfifolock_is_locked(RFifoLock *r)
>> +{
>> +    return qemu_thread_is_self(&r->owner_thread);
>> +}
> The function name confused me since "does the current thread hold the
> lock?" != "does anyone currently hold the lock?".
>
> I suggest:
>
> bool rfifolock_held_by_current_thread(RFifoLock *r) {
>      return r->nesting > 0 && qemu_thread_is_self(&r->owner_thread);
> }
>
> Then the r->nesting > 0 testing can also be dropped by callers, which is
> good since rfifolock_is_locked() does not return a meaningful result
> when r->nesting == 0.
actually the function is broken in the current state:
     aio_context_acquire()
     aio_context_release()
     aio_context_is_locked() will return true.
the problem is that owner thread is not reset on lock release.

with your proposal the function become racy if called from outer
thread. I need to think a bit here.

May be we'll need 2 versions - locked and unlocked or
something other should be added. I am in progress of
invention :)

Den

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH 1/4] fifolock: create rfifolock_is_locked helper
  2015-10-30 20:30     ` Denis V. Lunev
@ 2015-11-02 13:10       ` Stefan Hajnoczi
  0 siblings, 0 replies; 21+ messages in thread
From: Stefan Hajnoczi @ 2015-11-02 13:10 UTC (permalink / raw)
  To: Denis V. Lunev; +Cc: Paolo Bonzini, qemu-devel, Stefan Hajnoczi, qemu-stable

[-- Attachment #1: Type: text/plain, Size: 1874 bytes --]

On Fri, Oct 30, 2015 at 11:30:04PM +0300, Denis V. Lunev wrote:
> On 10/30/2015 06:41 PM, Stefan Hajnoczi wrote:
> >On Wed, Oct 28, 2015 at 06:01:02PM +0300, Denis V. Lunev wrote:
> >>+int  rfifolock_is_locked(RFifoLock *r);
> >Please use bool instead of int.
> >
> >>diff --git a/util/rfifolock.c b/util/rfifolock.c
> >>index afbf748..8ac58cb 100644
> >>--- a/util/rfifolock.c
> >>+++ b/util/rfifolock.c
> >>@@ -48,7 +48,7 @@ void rfifolock_lock(RFifoLock *r)
> >>      /* Take a ticket */
> >>      unsigned int ticket = r->tail++;
> >>-    if (r->nesting > 0 && qemu_thread_is_self(&r->owner_thread)) {
> >>+    if (r->nesting > 0 && rfifolock_is_locked(r)) {
> >>          r->tail--; /* put ticket back, we're nesting */
> >>      } else {
> >>          while (ticket != r->head) {
> >>@@ -69,10 +69,15 @@ void rfifolock_unlock(RFifoLock *r)
> >>  {
> >>      qemu_mutex_lock(&r->lock);
> >>      assert(r->nesting > 0);
> >>-    assert(qemu_thread_is_self(&r->owner_thread));
> >>+    assert(rfifolock_is_locked(r));
> >>      if (--r->nesting == 0) {
> >>          r->head++;
> >>          qemu_cond_broadcast(&r->cond);
> >>      }
> >>      qemu_mutex_unlock(&r->lock);
> >>  }
> >>+
> >>+int rfifolock_is_locked(RFifoLock *r)
> >>+{
> >>+    return qemu_thread_is_self(&r->owner_thread);
> >>+}
> >The function name confused me since "does the current thread hold the
> >lock?" != "does anyone currently hold the lock?".
> >
> >I suggest:
> >
> >bool rfifolock_held_by_current_thread(RFifoLock *r) {
> >     return r->nesting > 0 && qemu_thread_is_self(&r->owner_thread);
> >}
> >
> >Then the r->nesting > 0 testing can also be dropped by callers, which is
> >good since rfifolock_is_locked() does not return a meaningful result
> >when r->nesting == 0.
> what about
> 
> rfifolock_is_owner() ?

Great!

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH 1/4] fifolock: create rfifolock_is_locked helper
  2015-11-01 13:55     ` Denis V. Lunev
@ 2015-11-02 13:12       ` Stefan Hajnoczi
  2015-11-02 13:39         ` Denis V. Lunev
  0 siblings, 1 reply; 21+ messages in thread
From: Stefan Hajnoczi @ 2015-11-02 13:12 UTC (permalink / raw)
  To: Denis V. Lunev; +Cc: Paolo Bonzini, qemu-devel, Stefan Hajnoczi, qemu-stable

[-- Attachment #1: Type: text/plain, Size: 2322 bytes --]

On Sun, Nov 01, 2015 at 04:55:51PM +0300, Denis V. Lunev wrote:
> On 10/30/2015 06:41 PM, Stefan Hajnoczi wrote:
> >On Wed, Oct 28, 2015 at 06:01:02PM +0300, Denis V. Lunev wrote:
> >>+int  rfifolock_is_locked(RFifoLock *r);
> >Please use bool instead of int.
> >
> >>diff --git a/util/rfifolock.c b/util/rfifolock.c
> >>index afbf748..8ac58cb 100644
> >>--- a/util/rfifolock.c
> >>+++ b/util/rfifolock.c
> >>@@ -48,7 +48,7 @@ void rfifolock_lock(RFifoLock *r)
> >>      /* Take a ticket */
> >>      unsigned int ticket = r->tail++;
> >>-    if (r->nesting > 0 && qemu_thread_is_self(&r->owner_thread)) {
> >>+    if (r->nesting > 0 && rfifolock_is_locked(r)) {
> >>          r->tail--; /* put ticket back, we're nesting */
> >>      } else {
> >>          while (ticket != r->head) {
> >>@@ -69,10 +69,15 @@ void rfifolock_unlock(RFifoLock *r)
> >>  {
> >>      qemu_mutex_lock(&r->lock);
> >>      assert(r->nesting > 0);
> >>-    assert(qemu_thread_is_self(&r->owner_thread));
> >>+    assert(rfifolock_is_locked(r));
> >>      if (--r->nesting == 0) {
> >>          r->head++;
> >>          qemu_cond_broadcast(&r->cond);
> >>      }
> >>      qemu_mutex_unlock(&r->lock);
> >>  }
> >>+
> >>+int rfifolock_is_locked(RFifoLock *r)
> >>+{
> >>+    return qemu_thread_is_self(&r->owner_thread);
> >>+}
> >The function name confused me since "does the current thread hold the
> >lock?" != "does anyone currently hold the lock?".
> >
> >I suggest:
> >
> >bool rfifolock_held_by_current_thread(RFifoLock *r) {
> >     return r->nesting > 0 && qemu_thread_is_self(&r->owner_thread);
> >}
> >
> >Then the r->nesting > 0 testing can also be dropped by callers, which is
> >good since rfifolock_is_locked() does not return a meaningful result
> >when r->nesting == 0.
> actually the function is broken in the current state:
>     aio_context_acquire()
>     aio_context_release()
>     aio_context_is_locked() will return true.
> the problem is that owner thread is not reset on lock release.

The owner thread field is only valid when nesting > 0.

> with your proposal the function become racy if called from outer
> thread. I need to think a bit here.

This is thread-safe:

bool owner;

qemu_mutex_lock(&r->lock);
owner = r->nesting > 0 && qemu_thread_is_self(&r->owner_thread);
qemu_mutex_unlock(&r->lock);

return owner;

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH 1/4] fifolock: create rfifolock_is_locked helper
  2015-11-02 13:12       ` Stefan Hajnoczi
@ 2015-11-02 13:39         ` Denis V. Lunev
  2015-11-02 13:55           ` Paolo Bonzini
  0 siblings, 1 reply; 21+ messages in thread
From: Denis V. Lunev @ 2015-11-02 13:39 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: Paolo Bonzini, qemu-devel, Stefan Hajnoczi, qemu-stable

On 11/02/2015 04:12 PM, Stefan Hajnoczi wrote:
> On Sun, Nov 01, 2015 at 04:55:51PM +0300, Denis V. Lunev wrote:
>> On 10/30/2015 06:41 PM, Stefan Hajnoczi wrote:
>>> On Wed, Oct 28, 2015 at 06:01:02PM +0300, Denis V. Lunev wrote:
>>>> +int  rfifolock_is_locked(RFifoLock *r);
>>> Please use bool instead of int.
>>>
>>>> diff --git a/util/rfifolock.c b/util/rfifolock.c
>>>> index afbf748..8ac58cb 100644
>>>> --- a/util/rfifolock.c
>>>> +++ b/util/rfifolock.c
>>>> @@ -48,7 +48,7 @@ void rfifolock_lock(RFifoLock *r)
>>>>       /* Take a ticket */
>>>>       unsigned int ticket = r->tail++;
>>>> -    if (r->nesting > 0 && qemu_thread_is_self(&r->owner_thread)) {
>>>> +    if (r->nesting > 0 && rfifolock_is_locked(r)) {
>>>>           r->tail--; /* put ticket back, we're nesting */
>>>>       } else {
>>>>           while (ticket != r->head) {
>>>> @@ -69,10 +69,15 @@ void rfifolock_unlock(RFifoLock *r)
>>>>   {
>>>>       qemu_mutex_lock(&r->lock);
>>>>       assert(r->nesting > 0);
>>>> -    assert(qemu_thread_is_self(&r->owner_thread));
>>>> +    assert(rfifolock_is_locked(r));
>>>>       if (--r->nesting == 0) {
>>>>           r->head++;
>>>>           qemu_cond_broadcast(&r->cond);
>>>>       }
>>>>       qemu_mutex_unlock(&r->lock);
>>>>   }
>>>> +
>>>> +int rfifolock_is_locked(RFifoLock *r)
>>>> +{
>>>> +    return qemu_thread_is_self(&r->owner_thread);
>>>> +}
>>> The function name confused me since "does the current thread hold the
>>> lock?" != "does anyone currently hold the lock?".
>>>
>>> I suggest:
>>>
>>> bool rfifolock_held_by_current_thread(RFifoLock *r) {
>>>      return r->nesting > 0 && qemu_thread_is_self(&r->owner_thread);
>>> }
>>>
>>> Then the r->nesting > 0 testing can also be dropped by callers, which is
>>> good since rfifolock_is_locked() does not return a meaningful result
>>> when r->nesting == 0.
>> actually the function is broken in the current state:
>>      aio_context_acquire()
>>      aio_context_release()
>>      aio_context_is_locked() will return true.
>> the problem is that owner thread is not reset on lock release.
> The owner thread field is only valid when nesting > 0.
>
>> with your proposal the function become racy if called from outer
>> thread. I need to think a bit here.
> This is thread-safe:
>
> bool owner;
>
> qemu_mutex_lock(&r->lock);
> owner = r->nesting > 0 && qemu_thread_is_self(&r->owner_thread);
> qemu_mutex_unlock(&r->lock);
>
> return owner;
yep, I know.

But I do not want to take the lock for check.
IMHO it would be better to

@@ -68,11 +68,16 @@ void rfifolock_lock(RFifoLock *r)
  void rfifolock_unlock(RFifoLock *r)
  {
      qemu_mutex_lock(&r->lock);
-    assert(r->nesting > 0);
-    assert(qemu_thread_is_self(&r->owner_thread));
+    assert(rfifolock_is_owner(r));
      if (--r->nesting == 0) {
+        qemu_thread_clear(&r->owner_thread);
          r->head++;
          qemu_cond_broadcast(&r->cond);
      }
      qemu_mutex_unlock(&r->lock);
  }
+
+bool rfifolock_is_owner(RFifoLock *r)
+{
+    return r->nesting > 0 && qemu_thread_is_self(&r->owner_thread);
+}

which does not require lock and thread safe.

Den

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH 1/4] fifolock: create rfifolock_is_locked helper
  2015-11-02 13:39         ` Denis V. Lunev
@ 2015-11-02 13:55           ` Paolo Bonzini
  2015-11-02 14:02             ` Denis V. Lunev
  0 siblings, 1 reply; 21+ messages in thread
From: Paolo Bonzini @ 2015-11-02 13:55 UTC (permalink / raw)
  To: Denis V. Lunev, Stefan Hajnoczi; +Cc: qemu-devel, Stefan Hajnoczi, qemu-stable



On 02/11/2015 14:39, Denis V. Lunev wrote:
>> This is thread-safe:
>>
>> bool owner;
>>
>> qemu_mutex_lock(&r->lock);
>> owner = r->nesting > 0 && qemu_thread_is_self(&r->owner_thread);
>> qemu_mutex_unlock(&r->lock);
>>
>> return owner;
> yep, I know.
> 
> But I do not want to take the lock for check.

You can use a trylock.  If it fails, it is definitely safe to return false.

> IMHO it would be better to
> 
> @@ -68,11 +68,16 @@ void rfifolock_lock(RFifoLock *r)
>  void rfifolock_unlock(RFifoLock *r)
>  {
>      qemu_mutex_lock(&r->lock);
> -    assert(r->nesting > 0);
> -    assert(qemu_thread_is_self(&r->owner_thread));
> +    assert(rfifolock_is_owner(r));
>      if (--r->nesting == 0) {
> +        qemu_thread_clear(&r->owner_thread);
>          r->head++;
>          qemu_cond_broadcast(&r->cond);
>      }
>      qemu_mutex_unlock(&r->lock);
>  }
> +
> +bool rfifolock_is_owner(RFifoLock *r)
> +{
> +    return r->nesting > 0 && qemu_thread_is_self(&r->owner_thread);
> +}
> 
> which does not require lock and thread safe.

I think it requires memory barriers though.  But it can be simplified:
if you clear owner_thread, you do not need to check r->nesting in
rfifolock_is_owner.

Clearing owner_thread can be done with a simple memset.

Paolo

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH 1/4] fifolock: create rfifolock_is_locked helper
  2015-11-02 13:55           ` Paolo Bonzini
@ 2015-11-02 14:02             ` Denis V. Lunev
  0 siblings, 0 replies; 21+ messages in thread
From: Denis V. Lunev @ 2015-11-02 14:02 UTC (permalink / raw)
  To: Paolo Bonzini, Stefan Hajnoczi; +Cc: qemu-devel, Stefan Hajnoczi, qemu-stable

On 11/02/2015 04:55 PM, Paolo Bonzini wrote:
>
> On 02/11/2015 14:39, Denis V. Lunev wrote:
>>> This is thread-safe:
>>>
>>> bool owner;
>>>
>>> qemu_mutex_lock(&r->lock);
>>> owner = r->nesting > 0 && qemu_thread_is_self(&r->owner_thread);
>>> qemu_mutex_unlock(&r->lock);
>>>
>>> return owner;
>> yep, I know.
>>
>> But I do not want to take the lock for check.
> You can use a trylock.  If it fails, it is definitely safe to return false.
>
>> IMHO it would be better to
>>
>> @@ -68,11 +68,16 @@ void rfifolock_lock(RFifoLock *r)
>>   void rfifolock_unlock(RFifoLock *r)
>>   {
>>       qemu_mutex_lock(&r->lock);
>> -    assert(r->nesting > 0);
>> -    assert(qemu_thread_is_self(&r->owner_thread));
>> +    assert(rfifolock_is_owner(r));
>>       if (--r->nesting == 0) {
>> +        qemu_thread_clear(&r->owner_thread);
>>           r->head++;
>>           qemu_cond_broadcast(&r->cond);
>>       }
>>       qemu_mutex_unlock(&r->lock);
>>   }
>> +
>> +bool rfifolock_is_owner(RFifoLock *r)
>> +{
>> +    return r->nesting > 0 && qemu_thread_is_self(&r->owner_thread);
>> +}
>>
>> which does not require lock and thread safe.
> I think it requires memory barriers though.  But it can be simplified:
> if you clear owner_thread, you do not need to check r->nesting in
> rfifolock_is_owner.
>
> Clearing owner_thread can be done with a simple memset.
this does not require memory barrier as call to qemu_call_broadcast
will do the job just fine.

The check for ownership is actually enough as:
- current thread could be set in the current thread only and
   cleared in the same current thread only. This is not racy at all :)
- count check is moved here for convenience only by request of
   Stefan, it is not required at all to make a decision with after
   clearing the thread.

OK, I can use memset for sure if it will be accepted :)
This was my first opinion.

Den

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH 4/4] migration: add missed aio_context_acquire into HMP snapshot code
  2015-10-30 15:52   ` Stefan Hajnoczi
@ 2015-11-03 14:48     ` Juan Quintela
  2015-11-03 15:30       ` Stefan Hajnoczi
  0 siblings, 1 reply; 21+ messages in thread
From: Juan Quintela @ 2015-11-03 14:48 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Amit Shah, Denis V. Lunev, Paolo Bonzini, qemu-devel, qemu-stable

Stefan Hajnoczi <stefanha@redhat.com> wrote:
> On Wed, Oct 28, 2015 at 06:01:05PM +0300, Denis V. Lunev wrote:
>> diff --git a/block/snapshot.c b/block/snapshot.c
>> index 89500f2..f6fa17a 100644
>> --- a/block/snapshot.c
>> +++ b/block/snapshot.c
>> @@ -259,6 +259,9 @@ void bdrv_snapshot_delete_by_id_or_name(BlockDriverState *bs,
>>  {
>>      int ret;
>>      Error *local_err = NULL;
>> +    AioContext *aio_context = bdrv_get_aio_context(bs);
>> +
>> +    aio_context_acquire(aio_context);
>>  
>>      ret = bdrv_snapshot_delete(bs, id_or_name, NULL, &local_err);
>>      if (ret == -ENOENT || ret == -EINVAL) {
>> @@ -267,6 +270,8 @@ void bdrv_snapshot_delete_by_id_or_name(BlockDriverState *bs,
>>          ret = bdrv_snapshot_delete(bs, NULL, id_or_name, &local_err);
>>      }
>>  
>> +    aio_context_release(aio_context);
>> +
>>      if (ret < 0) {
>>          error_propagate(errp, local_err);
>>      }
>
> Please make the caller acquire the AioContext instead of modifying
> bdrv_snapshot_delete_id_or_name() because no other functions in this
> file acquire AioContext and the API should be consistent.

That is wrong (TM).  No other functions in migration/* know what an
aiocontext is, and they are fine, thanks O:-)

So, I guess we would have to get some other function exported from the
block layer, with the aiocontext taken?

Code ends being like this:


     while ((bs = bdrv_next(bs))) {
         if (bdrv_can_snapshot(bs) &&
             bdrv_snapshot_find(bs, snapshot, name) >= 0) {
             AioContext *ctx = bdrv_get_aio_context(bs);

             aio_context_acquire(ctx);
             bdrv_snapshot_delete_by_id_or_name(bs, name, &err);
            aio_context_release(ctx);
         .... some error handling here ...
    }


As discussed on irc, we need to get some function exported from the
block layer that does this.

I am sure that I don't understand the differences between hmp_devlvm()
and del_existing_snapshots().

>
> There's no harm in recursive locking but it is hard to write correct
> code if related functions differ in whether or not they acquire the
> AioContext.  Either all of them should acquire AioContext or none of
> them.

I don't like recursive locking, but that is a different question,
altogether.

Denis, on irc Stefan says that new locking is not valid either, so
working from there.

Thanks, Juan.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH 4/4] migration: add missed aio_context_acquire into HMP snapshot code
  2015-11-03 14:48     ` Juan Quintela
@ 2015-11-03 15:30       ` Stefan Hajnoczi
  2015-11-03 15:36         ` Denis V. Lunev
  0 siblings, 1 reply; 21+ messages in thread
From: Stefan Hajnoczi @ 2015-11-03 15:30 UTC (permalink / raw)
  To: Juan Quintela
  Cc: Amit Shah, Denis V. Lunev, Paolo Bonzini, qemu-devel, qemu-stable

[-- Attachment #1: Type: text/plain, Size: 2940 bytes --]

On Tue, Nov 03, 2015 at 03:48:07PM +0100, Juan Quintela wrote:
> Stefan Hajnoczi <stefanha@redhat.com> wrote:
> > On Wed, Oct 28, 2015 at 06:01:05PM +0300, Denis V. Lunev wrote:
> >> diff --git a/block/snapshot.c b/block/snapshot.c
> >> index 89500f2..f6fa17a 100644
> >> --- a/block/snapshot.c
> >> +++ b/block/snapshot.c
> >> @@ -259,6 +259,9 @@ void bdrv_snapshot_delete_by_id_or_name(BlockDriverState *bs,
> >>  {
> >>      int ret;
> >>      Error *local_err = NULL;
> >> +    AioContext *aio_context = bdrv_get_aio_context(bs);
> >> +
> >> +    aio_context_acquire(aio_context);
> >>  
> >>      ret = bdrv_snapshot_delete(bs, id_or_name, NULL, &local_err);
> >>      if (ret == -ENOENT || ret == -EINVAL) {
> >> @@ -267,6 +270,8 @@ void bdrv_snapshot_delete_by_id_or_name(BlockDriverState *bs,
> >>          ret = bdrv_snapshot_delete(bs, NULL, id_or_name, &local_err);
> >>      }
> >>  
> >> +    aio_context_release(aio_context);
> >> +
> >>      if (ret < 0) {
> >>          error_propagate(errp, local_err);
> >>      }
> >
> > Please make the caller acquire the AioContext instead of modifying
> > bdrv_snapshot_delete_id_or_name() because no other functions in this
> > file acquire AioContext and the API should be consistent.
> 
> That is wrong (TM).  No other functions in migration/* know what an
> aiocontext is, and they are fine, thanks O:-)

To clarify my comment:

APIs should have a consistent locking strategy.  Either all of the the
block/snapshot.c public functions should take the lock or none of them
should.

With an inconsistent locking strategy it's really hard to review code
and ensure it is correct because you need to look up for each function
whether or not it takes the lock internally.

> So, I guess we would have to get some other function exported from the
> block layer, with the aiocontext taken?
> 
> Code ends being like this:
> 
> 
>      while ((bs = bdrv_next(bs))) {
>          if (bdrv_can_snapshot(bs) &&
>              bdrv_snapshot_find(bs, snapshot, name) >= 0) {
>              AioContext *ctx = bdrv_get_aio_context(bs);
> 
>              aio_context_acquire(ctx);
>              bdrv_snapshot_delete_by_id_or_name(bs, name, &err);
>             aio_context_release(ctx);
>          .... some error handling here ...
>     }
> 
> 
> As discussed on irc, we need to get some function exported from the
> block layer that does this.
> 
> I am sure that I don't understand the differences between hmp_devlvm()
> and del_existing_snapshots().

On IRC I commented when you posted this code because there's a bug:

bdrv_can_snapshot() and bdrv_snapshot_find() must be called with
AioContext acquired.  So the function should actually be:

while ((bs = bdrv_next(bs))) {
    AioContext *ctx = bdrv_get_aio_context(ctx);

    if (bdrv_can_snapshot(bs) &&
        ...

    aio_context_release(ctx);
}

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH 4/4] migration: add missed aio_context_acquire into HMP snapshot code
  2015-11-03 15:30       ` Stefan Hajnoczi
@ 2015-11-03 15:36         ` Denis V. Lunev
  0 siblings, 0 replies; 21+ messages in thread
From: Denis V. Lunev @ 2015-11-03 15:36 UTC (permalink / raw)
  To: Stefan Hajnoczi, Juan Quintela
  Cc: Amit Shah, Paolo Bonzini, qemu-devel, qemu-stable

On 11/03/2015 06:30 PM, Stefan Hajnoczi wrote:
> On Tue, Nov 03, 2015 at 03:48:07PM +0100, Juan Quintela wrote:
>> Stefan Hajnoczi <stefanha@redhat.com> wrote:
>>> On Wed, Oct 28, 2015 at 06:01:05PM +0300, Denis V. Lunev wrote:
>>>> diff --git a/block/snapshot.c b/block/snapshot.c
>>>> index 89500f2..f6fa17a 100644
>>>> --- a/block/snapshot.c
>>>> +++ b/block/snapshot.c
>>>> @@ -259,6 +259,9 @@ void bdrv_snapshot_delete_by_id_or_name(BlockDriverState *bs,
>>>>   {
>>>>       int ret;
>>>>       Error *local_err = NULL;
>>>> +    AioContext *aio_context = bdrv_get_aio_context(bs);
>>>> +
>>>> +    aio_context_acquire(aio_context);
>>>>   
>>>>       ret = bdrv_snapshot_delete(bs, id_or_name, NULL, &local_err);
>>>>       if (ret == -ENOENT || ret == -EINVAL) {
>>>> @@ -267,6 +270,8 @@ void bdrv_snapshot_delete_by_id_or_name(BlockDriverState *bs,
>>>>           ret = bdrv_snapshot_delete(bs, NULL, id_or_name, &local_err);
>>>>       }
>>>>   
>>>> +    aio_context_release(aio_context);
>>>> +
>>>>       if (ret < 0) {
>>>>           error_propagate(errp, local_err);
>>>>       }
>>> Please make the caller acquire the AioContext instead of modifying
>>> bdrv_snapshot_delete_id_or_name() because no other functions in this
>>> file acquire AioContext and the API should be consistent.
>> That is wrong (TM).  No other functions in migration/* know what an
>> aiocontext is, and they are fine, thanks O:-)
> To clarify my comment:
>
> APIs should have a consistent locking strategy.  Either all of the the
> block/snapshot.c public functions should take the lock or none of them
> should.
>
> With an inconsistent locking strategy it's really hard to review code
> and ensure it is correct because you need to look up for each function
> whether or not it takes the lock internally.
>
>> So, I guess we would have to get some other function exported from the
>> block layer, with the aiocontext taken?
>>
>> Code ends being like this:
>>
>>
>>       while ((bs = bdrv_next(bs))) {
>>           if (bdrv_can_snapshot(bs) &&
>>               bdrv_snapshot_find(bs, snapshot, name) >= 0) {
>>               AioContext *ctx = bdrv_get_aio_context(bs);
>>
>>               aio_context_acquire(ctx);
>>               bdrv_snapshot_delete_by_id_or_name(bs, name, &err);
>>              aio_context_release(ctx);
>>           .... some error handling here ...
>>      }
>>
>>
>> As discussed on irc, we need to get some function exported from the
>> block layer that does this.
>>
>> I am sure that I don't understand the differences between hmp_devlvm()
>> and del_existing_snapshots().
> On IRC I commented when you posted this code because there's a bug:
>
> bdrv_can_snapshot() and bdrv_snapshot_find() must be called with
> AioContext acquired.  So the function should actually be:
>
> while ((bs = bdrv_next(bs))) {
>      AioContext *ctx = bdrv_get_aio_context(ctx);
>
>      if (bdrv_can_snapshot(bs) &&
>          ...
>
>      aio_context_release(ctx);
> }
this is not that necessary after patch 6 which adds
guard into direct AIO submission code.

Anyway, I would tend to agree that locking
MUST be consistent and it MUST be ENFORCED
in proper places.

Truly speaking we must place assert(aio_context_is_owner)
at the beginning of each externally visible function in
block layer. Though this is too late in 2.5.

Den

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2015-11-03 15:36 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-10-28 15:01 [Qemu-devel] [PATCH v3 0/4] dataplane snapshot fixes Denis V. Lunev
2015-10-28 15:01 ` [Qemu-devel] [PATCH 1/4] fifolock: create rfifolock_is_locked helper Denis V. Lunev
2015-10-30 15:41   ` Stefan Hajnoczi
2015-10-30 20:30     ` Denis V. Lunev
2015-11-02 13:10       ` Stefan Hajnoczi
2015-11-01 13:55     ` Denis V. Lunev
2015-11-02 13:12       ` Stefan Hajnoczi
2015-11-02 13:39         ` Denis V. Lunev
2015-11-02 13:55           ` Paolo Bonzini
2015-11-02 14:02             ` Denis V. Lunev
2015-10-28 15:01 ` [Qemu-devel] [PATCH 2/4] aio_context: create aio_context_is_locked helper Denis V. Lunev
2015-10-30 15:42   ` Stefan Hajnoczi
2015-10-28 15:01 ` [Qemu-devel] [PATCH 3/4] io: add locking constraints check into bdrv_drain to ensure locking Denis V. Lunev
2015-10-30 15:43   ` Stefan Hajnoczi
2015-10-28 15:01 ` [Qemu-devel] [PATCH 4/4] migration: add missed aio_context_acquire into HMP snapshot code Denis V. Lunev
2015-10-28 15:33   ` Juan Quintela
2015-10-28 15:57     ` Denis V. Lunev
2015-10-30 15:52   ` Stefan Hajnoczi
2015-11-03 14:48     ` Juan Quintela
2015-11-03 15:30       ` Stefan Hajnoczi
2015-11-03 15:36         ` Denis V. Lunev

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.