[Qemu-devel] [RFC PATCH 0/2] establish nesting rule of BQL vs cpu-exclusive

All of lore.kernel.org
 help / color / mirror / Atom feed

* [Qemu-devel] [RFC PATCH 0/2] establish nesting rule of BQL vs cpu-exclusive
@ 2019-05-23 10:54 Roman Kagan
  2019-05-23 10:54 ` [Qemu-devel] [RFC PATCH 1/2] cpus-common: nuke finish_safe_work Roman Kagan
                   ` (2 more replies)
  0 siblings, 3 replies; 12+ messages in thread
From: Roman Kagan @ 2019-05-23 10:54 UTC (permalink / raw)
  To: Paolo Bonzini, qemu-devel

I came across the following AB-BA deadlock:

    vCPU thread                             main thread
    -----------                             -----------
async_safe_run_on_cpu(self,
                      async_synic_update)
...                                         [cpu hot-add]
process_queued_cpu_work()
  qemu_mutex_unlock_iothread()
                                            [grab BQL]
  start_exclusive()                         cpu_list_add()
  async_synic_update()                        finish_safe_work()
    qemu_mutex_lock_iothread()                  cpu_exec_start()

ATM async_synic_update seems to be the only async safe work item that
grabs BQL.  However it isn't quite obvious that it shouldn't; in the
past there were more examples of this (e.g.
memory_region_do_invalidate_mmio_ptr).

It looks like the problem is generally in the lack of the nesting rule
for cpu-exclusive sections against BQL, so I thought I would try to
address that.  This patchset is my feeble attempt at this; I'm not sure
I fully comprehend all the consequences (rather, I'm sure I don't) hence
RFC.

Roman Kagan (2):
  cpus-common: nuke finish_safe_work
  cpus-common: assert BQL nesting within cpu-exclusive sections

 cpus-common.c | 12 ++++--------
 1 file changed, 4 insertions(+), 8 deletions(-)

-- 
2.21.0

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Qemu-devel] [RFC PATCH 1/2] cpus-common: nuke finish_safe_work
  2019-05-23 10:54 [Qemu-devel] [RFC PATCH 0/2] establish nesting rule of BQL vs cpu-exclusive Roman Kagan
@ 2019-05-23 10:54 ` Roman Kagan
  2019-06-24 10:58   ` Alex Bennée
  2019-05-23 10:54 ` [Qemu-devel] [RFC PATCH 2/2] cpus-common: assert BQL nesting within cpu-exclusive sections Roman Kagan
  2019-05-23 11:31 ` [Qemu-devel] [RFC PATCH 0/2] establish nesting rule of BQL vs cpu-exclusive Alex Bennée
  2 siblings, 1 reply; 12+ messages in thread
From: Roman Kagan @ 2019-05-23 10:54 UTC (permalink / raw)
  To: Paolo Bonzini, qemu-devel

It was introduced in commit b129972c8b41e15b0521895a46fd9c752b68a5e,
with the following motivation:

  Because start_exclusive uses CPU_FOREACH, merge exclusive_lock with
  qemu_cpu_list_lock: together with a call to exclusive_idle (via
  cpu_exec_start/end) in cpu_list_add, this protects exclusive work
  against concurrent CPU addition and removal.

However, it seems to be redundant, because the cpu-exclusive
infrastructure provides suffificent protection against the newly added
CPU starting execution while the cpu-exclusive work is running, and the
aforementioned traversing of the cpu list is protected by
qemu_cpu_list_lock.

Besides, this appears to be the only place where the cpu-exclusive
section is entered with the BQL taken, which has been found to trigger
AB-BA deadlock as follows:

    vCPU thread                             main thread
    -----------                             -----------
async_safe_run_on_cpu(self,
                      async_synic_update)
...                                         [cpu hot-add]
process_queued_cpu_work()
  qemu_mutex_unlock_iothread()
                                            [grab BQL]
  start_exclusive()                         cpu_list_add()
  async_synic_update()                        finish_safe_work()
    qemu_mutex_lock_iothread()                  cpu_exec_start()

So remove it.  This paves the way to establishing a strict nesting rule
of never entering the exclusive section with the BQL taken.

Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
---
 cpus-common.c | 8 --------
 1 file changed, 8 deletions(-)

diff --git a/cpus-common.c b/cpus-common.c
index 3ca58c64e8..023cfebfa3 100644
--- a/cpus-common.c
+++ b/cpus-common.c
@@ -69,12 +69,6 @@ static int cpu_get_free_index(void)
     return cpu_index;
 }
 
-static void finish_safe_work(CPUState *cpu)
-{
-    cpu_exec_start(cpu);
-    cpu_exec_end(cpu);
-}
-
 void cpu_list_add(CPUState *cpu)
 {
     qemu_mutex_lock(&qemu_cpu_list_lock);
@@ -86,8 +80,6 @@ void cpu_list_add(CPUState *cpu)
     }
     QTAILQ_INSERT_TAIL_RCU(&cpus, cpu, node);
     qemu_mutex_unlock(&qemu_cpu_list_lock);
-
-    finish_safe_work(cpu);
 }
 
 void cpu_list_remove(CPUState *cpu)
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [Qemu-devel] [RFC PATCH 2/2] cpus-common: assert BQL nesting within cpu-exclusive sections
  2019-05-23 10:54 [Qemu-devel] [RFC PATCH 0/2] establish nesting rule of BQL vs cpu-exclusive Roman Kagan
  2019-05-23 10:54 ` [Qemu-devel] [RFC PATCH 1/2] cpus-common: nuke finish_safe_work Roman Kagan
@ 2019-05-23 10:54 ` Roman Kagan
  2019-05-23 11:31 ` [Qemu-devel] [RFC PATCH 0/2] establish nesting rule of BQL vs cpu-exclusive Alex Bennée
  2 siblings, 0 replies; 12+ messages in thread
From: Roman Kagan @ 2019-05-23 10:54 UTC (permalink / raw)
  To: Paolo Bonzini, qemu-devel

Assert that the cpu-exclusive sections are never entered/left with the
BQL taken.

Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
---
 cpus-common.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/cpus-common.c b/cpus-common.c
index 023cfebfa3..9aa75fe1ba 100644
--- a/cpus-common.c
+++ b/cpus-common.c
@@ -174,6 +174,7 @@ void start_exclusive(void)
     CPUState *other_cpu;
     int running_cpus;
 
+    assert(!qemu_mutex_iothread_locked());
     qemu_mutex_lock(&qemu_cpu_list_lock);
     exclusive_idle();
 
@@ -205,6 +206,7 @@ void start_exclusive(void)
 /* Finish an exclusive operation.  */
 void end_exclusive(void)
 {
+    assert(!qemu_mutex_iothread_locked());
     qemu_mutex_lock(&qemu_cpu_list_lock);
     atomic_set(&pending_cpus, 0);
     qemu_cond_broadcast(&exclusive_resume);
@@ -214,6 +216,7 @@ void end_exclusive(void)
 /* Wait for exclusive ops to finish, and begin cpu execution.  */
 void cpu_exec_start(CPUState *cpu)
 {
+    assert(!qemu_mutex_iothread_locked());
     atomic_set(&cpu->running, true);
 
     /* Write cpu->running before reading pending_cpus.  */
@@ -255,6 +258,7 @@ void cpu_exec_start(CPUState *cpu)
 /* Mark cpu as not executing, and release pending exclusive ops.  */
 void cpu_exec_end(CPUState *cpu)
 {
+    assert(!qemu_mutex_iothread_locked());
     atomic_set(&cpu->running, false);
 
     /* Write cpu->running before reading pending_cpus.  */
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] [RFC PATCH 0/2] establish nesting rule of BQL vs cpu-exclusive
  2019-05-23 10:54 [Qemu-devel] [RFC PATCH 0/2] establish nesting rule of BQL vs cpu-exclusive Roman Kagan
  2019-05-23 10:54 ` [Qemu-devel] [RFC PATCH 1/2] cpus-common: nuke finish_safe_work Roman Kagan
  2019-05-23 10:54 ` [Qemu-devel] [RFC PATCH 2/2] cpus-common: assert BQL nesting within cpu-exclusive sections Roman Kagan
@ 2019-05-23 11:31 ` Alex Bennée
  2019-05-27 11:05   ` Roman Kagan
  2 siblings, 1 reply; 12+ messages in thread
From: Alex Bennée @ 2019-05-23 11:31 UTC (permalink / raw)
  To: qemu-devel; +Cc: Paolo Bonzini, cota, richard.henderson


Roman Kagan <rkagan@virtuozzo.com> writes:

> I came across the following AB-BA deadlock:
>
>     vCPU thread                             main thread
>     -----------                             -----------
> async_safe_run_on_cpu(self,
>                       async_synic_update)
> ...                                         [cpu hot-add]
> process_queued_cpu_work()
>   qemu_mutex_unlock_iothread()
>                                             [grab BQL]
>   start_exclusive()                         cpu_list_add()
>   async_synic_update()                        finish_safe_work()
>     qemu_mutex_lock_iothread()                  cpu_exec_start()
>
> ATM async_synic_update seems to be the only async safe work item that
> grabs BQL.  However it isn't quite obvious that it shouldn't; in the
> past there were more examples of this (e.g.
> memory_region_do_invalidate_mmio_ptr).
>
> It looks like the problem is generally in the lack of the nesting rule
> for cpu-exclusive sections against BQL, so I thought I would try to
> address that.  This patchset is my feeble attempt at this; I'm not sure
> I fully comprehend all the consequences (rather, I'm sure I don't) hence
> RFC.

Hmm I think this is an area touched by:

  Subject: [PATCH v7 00/73] per-CPU locks
  Date: Mon,  4 Mar 2019 13:17:00 -0500
  Message-Id: <20190304181813.8075-1-cota@braap.org>

which has stalled on it's path into the tree. Last time I checked it
explicitly handled the concept of work that needed the BQL and work that
didn't.

How do you trigger your deadlock? Just hot-pluging CPUs?

>
> Roman Kagan (2):
>   cpus-common: nuke finish_safe_work
>   cpus-common: assert BQL nesting within cpu-exclusive sections
>
>  cpus-common.c | 12 ++++--------
>  1 file changed, 4 insertions(+), 8 deletions(-)


--
Alex Bennée


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] [RFC PATCH 0/2] establish nesting rule of BQL vs cpu-exclusive
  2019-05-23 11:31 ` [Qemu-devel] [RFC PATCH 0/2] establish nesting rule of BQL vs cpu-exclusive Alex Bennée
@ 2019-05-27 11:05   ` Roman Kagan
  2019-06-06 13:22     ` Roman Kagan
  0 siblings, 1 reply; 12+ messages in thread
From: Roman Kagan @ 2019-05-27 11:05 UTC (permalink / raw)
  To: Alex Bennée; +Cc: Paolo Bonzini, cota, richard.henderson, qemu-devel

On Thu, May 23, 2019 at 12:31:16PM +0100, Alex Bennée wrote:
> 
> Roman Kagan <rkagan@virtuozzo.com> writes:
> 
> > I came across the following AB-BA deadlock:
> >
> >     vCPU thread                             main thread
> >     -----------                             -----------
> > async_safe_run_on_cpu(self,
> >                       async_synic_update)
> > ...                                         [cpu hot-add]
> > process_queued_cpu_work()
> >   qemu_mutex_unlock_iothread()
> >                                             [grab BQL]
> >   start_exclusive()                         cpu_list_add()
> >   async_synic_update()                        finish_safe_work()
> >     qemu_mutex_lock_iothread()                  cpu_exec_start()
> >
> > ATM async_synic_update seems to be the only async safe work item that
> > grabs BQL.  However it isn't quite obvious that it shouldn't; in the
> > past there were more examples of this (e.g.
> > memory_region_do_invalidate_mmio_ptr).
> >
> > It looks like the problem is generally in the lack of the nesting rule
> > for cpu-exclusive sections against BQL, so I thought I would try to
> > address that.  This patchset is my feeble attempt at this; I'm not sure
> > I fully comprehend all the consequences (rather, I'm sure I don't) hence
> > RFC.
> 
> Hmm I think this is an area touched by:
> 
>   Subject: [PATCH v7 00/73] per-CPU locks
>   Date: Mon,  4 Mar 2019 13:17:00 -0500
>   Message-Id: <20190304181813.8075-1-cota@braap.org>
> 
> which has stalled on it's path into the tree. Last time I checked it
> explicitly handled the concept of work that needed the BQL and work that
> didn't.

I'm still trying to get my head around that patchset, but it looks like
it changes nothing in regards to cpu-exclusive sections and safe work,
so it doesn't make the problem go.

> How do you trigger your deadlock? Just hot-pluging CPUs?

Yes.  The window is pretty narrow so I only saw it once although this
test (where the vms are started and stopped and the cpus are plugged in
and out) is in our test loop for quite a bit (probably 2+ years).

Roman.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] [RFC PATCH 0/2] establish nesting rule of BQL vs cpu-exclusive
  2019-05-27 11:05   ` Roman Kagan
@ 2019-06-06 13:22     ` Roman Kagan
  2019-06-21 12:49       ` Roman Kagan
  0 siblings, 1 reply; 12+ messages in thread
From: Roman Kagan @ 2019-06-06 13:22 UTC (permalink / raw)
  To: Alex Bennée; +Cc: Paolo Bonzini, cota, richard.henderson, qemu-devel

On Mon, May 27, 2019 at 11:05:38AM +0000, Roman Kagan wrote:
> On Thu, May 23, 2019 at 12:31:16PM +0100, Alex Bennée wrote:
> > 
> > Roman Kagan <rkagan@virtuozzo.com> writes:
> > 
> > > I came across the following AB-BA deadlock:
> > >
> > >     vCPU thread                             main thread
> > >     -----------                             -----------
> > > async_safe_run_on_cpu(self,
> > >                       async_synic_update)
> > > ...                                         [cpu hot-add]
> > > process_queued_cpu_work()
> > >   qemu_mutex_unlock_iothread()
> > >                                             [grab BQL]
> > >   start_exclusive()                         cpu_list_add()
> > >   async_synic_update()                        finish_safe_work()
> > >     qemu_mutex_lock_iothread()                  cpu_exec_start()
> > >
> > > ATM async_synic_update seems to be the only async safe work item that
> > > grabs BQL.  However it isn't quite obvious that it shouldn't; in the
> > > past there were more examples of this (e.g.
> > > memory_region_do_invalidate_mmio_ptr).
> > >
> > > It looks like the problem is generally in the lack of the nesting rule
> > > for cpu-exclusive sections against BQL, so I thought I would try to
> > > address that.  This patchset is my feeble attempt at this; I'm not sure
> > > I fully comprehend all the consequences (rather, I'm sure I don't) hence
> > > RFC.
> > 
> > Hmm I think this is an area touched by:
> > 
> >   Subject: [PATCH v7 00/73] per-CPU locks
> >   Date: Mon,  4 Mar 2019 13:17:00 -0500
> >   Message-Id: <20190304181813.8075-1-cota@braap.org>
> > 
> > which has stalled on it's path into the tree. Last time I checked it
> > explicitly handled the concept of work that needed the BQL and work that
> > didn't.
> 
> I'm still trying to get my head around that patchset, but it looks like
> it changes nothing in regards to cpu-exclusive sections and safe work,
> so it doesn't make the problem go.
> 
> > How do you trigger your deadlock? Just hot-pluging CPUs?
> 
> Yes.  The window is pretty narrow so I only saw it once although this
> test (where the vms are started and stopped and the cpus are plugged in
> and out) is in our test loop for quite a bit (probably 2+ years).
> 
> Roman.

ping?


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] [RFC PATCH 0/2] establish nesting rule of BQL vs cpu-exclusive
  2019-06-06 13:22     ` Roman Kagan
@ 2019-06-21 12:49       ` Roman Kagan
  2019-08-05 12:47         ` Roman Kagan
  0 siblings, 1 reply; 12+ messages in thread
From: Roman Kagan @ 2019-06-21 12:49 UTC (permalink / raw)
  To: Alex Bennée; +Cc: Paolo Bonzini, cota, richard.henderson, qemu-devel

On Thu, Jun 06, 2019 at 01:22:33PM +0000, Roman Kagan wrote:
> On Mon, May 27, 2019 at 11:05:38AM +0000, Roman Kagan wrote:
> > On Thu, May 23, 2019 at 12:31:16PM +0100, Alex Bennée wrote:
> > > 
> > > Roman Kagan <rkagan@virtuozzo.com> writes:
> > > 
> > > > I came across the following AB-BA deadlock:
> > > >
> > > >     vCPU thread                             main thread
> > > >     -----------                             -----------
> > > > async_safe_run_on_cpu(self,
> > > >                       async_synic_update)
> > > > ...                                         [cpu hot-add]
> > > > process_queued_cpu_work()
> > > >   qemu_mutex_unlock_iothread()
> > > >                                             [grab BQL]
> > > >   start_exclusive()                         cpu_list_add()
> > > >   async_synic_update()                        finish_safe_work()
> > > >     qemu_mutex_lock_iothread()                  cpu_exec_start()
> > > >
> > > > ATM async_synic_update seems to be the only async safe work item that
> > > > grabs BQL.  However it isn't quite obvious that it shouldn't; in the
> > > > past there were more examples of this (e.g.
> > > > memory_region_do_invalidate_mmio_ptr).
> > > >
> > > > It looks like the problem is generally in the lack of the nesting rule
> > > > for cpu-exclusive sections against BQL, so I thought I would try to
> > > > address that.  This patchset is my feeble attempt at this; I'm not sure
> > > > I fully comprehend all the consequences (rather, I'm sure I don't) hence
> > > > RFC.
> > > 
> > > Hmm I think this is an area touched by:
> > > 
> > >   Subject: [PATCH v7 00/73] per-CPU locks
> > >   Date: Mon,  4 Mar 2019 13:17:00 -0500
> > >   Message-Id: <20190304181813.8075-1-cota@braap.org>
> > > 
> > > which has stalled on it's path into the tree. Last time I checked it
> > > explicitly handled the concept of work that needed the BQL and work that
> > > didn't.
> > 
> > I'm still trying to get my head around that patchset, but it looks like
> > it changes nothing in regards to cpu-exclusive sections and safe work,
> > so it doesn't make the problem go.
> > 
> > > How do you trigger your deadlock? Just hot-pluging CPUs?
> > 
> > Yes.  The window is pretty narrow so I only saw it once although this
> > test (where the vms are started and stopped and the cpus are plugged in
> > and out) is in our test loop for quite a bit (probably 2+ years).
> > 
> > Roman.
> 
> ping?

ping?


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] [RFC PATCH 1/2] cpus-common: nuke finish_safe_work
  2019-05-23 10:54 ` [Qemu-devel] [RFC PATCH 1/2] cpus-common: nuke finish_safe_work Roman Kagan
@ 2019-06-24 10:58   ` Alex Bennée
  2019-06-24 11:50     ` Roman Kagan
  0 siblings, 1 reply; 12+ messages in thread
From: Alex Bennée @ 2019-06-24 10:58 UTC (permalink / raw)
  To: qemu-devel; +Cc: Paolo Bonzini


Roman Kagan <rkagan@virtuozzo.com> writes:

> It was introduced in commit b129972c8b41e15b0521895a46fd9c752b68a5e,
> with the following motivation:

I can't find this commit in my tree.

>
>   Because start_exclusive uses CPU_FOREACH, merge exclusive_lock with
>   qemu_cpu_list_lock: together with a call to exclusive_idle (via
>   cpu_exec_start/end) in cpu_list_add, this protects exclusive work
>   against concurrent CPU addition and removal.
>
> However, it seems to be redundant, because the cpu-exclusive
> infrastructure provides suffificent protection against the newly added
> CPU starting execution while the cpu-exclusive work is running, and the
> aforementioned traversing of the cpu list is protected by
> qemu_cpu_list_lock.
>
> Besides, this appears to be the only place where the cpu-exclusive
> section is entered with the BQL taken, which has been found to trigger
> AB-BA deadlock as follows:
>
>     vCPU thread                             main thread
>     -----------                             -----------
> async_safe_run_on_cpu(self,
>                       async_synic_update)
> ...                                         [cpu hot-add]
> process_queued_cpu_work()
>   qemu_mutex_unlock_iothread()
>                                             [grab BQL]
>   start_exclusive()                         cpu_list_add()
>   async_synic_update()                        finish_safe_work()
>     qemu_mutex_lock_iothread()                  cpu_exec_start()
>
> So remove it.  This paves the way to establishing a strict nesting rule
> of never entering the exclusive section with the BQL taken.
>
> Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
> ---
>  cpus-common.c | 8 --------
>  1 file changed, 8 deletions(-)
>
> diff --git a/cpus-common.c b/cpus-common.c
> index 3ca58c64e8..023cfebfa3 100644
> --- a/cpus-common.c
> +++ b/cpus-common.c
> @@ -69,12 +69,6 @@ static int cpu_get_free_index(void)
>      return cpu_index;
>  }
>
> -static void finish_safe_work(CPUState *cpu)
> -{
> -    cpu_exec_start(cpu);
> -    cpu_exec_end(cpu);
> -}
> -

This makes sense to me intellectually but I'm worried I've missed the
reason for it being introduced. Without finish_safe_work we have to wait
for the actual vCPU thread function to acquire and release the BQL and
enter it's first cpu_exec_start().

I guess I'd be happier if we had a hotplug test where we could stress
test the operation and be sure we've not just moved the deadlock
somewhere else.

>  void cpu_list_add(CPUState *cpu)
>  {
>      qemu_mutex_lock(&qemu_cpu_list_lock);
> @@ -86,8 +80,6 @@ void cpu_list_add(CPUState *cpu)
>      }
>      QTAILQ_INSERT_TAIL_RCU(&cpus, cpu, node);
>      qemu_mutex_unlock(&qemu_cpu_list_lock);
> -
> -    finish_safe_work(cpu);
>  }
>
>  void cpu_list_remove(CPUState *cpu)


--
Alex Bennée


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] [RFC PATCH 1/2] cpus-common: nuke finish_safe_work
  2019-06-24 10:58   ` Alex Bennée
@ 2019-06-24 11:50     ` Roman Kagan
  2019-06-24 12:43       ` Alex Bennée
  0 siblings, 1 reply; 12+ messages in thread
From: Roman Kagan @ 2019-06-24 11:50 UTC (permalink / raw)
  To: Alex Bennée; +Cc: Paolo Bonzini, qemu-devel

On Mon, Jun 24, 2019 at 11:58:23AM +0100, Alex Bennée wrote:
> Roman Kagan <rkagan@virtuozzo.com> writes:
> 
> > It was introduced in commit b129972c8b41e15b0521895a46fd9c752b68a5e,
> > with the following motivation:
> 
> I can't find this commit in my tree.

OOPS, that was supposed to be ab129972c8b41e15b0521895a46fd9c752b68a5e,
sorry.

> 
> >
> >   Because start_exclusive uses CPU_FOREACH, merge exclusive_lock with
> >   qemu_cpu_list_lock: together with a call to exclusive_idle (via
> >   cpu_exec_start/end) in cpu_list_add, this protects exclusive work
> >   against concurrent CPU addition and removal.
> >
> > However, it seems to be redundant, because the cpu-exclusive
> > infrastructure provides suffificent protection against the newly added
> > CPU starting execution while the cpu-exclusive work is running, and the
> > aforementioned traversing of the cpu list is protected by
> > qemu_cpu_list_lock.
> >
> > Besides, this appears to be the only place where the cpu-exclusive
> > section is entered with the BQL taken, which has been found to trigger
> > AB-BA deadlock as follows:
> >
> >     vCPU thread                             main thread
> >     -----------                             -----------
> > async_safe_run_on_cpu(self,
> >                       async_synic_update)
> > ...                                         [cpu hot-add]
> > process_queued_cpu_work()
> >   qemu_mutex_unlock_iothread()
> >                                             [grab BQL]
> >   start_exclusive()                         cpu_list_add()
> >   async_synic_update()                        finish_safe_work()
> >     qemu_mutex_lock_iothread()                  cpu_exec_start()
> >
> > So remove it.  This paves the way to establishing a strict nesting rule
> > of never entering the exclusive section with the BQL taken.
> >
> > Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
> > ---
> >  cpus-common.c | 8 --------
> >  1 file changed, 8 deletions(-)
> >
> > diff --git a/cpus-common.c b/cpus-common.c
> > index 3ca58c64e8..023cfebfa3 100644
> > --- a/cpus-common.c
> > +++ b/cpus-common.c
> > @@ -69,12 +69,6 @@ static int cpu_get_free_index(void)
> >      return cpu_index;
> >  }
> >
> > -static void finish_safe_work(CPUState *cpu)
> > -{
> > -    cpu_exec_start(cpu);
> > -    cpu_exec_end(cpu);
> > -}
> > -
> 
> This makes sense to me intellectually but I'm worried I've missed the
> reason for it being introduced. Without finish_safe_work we have to wait
> for the actual vCPU thread function to acquire and release the BQL and
> enter it's first cpu_exec_start().
> 
> I guess I'd be happier if we had a hotplug test where we could stress
> test the operation and be sure we've not just moved the deadlock
> somewhere else.

Me too.  Unfortunately I haven't managed to come up with an idea how to
do this test.  One of the race participants, the safe work in a vCPU
thread, happens in response to an MSR write by the guest.  ATM there's
no way to do it without an actual guest running.  I'll have a look if I
can make a vm test for it, using a linux guest and its /dev/cpu/*/msr.

Thanks,
Roman.

> 
> >  void cpu_list_add(CPUState *cpu)
> >  {
> >      qemu_mutex_lock(&qemu_cpu_list_lock);
> > @@ -86,8 +80,6 @@ void cpu_list_add(CPUState *cpu)
> >      }
> >      QTAILQ_INSERT_TAIL_RCU(&cpus, cpu, node);
> >      qemu_mutex_unlock(&qemu_cpu_list_lock);
> > -
> > -    finish_safe_work(cpu);
> >  }
> >
> >  void cpu_list_remove(CPUState *cpu)
> 
> 
> --
> Alex Bennée
> 


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] [RFC PATCH 1/2] cpus-common: nuke finish_safe_work
  2019-06-24 11:50     ` Roman Kagan
@ 2019-06-24 12:43       ` Alex Bennée
  0 siblings, 0 replies; 12+ messages in thread
From: Alex Bennée @ 2019-06-24 12:43 UTC (permalink / raw)
  To: Roman Kagan; +Cc: Paolo Bonzini, qemu-devel


Roman Kagan <rkagan@virtuozzo.com> writes:

> On Mon, Jun 24, 2019 at 11:58:23AM +0100, Alex Bennée wrote:
>> Roman Kagan <rkagan@virtuozzo.com> writes:
>>
>> > It was introduced in commit b129972c8b41e15b0521895a46fd9c752b68a5e,
>> > with the following motivation:
>>
>> I can't find this commit in my tree.
>
> OOPS, that was supposed to be ab129972c8b41e15b0521895a46fd9c752b68a5e,
> sorry.
>
>>
>> >
>> >   Because start_exclusive uses CPU_FOREACH, merge exclusive_lock with
>> >   qemu_cpu_list_lock: together with a call to exclusive_idle (via
>> >   cpu_exec_start/end) in cpu_list_add, this protects exclusive work
>> >   against concurrent CPU addition and removal.
>> >
>> > However, it seems to be redundant, because the cpu-exclusive
>> > infrastructure provides suffificent protection against the newly added
>> > CPU starting execution while the cpu-exclusive work is running, and the
>> > aforementioned traversing of the cpu list is protected by
>> > qemu_cpu_list_lock.
>> >
>> > Besides, this appears to be the only place where the cpu-exclusive
>> > section is entered with the BQL taken, which has been found to trigger
>> > AB-BA deadlock as follows:
>> >
>> >     vCPU thread                             main thread
>> >     -----------                             -----------
>> > async_safe_run_on_cpu(self,
>> >                       async_synic_update)
>> > ...                                         [cpu hot-add]
>> > process_queued_cpu_work()
>> >   qemu_mutex_unlock_iothread()
>> >                                             [grab BQL]
>> >   start_exclusive()                         cpu_list_add()
>> >   async_synic_update()                        finish_safe_work()
>> >     qemu_mutex_lock_iothread()                  cpu_exec_start()
>> >
>> > So remove it.  This paves the way to establishing a strict nesting rule
>> > of never entering the exclusive section with the BQL taken.
>> >
>> > Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
>> > ---
>> >  cpus-common.c | 8 --------
>> >  1 file changed, 8 deletions(-)
>> >
>> > diff --git a/cpus-common.c b/cpus-common.c
>> > index 3ca58c64e8..023cfebfa3 100644
>> > --- a/cpus-common.c
>> > +++ b/cpus-common.c
>> > @@ -69,12 +69,6 @@ static int cpu_get_free_index(void)
>> >      return cpu_index;
>> >  }
>> >
>> > -static void finish_safe_work(CPUState *cpu)
>> > -{
>> > -    cpu_exec_start(cpu);
>> > -    cpu_exec_end(cpu);
>> > -}
>> > -
>>
>> This makes sense to me intellectually but I'm worried I've missed the
>> reason for it being introduced. Without finish_safe_work we have to wait
>> for the actual vCPU thread function to acquire and release the BQL and
>> enter it's first cpu_exec_start().
>>
>> I guess I'd be happier if we had a hotplug test where we could stress
>> test the operation and be sure we've not just moved the deadlock
>> somewhere else.
>
> Me too.  Unfortunately I haven't managed to come up with an idea how to
> do this test.  One of the race participants, the safe work in a vCPU
> thread, happens in response to an MSR write by the guest.  ATM there's
> no way to do it without an actual guest running.  I'll have a look if I
> can make a vm test for it, using a linux guest and its /dev/cpu/*/msr.

Depending on how much machinery is required to trigger this we could
add a system mode test. However there isn't much point if it requires
duplicating the entire guest hotplug stack. It maybe easier to trigger
on ARM - the PCSI sequence isn't overly complicated to deal with but I
don't know what the impact of MSIs is.


--
Alex Bennée


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] [RFC PATCH 0/2] establish nesting rule of BQL vs cpu-exclusive
  2019-06-21 12:49       ` Roman Kagan
@ 2019-08-05 12:47         ` Roman Kagan
  2019-08-05 15:56           ` Paolo Bonzini
  0 siblings, 1 reply; 12+ messages in thread
From: Roman Kagan @ 2019-08-05 12:47 UTC (permalink / raw)
  To: Alex Bennée; +Cc: Paolo Bonzini, cota, richard.henderson, qemu-devel

On Fri, Jun 21, 2019 at 12:49:07PM +0000, Roman Kagan wrote:
> On Thu, Jun 06, 2019 at 01:22:33PM +0000, Roman Kagan wrote:
> > On Mon, May 27, 2019 at 11:05:38AM +0000, Roman Kagan wrote:
> > > On Thu, May 23, 2019 at 12:31:16PM +0100, Alex Bennée wrote:
> > > > 
> > > > Roman Kagan <rkagan@virtuozzo.com> writes:
> > > > 
> > > > > I came across the following AB-BA deadlock:
> > > > >
> > > > >     vCPU thread                             main thread
> > > > >     -----------                             -----------
> > > > > async_safe_run_on_cpu(self,
> > > > >                       async_synic_update)
> > > > > ...                                         [cpu hot-add]
> > > > > process_queued_cpu_work()
> > > > >   qemu_mutex_unlock_iothread()
> > > > >                                             [grab BQL]
> > > > >   start_exclusive()                         cpu_list_add()
> > > > >   async_synic_update()                        finish_safe_work()
> > > > >     qemu_mutex_lock_iothread()                  cpu_exec_start()
> > > > >
> > > > > ATM async_synic_update seems to be the only async safe work item that
> > > > > grabs BQL.  However it isn't quite obvious that it shouldn't; in the
> > > > > past there were more examples of this (e.g.
> > > > > memory_region_do_invalidate_mmio_ptr).
> > > > >
> > > > > It looks like the problem is generally in the lack of the nesting rule
> > > > > for cpu-exclusive sections against BQL, so I thought I would try to
> > > > > address that.  This patchset is my feeble attempt at this; I'm not sure
> > > > > I fully comprehend all the consequences (rather, I'm sure I don't) hence
> > > > > RFC.
> > > > 
> > > > Hmm I think this is an area touched by:
> > > > 
> > > >   Subject: [PATCH v7 00/73] per-CPU locks
> > > >   Date: Mon,  4 Mar 2019 13:17:00 -0500
> > > >   Message-Id: <20190304181813.8075-1-cota@braap.org>
> > > > 
> > > > which has stalled on it's path into the tree. Last time I checked it
> > > > explicitly handled the concept of work that needed the BQL and work that
> > > > didn't.
> > > 
> > > I'm still trying to get my head around that patchset, but it looks like
> > > it changes nothing in regards to cpu-exclusive sections and safe work,
> > > so it doesn't make the problem go.
> > > 
> > > > How do you trigger your deadlock? Just hot-pluging CPUs?
> > > 
> > > Yes.  The window is pretty narrow so I only saw it once although this
> > > test (where the vms are started and stopped and the cpus are plugged in
> > > and out) is in our test loop for quite a bit (probably 2+ years).
> > > 
> > > Roman.
> > 
> > ping?
> 
> ping?

ping?


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] [RFC PATCH 0/2] establish nesting rule of BQL vs cpu-exclusive
  2019-08-05 12:47         ` Roman Kagan
@ 2019-08-05 15:56           ` Paolo Bonzini
  0 siblings, 0 replies; 12+ messages in thread
From: Paolo Bonzini @ 2019-08-05 15:56 UTC (permalink / raw)
  To: Roman Kagan, Alex Bennée, Paolo Bonzini, cota,
	richard.henderson, qemu-devel

On 05/08/19 14:47, Roman Kagan wrote:
> On Fri, Jun 21, 2019 at 12:49:07PM +0000, Roman Kagan wrote:
>> On Thu, Jun 06, 2019 at 01:22:33PM +0000, Roman Kagan wrote:
>>> On Mon, May 27, 2019 at 11:05:38AM +0000, Roman Kagan wrote:
>>>> On Thu, May 23, 2019 at 12:31:16PM +0100, Alex Bennée wrote:
>>>>>
>>>>> Roman Kagan <rkagan@virtuozzo.com> writes:
>>>>>
>>>>>> I came across the following AB-BA deadlock:
>>>>>>
>>>>>>     vCPU thread                             main thread
>>>>>>     -----------                             -----------
>>>>>> async_safe_run_on_cpu(self,
>>>>>>                       async_synic_update)
>>>>>> ...                                         [cpu hot-add]
>>>>>> process_queued_cpu_work()
>>>>>>   qemu_mutex_unlock_iothread()
>>>>>>                                             [grab BQL]
>>>>>>   start_exclusive()                         cpu_list_add()
>>>>>>   async_synic_update()                        finish_safe_work()
>>>>>>     qemu_mutex_lock_iothread()                  cpu_exec_start()
>>>>>>
>>>>>> ATM async_synic_update seems to be the only async safe work item that
>>>>>> grabs BQL.  However it isn't quite obvious that it shouldn't; in the
>>>>>> past there were more examples of this (e.g.
>>>>>> memory_region_do_invalidate_mmio_ptr).
>>>>>>
>>>>>> It looks like the problem is generally in the lack of the nesting rule
>>>>>> for cpu-exclusive sections against BQL, so I thought I would try to
>>>>>> address that.  This patchset is my feeble attempt at this; I'm not sure
>>>>>> I fully comprehend all the consequences (rather, I'm sure I don't) hence
>>>>>> RFC.
>>>>>
>>>>> Hmm I think this is an area touched by:
>>>>>
>>>>>   Subject: [PATCH v7 00/73] per-CPU locks
>>>>>   Date: Mon,  4 Mar 2019 13:17:00 -0500
>>>>>   Message-Id: <20190304181813.8075-1-cota@braap.org>
>>>>>
>>>>> which has stalled on it's path into the tree. Last time I checked it
>>>>> explicitly handled the concept of work that needed the BQL and work that
>>>>> didn't.
>>>>
>>>> I'm still trying to get my head around that patchset, but it looks like
>>>> it changes nothing in regards to cpu-exclusive sections and safe work,
>>>> so it doesn't make the problem go.
>>>>
>>>>> How do you trigger your deadlock? Just hot-pluging CPUs?
>>>>
>>>> Yes.  The window is pretty narrow so I only saw it once although this
>>>> test (where the vms are started and stopped and the cpus are plugged in
>>>> and out) is in our test loop for quite a bit (probably 2+ years).
>>>>
>>>> Roman.
>>>
>>> ping?
>>
>> ping?
> 
> ping?
> 

Queued for 4.2.

Paolo


^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2019-08-05 15:57 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-05-23 10:54 [Qemu-devel] [RFC PATCH 0/2] establish nesting rule of BQL vs cpu-exclusive Roman Kagan
2019-05-23 10:54 ` [Qemu-devel] [RFC PATCH 1/2] cpus-common: nuke finish_safe_work Roman Kagan
2019-06-24 10:58   ` Alex Bennée
2019-06-24 11:50     ` Roman Kagan
2019-06-24 12:43       ` Alex Bennée
2019-05-23 10:54 ` [Qemu-devel] [RFC PATCH 2/2] cpus-common: assert BQL nesting within cpu-exclusive sections Roman Kagan
2019-05-23 11:31 ` [Qemu-devel] [RFC PATCH 0/2] establish nesting rule of BQL vs cpu-exclusive Alex Bennée
2019-05-27 11:05   ` Roman Kagan
2019-06-06 13:22     ` Roman Kagan
2019-06-21 12:49       ` Roman Kagan
2019-08-05 12:47         ` Roman Kagan
2019-08-05 15:56           ` Paolo Bonzini

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.