All of lore.kernel.org
 help / color / mirror / Atom feed
* pseries on qemu-system-ppc64le crashes in doorbell_core_ipi()
@ 2019-03-27 12:36 Sebastian Andrzej Siewior
  2019-03-27 16:37 ` [Qemu-ppc] " Cédric Le Goater
  0 siblings, 1 reply; 23+ messages in thread
From: Sebastian Andrzej Siewior @ 2019-03-27 12:36 UTC (permalink / raw)
  To: linuxppc-dev, qemu-ppc; +Cc: tglx, David Gibson, Paul Mackerras

With qemu-system-ppc64le -machine pseries -smp 4 I get:

|#  chrt 1 hackbench
|Running in process mode with 10 groups using 40 file descriptors each (== 400 tasks)
|Each sender will pass 100 messages of 100 bytes
| Oops: Exception in kernel mode, sig: 4 [#1]
| LE PAGE_SIZE=64K MMU=Hash PREEMPT SMP NR_CPUS=2048 NUMA pSeries
| Modules linked in:
| CPU: 0 PID: 629 Comm: hackbench Not tainted 5.1.0-rc2 #71
| NIP:  c000000000046978 LR: c000000000046a38 CTR: c0000000000b0150
| REGS: c0000001fffeb8e0 TRAP: 0700   Not tainted  (5.1.0-rc2)
| MSR:  8000000000089033 <SF,EE,ME,IR,DR,RI,LE>  CR: 42000874  XER: 00000000
| CFAR: c000000000046a34 IRQMASK: 1
| GPR00: c0000000000b0170 c0000001fffebb70 c000000000a6ba00 0000000028000000
…
| NIP [c000000000046978] doorbell_core_ipi+0x28/0x30
| LR [c000000000046a38] doorbell_try_core_ipi+0xb8/0xf0
| Call Trace:
| [c0000001fffebb70] [c0000001fffebba0] 0xc0000001fffebba0 (unreliable)
| [c0000001fffebba0] [c0000000000b0170] smp_pseries_cause_ipi+0x20/0x70
| [c0000001fffebbd0] [c00000000004b02c] arch_send_call_function_single_ipi+0x8c/0xa0
| [c0000001fffebbf0] [c0000000001de600] irq_work_queue_on+0xe0/0x130
| [c0000001fffebc30] [c0000000001340c8] rto_push_irq_work_func+0xc8/0x120
…
| Instruction dump:
| 60000000 60000000 3c4c00a2 384250b0 3d220009 392949c8 81290000 3929ffff
| 7d231838 7c0004ac 5463017e 64632800 <7c00191c> 4e800020 3c4c00a2 38425080
| ---[ end trace eb842b544538cbdf ]---

and I was wondering whether this is a qemu bug or the kernel is using an
opcode it should rather not. If I skip doorbell_try_core_ipi() in
smp_pseries_cause_ipi() then there is no crash. The comment says "POWER9
should not use this handler" so…

(This is QEMU emulator version 3.1.0 (Debian 1:3.1+dfsg-6))

Sebastian

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Qemu-ppc] pseries on qemu-system-ppc64le crashes in doorbell_core_ipi()
  2019-03-27 12:36 pseries on qemu-system-ppc64le crashes in doorbell_core_ipi() Sebastian Andrzej Siewior
@ 2019-03-27 16:37 ` Cédric Le Goater
  2019-03-27 16:51   ` Cédric Le Goater
  0 siblings, 1 reply; 23+ messages in thread
From: Cédric Le Goater @ 2019-03-27 16:37 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior, linuxppc-dev, qemu-ppc
  Cc: tglx, Paul Mackerras, David Gibson

On 3/27/19 1:36 PM, Sebastian Andrzej Siewior wrote:
> With qemu-system-ppc64le -machine pseries -smp 4 I get:
> 
> |#  chrt 1 hackbench
> |Running in process mode with 10 groups using 40 file descriptors each (== 400 tasks)
> |Each sender will pass 100 messages of 100 bytes
> | Oops: Exception in kernel mode, sig: 4 [#1]
> | LE PAGE_SIZE=64K MMU=Hash PREEMPT SMP NR_CPUS=2048 NUMA pSeries
> | Modules linked in:
> | CPU: 0 PID: 629 Comm: hackbench Not tainted 5.1.0-rc2 #71
> | NIP:  c000000000046978 LR: c000000000046a38 CTR: c0000000000b0150
> | REGS: c0000001fffeb8e0 TRAP: 0700   Not tainted  (5.1.0-rc2)
> | MSR:  8000000000089033 <SF,EE,ME,IR,DR,RI,LE>  CR: 42000874  XER: 00000000
> | CFAR: c000000000046a34 IRQMASK: 1
> | GPR00: c0000000000b0170 c0000001fffebb70 c000000000a6ba00 0000000028000000
> …
> | NIP [c000000000046978] doorbell_core_ipi+0x28/0x30
> | LR [c000000000046a38] doorbell_try_core_ipi+0xb8/0xf0
> | Call Trace:
> | [c0000001fffebb70] [c0000001fffebba0] 0xc0000001fffebba0 (unreliable)
> | [c0000001fffebba0] [c0000000000b0170] smp_pseries_cause_ipi+0x20/0x70
> | [c0000001fffebbd0] [c00000000004b02c] arch_send_call_function_single_ipi+0x8c/0xa0
> | [c0000001fffebbf0] [c0000000001de600] irq_work_queue_on+0xe0/0x130
> | [c0000001fffebc30] [c0000000001340c8] rto_push_irq_work_func+0xc8/0x120
> …
> | Instruction dump:
> | 60000000 60000000 3c4c00a2 384250b0 3d220009 392949c8 81290000 3929ffff
> | 7d231838 7c0004ac 5463017e 64632800 <7c00191c> 4e800020 3c4c00a2 38425080
> | ---[ end trace eb842b544538cbdf ]---
> 
> and I was wondering whether this is a qemu bug or the kernel is using an
> opcode it should rather not. If I skip doorbell_try_core_ipi() in
> smp_pseries_cause_ipi() then there is no crash. The comment says "POWER9
> should not use this handler" so…

I would say Linux is using a msgsndp instruction which is not implemented
in QEMU TCG. But why have we started using dbells in Linux ? 

C. 



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Qemu-ppc] pseries on qemu-system-ppc64le crashes in doorbell_core_ipi()
  2019-03-27 16:37 ` [Qemu-ppc] " Cédric Le Goater
@ 2019-03-27 16:51   ` Cédric Le Goater
  2019-03-29  5:20     ` Suraj Jitindar Singh
  0 siblings, 1 reply; 23+ messages in thread
From: Cédric Le Goater @ 2019-03-27 16:51 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior, linuxppc-dev, qemu-ppc
  Cc: tglx, Paul Mackerras, Nicholas Piggin, David Gibson

On 3/27/19 5:37 PM, Cédric Le Goater wrote:
> On 3/27/19 1:36 PM, Sebastian Andrzej Siewior wrote:
>> With qemu-system-ppc64le -machine pseries -smp 4 I get:
>>
>> |#  chrt 1 hackbench
>> |Running in process mode with 10 groups using 40 file descriptors each (== 400 tasks)
>> |Each sender will pass 100 messages of 100 bytes
>> | Oops: Exception in kernel mode, sig: 4 [#1]
>> | LE PAGE_SIZE=64K MMU=Hash PREEMPT SMP NR_CPUS=2048 NUMA pSeries
>> | Modules linked in:
>> | CPU: 0 PID: 629 Comm: hackbench Not tainted 5.1.0-rc2 #71
>> | NIP:  c000000000046978 LR: c000000000046a38 CTR: c0000000000b0150
>> | REGS: c0000001fffeb8e0 TRAP: 0700   Not tainted  (5.1.0-rc2)
>> | MSR:  8000000000089033 <SF,EE,ME,IR,DR,RI,LE>  CR: 42000874  XER: 00000000
>> | CFAR: c000000000046a34 IRQMASK: 1
>> | GPR00: c0000000000b0170 c0000001fffebb70 c000000000a6ba00 0000000028000000
>> …
>> | NIP [c000000000046978] doorbell_core_ipi+0x28/0x30
>> | LR [c000000000046a38] doorbell_try_core_ipi+0xb8/0xf0
>> | Call Trace:
>> | [c0000001fffebb70] [c0000001fffebba0] 0xc0000001fffebba0 (unreliable)
>> | [c0000001fffebba0] [c0000000000b0170] smp_pseries_cause_ipi+0x20/0x70
>> | [c0000001fffebbd0] [c00000000004b02c] arch_send_call_function_single_ipi+0x8c/0xa0
>> | [c0000001fffebbf0] [c0000000001de600] irq_work_queue_on+0xe0/0x130
>> | [c0000001fffebc30] [c0000000001340c8] rto_push_irq_work_func+0xc8/0x120
>> …
>> | Instruction dump:
>> | 60000000 60000000 3c4c00a2 384250b0 3d220009 392949c8 81290000 3929ffff
>> | 7d231838 7c0004ac 5463017e 64632800 <7c00191c> 4e800020 3c4c00a2 38425080
>> | ---[ end trace eb842b544538cbdf ]---
>>
>> and I was wondering whether this is a qemu bug or the kernel is using an
>> opcode it should rather not. If I skip doorbell_try_core_ipi() in
>> smp_pseries_cause_ipi() then there is no crash. The comment says "POWER9
>> should not use this handler" so…
> 
> I would say Linux is using a msgsndp instruction which is not implemented
> in QEMU TCG. But why have we started using dbells in Linux ? 

ah. It seems arch_local_irq_restore() / replay_interrupt() generated
some interrupt.

C.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Qemu-ppc] pseries on qemu-system-ppc64le crashes in doorbell_core_ipi()
  2019-03-27 16:51   ` Cédric Le Goater
@ 2019-03-29  5:20     ` Suraj Jitindar Singh
  2019-03-29  8:32       ` Sebastian Andrzej Siewior
  2019-03-29  9:13       ` Nicholas Piggin
  0 siblings, 2 replies; 23+ messages in thread
From: Suraj Jitindar Singh @ 2019-03-29  5:20 UTC (permalink / raw)
  To: Cédric Le Goater, Sebastian Andrzej Siewior, linuxppc-dev, qemu-ppc
  Cc: tglx, Paul Mackerras, Nicholas Piggin, David Gibson

On Wed, 2019-03-27 at 17:51 +0100, Cédric Le Goater wrote:
> On 3/27/19 5:37 PM, Cédric Le Goater wrote:
> > On 3/27/19 1:36 PM, Sebastian Andrzej Siewior wrote:
> > > With qemu-system-ppc64le -machine pseries -smp 4 I get:
> > > 
> > > > #  chrt 1 hackbench
> > > > Running in process mode with 10 groups using 40 file
> > > > descriptors each (== 400 tasks)
> > > > Each sender will pass 100 messages of 100 bytes
> > > > Oops: Exception in kernel mode, sig: 4 [#1]
> > > > LE PAGE_SIZE=64K MMU=Hash PREEMPT SMP NR_CPUS=2048 NUMA pSeries
> > > > Modules linked in:
> > > > CPU: 0 PID: 629 Comm: hackbench Not tainted 5.1.0-rc2 #71
> > > > NIP:  c000000000046978 LR: c000000000046a38 CTR:
> > > > c0000000000b0150
> > > > REGS: c0000001fffeb8e0 TRAP: 0700   Not tainted  (5.1.0-rc2)
> > > > MSR:  8000000000089033 <SF,EE,ME,IR,DR,RI,LE>  CR:
> > > > 42000874  XER: 00000000
> > > > CFAR: c000000000046a34 IRQMASK: 1
> > > > GPR00: c0000000000b0170 c0000001fffebb70 c000000000a6ba00
> > > > 0000000028000000
> > > 
> > > …
> > > > NIP [c000000000046978] doorbell_core_ipi+0x28/0x30
> > > > LR [c000000000046a38] doorbell_try_core_ipi+0xb8/0xf0
> > > > Call Trace:
> > > > [c0000001fffebb70] [c0000001fffebba0] 0xc0000001fffebba0
> > > > (unreliable)
> > > > [c0000001fffebba0] [c0000000000b0170]
> > > > smp_pseries_cause_ipi+0x20/0x70
> > > > [c0000001fffebbd0] [c00000000004b02c]
> > > > arch_send_call_function_single_ipi+0x8c/0xa0
> > > > [c0000001fffebbf0] [c0000000001de600]
> > > > irq_work_queue_on+0xe0/0x130
> > > > [c0000001fffebc30] [c0000000001340c8]
> > > > rto_push_irq_work_func+0xc8/0x120
> > > 
> > > …
> > > > Instruction dump:
> > > > 60000000 60000000 3c4c00a2 384250b0 3d220009 392949c8 81290000
> > > > 3929ffff
> > > > 7d231838 7c0004ac 5463017e 64632800 <7c00191c> 4e800020
> > > > 3c4c00a2 38425080
> > > > ---[ end trace eb842b544538cbdf ]---
> > > 
> > > and I was wondering whether this is a qemu bug or the kernel is
> > > using an
> > > opcode it should rather not. If I skip doorbell_try_core_ipi() in
> > > smp_pseries_cause_ipi() then there is no crash. The comment says
> > > "POWER9
> > > should not use this handler" so…
> > 
> > I would say Linux is using a msgsndp instruction which is not
> > implemented
> > in QEMU TCG. But why have we started using dbells in Linux ? 

Yeah the kernel must have used msgsndp which isn't implemented for TCG
yet. We use doorbells in linux but only for threads which are on the
same core.
And when I try to construct a situation with more than 1 thread per
core (e.g. -smp 4,threads=4), I get "TCG cannot support more than 1
thread/core on a pseries machine".

So I wonder why the guest thinks it can use msgsndp...

> 
> ah. It seems arch_local_irq_restore() / replay_interrupt() generated
> some interrupt.
> 
> C.
> 

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Qemu-ppc] pseries on qemu-system-ppc64le crashes in doorbell_core_ipi()
  2019-03-29  5:20     ` Suraj Jitindar Singh
@ 2019-03-29  8:32       ` Sebastian Andrzej Siewior
  2019-03-29  9:13       ` Nicholas Piggin
  1 sibling, 0 replies; 23+ messages in thread
From: Sebastian Andrzej Siewior @ 2019-03-29  8:32 UTC (permalink / raw)
  To: Suraj Jitindar Singh
  Cc: Nicholas Piggin, qemu-ppc, Cédric Le Goater, Paul Mackerras,
	tglx, linuxppc-dev, David Gibson

On 2019-03-29 16:20:51 [+1100], Suraj Jitindar Singh wrote:
> 
> Yeah the kernel must have used msgsndp which isn't implemented for TCG
> yet. We use doorbells in linux but only for threads which are on the
> same core.

It is msgsndp as per instruction decode.

> And when I try to construct a situation with more than 1 thread per
> core (e.g. -smp 4,threads=4), I get "TCG cannot support more than 1
> thread/core on a pseries machine".
>
> So I wonder why the guest thinks it can use msgsndp...

I see
|# cat /sys/devices/system/cpu/cpu*/topology/thread_siblings_list
|0
|1
|2
|3

so this works all well. The problem is when a CPU sends an IPI to itself
then linux's doorbell_try_core_ipi() considers this as a valid sibling
and here we have the msgsndp.

Sebastian

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Qemu-ppc] pseries on qemu-system-ppc64le crashes in doorbell_core_ipi()
  2019-03-29  5:20     ` Suraj Jitindar Singh
  2019-03-29  8:32       ` Sebastian Andrzej Siewior
@ 2019-03-29  9:13       ` Nicholas Piggin
  2019-03-29 15:31         ` Steven Rostedt
  1 sibling, 1 reply; 23+ messages in thread
From: Nicholas Piggin @ 2019-03-29  9:13 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior, Cédric Le Goater, linuxppc-dev,
	qemu-ppc, Suraj Jitindar Singh
  Cc: tglx, Paul Mackerras, Steven Rostedt, David Gibson

Suraj Jitindar Singh's on March 29, 2019 3:20 pm:
> On Wed, 2019-03-27 at 17:51 +0100, Cédric Le Goater wrote:
>> On 3/27/19 5:37 PM, Cédric Le Goater wrote:
>> > On 3/27/19 1:36 PM, Sebastian Andrzej Siewior wrote:
>> > > With qemu-system-ppc64le -machine pseries -smp 4 I get:
>> > > 
>> > > > #  chrt 1 hackbench
>> > > > Running in process mode with 10 groups using 40 file
>> > > > descriptors each (== 400 tasks)
>> > > > Each sender will pass 100 messages of 100 bytes
>> > > > Oops: Exception in kernel mode, sig: 4 [#1]
>> > > > LE PAGE_SIZE=64K MMU=Hash PREEMPT SMP NR_CPUS=2048 NUMA pSeries
>> > > > Modules linked in:
>> > > > CPU: 0 PID: 629 Comm: hackbench Not tainted 5.1.0-rc2 #71
>> > > > NIP:  c000000000046978 LR: c000000000046a38 CTR:
>> > > > c0000000000b0150
>> > > > REGS: c0000001fffeb8e0 TRAP: 0700   Not tainted  (5.1.0-rc2)
>> > > > MSR:  8000000000089033 <SF,EE,ME,IR,DR,RI,LE>  CR:
>> > > > 42000874  XER: 00000000
>> > > > CFAR: c000000000046a34 IRQMASK: 1
>> > > > GPR00: c0000000000b0170 c0000001fffebb70 c000000000a6ba00
>> > > > 0000000028000000
>> > > 
>> > > …
>> > > > NIP [c000000000046978] doorbell_core_ipi+0x28/0x30
>> > > > LR [c000000000046a38] doorbell_try_core_ipi+0xb8/0xf0
>> > > > Call Trace:
>> > > > [c0000001fffebb70] [c0000001fffebba0] 0xc0000001fffebba0
>> > > > (unreliable)
>> > > > [c0000001fffebba0] [c0000000000b0170]
>> > > > smp_pseries_cause_ipi+0x20/0x70
>> > > > [c0000001fffebbd0] [c00000000004b02c]
>> > > > arch_send_call_function_single_ipi+0x8c/0xa0
>> > > > [c0000001fffebbf0] [c0000000001de600]
>> > > > irq_work_queue_on+0xe0/0x130
>> > > > [c0000001fffebc30] [c0000000001340c8]
>> > > > rto_push_irq_work_func+0xc8/0x120
>> > > 
>> > > …
>> > > > Instruction dump:
>> > > > 60000000 60000000 3c4c00a2 384250b0 3d220009 392949c8 81290000
>> > > > 3929ffff
>> > > > 7d231838 7c0004ac 5463017e 64632800 <7c00191c> 4e800020
>> > > > 3c4c00a2 38425080
>> > > > ---[ end trace eb842b544538cbdf ]---

This is unusual and causing powerpc code to crash because the rt
scheduler is telling irq_work_queue_on to queue work on this CPU.
Is that something allowed? There's no warnings in there but it must
be a rarely tested path, would it be better to ban it?

Steven is this queue_work_on to self by design?

>> > > 
>> > > and I was wondering whether this is a qemu bug or the kernel is
>> > > using an
>> > > opcode it should rather not. If I skip doorbell_try_core_ipi() in
>> > > smp_pseries_cause_ipi() then there is no crash. The comment says
>> > > "POWER9
>> > > should not use this handler" so…
>> > 
>> > I would say Linux is using a msgsndp instruction which is not
>> > implemented
>> > in QEMU TCG. But why have we started using dbells in Linux ? 
> 
> Yeah the kernel must have used msgsndp which isn't implemented for TCG
> yet. We use doorbells in linux but only for threads which are on the
> same core.
> And when I try to construct a situation with more than 1 thread per
> core (e.g. -smp 4,threads=4), I get "TCG cannot support more than 1
> thread/core on a pseries machine".
> 
> So I wonder why the guest thinks it can use msgsndp...

IPI to self evidently. Under TCG it really should implement the
instruction or remove the DBELL feature.

Thanks,
Nick


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Qemu-ppc] pseries on qemu-system-ppc64le crashes in doorbell_core_ipi()
  2019-03-29  9:13       ` Nicholas Piggin
@ 2019-03-29 15:31         ` Steven Rostedt
  2019-03-30  3:10           ` Nicholas Piggin
  0 siblings, 1 reply; 23+ messages in thread
From: Steven Rostedt @ 2019-03-29 15:31 UTC (permalink / raw)
  To: Nicholas Piggin
  Cc: Peter Zijlstra, Sebastian Andrzej Siewior, qemu-ppc,
	Cédric Le Goater, Suraj Jitindar Singh, Paul Mackerras,
	tglx, linuxppc-dev, David Gibson

[ Added Peter ]

On Fri, 29 Mar 2019 19:13:55 +1000
Nicholas Piggin <npiggin@gmail.com> wrote:

> Suraj Jitindar Singh's on March 29, 2019 3:20 pm:
> > On Wed, 2019-03-27 at 17:51 +0100, Cédric Le Goater wrote:  
> >> On 3/27/19 5:37 PM, Cédric Le Goater wrote:  
> >> > On 3/27/19 1:36 PM, Sebastian Andrzej Siewior wrote:  
> >> > > With qemu-system-ppc64le -machine pseries -smp 4 I get:
> >> > >   
> >> > > > #  chrt 1 hackbench
> >> > > > Running in process mode with 10 groups using 40 file
> >> > > > descriptors each (== 400 tasks)
> >> > > > Each sender will pass 100 messages of 100 bytes
> >> > > > Oops: Exception in kernel mode, sig: 4 [#1]
> >> > > > LE PAGE_SIZE=64K MMU=Hash PREEMPT SMP NR_CPUS=2048 NUMA pSeries
> >> > > > Modules linked in:
> >> > > > CPU: 0 PID: 629 Comm: hackbench Not tainted 5.1.0-rc2 #71
> >> > > > NIP:  c000000000046978 LR: c000000000046a38 CTR:
> >> > > > c0000000000b0150
> >> > > > REGS: c0000001fffeb8e0 TRAP: 0700   Not tainted  (5.1.0-rc2)
> >> > > > MSR:  8000000000089033 <SF,EE,ME,IR,DR,RI,LE>  CR:
> >> > > > 42000874  XER: 00000000
> >> > > > CFAR: c000000000046a34 IRQMASK: 1
> >> > > > GPR00: c0000000000b0170 c0000001fffebb70 c000000000a6ba00
> >> > > > 0000000028000000  
> >> > > 
> >> > > …  
> >> > > > NIP [c000000000046978] doorbell_core_ipi+0x28/0x30
> >> > > > LR [c000000000046a38] doorbell_try_core_ipi+0xb8/0xf0
> >> > > > Call Trace:
> >> > > > [c0000001fffebb70] [c0000001fffebba0] 0xc0000001fffebba0
> >> > > > (unreliable)
> >> > > > [c0000001fffebba0] [c0000000000b0170]
> >> > > > smp_pseries_cause_ipi+0x20/0x70
> >> > > > [c0000001fffebbd0] [c00000000004b02c]
> >> > > > arch_send_call_function_single_ipi+0x8c/0xa0
> >> > > > [c0000001fffebbf0] [c0000000001de600]
> >> > > > irq_work_queue_on+0xe0/0x130
> >> > > > [c0000001fffebc30] [c0000000001340c8]
> >> > > > rto_push_irq_work_func+0xc8/0x120  
> >> > > 
> >> > > …  
> >> > > > Instruction dump:
> >> > > > 60000000 60000000 3c4c00a2 384250b0 3d220009 392949c8 81290000
> >> > > > 3929ffff
> >> > > > 7d231838 7c0004ac 5463017e 64632800 <7c00191c> 4e800020
> >> > > > 3c4c00a2 38425080
> >> > > > ---[ end trace eb842b544538cbdf ]---  
> 
> This is unusual and causing powerpc code to crash because the rt
> scheduler is telling irq_work_queue_on to queue work on this CPU.
> Is that something allowed? There's no warnings in there but it must
> be a rarely tested path, would it be better to ban it?
> 
> Steven is this queue_work_on to self by design?

I just figured that a irq_work_queue_on() would default to just
irq_work_queue() if cpu == smp_processor_id().

I'm sure this case has been tested (on x86), as I stressed this quite a
bit, and all it takes is to have an RT task queued on the CPU
processing the current "push" logic when it's not holding the rq locks
and read that the current CPU now has an overloaded RT task.

If calling irq_work_queue_on() is really bad to do to the same CPU,
then we can work on changing the logic to do something different if the
CPU returned from rto_next_cpu() is the current CPU.

Or would it be possible to just change irq_work_queue_on() to call
irq_work_queue() if cpu == smp_processor_id()?

-- Steve


> 
> >> > > 
> >> > > and I was wondering whether this is a qemu bug or the kernel is
> >> > > using an
> >> > > opcode it should rather not. If I skip doorbell_try_core_ipi() in
> >> > > smp_pseries_cause_ipi() then there is no crash. The comment says
> >> > > "POWER9
> >> > > should not use this handler" so…  
> >> > 
> >> > I would say Linux is using a msgsndp instruction which is not
> >> > implemented
> >> > in QEMU TCG. But why have we started using dbells in Linux ?   
> > 
> > Yeah the kernel must have used msgsndp which isn't implemented for TCG
> > yet. We use doorbells in linux but only for threads which are on the
> > same core.
> > And when I try to construct a situation with more than 1 thread per
> > core (e.g. -smp 4,threads=4), I get "TCG cannot support more than 1
> > thread/core on a pseries machine".
> > 
> > So I wonder why the guest thinks it can use msgsndp...  
> 
> IPI to self evidently. Under TCG it really should implement the
> instruction or remove the DBELL feature.
> 
> Thanks,
> Nick
> 


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Qemu-ppc] pseries on qemu-system-ppc64le crashes in doorbell_core_ipi()
  2019-03-29 15:31         ` Steven Rostedt
@ 2019-03-30  3:10           ` Nicholas Piggin
  2019-04-01  8:38             ` Peter Zijlstra
  0 siblings, 1 reply; 23+ messages in thread
From: Nicholas Piggin @ 2019-03-30  3:10 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Peter Zijlstra, Sebastian Andrzej Siewior, Paul Mackerras,
	Cédric Le Goater, Suraj Jitindar Singh, qemu-ppc, tglx,
	linuxppc-dev,
	David
	 Gibson

Steven Rostedt's on March 30, 2019 1:31 am:
> [ Added Peter ]
> 
> On Fri, 29 Mar 2019 19:13:55 +1000
> Nicholas Piggin <npiggin@gmail.com> wrote:
> 
>> Suraj Jitindar Singh's on March 29, 2019 3:20 pm:
>> > On Wed, 2019-03-27 at 17:51 +0100, Cédric Le Goater wrote:  
>> >> On 3/27/19 5:37 PM, Cédric Le Goater wrote:  
>> >> > On 3/27/19 1:36 PM, Sebastian Andrzej Siewior wrote:  
>> >> > > With qemu-system-ppc64le -machine pseries -smp 4 I get:
>> >> > >   
>> >> > > > #  chrt 1 hackbench
>> >> > > > Running in process mode with 10 groups using 40 file
>> >> > > > descriptors each (== 400 tasks)
>> >> > > > Each sender will pass 100 messages of 100 bytes
>> >> > > > Oops: Exception in kernel mode, sig: 4 [#1]
>> >> > > > LE PAGE_SIZE=64K MMU=Hash PREEMPT SMP NR_CPUS=2048 NUMA pSeries
>> >> > > > Modules linked in:
>> >> > > > CPU: 0 PID: 629 Comm: hackbench Not tainted 5.1.0-rc2 #71
>> >> > > > NIP:  c000000000046978 LR: c000000000046a38 CTR:
>> >> > > > c0000000000b0150
>> >> > > > REGS: c0000001fffeb8e0 TRAP: 0700   Not tainted  (5.1.0-rc2)
>> >> > > > MSR:  8000000000089033 <SF,EE,ME,IR,DR,RI,LE>  CR:
>> >> > > > 42000874  XER: 00000000
>> >> > > > CFAR: c000000000046a34 IRQMASK: 1
>> >> > > > GPR00: c0000000000b0170 c0000001fffebb70 c000000000a6ba00
>> >> > > > 0000000028000000  
>> >> > > 
>> >> > > …  
>> >> > > > NIP [c000000000046978] doorbell_core_ipi+0x28/0x30
>> >> > > > LR [c000000000046a38] doorbell_try_core_ipi+0xb8/0xf0
>> >> > > > Call Trace:
>> >> > > > [c0000001fffebb70] [c0000001fffebba0] 0xc0000001fffebba0
>> >> > > > (unreliable)
>> >> > > > [c0000001fffebba0] [c0000000000b0170]
>> >> > > > smp_pseries_cause_ipi+0x20/0x70
>> >> > > > [c0000001fffebbd0] [c00000000004b02c]
>> >> > > > arch_send_call_function_single_ipi+0x8c/0xa0
>> >> > > > [c0000001fffebbf0] [c0000000001de600]
>> >> > > > irq_work_queue_on+0xe0/0x130
>> >> > > > [c0000001fffebc30] [c0000000001340c8]
>> >> > > > rto_push_irq_work_func+0xc8/0x120  
>> >> > > 
>> >> > > …  
>> >> > > > Instruction dump:
>> >> > > > 60000000 60000000 3c4c00a2 384250b0 3d220009 392949c8 81290000
>> >> > > > 3929ffff
>> >> > > > 7d231838 7c0004ac 5463017e 64632800 <7c00191c> 4e800020
>> >> > > > 3c4c00a2 38425080
>> >> > > > ---[ end trace eb842b544538cbdf ]---  
>> 
>> This is unusual and causing powerpc code to crash because the rt
>> scheduler is telling irq_work_queue_on to queue work on this CPU.
>> Is that something allowed? There's no warnings in there but it must
>> be a rarely tested path, would it be better to ban it?
>> 
>> Steven is this queue_work_on to self by design?
> 
> I just figured that a irq_work_queue_on() would default to just
> irq_work_queue() if cpu == smp_processor_id().

Yeah it actually doesn't, it raises a CALL_FUNCTION IPI on the
current CPU. Maybe it should do that.

> 
> I'm sure this case has been tested (on x86), as I stressed this quite a
> bit, and all it takes is to have an RT task queued on the CPU
> processing the current "push" logic when it's not holding the rq locks
> and read that the current CPU now has an overloaded RT task.
> 
> If calling irq_work_queue_on() is really bad to do to the same CPU,
> then we can work on changing the logic to do something different if the
> CPU returned from rto_next_cpu() is the current CPU.

I don't actually know that it is bad. It does seem to violate the
principle of least surprise for arch code though.

From this bug report, I'm guessing it's pretty rare case to hit so
I would worry about subtle bugs. And I just cc'ed you in case your
code was actually not intending to do this.

> Or would it be possible to just change irq_work_queue_on() to call
> irq_work_queue() if cpu == smp_processor_id()?

Something like this?

kernel/irq_work: Do not raise an IPI when queueing work on the local CPU

The QEMU powerpc/pseries machine model was not expecting a self-IPI,
and it may be a bit surprising thing to do, so have irq_work_queue_on
do local queueing when target is current CPU.

Suggested-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 kernel/irq_work.c | 78 ++++++++++++++++++++++++++---------------------
 1 file changed, 43 insertions(+), 35 deletions(-)

diff --git a/kernel/irq_work.c b/kernel/irq_work.c
index 6b7cdf17ccf8..f0e539d0f879 100644
--- a/kernel/irq_work.c
+++ b/kernel/irq_work.c
@@ -56,61 +56,69 @@ void __weak arch_irq_work_raise(void)
 	 */
 }
 
-/*
- * Enqueue the irq_work @work on @cpu unless it's already pending
- * somewhere.
- *
- * Can be re-enqueued while the callback is still in progress.
- */
-bool irq_work_queue_on(struct irq_work *work, int cpu)
+/* Enqueue on current CPU, work must already be claimed and preempt disabled */
+static void __irq_work_queue(struct irq_work *work)
 {
-	/* All work should have been flushed before going offline */
-	WARN_ON_ONCE(cpu_is_offline(cpu));
-
-#ifdef CONFIG_SMP
-
-	/* Arch remote IPI send/receive backend aren't NMI safe */
-	WARN_ON_ONCE(in_nmi());
+	/* If the work is "lazy", handle it from next tick if any */
+	if (work->flags & IRQ_WORK_LAZY) {
+		if (llist_add(&work->llnode, this_cpu_ptr(&lazy_list)) &&
+		    tick_nohz_tick_stopped())
+			arch_irq_work_raise();
+	} else {
+		if (llist_add(&work->llnode, this_cpu_ptr(&raised_list)))
+			arch_irq_work_raise();
+	}
+}
 
+/* Enqueue the irq work @work on the current CPU */
+bool irq_work_queue(struct irq_work *work)
+{
 	/* Only queue if not already pending */
 	if (!irq_work_claim(work))
 		return false;
 
-	if (llist_add(&work->llnode, &per_cpu(raised_list, cpu)))
-		arch_send_call_function_single_ipi(cpu);
-
-#else /* #ifdef CONFIG_SMP */
-	irq_work_queue(work);
-#endif /* #else #ifdef CONFIG_SMP */
+	/* Queue the entry and raise the IPI if needed. */
+	preempt_disable();
+	__irq_work_queue(work);
+	preempt_enable();
 
 	return true;
 }
+EXPORT_SYMBOL_GPL(irq_work_queue);
 
-/* Enqueue the irq work @work on the current CPU */
-bool irq_work_queue(struct irq_work *work)
+/*
+ * Enqueue the irq_work @work on @cpu unless it's already pending
+ * somewhere.
+ *
+ * Can be re-enqueued while the callback is still in progress.
+ */
+bool irq_work_queue_on(struct irq_work *work, int cpu)
 {
+#ifndef CONFIG_SMP
+	return irq_work_queue(work);
+
+#else /* #ifndef CONFIG_SMP */
+	/* All work should have been flushed before going offline */
+	WARN_ON_ONCE(cpu_is_offline(cpu));
+
 	/* Only queue if not already pending */
 	if (!irq_work_claim(work))
 		return false;
 
-	/* Queue the entry and raise the IPI if needed. */
 	preempt_disable();
-
-	/* If the work is "lazy", handle it from next tick if any */
-	if (work->flags & IRQ_WORK_LAZY) {
-		if (llist_add(&work->llnode, this_cpu_ptr(&lazy_list)) &&
-		    tick_nohz_tick_stopped())
-			arch_irq_work_raise();
-	} else {
-		if (llist_add(&work->llnode, this_cpu_ptr(&raised_list)))
-			arch_irq_work_raise();
-	}
-
+	if (cpu != smp_processor_id()) {
+		/* Arch remote IPI send/receive backend aren't NMI safe */
+		WARN_ON_ONCE(in_nmi());
+		if (llist_add(&work->llnode, &per_cpu(raised_list, cpu)))
+			arch_send_call_function_single_ipi(cpu);
+	} else
+		__irq_work_queue(work);
 	preempt_enable();
 
 	return true;
+#endif /* #else #ifndef CONFIG_SMP */
 }
-EXPORT_SYMBOL_GPL(irq_work_queue);
+
 
 bool irq_work_needs_cpu(void)
 {
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [Qemu-ppc] pseries on qemu-system-ppc64le crashes in doorbell_core_ipi()
  2019-03-30  3:10           ` Nicholas Piggin
@ 2019-04-01  8:38             ` Peter Zijlstra
  2019-04-04 16:25               ` Nicholas Piggin
  2019-04-06  0:06               ` Frederic Weisbecker
  0 siblings, 2 replies; 23+ messages in thread
From: Peter Zijlstra @ 2019-04-01  8:38 UTC (permalink / raw)
  To: Nicholas Piggin
  Cc: Sebastian Andrzej Siewior, Steven Rostedt, Paul Mackerras,
	Cédric Le Goater, Suraj Jitindar Singh, Frederic Weisbecker,
	qemu-ppc, tglx, linuxppc-dev, David? Gibson


+ fweisbec, who did the remote bits

On Sat, Mar 30, 2019 at 01:10:28PM +1000, Nicholas Piggin wrote:
> Something like this?
> 
> kernel/irq_work: Do not raise an IPI when queueing work on the local CPU
> 
> The QEMU powerpc/pseries machine model was not expecting a self-IPI,
> and it may be a bit surprising thing to do, so have irq_work_queue_on
> do local queueing when target is current CPU.

This seems OK to me.

> Suggested-by: Steven Rostedt <rostedt@goodmis.org>
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>

Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>

> ---
>  kernel/irq_work.c | 78 ++++++++++++++++++++++++++---------------------
>  1 file changed, 43 insertions(+), 35 deletions(-)
> 
> diff --git a/kernel/irq_work.c b/kernel/irq_work.c
> index 6b7cdf17ccf8..f0e539d0f879 100644
> --- a/kernel/irq_work.c
> +++ b/kernel/irq_work.c
> @@ -56,61 +56,69 @@ void __weak arch_irq_work_raise(void)
>  	 */
>  }
>  
> -/*
> - * Enqueue the irq_work @work on @cpu unless it's already pending
> - * somewhere.
> - *
> - * Can be re-enqueued while the callback is still in progress.
> - */
> -bool irq_work_queue_on(struct irq_work *work, int cpu)
> +/* Enqueue on current CPU, work must already be claimed and preempt disabled */
> +static void __irq_work_queue(struct irq_work *work)
>  {
> -	/* All work should have been flushed before going offline */
> -	WARN_ON_ONCE(cpu_is_offline(cpu));
> -
> -#ifdef CONFIG_SMP
> -
> -	/* Arch remote IPI send/receive backend aren't NMI safe */
> -	WARN_ON_ONCE(in_nmi());
> +	/* If the work is "lazy", handle it from next tick if any */
> +	if (work->flags & IRQ_WORK_LAZY) {
> +		if (llist_add(&work->llnode, this_cpu_ptr(&lazy_list)) &&
> +		    tick_nohz_tick_stopped())
> +			arch_irq_work_raise();
> +	} else {
> +		if (llist_add(&work->llnode, this_cpu_ptr(&raised_list)))
> +			arch_irq_work_raise();
> +	}
> +}
>  
> +/* Enqueue the irq work @work on the current CPU */
> +bool irq_work_queue(struct irq_work *work)
> +{
>  	/* Only queue if not already pending */
>  	if (!irq_work_claim(work))
>  		return false;
>  
> -	if (llist_add(&work->llnode, &per_cpu(raised_list, cpu)))
> -		arch_send_call_function_single_ipi(cpu);
> -
> -#else /* #ifdef CONFIG_SMP */
> -	irq_work_queue(work);
> -#endif /* #else #ifdef CONFIG_SMP */
> +	/* Queue the entry and raise the IPI if needed. */
> +	preempt_disable();
> +	__irq_work_queue(work);
> +	preempt_enable();
>  
>  	return true;
>  }
> +EXPORT_SYMBOL_GPL(irq_work_queue);
>  
> -/* Enqueue the irq work @work on the current CPU */
> -bool irq_work_queue(struct irq_work *work)
> +/*
> + * Enqueue the irq_work @work on @cpu unless it's already pending
> + * somewhere.
> + *
> + * Can be re-enqueued while the callback is still in progress.
> + */
> +bool irq_work_queue_on(struct irq_work *work, int cpu)
>  {
> +#ifndef CONFIG_SMP
> +	return irq_work_queue(work);
> +
> +#else /* #ifndef CONFIG_SMP */
> +	/* All work should have been flushed before going offline */
> +	WARN_ON_ONCE(cpu_is_offline(cpu));
> +
>  	/* Only queue if not already pending */
>  	if (!irq_work_claim(work))
>  		return false;
>  
> -	/* Queue the entry and raise the IPI if needed. */
>  	preempt_disable();
> -
> -	/* If the work is "lazy", handle it from next tick if any */
> -	if (work->flags & IRQ_WORK_LAZY) {
> -		if (llist_add(&work->llnode, this_cpu_ptr(&lazy_list)) &&
> -		    tick_nohz_tick_stopped())
> -			arch_irq_work_raise();
> -	} else {
> -		if (llist_add(&work->llnode, this_cpu_ptr(&raised_list)))
> -			arch_irq_work_raise();
> -	}
> -
> +	if (cpu != smp_processor_id()) {
> +		/* Arch remote IPI send/receive backend aren't NMI safe */
> +		WARN_ON_ONCE(in_nmi());
> +		if (llist_add(&work->llnode, &per_cpu(raised_list, cpu)))
> +			arch_send_call_function_single_ipi(cpu);
> +	} else
> +		__irq_work_queue(work);
>  	preempt_enable();
>  
>  	return true;
> +#endif /* #else #ifndef CONFIG_SMP */
>  }
> -EXPORT_SYMBOL_GPL(irq_work_queue);
> +
>  
>  bool irq_work_needs_cpu(void)
>  {
> -- 
> 2.20.1
> 

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Qemu-ppc] pseries on qemu-system-ppc64le crashes in doorbell_core_ipi()
  2019-04-01  8:38             ` Peter Zijlstra
@ 2019-04-04 16:25               ` Nicholas Piggin
  2019-04-05 14:47                 ` Sebastian Andrzej Siewior
  2019-04-06  0:06               ` Frederic Weisbecker
  1 sibling, 1 reply; 23+ messages in thread
From: Nicholas Piggin @ 2019-04-04 16:25 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Sebastian Andrzej Siewior, Steven Rostedt, Paul Mackerras,
	Cédric Le Goater, Suraj Jitindar Singh, Frederic Weisbecker,
	qemu-ppc, tglx, linuxppc-dev, David? Gibson

Peter Zijlstra's on April 1, 2019 6:38 pm:
> 
> + fweisbec, who did the remote bits
> 
> On Sat, Mar 30, 2019 at 01:10:28PM +1000, Nicholas Piggin wrote:
>> Something like this?
>> 
>> kernel/irq_work: Do not raise an IPI when queueing work on the local CPU
>> 
>> The QEMU powerpc/pseries machine model was not expecting a self-IPI,
>> and it may be a bit surprising thing to do, so have irq_work_queue_on
>> do local queueing when target is current CPU.
> 
> This seems OK to me.
> 
>> Suggested-by: Steven Rostedt <rostedt@goodmis.org>
>> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> 
> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>

Thanks for taking a look.

Sebastian, are you able to test if this patch solves your problem?

Thanks,
Nick


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Qemu-ppc] pseries on qemu-system-ppc64le crashes in doorbell_core_ipi()
  2019-04-04 16:25               ` Nicholas Piggin
@ 2019-04-05 14:47                 ` Sebastian Andrzej Siewior
  0 siblings, 0 replies; 23+ messages in thread
From: Sebastian Andrzej Siewior @ 2019-04-05 14:47 UTC (permalink / raw)
  To: Nicholas Piggin
  Cc: Peter Zijlstra, Frederic Weisbecker, Steven Rostedt,
	Paul Mackerras, Cédric Le Goater, Suraj Jitindar Singh,
	qemu-ppc, tglx, linuxppc-dev, David? Gibson

On 2019-04-05 02:25:44 [+1000], Nicholas Piggin wrote:
> Sebastian, are you able to test if this patch solves your problem?

yes, it does.

Tested-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>

> Thanks,
> Nick

Sebastian

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Qemu-ppc] pseries on qemu-system-ppc64le crashes in doorbell_core_ipi()
  2019-04-01  8:38             ` Peter Zijlstra
  2019-04-04 16:25               ` Nicholas Piggin
@ 2019-04-06  0:06               ` Frederic Weisbecker
  2019-04-09  9:25                 ` Nicholas Piggin
  1 sibling, 1 reply; 23+ messages in thread
From: Frederic Weisbecker @ 2019-04-06  0:06 UTC (permalink / raw)
  To: Peter Zijlstra, Nicholas Piggin
  Cc: Sebastian Andrzej Siewior, Steven Rostedt, Paul Mackerras,
	Cédric Le Goater, Suraj Jitindar Singh, Frederic Weisbecker,
	qemu-ppc, tglx, linuxppc-dev, David? Gibson

On Mon, Apr 01, 2019 at 10:38:27AM +0200, Peter Zijlstra wrote:
> 
> + fweisbec, who did the remote bits
> 
> On Sat, Mar 30, 2019 at 01:10:28PM +1000, Nicholas Piggin wrote:
> > diff --git a/kernel/irq_work.c b/kernel/irq_work.c
> > index 6b7cdf17ccf8..f0e539d0f879 100644
> > --- a/kernel/irq_work.c
> > +++ b/kernel/irq_work.c
> > -/* Enqueue the irq work @work on the current CPU */
> > -bool irq_work_queue(struct irq_work *work)
> > +/*
> > + * Enqueue the irq_work @work on @cpu unless it's already pending
> > + * somewhere.
> > + *
> > + * Can be re-enqueued while the callback is still in progress.
> > + */
> > +bool irq_work_queue_on(struct irq_work *work, int cpu)
> >  {
> > +#ifndef CONFIG_SMP
> > +	return irq_work_queue(work);
> > +

I'd suggest to use "if (!IS_ENABLED(CONFIG_SMP))" here to avoid the large
ifdeffery.

> > +#else /* #ifndef CONFIG_SMP */
> > +	/* All work should have been flushed before going offline */
> > +	WARN_ON_ONCE(cpu_is_offline(cpu));
> > +
> >  	/* Only queue if not already pending */
> >  	if (!irq_work_claim(work))
> >  		return false;
> >  
> > -	/* Queue the entry and raise the IPI if needed. */
> >  	preempt_disable();
> > -
> > -	/* If the work is "lazy", handle it from next tick if any */
> > -	if (work->flags & IRQ_WORK_LAZY) {
> > -		if (llist_add(&work->llnode, this_cpu_ptr(&lazy_list)) &&
> > -		    tick_nohz_tick_stopped())
> > -			arch_irq_work_raise();
> > -	} else {
> > -		if (llist_add(&work->llnode, this_cpu_ptr(&raised_list)))
> > -			arch_irq_work_raise();
> > -	}
> > -
> > +	if (cpu != smp_processor_id()) {
> > +		/* Arch remote IPI send/receive backend aren't NMI safe */
> > +		WARN_ON_ONCE(in_nmi());
> > +		if (llist_add(&work->llnode, &per_cpu(raised_list, cpu)))
> > +			arch_send_call_function_single_ipi(cpu);
> > +	} else
> > +		__irq_work_queue(work);

Also perhaps rename __irq_work_queue() to irq_work_queue_local() to make it
instantly clearer to reviewers.

Other than those cosmetic changes,

  Reviewed-by: Frederic Weisbecker <frederic@kernel.org>

Thanks.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Qemu-ppc] pseries on qemu-system-ppc64le crashes in doorbell_core_ipi()
  2019-04-06  0:06               ` Frederic Weisbecker
@ 2019-04-09  9:25                 ` Nicholas Piggin
  2019-12-19 10:41                   ` Jason A. Donenfeld
  0 siblings, 1 reply; 23+ messages in thread
From: Nicholas Piggin @ 2019-04-09  9:25 UTC (permalink / raw)
  To: Frederic Weisbecker, Peter Zijlstra
  Cc: Sebastian Andrzej Siewior, Steven Rostedt, Paul Mackerras,
	Cédric Le Goater, Suraj Jitindar Singh, Frederic Weisbecker,
	qemu-ppc, tglx, linuxppc-dev, David? Gibson

Frederic Weisbecker's on April 6, 2019 10:06 am:
> On Mon, Apr 01, 2019 at 10:38:27AM +0200, Peter Zijlstra wrote:
>> 
>> + fweisbec, who did the remote bits
>> 
>> On Sat, Mar 30, 2019 at 01:10:28PM +1000, Nicholas Piggin wrote:
>> > diff --git a/kernel/irq_work.c b/kernel/irq_work.c
>> > index 6b7cdf17ccf8..f0e539d0f879 100644
>> > --- a/kernel/irq_work.c
>> > +++ b/kernel/irq_work.c
>> > -/* Enqueue the irq work @work on the current CPU */
>> > -bool irq_work_queue(struct irq_work *work)
>> > +/*
>> > + * Enqueue the irq_work @work on @cpu unless it's already pending
>> > + * somewhere.
>> > + *
>> > + * Can be re-enqueued while the callback is still in progress.
>> > + */
>> > +bool irq_work_queue_on(struct irq_work *work, int cpu)
>> >  {
>> > +#ifndef CONFIG_SMP
>> > +	return irq_work_queue(work);
>> > +
> 
> I'd suggest to use "if (!IS_ENABLED(CONFIG_SMP))" here to avoid the large
> ifdeffery.

Sadly you can't do it because arch_send_call_function_single_ipi is
not defined for !SMP. I made your suggested name change though (with
the __ prefix because work needs to be claimed and preempt disabled).

Thanks,
Nick

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Qemu-ppc] pseries on qemu-system-ppc64le crashes in doorbell_core_ipi()
  2019-04-09  9:25                 ` Nicholas Piggin
@ 2019-12-19 10:41                   ` Jason A. Donenfeld
  2019-12-19 11:13                     ` Sebastian Andrzej Siewior
  2019-12-19 12:45                     ` Michael Ellerman
  0 siblings, 2 replies; 23+ messages in thread
From: Jason A. Donenfeld @ 2019-12-19 10:41 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior, qemu-ppc, linuxppc-dev
  Cc: Peter Zijlstra, Frederic Weisbecker, Frederic Weisbecker,
	Steven Rostedt, Paul Mackerras, Nicholas Piggin,
	Suraj Jitindar Singh, tglx, David? Gibson, Cédric Le Goater

Hi folks,

I'm actually still experiencing this sporadically in the WireGuard test 
suite, which you can see being run on https://build.wireguard.com/ . 
About 50% of the time the powerpc64 build will fail at a place like this:

[   65.147823] Oops: Exception in kernel mode, sig: 4 [#1]
[   65.149198] LE PAGE_SIZE=4K MMU=Hash PREEMPT SMP NR_CPUS=4 NUMA pSeries
[   65.149595] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.5.0-rc1+ #1
[   65.149745] NIP:  c000000000033330 LR: c00000000007eda0 CTR: 
c00000000007ed80
[   65.149934] REGS: c000000000a47970 TRAP: 0700   Not tainted  (5.5.0-rc1+)
[   65.150032] MSR:  800000000288b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE> 
CR: 48008288  XER: 00000000
[   65.150352] CFAR: c0000000000332bc IRQMASK: 1
[   65.150352] GPR00: c000000000036508 c000000000a47c00 c000000000a4c100 
0000000000000001
[   65.150352] GPR04: c000000000a50998 0000000000000000 c000000000a50908 
000000000f509000
[   65.150352] GPR08: 0000000028000000 0000000000000000 0000000000000000 
c00000000ff24f00
[   65.150352] GPR12: c00000000007ed80 c000000000ad9000 0000000000000000 
0000000000000000
[   65.150352] GPR16: 00000000008c9190 00000000008c94a8 00000000008c92f8 
00000000008c98b0
[   65.150352] GPR20: 00000000008f2f88 fffffffffffffffd 0000000000000014 
0000000000e6c100
[   65.150352] GPR24: 0000000000e6c100 0000000000000001 0000000000000000 
c000000000a50998
[   65.150352] GPR28: c000000000a9e280 c000000000a50aa4 0000000000000002 
0000000000000000
[   65.151591] NIP [c000000000033330] doorbell_try_core_ipi+0xd0/0xf0
[   65.151750] LR [c00000000007eda0] smp_pseries_cause_ipi+0x20/0x70
[   65.151913] Call Trace:
[   65.152109] [c000000000a47c00] [c0000000000c7c9c] 
_nohz_idle_balance+0xbc/0x300 (unreliable)
[   65.152370] [c000000000a47c30] [c000000000036508] 
smp_send_reschedule+0x98/0xb0
[   65.152711] [c000000000a47c50] [c0000000000c1634] kick_ilb+0x114/0x140
[   65.152962] [c000000000a47ca0] [c0000000000c86d8] 
newidle_balance+0x4e8/0x500
[   65.153213] [c000000000a47d20] [c0000000000c8788] 
pick_next_task_fair+0x48/0x3a0
[   65.153424] [c000000000a47d80] [c000000000466620] __schedule+0xf0/0x430
[   65.153612] [c000000000a47de0] [c000000000466b04] schedule_idle+0x34/0x70
[   65.153786] [c000000000a47e10] [c0000000000c0bc8] do_idle+0x1a8/0x220
[   65.154121] [c000000000a47e70] [c0000000000c0e94] 
cpu_startup_entry+0x34/0x40
[   65.154313] [c000000000a47ea0] [c00000000000ef1c] rest_init+0x10c/0x124
[   65.154414] [c000000000a47ee0] [c000000000500004] 
start_kernel+0x568/0x594
[   65.154585] [c000000000a47f90] [c00000000000a7cc] 
start_here_common+0x1c/0x330
[   65.154854] Instruction dump:
[   65.155191] 38210030 e8010010 7c0803a6 4e800020 3d220004 39295228 
81290000 3929ffff
[   65.155498] 7d284038 7c0004ac 5508017e 65082800 <7c00411c> e94d0178 
812a0000 3929ffff
[   65.156155] ---[ end trace 6180d12e268ffdaf ]---
[   65.185452]
[   66.187490] Kernel panic - not syncing: Fatal exception

This is with "qemu-system-ppc64 -smp 4 -machine pseries" on QEMU 4.0.0.

I'm not totally sure what's going on here. I'm emulating a pseries, and 
using that with qemu's pseries model, and I see that selecting the 
pseries forces the selection of 'config PPC_DOORBELL' (twice in the same 
section, actually). Then inside the kernel there appears to be some 
runtime CPU check for doorbell support. Is this a case in which QEMU is 
advertising doorbell support that TCG doesn't have? Or is something else 
happening here?

Thanks,
Jason

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Qemu-ppc] pseries on qemu-system-ppc64le crashes in doorbell_core_ipi()
  2019-12-19 10:41                   ` Jason A. Donenfeld
@ 2019-12-19 11:13                     ` Sebastian Andrzej Siewior
  2019-12-19 11:19                       ` Jason A. Donenfeld
  2019-12-19 12:45                     ` Michael Ellerman
  1 sibling, 1 reply; 23+ messages in thread
From: Sebastian Andrzej Siewior @ 2019-12-19 11:13 UTC (permalink / raw)
  To: Jason A. Donenfeld
  Cc: Peter Zijlstra, Frederic Weisbecker, Frederic Weisbecker,
	Steven Rostedt, qemu-ppc, Nicholas Piggin, Suraj Jitindar Singh,
	Paul Mackerras, tglx, David? Gibson, linuxppc-dev,
	Cédric Le Goater

On 2019-12-19 11:41:21 [+0100], Jason A. Donenfeld wrote:
> Hi folks,
Hi,

so this should duct tape it:

diff --git a/arch/powerpc/kernel/dbell.c b/arch/powerpc/kernel/dbell.c
index f17ff1200eaae..ec044bdf362a1 100644
--- a/arch/powerpc/kernel/dbell.c
+++ b/arch/powerpc/kernel/dbell.c
@@ -60,16 +60,8 @@ void doorbell_core_ipi(int cpu)
  */
 int doorbell_try_core_ipi(int cpu)
 {
-	int this_cpu = get_cpu();
 	int ret = 0;
 
-	if (cpumask_test_cpu(cpu, cpu_sibling_mask(this_cpu))) {
-		doorbell_core_ipi(cpu);
-		ret = 1;
-	}
-
-	put_cpu();
-
 	return ret;
 }
 

> This is with "qemu-system-ppc64 -smp 4 -machine pseries" on QEMU 4.0.0.

Interesting. I didn't get v5.4 to boot a while ago and didn't have the
time to look into it.

> I'm not totally sure what's going on here. I'm emulating a pseries, and
> using that with qemu's pseries model, and I see that selecting the pseries
> forces the selection of 'config PPC_DOORBELL' (twice in the same section,
> actually). Then inside the kernel there appears to be some runtime CPU check
> for doorbell support. Is this a case in which QEMU is advertising doorbell
> support that TCG doesn't have? Or is something else happening here?

Based on my understanding is that the doorbell feature is part of the
architecture. It can be used to signal other siblings on the same CPU.
qemu TCG doesn't support that and does not allow to announce multiple
siblings on the same CPU. However, the kernel uses this interface if it
tries to send an interrupt to itself (the same CPU) so everything
matches.
Last time I run into this, the interface was change so the kernel das
not send an IPI to itself. This changed now for another function…

> Thanks,
> Jason

Sebastian

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [Qemu-ppc] pseries on qemu-system-ppc64le crashes in doorbell_core_ipi()
  2019-12-19 11:13                     ` Sebastian Andrzej Siewior
@ 2019-12-19 11:19                       ` Jason A. Donenfeld
  0 siblings, 0 replies; 23+ messages in thread
From: Jason A. Donenfeld @ 2019-12-19 11:19 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: Peter Zijlstra, Frederic Weisbecker, Frederic Weisbecker,
	Steven Rostedt, qemu-ppc, Nicholas Piggin, Suraj Jitindar Singh,
	Paul Mackerras, Thomas Gleixner, David? Gibson, linuxppc-dev,
	Cédric Le Goater

On Thu, Dec 19, 2019 at 12:13 PM Sebastian Andrzej Siewior
<bigeasy@linutronix.de> wrote:
> Based on my understanding is that the doorbell feature is part of the
> architecture. It can be used to signal other siblings on the same CPU.
> qemu TCG doesn't support that and does not allow to announce multiple
> siblings on the same CPU. However, the kernel uses this interface if it
> tries to send an interrupt to itself (the same CPU) so everything
> matches.
> Last time I run into this, the interface was change so the kernel das
> not send an IPI to itself. This changed now for another function…

One way of fixing this is to just "not use the feature", as you seem
to be suggesting.

But actually shouldn't there be some CPU feature detection available?
Something like -- QEMU advertises to the kernel that it supports or
doesn't support doorbells, and the kernel then avoids those paths if
the CPU feature flag isn't present?

Jason

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Qemu-ppc] pseries on qemu-system-ppc64le crashes in doorbell_core_ipi()
  2019-12-19 10:41                   ` Jason A. Donenfeld
  2019-12-19 11:13                     ` Sebastian Andrzej Siewior
@ 2019-12-19 12:45                     ` Michael Ellerman
  2019-12-19 13:08                       ` Cédric Le Goater
  2019-12-20  0:53                       ` Jason A. Donenfeld
  1 sibling, 2 replies; 23+ messages in thread
From: Michael Ellerman @ 2019-12-19 12:45 UTC (permalink / raw)
  To: Jason A. Donenfeld, Sebastian Andrzej Siewior, qemu-ppc, linuxppc-dev
  Cc: Peter Zijlstra, Frederic Weisbecker, Frederic Weisbecker,
	Nicholas Piggin, Cédric Le Goater, Paul Mackerras,
	Steven Rostedt, Suraj Jitindar Singh, tglx, David? Gibson

"Jason A. Donenfeld" <Jason@zx2c4.com> writes:
> Hi folks,
>
> I'm actually still experiencing this sporadically in the WireGuard test 
> suite, which you can see being run on https://build.wireguard.com/ . 

Fancy dashboard you got there :)

> About 50% of the time the powerpc64 build will fail at a place like this:
>
> [   65.147823] Oops: Exception in kernel mode, sig: 4 [#1]
> [   65.149198] LE PAGE_SIZE=4K MMU=Hash PREEMPT SMP NR_CPUS=4 NUMA pSeries
> [   65.149595] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.5.0-rc1+ #1
> [   65.149745] NIP:  c000000000033330 LR: c00000000007eda0 CTR: c00000000007ed80
> [   65.149934] REGS: c000000000a47970 TRAP: 0700   Not tainted  (5.5.0-rc1+)
> [   65.150032] MSR:  800000000288b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE> > CR: 48008288  XER: 00000000
> [   65.150352] CFAR: c0000000000332bc IRQMASK: 1
> [   65.150352] GPR00: c000000000036508 c000000000a47c00 c000000000a4c100 0000000000000001
> [   65.150352] GPR04: c000000000a50998 0000000000000000 c000000000a50908 000000000f509000
> [   65.150352] GPR08: 0000000028000000 0000000000000000 0000000000000000 c00000000ff24f00
> [   65.150352] GPR12: c00000000007ed80 c000000000ad9000 0000000000000000 0000000000000000
> [   65.150352] GPR16: 00000000008c9190 00000000008c94a8 00000000008c92f8 00000000008c98b0
> [   65.150352] GPR20: 00000000008f2f88 fffffffffffffffd 0000000000000014 0000000000e6c100
> [   65.150352] GPR24: 0000000000e6c100 0000000000000001 0000000000000000 c000000000a50998
> [   65.150352] GPR28: c000000000a9e280 c000000000a50aa4 0000000000000002 0000000000000000
> [   65.151591] NIP [c000000000033330] doorbell_try_core_ipi+0xd0/0xf0
> [   65.151750] LR [c00000000007eda0] smp_pseries_cause_ipi+0x20/0x70
> [   65.151913] Call Trace:
> [   65.152109] [c000000000a47c00] [c0000000000c7c9c] _nohz_idle_balance+0xbc/0x300 (unreliable)
> [   65.152370] [c000000000a47c30] [c000000000036508] smp_send_reschedule+0x98/0xb0
> [   65.152711] [c000000000a47c50] [c0000000000c1634] kick_ilb+0x114/0x140
> [   65.152962] [c000000000a47ca0] [c0000000000c86d8] newidle_balance+0x4e8/0x500
> [   65.153213] [c000000000a47d20] [c0000000000c8788] pick_next_task_fair+0x48/0x3a0
> [   65.153424] [c000000000a47d80] [c000000000466620] __schedule+0xf0/0x430
> [   65.153612] [c000000000a47de0] [c000000000466b04] schedule_idle+0x34/0x70
> [   65.153786] [c000000000a47e10] [c0000000000c0bc8] do_idle+0x1a8/0x220
> [   65.154121] [c000000000a47e70] [c0000000000c0e94] cpu_startup_entry+0x34/0x40
> [   65.154313] [c000000000a47ea0] [c00000000000ef1c] rest_init+0x10c/0x124
> [   65.154414] [c000000000a47ee0] [c000000000500004] start_kernel+0x568/0x594
> [   65.154585] [c000000000a47f90] [c00000000000a7cc] start_here_common+0x1c/0x330
> [   65.154854] Instruction dump:
> [   65.155191] 38210030 e8010010 7c0803a6 4e800020 3d220004 39295228 81290000 3929ffff
> [   65.155498] 7d284038 7c0004ac 5508017e 65082800 <7c00411c> e94d0178 812a0000 3929ffff
                                                      ^
Again the faulting instruction there is "msgsndp r8"

> [   65.156155] ---[ end trace 6180d12e268ffdaf ]---
> [   65.185452]
> [   66.187490] Kernel panic - not syncing: Fatal exception
>
> This is with "qemu-system-ppc64 -smp 4 -machine pseries" on QEMU 4.0.0.
>
> I'm not totally sure what's going on here. I'm emulating a pseries, and 
> using that with qemu's pseries model, and I see that selecting the 
> pseries forces the selection of 'config PPC_DOORBELL' (twice in the same 
> section, actually).

Noted.

> Then inside the kernel there appears to be some runtime CPU check for
> doorbell support.

Not really. The kernel looks at the CPU revision (PVR) and decides that
it has doorbell support.

> Is this a case in which QEMU is advertising doorbell support that TCG
> doesn't have? Or is something else happening here?

It's a gap in the emulation I guess. qemu doesn't emulate msgsndp, but
it really should because that's a supported instruction since Power8.

I suspect msgsndp wasn't implemented for TCG because TCG doesn't support
more than one thread per core, and you can only send doorbells to other
threads in the same core, and therefore there is no reason to ever use
msgsndp.

That's the message Suraj mentioned up thread, eg:

  $ qemu-system-ppc64 -nographic -vga none -M pseries -smp 2,threads=2 -cpu POWER8 -kernel build~/vmlinux
  qemu-system-ppc64: TCG cannot support more than 1 thread/core on a pseries machine


But I guess we've hit another case of a CPU sending itself an IPI, and
the way the sibling masks are done, CPUs are siblings of themselves, so
the sibling test passes, eg:

int doorbell_try_core_ipi(int cpu)
{
	int this_cpu = get_cpu();
	int ret = 0;

	if (cpumask_test_cpu(cpu, cpu_sibling_mask(this_cpu))) {
		doorbell_core_ipi(cpu);



In which case this patch should fix it.

diff --git a/arch/powerpc/kernel/dbell.c b/arch/powerpc/kernel/dbell.c
index f17ff1200eaa..e45cb9bba193 100644
--- a/arch/powerpc/kernel/dbell.c
+++ b/arch/powerpc/kernel/dbell.c
@@ -63,7 +63,7 @@ int doorbell_try_core_ipi(int cpu)
 	int this_cpu = get_cpu();
 	int ret = 0;
 
-	if (cpumask_test_cpu(cpu, cpu_sibling_mask(this_cpu))) {
+	if (cpu != this_cpu && cpumask_test_cpu(cpu, cpu_sibling_mask(this_cpu))) {
 		doorbell_core_ipi(cpu);
 		ret = 1;
 	}


The other option would be we disable CPU_FTR_DBELL if we detect we're
running under TCG. But I'm not sure we have a particularly clean way to
detect that.

cheers

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [Qemu-ppc] pseries on qemu-system-ppc64le crashes in doorbell_core_ipi()
  2019-12-19 12:45                     ` Michael Ellerman
@ 2019-12-19 13:08                       ` Cédric Le Goater
  2019-12-20  0:22                         ` David? Gibson
  2019-12-20 11:32                         ` Jason A. Donenfeld
  2019-12-20  0:53                       ` Jason A. Donenfeld
  1 sibling, 2 replies; 23+ messages in thread
From: Cédric Le Goater @ 2019-12-19 13:08 UTC (permalink / raw)
  To: Michael Ellerman, Jason A. Donenfeld, Sebastian Andrzej Siewior,
	qemu-ppc, linuxppc-dev
  Cc: Peter Zijlstra, Frederic Weisbecker, Frederic Weisbecker,
	Nicholas Piggin, Paul Mackerras, Steven Rostedt,
	Suraj Jitindar Singh, tglx, David? Gibson

On 19/12/2019 13:45, Michael Ellerman wrote:
> "Jason A. Donenfeld" <Jason@zx2c4.com> writes:
>> Hi folks,
>>
>> I'm actually still experiencing this sporadically in the WireGuard test 
>> suite, which you can see being run on https://build.wireguard.com/ . 
> 
> Fancy dashboard you got there :)
> 
>> About 50% of the time the powerpc64 build will fail at a place like this:
>>
>> [   65.147823] Oops: Exception in kernel mode, sig: 4 [#1]
>> [   65.149198] LE PAGE_SIZE=4K MMU=Hash PREEMPT SMP NR_CPUS=4 NUMA pSeries
>> [   65.149595] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.5.0-rc1+ #1
>> [   65.149745] NIP:  c000000000033330 LR: c00000000007eda0 CTR: c00000000007ed80
>> [   65.149934] REGS: c000000000a47970 TRAP: 0700   Not tainted  (5.5.0-rc1+)
>> [   65.150032] MSR:  800000000288b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE> > CR: 48008288  XER: 00000000
>> [   65.150352] CFAR: c0000000000332bc IRQMASK: 1
>> [   65.150352] GPR00: c000000000036508 c000000000a47c00 c000000000a4c100 0000000000000001
>> [   65.150352] GPR04: c000000000a50998 0000000000000000 c000000000a50908 000000000f509000
>> [   65.150352] GPR08: 0000000028000000 0000000000000000 0000000000000000 c00000000ff24f00
>> [   65.150352] GPR12: c00000000007ed80 c000000000ad9000 0000000000000000 0000000000000000
>> [   65.150352] GPR16: 00000000008c9190 00000000008c94a8 00000000008c92f8 00000000008c98b0
>> [   65.150352] GPR20: 00000000008f2f88 fffffffffffffffd 0000000000000014 0000000000e6c100
>> [   65.150352] GPR24: 0000000000e6c100 0000000000000001 0000000000000000 c000000000a50998
>> [   65.150352] GPR28: c000000000a9e280 c000000000a50aa4 0000000000000002 0000000000000000
>> [   65.151591] NIP [c000000000033330] doorbell_try_core_ipi+0xd0/0xf0
>> [   65.151750] LR [c00000000007eda0] smp_pseries_cause_ipi+0x20/0x70
>> [   65.151913] Call Trace:
>> [   65.152109] [c000000000a47c00] [c0000000000c7c9c] _nohz_idle_balance+0xbc/0x300 (unreliable)
>> [   65.152370] [c000000000a47c30] [c000000000036508] smp_send_reschedule+0x98/0xb0
>> [   65.152711] [c000000000a47c50] [c0000000000c1634] kick_ilb+0x114/0x140
>> [   65.152962] [c000000000a47ca0] [c0000000000c86d8] newidle_balance+0x4e8/0x500
>> [   65.153213] [c000000000a47d20] [c0000000000c8788] pick_next_task_fair+0x48/0x3a0
>> [   65.153424] [c000000000a47d80] [c000000000466620] __schedule+0xf0/0x430
>> [   65.153612] [c000000000a47de0] [c000000000466b04] schedule_idle+0x34/0x70
>> [   65.153786] [c000000000a47e10] [c0000000000c0bc8] do_idle+0x1a8/0x220
>> [   65.154121] [c000000000a47e70] [c0000000000c0e94] cpu_startup_entry+0x34/0x40
>> [   65.154313] [c000000000a47ea0] [c00000000000ef1c] rest_init+0x10c/0x124
>> [   65.154414] [c000000000a47ee0] [c000000000500004] start_kernel+0x568/0x594
>> [   65.154585] [c000000000a47f90] [c00000000000a7cc] start_here_common+0x1c/0x330
>> [   65.154854] Instruction dump:
>> [   65.155191] 38210030 e8010010 7c0803a6 4e800020 3d220004 39295228 81290000 3929ffff
>> [   65.155498] 7d284038 7c0004ac 5508017e 65082800 <7c00411c> e94d0178 812a0000 3929ffff
>                                                       ^
> Again the faulting instruction there is "msgsndp r8"
> 
>> [   65.156155] ---[ end trace 6180d12e268ffdaf ]---
>> [   65.185452]
>> [   66.187490] Kernel panic - not syncing: Fatal exception
>>
>> This is with "qemu-system-ppc64 -smp 4 -machine pseries" on QEMU 4.0.0.
>>
>> I'm not totally sure what's going on here. I'm emulating a pseries, and 
>> using that with qemu's pseries model, and I see that selecting the 
>> pseries forces the selection of 'config PPC_DOORBELL' (twice in the same 
>> section, actually).
> 
> Noted.
> 
>> Then inside the kernel there appears to be some runtime CPU check for
>> doorbell support.
> 
> Not really. The kernel looks at the CPU revision (PVR) and decides that
> it has doorbell support.
> 
>> Is this a case in which QEMU is advertising doorbell support that TCG
>> doesn't have? Or is something else happening here?
> 
> It's a gap in the emulation I guess. qemu doesn't emulate msgsndp, but
> it really should because that's a supported instruction since Power8.

There is a patch for msgsndp in my tree you could try : 

  https://github.com/legoater/qemu/tree/powernv-5.0

Currently being reviewed. I have to address some remarks from David before
it can be merged.

> I suspect msgsndp wasn't implemented for TCG because TCG doesn't support
> more than one thread per core, and you can only send doorbells to other
> threads in the same core, and therefore there is no reason to ever use
> msgsndp.

There is a need now with KVM emulation under TCG, but, yes, QEMU still lacks
SMT support.

> That's the message Suraj mentioned up thread, eg:
> 
>   $ qemu-system-ppc64 -nographic -vga none -M pseries -smp 2,threads=2 -cpu POWER8 -kernel build~/vmlinux
>   qemu-system-ppc64: TCG cannot support more than 1 thread/core on a pseries machine
> 
> 
> But I guess we've hit another case of a CPU sending itself an IPI, and
> the way the sibling masks are done, CPUs are siblings of themselves, so
> the sibling test passes, eg:
> 
> int doorbell_try_core_ipi(int cpu)
> {
> 	int this_cpu = get_cpu();
> 	int ret = 0;
> 
> 	if (cpumask_test_cpu(cpu, cpu_sibling_mask(this_cpu))) {
> 		doorbell_core_ipi(cpu);
> 
> 
> 
> In which case this patch should fix it.
> 
> diff --git a/arch/powerpc/kernel/dbell.c b/arch/powerpc/kernel/dbell.c
> index f17ff1200eaa..e45cb9bba193 100644
> --- a/arch/powerpc/kernel/dbell.c
> +++ b/arch/powerpc/kernel/dbell.c
> @@ -63,7 +63,7 @@ int doorbell_try_core_ipi(int cpu)
>  	int this_cpu = get_cpu();
>  	int ret = 0;
>  
> -	if (cpumask_test_cpu(cpu, cpu_sibling_mask(this_cpu))) {
> +	if (cpu != this_cpu && cpumask_test_cpu(cpu, cpu_sibling_mask(this_cpu))) {
>  		doorbell_core_ipi(cpu);
>  		ret = 1;
>  	}
> 
> 
> The other option would be we disable CPU_FTR_DBELL if we detect we're
> running under TCG. But I'm not sure we have a particularly clean way to
> detect that.

does the pseries kernel support cpufeatures in the DT ?

Cheers,

C.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Qemu-ppc] pseries on qemu-system-ppc64le crashes in doorbell_core_ipi()
  2019-12-19 13:08                       ` Cédric Le Goater
@ 2019-12-20  0:22                         ` David? Gibson
  2019-12-20 11:32                         ` Jason A. Donenfeld
  1 sibling, 0 replies; 23+ messages in thread
From: David? Gibson @ 2019-12-20  0:22 UTC (permalink / raw)
  To: Cédric Le Goater
  Cc: Jason A. Donenfeld, Peter Zijlstra, Sebastian Andrzej Siewior,
	Steven Rostedt, qemu-ppc, Nicholas Piggin, Suraj Jitindar Singh,
	Frederic Weisbecker, Paul Mackerras, tglx, linuxppc-dev,
	Frederic Weisbecker

[-- Attachment #1: Type: text/plain, Size: 6776 bytes --]

On Thu, Dec 19, 2019 at 02:08:29PM +0100, Cédric Le Goater wrote:
> On 19/12/2019 13:45, Michael Ellerman wrote:
> > "Jason A. Donenfeld" <Jason@zx2c4.com> writes:
> >> Hi folks,
> >>
> >> I'm actually still experiencing this sporadically in the WireGuard test 
> >> suite, which you can see being run on https://build.wireguard.com/ . 
> > 
> > Fancy dashboard you got there :)
> > 
> >> About 50% of the time the powerpc64 build will fail at a place like this:
> >>
> >> [   65.147823] Oops: Exception in kernel mode, sig: 4 [#1]
> >> [   65.149198] LE PAGE_SIZE=4K MMU=Hash PREEMPT SMP NR_CPUS=4 NUMA pSeries
> >> [   65.149595] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.5.0-rc1+ #1
> >> [   65.149745] NIP:  c000000000033330 LR: c00000000007eda0 CTR: c00000000007ed80
> >> [   65.149934] REGS: c000000000a47970 TRAP: 0700   Not tainted  (5.5.0-rc1+)
> >> [   65.150032] MSR:  800000000288b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE> > CR: 48008288  XER: 00000000
> >> [   65.150352] CFAR: c0000000000332bc IRQMASK: 1
> >> [   65.150352] GPR00: c000000000036508 c000000000a47c00 c000000000a4c100 0000000000000001
> >> [   65.150352] GPR04: c000000000a50998 0000000000000000 c000000000a50908 000000000f509000
> >> [   65.150352] GPR08: 0000000028000000 0000000000000000 0000000000000000 c00000000ff24f00
> >> [   65.150352] GPR12: c00000000007ed80 c000000000ad9000 0000000000000000 0000000000000000
> >> [   65.150352] GPR16: 00000000008c9190 00000000008c94a8 00000000008c92f8 00000000008c98b0
> >> [   65.150352] GPR20: 00000000008f2f88 fffffffffffffffd 0000000000000014 0000000000e6c100
> >> [   65.150352] GPR24: 0000000000e6c100 0000000000000001 0000000000000000 c000000000a50998
> >> [   65.150352] GPR28: c000000000a9e280 c000000000a50aa4 0000000000000002 0000000000000000
> >> [   65.151591] NIP [c000000000033330] doorbell_try_core_ipi+0xd0/0xf0
> >> [   65.151750] LR [c00000000007eda0] smp_pseries_cause_ipi+0x20/0x70
> >> [   65.151913] Call Trace:
> >> [   65.152109] [c000000000a47c00] [c0000000000c7c9c] _nohz_idle_balance+0xbc/0x300 (unreliable)
> >> [   65.152370] [c000000000a47c30] [c000000000036508] smp_send_reschedule+0x98/0xb0
> >> [   65.152711] [c000000000a47c50] [c0000000000c1634] kick_ilb+0x114/0x140
> >> [   65.152962] [c000000000a47ca0] [c0000000000c86d8] newidle_balance+0x4e8/0x500
> >> [   65.153213] [c000000000a47d20] [c0000000000c8788] pick_next_task_fair+0x48/0x3a0
> >> [   65.153424] [c000000000a47d80] [c000000000466620] __schedule+0xf0/0x430
> >> [   65.153612] [c000000000a47de0] [c000000000466b04] schedule_idle+0x34/0x70
> >> [   65.153786] [c000000000a47e10] [c0000000000c0bc8] do_idle+0x1a8/0x220
> >> [   65.154121] [c000000000a47e70] [c0000000000c0e94] cpu_startup_entry+0x34/0x40
> >> [   65.154313] [c000000000a47ea0] [c00000000000ef1c] rest_init+0x10c/0x124
> >> [   65.154414] [c000000000a47ee0] [c000000000500004] start_kernel+0x568/0x594
> >> [   65.154585] [c000000000a47f90] [c00000000000a7cc] start_here_common+0x1c/0x330
> >> [   65.154854] Instruction dump:
> >> [   65.155191] 38210030 e8010010 7c0803a6 4e800020 3d220004 39295228 81290000 3929ffff
> >> [   65.155498] 7d284038 7c0004ac 5508017e 65082800 <7c00411c> e94d0178 812a0000 3929ffff
> >                                                       ^
> > Again the faulting instruction there is "msgsndp r8"
> > 
> >> [   65.156155] ---[ end trace 6180d12e268ffdaf ]---
> >> [   65.185452]
> >> [   66.187490] Kernel panic - not syncing: Fatal exception
> >>
> >> This is with "qemu-system-ppc64 -smp 4 -machine pseries" on QEMU 4.0.0.
> >>
> >> I'm not totally sure what's going on here. I'm emulating a pseries, and 
> >> using that with qemu's pseries model, and I see that selecting the 
> >> pseries forces the selection of 'config PPC_DOORBELL' (twice in the same 
> >> section, actually).
> > 
> > Noted.
> > 
> >> Then inside the kernel there appears to be some runtime CPU check for
> >> doorbell support.
> > 
> > Not really. The kernel looks at the CPU revision (PVR) and decides that
> > it has doorbell support.
> > 
> >> Is this a case in which QEMU is advertising doorbell support that TCG
> >> doesn't have? Or is something else happening here?
> > 
> > It's a gap in the emulation I guess. qemu doesn't emulate msgsndp, but
> > it really should because that's a supported instruction since Power8.
> 
> There is a patch for msgsndp in my tree you could try : 
> 
>   https://github.com/legoater/qemu/tree/powernv-5.0
> 
> Currently being reviewed. I have to address some remarks from David before
> it can be merged.

Right.  It needs some polish, but I expect we'll have this merged in
the not too distant future.

> > I suspect msgsndp wasn't implemented for TCG because TCG doesn't support
> > more than one thread per core, and you can only send doorbells to other
> > threads in the same core, and therefore there is no reason to ever use
> > msgsndp.
> 
> There is a need now with KVM emulation under TCG, but, yes, QEMU still lacks
> SMT support.
> 
> > That's the message Suraj mentioned up thread, eg:
> > 
> >   $ qemu-system-ppc64 -nographic -vga none -M pseries -smp 2,threads=2 -cpu POWER8 -kernel build~/vmlinux
> >   qemu-system-ppc64: TCG cannot support more than 1 thread/core on a pseries machine
> > 
> > 
> > But I guess we've hit another case of a CPU sending itself an IPI, and
> > the way the sibling masks are done, CPUs are siblings of themselves, so
> > the sibling test passes, eg:
> > 
> > int doorbell_try_core_ipi(int cpu)
> > {
> > 	int this_cpu = get_cpu();
> > 	int ret = 0;
> > 
> > 	if (cpumask_test_cpu(cpu, cpu_sibling_mask(this_cpu))) {
> > 		doorbell_core_ipi(cpu);
> > 
> > 
> > 
> > In which case this patch should fix it.
> > 
> > diff --git a/arch/powerpc/kernel/dbell.c b/arch/powerpc/kernel/dbell.c
> > index f17ff1200eaa..e45cb9bba193 100644
> > --- a/arch/powerpc/kernel/dbell.c
> > +++ b/arch/powerpc/kernel/dbell.c
> > @@ -63,7 +63,7 @@ int doorbell_try_core_ipi(int cpu)
> >  	int this_cpu = get_cpu();
> >  	int ret = 0;
> >  
> > -	if (cpumask_test_cpu(cpu, cpu_sibling_mask(this_cpu))) {
> > +	if (cpu != this_cpu && cpumask_test_cpu(cpu, cpu_sibling_mask(this_cpu))) {
> >  		doorbell_core_ipi(cpu);
> >  		ret = 1;
> >  	}
> > 
> > 
> > The other option would be we disable CPU_FTR_DBELL if we detect we're
> > running under TCG. But I'm not sure we have a particularly clean way to
> > detect that.
> 
> does the pseries kernel support cpufeatures in the DT ?
> 

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Qemu-ppc] pseries on qemu-system-ppc64le crashes in doorbell_core_ipi()
  2019-12-19 12:45                     ` Michael Ellerman
  2019-12-19 13:08                       ` Cédric Le Goater
@ 2019-12-20  0:53                       ` Jason A. Donenfeld
  1 sibling, 0 replies; 23+ messages in thread
From: Jason A. Donenfeld @ 2019-12-20  0:53 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: Peter Zijlstra, Frederic Weisbecker, Sebastian Andrzej Siewior,
	Steven Rostedt, Cédric Le Goater, qemu-ppc, Nicholas Piggin,
	Suraj Jitindar Singh, Paul Mackerras, Thomas Gleixner,
	linuxppc-dev, Frederic Weisbecker, David? Gibson

On Thu, Dec 19, 2019 at 1:52 PM Michael Ellerman <mpe@ellerman.id.au> wrote:
> diff --git a/arch/powerpc/kernel/dbell.c b/arch/powerpc/kernel/dbell.c
> index f17ff1200eaa..e45cb9bba193 100644
> --- a/arch/powerpc/kernel/dbell.c
> +++ b/arch/powerpc/kernel/dbell.c
> @@ -63,7 +63,7 @@ int doorbell_try_core_ipi(int cpu)
>         int this_cpu = get_cpu();
>         int ret = 0;
>
> -       if (cpumask_test_cpu(cpu, cpu_sibling_mask(this_cpu))) {
> +       if (cpu != this_cpu && cpumask_test_cpu(cpu, cpu_sibling_mask(this_cpu))) {
>                 doorbell_core_ipi(cpu);
>                 ret = 1;
>         }

I realize the best solution is that nice powernv branch that will
eventually be merged for qemu/tcg. But, maybe the above would be a
decent idea to upstream? It seems like that's a case of a superfluous
doorbell?

Jason

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Qemu-ppc] pseries on qemu-system-ppc64le crashes in doorbell_core_ipi()
  2019-12-19 13:08                       ` Cédric Le Goater
  2019-12-20  0:22                         ` David? Gibson
@ 2019-12-20 11:32                         ` Jason A. Donenfeld
  2019-12-20 12:21                           ` David? Gibson
  2019-12-20 15:59                           ` Cédric Le Goater
  1 sibling, 2 replies; 23+ messages in thread
From: Jason A. Donenfeld @ 2019-12-20 11:32 UTC (permalink / raw)
  To: Cédric Le Goater
  Cc: Peter Zijlstra, Sebastian Andrzej Siewior, Steven Rostedt,
	qemu-ppc, Nicholas Piggin, Suraj Jitindar Singh,
	Frederic Weisbecker, Paul Mackerras, Thomas Gleixner,
	linuxppc-dev, Frederic Weisbecker, David? Gibson

On Thu, Dec 19, 2019 at 2:08 PM Cédric Le Goater <clg@kaod.org> wrote:>
> There is a patch for msgsndp in my tree you could try :
>
>   https://github.com/legoater/qemu/tree/powernv-5.0
>
> Currently being reviewed. I have to address some remarks from David before
> it can be merged.

Thanks. I've cherry-picked
https://github.com/legoater/qemu/commit/910c9ea5ecc and can confirm
that I no longer receive the crashes. QEMU 5.1 or 5.0.1 release, I
guess?

Jason

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Qemu-ppc] pseries on qemu-system-ppc64le crashes in doorbell_core_ipi()
  2019-12-20 11:32                         ` Jason A. Donenfeld
@ 2019-12-20 12:21                           ` David? Gibson
  2019-12-20 15:59                           ` Cédric Le Goater
  1 sibling, 0 replies; 23+ messages in thread
From: David? Gibson @ 2019-12-20 12:21 UTC (permalink / raw)
  To: Jason A. Donenfeld
  Cc: Peter Zijlstra, Sebastian Andrzej Siewior, Nicholas Piggin,
	Steven Rostedt, qemu-ppc, Cédric Le Goater,
	Suraj Jitindar Singh, Frederic Weisbecker, Paul Mackerras,
	Thomas Gleixner, linuxppc-dev, Frederic Weisbecker

[-- Attachment #1: Type: text/plain, Size: 839 bytes --]

On Fri, Dec 20, 2019 at 12:32:06PM +0100, Jason A. Donenfeld wrote:
> On Thu, Dec 19, 2019 at 2:08 PM Cédric Le Goater <clg@kaod.org> wrote:>
> > There is a patch for msgsndp in my tree you could try :
> >
> >   https://github.com/legoater/qemu/tree/powernv-5.0
> >
> > Currently being reviewed. I have to address some remarks from David before
> > it can be merged.
> 
> Thanks. I've cherry-picked
> https://github.com/legoater/qemu/commit/910c9ea5ecc and can confirm
> that I no longer receive the crashes. QEMU 5.1 or 5.0.1 release, I
> guess?

Unless the revision takes much longer than I expect, I'd anticipate it
in 5.0.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Qemu-ppc] pseries on qemu-system-ppc64le crashes in doorbell_core_ipi()
  2019-12-20 11:32                         ` Jason A. Donenfeld
  2019-12-20 12:21                           ` David? Gibson
@ 2019-12-20 15:59                           ` Cédric Le Goater
  1 sibling, 0 replies; 23+ messages in thread
From: Cédric Le Goater @ 2019-12-20 15:59 UTC (permalink / raw)
  To: Jason A. Donenfeld
  Cc: Peter Zijlstra, Sebastian Andrzej Siewior, Steven Rostedt,
	qemu-ppc, Nicholas Piggin, Suraj Jitindar Singh,
	Frederic Weisbecker, Paul Mackerras, Thomas Gleixner,
	linuxppc-dev, Frederic Weisbecker, David? Gibson

On 12/20/19 12:32 PM, Jason A. Donenfeld wrote:
> On Thu, Dec 19, 2019 at 2:08 PM Cédric Le Goater <clg@kaod.org> wrote:>
>> There is a patch for msgsndp in my tree you could try :
>>
>>   https://github.com/legoater/qemu/tree/powernv-5.0
>>
>> Currently being reviewed. I have to address some remarks from David before
>> it can be merged.
> 
> Thanks. I've cherry-picked
> https://github.com/legoater/qemu/commit/910c9ea5ecc and can confirm
> that I no longer receive the crashes. QEMU 5.1 or 5.0.1 release, I
> guess?

QEMU 5.1 I would say. That's enough time to address the comments.

Michael's patch to drop self IPIs is also interesting for Linux. I wonder
how we can reach that point. 

Thanks,

C. 


^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2019-12-20 23:10 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-03-27 12:36 pseries on qemu-system-ppc64le crashes in doorbell_core_ipi() Sebastian Andrzej Siewior
2019-03-27 16:37 ` [Qemu-ppc] " Cédric Le Goater
2019-03-27 16:51   ` Cédric Le Goater
2019-03-29  5:20     ` Suraj Jitindar Singh
2019-03-29  8:32       ` Sebastian Andrzej Siewior
2019-03-29  9:13       ` Nicholas Piggin
2019-03-29 15:31         ` Steven Rostedt
2019-03-30  3:10           ` Nicholas Piggin
2019-04-01  8:38             ` Peter Zijlstra
2019-04-04 16:25               ` Nicholas Piggin
2019-04-05 14:47                 ` Sebastian Andrzej Siewior
2019-04-06  0:06               ` Frederic Weisbecker
2019-04-09  9:25                 ` Nicholas Piggin
2019-12-19 10:41                   ` Jason A. Donenfeld
2019-12-19 11:13                     ` Sebastian Andrzej Siewior
2019-12-19 11:19                       ` Jason A. Donenfeld
2019-12-19 12:45                     ` Michael Ellerman
2019-12-19 13:08                       ` Cédric Le Goater
2019-12-20  0:22                         ` David? Gibson
2019-12-20 11:32                         ` Jason A. Donenfeld
2019-12-20 12:21                           ` David? Gibson
2019-12-20 15:59                           ` Cédric Le Goater
2019-12-20  0:53                       ` Jason A. Donenfeld

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.