All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/1] Uses the system workqueue as fallback
@ 2017-12-21 15:44 Jose Ricardo Ziviani
  2017-12-21 15:44 ` [PATCH 1/1] powerpc/pseries: Use the system workqueue as fallback to hotplug workqueue Jose Ricardo Ziviani
  0 siblings, 1 reply; 4+ messages in thread
From: Jose Ricardo Ziviani @ 2017-12-21 15:44 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: mpe, david, benh

In order to avoid kernel panic after memory hotplug in early stages of the boot
process (which the kernel is already able to handle IRQs), this patch uses the
system workqueue as a fallback to the hotplug workqueue.

After this patch I'm not able to reproduce the problem and the memory is
successfuly plugged at any stage in the boot process.

Thank you

Error scenario:

Booting Linux via __start() @ 0x0000000002000000 ...
[    0.000000] Detected Power 8 processor 
[    0.000000] Warning: Processor - this hardware has not undergone testing by Red Hat and might not be certified. Please consult https://hardware.redhat.com for certified hardware.
 -> smp_release_cpus()
spinning_secondaries = 3
 <- smp_release_cpus()
Linux ppc64le
#1 SMP Wed Nov 2[    0.021319] Unable to handle kernel paging request for data at address 0x00000100
[    0.021379] Faulting instruction address: 0xc00000000015c420
[    0.021423] Oops: Kernel access of bad area, sig: 11 [#1]
[    0.021457] LE SMP NR_CPUS=2048 NUMA pSeries
[    0.021493] Modules linked in:
[    0.021522] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.14.0-9.el7a.ppc64le #1
[    0.021572] task: c00000047bb80000 task.stack: c00000047bbc0000
[    0.021615] NIP:  c00000000015c420 LR: c00000000015cae4 CTR: 0000000000000000
[    0.021666] REGS: c00000047ffeb920 TRAP: 0380   Not tainted  (4.14.0-9.el7a.ppc64le)
[    0.021715] MSR:  8000000002009033 <SF,VEC,EE,ME,IR,DR,RI,LE>  CR: 28000042  XER: 20000000
[    0.021769] CFAR: c00000000015cae0 SOFTE: 0 
[    0.021769] GPR00: c00000000015cae4 c00000047ffebba0 c0000000014ca600 0000000000000800 
[    0.021769] GPR04: 0000000000000000 c00000047e1e5000 00000000000001a0 c0000000000e19e0 
[    0.021769] GPR08: 0000000fffffffe1 0000000000000000 0000000fffffffe0 0000000002001001 
[    0.021769] GPR12: c0000000000e0ea0 c000000007ac0000 c00000000000d0b8 0000000000000000 
[    0.021769] GPR16: 0000000000000000 c00000047e1e5000 0000000000000000 0000000000000000 
[    0.021769] GPR20: 0000000000000000 0000000000000001 0000000000000002 0000000000000015 
[    0.021769] GPR24: c0000001fdc075b8 0000000000000001 c0000001fdc07400 0000000000000000 
[    0.021769] GPR28: 0000000000000800 0000000000000000 0000000000000000 0000000000000800 
[    0.022196] NIP [c00000000015c420] __queue_work+0x80/0x690
[    0.022231] LR [c00000000015cae4] queue_work_on+0xb4/0xf0
[    0.022264] Call Trace:
[    0.022283] [c00000047ffebba0] [c00000000017d948] ttwu_do_wakeup+0x228/0x290 (unreliable)
[    0.022334] [c00000047ffebc90] [c00000000015cae4] queue_work_on+0xb4/0xf0
[    0.022377] [c00000047ffebcd0] [c0000000000e36d0] queue_hotplug_event+0xe0/0x160
[    0.022428] [c00000047ffebd20] [c0000000000e0fe0] ras_hotplug_interrupt+0x140/0x160
[    0.022480] [c00000047ffebdb0] [c0000000001d0a20] __handle_irq_event_percpu+0xa0/0x330
...
[    1.024963] Kernel panic - not syncing: Fatal exception in interrupt
[    1.027080] Rebooting in 10 seconds..

Test case 1: Hotplug during the boot process, after the hotplug wq
initialization

    0.554391] rtas_flash: no firmware flash support
[    0.554464] >>>>>>>>>>>>>>>> [devlog pseries_hp_wq] ALLOCed
[    0.555021] Initialise system trusted keyrings
...
...
Welcome to Red Hat Enterprise Linux Server 7.4 (Maipo) dracut-033-502.el7 (Initramfs)!
...
[  OK  ] Started dracut cmdline hook.
         Starting dracut pre-udev hook...
(qemu) object_add memory-backend-ram,id=mem1,size=10G
(qemu) device_add pc-dimm,id=dimm1,memdev=mem1
[    0.754432] >>>>>>>>>>>>>>>> [devlog pseries_hp_wq] 0xfec52400L
[    0.765465] pseries-hotplug-mem: Attempting to hot-add 40 LMB(s) at index 80000010
[    0.765710] radix-mmu: Mapped 0xc000000100000000-0xc000000110000000 with 2.00 MiB pages
...
(qemu) info memory-devices
Memory device [dimm]: "dimm1"
  addr: 0x100000000
  slot: 0
  node: 0
  size: 10737418240
  memdev: /objects/mem1
  hotplugged: true
  hotpluggable: true

Test case 2: Hotplug during the boot process, before the hotplug wq
initialization

[   [    0.028103] NET: Registered protocol family 1
[    0.028745] Unpacking initramfs...
(qemu) object_add memory-backend-ram,id=mem1,size=10G
(qemu) device_add pc-dimm,id=dimm1,memdev=mem1
[    0.407070] >>>>>>>>>>>>>>>> [devlog pseries_hp_wq] 0x0 (using system wq)
[    0.407420] pseries-hotplug-mem: Attempting to hot-add 40 LMB(s) at index 80000010
[    0.407749] radix-mmu: Mapped 0xc000000100000000-0xc000000110000000 with 2.00 MiB pages
...  0.627554] rtas_flash: no firmware flash support
[    0.627674] >>>>>>>>>>>>>>>> [devlog pseries_hp_wq] ALLOCed
[    0.628451] Initialise system trusted keyrings
...
(qemu) info memory-devices
Memory device [dimm]: "dimm1"
  addr: 0x100000000
  slot: 0
  node: 0
  size: 10737418240
  memdev: /objects/mem1
  hotplugged: true
  hotpluggable: true

Jose Ricardo Ziviani (1):
  powerpc/pseries: Use the system workqueue as fallback to hotplug
    workqueue

 arch/powerpc/platforms/pseries/dlpar.c | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

-- 
2.14.1

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH 1/1] powerpc/pseries: Use the system workqueue as fallback to hotplug workqueue
  2017-12-21 15:44 [PATCH 0/1] Uses the system workqueue as fallback Jose Ricardo Ziviani
@ 2017-12-21 15:44 ` Jose Ricardo Ziviani
  2017-12-22  0:54   ` David Gibson
  0 siblings, 1 reply; 4+ messages in thread
From: Jose Ricardo Ziviani @ 2017-12-21 15:44 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: mpe, david, benh

The hotplug engine uses its own workqueue to handle IRQ requests, the
problem is that such workqueue is initialized not so early in the boot
process.

Thus, when the kernel is ready to handle IRQ requests, after the system
workqueue is initialized, we have a timeframe where any hotplug issued
by the client will result in a kernel panic. That timeframe goes until
the hotplug workqueue is initialized.

It would be good to have the hotplug workqueue initialized as soon as
the system workqueue but I don't think it is possible. So, this patch
uses the system workqueue as a fallback the handle such IRQs.

Signed-off-by: Jose Ricardo Ziviani <joserz@linux.vnet.ibm.com>
---
 arch/powerpc/platforms/pseries/dlpar.c | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/platforms/pseries/dlpar.c b/arch/powerpc/platforms/pseries/dlpar.c
index 6e35780c5962..0474aa14b5f6 100644
--- a/arch/powerpc/platforms/pseries/dlpar.c
+++ b/arch/powerpc/platforms/pseries/dlpar.c
@@ -399,7 +399,15 @@ void queue_hotplug_event(struct pseries_hp_errorlog *hp_errlog,
 		work->errlog = hp_errlog_copy;
 		work->hp_completion = hotplug_done;
 		work->rc = rc;
-		queue_work(pseries_hp_wq, (struct work_struct *)work);
+
+		/* The hotplug workqueue may happen to be NULL at the moment
+		 * this code is executed, during the boot phase. So, in this
+		 * scenario, we can fallback to the system workqueue.
+		 */
+		if (unlikely(pseries_hp_wq == NULL))
+			schedule_work((struct work_struct *)work);
+		else
+			queue_work(pseries_hp_wq, (struct work_struct *)work);
 	} else {
 		*rc = -ENOMEM;
 		kfree(hp_errlog_copy);
-- 
2.14.1

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH 1/1] powerpc/pseries: Use the system workqueue as fallback to hotplug workqueue
  2017-12-21 15:44 ` [PATCH 1/1] powerpc/pseries: Use the system workqueue as fallback to hotplug workqueue Jose Ricardo Ziviani
@ 2017-12-22  0:54   ` David Gibson
  2017-12-22 14:19     ` joserz
  0 siblings, 1 reply; 4+ messages in thread
From: David Gibson @ 2017-12-22  0:54 UTC (permalink / raw)
  To: Jose Ricardo Ziviani; +Cc: linuxppc-dev, mpe, benh

[-- Attachment #1: Type: text/plain, Size: 2259 bytes --]

On Thu, Dec 21, 2017 at 01:44:48PM -0200, Jose Ricardo Ziviani wrote:
> The hotplug engine uses its own workqueue to handle IRQ requests, the
> problem is that such workqueue is initialized not so early in the boot
> process.
> 
> Thus, when the kernel is ready to handle IRQ requests, after the system
> workqueue is initialized, we have a timeframe where any hotplug issued
> by the client will result in a kernel panic. That timeframe goes until
> the hotplug workqueue is initialized.
> 
> It would be good to have the hotplug workqueue initialized as soon as
> the system workqueue but I don't think it is possible. So, this patch
> uses the system workqueue as a fallback the handle such IRQs.
> 
> Signed-off-by: Jose Ricardo Ziviani <joserz@linux.vnet.ibm.com>

I don't think this is the right approach.

It seems to me the bug is that the hotplug interrupt is registered in
init_ras_IRQ(), before the work queue is initialized in
pseries_dlpar_init().  We need to correct that ordering.

> ---
>  arch/powerpc/platforms/pseries/dlpar.c | 10 +++++++++-
>  1 file changed, 9 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/powerpc/platforms/pseries/dlpar.c b/arch/powerpc/platforms/pseries/dlpar.c
> index 6e35780c5962..0474aa14b5f6 100644
> --- a/arch/powerpc/platforms/pseries/dlpar.c
> +++ b/arch/powerpc/platforms/pseries/dlpar.c
> @@ -399,7 +399,15 @@ void queue_hotplug_event(struct pseries_hp_errorlog *hp_errlog,
>  		work->errlog = hp_errlog_copy;
>  		work->hp_completion = hotplug_done;
>  		work->rc = rc;
> -		queue_work(pseries_hp_wq, (struct work_struct *)work);
> +
> +		/* The hotplug workqueue may happen to be NULL at the moment
> +		 * this code is executed, during the boot phase. So, in this
> +		 * scenario, we can fallback to the system workqueue.
> +		 */
> +		if (unlikely(pseries_hp_wq == NULL))
> +			schedule_work((struct work_struct *)work);
> +		else
> +			queue_work(pseries_hp_wq, (struct work_struct *)work);
>  	} else {
>  		*rc = -ENOMEM;
>  		kfree(hp_errlog_copy);

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH 1/1] powerpc/pseries: Use the system workqueue as fallback to hotplug workqueue
  2017-12-22  0:54   ` David Gibson
@ 2017-12-22 14:19     ` joserz
  0 siblings, 0 replies; 4+ messages in thread
From: joserz @ 2017-12-22 14:19 UTC (permalink / raw)
  To: David Gibson; +Cc: benh, linuxppc-dev

On Fri, Dec 22, 2017 at 11:54:10AM +1100, David Gibson wrote:
> On Thu, Dec 21, 2017 at 01:44:48PM -0200, Jose Ricardo Ziviani wrote:
> > The hotplug engine uses its own workqueue to handle IRQ requests, the
> > problem is that such workqueue is initialized not so early in the boot
> > process.
> > 
> > Thus, when the kernel is ready to handle IRQ requests, after the system
> > workqueue is initialized, we have a timeframe where any hotplug issued
> > by the client will result in a kernel panic. That timeframe goes until
> > the hotplug workqueue is initialized.
> > 
> > It would be good to have the hotplug workqueue initialized as soon as
> > the system workqueue but I don't think it is possible. So, this patch
> > uses the system workqueue as a fallback the handle such IRQs.
> > 
> > Signed-off-by: Jose Ricardo Ziviani <joserz@linux.vnet.ibm.com>
> 
> I don't think this is the right approach.
> 
> It seems to me the bug is that the hotplug interrupt is registered in
> init_ras_IRQ(), before the work queue is initialized in
> pseries_dlpar_init().  We need to correct that ordering.
> 

Oh, this makes much more sense. I'm going to make some tests and send a
v2 then.

Thanks for reviewing it!

> > ---
> >  arch/powerpc/platforms/pseries/dlpar.c | 10 +++++++++-
> >  1 file changed, 9 insertions(+), 1 deletion(-)
> > 
> > diff --git a/arch/powerpc/platforms/pseries/dlpar.c b/arch/powerpc/platforms/pseries/dlpar.c
> > index 6e35780c5962..0474aa14b5f6 100644
> > --- a/arch/powerpc/platforms/pseries/dlpar.c
> > +++ b/arch/powerpc/platforms/pseries/dlpar.c
> > @@ -399,7 +399,15 @@ void queue_hotplug_event(struct pseries_hp_errorlog *hp_errlog,
> >  		work->errlog = hp_errlog_copy;
> >  		work->hp_completion = hotplug_done;
> >  		work->rc = rc;
> > -		queue_work(pseries_hp_wq, (struct work_struct *)work);
> > +
> > +		/* The hotplug workqueue may happen to be NULL at the moment
> > +		 * this code is executed, during the boot phase. So, in this
> > +		 * scenario, we can fallback to the system workqueue.
> > +		 */
> > +		if (unlikely(pseries_hp_wq == NULL))
> > +			schedule_work((struct work_struct *)work);
> > +		else
> > +			queue_work(pseries_hp_wq, (struct work_struct *)work);
> >  	} else {
> >  		*rc = -ENOMEM;
> >  		kfree(hp_errlog_copy);
> 
> -- 
> David Gibson			| I'll have my music baroque, and my code
> david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
> 				| _way_ _around_!
> http://www.ozlabs.org/~dgibson

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2017-12-22 14:19 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-12-21 15:44 [PATCH 0/1] Uses the system workqueue as fallback Jose Ricardo Ziviani
2017-12-21 15:44 ` [PATCH 1/1] powerpc/pseries: Use the system workqueue as fallback to hotplug workqueue Jose Ricardo Ziviani
2017-12-22  0:54   ` David Gibson
2017-12-22 14:19     ` joserz

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.