linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/2] powerpc/xive: fix CPU hot unplug
@ 2017-09-23  8:26 Cédric Le Goater
  2017-09-23  8:26 ` [PATCH 1/2] powerpc/xive: fix IPI reset Cédric Le Goater
                   ` (2 more replies)
  0 siblings, 3 replies; 15+ messages in thread
From: Cédric Le Goater @ 2017-09-23  8:26 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: Michael Ellerman, Benjamin Herrenschmidt, David Gibson,
	Cédric Le Goater

Hi,

Here are a couple of small fixes to support CPU hot unplug. There are
still some issues to be investigated as, in some occasions, after a
couple of plug and unplug, the cpu which was removed receives a 'lost'
interrupt. This showed to be the decrementer under QEMU.

Nevertheless, these patches are required and provide a significant
improvement to support CPU removal.

Tested under a phyp and a XIVE QEMU model for pseries.

Thanks,

C.

Cédric Le Goater (2):
  powerpc/xive: fix IPI reset
  powerpc/xive: fix cpu removal

 arch/powerpc/sysdev/xive/common.c | 8 ++++++++
 arch/powerpc/sysdev/xive/spapr.c  | 4 ++++
 2 files changed, 12 insertions(+)

-- 
2.13.5

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 1/2] powerpc/xive: fix IPI reset
  2017-09-23  8:26 [PATCH 0/2] powerpc/xive: fix CPU hot unplug Cédric Le Goater
@ 2017-09-23  8:26 ` Cédric Le Goater
  2017-09-23  8:26 ` [PATCH 2/2] powerpc/xive: fix cpu removal Cédric Le Goater
  2017-10-02 16:27 ` [PATCH 0/2] powerpc/xive: fix CPU hot unplug Cédric Le Goater
  2 siblings, 0 replies; 15+ messages in thread
From: Cédric Le Goater @ 2017-09-23  8:26 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: Michael Ellerman, Benjamin Herrenschmidt, David Gibson,
	Cédric Le Goater

When resetting an IPI, hw_ipi should also be set to zero.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
---
 arch/powerpc/sysdev/xive/spapr.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/arch/powerpc/sysdev/xive/spapr.c b/arch/powerpc/sysdev/xive/spapr.c
index f24a70bc6855..d9c4c9366049 100644
--- a/arch/powerpc/sysdev/xive/spapr.c
+++ b/arch/powerpc/sysdev/xive/spapr.c
@@ -431,7 +431,11 @@ static int xive_spapr_get_ipi(unsigned int cpu, struct xive_cpu *xc)
 
 static void xive_spapr_put_ipi(unsigned int cpu, struct xive_cpu *xc)
 {
+	if (!xc->hw_ipi)
+		return;
+
 	xive_irq_bitmap_free(xc->hw_ipi);
+	xc->hw_ipi = 0;
 }
 #endif /* CONFIG_SMP */
 
-- 
2.13.5

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 2/2] powerpc/xive: fix cpu removal
  2017-09-23  8:26 [PATCH 0/2] powerpc/xive: fix CPU hot unplug Cédric Le Goater
  2017-09-23  8:26 ` [PATCH 1/2] powerpc/xive: fix IPI reset Cédric Le Goater
@ 2017-09-23  8:26 ` Cédric Le Goater
  2017-10-02 16:27 ` [PATCH 0/2] powerpc/xive: fix CPU hot unplug Cédric Le Goater
  2 siblings, 0 replies; 15+ messages in thread
From: Cédric Le Goater @ 2017-09-23  8:26 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: Michael Ellerman, Benjamin Herrenschmidt, David Gibson,
	Cédric Le Goater

When a CPU is hot unplugged, the IPI interrupt source should be
deconfigured and the event queue cleared.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
---
 arch/powerpc/sysdev/xive/common.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/arch/powerpc/sysdev/xive/common.c b/arch/powerpc/sysdev/xive/common.c
index 1c087ed7427f..b9440ac7a3c1 100644
--- a/arch/powerpc/sysdev/xive/common.c
+++ b/arch/powerpc/sysdev/xive/common.c
@@ -1398,6 +1398,14 @@ void xive_teardown_cpu(void)
 
 	if (xive_ops->teardown_cpu)
 		xive_ops->teardown_cpu(cpu, xc);
+
+#ifdef CONFIG_SMP
+	/* Get rid of IPI */
+	xive_cleanup_cpu_ipi(cpu, xc);
+#endif
+
+	/* Disable and free the queues */
+	xive_cleanup_cpu_queues(cpu, xc);
 }
 
 void xive_kexec_teardown_cpu(int secondary)
-- 
2.13.5

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH 0/2] powerpc/xive: fix CPU hot unplug
  2017-09-23  8:26 [PATCH 0/2] powerpc/xive: fix CPU hot unplug Cédric Le Goater
  2017-09-23  8:26 ` [PATCH 1/2] powerpc/xive: fix IPI reset Cédric Le Goater
  2017-09-23  8:26 ` [PATCH 2/2] powerpc/xive: fix cpu removal Cédric Le Goater
@ 2017-10-02 16:27 ` Cédric Le Goater
  2017-10-02 16:52   ` Benjamin Herrenschmidt
                     ` (2 more replies)
  2 siblings, 3 replies; 15+ messages in thread
From: Cédric Le Goater @ 2017-10-02 16:27 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Michael Ellerman, Benjamin Herrenschmidt, David Gibson

On 09/23/2017 10:26 AM, Cédric Le Goater wrote:
> Hi,
> 
> Here are a couple of small fixes to support CPU hot unplug. There are
> still some issues to be investigated as, in some occasions, after a
> couple of plug and unplug, the cpu which was removed receives a 'lost'
> interrupt. This showed to be the decrementer under QEMU.

So this seems to be a QEMU issue only which can be solved by 
removing the DEE bit from the LPCR on P9 processor when the CPU 
is stopped in rtas. PECE3 bit on P8 processors. 

I think these patches are valuable fixes for 4.14. The first 
is trivial and the second touches the common xive part but it
is only called on the pseries platform.  

Could you please take a look ? 

Thanks,

C. 

> Nevertheless, these patches are required and provide a significant
> improvement to support CPU removal.
> 
> Tested under a phyp and a XIVE QEMU model for pseries.
> 
> Thanks,
> 
> C.
> 
> Cédric Le Goater (2):
>   powerpc/xive: fix IPI reset
>   powerpc/xive: fix cpu removal
> 
>  arch/powerpc/sysdev/xive/common.c | 8 ++++++++
>  arch/powerpc/sysdev/xive/spapr.c  | 4 ++++
>  2 files changed, 12 insertions(+)
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 0/2] powerpc/xive: fix CPU hot unplug
  2017-10-02 16:27 ` [PATCH 0/2] powerpc/xive: fix CPU hot unplug Cédric Le Goater
@ 2017-10-02 16:52   ` Benjamin Herrenschmidt
  2017-10-02 17:53     ` Cédric Le Goater
  2017-10-03  3:36   ` David Gibson
  2017-10-03 11:23   ` Michael Ellerman
  2 siblings, 1 reply; 15+ messages in thread
From: Benjamin Herrenschmidt @ 2017-10-02 16:52 UTC (permalink / raw)
  To: Cédric Le Goater, linuxppc-dev; +Cc: Michael Ellerman, David Gibson

On Mon, 2017-10-02 at 18:27 +0200, Cédric Le Goater wrote:
> On 09/23/2017 10:26 AM, Cédric Le Goater wrote:
> > Hi,
> > 
> > Here are a couple of small fixes to support CPU hot unplug. There are
> > still some issues to be investigated as, in some occasions, after a
> > couple of plug and unplug, the cpu which was removed receives a 'lost'
> > interrupt. This showed to be the decrementer under QEMU.
> 
> So this seems to be a QEMU issue only which can be solved by 
> removing the DEE bit from the LPCR on P9 processor when the CPU 
> is stopped in rtas. PECE3 bit on P8 processors. 

It should be the same bit no ?

> I think these patches are valuable fixes for 4.14. The first 
> is trivial and the second touches the common xive part but it
> is only called on the pseries platform.  
> 
> Could you please take a look ? 
> 
> Thanks,
> 
> C. 
> 
> > Nevertheless, these patches are required and provide a significant
> > improvement to support CPU removal.
> > 
> > Tested under a phyp and a XIVE QEMU model for pseries.
> > 
> > Thanks,
> > 
> > C.
> > 
> > Cédric Le Goater (2):
> >   powerpc/xive: fix IPI reset
> >   powerpc/xive: fix cpu removal
> > 
> >  arch/powerpc/sysdev/xive/common.c | 8 ++++++++
> >  arch/powerpc/sysdev/xive/spapr.c  | 4 ++++
> >  2 files changed, 12 insertions(+)
> > 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 0/2] powerpc/xive: fix CPU hot unplug
  2017-10-02 16:52   ` Benjamin Herrenschmidt
@ 2017-10-02 17:53     ` Cédric Le Goater
  0 siblings, 0 replies; 15+ messages in thread
From: Cédric Le Goater @ 2017-10-02 17:53 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, linuxppc-dev; +Cc: Michael Ellerman, David Gibson

On 10/02/2017 06:52 PM, Benjamin Herrenschmidt wrote:
> On Mon, 2017-10-02 at 18:27 +0200, Cédric Le Goater wrote:
>> On 09/23/2017 10:26 AM, Cédric Le Goater wrote:
>>> Hi,
>>>
>>> Here are a couple of small fixes to support CPU hot unplug. There are
>>> still some issues to be investigated as, in some occasions, after a
>>> couple of plug and unplug, the cpu which was removed receives a 'lost'
>>> interrupt. This showed to be the decrementer under QEMU.
>>
>> So this seems to be a QEMU issue only which can be solved by 
>> removing the DEE bit from the LPCR on P9 processor when the CPU 
>> is stopped in rtas. PECE3 bit on P8 processors. 
> 
> It should be the same bit no ?

yes and it is for the QEMU side of the world. 

C.
 
>> I think these patches are valuable fixes for 4.14. The first 
>> is trivial and the second touches the common xive part but it
>> is only called on the pseries platform.  
>>
>> Could you please take a look ? 
>>
>> Thanks,
>>
>> C. 
>>
>>> Nevertheless, these patches are required and provide a significant
>>> improvement to support CPU removal.
>>>
>>> Tested under a phyp and a XIVE QEMU model for pseries.
>>>
>>> Thanks,
>>>
>>> C.
>>>
>>> Cédric Le Goater (2):
>>>   powerpc/xive: fix IPI reset
>>>   powerpc/xive: fix cpu removal
>>>
>>>  arch/powerpc/sysdev/xive/common.c | 8 ++++++++
>>>  arch/powerpc/sysdev/xive/spapr.c  | 4 ++++
>>>  2 files changed, 12 insertions(+)
>>>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 0/2] powerpc/xive: fix CPU hot unplug
  2017-10-02 16:27 ` [PATCH 0/2] powerpc/xive: fix CPU hot unplug Cédric Le Goater
  2017-10-02 16:52   ` Benjamin Herrenschmidt
@ 2017-10-03  3:36   ` David Gibson
  2017-10-03  6:24     ` Cédric Le Goater
  2017-10-03 11:23   ` Michael Ellerman
  2 siblings, 1 reply; 15+ messages in thread
From: David Gibson @ 2017-10-03  3:36 UTC (permalink / raw)
  To: Cédric Le Goater
  Cc: linuxppc-dev, Michael Ellerman, Benjamin Herrenschmidt

[-- Attachment #1: Type: text/plain, Size: 1158 bytes --]

On Mon, Oct 02, 2017 at 06:27:20PM +0200, Cédric Le Goater wrote:
> On 09/23/2017 10:26 AM, Cédric Le Goater wrote:
> > Hi,
> > 
> > Here are a couple of small fixes to support CPU hot unplug. There are
> > still some issues to be investigated as, in some occasions, after a
> > couple of plug and unplug, the cpu which was removed receives a 'lost'
> > interrupt. This showed to be the decrementer under QEMU.
> 
> So this seems to be a QEMU issue only which can be solved by 
> removing the DEE bit from the LPCR on P9 processor when the CPU 
> is stopped in rtas. PECE3 bit on P8 processors. 
> 
> I think these patches are valuable fixes for 4.14. The first 
> is trivial and the second touches the common xive part but it
> is only called on the pseries platform.  
> 
> Could you please take a look ?

Sorry, I think I've missed something here.

Is there a qemu bug involved in this?  Has there been a patch sent
that I didn't spot?

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 0/2] powerpc/xive: fix CPU hot unplug
  2017-10-03  3:36   ` David Gibson
@ 2017-10-03  6:24     ` Cédric Le Goater
  2017-10-03  6:58       ` David Gibson
  0 siblings, 1 reply; 15+ messages in thread
From: Cédric Le Goater @ 2017-10-03  6:24 UTC (permalink / raw)
  To: David Gibson; +Cc: linuxppc-dev, Michael Ellerman, Benjamin Herrenschmidt

On 10/03/2017 05:36 AM, David Gibson wrote:
> On Mon, Oct 02, 2017 at 06:27:20PM +0200, Cédric Le Goater wrote:
>> On 09/23/2017 10:26 AM, Cédric Le Goater wrote:
>>> Hi,
>>>
>>> Here are a couple of small fixes to support CPU hot unplug. There are
>>> still some issues to be investigated as, in some occasions, after a
>>> couple of plug and unplug, the cpu which was removed receives a 'lost'
>>> interrupt. This showed to be the decrementer under QEMU.
>>
>> So this seems to be a QEMU issue only which can be solved by 
>> removing the DEE bit from the LPCR on P9 processor when the CPU 
>> is stopped in rtas. PECE3 bit on P8 processors. 
>>
>> I think these patches are valuable fixes for 4.14. The first 
>> is trivial and the second touches the common xive part but it
>> is only called on the pseries platform.  
>>
>> Could you please take a look ?
> 
> Sorry, I think I've missed something here.
> 
> Is there a qemu bug involved in this?  Has there been a patch sent
> that I didn't spot?


No, not yet, but I will today probably. something like below to stop
the decrementer when a CPU is stopped:

	--- qemu.git.orig/hw/ppc/spapr_rtas.c
	+++ qemu.git/hw/ppc/spapr_rtas.c
	@@ -174,6 +174,15 @@ static void rtas_start_cpu(PowerPCCPU *c
	         kvm_cpu_synchronize_state(cs);
	 
	         env->msr = (1ULL << MSR_SF) | (1ULL << MSR_ME);
	+
	+        /* Enable DECR interrupt */
	+        if (env->mmu_model == POWERPC_MMU_3_00) {
	+            env->spr[SPR_LPCR] |= LPCR_DEE;
	+        } else {
	+            /* P7 and P8 both have same bit for DECR */
	+            env->spr[SPR_LPCR] |= LPCR_P8_PECE3;
	+        }
	+
	         env->nip = start;
	         env->gpr[3] = r3;
	         cs->halted = 0;
	@@ -210,6 +219,13 @@ static void rtas_stop_self(PowerPCCPU *c
	      * no need to bother with specific bits, we just clear it.
	      */
	     env->msr = 0;
	+
	+    if (env->mmu_model == POWERPC_MMU_3_00) {
	+        env->spr[SPR_LPCR] &= ~LPCR_DEE;
	+    } else {
	+        /* P7 and P8 both have same bit for DECR */
	+        env->spr[SPR_LPCR] &= ~LPCR_P8_PECE3;
	+    }
	 }
	 
	 static inline int sysparm_st(target_ulong addr, target_ulong len,
	
I haven't yet because I fail to understand why the decrementer is not 
interrupting the dying CPU under xics as it is the case under XIVE.

Also I am not sure this hack is of any use :

    /*
     * While stopping a CPU, the guest calls H_CPPR which
     * effectively disables interrupts on XICS level.
     * However decrementer interrupts in TCG can still
     * wake the CPU up so here we disable interrupts in MSR
     * as well.
     * As rtas_start_cpu() resets the whole MSR anyway, there is
     * no need to bother with specific bits, we just clear it.
     */
    env->msr = 0;

and the different CPU states are confusing. Nikunj already to a look
at this when trying to fix the TCG reboot. Anyway, the QEMU patch 
should (re)start a thread. This is not the place to discuss.

Thanks,

C.  

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 0/2] powerpc/xive: fix CPU hot unplug
  2017-10-03  6:24     ` Cédric Le Goater
@ 2017-10-03  6:58       ` David Gibson
  2017-10-03  8:22         ` Benjamin Herrenschmidt
                           ` (2 more replies)
  0 siblings, 3 replies; 15+ messages in thread
From: David Gibson @ 2017-10-03  6:58 UTC (permalink / raw)
  To: Cédric Le Goater
  Cc: linuxppc-dev, Michael Ellerman, Benjamin Herrenschmidt, nikunj

[-- Attachment #1: Type: text/plain, Size: 3994 bytes --]

On Tue, Oct 03, 2017 at 08:24:07AM +0200, Cédric Le Goater wrote:
> On 10/03/2017 05:36 AM, David Gibson wrote:
> > On Mon, Oct 02, 2017 at 06:27:20PM +0200, Cédric Le Goater wrote:
> >> On 09/23/2017 10:26 AM, Cédric Le Goater wrote:
> >>> Hi,
> >>>
> >>> Here are a couple of small fixes to support CPU hot unplug. There are
> >>> still some issues to be investigated as, in some occasions, after a
> >>> couple of plug and unplug, the cpu which was removed receives a 'lost'
> >>> interrupt. This showed to be the decrementer under QEMU.
> >>
> >> So this seems to be a QEMU issue only which can be solved by 
> >> removing the DEE bit from the LPCR on P9 processor when the CPU 
> >> is stopped in rtas. PECE3 bit on P8 processors. 
> >>
> >> I think these patches are valuable fixes for 4.14. The first 
> >> is trivial and the second touches the common xive part but it
> >> is only called on the pseries platform.  
> >>
> >> Could you please take a look ?
> > 
> > Sorry, I think I've missed something here.
> > 
> > Is there a qemu bug involved in this?  Has there been a patch sent
> > that I didn't spot?
> 
> 
> No, not yet, but I will today probably. something like below to stop
> the decrementer when a CPU is stopped:
> 
> 	--- qemu.git.orig/hw/ppc/spapr_rtas.c
> 	+++ qemu.git/hw/ppc/spapr_rtas.c
> 	@@ -174,6 +174,15 @@ static void rtas_start_cpu(PowerPCCPU *c
> 	         kvm_cpu_synchronize_state(cs);
> 	 
> 	         env->msr = (1ULL << MSR_SF) | (1ULL << MSR_ME);
> 	+
> 	+        /* Enable DECR interrupt */
> 	+        if (env->mmu_model == POWERPC_MMU_3_00) {
> 	+            env->spr[SPR_LPCR] |= LPCR_DEE;
> 	+        } else {
> 	+            /* P7 and P8 both have same bit for DECR */
> 	+            env->spr[SPR_LPCR] |= LPCR_P8_PECE3;
> 	+        }
> 	+
> 	         env->nip = start;
> 	         env->gpr[3] = r3;
> 	         cs->halted = 0;
> 	@@ -210,6 +219,13 @@ static void rtas_stop_self(PowerPCCPU *c
> 	      * no need to bother with specific bits, we just clear it.
> 	      */
> 	     env->msr = 0;
> 	+
> 	+    if (env->mmu_model == POWERPC_MMU_3_00) {
> 	+        env->spr[SPR_LPCR] &= ~LPCR_DEE;
> 	+    } else {
> 	+        /* P7 and P8 both have same bit for DECR */
> 	+        env->spr[SPR_LPCR] &= ~LPCR_P8_PECE3;
> 	+    }
> 	 }
> 	 
> 	 static inline int sysparm_st(target_ulong addr, target_ulong len,
> 	
> I haven't yet because I fail to understand why the decrementer is not 
> interrupting the dying CPU under xics as it is the case under XIVE.

Oh.. ok.  This sounds very similar to the problem Nikunj hit under TCG
with decrementer interrupts waking up a supposedly dead CPU.  He had a
couple of proposed fixes, but we got bogged down trying to work out
why  (with TCG at least) it only seemed to bite after a system_reset,
and not on initial boot up.

> Also I am not sure this hack is of any use :
> 
>     /*
>      * While stopping a CPU, the guest calls H_CPPR which
>      * effectively disables interrupts on XICS level.
>      * However decrementer interrupts in TCG can still
>      * wake the CPU up so here we disable interrupts in MSR
>      * as well.
>      * As rtas_start_cpu() resets the whole MSR anyway, there is
>      * no need to bother with specific bits, we just clear it.
>      */
>     env->msr = 0;

Ok.. why do you think this isn't of use?  I'm pretty sure this is
necessary for the TCG case, since MSR is checked in cpu_has_work(),
which could otherwise wake up the "dead" cpu.

> and the different CPU states are confusing. Nikunj already to a look
> at this when trying to fix the TCG reboot. Anyway, the QEMU patch 
> should (re)start a thread. This is not the place to discuss.
> 
> Thanks,
> 
> C.  
> 
> 

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 0/2] powerpc/xive: fix CPU hot unplug
  2017-10-03  6:58       ` David Gibson
@ 2017-10-03  8:22         ` Benjamin Herrenschmidt
  2017-10-04 14:48         ` Cédric Le Goater
  2017-10-05  3:29         ` Nikunj A Dadhania
  2 siblings, 0 replies; 15+ messages in thread
From: Benjamin Herrenschmidt @ 2017-10-03  8:22 UTC (permalink / raw)
  To: David Gibson, Cédric Le Goater
  Cc: linuxppc-dev, Michael Ellerman, nikunj

On Tue, 2017-10-03 at 17:58 +1100, David Gibson wrote:
> 
> Ok.. why do you think this isn't of use?  I'm pretty sure this is
> necessary for the TCG case, since MSR is checked in cpu_has_work(),
> which could otherwise wake up the "dead" cpu.

Ony if it's not in a PM state, in that case we check the corresponding
LPCR:PECE* bit. At least on P7 and later.

Cheers,
Ben.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 0/2] powerpc/xive: fix CPU hot unplug
  2017-10-02 16:27 ` [PATCH 0/2] powerpc/xive: fix CPU hot unplug Cédric Le Goater
  2017-10-02 16:52   ` Benjamin Herrenschmidt
  2017-10-03  3:36   ` David Gibson
@ 2017-10-03 11:23   ` Michael Ellerman
  2017-10-03 12:58     ` Cédric Le Goater
  2 siblings, 1 reply; 15+ messages in thread
From: Michael Ellerman @ 2017-10-03 11:23 UTC (permalink / raw)
  To: Cédric Le Goater, linuxppc-dev; +Cc: Benjamin Herrenschmidt, David Gibson

C=C3=A9dric Le Goater <clg@kaod.org> writes:

> On 09/23/2017 10:26 AM, C=C3=A9dric Le Goater wrote:
>> Hi,
>>=20
>> Here are a couple of small fixes to support CPU hot unplug. There are
>> still some issues to be investigated as, in some occasions, after a
>> couple of plug and unplug, the cpu which was removed receives a 'lost'
>> interrupt. This showed to be the decrementer under QEMU.
>
> So this seems to be a QEMU issue only which can be solved by=20
> removing the DEE bit from the LPCR on P9 processor when the CPU=20
> is stopped in rtas. PECE3 bit on P8 processors.=20
>
> I think these patches are valuable fixes for 4.14. The first=20
> is trivial and the second touches the common xive part but it
> is only called on the pseries platform.=20=20
>
> Could you please take a look ?=20

I can.

You didn't give me much indication in the change logs of what the
failure mode is if I _don't_ have the fixes, which makes it hard for me
to assess the severity. Can you flesh out the ".. or else" case in the
change logs?

And should both be tagged?

  Fixes: eac1e731b59e ("powerpc/xive: guest exploitation of the XIVE interr=
upt controller")

cheers

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 0/2] powerpc/xive: fix CPU hot unplug
  2017-10-03 11:23   ` Michael Ellerman
@ 2017-10-03 12:58     ` Cédric Le Goater
  0 siblings, 0 replies; 15+ messages in thread
From: Cédric Le Goater @ 2017-10-03 12:58 UTC (permalink / raw)
  To: Michael Ellerman, linuxppc-dev; +Cc: Benjamin Herrenschmidt, David Gibson

Hello Michael,

On 10/03/2017 01:23 PM, Michael Ellerman wrote:
> Cédric Le Goater <clg@kaod.org> writes:
> 
>> On 09/23/2017 10:26 AM, Cédric Le Goater wrote:
>>> Hi,
>>>
>>> Here are a couple of small fixes to support CPU hot unplug. There are
>>> still some issues to be investigated as, in some occasions, after a
>>> couple of plug and unplug, the cpu which was removed receives a 'lost'
>>> interrupt. This showed to be the decrementer under QEMU.
>>
>> So this seems to be a QEMU issue only which can be solved by 
>> removing the DEE bit from the LPCR on P9 processor when the CPU 
>> is stopped in rtas. PECE3 bit on P8 processors. 
>>
>> I think these patches are valuable fixes for 4.14. The first 
>> is trivial and the second touches the common xive part but it
>> is only called on the pseries platform.  
>>
>> Could you please take a look ? 
> 
> I can.
> 
> You didn't give me much indication in the change logs of what the
> failure mode is if I _don't_ have the fixes, which makes it hard for me
> to assess the severity.
Support for CPU removal came recently on PowerVM and was not
tested when the patchset was sent. A couple of cleanups of the 
XIVE internal structures are missing resulting in a kernel crash 
when the CPU is released. 

There are still some corner cases when stressing the lpar with 
plug/unplug loops. Investigation in progress.  

> Can you flesh out the ".. or else" case in the change logs?

OK. I will improve patch 2 changelog.

> And should both be tagged?
> 
>   Fixes: eac1e731b59e ("powerpc/xive: guest exploitation of the XIVE interrupt controller")
yes I think so. These patches are too small to be tagged as 
adding hotplug support.

Thanks,

C. 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 0/2] powerpc/xive: fix CPU hot unplug
  2017-10-03  6:58       ` David Gibson
  2017-10-03  8:22         ` Benjamin Herrenschmidt
@ 2017-10-04 14:48         ` Cédric Le Goater
  2017-10-05  3:29         ` Nikunj A Dadhania
  2 siblings, 0 replies; 15+ messages in thread
From: Cédric Le Goater @ 2017-10-04 14:48 UTC (permalink / raw)
  To: David Gibson
  Cc: linuxppc-dev, Michael Ellerman, Benjamin Herrenschmidt, nikunj

On 10/03/2017 08:58 AM, David Gibson wrote:
> On Tue, Oct 03, 2017 at 08:24:07AM +0200, Cédric Le Goater wrote:
>> On 10/03/2017 05:36 AM, David Gibson wrote:
>>> On Mon, Oct 02, 2017 at 06:27:20PM +0200, Cédric Le Goater wrote:
>>>> On 09/23/2017 10:26 AM, Cédric Le Goater wrote:
>>>>> Hi,
>>>>>
>>>>> Here are a couple of small fixes to support CPU hot unplug. There are
>>>>> still some issues to be investigated as, in some occasions, after a
>>>>> couple of plug and unplug, the cpu which was removed receives a 'lost'
>>>>> interrupt. This showed to be the decrementer under QEMU.
>>>>
>>>> So this seems to be a QEMU issue only which can be solved by 
>>>> removing the DEE bit from the LPCR on P9 processor when the CPU 
>>>> is stopped in rtas. PECE3 bit on P8 processors. 
>>>>
>>>> I think these patches are valuable fixes for 4.14. The first 
>>>> is trivial and the second touches the common xive part but it
>>>> is only called on the pseries platform.  
>>>>
>>>> Could you please take a look ?
>>>
>>> Sorry, I think I've missed something here.
>>>
>>> Is there a qemu bug involved in this?  Has there been a patch sent
>>> that I didn't spot?
>>
>>
>> No, not yet, but I will today probably. something like below to stop
>> the decrementer when a CPU is stopped:
>>
>> 	--- qemu.git.orig/hw/ppc/spapr_rtas.c
>> 	+++ qemu.git/hw/ppc/spapr_rtas.c
>> 	@@ -174,6 +174,15 @@ static void rtas_start_cpu(PowerPCCPU *c
>> 	         kvm_cpu_synchronize_state(cs);
>> 	 
>> 	         env->msr = (1ULL << MSR_SF) | (1ULL << MSR_ME);
>> 	+
>> 	+        /* Enable DECR interrupt */
>> 	+        if (env->mmu_model == POWERPC_MMU_3_00) {
>> 	+            env->spr[SPR_LPCR] |= LPCR_DEE;
>> 	+        } else {
>> 	+            /* P7 and P8 both have same bit for DECR */
>> 	+            env->spr[SPR_LPCR] |= LPCR_P8_PECE3;
>> 	+        }
>> 	+
>> 	         env->nip = start;
>> 	         env->gpr[3] = r3;
>> 	         cs->halted = 0;
>> 	@@ -210,6 +219,13 @@ static void rtas_stop_self(PowerPCCPU *c
>> 	      * no need to bother with specific bits, we just clear it.
>> 	      */
>> 	     env->msr = 0;
>> 	+
>> 	+    if (env->mmu_model == POWERPC_MMU_3_00) {
>> 	+        env->spr[SPR_LPCR] &= ~LPCR_DEE;
>> 	+    } else {
>> 	+        /* P7 and P8 both have same bit for DECR */
>> 	+        env->spr[SPR_LPCR] &= ~LPCR_P8_PECE3;
>> 	+    }
>> 	 }
>> 	 
>> 	 static inline int sysparm_st(target_ulong addr, target_ulong len,
>> 	
>> I haven't yet because I fail to understand why the decrementer is not 
>> interrupting the dying CPU under xics as it is the case under XIVE.
> 
> Oh.. ok.  This sounds very similar to the problem Nikunj hit under TCG
> with decrementer interrupts waking up a supposedly dead CPU.  He had a
> couple of proposed fixes, but we got bogged down trying to work out
> why  (with TCG at least) it only seemed to bite after a system_reset,
> and not on initial boot up.

yes. It would be nice to fix the reset under TCG though. May be this is
related. 

>> Also I am not sure this hack is of any use :
>>
>>     /*
>>      * While stopping a CPU, the guest calls H_CPPR which
>>      * effectively disables interrupts on XICS level.
>>      * However decrementer interrupts in TCG can still
>>      * wake the CPU up so here we disable interrupts in MSR
>>      * as well.
>>      * As rtas_start_cpu() resets the whole MSR anyway, there is
>>      * no need to bother with specific bits, we just clear it.
>>      */
>>     env->msr = 0;
> 
> Ok.. why do you think this isn't of use?  I'm pretty sure this is
> necessary for the TCG case, since MSR is checked in cpu_has_work(),
> which could otherwise wake up the "dead" cpu.

well, no, when the CPU is stopped with the 'stop-self' RTAS call, one of 
the CPU states is switched to 1 (cs->halted=1). In cpu_has_work(), this 
is a branch in which we don't check the MSR, only pending hardware 
interrupts are checked with their LPCR enablement bit.

So if the DECR timer fires after 'stop-self' is called (cs->halted=1) and 
before it is really stopped (cs->stop=1), the nearly-dead CPU will have 
some work to do and the guest will crash. This case happens very frequently 
when the P9 XIVE exploitation mode is activated but it does not without, 
when using the XICS mode. In XICS mode, the DECR is occasionally fired but 
after cs->stop=1, so no work is to be done.
 
The patch above fixes the problem but I don't understand why this works 
with XICS. My feeling is that there is a race somewhere and 

	env->msr = 0;

is just a useless workaround, in this case at least. 

C.


> 
>> and the different CPU states are confusing. Nikunj already to a look
>> at this when trying to fix the TCG reboot. Anyway, the QEMU patch 
>> should (re)start a thread. This is not the place to discuss.
>>
>> Thanks,
>>
>> C.  
>>
>>
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 0/2] powerpc/xive: fix CPU hot unplug
  2017-10-03  6:58       ` David Gibson
  2017-10-03  8:22         ` Benjamin Herrenschmidt
  2017-10-04 14:48         ` Cédric Le Goater
@ 2017-10-05  3:29         ` Nikunj A Dadhania
  2017-10-05  8:18           ` Cédric Le Goater
  2 siblings, 1 reply; 15+ messages in thread
From: Nikunj A Dadhania @ 2017-10-05  3:29 UTC (permalink / raw)
  To: David Gibson, Cédric Le Goater
  Cc: linuxppc-dev, Michael Ellerman, Benjamin Herrenschmidt

David Gibson <david@gibson.dropbear.id.au> writes:

> On Tue, Oct 03, 2017 at 08:24:07AM +0200, C=C3=A9dric Le Goater wrote:
>> On 10/03/2017 05:36 AM, David Gibson wrote:
>> > On Mon, Oct 02, 2017 at 06:27:20PM +0200, C=C3=A9dric Le Goater wrote:
>> >> On 09/23/2017 10:26 AM, C=C3=A9dric Le Goater wrote:
>> >>> Hi,
>> >>>
>> >>> Here are a couple of small fixes to support CPU hot unplug. There are
>> >>> still some issues to be investigated as, in some occasions, after a
>> >>> couple of plug and unplug, the cpu which was removed receives a 'los=
t'
>> >>> interrupt. This showed to be the decrementer under QEMU.
>> >>
>> >> So this seems to be a QEMU issue only which can be solved by=20
>> >> removing the DEE bit from the LPCR on P9 processor when the CPU=20
>> >> is stopped in rtas. PECE3 bit on P8 processors.=20
>> >>
>> >> I think these patches are valuable fixes for 4.14. The first=20
>> >> is trivial and the second touches the common xive part but it
>> >> is only called on the pseries platform.=20=20
>> >>
>> >> Could you please take a look ?
>> >=20
>> > Sorry, I think I've missed something here.
>> >=20
>> > Is there a qemu bug involved in this?  Has there been a patch sent
>> > that I didn't spot?
>>=20
>>=20
>> No, not yet, but I will today probably. something like below to stop
>> the decrementer when a CPU is stopped:
>>=20
>> 	--- qemu.git.orig/hw/ppc/spapr_rtas.c
>> 	+++ qemu.git/hw/ppc/spapr_rtas.c
>> 	@@ -174,6 +174,15 @@ static void rtas_start_cpu(PowerPCCPU *c
>> 	         kvm_cpu_synchronize_state(cs);
>>=20=09=20
>> 	         env->msr =3D (1ULL << MSR_SF) | (1ULL << MSR_ME);
>> 	+
>> 	+        /* Enable DECR interrupt */
>> 	+        if (env->mmu_model =3D=3D POWERPC_MMU_3_00) {
>> 	+            env->spr[SPR_LPCR] |=3D LPCR_DEE;
>> 	+        } else {
>> 	+            /* P7 and P8 both have same bit for DECR */
>> 	+            env->spr[SPR_LPCR] |=3D LPCR_P8_PECE3;
>> 	+        }
>> 	+
>> 	         env->nip =3D start;
>> 	         env->gpr[3] =3D r3;
>> 	         cs->halted =3D 0;
>> 	@@ -210,6 +219,13 @@ static void rtas_stop_self(PowerPCCPU *c
>> 	      * no need to bother with specific bits, we just clear it.
>> 	      */
>> 	     env->msr =3D 0;
>> 	+
>> 	+    if (env->mmu_model =3D=3D POWERPC_MMU_3_00) {
>> 	+        env->spr[SPR_LPCR] &=3D ~LPCR_DEE;
>> 	+    } else {
>> 	+        /* P7 and P8 both have same bit for DECR */
>> 	+        env->spr[SPR_LPCR] &=3D ~LPCR_P8_PECE3;
>> 	+    }
>> 	 }
>>=20=09=20
>> 	 static inline int sysparm_st(target_ulong addr, target_ulong len,
>>=20=09
>> I haven't yet because I fail to understand why the decrementer is not=20
>> interrupting the dying CPU under xics as it is the case under XIVE.
>
> Oh.. ok.  This sounds very similar to the problem Nikunj hit under TCG
> with decrementer interrupts waking up a supposedly dead CPU.  He had a
> couple of proposed fixes, but we got bogged down trying to work out
> why  (with TCG at least).

Yeah, I wasnt able to get to the exact reason for that.

Regards
Nikunj

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 0/2] powerpc/xive: fix CPU hot unplug
  2017-10-05  3:29         ` Nikunj A Dadhania
@ 2017-10-05  8:18           ` Cédric Le Goater
  0 siblings, 0 replies; 15+ messages in thread
From: Cédric Le Goater @ 2017-10-05  8:18 UTC (permalink / raw)
  To: Nikunj A Dadhania, David Gibson
  Cc: linuxppc-dev, Michael Ellerman, Benjamin Herrenschmidt

>>> I haven't yet because I fail to understand why the decrementer is not 
>>> interrupting the dying CPU under xics as it is the case under XIVE.
>>
>> Oh.. ok.  This sounds very similar to the problem Nikunj hit under TCG
>> with decrementer interrupts waking up a supposedly dead CPU.  He had a
>> couple of proposed fixes, but we got bogged down trying to work out
>> why  (with TCG at least).
> 
> Yeah, I wasnt able to get to the exact reason for that.

Yes. Tracking all the CPU states is a nightmare.

 * @running: #true if CPU is currently running (lockless).
 * @has_waiter: #true if a CPU is currently waiting for the cpu_exec_end;
 * valid under cpu_list_lock.
 * @created: Indicates whether the CPU thread has been successfully created.
 * @interrupt_request: Indicates a pending interrupt request.
 * @halted: Nonzero if the CPU is in suspended state.
 * @stop: Indicates a pending stop request.
 * @stopped: Indicates the CPU has been artificially stopped.
 * @unplug: Indicates a pending CPU unplug request.


I will spend some more time to understand why XICS is not behaving 
the same. This is really time consuming ...

C.

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2017-10-05  9:13 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-09-23  8:26 [PATCH 0/2] powerpc/xive: fix CPU hot unplug Cédric Le Goater
2017-09-23  8:26 ` [PATCH 1/2] powerpc/xive: fix IPI reset Cédric Le Goater
2017-09-23  8:26 ` [PATCH 2/2] powerpc/xive: fix cpu removal Cédric Le Goater
2017-10-02 16:27 ` [PATCH 0/2] powerpc/xive: fix CPU hot unplug Cédric Le Goater
2017-10-02 16:52   ` Benjamin Herrenschmidt
2017-10-02 17:53     ` Cédric Le Goater
2017-10-03  3:36   ` David Gibson
2017-10-03  6:24     ` Cédric Le Goater
2017-10-03  6:58       ` David Gibson
2017-10-03  8:22         ` Benjamin Herrenschmidt
2017-10-04 14:48         ` Cédric Le Goater
2017-10-05  3:29         ` Nikunj A Dadhania
2017-10-05  8:18           ` Cédric Le Goater
2017-10-03 11:23   ` Michael Ellerman
2017-10-03 12:58     ` Cédric Le Goater

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).