All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] ppc/pnv: Fix NMI system reset SRR1 value
@ 2020-05-07 11:48 Nicholas Piggin
  2020-05-07 13:51 ` David Gibson
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Nicholas Piggin @ 2020-05-07 11:48 UTC (permalink / raw)
  To: qemu-ppc; +Cc: David Gibson, qemu-devel, Nicholas Piggin, Cédric Le Goater

Commit a77fed5bd926 ("ppc/pnv: Add support for NMI interface") got the
SRR1 setting wrong for sresets that hit outside of power-save states.

Fix this, better documenting the source for the bit definitions.

Fixes: a77fed5bd926 ("ppc/pnv: Add support for NMI interface") got the
Cc: Cédric Le Goater <clg@kaod.org>
Cc: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---

Thanks to Cedric for pointing out concerns with a previous MCE patch
that unearthed this as well. Linux does not actually care what these
SRR1[42:45] bits look like for non-powersave sresets, but we should
follow documented behaviour as far as possible.

 hw/ppc/pnv.c | 26 ++++++++++++++++++++------
 1 file changed, 20 insertions(+), 6 deletions(-)

diff --git a/hw/ppc/pnv.c b/hw/ppc/pnv.c
index a3b7a8d0ff..1b4748ce6d 100644
--- a/hw/ppc/pnv.c
+++ b/hw/ppc/pnv.c
@@ -1986,12 +1986,26 @@ static void pnv_cpu_do_nmi_on_cpu(CPUState *cs, run_on_cpu_data arg)
 
     cpu_synchronize_state(cs);
     ppc_cpu_do_system_reset(cs);
-    /*
-     * SRR1[42:45] is set to 0100 which the ISA defines as implementation
-     * dependent. POWER processors use this for xscom triggered interrupts,
-     * which come from the BMC or NMI IPIs.
-     */
-    env->spr[SPR_SRR1] |= PPC_BIT(43);
+    if (env->spr[SPR_SRR1] & PPC_BITMASK(46, 47)) {
+        /*
+	 * Power-save wakeups, as indicated by non-zero SRR1[46:47] put the
+	 * wakeup reason in SRR1[42:45], system reset is indicated with 0b0100
+	 * (PPC_BIT(43)).
+	 */
+        if (!(env->spr[SPR_SRR1] & PPC_BIT(43))) {
+            warn_report("ppc_cpu_do_system_reset does not set system reset wakeup reason");
+            env->spr[SPR_SRR1] |= PPC_BIT(43);
+        }
+    } else {
+        /*
+	 * For non-powersave system resets, SRR1[42:45] are defined to be
+	 * implementation-dependent. The POWER9 User Manual specifies that
+	 * an external (SCOM driven, which may come from a BMC nmi command or
+	 * another CPU requesting a NMI IPI) system reset exception should be
+	 * 0b0010 (PPC_BIT(44)).
+         */
+        env->spr[SPR_SRR1] |= PPC_BIT(44);
+    }
 }
 
 static void pnv_nmi(NMIState *n, int cpu_index, Error **errp)
-- 
2.23.0



^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] ppc/pnv: Fix NMI system reset SRR1 value
  2020-05-07 11:48 [PATCH] ppc/pnv: Fix NMI system reset SRR1 value Nicholas Piggin
@ 2020-05-07 13:51 ` David Gibson
  2020-05-08  8:43   ` Greg Kurz
  2020-05-07 17:14 ` Cédric Le Goater
  2020-05-08  3:43 ` no-reply
  2 siblings, 1 reply; 8+ messages in thread
From: David Gibson @ 2020-05-07 13:51 UTC (permalink / raw)
  To: Nicholas Piggin; +Cc: qemu-ppc, qemu-devel, Cédric Le Goater

[-- Attachment #1: Type: text/plain, Size: 2669 bytes --]

On Thu, May 07, 2020 at 09:48:24PM +1000, Nicholas Piggin wrote:
> Commit a77fed5bd926 ("ppc/pnv: Add support for NMI interface") got the
> SRR1 setting wrong for sresets that hit outside of power-save states.
> 
> Fix this, better documenting the source for the bit definitions.
> 
> Fixes: a77fed5bd926 ("ppc/pnv: Add support for NMI interface") got the
> Cc: Cédric Le Goater <clg@kaod.org>
> Cc: David Gibson <david@gibson.dropbear.id.au>
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>

Applied to ppc-for-5.1, thanks.
> ---
> 
> Thanks to Cedric for pointing out concerns with a previous MCE patch
> that unearthed this as well. Linux does not actually care what these
> SRR1[42:45] bits look like for non-powersave sresets, but we should
> follow documented behaviour as far as possible.
> 
>  hw/ppc/pnv.c | 26 ++++++++++++++++++++------
>  1 file changed, 20 insertions(+), 6 deletions(-)
> 
> diff --git a/hw/ppc/pnv.c b/hw/ppc/pnv.c
> index a3b7a8d0ff..1b4748ce6d 100644
> --- a/hw/ppc/pnv.c
> +++ b/hw/ppc/pnv.c
> @@ -1986,12 +1986,26 @@ static void pnv_cpu_do_nmi_on_cpu(CPUState *cs, run_on_cpu_data arg)
>  
>      cpu_synchronize_state(cs);
>      ppc_cpu_do_system_reset(cs);
> -    /*
> -     * SRR1[42:45] is set to 0100 which the ISA defines as implementation
> -     * dependent. POWER processors use this for xscom triggered interrupts,
> -     * which come from the BMC or NMI IPIs.
> -     */
> -    env->spr[SPR_SRR1] |= PPC_BIT(43);
> +    if (env->spr[SPR_SRR1] & PPC_BITMASK(46, 47)) {
> +        /*
> +	 * Power-save wakeups, as indicated by non-zero SRR1[46:47] put the
> +	 * wakeup reason in SRR1[42:45], system reset is indicated with 0b0100
> +	 * (PPC_BIT(43)).
> +	 */
> +        if (!(env->spr[SPR_SRR1] & PPC_BIT(43))) {
> +            warn_report("ppc_cpu_do_system_reset does not set system reset wakeup reason");
> +            env->spr[SPR_SRR1] |= PPC_BIT(43);
> +        }
> +    } else {
> +        /*
> +	 * For non-powersave system resets, SRR1[42:45] are defined to be
> +	 * implementation-dependent. The POWER9 User Manual specifies that
> +	 * an external (SCOM driven, which may come from a BMC nmi command or
> +	 * another CPU requesting a NMI IPI) system reset exception should be
> +	 * 0b0010 (PPC_BIT(44)).
> +         */
> +        env->spr[SPR_SRR1] |= PPC_BIT(44);
> +    }
>  }
>  
>  static void pnv_nmi(NMIState *n, int cpu_index, Error **errp)

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] ppc/pnv: Fix NMI system reset SRR1 value
  2020-05-07 11:48 [PATCH] ppc/pnv: Fix NMI system reset SRR1 value Nicholas Piggin
  2020-05-07 13:51 ` David Gibson
@ 2020-05-07 17:14 ` Cédric Le Goater
  2020-05-08  3:43   ` Nicholas Piggin
  2020-05-08  3:43 ` no-reply
  2 siblings, 1 reply; 8+ messages in thread
From: Cédric Le Goater @ 2020-05-07 17:14 UTC (permalink / raw)
  To: Nicholas Piggin, qemu-ppc; +Cc: qemu-devel, David Gibson

On 5/7/20 1:48 PM, Nicholas Piggin wrote:
> Commit a77fed5bd926 ("ppc/pnv: Add support for NMI interface") got the
> SRR1 setting wrong for sresets that hit outside of power-save states.
> 
> Fix this, better documenting the source for the bit definitions.
> 
> Fixes: a77fed5bd926 ("ppc/pnv: Add support for NMI interface") got the
> Cc: Cédric Le Goater <clg@kaod.org>
> Cc: David Gibson <david@gibson.dropbear.id.au>
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>

We should introduce some defines like the SRR1_WAKE ones in Linux and 
cleanup powerpc_reset_wakeup(). This function uses cryptic values. 
That can be done later on as a followup.

Reviewed-by: Cédric Le Goater <clg@kaod.org>

> ---
> 
> Thanks to Cedric for pointing out concerns with a previous MCE patch
> that unearthed this as well. Linux does not actually care what these
> SRR1[42:45] bits look like for non-powersave sresets, but we should
> follow documented behaviour as far as possible.

We should introduce some defines like the SRR1_WAKE ones in Linux and 
cleanup powerpc_reset_wakeup(). This function uses cryptic values. 
That can be done later on as a followup.


I am currently after a bug which results in a CPU hard lockup because 
of a pending interrupt. It occurs on a SMP PowerNV machine when it is 
stressed with IO, such as scp of a big file. 

I am suspecting more and more an issue with an interrupt being handled 
when the CPU is coming out of idle. I haven't seen anything wrong in
the models. Unless this maybe :

    /* Pretend to be returning from doze always as we don't lose state */
    *msr |= (0x1ull << (63 - 47));

I am not sure how in sync it is with PSSCR.


Thanks, 

C.

>  hw/ppc/pnv.c | 26 ++++++++++++++++++++------
>  1 file changed, 20 insertions(+), 6 deletions(-)
> 
> diff --git a/hw/ppc/pnv.c b/hw/ppc/pnv.c
> index a3b7a8d0ff..1b4748ce6d 100644
> --- a/hw/ppc/pnv.c
> +++ b/hw/ppc/pnv.c
> @@ -1986,12 +1986,26 @@ static void pnv_cpu_do_nmi_on_cpu(CPUState *cs, run_on_cpu_data arg)
>  
>      cpu_synchronize_state(cs);
>      ppc_cpu_do_system_reset(cs);
> -    /*
> -     * SRR1[42:45] is set to 0100 which the ISA defines as implementation
> -     * dependent. POWER processors use this for xscom triggered interrupts,
> -     * which come from the BMC or NMI IPIs.
> -     */
> -    env->spr[SPR_SRR1] |= PPC_BIT(43);
> +    if (env->spr[SPR_SRR1] & PPC_BITMASK(46, 47)) {
> +        /*
> +	 * Power-save wakeups, as indicated by non-zero SRR1[46:47] put the
> +	 * wakeup reason in SRR1[42:45], system reset is indicated with 0b0100
> +	 * (PPC_BIT(43)).
> +	 */
> +        if (!(env->spr[SPR_SRR1] & PPC_BIT(43))) {
> +            warn_report("ppc_cpu_do_system_reset does not set system reset wakeup reason");
> +            env->spr[SPR_SRR1] |= PPC_BIT(43);
> +        }
> +    } else {
> +        /*
> +	 * For non-powersave system resets, SRR1[42:45] are defined to be
> +	 * implementation-dependent. The POWER9 User Manual specifies that
> +	 * an external (SCOM driven, which may come from a BMC nmi command or
> +	 * another CPU requesting a NMI IPI) system reset exception should be
> +	 * 0b0010 (PPC_BIT(44)).
> +         */
> +        env->spr[SPR_SRR1] |= PPC_BIT(44);
> +    }
>  }
>  
>  static void pnv_nmi(NMIState *n, int cpu_index, Error **errp)
> 



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] ppc/pnv: Fix NMI system reset SRR1 value
  2020-05-07 11:48 [PATCH] ppc/pnv: Fix NMI system reset SRR1 value Nicholas Piggin
  2020-05-07 13:51 ` David Gibson
  2020-05-07 17:14 ` Cédric Le Goater
@ 2020-05-08  3:43 ` no-reply
  2 siblings, 0 replies; 8+ messages in thread
From: no-reply @ 2020-05-08  3:43 UTC (permalink / raw)
  To: npiggin; +Cc: clg, qemu-ppc, qemu-devel, npiggin, david

Patchew URL: https://patchew.org/QEMU/20200507114824.788942-1-npiggin@gmail.com/



Hi,

This series seems to have some coding style problems. See output below for
more information:

Message-id: 20200507114824.788942-1-npiggin@gmail.com
Subject: [PATCH] ppc/pnv: Fix NMI system reset SRR1 value
Type: series

=== TEST SCRIPT BEGIN ===
#!/bin/bash
git rev-parse base > /dev/null || exit 0
git config --local diff.renamelimit 0
git config --local diff.renames True
git config --local diff.algorithm histogram
./scripts/checkpatch.pl --mailback base..
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
Switched to a new branch 'test'
fbe72cb ppc/pnv: Fix NMI system reset SRR1 value

=== OUTPUT BEGIN ===
ERROR: code indent should never use tabs
#35: FILE: hw/ppc/pnv.c:1991:
+^I * Power-save wakeups, as indicated by non-zero SRR1[46:47] put the$

ERROR: code indent should never use tabs
#36: FILE: hw/ppc/pnv.c:1992:
+^I * wakeup reason in SRR1[42:45], system reset is indicated with 0b0100$

ERROR: code indent should never use tabs
#37: FILE: hw/ppc/pnv.c:1993:
+^I * (PPC_BIT(43)).$

ERROR: code indent should never use tabs
#38: FILE: hw/ppc/pnv.c:1994:
+^I */$

ERROR: line over 90 characters
#40: FILE: hw/ppc/pnv.c:1996:
+            warn_report("ppc_cpu_do_system_reset does not set system reset wakeup reason");

ERROR: code indent should never use tabs
#45: FILE: hw/ppc/pnv.c:2001:
+^I * For non-powersave system resets, SRR1[42:45] are defined to be$

ERROR: code indent should never use tabs
#46: FILE: hw/ppc/pnv.c:2002:
+^I * implementation-dependent. The POWER9 User Manual specifies that$

ERROR: code indent should never use tabs
#47: FILE: hw/ppc/pnv.c:2003:
+^I * an external (SCOM driven, which may come from a BMC nmi command or$

ERROR: code indent should never use tabs
#48: FILE: hw/ppc/pnv.c:2004:
+^I * another CPU requesting a NMI IPI) system reset exception should be$

ERROR: code indent should never use tabs
#49: FILE: hw/ppc/pnv.c:2005:
+^I * 0b0010 (PPC_BIT(44)).$

total: 10 errors, 0 warnings, 32 lines checked

Commit fbe72cb9d465 (ppc/pnv: Fix NMI system reset SRR1 value) has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
=== OUTPUT END ===

Test command exited with code: 1


The full log is available at
http://patchew.org/logs/20200507114824.788942-1-npiggin@gmail.com/testing.checkpatch/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-devel@redhat.com

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] ppc/pnv: Fix NMI system reset SRR1 value
  2020-05-07 17:14 ` Cédric Le Goater
@ 2020-05-08  3:43   ` Nicholas Piggin
  2020-05-08 14:05     ` Cédric Le Goater
  0 siblings, 1 reply; 8+ messages in thread
From: Nicholas Piggin @ 2020-05-08  3:43 UTC (permalink / raw)
  To: Cédric Le Goater, qemu-ppc; +Cc: qemu-devel, David Gibson

Excerpts from Cédric Le Goater's message of May 8, 2020 3:14 am:
> On 5/7/20 1:48 PM, Nicholas Piggin wrote:
>> Commit a77fed5bd926 ("ppc/pnv: Add support for NMI interface") got the
>> SRR1 setting wrong for sresets that hit outside of power-save states.
>> 
>> Fix this, better documenting the source for the bit definitions.
>> 
>> Fixes: a77fed5bd926 ("ppc/pnv: Add support for NMI interface") got the
>> Cc: Cédric Le Goater <clg@kaod.org>
>> Cc: David Gibson <david@gibson.dropbear.id.au>
>> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> 
> We should introduce some defines like the SRR1_WAKE ones in Linux and 
> cleanup powerpc_reset_wakeup(). This function uses cryptic values. 
> That can be done later on as a followup.
> 
> Reviewed-by: Cédric Le Goater <clg@kaod.org>

Thanks.

>> ---
>> 
>> Thanks to Cedric for pointing out concerns with a previous MCE patch
>> that unearthed this as well. Linux does not actually care what these
>> SRR1[42:45] bits look like for non-powersave sresets, but we should
>> follow documented behaviour as far as possible.
> 
> We should introduce some defines like the SRR1_WAKE ones in Linux and 
> cleanup powerpc_reset_wakeup(). This function uses cryptic values. 
> That can be done later on as a followup.
> 
> 
> I am currently after a bug which results in a CPU hard lockup because 
> of a pending interrupt. It occurs on a SMP PowerNV machine when it is 
> stressed with IO, such as scp of a big file. 
> 
> I am suspecting more and more an issue with an interrupt being handled 
> when the CPU is coming out of idle. I haven't seen anything wrong in

So you can't hit it when booting Linux with powersave=off?

Do we model stop with EC=0 properly? Looks like helper_pminsn seems to
be doing the right thing there.

> the models. Unless this maybe :
> 
>     /* Pretend to be returning from doze always as we don't lose state */
>     *msr |= (0x1ull << (63 - 47));
> 
> I am not sure how in sync it is with PSSCR.

That should be okay, the hardware can always enter a shallower state 
than was asked for. Linux will handle it. For testing purpose, we could
model deeper states by scribbling on registers and indicating state loss.

Aide from SRR1 sleep state value, Linux uses the SRR1 wake reason value 
to run the interrupt handler, but even if we got SRR1 wrong, Linux 
eventually enables MSR[EE] so the interrupt should get replayed then 
(this is what Linux used to do until we added the wake-reason processing 
for improved performance).

But we do appear to get those right in powerpc_reset_wakeup().

Thanks,
Nick


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] ppc/pnv: Fix NMI system reset SRR1 value
  2020-05-07 13:51 ` David Gibson
@ 2020-05-08  8:43   ` Greg Kurz
  2020-05-11  1:30     ` David Gibson
  0 siblings, 1 reply; 8+ messages in thread
From: Greg Kurz @ 2020-05-08  8:43 UTC (permalink / raw)
  To: David Gibson; +Cc: qemu-ppc, qemu-devel, Nicholas Piggin, Cédric Le Goater

[-- Attachment #1: Type: text/plain, Size: 2916 bytes --]

On Thu, 7 May 2020 23:51:54 +1000
David Gibson <david@gibson.dropbear.id.au> wrote:

> On Thu, May 07, 2020 at 09:48:24PM +1000, Nicholas Piggin wrote:
> > Commit a77fed5bd926 ("ppc/pnv: Add support for NMI interface") got the

Please note that the culprit patch was merged with a different SHA1:

https://git.qemu.org/?p=qemu.git;a=commit;h=01b552b05b0f21f8ff57a508f7ad26f7abbcd123

> > SRR1 setting wrong for sresets that hit outside of power-save states.
> > 
> > Fix this, better documenting the source for the bit definitions.
> > 
> > Fixes: a77fed5bd926 ("ppc/pnv: Add support for NMI interface") got the

Fixes: 01b552b05b0f ("ppc/pnv: Add support for NMI interface")

> > Cc: Cédric Le Goater <clg@kaod.org>
> > Cc: David Gibson <david@gibson.dropbear.id.au>
> > Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> 
> Applied to ppc-for-5.1, thanks.
> > ---
> > 
> > Thanks to Cedric for pointing out concerns with a previous MCE patch
> > that unearthed this as well. Linux does not actually care what these
> > SRR1[42:45] bits look like for non-powersave sresets, but we should
> > follow documented behaviour as far as possible.
> > 
> >  hw/ppc/pnv.c | 26 ++++++++++++++++++++------
> >  1 file changed, 20 insertions(+), 6 deletions(-)
> > 
> > diff --git a/hw/ppc/pnv.c b/hw/ppc/pnv.c
> > index a3b7a8d0ff..1b4748ce6d 100644
> > --- a/hw/ppc/pnv.c
> > +++ b/hw/ppc/pnv.c
> > @@ -1986,12 +1986,26 @@ static void pnv_cpu_do_nmi_on_cpu(CPUState *cs, run_on_cpu_data arg)
> >  
> >      cpu_synchronize_state(cs);
> >      ppc_cpu_do_system_reset(cs);
> > -    /*
> > -     * SRR1[42:45] is set to 0100 which the ISA defines as implementation
> > -     * dependent. POWER processors use this for xscom triggered interrupts,
> > -     * which come from the BMC or NMI IPIs.
> > -     */
> > -    env->spr[SPR_SRR1] |= PPC_BIT(43);
> > +    if (env->spr[SPR_SRR1] & PPC_BITMASK(46, 47)) {
> > +        /*
> > +	 * Power-save wakeups, as indicated by non-zero SRR1[46:47] put the
> > +	 * wakeup reason in SRR1[42:45], system reset is indicated with 0b0100
> > +	 * (PPC_BIT(43)).
> > +	 */
> > +        if (!(env->spr[SPR_SRR1] & PPC_BIT(43))) {
> > +            warn_report("ppc_cpu_do_system_reset does not set system reset wakeup reason");
> > +            env->spr[SPR_SRR1] |= PPC_BIT(43);
> > +        }
> > +    } else {
> > +        /*
> > +	 * For non-powersave system resets, SRR1[42:45] are defined to be
> > +	 * implementation-dependent. The POWER9 User Manual specifies that
> > +	 * an external (SCOM driven, which may come from a BMC nmi command or
> > +	 * another CPU requesting a NMI IPI) system reset exception should be
> > +	 * 0b0010 (PPC_BIT(44)).
> > +         */
> > +        env->spr[SPR_SRR1] |= PPC_BIT(44);
> > +    }
> >  }
> >  
> >  static void pnv_nmi(NMIState *n, int cpu_index, Error **errp)
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] ppc/pnv: Fix NMI system reset SRR1 value
  2020-05-08  3:43   ` Nicholas Piggin
@ 2020-05-08 14:05     ` Cédric Le Goater
  0 siblings, 0 replies; 8+ messages in thread
From: Cédric Le Goater @ 2020-05-08 14:05 UTC (permalink / raw)
  To: Nicholas Piggin, qemu-ppc; +Cc: qemu-devel, David Gibson

>> of a pending interrupt. It occurs on a SMP PowerNV machine when it is 
>> stressed with IO, such as scp of a big file. 
>>
>> I am suspecting more and more an issue with an interrupt being handled 
>> when the CPU is coming out of idle. I haven't seen anything wrong in
> 
> So you can't hit it when booting Linux with powersave=off?

no. I uploaded 32GB steadily at 3.0MB/s on a smp 2 machine. 

When powersave is on, a P8 or P9 machine will miss an interrupt quite 
quickly. This assert can catch a symptom of the failure  :

    @@ -75,6 +83,9 @@ void ppc_set_irq(PowerPCCPU *cpu, int n_
         if (level) {
             env->pending_interrupts |= 1 << n_IRQ;
             cpu_interrupt(cs, CPU_INTERRUPT_HARD);
    +        if (!(env->pending_interrupts & (1 << n_IRQ))) {
    +            g_assert_not_reached();
    +        }
         } else {
             env->pending_interrupts &= ~(1 << n_IRQ);
             if (env->pending_interrupts == 0) {

env->pending_interrupts is reseted in ppc_set_irq() setting it. I think 
it is the CPU handling the external IO interrupt which is kicked to wake 
up in cpu_interrupt(). The IRQ level goes out of sync with what the device 
expects and things go bad very quickly after. 

But this is post mortem. I need to find the right spot where to put an 
assert() to analyze. But, adding too much traces closes the window ...

> Do we model stop with EC=0 properly? Looks like helper_pminsn seems to
> be doing the right thing there.

Yes. It seems so. The CPUs enter nap and come out with PACA_IRQ_EE set.

>> the models. Unless this maybe :
>>
>>     /* Pretend to be returning from doze always as we don't lose state */
>>     *msr |= (0x1ull << (63 - 47));
>>
>> I am not sure how in sync it is with PSSCR.
> 
> That should be okay, the hardware can always enter a shallower state 
> than was asked for. Linux will handle it. For testing purpose, we could
> model deeper states by scribbling on registers and indicating state loss.
> 
> Aide from SRR1 sleep state value, Linux uses the SRR1 wake reason value 
> to run the interrupt handler, but even if we got SRR1 wrong, Linux 
> eventually enables MSR[EE] so the interrupt should get replayed then 
> (this is what Linux used to do until we added the wake-reason processing 
> for improved performance).
> 
> But we do appear to get those right in powerpc_reset_wakeup().

yes. Still digging.

Thanks,

C.



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] ppc/pnv: Fix NMI system reset SRR1 value
  2020-05-08  8:43   ` Greg Kurz
@ 2020-05-11  1:30     ` David Gibson
  0 siblings, 0 replies; 8+ messages in thread
From: David Gibson @ 2020-05-11  1:30 UTC (permalink / raw)
  To: Greg Kurz; +Cc: qemu-ppc, qemu-devel, Nicholas Piggin, Cédric Le Goater

[-- Attachment #1: Type: text/plain, Size: 3352 bytes --]

On Fri, May 08, 2020 at 10:43:05AM +0200, Greg Kurz wrote:
> On Thu, 7 May 2020 23:51:54 +1000
> David Gibson <david@gibson.dropbear.id.au> wrote:
> 
> > On Thu, May 07, 2020 at 09:48:24PM +1000, Nicholas Piggin wrote:
> > > Commit a77fed5bd926 ("ppc/pnv: Add support for NMI interface") got the
> 
> Please note that the culprit patch was merged with a different SHA1:
> 
> https://git.qemu.org/?p=qemu.git;a=commit;h=01b552b05b0f21f8ff57a508f7ad26f7abbcd123
> 
> > > SRR1 setting wrong for sresets that hit outside of power-save states.
> > > 
> > > Fix this, better documenting the source for the bit definitions.
> > > 
> > > Fixes: a77fed5bd926 ("ppc/pnv: Add support for NMI interface") got the
> 
> Fixes: 01b552b05b0f ("ppc/pnv: Add support for NMI interface")

Updated in my tree, thanks.

> 
> > > Cc: Cédric Le Goater <clg@kaod.org>
> > > Cc: David Gibson <david@gibson.dropbear.id.au>
> > > Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> > 
> > Applied to ppc-for-5.1, thanks.
> > > ---
> > > 
> > > Thanks to Cedric for pointing out concerns with a previous MCE patch
> > > that unearthed this as well. Linux does not actually care what these
> > > SRR1[42:45] bits look like for non-powersave sresets, but we should
> > > follow documented behaviour as far as possible.
> > > 
> > >  hw/ppc/pnv.c | 26 ++++++++++++++++++++------
> > >  1 file changed, 20 insertions(+), 6 deletions(-)
> > > 
> > > diff --git a/hw/ppc/pnv.c b/hw/ppc/pnv.c
> > > index a3b7a8d0ff..1b4748ce6d 100644
> > > --- a/hw/ppc/pnv.c
> > > +++ b/hw/ppc/pnv.c
> > > @@ -1986,12 +1986,26 @@ static void pnv_cpu_do_nmi_on_cpu(CPUState *cs, run_on_cpu_data arg)
> > >  
> > >      cpu_synchronize_state(cs);
> > >      ppc_cpu_do_system_reset(cs);
> > > -    /*
> > > -     * SRR1[42:45] is set to 0100 which the ISA defines as implementation
> > > -     * dependent. POWER processors use this for xscom triggered interrupts,
> > > -     * which come from the BMC or NMI IPIs.
> > > -     */
> > > -    env->spr[SPR_SRR1] |= PPC_BIT(43);
> > > +    if (env->spr[SPR_SRR1] & PPC_BITMASK(46, 47)) {
> > > +        /*
> > > +	 * Power-save wakeups, as indicated by non-zero SRR1[46:47] put the
> > > +	 * wakeup reason in SRR1[42:45], system reset is indicated with 0b0100
> > > +	 * (PPC_BIT(43)).
> > > +	 */
> > > +        if (!(env->spr[SPR_SRR1] & PPC_BIT(43))) {
> > > +            warn_report("ppc_cpu_do_system_reset does not set system reset wakeup reason");
> > > +            env->spr[SPR_SRR1] |= PPC_BIT(43);
> > > +        }
> > > +    } else {
> > > +        /*
> > > +	 * For non-powersave system resets, SRR1[42:45] are defined to be
> > > +	 * implementation-dependent. The POWER9 User Manual specifies that
> > > +	 * an external (SCOM driven, which may come from a BMC nmi command or
> > > +	 * another CPU requesting a NMI IPI) system reset exception should be
> > > +	 * 0b0010 (PPC_BIT(44)).
> > > +         */
> > > +        env->spr[SPR_SRR1] |= PPC_BIT(44);
> > > +    }
> > >  }
> > >  
> > >  static void pnv_nmi(NMIState *n, int cpu_index, Error **errp)
> > 
> 



-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2020-05-11  1:46 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-07 11:48 [PATCH] ppc/pnv: Fix NMI system reset SRR1 value Nicholas Piggin
2020-05-07 13:51 ` David Gibson
2020-05-08  8:43   ` Greg Kurz
2020-05-11  1:30     ` David Gibson
2020-05-07 17:14 ` Cédric Le Goater
2020-05-08  3:43   ` Nicholas Piggin
2020-05-08 14:05     ` Cédric Le Goater
2020-05-08  3:43 ` no-reply

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.