All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 1/1] ppc/crash: Reset spinlocks during crash
@ 2020-04-01  0:00 Leonardo Bras
  2020-04-01  3:07   ` kbuild test robot
                   ` (2 more replies)
  0 siblings, 3 replies; 28+ messages in thread
From: Leonardo Bras @ 2020-04-01  0:00 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	Enrico Weigelt, Leonardo Bras, Alexios Zavras, Thomas Gleixner,
	Greg Kroah-Hartman, Christophe Leroy, peterz
  Cc: linuxppc-dev, linux-kernel

During a crash, there is chance that the cpus that handle the NMI IPI
are holding a spin_lock. If this spin_lock is needed by crashing_cpu it
will cause a deadlock. (rtas.lock and printk logbuf_lock as of today)

This is a problem if the system has kdump set up, given if it crashes
for any reason kdump may not be saved for crash analysis.

After NMI IPI is sent to all other cpus, force unlock all spinlocks
needed for finishing crash routine.

Signed-off-by: Leonardo Bras <leonardo@linux.ibm.com>

---
Changes from v2:
- Instead of skipping spinlocks, unlock the needed ones.

Changes from v1:
- Exported variable
---
 arch/powerpc/kexec/crash.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/arch/powerpc/kexec/crash.c b/arch/powerpc/kexec/crash.c
index d488311efab1..8d63fca3242c 100644
--- a/arch/powerpc/kexec/crash.c
+++ b/arch/powerpc/kexec/crash.c
@@ -24,6 +24,7 @@
 #include <asm/smp.h>
 #include <asm/setjmp.h>
 #include <asm/debug.h>
+#include <asm/rtas.h>
 
 /*
  * The primary CPU waits a while for all secondary CPUs to enter. This is to
@@ -49,6 +50,8 @@ static int time_to_dump;
  */
 int crash_wake_offline;
 
+extern raw_spinlock_t logbuf_lock;
+
 #define CRASH_HANDLER_MAX 3
 /* List of shutdown handles */
 static crash_shutdown_t crash_shutdown_handles[CRASH_HANDLER_MAX];
@@ -129,6 +132,13 @@ static void crash_kexec_prepare_cpus(int cpu)
 	/* Would it be better to replace the trap vector here? */
 
 	if (atomic_read(&cpus_in_crash) >= ncpus) {
+		/*
+		 * At this point no other CPU is running, and some of them may
+		 * have been interrupted while holding one of the locks needed
+		 * to complete crashing. Free them so there is no deadlock.
+		 */
+		arch_spin_unlock(&logbuf_lock.raw_lock);
+		arch_spin_unlock(&rtas.lock);
 		printk(KERN_EMERG "IPI complete\n");
 		return;
 	}
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 1/1] ppc/crash: Reset spinlocks during crash
  2020-04-01  0:00 [PATCH v3 1/1] ppc/crash: Reset spinlocks during crash Leonardo Bras
  2020-04-01  3:07   ` kbuild test robot
@ 2020-04-01  3:07   ` kbuild test robot
  2020-04-02 11:28 ` Michael Ellerman
  2 siblings, 0 replies; 28+ messages in thread
From: kbuild test robot @ 2020-04-01  3:07 UTC (permalink / raw)
  To: Leonardo Bras
  Cc: kbuild-all, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Enrico Weigelt, Leonardo Bras, Alexios Zavras,
	Thomas Gleixner, Greg Kroah-Hartman, Christophe Leroy, peterz,
	linuxppc-dev, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1455 bytes --]

Hi Leonardo,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on powerpc/next]
[also build test ERROR on paulus-powerpc/kvm-ppc-next linus/master linux/master v5.6 next-20200331]
[if your patch is applied to the wrong git tree, please drop us a note to help
improve the system. BTW, we also suggest to use '--base' option to specify the
base tree in git format-patch, please see https://stackoverflow.com/a/37406982]

url:    https://github.com/0day-ci/linux/commits/Leonardo-Bras/ppc-crash-Reset-spinlocks-during-crash/20200401-091600
base:   https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next
config: powerpc-rhel-kconfig (attached as .config)
compiler: powerpc64le-linux-gcc (GCC) 9.3.0
reproduce:
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # save the attached .config to linux build tree
        GCC_VERSION=9.3.0 make.cross ARCH=powerpc 

If you fix the issue, kindly add following tag
Reported-by: kbuild test robot <lkp@intel.com>

All errors (new ones prefixed by >>):

   powerpc64le-linux-ld: warning: orphan section `.gnu.hash' from `linker stubs' being placed in section `.gnu.hash'
>> powerpc64le-linux-ld: arch/powerpc/kexec/crash.o:(.toc+0x0): undefined reference to `rtas'

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 15339 bytes --]

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 1/1] ppc/crash: Reset spinlocks during crash
@ 2020-04-01  3:07   ` kbuild test robot
  0 siblings, 0 replies; 28+ messages in thread
From: kbuild test robot @ 2020-04-01  3:07 UTC (permalink / raw)
  To: Leonardo Bras
  Cc: kbuild-all, Greg Kroah-Hartman, peterz, linuxppc-dev,
	linux-kernel, Alexios Zavras, Paul Mackerras, Leonardo Bras,
	Thomas Gleixner, Enrico Weigelt

[-- Attachment #1: Type: text/plain, Size: 1455 bytes --]

Hi Leonardo,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on powerpc/next]
[also build test ERROR on paulus-powerpc/kvm-ppc-next linus/master linux/master v5.6 next-20200331]
[if your patch is applied to the wrong git tree, please drop us a note to help
improve the system. BTW, we also suggest to use '--base' option to specify the
base tree in git format-patch, please see https://stackoverflow.com/a/37406982]

url:    https://github.com/0day-ci/linux/commits/Leonardo-Bras/ppc-crash-Reset-spinlocks-during-crash/20200401-091600
base:   https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next
config: powerpc-rhel-kconfig (attached as .config)
compiler: powerpc64le-linux-gcc (GCC) 9.3.0
reproduce:
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # save the attached .config to linux build tree
        GCC_VERSION=9.3.0 make.cross ARCH=powerpc 

If you fix the issue, kindly add following tag
Reported-by: kbuild test robot <lkp@intel.com>

All errors (new ones prefixed by >>):

   powerpc64le-linux-ld: warning: orphan section `.gnu.hash' from `linker stubs' being placed in section `.gnu.hash'
>> powerpc64le-linux-ld: arch/powerpc/kexec/crash.o:(.toc+0x0): undefined reference to `rtas'

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 15339 bytes --]

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 1/1] ppc/crash: Reset spinlocks during crash
@ 2020-04-01  3:07   ` kbuild test robot
  0 siblings, 0 replies; 28+ messages in thread
From: kbuild test robot @ 2020-04-01  3:07 UTC (permalink / raw)
  To: kbuild-all

[-- Attachment #1: Type: text/plain, Size: 1488 bytes --]

Hi Leonardo,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on powerpc/next]
[also build test ERROR on paulus-powerpc/kvm-ppc-next linus/master linux/master v5.6 next-20200331]
[if your patch is applied to the wrong git tree, please drop us a note to help
improve the system. BTW, we also suggest to use '--base' option to specify the
base tree in git format-patch, please see https://stackoverflow.com/a/37406982]

url:    https://github.com/0day-ci/linux/commits/Leonardo-Bras/ppc-crash-Reset-spinlocks-during-crash/20200401-091600
base:   https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next
config: powerpc-rhel-kconfig (attached as .config)
compiler: powerpc64le-linux-gcc (GCC) 9.3.0
reproduce:
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # save the attached .config to linux build tree
        GCC_VERSION=9.3.0 make.cross ARCH=powerpc 

If you fix the issue, kindly add following tag
Reported-by: kbuild test robot <lkp@intel.com>

All errors (new ones prefixed by >>):

   powerpc64le-linux-ld: warning: orphan section `.gnu.hash' from `linker stubs' being placed in section `.gnu.hash'
>> powerpc64le-linux-ld: arch/powerpc/kexec/crash.o:(.toc+0x0): undefined reference to `rtas'

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all(a)lists.01.org

[-- Attachment #2: config.gz --]
[-- Type: application/gzip, Size: 15339 bytes --]

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 1/1] ppc/crash: Reset spinlocks during crash
  2020-04-01  0:00 [PATCH v3 1/1] ppc/crash: Reset spinlocks during crash Leonardo Bras
@ 2020-04-01  9:26   ` Peter Zijlstra
  2020-04-01  9:26   ` Peter Zijlstra
  2020-04-02 11:28 ` Michael Ellerman
  2 siblings, 0 replies; 28+ messages in thread
From: Peter Zijlstra @ 2020-04-01  9:26 UTC (permalink / raw)
  To: Leonardo Bras
  Cc: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	Enrico Weigelt, Alexios Zavras, Thomas Gleixner,
	Greg Kroah-Hartman, Christophe Leroy, linuxppc-dev, linux-kernel

On Tue, Mar 31, 2020 at 09:00:21PM -0300, Leonardo Bras wrote:
> During a crash, there is chance that the cpus that handle the NMI IPI
> are holding a spin_lock. If this spin_lock is needed by crashing_cpu it
> will cause a deadlock. (rtas.lock and printk logbuf_lock as of today)
> 
> This is a problem if the system has kdump set up, given if it crashes
> for any reason kdump may not be saved for crash analysis.
> 
> After NMI IPI is sent to all other cpus, force unlock all spinlocks
> needed for finishing crash routine.
> 
> Signed-off-by: Leonardo Bras <leonardo@linux.ibm.com>

> @@ -129,6 +132,13 @@ static void crash_kexec_prepare_cpus(int cpu)
>  	/* Would it be better to replace the trap vector here? */
>  
>  	if (atomic_read(&cpus_in_crash) >= ncpus) {
> +		/*
> +		 * At this point no other CPU is running, and some of them may
> +		 * have been interrupted while holding one of the locks needed
> +		 * to complete crashing. Free them so there is no deadlock.
> +		 */
> +		arch_spin_unlock(&logbuf_lock.raw_lock);
> +		arch_spin_unlock(&rtas.lock);
>  		printk(KERN_EMERG "IPI complete\n");
>  		return;
>  	}

You might want to add a note to your asm/spinlock.h that you rely on
spin_unlock() unconditionally clearing a lock.

This isn't naturally true for all lock implementations. Consider ticket
locks, doing a surplus unlock will wreck your lock state in that case.
So anybody poking at the powerpc spinlock implementation had better know
you rely on this.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 1/1] ppc/crash: Reset spinlocks during crash
@ 2020-04-01  9:26   ` Peter Zijlstra
  0 siblings, 0 replies; 28+ messages in thread
From: Peter Zijlstra @ 2020-04-01  9:26 UTC (permalink / raw)
  To: Leonardo Bras
  Cc: Greg Kroah-Hartman, linuxppc-dev, linux-kernel, Alexios Zavras,
	Paul Mackerras, Thomas Gleixner, Enrico Weigelt

On Tue, Mar 31, 2020 at 09:00:21PM -0300, Leonardo Bras wrote:
> During a crash, there is chance that the cpus that handle the NMI IPI
> are holding a spin_lock. If this spin_lock is needed by crashing_cpu it
> will cause a deadlock. (rtas.lock and printk logbuf_lock as of today)
> 
> This is a problem if the system has kdump set up, given if it crashes
> for any reason kdump may not be saved for crash analysis.
> 
> After NMI IPI is sent to all other cpus, force unlock all spinlocks
> needed for finishing crash routine.
> 
> Signed-off-by: Leonardo Bras <leonardo@linux.ibm.com>

> @@ -129,6 +132,13 @@ static void crash_kexec_prepare_cpus(int cpu)
>  	/* Would it be better to replace the trap vector here? */
>  
>  	if (atomic_read(&cpus_in_crash) >= ncpus) {
> +		/*
> +		 * At this point no other CPU is running, and some of them may
> +		 * have been interrupted while holding one of the locks needed
> +		 * to complete crashing. Free them so there is no deadlock.
> +		 */
> +		arch_spin_unlock(&logbuf_lock.raw_lock);
> +		arch_spin_unlock(&rtas.lock);
>  		printk(KERN_EMERG "IPI complete\n");
>  		return;
>  	}

You might want to add a note to your asm/spinlock.h that you rely on
spin_unlock() unconditionally clearing a lock.

This isn't naturally true for all lock implementations. Consider ticket
locks, doing a surplus unlock will wreck your lock state in that case.
So anybody poking at the powerpc spinlock implementation had better know
you rely on this.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 1/1] ppc/crash: Reset spinlocks during crash
  2020-04-01  9:26   ` Peter Zijlstra
@ 2020-04-01 23:53     ` Leonardo Bras
  -1 siblings, 0 replies; 28+ messages in thread
From: Leonardo Bras @ 2020-04-01 23:53 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	Enrico Weigelt, Alexios Zavras, Thomas Gleixner,
	Greg Kroah-Hartman, Christophe Leroy, linuxppc-dev, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 524 bytes --]

Hello Peter, 

On Wed, 2020-04-01 at 11:26 +0200, Peter Zijlstra wrote:
> You might want to add a note to your asm/spinlock.h that you rely on
> spin_unlock() unconditionally clearing a lock.
> 
> This isn't naturally true for all lock implementations. Consider ticket
> locks, doing a surplus unlock will wreck your lock state in that case.
> So anybody poking at the powerpc spinlock implementation had better know
> you rely on this.

Good idea. I will add this to my changes and generate a v4.

Thank you,

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 1/1] ppc/crash: Reset spinlocks during crash
@ 2020-04-01 23:53     ` Leonardo Bras
  0 siblings, 0 replies; 28+ messages in thread
From: Leonardo Bras @ 2020-04-01 23:53 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Greg Kroah-Hartman, linuxppc-dev, linux-kernel, Alexios Zavras,
	Paul Mackerras, Thomas Gleixner, Enrico Weigelt

[-- Attachment #1: Type: text/plain, Size: 524 bytes --]

Hello Peter, 

On Wed, 2020-04-01 at 11:26 +0200, Peter Zijlstra wrote:
> You might want to add a note to your asm/spinlock.h that you rely on
> spin_unlock() unconditionally clearing a lock.
> 
> This isn't naturally true for all lock implementations. Consider ticket
> locks, doing a surplus unlock will wreck your lock state in that case.
> So anybody poking at the powerpc spinlock implementation had better know
> you rely on this.

Good idea. I will add this to my changes and generate a v4.

Thank you,

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 1/1] ppc/crash: Reset spinlocks during crash
  2020-04-01  0:00 [PATCH v3 1/1] ppc/crash: Reset spinlocks during crash Leonardo Bras
  2020-04-01  3:07   ` kbuild test robot
  2020-04-01  9:26   ` Peter Zijlstra
@ 2020-04-02 11:28 ` Michael Ellerman
  2020-04-03  0:37   ` Leonardo Bras
  2020-04-06 18:46   ` Leonardo Bras
  2 siblings, 2 replies; 28+ messages in thread
From: Michael Ellerman @ 2020-04-02 11:28 UTC (permalink / raw)
  To: Leonardo Bras, Benjamin Herrenschmidt, Paul Mackerras,
	Enrico Weigelt, Leonardo Bras, Alexios Zavras, Thomas Gleixner,
	Greg Kroah-Hartman, Christophe Leroy, peterz
  Cc: linuxppc-dev, linux-kernel

Leonardo Bras <leonardo@linux.ibm.com> writes:
> During a crash, there is chance that the cpus that handle the NMI IPI
> are holding a spin_lock. If this spin_lock is needed by crashing_cpu it
> will cause a deadlock. (rtas.lock and printk logbuf_lock as of today)
>
> This is a problem if the system has kdump set up, given if it crashes
> for any reason kdump may not be saved for crash analysis.
>
> After NMI IPI is sent to all other cpus, force unlock all spinlocks
> needed for finishing crash routine.

I'm not convinced this is the right approach.

Busting locks is risky, it could easily cause a crash if data structures
are left in some inconsistent state.

I think we need to make this code more careful about what it's doing.
There's a clue at the top of default_machine_crash_shutdown(), which
calls crash_kexec_prepare_cpus():

	 * This function is only called after the system
	 * has panicked or is otherwise in a critical state.
	 * The minimum amount of code to allow a kexec'd kernel
	 * to run successfully needs to happen here.


You said the "IPI complete" message was the cause of one lockup:

  #0  arch_spin_lock 
  #1  do_raw_spin_lock 
  #2  __raw_spin_lock 
  #3  _raw_spin_lock 
  #4  vprintk_emit 
  #5  vprintk_func
  #7  crash_kexec_prepare_cpus 
  #8  default_machine_crash_shutdown
  #9  machine_crash_shutdown 
  #10 __crash_kexec
  #11 crash_kexec
  #12 oops_end

TBH I think we could just drop that printk() entirely.

Or we could tell printk() that we're in NMI context so that it uses the
percpu buffers.

We should probably do the latter anyway, in case there's any other code
we call that inadvertently calls printk().


The RTAS trace you sent was:

  #0 arch_spin_lock
  #1  lock_rtas () 
  #2  rtas_call (token=8204, nargs=1, nret=1, outputs=0x0)
  #3  ics_rtas_mask_real_irq (hw_irq=4100) 
  #4  machine_kexec_mask_interrupts
  #5  default_machine_crash_shutdown
  #6  machine_crash_shutdown 
  #7  __crash_kexec
  #8  crash_kexec
  #9  oops_end


Which doesn't make it clear who holds the RTAS lock. We really shouldn't
be crashing while holding the RTAS lock, but I guess it could happen.
Can you get a full backtrace?


PAPR says we are not allowed to have multiple CPUs calling RTAS at once,
except for a very small list of RTAS calls. So if we bust the RTAS lock
there's a risk we violate that part of PAPR and crash even harder.

Also it's not specific to kdump, we can't even get through a normal
reboot if we crash with the RTAS lock held.

Anyway here's a patch with some ideas. That allows me to get from a
crash with the RTAS lock held through kdump into the 2nd kernel. But it
only works if it's the crashing CPU that holds the RTAS lock.

cheers

diff --git a/arch/powerpc/kernel/rtas.c b/arch/powerpc/kernel/rtas.c
index c5fa251b8950..44ce74966d60 100644
--- a/arch/powerpc/kernel/rtas.c
+++ b/arch/powerpc/kernel/rtas.c
@@ -25,6 +25,7 @@
 #include <linux/reboot.h>
 #include <linux/syscalls.h>
 
+#include <asm/debugfs.h>
 #include <asm/prom.h>
 #include <asm/rtas.h>
 #include <asm/hvcall.h>
@@ -65,6 +66,8 @@ unsigned long rtas_rmo_buf;
 void (*rtas_flash_term_hook)(int);
 EXPORT_SYMBOL(rtas_flash_term_hook);
 
+static int rtas_lock_holder = -1;
+
 /* RTAS use home made raw locking instead of spin_lock_irqsave
  * because those can be called from within really nasty contexts
  * such as having the timebase stopped which would lockup with
@@ -76,7 +79,20 @@ static unsigned long lock_rtas(void)
 
 	local_irq_save(flags);
 	preempt_disable();
-	arch_spin_lock(&rtas.lock);
+
+	if (!arch_spin_trylock(&rtas.lock)) {
+		// Couldn't get the lock, do we already hold it?
+		if (rtas_lock_holder == smp_processor_id())
+			// Yes, so we would have deadlocked on ourself. Assume
+			// we're crashing and continue on hopefully ...
+			return flags;
+
+		// No, wait on the lock
+		arch_spin_lock(&rtas.lock);
+	}
+
+	rtas_lock_holder = smp_processor_id();
+
 	return flags;
 }
 
@@ -85,6 +101,8 @@ static void unlock_rtas(unsigned long flags)
 	arch_spin_unlock(&rtas.lock);
 	local_irq_restore(flags);
 	preempt_enable();
+
+	rtas_lock_holder = -1;
 }
 
 /*
@@ -1263,3 +1281,24 @@ void rtas_take_timebase(void)
 	timebase = 0;
 	arch_spin_unlock(&timebase_lock);
 }
+
+static int rtas_crash_set(void *data, u64 val)
+{
+	printk("%s: Taking RTAS lock and then crashing ...\n", __func__);
+	lock_rtas();
+
+	*((volatile int *) 0) = 0;
+
+	return 0;
+}
+
+DEFINE_DEBUGFS_ATTRIBUTE(fops_rtas_crash, NULL, rtas_crash_set, "%llu\n");
+
+static __init int rtas_crash_debugfs_init(void)
+{
+	debugfs_create_file_unsafe("crash_in_rtas", 0200,
+				   powerpc_debugfs_root, NULL,
+				   &fops_rtas_crash);
+	return 0;
+}
+device_initcall(rtas_crash_debugfs_init);
diff --git a/arch/powerpc/kexec/crash.c b/arch/powerpc/kexec/crash.c
index d488311efab1..4c52cb58e889 100644
--- a/arch/powerpc/kexec/crash.c
+++ b/arch/powerpc/kexec/crash.c
@@ -15,6 +15,7 @@
 #include <linux/crash_dump.h>
 #include <linux/delay.h>
 #include <linux/irq.h>
+#include <linux/printk.h>
 #include <linux/types.h>
 
 #include <asm/processor.h>
@@ -311,6 +312,8 @@ void default_machine_crash_shutdown(struct pt_regs *regs)
 	unsigned int i;
 	int (*old_handler)(struct pt_regs *regs);
 
+	printk_nmi_enter();
+
 	/*
 	 * This function is only called after the system
 	 * has panicked or is otherwise in a critical state.



^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 1/1] ppc/crash: Reset spinlocks during crash
  2020-04-02 11:28 ` Michael Ellerman
@ 2020-04-03  0:37   ` Leonardo Bras
  2020-04-03  6:41       ` Nicholas Piggin
  2020-04-06 18:46   ` Leonardo Bras
  1 sibling, 1 reply; 28+ messages in thread
From: Leonardo Bras @ 2020-04-03  0:37 UTC (permalink / raw)
  To: Michael Ellerman, Benjamin Herrenschmidt, Paul Mackerras,
	Enrico Weigelt, Alexios Zavras, Thomas Gleixner,
	Greg Kroah-Hartman, Christophe Leroy, peterz
  Cc: linuxppc-dev, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 7133 bytes --]

On Thu, 2020-04-02 at 22:28 +1100, Michael Ellerman wrote:
> Leonardo Bras <leonardo@linux.ibm.com> writes:
> > During a crash, there is chance that the cpus that handle the NMI IPI
> > are holding a spin_lock. If this spin_lock is needed by crashing_cpu it
> > will cause a deadlock. (rtas.lock and printk logbuf_lock as of today)
> > 
> > This is a problem if the system has kdump set up, given if it crashes
> > for any reason kdump may not be saved for crash analysis.
> > 
> > After NMI IPI is sent to all other cpus, force unlock all spinlocks
> > needed for finishing crash routine.
> 
> I'm not convinced this is the right approach.

Me neither. I think it's a very hacky solution, but I couldn't think of
anything better at the time.

> Busting locks is risky, it could easily cause a crash if data structures
> are left in some inconsistent state.
> 
> I think we need to make this code more careful about what it's doing.
> There's a clue at the top of default_machine_crash_shutdown(), which
> calls crash_kexec_prepare_cpus():
> 
> 	 * This function is only called after the system
> 	 * has panicked or is otherwise in a critical state.
> 	 * The minimum amount of code to allow a kexec'd kernel
> 	 * to run successfully needs to happen here.
> 
> 
> You said the "IPI complete" message was the cause of one lockup:
> 
>   #0  arch_spin_lock 
>   #1  do_raw_spin_lock 
>   #2  __raw_spin_lock 
>   #3  _raw_spin_lock 
>   #4  vprintk_emit 
>   #5  vprintk_func
>   #7  crash_kexec_prepare_cpus 
>   #8  default_machine_crash_shutdown
>   #9  machine_crash_shutdown 
>   #10 __crash_kexec
>   #11 crash_kexec
>   #12 oops_end
> 
> TBH I think we could just drop that printk() entirely.
> 
> Or we could tell printk() that we're in NMI context so that it uses the
> percpu buffers.
> 
> We should probably do the latter anyway, in case there's any other code
> we call that inadvertently calls printk().
> 

I was not aware of using per-cpu buffers in printk. It may be a nice
solution.

There is another printk call there:
printk("kexec: Starting switchover sequence.\n");
in default_machine_kexec().

Two printk and one rtas call: it's all I could see using a spinlock
after IPI Complete.

> 
> The RTAS trace you sent was:
> 
>   #0 arch_spin_lock
>   #1  lock_rtas () 
>   #2  rtas_call (token=8204, nargs=1, nret=1, outputs=0x0)
>   #3  ics_rtas_mask_real_irq (hw_irq=4100) 
>   #4  machine_kexec_mask_interrupts
>   #5  default_machine_crash_shutdown
>   #6  machine_crash_shutdown 
>   #7  __crash_kexec
>   #8  crash_kexec
>   #9  oops_end
> 
> 
> Which doesn't make it clear who holds the RTAS lock. We really shouldn't
> be crashing while holding the RTAS lock, but I guess it could happen.
> Can you get a full backtrace?
> 

Oh, all traces are from the thread that called the crash, by writing
'c' to sysrq. That is what I am using to reproduce.

#10 bad_page_fault
#11 handle_page_fault
#12 __handle_sysrq (key=99, check_mask=false) 
#13 write_sysrq_trigger 
#14 proc_reg_write
#15 __vfs_write
#16 vfs_write
#17 SYSC_write
#18 SyS_write
#19 system_call

> 
> PAPR says we are not allowed to have multiple CPUs calling RTAS at once,
> except for a very small list of RTAS calls. So if we bust the RTAS lock
> there's a risk we violate that part of PAPR and crash even harder.

Interesting, I was not aware.

> 
> Also it's not specific to kdump, we can't even get through a normal
> reboot if we crash with the RTAS lock held.
> 
> Anyway here's a patch with some ideas. That allows me to get from a
> crash with the RTAS lock held through kdump into the 2nd kernel. But it
> only works if it's the crashing CPU that holds the RTAS lock.
> 

Nice idea. 
But my test environment is just triggering a crash from sysrq, so I
think it would not improve the result, given that this thread is
probably not holding the lock by the time.

I noticed that when rtas is locked, irqs and preemption are also
disabled.

Should the IPI send by crash be able to interrupt a thread with
disabled irqs?

Best regards,
Leonardo Bras


> cheers
> 
> diff --git a/arch/powerpc/kernel/rtas.c b/arch/powerpc/kernel/rtas.c
> index c5fa251b8950..44ce74966d60 100644
> --- a/arch/powerpc/kernel/rtas.c
> +++ b/arch/powerpc/kernel/rtas.c
> @@ -25,6 +25,7 @@
>  #include <linux/reboot.h>
>  #include <linux/syscalls.h>
>  
> +#include <asm/debugfs.h>
>  #include <asm/prom.h>
>  #include <asm/rtas.h>
>  #include <asm/hvcall.h>
> @@ -65,6 +66,8 @@ unsigned long rtas_rmo_buf;
>  void (*rtas_flash_term_hook)(int);
>  EXPORT_SYMBOL(rtas_flash_term_hook);
>  
> +static int rtas_lock_holder = -1;
> +
>  /* RTAS use home made raw locking instead of spin_lock_irqsave
>   * because those can be called from within really nasty contexts
>   * such as having the timebase stopped which would lockup with
> @@ -76,7 +79,20 @@ static unsigned long lock_rtas(void)
>  
>  	local_irq_save(flags);
>  	preempt_disable();
> -	arch_spin_lock(&rtas.lock);
> +
> +	if (!arch_spin_trylock(&rtas.lock)) {
> +		// Couldn't get the lock, do we already hold it?
> +		if (rtas_lock_holder == smp_processor_id())
> +			// Yes, so we would have deadlocked on ourself. Assume
> +			// we're crashing and continue on hopefully ...
> +			return flags;
> +
> +		// No, wait on the lock
> +		arch_spin_lock(&rtas.lock);
> +	}
> +
> +	rtas_lock_holder = smp_processor_id();
> +
>  	return flags;
>  }
>  
> @@ -85,6 +101,8 @@ static void unlock_rtas(unsigned long flags)
>  	arch_spin_unlock(&rtas.lock);
>  	local_irq_restore(flags);
>  	preempt_enable();
> +
> +	rtas_lock_holder = -1;
>  }
>  
>  /*
> @@ -1263,3 +1281,24 @@ void rtas_take_timebase(void)
>  	timebase = 0;
>  	arch_spin_unlock(&timebase_lock);
>  }
> +
> +static int rtas_crash_set(void *data, u64 val)
> +{
> +	printk("%s: Taking RTAS lock and then crashing ...\n", __func__);
> +	lock_rtas();
> +
> +	*((volatile int *) 0) = 0;
> +
> +	return 0;
> +}
> +
> +DEFINE_DEBUGFS_ATTRIBUTE(fops_rtas_crash, NULL, rtas_crash_set, "%llu\n");
> +
> +static __init int rtas_crash_debugfs_init(void)
> +{
> +	debugfs_create_file_unsafe("crash_in_rtas", 0200,
> +				   powerpc_debugfs_root, NULL,
> +				   &fops_rtas_crash);
> +	return 0;
> +}
> +device_initcall(rtas_crash_debugfs_init);
> diff --git a/arch/powerpc/kexec/crash.c b/arch/powerpc/kexec/crash.c
> index d488311efab1..4c52cb58e889 100644
> --- a/arch/powerpc/kexec/crash.c
> +++ b/arch/powerpc/kexec/crash.c
> @@ -15,6 +15,7 @@
>  #include <linux/crash_dump.h>
>  #include <linux/delay.h>
>  #include <linux/irq.h>
> +#include <linux/printk.h>
>  #include <linux/types.h>
>  
>  #include <asm/processor.h>
> @@ -311,6 +312,8 @@ void default_machine_crash_shutdown(struct pt_regs *regs)
>  	unsigned int i;
>  	int (*old_handler)(struct pt_regs *regs);
>  
> +	printk_nmi_enter();
> +
>  	/*
>  	 * This function is only called after the system
>  	 * has panicked or is otherwise in a critical state.
> 
> 

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 1/1] ppc/crash: Reset spinlocks during crash
  2020-04-03  0:37   ` Leonardo Bras
@ 2020-04-03  6:41       ` Nicholas Piggin
  0 siblings, 0 replies; 28+ messages in thread
From: Nicholas Piggin @ 2020-04-03  6:41 UTC (permalink / raw)
  To: Alexios Zavras, Benjamin Herrenschmidt, Christophe Leroy,
	Greg Kroah-Hartman,
	Enrico
	 Weigelt, Leonardo Bras, Michael Ellerman, Paul Mackerras,
	peterz,
	Thomas
	 Gleixner
  Cc: linux-kernel, linuxppc-dev

Leonardo Bras's on April 3, 2020 10:37 am:
> On Thu, 2020-04-02 at 22:28 +1100, Michael Ellerman wrote:
>> Leonardo Bras <leonardo@linux.ibm.com> writes:
>> > During a crash, there is chance that the cpus that handle the NMI IPI
>> > are holding a spin_lock. If this spin_lock is needed by crashing_cpu it
>> > will cause a deadlock. (rtas.lock and printk logbuf_lock as of today)
>> > 
>> > This is a problem if the system has kdump set up, given if it crashes
>> > for any reason kdump may not be saved for crash analysis.
>> > 
>> > After NMI IPI is sent to all other cpus, force unlock all spinlocks
>> > needed for finishing crash routine.
>> 
>> I'm not convinced this is the right approach.
> 
> Me neither. I think it's a very hacky solution, but I couldn't think of
> anything better at the time.
> 
>> Busting locks is risky, it could easily cause a crash if data structures
>> are left in some inconsistent state.
>> 
>> I think we need to make this code more careful about what it's doing.
>> There's a clue at the top of default_machine_crash_shutdown(), which
>> calls crash_kexec_prepare_cpus():
>> 
>> 	 * This function is only called after the system
>> 	 * has panicked or is otherwise in a critical state.
>> 	 * The minimum amount of code to allow a kexec'd kernel
>> 	 * to run successfully needs to happen here.
>> 
>> 
>> You said the "IPI complete" message was the cause of one lockup:
>> 
>>   #0  arch_spin_lock 
>>   #1  do_raw_spin_lock 
>>   #2  __raw_spin_lock 
>>   #3  _raw_spin_lock 
>>   #4  vprintk_emit 
>>   #5  vprintk_func
>>   #7  crash_kexec_prepare_cpus 
>>   #8  default_machine_crash_shutdown
>>   #9  machine_crash_shutdown 
>>   #10 __crash_kexec
>>   #11 crash_kexec
>>   #12 oops_end
>> 
>> TBH I think we could just drop that printk() entirely.
>> 
>> Or we could tell printk() that we're in NMI context so that it uses the
>> percpu buffers.
>> 
>> We should probably do the latter anyway, in case there's any other code
>> we call that inadvertently calls printk().
>> 
> 
> I was not aware of using per-cpu buffers in printk. It may be a nice
> solution.
> 
> There is another printk call there:
> printk("kexec: Starting switchover sequence.\n");
> in default_machine_kexec().
> 
> Two printk and one rtas call: it's all I could see using a spinlock
> after IPI Complete.
> 
>> 
>> The RTAS trace you sent was:
>> 
>>   #0 arch_spin_lock
>>   #1  lock_rtas () 
>>   #2  rtas_call (token=8204, nargs=1, nret=1, outputs=0x0)
>>   #3  ics_rtas_mask_real_irq (hw_irq=4100) 
>>   #4  machine_kexec_mask_interrupts
>>   #5  default_machine_crash_shutdown
>>   #6  machine_crash_shutdown 
>>   #7  __crash_kexec
>>   #8  crash_kexec
>>   #9  oops_end
>> 
>> 
>> Which doesn't make it clear who holds the RTAS lock. We really shouldn't
>> be crashing while holding the RTAS lock, but I guess it could happen.
>> Can you get a full backtrace?
>> 
> 
> Oh, all traces are from the thread that called the crash, by writing
> 'c' to sysrq. That is what I am using to reproduce.
> 
> #10 bad_page_fault
> #11 handle_page_fault
> #12 __handle_sysrq (key=99, check_mask=false) 
> #13 write_sysrq_trigger 
> #14 proc_reg_write
> #15 __vfs_write
> #16 vfs_write
> #17 SYSC_write
> #18 SyS_write
> #19 system_call
> 
>> 
>> PAPR says we are not allowed to have multiple CPUs calling RTAS at once,
>> except for a very small list of RTAS calls. So if we bust the RTAS lock
>> there's a risk we violate that part of PAPR and crash even harder.
> 
> Interesting, I was not aware.
> 
>> 
>> Also it's not specific to kdump, we can't even get through a normal
>> reboot if we crash with the RTAS lock held.
>> 
>> Anyway here's a patch with some ideas. That allows me to get from a
>> crash with the RTAS lock held through kdump into the 2nd kernel. But it
>> only works if it's the crashing CPU that holds the RTAS lock.
>> 
> 
> Nice idea. 
> But my test environment is just triggering a crash from sysrq, so I
> think it would not improve the result, given that this thread is
> probably not holding the lock by the time.

Crash paths should not take that RTAS lock, it's a massive pain. I'm 
fixing it for machine check, for other crashes I think it can be removed 
too, it just needs to be unpicked. The good thing with crashing is that 
you can reasonably *know* that you're single threaded, so you can 
usually reason through situations like above.

> I noticed that when rtas is locked, irqs and preemption are also
> disabled.
> 
> Should the IPI send by crash be able to interrupt a thread with
> disabled irqs?

Yes. It's been a bit painful, but in the long term it means that a CPU 
which hangs with interrupts off can be debugged, and it means we can 
take it offline to crash without risking that it will be clobbering what 
we're doing.

Arguably what I should have done is try a regular IPI first, wait a few 
seconds, then NMI IPI.

A couple of problems with that. Firstly it probably avoids this issue 
you hit almost all the time, so it won't get fixed. So when we really 
need the NMI IPI in the field, it'll still be riddled with deadlocks.

Secondly, sending the IPI first in theory can be more intrusive to the 
state that we want to debug. It uses the currently running stack, paca 
save areas, ec. NMI IPI uses its own stack and save regions so it's a 
little more isolated. Maybe this is only a small advantage but I'd like 
to have it if we can.  

Thanks,
Nick


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 1/1] ppc/crash: Reset spinlocks during crash
@ 2020-04-03  6:41       ` Nicholas Piggin
  0 siblings, 0 replies; 28+ messages in thread
From: Nicholas Piggin @ 2020-04-03  6:41 UTC (permalink / raw)
  To: Alexios Zavras, Benjamin Herrenschmidt, Christophe Leroy,
	Greg Kroah-Hartman,
	Enrico
	 Weigelt, Leonardo Bras, Michael Ellerman, Paul Mackerras,
	peterz,
	Thomas
	 Gleixner
  Cc: linuxppc-dev, linux-kernel

Leonardo Bras's on April 3, 2020 10:37 am:
> On Thu, 2020-04-02 at 22:28 +1100, Michael Ellerman wrote:
>> Leonardo Bras <leonardo@linux.ibm.com> writes:
>> > During a crash, there is chance that the cpus that handle the NMI IPI
>> > are holding a spin_lock. If this spin_lock is needed by crashing_cpu it
>> > will cause a deadlock. (rtas.lock and printk logbuf_lock as of today)
>> > 
>> > This is a problem if the system has kdump set up, given if it crashes
>> > for any reason kdump may not be saved for crash analysis.
>> > 
>> > After NMI IPI is sent to all other cpus, force unlock all spinlocks
>> > needed for finishing crash routine.
>> 
>> I'm not convinced this is the right approach.
> 
> Me neither. I think it's a very hacky solution, but I couldn't think of
> anything better at the time.
> 
>> Busting locks is risky, it could easily cause a crash if data structures
>> are left in some inconsistent state.
>> 
>> I think we need to make this code more careful about what it's doing.
>> There's a clue at the top of default_machine_crash_shutdown(), which
>> calls crash_kexec_prepare_cpus():
>> 
>> 	 * This function is only called after the system
>> 	 * has panicked or is otherwise in a critical state.
>> 	 * The minimum amount of code to allow a kexec'd kernel
>> 	 * to run successfully needs to happen here.
>> 
>> 
>> You said the "IPI complete" message was the cause of one lockup:
>> 
>>   #0  arch_spin_lock 
>>   #1  do_raw_spin_lock 
>>   #2  __raw_spin_lock 
>>   #3  _raw_spin_lock 
>>   #4  vprintk_emit 
>>   #5  vprintk_func
>>   #7  crash_kexec_prepare_cpus 
>>   #8  default_machine_crash_shutdown
>>   #9  machine_crash_shutdown 
>>   #10 __crash_kexec
>>   #11 crash_kexec
>>   #12 oops_end
>> 
>> TBH I think we could just drop that printk() entirely.
>> 
>> Or we could tell printk() that we're in NMI context so that it uses the
>> percpu buffers.
>> 
>> We should probably do the latter anyway, in case there's any other code
>> we call that inadvertently calls printk().
>> 
> 
> I was not aware of using per-cpu buffers in printk. It may be a nice
> solution.
> 
> There is another printk call there:
> printk("kexec: Starting switchover sequence.\n");
> in default_machine_kexec().
> 
> Two printk and one rtas call: it's all I could see using a spinlock
> after IPI Complete.
> 
>> 
>> The RTAS trace you sent was:
>> 
>>   #0 arch_spin_lock
>>   #1  lock_rtas () 
>>   #2  rtas_call (token=8204, nargs=1, nret=1, outputs=0x0)
>>   #3  ics_rtas_mask_real_irq (hw_irq=4100) 
>>   #4  machine_kexec_mask_interrupts
>>   #5  default_machine_crash_shutdown
>>   #6  machine_crash_shutdown 
>>   #7  __crash_kexec
>>   #8  crash_kexec
>>   #9  oops_end
>> 
>> 
>> Which doesn't make it clear who holds the RTAS lock. We really shouldn't
>> be crashing while holding the RTAS lock, but I guess it could happen.
>> Can you get a full backtrace?
>> 
> 
> Oh, all traces are from the thread that called the crash, by writing
> 'c' to sysrq. That is what I am using to reproduce.
> 
> #10 bad_page_fault
> #11 handle_page_fault
> #12 __handle_sysrq (key=99, check_mask=false) 
> #13 write_sysrq_trigger 
> #14 proc_reg_write
> #15 __vfs_write
> #16 vfs_write
> #17 SYSC_write
> #18 SyS_write
> #19 system_call
> 
>> 
>> PAPR says we are not allowed to have multiple CPUs calling RTAS at once,
>> except for a very small list of RTAS calls. So if we bust the RTAS lock
>> there's a risk we violate that part of PAPR and crash even harder.
> 
> Interesting, I was not aware.
> 
>> 
>> Also it's not specific to kdump, we can't even get through a normal
>> reboot if we crash with the RTAS lock held.
>> 
>> Anyway here's a patch with some ideas. That allows me to get from a
>> crash with the RTAS lock held through kdump into the 2nd kernel. But it
>> only works if it's the crashing CPU that holds the RTAS lock.
>> 
> 
> Nice idea. 
> But my test environment is just triggering a crash from sysrq, so I
> think it would not improve the result, given that this thread is
> probably not holding the lock by the time.

Crash paths should not take that RTAS lock, it's a massive pain. I'm 
fixing it for machine check, for other crashes I think it can be removed 
too, it just needs to be unpicked. The good thing with crashing is that 
you can reasonably *know* that you're single threaded, so you can 
usually reason through situations like above.

> I noticed that when rtas is locked, irqs and preemption are also
> disabled.
> 
> Should the IPI send by crash be able to interrupt a thread with
> disabled irqs?

Yes. It's been a bit painful, but in the long term it means that a CPU 
which hangs with interrupts off can be debugged, and it means we can 
take it offline to crash without risking that it will be clobbering what 
we're doing.

Arguably what I should have done is try a regular IPI first, wait a few 
seconds, then NMI IPI.

A couple of problems with that. Firstly it probably avoids this issue 
you hit almost all the time, so it won't get fixed. So when we really 
need the NMI IPI in the field, it'll still be riddled with deadlocks.

Secondly, sending the IPI first in theory can be more intrusive to the 
state that we want to debug. It uses the currently running stack, paca 
save areas, ec. NMI IPI uses its own stack and save regions so it's a 
little more isolated. Maybe this is only a small advantage but I'd like 
to have it if we can.  

Thanks,
Nick


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 1/1] ppc/crash: Reset spinlocks during crash
  2020-04-02 11:28 ` Michael Ellerman
  2020-04-03  0:37   ` Leonardo Bras
@ 2020-04-06 18:46   ` Leonardo Bras
  1 sibling, 0 replies; 28+ messages in thread
From: Leonardo Bras @ 2020-04-06 18:46 UTC (permalink / raw)
  To: Michael Ellerman, Benjamin Herrenschmidt, Paul Mackerras,
	Enrico Weigelt, Alexios Zavras, Thomas Gleixner,
	Greg Kroah-Hartman, Christophe Leroy, peterz
  Cc: linuxppc-dev, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 545 bytes --]

On Thu, 2020-04-02 at 22:28 +1100, Michael Ellerman wrote:
> Leonardo Bras <leonardo@linux.ibm.com> 
> TBH I think we could just drop that printk() entirely.
> 
> Or we could tell printk() that we're in NMI context so that it uses the
> percpu buffers.
> 
> We should probably do the latter anyway, in case there's any other code
> we call that inadvertently calls printk().

Done:
http://patchwork.ozlabs.org/patch/1266956/

About the rtas-call, I think it will take more time, because I have to
study it properly.

Thank you,

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 1/1] ppc/crash: Reset spinlocks during crash
  2020-04-03  6:41       ` Nicholas Piggin
@ 2020-04-08  2:36         ` Leonardo Bras
  -1 siblings, 0 replies; 28+ messages in thread
From: Leonardo Bras @ 2020-04-08  2:36 UTC (permalink / raw)
  To: Nicholas Piggin, Alexios Zavras, Benjamin Herrenschmidt,
	Christophe Leroy, Greg Kroah-Hartman, Enrico Weigelt,
	Michael Ellerman, Paul Mackerras, peterz, Thomas Gleixner
  Cc: linux-kernel, linuxppc-dev

[-- Attachment #1: Type: text/plain, Size: 3648 bytes --]

Hello Nick, Michael,

On Fri, 2020-04-03 at 16:41 +1000, Nicholas Piggin wrote:
[...]
> > > PAPR says we are not allowed to have multiple CPUs calling RTAS at once,
> > > except for a very small list of RTAS calls. So if we bust the RTAS lock
> > > there's a risk we violate that part of PAPR and crash even harder.
> > 
> > Interesting, I was not aware.
> > 
> > > Also it's not specific to kdump, we can't even get through a normal
> > > reboot if we crash with the RTAS lock held.
> > > 
> > > Anyway here's a patch with some ideas. That allows me to get from a
> > > crash with the RTAS lock held through kdump into the 2nd kernel. But it
> > > only works if it's the crashing CPU that holds the RTAS lock.
> > > 
> > 
> > Nice idea. 
> > But my test environment is just triggering a crash from sysrq, so I
> > think it would not improve the result, given that this thread is
> > probably not holding the lock by the time.
> 
> Crash paths should not take that RTAS lock, it's a massive pain. I'm 
> fixing it for machine check, for other crashes I think it can be removed 
> too, it just needs to be unpicked. The good thing with crashing is that 
> you can reasonably *know* that you're single threaded, so you can 
> usually reason through situations like above.
> 
> > I noticed that when rtas is locked, irqs and preemption are also
> > disabled.
> > 
> > Should the IPI send by crash be able to interrupt a thread with
> > disabled irqs?
> 
> Yes. It's been a bit painful, but in the long term it means that a CPU 
> which hangs with interrupts off can be debugged, and it means we can 
> take it offline to crash without risking that it will be clobbering what 
> we're doing.
> 
> Arguably what I should have done is try a regular IPI first, wait a few 
> seconds, then NMI IPI.
> 
> A couple of problems with that. Firstly it probably avoids this issue 
> you hit almost all the time, so it won't get fixed. So when we really 
> need the NMI IPI in the field, it'll still be riddled with deadlocks.
> 
> Secondly, sending the IPI first in theory can be more intrusive to the 
> state that we want to debug. It uses the currently running stack, paca 
> save areas, ec. NMI IPI uses its own stack and save regions so it's a 
> little more isolated. Maybe this is only a small advantage but I'd like 
> to have it if we can.  
> 
> Thanks,
> Nick


I think the printk issue is solved (sent a patch on that), now what is
missing is the rtas call spinlock.

I noticed that rtas.lock is taken on machine_kexec_mask_interrupts(),
which happen after crashing the other threads and getting into
realmode. 

The following rtas are called each IRQ with valid interrupt descriptor:
ibm,int-off : Reset mask bit for that interrupt
ibm,set_xive : Set XIVE priority to 0xff

By what I could understand, these rtas calls happen to put the next
kexec kernel (kdump kernel) in a safer environment, so I think it's not
safe to just remove them.
 
(See commit d6c1a9081080c6c4658acf2a06d851feb2855933)

On the other hand, busting the rtas.lock could be dangerous, because
it's code we can't control.

According with LoPAR, for both of these rtas-calls, we have:

For the PowerPC External Interrupt option: The call must be reentrant
to the number of processors on the platform.
For the PowerPC External Interrupt option: The argument call buffer for
each simultaneous call must be physically unique.

Which I think means this rtas-calls can be done simultaneously.
Would it mean that busting the rtas.lock for these calls would be safe?

Best regards,
Leonardo Bras

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 1/1] ppc/crash: Reset spinlocks during crash
@ 2020-04-08  2:36         ` Leonardo Bras
  0 siblings, 0 replies; 28+ messages in thread
From: Leonardo Bras @ 2020-04-08  2:36 UTC (permalink / raw)
  To: Nicholas Piggin, Alexios Zavras, Benjamin Herrenschmidt,
	Christophe Leroy, Greg Kroah-Hartman, Enrico Weigelt,
	Michael Ellerman, Paul Mackerras, peterz, Thomas Gleixner
  Cc: linuxppc-dev, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 3648 bytes --]

Hello Nick, Michael,

On Fri, 2020-04-03 at 16:41 +1000, Nicholas Piggin wrote:
[...]
> > > PAPR says we are not allowed to have multiple CPUs calling RTAS at once,
> > > except for a very small list of RTAS calls. So if we bust the RTAS lock
> > > there's a risk we violate that part of PAPR and crash even harder.
> > 
> > Interesting, I was not aware.
> > 
> > > Also it's not specific to kdump, we can't even get through a normal
> > > reboot if we crash with the RTAS lock held.
> > > 
> > > Anyway here's a patch with some ideas. That allows me to get from a
> > > crash with the RTAS lock held through kdump into the 2nd kernel. But it
> > > only works if it's the crashing CPU that holds the RTAS lock.
> > > 
> > 
> > Nice idea. 
> > But my test environment is just triggering a crash from sysrq, so I
> > think it would not improve the result, given that this thread is
> > probably not holding the lock by the time.
> 
> Crash paths should not take that RTAS lock, it's a massive pain. I'm 
> fixing it for machine check, for other crashes I think it can be removed 
> too, it just needs to be unpicked. The good thing with crashing is that 
> you can reasonably *know* that you're single threaded, so you can 
> usually reason through situations like above.
> 
> > I noticed that when rtas is locked, irqs and preemption are also
> > disabled.
> > 
> > Should the IPI send by crash be able to interrupt a thread with
> > disabled irqs?
> 
> Yes. It's been a bit painful, but in the long term it means that a CPU 
> which hangs with interrupts off can be debugged, and it means we can 
> take it offline to crash without risking that it will be clobbering what 
> we're doing.
> 
> Arguably what I should have done is try a regular IPI first, wait a few 
> seconds, then NMI IPI.
> 
> A couple of problems with that. Firstly it probably avoids this issue 
> you hit almost all the time, so it won't get fixed. So when we really 
> need the NMI IPI in the field, it'll still be riddled with deadlocks.
> 
> Secondly, sending the IPI first in theory can be more intrusive to the 
> state that we want to debug. It uses the currently running stack, paca 
> save areas, ec. NMI IPI uses its own stack and save regions so it's a 
> little more isolated. Maybe this is only a small advantage but I'd like 
> to have it if we can.  
> 
> Thanks,
> Nick


I think the printk issue is solved (sent a patch on that), now what is
missing is the rtas call spinlock.

I noticed that rtas.lock is taken on machine_kexec_mask_interrupts(),
which happen after crashing the other threads and getting into
realmode. 

The following rtas are called each IRQ with valid interrupt descriptor:
ibm,int-off : Reset mask bit for that interrupt
ibm,set_xive : Set XIVE priority to 0xff

By what I could understand, these rtas calls happen to put the next
kexec kernel (kdump kernel) in a safer environment, so I think it's not
safe to just remove them.
 
(See commit d6c1a9081080c6c4658acf2a06d851feb2855933)

On the other hand, busting the rtas.lock could be dangerous, because
it's code we can't control.

According with LoPAR, for both of these rtas-calls, we have:

For the PowerPC External Interrupt option: The call must be reentrant
to the number of processors on the platform.
For the PowerPC External Interrupt option: The argument call buffer for
each simultaneous call must be physically unique.

Which I think means this rtas-calls can be done simultaneously.
Would it mean that busting the rtas.lock for these calls would be safe?

Best regards,
Leonardo Bras

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 1/1] ppc/crash: Reset spinlocks during crash
  2020-04-08  2:36         ` Leonardo Bras
@ 2020-04-08 12:21           ` Michael Ellerman
  -1 siblings, 0 replies; 28+ messages in thread
From: Michael Ellerman @ 2020-04-08 12:21 UTC (permalink / raw)
  To: Leonardo Bras, Nicholas Piggin, Alexios Zavras,
	Benjamin Herrenschmidt, Christophe Leroy, Greg Kroah-Hartman,
	Enrico Weigelt, Paul Mackerras, peterz, Thomas Gleixner
  Cc: linux-kernel, linuxppc-dev

Leonardo Bras <leonardo@linux.ibm.com> writes:
> Hello Nick, Michael,
>
> On Fri, 2020-04-03 at 16:41 +1000, Nicholas Piggin wrote:
> [...]
>> > > PAPR says we are not allowed to have multiple CPUs calling RTAS at once,
>> > > except for a very small list of RTAS calls. So if we bust the RTAS lock
>> > > there's a risk we violate that part of PAPR and crash even harder.
>> > 
>> > Interesting, I was not aware.
>> > 
>> > > Also it's not specific to kdump, we can't even get through a normal
>> > > reboot if we crash with the RTAS lock held.
>> > > 
>> > > Anyway here's a patch with some ideas. That allows me to get from a
>> > > crash with the RTAS lock held through kdump into the 2nd kernel. But it
>> > > only works if it's the crashing CPU that holds the RTAS lock.
>> > > 
>> > 
>> > Nice idea. 
>> > But my test environment is just triggering a crash from sysrq, so I
>> > think it would not improve the result, given that this thread is
>> > probably not holding the lock by the time.
>> 
>> Crash paths should not take that RTAS lock, it's a massive pain. I'm 
>> fixing it for machine check, for other crashes I think it can be removed 
>> too, it just needs to be unpicked. The good thing with crashing is that 
>> you can reasonably *know* that you're single threaded, so you can 
>> usually reason through situations like above.
>> 
>> > I noticed that when rtas is locked, irqs and preemption are also
>> > disabled.
>> > 
>> > Should the IPI send by crash be able to interrupt a thread with
>> > disabled irqs?
>> 
>> Yes. It's been a bit painful, but in the long term it means that a CPU 
>> which hangs with interrupts off can be debugged, and it means we can 
>> take it offline to crash without risking that it will be clobbering what 
>> we're doing.
>> 
>> Arguably what I should have done is try a regular IPI first, wait a few 
>> seconds, then NMI IPI.
>> 
>> A couple of problems with that. Firstly it probably avoids this issue 
>> you hit almost all the time, so it won't get fixed. So when we really 
>> need the NMI IPI in the field, it'll still be riddled with deadlocks.
>> 
>> Secondly, sending the IPI first in theory can be more intrusive to the 
>> state that we want to debug. It uses the currently running stack, paca 
>> save areas, ec. NMI IPI uses its own stack and save regions so it's a 
>> little more isolated. Maybe this is only a small advantage but I'd like 
>> to have it if we can.  
>
> I think the printk issue is solved (sent a patch on that), now what is
> missing is the rtas call spinlock.
>
> I noticed that rtas.lock is taken on machine_kexec_mask_interrupts(),
> which happen after crashing the other threads and getting into
> realmode. 
>
> The following rtas are called each IRQ with valid interrupt descriptor:
> ibm,int-off : Reset mask bit for that interrupt
> ibm,set_xive : Set XIVE priority to 0xff
>
> By what I could understand, these rtas calls happen to put the next
> kexec kernel (kdump kernel) in a safer environment, so I think it's not
> safe to just remove them.

Yes.

> (See commit d6c1a9081080c6c4658acf2a06d851feb2855933)

In hindsight the person who wrote that commit was being lazy. We
*should* have made the 2nd kernel robust against the IRQ state being
messed up.

> On the other hand, busting the rtas.lock could be dangerous, because
> it's code we can't control.
>
> According with LoPAR, for both of these rtas-calls, we have:
>
> For the PowerPC External Interrupt option: The call must be reentrant
> to the number of processors on the platform.
> For the PowerPC External Interrupt option: The argument call buffer for
> each simultaneous call must be physically unique.

Oh well spotted. Where is that in the doc?

> Which I think means this rtas-calls can be done simultaneously.

I think so too. I'll read PAPR in the morning and make sure.

> Would it mean that busting the rtas.lock for these calls would be safe?

What would be better is to make those specific calls not take the global
RTAS lock to begin with.

We should be able to just allocate the rtas_args on the stack, it's only
~80 odd bytes. And then we can use rtas_call_unlocked() which doesn't
take the global lock.

cheers

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 1/1] ppc/crash: Reset spinlocks during crash
@ 2020-04-08 12:21           ` Michael Ellerman
  0 siblings, 0 replies; 28+ messages in thread
From: Michael Ellerman @ 2020-04-08 12:21 UTC (permalink / raw)
  To: Leonardo Bras, Nicholas Piggin, Alexios Zavras,
	Benjamin Herrenschmidt, Christophe Leroy, Greg Kroah-Hartman,
	Enrico Weigelt, Paul Mackerras, peterz, Thomas Gleixner
  Cc: linuxppc-dev, linux-kernel

Leonardo Bras <leonardo@linux.ibm.com> writes:
> Hello Nick, Michael,
>
> On Fri, 2020-04-03 at 16:41 +1000, Nicholas Piggin wrote:
> [...]
>> > > PAPR says we are not allowed to have multiple CPUs calling RTAS at once,
>> > > except for a very small list of RTAS calls. So if we bust the RTAS lock
>> > > there's a risk we violate that part of PAPR and crash even harder.
>> > 
>> > Interesting, I was not aware.
>> > 
>> > > Also it's not specific to kdump, we can't even get through a normal
>> > > reboot if we crash with the RTAS lock held.
>> > > 
>> > > Anyway here's a patch with some ideas. That allows me to get from a
>> > > crash with the RTAS lock held through kdump into the 2nd kernel. But it
>> > > only works if it's the crashing CPU that holds the RTAS lock.
>> > > 
>> > 
>> > Nice idea. 
>> > But my test environment is just triggering a crash from sysrq, so I
>> > think it would not improve the result, given that this thread is
>> > probably not holding the lock by the time.
>> 
>> Crash paths should not take that RTAS lock, it's a massive pain. I'm 
>> fixing it for machine check, for other crashes I think it can be removed 
>> too, it just needs to be unpicked. The good thing with crashing is that 
>> you can reasonably *know* that you're single threaded, so you can 
>> usually reason through situations like above.
>> 
>> > I noticed that when rtas is locked, irqs and preemption are also
>> > disabled.
>> > 
>> > Should the IPI send by crash be able to interrupt a thread with
>> > disabled irqs?
>> 
>> Yes. It's been a bit painful, but in the long term it means that a CPU 
>> which hangs with interrupts off can be debugged, and it means we can 
>> take it offline to crash without risking that it will be clobbering what 
>> we're doing.
>> 
>> Arguably what I should have done is try a regular IPI first, wait a few 
>> seconds, then NMI IPI.
>> 
>> A couple of problems with that. Firstly it probably avoids this issue 
>> you hit almost all the time, so it won't get fixed. So when we really 
>> need the NMI IPI in the field, it'll still be riddled with deadlocks.
>> 
>> Secondly, sending the IPI first in theory can be more intrusive to the 
>> state that we want to debug. It uses the currently running stack, paca 
>> save areas, ec. NMI IPI uses its own stack and save regions so it's a 
>> little more isolated. Maybe this is only a small advantage but I'd like 
>> to have it if we can.  
>
> I think the printk issue is solved (sent a patch on that), now what is
> missing is the rtas call spinlock.
>
> I noticed that rtas.lock is taken on machine_kexec_mask_interrupts(),
> which happen after crashing the other threads and getting into
> realmode. 
>
> The following rtas are called each IRQ with valid interrupt descriptor:
> ibm,int-off : Reset mask bit for that interrupt
> ibm,set_xive : Set XIVE priority to 0xff
>
> By what I could understand, these rtas calls happen to put the next
> kexec kernel (kdump kernel) in a safer environment, so I think it's not
> safe to just remove them.

Yes.

> (See commit d6c1a9081080c6c4658acf2a06d851feb2855933)

In hindsight the person who wrote that commit was being lazy. We
*should* have made the 2nd kernel robust against the IRQ state being
messed up.

> On the other hand, busting the rtas.lock could be dangerous, because
> it's code we can't control.
>
> According with LoPAR, for both of these rtas-calls, we have:
>
> For the PowerPC External Interrupt option: The call must be reentrant
> to the number of processors on the platform.
> For the PowerPC External Interrupt option: The argument call buffer for
> each simultaneous call must be physically unique.

Oh well spotted. Where is that in the doc?

> Which I think means this rtas-calls can be done simultaneously.

I think so too. I'll read PAPR in the morning and make sure.

> Would it mean that busting the rtas.lock for these calls would be safe?

What would be better is to make those specific calls not take the global
RTAS lock to begin with.

We should be able to just allocate the rtas_args on the stack, it's only
~80 odd bytes. And then we can use rtas_call_unlocked() which doesn't
take the global lock.

cheers

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 1/1] ppc/crash: Reset spinlocks during crash
  2020-04-08 12:21           ` Michael Ellerman
@ 2020-04-08 16:48             ` Leonardo Bras
  -1 siblings, 0 replies; 28+ messages in thread
From: Leonardo Bras @ 2020-04-08 16:48 UTC (permalink / raw)
  To: Michael Ellerman, Nicholas Piggin, Alexios Zavras,
	Benjamin Herrenschmidt, Christophe Leroy, Greg Kroah-Hartman,
	Enrico Weigelt, Paul Mackerras, peterz, Thomas Gleixner
  Cc: linux-kernel, linuxppc-dev

[-- Attachment #1: Type: text/plain, Size: 1269 bytes --]

On Wed, 2020-04-08 at 22:21 +1000, Michael Ellerman wrote:
[...]
> > On the other hand, busting the rtas.lock could be dangerous, because
> > it's code we can't control.
> > 
> > According with LoPAR, for both of these rtas-calls, we have:
> > 
> > For the PowerPC External Interrupt option: The call must be reentrant
> > to the number of processors on the platform.
> > For the PowerPC External Interrupt option: The argument call buffer for
> > each simultaneous call must be physically unique.
> 
> Oh well spotted. Where is that in the doc?

In the current LoPAR available on OpenPower Foundation, it's on page
170, '7.3.10.2 ibm,set-xive' and '7.3.10.3 ibm,int-off'.

> > Which I think means this rtas-calls can be done simultaneously.
> 
> I think so too. I'll read PAPR in the morning and make sure.
> 
> > Would it mean that busting the rtas.lock for these calls would be safe?
> 
> What would be better is to make those specific calls not take the global
> RTAS lock to begin with.
> 
> We should be able to just allocate the rtas_args on the stack, it's only
> ~80 odd bytes. And then we can use rtas_call_unlocked() which doesn't
> take the global lock.

Good idea. I will try getting some work done on this.

Best regards,

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 1/1] ppc/crash: Reset spinlocks during crash
@ 2020-04-08 16:48             ` Leonardo Bras
  0 siblings, 0 replies; 28+ messages in thread
From: Leonardo Bras @ 2020-04-08 16:48 UTC (permalink / raw)
  To: Michael Ellerman, Nicholas Piggin, Alexios Zavras,
	Benjamin Herrenschmidt, Christophe Leroy, Greg Kroah-Hartman,
	Enrico Weigelt, Paul Mackerras, peterz, Thomas Gleixner
  Cc: linuxppc-dev, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1269 bytes --]

On Wed, 2020-04-08 at 22:21 +1000, Michael Ellerman wrote:
[...]
> > On the other hand, busting the rtas.lock could be dangerous, because
> > it's code we can't control.
> > 
> > According with LoPAR, for both of these rtas-calls, we have:
> > 
> > For the PowerPC External Interrupt option: The call must be reentrant
> > to the number of processors on the platform.
> > For the PowerPC External Interrupt option: The argument call buffer for
> > each simultaneous call must be physically unique.
> 
> Oh well spotted. Where is that in the doc?

In the current LoPAR available on OpenPower Foundation, it's on page
170, '7.3.10.2 ibm,set-xive' and '7.3.10.3 ibm,int-off'.

> > Which I think means this rtas-calls can be done simultaneously.
> 
> I think so too. I'll read PAPR in the morning and make sure.
> 
> > Would it mean that busting the rtas.lock for these calls would be safe?
> 
> What would be better is to make those specific calls not take the global
> RTAS lock to begin with.
> 
> We should be able to just allocate the rtas_args on the stack, it's only
> ~80 odd bytes. And then we can use rtas_call_unlocked() which doesn't
> take the global lock.

Good idea. I will try getting some work done on this.

Best regards,

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 1/1] ppc/crash: Reset spinlocks during crash
  2020-04-08 12:21           ` Michael Ellerman
@ 2020-04-08 18:00             ` Leonardo Bras
  -1 siblings, 0 replies; 28+ messages in thread
From: Leonardo Bras @ 2020-04-08 18:00 UTC (permalink / raw)
  To: Michael Ellerman, Nicholas Piggin, Alexios Zavras,
	Benjamin Herrenschmidt, Christophe Leroy, Greg Kroah-Hartman,
	Enrico Weigelt, Paul Mackerras, peterz, Thomas Gleixner
  Cc: linux-kernel, linuxppc-dev

[-- Attachment #1: Type: text/plain, Size: 307 bytes --]

On Wed, 2020-04-08 at 22:21 +1000, Michael Ellerman wrote:
> We should be able to just allocate the rtas_args on the stack, it's only
> ~80 odd bytes. And then we can use rtas_call_unlocked() which doesn't
> take the global lock.

At this point, would it be a problem using kmalloc? 

Best regards,

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 1/1] ppc/crash: Reset spinlocks during crash
@ 2020-04-08 18:00             ` Leonardo Bras
  0 siblings, 0 replies; 28+ messages in thread
From: Leonardo Bras @ 2020-04-08 18:00 UTC (permalink / raw)
  To: Michael Ellerman, Nicholas Piggin, Alexios Zavras,
	Benjamin Herrenschmidt, Christophe Leroy, Greg Kroah-Hartman,
	Enrico Weigelt, Paul Mackerras, peterz, Thomas Gleixner
  Cc: linuxppc-dev, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 307 bytes --]

On Wed, 2020-04-08 at 22:21 +1000, Michael Ellerman wrote:
> We should be able to just allocate the rtas_args on the stack, it's only
> ~80 odd bytes. And then we can use rtas_call_unlocked() which doesn't
> take the global lock.

At this point, would it be a problem using kmalloc? 

Best regards,

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 1/1] ppc/crash: Reset spinlocks during crash
  2020-04-08 12:21           ` Michael Ellerman
                             ` (2 preceding siblings ...)
  (?)
@ 2020-04-08 22:55           ` Leonardo Bras
  -1 siblings, 0 replies; 28+ messages in thread
From: Leonardo Bras @ 2020-04-08 22:55 UTC (permalink / raw)
  To: Michael Ellerman, Nicholas Piggin, Alexios Zavras,
	Benjamin Herrenschmidt, Christophe Leroy, Greg Kroah-Hartman,
	Enrico Weigelt, Paul Mackerras, peterz, Thomas Gleixner
  Cc: linuxppc-dev, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1580 bytes --]

On Wed, 2020-04-08 at 22:21 +1000, Michael Ellerman wrote:
[...]
> > According with LoPAR, for both of these rtas-calls, we have:
> > 
> > For the PowerPC External Interrupt option: The call must be reentrant
> > to the number of processors on the platform.
> > For the PowerPC External Interrupt option: The argument call buffer for
> > each simultaneous call must be physically unique.
> 
> Oh well spotted. Where is that in the doc?
> 
> > Which I think means this rtas-calls can be done simultaneously.
> 
> I think so too. I'll read PAPR in the morning and make sure.
> 
> > Would it mean that busting the rtas.lock for these calls would be safe?
> 
> What would be better is to make those specific calls not take the global
> RTAS lock to begin with.
> 
> We should be able to just allocate the rtas_args on the stack, it's only
> ~80 odd bytes. And then we can use rtas_call_unlocked() which doesn't
> take the global lock.

Hello Michael,

I did the simplest possible version of this change:
http://patchwork.ozlabs.org/patch/1268371/

Where I create a rtas_call_reentrant(), and replace rtas_call() for
that in all the possible calls of "ibm,int-on", "ibm,int-off",ibm,get-
xive" and  "ibm,set-xive".

At first, I was planning on creating a function that tells if the
requested token is one of above, before automatically choosing between
the common and reentrant versions. But it seemed like unnecessary
overhead, since the current calls are very few and very straight. 

What do you think on this?

Best regards,
Leonardo Bras

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 1/1] ppc/crash: Reset spinlocks during crash
  2020-04-08 12:21           ` Michael Ellerman
@ 2020-04-09  0:27             ` Paul Mackerras
  -1 siblings, 0 replies; 28+ messages in thread
From: Paul Mackerras @ 2020-04-09  0:27 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: Leonardo Bras, Nicholas Piggin, Alexios Zavras,
	Benjamin Herrenschmidt, Christophe Leroy, Greg Kroah-Hartman,
	Enrico Weigelt, peterz, Thomas Gleixner, linux-kernel,
	linuxppc-dev

On Wed, Apr 08, 2020 at 10:21:29PM +1000, Michael Ellerman wrote:
> 
> We should be able to just allocate the rtas_args on the stack, it's only
> ~80 odd bytes. And then we can use rtas_call_unlocked() which doesn't
> take the global lock.

Do we instantiate a 64-bit RTAS these days, or is it still 32-bit?
In the old days we had to make sure the RTAS argument buffer was
below the 4GB point.  If that's still necessary then perhaps putting
rtas_args inside the PACA would be the way to go.

Paul.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 1/1] ppc/crash: Reset spinlocks during crash
@ 2020-04-09  0:27             ` Paul Mackerras
  0 siblings, 0 replies; 28+ messages in thread
From: Paul Mackerras @ 2020-04-09  0:27 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: peterz, linuxppc-dev, linux-kernel, Nicholas Piggin,
	Alexios Zavras, Greg Kroah-Hartman, Leonardo Bras,
	Thomas Gleixner, Enrico Weigelt

On Wed, Apr 08, 2020 at 10:21:29PM +1000, Michael Ellerman wrote:
> 
> We should be able to just allocate the rtas_args on the stack, it's only
> ~80 odd bytes. And then we can use rtas_call_unlocked() which doesn't
> take the global lock.

Do we instantiate a 64-bit RTAS these days, or is it still 32-bit?
In the old days we had to make sure the RTAS argument buffer was
below the 4GB point.  If that's still necessary then perhaps putting
rtas_args inside the PACA would be the way to go.

Paul.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 1/1] ppc/crash: Reset spinlocks during crash
  2020-04-09  0:27             ` Paul Mackerras
@ 2020-05-12  3:48               ` Leonardo Bras
  -1 siblings, 0 replies; 28+ messages in thread
From: Leonardo Bras @ 2020-05-12  3:48 UTC (permalink / raw)
  To: Paul Mackerras, Michael Ellerman
  Cc: peterz, linuxppc-dev, linux-kernel, Nicholas Piggin,
	Alexios Zavras, Greg Kroah-Hartman, Thomas Gleixner,
	Enrico Weigelt

[-- Attachment #1: Type: text/plain, Size: 919 bytes --]

Hello Paul, thanks for the reply!

On Thu, 2020-04-09 at 10:27 +1000, Paul Mackerras wrote:
> On Wed, Apr 08, 2020 at 10:21:29PM +1000, Michael Ellerman wrote:
> > We should be able to just allocate the rtas_args on the stack, it's only
> > ~80 odd bytes. And then we can use rtas_call_unlocked() which doesn't
> > take the global lock.
> 
> Do we instantiate a 64-bit RTAS these days, or is it still 32-bit?

According to LoPAR, we can use instantiate-rtas or instantiate-rtas-64. 
It looks like we do instantiate-rtas today (grep pointed only to
prom_instantiate_rtas()).

> In the old days we had to make sure the RTAS argument buffer was
> below the 4GB point.  If that's still necessary then perhaps putting
> rtas_args inside the PACA would be the way to go.

Yes, we still need to make sure of this. I will study more about PACA
and try to implement that way.

Best regards,
Leonardo Bras

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 862 bytes --]

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 1/1] ppc/crash: Reset spinlocks during crash
@ 2020-05-12  3:48               ` Leonardo Bras
  0 siblings, 0 replies; 28+ messages in thread
From: Leonardo Bras @ 2020-05-12  3:48 UTC (permalink / raw)
  To: Paul Mackerras, Michael Ellerman
  Cc: Enrico Weigelt, peterz, Greg Kroah-Hartman, linux-kernel,
	Nicholas Piggin, Alexios Zavras, Thomas Gleixner, linuxppc-dev

[-- Attachment #1: Type: text/plain, Size: 919 bytes --]

Hello Paul, thanks for the reply!

On Thu, 2020-04-09 at 10:27 +1000, Paul Mackerras wrote:
> On Wed, Apr 08, 2020 at 10:21:29PM +1000, Michael Ellerman wrote:
> > We should be able to just allocate the rtas_args on the stack, it's only
> > ~80 odd bytes. And then we can use rtas_call_unlocked() which doesn't
> > take the global lock.
> 
> Do we instantiate a 64-bit RTAS these days, or is it still 32-bit?

According to LoPAR, we can use instantiate-rtas or instantiate-rtas-64. 
It looks like we do instantiate-rtas today (grep pointed only to
prom_instantiate_rtas()).

> In the old days we had to make sure the RTAS argument buffer was
> below the 4GB point.  If that's still necessary then perhaps putting
> rtas_args inside the PACA would be the way to go.

Yes, we still need to make sure of this. I will study more about PACA
and try to implement that way.

Best regards,
Leonardo Bras

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 862 bytes --]

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 1/1] ppc/crash: Reset spinlocks during crash
  2020-04-09  0:27             ` Paul Mackerras
@ 2020-05-12 10:42               ` Michael Ellerman
  -1 siblings, 0 replies; 28+ messages in thread
From: Michael Ellerman @ 2020-05-12 10:42 UTC (permalink / raw)
  To: Paul Mackerras
  Cc: Leonardo Bras, Nicholas Piggin, Alexios Zavras,
	Benjamin Herrenschmidt, Christophe Leroy, Greg Kroah-Hartman,
	Enrico Weigelt, peterz, Thomas Gleixner, linux-kernel,
	linuxppc-dev

Paul Mackerras <paulus@ozlabs.org> writes:
> On Wed, Apr 08, 2020 at 10:21:29PM +1000, Michael Ellerman wrote:
>> 
>> We should be able to just allocate the rtas_args on the stack, it's only
>> ~80 odd bytes. And then we can use rtas_call_unlocked() which doesn't
>> take the global lock.
>
> Do we instantiate a 64-bit RTAS these days, or is it still 32-bit?

No, yes.

> In the old days we had to make sure the RTAS argument buffer was
> below the 4GB point.

Yes you're right, that's still true.

I was thinking we were on the emergency stack, but we may not be.

> If that's still necessary then perhaps putting rtas_args inside the
> PACA would be the way to go.

Yeah I guess. Allocating a struct within the RMO for each CPU is not
that simple vs just putting it in the paca.

cheers

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 1/1] ppc/crash: Reset spinlocks during crash
@ 2020-05-12 10:42               ` Michael Ellerman
  0 siblings, 0 replies; 28+ messages in thread
From: Michael Ellerman @ 2020-05-12 10:42 UTC (permalink / raw)
  To: Paul Mackerras
  Cc: peterz, linuxppc-dev, linux-kernel, Nicholas Piggin,
	Alexios Zavras, Greg Kroah-Hartman, Leonardo Bras,
	Thomas Gleixner, Enrico Weigelt

Paul Mackerras <paulus@ozlabs.org> writes:
> On Wed, Apr 08, 2020 at 10:21:29PM +1000, Michael Ellerman wrote:
>> 
>> We should be able to just allocate the rtas_args on the stack, it's only
>> ~80 odd bytes. And then we can use rtas_call_unlocked() which doesn't
>> take the global lock.
>
> Do we instantiate a 64-bit RTAS these days, or is it still 32-bit?

No, yes.

> In the old days we had to make sure the RTAS argument buffer was
> below the 4GB point.

Yes you're right, that's still true.

I was thinking we were on the emergency stack, but we may not be.

> If that's still necessary then perhaps putting rtas_args inside the
> PACA would be the way to go.

Yeah I guess. Allocating a struct within the RMO for each CPU is not
that simple vs just putting it in the paca.

cheers

^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2020-05-12 10:44 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-04-01  0:00 [PATCH v3 1/1] ppc/crash: Reset spinlocks during crash Leonardo Bras
2020-04-01  3:07 ` kbuild test robot
2020-04-01  3:07   ` kbuild test robot
2020-04-01  3:07   ` kbuild test robot
2020-04-01  9:26 ` Peter Zijlstra
2020-04-01  9:26   ` Peter Zijlstra
2020-04-01 23:53   ` Leonardo Bras
2020-04-01 23:53     ` Leonardo Bras
2020-04-02 11:28 ` Michael Ellerman
2020-04-03  0:37   ` Leonardo Bras
2020-04-03  6:41     ` Nicholas Piggin
2020-04-03  6:41       ` Nicholas Piggin
2020-04-08  2:36       ` Leonardo Bras
2020-04-08  2:36         ` Leonardo Bras
2020-04-08 12:21         ` Michael Ellerman
2020-04-08 12:21           ` Michael Ellerman
2020-04-08 16:48           ` Leonardo Bras
2020-04-08 16:48             ` Leonardo Bras
2020-04-08 18:00           ` Leonardo Bras
2020-04-08 18:00             ` Leonardo Bras
2020-04-08 22:55           ` Leonardo Bras
2020-04-09  0:27           ` Paul Mackerras
2020-04-09  0:27             ` Paul Mackerras
2020-05-12  3:48             ` Leonardo Bras
2020-05-12  3:48               ` Leonardo Bras
2020-05-12 10:42             ` Michael Ellerman
2020-05-12 10:42               ` Michael Ellerman
2020-04-06 18:46   ` Leonardo Bras

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.