linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC][PATCH v4 -next 0/4] Make pstore/kmsg_dump run after stopping other cpus in panic path
@ 2012-01-05 17:35 Seiji Aguchi
  2012-01-05 17:36 ` [RFC][PATCH v4 -next 1/4] Move kmsg_dump(KMSG_DUMP_PANIC) below smp_send_stop() Seiji Aguchi
                   ` (3 more replies)
  0 siblings, 4 replies; 24+ messages in thread
From: Seiji Aguchi @ 2012-01-05 17:35 UTC (permalink / raw)
  To: Luck, Tony, Don Zickus
  Cc: linux-kernel, Matthew Garrett, Vivek Goyal, Chen, Gong, akpm,
	Brown, Len, 'ying.huang, 'ak, 'hughd, 'mingo,
	jmorris, a.p.zijlstra, namhyung, dle-develop, Satoru Moriya

Hi,

Discussion:
 As Don mentioned in following thread, it would be nice for pstore/kmsg_dump to serialize 
 panic path because they can log messages reliably.

 https://lkml.org/lkml/2011/10/13/427

 This patchset is based on his proposal switching smp_send_stop() from REBOOT_VECTOR to NMI.

Change Log:

 v3 -> v4
   - Add comment for explaining the purpose of WARN_ON() based on Don's comment (patch 2/4)
     https://lkml.org/lkml/2011/12/12/296

   - Skip spin_lock of efi_pstore_write() in panic case based on discussion with Tony (patch 4/4)
     https://lkml.org/lkml/2012/1/3/151

   - Apply this patchset to -next tree instead of linus -tree so that "patch 4/4" makes simple

 v2 -> v3
   - Skip spin_locks in panic case in both kmsg_dump() and pstore_dump() instead of calling spin_lock_init()
     to avoid potential issues due to spin_lock_init()
   - Add WARN_ON() in "in_nmi() and !panic" case into kmsg_dump() so that we trap when someone adds 
     new kmsg_dump() in NMI path in the future
   - Skip subsequent kmsg_dump() function calls to avoid deadlock.

 v1 -> v2
  - Add trylocks to kmsg_dump()/pstore_dump() so that they can work in NMI context.  
  - Divide a patch into two
     First one is just moving kmsg_dump(KMSG_DUMP_PANIC) below smp_send_stop()
     Second one is changing lock operations in kmsg_dump()/pstore_dump()
 v1
  - Move kmsg_dump(KMSG_DUMP_PANIC) below smp_send_stop
  - Bust logbuf_lock of kmsg_dump() in panic path for avoiding deadlock
  - Bust psinfo->buf_lock of pstore_dump() in panic path for avoiding deadlock

Patch Description:
   [RFC][PATCH v4 -next 1/4] Move kmsg_dump(KMSG_DUMP_PANIC) below smp_send_stop()
     - Just move kmsg_dump(KMSG_DUMP_PANIC) below smp_send_stop()

   [RFC][PATCH v4 -next 2/4] Skip spin_locks in panic case and Add WARN_ON()
     - Skip spin_locks in panic case in both kmsg_dump() and pstore_dump()
     - Add WARN_ON() in "in_nmi() and !panic" case into kmsg_dump()

   [RFC][PATCH v4 -next 3/4] Skip subsequent kmsg_dump()
     - Skip subsequent kmsg_dump() function calls in panic path

   [RFC][PATCH v4 -next 4/4] Skip spin_lock of efi_pstore_write() in panic case
     - Skip spin_lock of efi_pstore_write() in panic case to avoid deadlock

Seiji

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [RFC][PATCH v4 -next 1/4] Move kmsg_dump(KMSG_DUMP_PANIC) below smp_send_stop()
  2012-01-05 17:35 [RFC][PATCH v4 -next 0/4] Make pstore/kmsg_dump run after stopping other cpus in panic path Seiji Aguchi
@ 2012-01-05 17:36 ` Seiji Aguchi
  2012-01-05 19:06   ` Luck, Tony
  2012-01-05 17:38 ` [RFC][PATCH v4 -next 2/4] Skip spin_locks in panic case and Add WARN_ON() Seiji Aguchi
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 24+ messages in thread
From: Seiji Aguchi @ 2012-01-05 17:36 UTC (permalink / raw)
  To: Luck, Tony, Don Zickus
  Cc: linux-kernel, Matthew Garrett, Vivek Goyal, Chen, Gong, akpm,
	Brown, Len, 'ying.huang, 'ak, 'hughd, 'mingo,
	jmorris, a.p.zijlstra, namhyung, dle-develop, Satoru Moriya

This patch just moves kmsg_dump(KMSG_DUMP_PANIC) below smp_send_stop
for serializing logging process via smp_send_stop.

 Signed-off-by: Seiji Aguchi <seiji.aguchi@hds.com>
 Acked-by: Don Zickus <dzickus@redhat.com>

---
 kernel/panic.c |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/kernel/panic.c b/kernel/panic.c
index 80aed44..da585b8 100644
--- a/kernel/panic.c
+++ b/kernel/panic.c
@@ -108,8 +108,6 @@ void panic(const char *fmt, ...)
 	 */
 	crash_kexec(NULL);
 
-	kmsg_dump(KMSG_DUMP_PANIC);
-
 	/*
 	 * Note smp_send_stop is the usual smp shutdown function, which
 	 * unfortunately means it may not be hardened to work in a panic
@@ -117,6 +115,8 @@ void panic(const char *fmt, ...)
 	 */
 	smp_send_stop();
 
+	kmsg_dump(KMSG_DUMP_PANIC);
+
 	atomic_notifier_call_chain(&panic_notifier_list, 0, buf);
 
 	bust_spinlocks(0);
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [RFC][PATCH v4 -next 2/4] Skip spin_locks in panic case and Add WARN_ON()
  2012-01-05 17:35 [RFC][PATCH v4 -next 0/4] Make pstore/kmsg_dump run after stopping other cpus in panic path Seiji Aguchi
  2012-01-05 17:36 ` [RFC][PATCH v4 -next 1/4] Move kmsg_dump(KMSG_DUMP_PANIC) below smp_send_stop() Seiji Aguchi
@ 2012-01-05 17:38 ` Seiji Aguchi
  2012-01-05 17:39 ` [RFC][PATCH v4 -next 3/4]Skip subsequent kmsg_dump() function calls in panic path Seiji Aguchi
  2012-01-05 17:41 ` [RFC][PATCH v4 -next 4/4] Skip spin_lock of efi_pstore_write() in panic case Seiji Aguchi
  3 siblings, 0 replies; 24+ messages in thread
From: Seiji Aguchi @ 2012-01-05 17:38 UTC (permalink / raw)
  To: Luck, Tony, Don Zickus
  Cc: linux-kernel, Matthew Garrett, Vivek Goyal, Chen, Gong, akpm,
	Brown, Len, 'ying.huang, 'ak, 'hughd, 'mingo,
	jmorris, a.p.zijlstra, namhyung, dle-develop, Satoru Moriya

This patch does followings.

   - Skip spin_locks in panic case in both kmsg_dump() and pstore_dump() because they are 
     serialized via smp_send_stop

   - Add WARN_ON() in "in_nmi() and !panic" case into kmsg_dump(). Currently, this case never 
     happens because only kmsg_dump(KMSG_DUMP_PANIC) is called in NMI case.
     But if someone adds new kmsg_dump() into NMI path in the future, kmsg_dump() may deadlock. 
     We can trap it and complain with this WARN_ON().
     

 With this patch, kmsg_dump()/pstore_dump() work as follows.
   panic case (KMSG_DUMP_PANIC):
     - don't take lock because they are serialized.

   not panic case (KMSG_DUMP_OOPS/KMSG_DUMP_EMERG/KMSG_DUMP_RESTART/KMSG_DUMP_HALT):
     - take locks normally
 
 Regarding as NMI case,
    - kmsg_dump()/pstore_dump() don't take locks, so deadlock issue will not happen
      because kmsg_dump() is called in just panic case with current implementation.
    - If someone adds new kmsg_dump() into NMI path, WARN_ON() is called.
      So we can trap it and ask to fix it.

 Signed-off-by: Seiji Aguchi <seiji.aguchi@hds.com>

---
 fs/pstore/platform.c |   16 ++++++----------
 kernel/printk.c      |   20 ++++++++++++++++++--
 2 files changed, 24 insertions(+), 12 deletions(-)

diff --git a/fs/pstore/platform.c b/fs/pstore/platform.c
index 9ec22d3..426e77d 100644
--- a/fs/pstore/platform.c
+++ b/fs/pstore/platform.c
@@ -90,18 +90,17 @@ static void pstore_dump(struct kmsg_dumper *dumper,
 	int		hsize, ret;
 	unsigned int	part = 1;
 	unsigned long	flags = 0;
-	int		is_locked = 0;
 
 	if (reason < ARRAY_SIZE(reason_str))
 		why = reason_str[reason];
 	else
 		why = "Unknown";
 
-	if (in_nmi()) {
-		is_locked = spin_trylock(&psinfo->buf_lock);
-		if (!is_locked)
-			pr_err("pstore dump routine blocked in NMI, may corrupt error record\n");
-	} else
+	/*
+	 * pstore_dump() is serialized in panic path.
+	 * So, we don't need to take any locks.
+	 */
+	if (reason != KMSG_DUMP_PANIC)
 		spin_lock_irqsave(&psinfo->buf_lock, flags);
 	oopscount++;
 	while (total < kmsg_bytes) {
@@ -131,10 +130,7 @@ static void pstore_dump(struct kmsg_dumper *dumper,
 		total += l1_cpy + l2_cpy;
 		part++;
 	}
-	if (in_nmi()) {
-		if (is_locked)
-			spin_unlock(&psinfo->buf_lock);
-	} else
+	if (reason != KMSG_DUMP_PANIC)
 		spin_unlock_irqrestore(&psinfo->buf_lock, flags);
 }
 
diff --git a/kernel/printk.c b/kernel/printk.c
index 13c0a11..5294426 100644
--- a/kernel/printk.c
+++ b/kernel/printk.c
@@ -1732,13 +1732,29 @@ void kmsg_dump(enum kmsg_dump_reason reason)
 	unsigned long l1, l2;
 	unsigned long flags;
 
+	/*
+	 * Currently, "in_nmi() && reason != KMSG_DUMP_PANIC" case never happens
+	 * because ,in NMI path, only kmsg_dump(KMSG_DUMP_PANIC) is called.
+	 * But if someone adds new kmsg_dump() into NMI path in the future,
+	 * kmsg_dump() may deadlock. We can trap it and complain
+	 * with this WARN_ON().
+	 */
+	WARN_ON(in_nmi() && reason != KMSG_DUMP_PANIC);
+
 	/* Theoretically, the log could move on after we do this, but
 	   there's not a lot we can do about that. The new messages
 	   will overwrite the start of what we dump. */
-	raw_spin_lock_irqsave(&logbuf_lock, flags);
+
+	/*
+	 * kmsg_dump(KMSG_DUMP_PANIC) is serialized.
+	 * So, we don't need to take any locks.
+	 */
+	if (reason != KMSG_DUMP_PANIC)
+		raw_spin_lock_irqsave(&logbuf_lock, flags);
 	end = log_end & LOG_BUF_MASK;
 	chars = logged_chars;
-	raw_spin_unlock_irqrestore(&logbuf_lock, flags);
+	if (reason != KMSG_DUMP_PANIC)
+		raw_spin_unlock_irqrestore(&logbuf_lock, flags);
 
 	if (chars > end) {
 		s1 = log_buf + log_buf_len - chars + end;
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [RFC][PATCH v4 -next 3/4]Skip subsequent kmsg_dump() function calls in panic path
  2012-01-05 17:35 [RFC][PATCH v4 -next 0/4] Make pstore/kmsg_dump run after stopping other cpus in panic path Seiji Aguchi
  2012-01-05 17:36 ` [RFC][PATCH v4 -next 1/4] Move kmsg_dump(KMSG_DUMP_PANIC) below smp_send_stop() Seiji Aguchi
  2012-01-05 17:38 ` [RFC][PATCH v4 -next 2/4] Skip spin_locks in panic case and Add WARN_ON() Seiji Aguchi
@ 2012-01-05 17:39 ` Seiji Aguchi
  2012-01-05 17:41 ` [RFC][PATCH v4 -next 4/4] Skip spin_lock of efi_pstore_write() in panic case Seiji Aguchi
  3 siblings, 0 replies; 24+ messages in thread
From: Seiji Aguchi @ 2012-01-05 17:39 UTC (permalink / raw)
  To: Luck, Tony, Don Zickus
  Cc: linux-kernel, Matthew Garrett, Vivek Goyal, Chen, Gong, akpm,
	Brown, Len, 'ying.huang, 'ak, 'hughd, 'mingo,
	jmorris, a.p.zijlstra, namhyung, dle-develop, Satoru Moriya

This patch skips subsequent kmsg_dump() function calls in panic path

With this patch, we can avoid deadlock due to the subsequent calls.
Actually, kmsg_dump(KMSG_DUMP_EMREG) is called after kmsg_dump(KMSG_DUMP_PANIC) 
when panic_timeout variable is set.

 Signed-off-by: Seiji Aguchi <seiji.aguchi@hds.com>

---
 kernel/printk.c |   10 ++++++++++
 1 files changed, 10 insertions(+), 0 deletions(-)

diff --git a/kernel/printk.c b/kernel/printk.c
index 5294426..186d792 100644
--- a/kernel/printk.c
+++ b/kernel/printk.c
@@ -1731,6 +1731,16 @@ void kmsg_dump(enum kmsg_dump_reason reason)
 	const char *s1, *s2;
 	unsigned long l1, l2;
 	unsigned long flags;
+	static bool panicked;
+
+	/*
+	 * kmsg_dump() is skipped because we already got panic log.
+	 */
+	if (panicked)
+		return;
+
+	if (reason == KMSG_DUMP_PANIC)
+		panicked = true;
 
 	/*
 	 * Currently, "in_nmi() && reason != KMSG_DUMP_PANIC" case never happens
-- 
1.7.1



^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [RFC][PATCH v4 -next 4/4] Skip spin_lock of efi_pstore_write() in panic case
  2012-01-05 17:35 [RFC][PATCH v4 -next 0/4] Make pstore/kmsg_dump run after stopping other cpus in panic path Seiji Aguchi
                   ` (2 preceding siblings ...)
  2012-01-05 17:39 ` [RFC][PATCH v4 -next 3/4]Skip subsequent kmsg_dump() function calls in panic path Seiji Aguchi
@ 2012-01-05 17:41 ` Seiji Aguchi
  3 siblings, 0 replies; 24+ messages in thread
From: Seiji Aguchi @ 2012-01-05 17:41 UTC (permalink / raw)
  To: Luck, Tony, Don Zickus
  Cc: linux-kernel, Matthew Garrett, Vivek Goyal, Chen, Gong, akpm,
	Brown, Len, 'ying.huang, 'ak, 'hughd, 'mingo,
	jmorris, a.p.zijlstra, namhyung, dle-develop, Satoru Moriya

This patch skips spin_lock of efi_pstore_write() in panic path for these reasons.
 - efi_pstore_write() is serialized via smp_send_stop().
 - SetVariable runtime service is guaranteed to work correctly if you call
   it a second time when you have taken an exception in trying
   to execute them already.

We can avoid deadlock of spin_lock(&efivars->lock) in panic path and get panic log
reliably.


 Signed-off-by: Seiji Aguchi <seiji.aguchi@hds.com>

---
 drivers/firmware/efivars.c |   13 +++++++++++--
 1 files changed, 11 insertions(+), 2 deletions(-)

diff --git a/drivers/firmware/efivars.c b/drivers/firmware/efivars.c
index d25599f..657e6f1 100644
--- a/drivers/firmware/efivars.c
+++ b/drivers/firmware/efivars.c
@@ -510,7 +510,15 @@ static int efi_pstore_write(enum pstore_type_id type,
 	sprintf(stub_name, "dump-type%u-%u-", type, part);
 	sprintf(name, "%s%lu", stub_name, get_seconds());
 
-	spin_lock(&efivars->lock);
+	/*
+	 * We don't need to take any locks in panic path for these reasons.
+	 *  - efi_pstore_write() is serialized via smp_send_stop().
+	 *  - SetVariable runtime service is guaranteed to work correctly
+	 *    if you call it a second time when you have taken an exception
+	 *    in trying to execute them already.
+	 */
+	if (reason != KMSG_DUMP_PANIC)
+		spin_lock(&efivars->lock);
 
 	for (i = 0; i < DUMP_NAME_LEN; i++)
 		efi_name[i] = stub_name[i];
@@ -548,7 +556,8 @@ static int efi_pstore_write(enum pstore_type_id type,
 	efivars->ops->set_variable(efi_name, &vendor, PSTORE_EFI_ATTRIBUTES,
 				   size, psi->buf);
 
-	spin_unlock(&efivars->lock);
+	if (reason != KMSG_DUMP_PANIC)
+		spin_unlock(&efivars->lock);
 
 	if (found)
 		efivar_unregister(found);
-- 
1.7.1




^ permalink raw reply related	[flat|nested] 24+ messages in thread

* RE: [RFC][PATCH v4 -next 1/4] Move kmsg_dump(KMSG_DUMP_PANIC) below smp_send_stop()
  2012-01-05 17:36 ` [RFC][PATCH v4 -next 1/4] Move kmsg_dump(KMSG_DUMP_PANIC) below smp_send_stop() Seiji Aguchi
@ 2012-01-05 19:06   ` Luck, Tony
  2012-01-05 20:10     ` Seiji Aguchi
  0 siblings, 1 reply; 24+ messages in thread
From: Luck, Tony @ 2012-01-05 19:06 UTC (permalink / raw)
  To: Seiji Aguchi, Don Zickus
  Cc: linux-kernel, Matthew Garrett, Vivek Goyal, Chen, Gong, akpm,
	Brown, Len, 'ying.huang, 'ak, 'hughd, 'mingo,
	jmorris, a.p.zijlstra, namhyung, dle-develop, Satoru Moriya

-	kmsg_dump(KMSG_DUMP_PANIC);
-
 	/*
 	 * Note smp_send_stop is the usual smp shutdown function, which
 	 * unfortunately means it may not be hardened to work in a panic
@@ -117,6 +115,8 @@ void panic(const char *fmt, ...)
 	 */
 	smp_send_stop();
 
+	kmsg_dump(KMSG_DUMP_PANIC);
+

Aren't you worried about the comment about smp_send_stop() not
being hardened to work in a panic situation?

If it does work - we are clearly much better off moving the
kmsg_dump() call down like this. It makes life much simpler
and cleaner to work with just one running cpu.

But if something goes wrong - we might not see the dump at all!

How do we compare these cases and decide that it is better to
trust that smp_send_stop() will return?

-Tony

^ permalink raw reply	[flat|nested] 24+ messages in thread

* RE: [RFC][PATCH v4 -next 1/4] Move kmsg_dump(KMSG_DUMP_PANIC) below smp_send_stop()
  2012-01-05 19:06   ` Luck, Tony
@ 2012-01-05 20:10     ` Seiji Aguchi
  2012-01-05 21:01       ` Don Zickus
  0 siblings, 1 reply; 24+ messages in thread
From: Seiji Aguchi @ 2012-01-05 20:10 UTC (permalink / raw)
  To: Luck, Tony, Don Zickus
  Cc: linux-kernel, Matthew Garrett, Vivek Goyal, Chen, Gong, akpm,
	Brown, Len, 'ying.huang, 'ak, 'hughd, 'mingo,
	jmorris, a.p.zijlstra, namhyung, dle-develop, Satoru Moriya


>Aren't you worried about the comment about smp_send_stop() not
>being hardened to work in a panic situation?
> 	/*
> 	 * Note smp_send_stop is the usual smp shutdown function, which
> 	 * unfortunately means it may not be hardened to work in a panic

This comment is wrong because Don improved smp_send_stop() by switching REBOOT_VECTOR to NMI.
And his patch has already merged to linux-next tree.

http://git.kernel.org/?p=linux/kernel/git/next/linux-next.git;a=commitdiff;h=3603a2512f9e69dc87914ba922eb4a0812b21cd6

So, current smp_send_stop() is hardened to work in a panic situation.

I will remove this wrong comment.

Seiji

>-----Original Message-----
>From: Luck, Tony [mailto:tony.luck@intel.com]
>Sent: Thursday, January 05, 2012 2:07 PM
>To: Seiji Aguchi; Don Zickus
>Cc: linux-kernel@vger.kernel.org; Matthew Garrett; Vivek Goyal; Chen, Gong; akpm@linux-foundation.org; Brown, Len;
>'ying.huang@intel.com'; 'ak@linux.intel.com'; 'hughd@chromium.org'; 'mingo@elte.hu'; jmorris@namei.org;
>a.p.zijlstra@chello.nl; namhyung@gmail.com; dle-develop@lists.sourceforge.net; Satoru Moriya
>Subject: RE: [RFC][PATCH v4 -next 1/4] Move kmsg_dump(KMSG_DUMP_PANIC) below smp_send_stop()
>
>-	kmsg_dump(KMSG_DUMP_PANIC);
>-
> 	/*
> 	 * Note smp_send_stop is the usual smp shutdown function, which
> 	 * unfortunately means it may not be hardened to work in a panic
>@@ -117,6 +115,8 @@ void panic(const char *fmt, ...)
> 	 */
> 	smp_send_stop();
>
>+	kmsg_dump(KMSG_DUMP_PANIC);
>+
>
>Aren't you worried about the comment about smp_send_stop() not
>being hardened to work in a panic situation?
>
>If it does work - we are clearly much better off moving the
>kmsg_dump() call down like this. It makes life much simpler
>and cleaner to work with just one running cpu.
>
>But if something goes wrong - we might not see the dump at all!
>
>How do we compare these cases and decide that it is better to
>trust that smp_send_stop() will return?
>
>-Tony

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC][PATCH v4 -next 1/4] Move kmsg_dump(KMSG_DUMP_PANIC) below smp_send_stop()
  2012-01-05 20:10     ` Seiji Aguchi
@ 2012-01-05 21:01       ` Don Zickus
  2012-01-09 17:59         ` Seiji Aguchi
  0 siblings, 1 reply; 24+ messages in thread
From: Don Zickus @ 2012-01-05 21:01 UTC (permalink / raw)
  To: Seiji Aguchi
  Cc: Luck, Tony, linux-kernel, Matthew Garrett, Vivek Goyal, Chen,
	Gong, akpm, Brown, Len, 'ying.huang, 'ak, 'hughd,
	'mingo, jmorris, a.p.zijlstra, namhyung, dle-develop,
	Satoru Moriya

On Thu, Jan 05, 2012 at 03:10:25PM -0500, Seiji Aguchi wrote:
> 
> >Aren't you worried about the comment about smp_send_stop() not
> >being hardened to work in a panic situation?
> > 	/*
> > 	 * Note smp_send_stop is the usual smp shutdown function, which
> > 	 * unfortunately means it may not be hardened to work in a panic
> 
> This comment is wrong because Don improved smp_send_stop() by switching REBOOT_VECTOR to NMI.
> And his patch has already merged to linux-next tree.

I only fixed x86.  Who knows what the other arches do..

I don't know how to prove something is hardened other than not seeing any
hangs or false reboots on in that piece of code.

Cheers,
Don

> 
> http://git.kernel.org/?p=linux/kernel/git/next/linux-next.git;a=commitdiff;h=3603a2512f9e69dc87914ba922eb4a0812b21cd6
> 
> So, current smp_send_stop() is hardened to work in a panic situation.
> 
> I will remove this wrong comment.
> 
> Seiji
> 
> >-----Original Message-----
> >From: Luck, Tony [mailto:tony.luck@intel.com]
> >Sent: Thursday, January 05, 2012 2:07 PM
> >To: Seiji Aguchi; Don Zickus
> >Cc: linux-kernel@vger.kernel.org; Matthew Garrett; Vivek Goyal; Chen, Gong; akpm@linux-foundation.org; Brown, Len;
> >'ying.huang@intel.com'; 'ak@linux.intel.com'; 'hughd@chromium.org'; 'mingo@elte.hu'; jmorris@namei.org;
> >a.p.zijlstra@chello.nl; namhyung@gmail.com; dle-develop@lists.sourceforge.net; Satoru Moriya
> >Subject: RE: [RFC][PATCH v4 -next 1/4] Move kmsg_dump(KMSG_DUMP_PANIC) below smp_send_stop()
> >
> >-	kmsg_dump(KMSG_DUMP_PANIC);
> >-
> > 	/*
> > 	 * Note smp_send_stop is the usual smp shutdown function, which
> > 	 * unfortunately means it may not be hardened to work in a panic
> >@@ -117,6 +115,8 @@ void panic(const char *fmt, ...)
> > 	 */
> > 	smp_send_stop();
> >
> >+	kmsg_dump(KMSG_DUMP_PANIC);
> >+
> >
> >Aren't you worried about the comment about smp_send_stop() not
> >being hardened to work in a panic situation?
> >
> >If it does work - we are clearly much better off moving the
> >kmsg_dump() call down like this. It makes life much simpler
> >and cleaner to work with just one running cpu.
> >
> >But if something goes wrong - we might not see the dump at all!
> >
> >How do we compare these cases and decide that it is better to
> >trust that smp_send_stop() will return?
> >
> >-Tony

^ permalink raw reply	[flat|nested] 24+ messages in thread

* RE: [RFC][PATCH v4 -next 1/4] Move kmsg_dump(KMSG_DUMP_PANIC) below smp_send_stop()
  2012-01-05 21:01       ` Don Zickus
@ 2012-01-09 17:59         ` Seiji Aguchi
  2012-01-10  3:06           ` Chen Gong
  0 siblings, 1 reply; 24+ messages in thread
From: Seiji Aguchi @ 2012-01-09 17:59 UTC (permalink / raw)
  To: Don Zickus, Luck, Tony
  Cc: linux-kernel, Matthew Garrett, Vivek Goyal, Chen, Gong, akpm,
	Brown, Len, 'ying.huang, 'ak, 'hughd, 'mingo,
	jmorris, a.p.zijlstra, namhyung, dle-develop, Satoru Moriya

>
>I don't know how to prove something is hardened other than not seeing any
>hangs or false reboots on in that piece of code.
>

I don't know how we can prove reliability of smp_send_stop() of all archs as well.

Another possible solution is replacing smp_send_stop() with 
machine_crash_shutdown().
This is a function call stopping other cpus reliably in panic situation
so that kdump can work .
But, I don't know whether it works in all archs.
We need further discussion with Eric for checking this idea is feasible...


I would like to go forward step by step rather than doing tough work at once.
I think just moving kmsg_dump() below smp_send_stop() is a good start for improving pstore.

Seiji

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC][PATCH v4 -next 1/4] Move kmsg_dump(KMSG_DUMP_PANIC) below smp_send_stop()
  2012-01-09 17:59         ` Seiji Aguchi
@ 2012-01-10  3:06           ` Chen Gong
  2012-01-10 20:29             ` Seiji Aguchi
  0 siblings, 1 reply; 24+ messages in thread
From: Chen Gong @ 2012-01-10  3:06 UTC (permalink / raw)
  To: Seiji Aguchi
  Cc: Don Zickus, Luck, Tony, linux-kernel, Matthew Garrett,
	Vivek Goyal, Chen, Gong, akpm, Brown, Len, 'ying.huang,
	'ak, 'hughd, 'mingo, jmorris, a.p.zijlstra, namhyung,
	dle-develop, Satoru Moriya

于 2012/1/10 1:59, Seiji Aguchi 写道:
>>
>> I don't know how to prove something is hardened other than not seeing any
>> hangs or false reboots on in that piece of code.
>>
>
> I don't know how we can prove reliability of smp_send_stop() of all archs as well.
>
> Another possible solution is replacing smp_send_stop() with
> machine_crash_shutdown().
> This is a function call stopping other cpus reliably in panic situation
> so that kdump can work .
> But, I don't know whether it works in all archs.

machine_crash_shutdown is defined in only some Archs so it is obvious that it
can't be used in all platforms, BTW, this function is bracketed by CONFIG_KEXEC,
which means it can't be used without this macro. So it is not suitable for
this scenario.

> We need further discussion with Eric for checking this idea is feasible...
>
>
> I would like to go forward step by step rather than doing tough work at once.
> I think just moving kmsg_dump() below smp_send_stop() is a good start for improving pstore.

I agree with you. How about adding macros or something like WARN_ON(XX_ARCH) or
Kconfig to limit its scope?


^ permalink raw reply	[flat|nested] 24+ messages in thread

* RE: [RFC][PATCH v4 -next 1/4] Move kmsg_dump(KMSG_DUMP_PANIC) below smp_send_stop()
  2012-01-10  3:06           ` Chen Gong
@ 2012-01-10 20:29             ` Seiji Aguchi
  2012-01-11  7:28               ` Chen Gong
  0 siblings, 1 reply; 24+ messages in thread
From: Seiji Aguchi @ 2012-01-10 20:29 UTC (permalink / raw)
  To: Chen Gong, Luck, Tony
  Cc: Don Zickus, linux-kernel, Matthew Garrett, Vivek Goyal, Chen,
	Gong, akpm, Brown, Len, 'ying.huang, 'ak, 'hughd,
	'mingo, jmorris, a.p.zijlstra, namhyung, dle-develop,
	Satoru Moriya

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="utf-8", Size: 1921 bytes --]


>I agree with you. How about adding macros or something like WARN_ON(XX_ARCH) or
>Kconfig to limit its scope?

Thank you for giving me your idea.
Your suggestions above will work for me because I'm a x86 user.
If Tony agrees to it, I can update my patch.

But, I'm hesitating to add WARN_ON() or change Kconfig only for specific arch 
because pstore aims for generic interface and this is related to its design.
Also, ramoops is going to use pstore now. It doesn't depend on x86.
I'm worried that ramoops users will complain about this change.

So, I think a reasonable solution at this time is just adding some explanations 
about smp_send_stop() to documentation as follows.

Users can use pstore with their own responsibility and ask developers 
if smp_send_stop() is reliable enough in panic situation on architecture they want to run.

What do you think?

---
 Documentation/ABI/testing/pstore |    8 ++++++++
 1 files changed, 8 insertions(+), 0 deletions(-)

diff --git a/Documentation/ABI/testing/pstore b/Documentation/ABI/testing/pstore
index ff1df4e..5583729 100644
--- a/Documentation/ABI/testing/pstore
+++ b/Documentation/ABI/testing/pstore
@@ -11,6 +11,14 @@ Description:	Generic interface to platform dependent persistent storage.
 		of the console log is captured, but other interesting
 		data can also be saved.
 
+		In case of panic, pstore is invoked after smp_send_stop()
+		,a function call stopping other cpus, so that we can get
+		logs simpler and cleaner with just one running cpu.
+
+		As for x86, smp_send_stop() is reliable enough to work in
+		panic situation. But we are not guaranteed that it works
+		reliably on other architectures.
+
 		# mount -t pstore -o kmsg_bytes=8000 - /dev/pstore
 
 		$ ls -l /dev/pstore
-- 

Seiji
ÿôèº{.nÇ+‰·Ÿ®‰­†+%ŠËÿ±éݶ\x17¥Šwÿº{.nÇ+‰·¥Š{±þG«éÿŠ{ayº\x1dʇڙë,j\a­¢f£¢·hšïêÿ‘êçz_è®\x03(­éšŽŠÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?™¨è­Ú&£ø§~á¶iO•æ¬z·švØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?–I¥

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [RFC][PATCH v4 -next 1/4] Move kmsg_dump(KMSG_DUMP_PANIC) below smp_send_stop()
  2012-01-10 20:29             ` Seiji Aguchi
@ 2012-01-11  7:28               ` Chen Gong
  2012-01-11 17:25                 ` Don Zickus
  0 siblings, 1 reply; 24+ messages in thread
From: Chen Gong @ 2012-01-11  7:28 UTC (permalink / raw)
  To: Seiji Aguchi
  Cc: Luck, Tony, Don Zickus, linux-kernel, Matthew Garrett,
	Vivek Goyal, Chen, Gong, akpm, Brown, Len, 'ying.huang,
	'ak, 'hughd, 'mingo, jmorris, a.p.zijlstra, namhyung,
	dle-develop, Satoru Moriya

于 2012/1/11 4:29, Seiji Aguchi 写道:
>
>> I agree with you. How about adding macros or something like WARN_ON(XX_ARCH) or
>> Kconfig to limit its scope?
>
> Thank you for giving me your idea.
> Your suggestions above will work for me because I'm a x86 user.
> If Tony agrees to it, I can update my patch.
>
> But, I'm hesitating to add WARN_ON() or change Kconfig only for specific arch
> because pstore aims for generic interface and this is related to its design.
> Also, ramoops is going to use pstore now. It doesn't depend on x86.
> I'm worried that ramoops users will complain about this change.
>
> So, I think a reasonable solution at this time is just adding some explanations
> about smp_send_stop() to documentation as follows.
>
> Users can use pstore with their own responsibility and ask developers
> if smp_send_stop() is reliable enough in panic situation on architecture they want to run.
>
> What do you think?
>
> ---
>   Documentation/ABI/testing/pstore |    8 ++++++++
>   1 files changed, 8 insertions(+), 0 deletions(-)
>
> diff --git a/Documentation/ABI/testing/pstore b/Documentation/ABI/testing/pstore
> index ff1df4e..5583729 100644
> --- a/Documentation/ABI/testing/pstore
> +++ b/Documentation/ABI/testing/pstore
> @@ -11,6 +11,14 @@ Description:	Generic interface to platform dependent persistent storage.
>   		of the console log is captured, but other interesting
>   		data can also be saved.
>
> +		In case of panic, pstore is invoked after smp_send_stop()
> +		,a function call stopping other cpus, so that we can get
> +		logs simpler and cleaner with just one running cpu.
> +
> +		As for x86, smp_send_stop() is reliable enough to work in
> +		panic situation. But we are not guaranteed that it works
> +		reliably on other architectures.
> +
>   		# mount -t pstore -o kmsg_bytes=8000 - /dev/pstore
>
>   		$ ls -l /dev/pstore

The explanation is great. but In my opinion, I still insist that
a WARN_ON() is necessary. What do you think, Tony and Don?

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC][PATCH v4 -next 1/4] Move kmsg_dump(KMSG_DUMP_PANIC) below smp_send_stop()
  2012-01-11  7:28               ` Chen Gong
@ 2012-01-11 17:25                 ` Don Zickus
  2012-01-11 22:22                   ` Luck, Tony
  0 siblings, 1 reply; 24+ messages in thread
From: Don Zickus @ 2012-01-11 17:25 UTC (permalink / raw)
  To: Chen Gong
  Cc: Seiji Aguchi, Luck, Tony, linux-kernel, Matthew Garrett,
	Vivek Goyal, Chen, Gong, akpm, Brown, Len, 'ying.huang,
	'ak, 'hughd, 'mingo, jmorris, a.p.zijlstra, namhyung,
	dle-develop, Satoru Moriya

On Wed, Jan 11, 2012 at 03:28:11PM +0800, Chen Gong wrote:
> 于 2012/1/11 4:29, Seiji Aguchi 写道:
> >
> >>I agree with you. How about adding macros or something like WARN_ON(XX_ARCH) or
> >>Kconfig to limit its scope?
> >
> >Thank you for giving me your idea.
> >Your suggestions above will work for me because I'm a x86 user.
> >If Tony agrees to it, I can update my patch.
> >
> >But, I'm hesitating to add WARN_ON() or change Kconfig only for specific arch
> >because pstore aims for generic interface and this is related to its design.
> >Also, ramoops is going to use pstore now. It doesn't depend on x86.
> >I'm worried that ramoops users will complain about this change.
> >
> >So, I think a reasonable solution at this time is just adding some explanations
> >about smp_send_stop() to documentation as follows.
> >
> >Users can use pstore with their own responsibility and ask developers
> >if smp_send_stop() is reliable enough in panic situation on architecture they want to run.
> >
> >What do you think?
> >
> >---
> >  Documentation/ABI/testing/pstore |    8 ++++++++
> >  1 files changed, 8 insertions(+), 0 deletions(-)
> >
> >diff --git a/Documentation/ABI/testing/pstore b/Documentation/ABI/testing/pstore
> >index ff1df4e..5583729 100644
> >--- a/Documentation/ABI/testing/pstore
> >+++ b/Documentation/ABI/testing/pstore
> >@@ -11,6 +11,14 @@ Description:	Generic interface to platform dependent persistent storage.
> >  		of the console log is captured, but other interesting
> >  		data can also be saved.
> >
> >+		In case of panic, pstore is invoked after smp_send_stop()
> >+		,a function call stopping other cpus, so that we can get
> >+		logs simpler and cleaner with just one running cpu.
> >+
> >+		As for x86, smp_send_stop() is reliable enough to work in
> >+		panic situation. But we are not guaranteed that it works
> >+		reliably on other architectures.
> >+
> >  		# mount -t pstore -o kmsg_bytes=8000 - /dev/pstore
> >
> >  		$ ls -l /dev/pstore
> 
> The explanation is great. but In my opinion, I still insist that
> a WARN_ON() is necessary. What do you think, Tony and Don?

I guess I still don't understand why.  Who uses kmsg_dump besides x86?  It
seems like there was only 3 or 4 subsystems that were registering.

Cheers,
Don


^ permalink raw reply	[flat|nested] 24+ messages in thread

* RE: [RFC][PATCH v4 -next 1/4] Move kmsg_dump(KMSG_DUMP_PANIC) below smp_send_stop()
  2012-01-11 17:25                 ` Don Zickus
@ 2012-01-11 22:22                   ` Luck, Tony
  2012-01-13 22:50                     ` Seiji Aguchi
       [not found]                     ` <32727E9A83EE9A42A1F0906295A3A77B2C78F49973@USINDEVS01.corp.hds.com>
  0 siblings, 2 replies; 24+ messages in thread
From: Luck, Tony @ 2012-01-11 22:22 UTC (permalink / raw)
  To: Don Zickus, Chen Gong
  Cc: Seiji Aguchi, linux-kernel, Matthew Garrett, Vivek Goyal, Chen,
	Gong, akpm, Brown, Len, 'ying.huang, 'ak, 'hughd,
	'mingo, jmorris, a.p.zijlstra, namhyung, dle-develop,
	Satoru Moriya

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="utf-8", Size: 1385 bytes --]

> The explanation is great. but In my opinion, I still insist that
> a WARN_ON() is necessary. What do you think, Tony and Don?

I disagree - a WARN_ON() would clutter the log with a stack trace of
dubious usefulness - possibly displacing something earlier in the
console log from being saved to pstore.

I'm about 80% sold on the idea of moving the call below smp_send_stop().
It does make life simpler for pstore (though not completely pain free,
ignoring locks is safe only in as far as we know that other cpus aren't
actually trying to modify things - there is still some problem if the
other cpu had a device or data structure in some special state when
smp_send_stop() interrupted it). The 20% of me that isn't buying this
still has worries that smp_send_stop() might fail in one of several ways:
1) Fails to actually stop one or more other cpus (this is similar to our
current situation where other cpus may interfere with us saving kmsg in
pstore).
2) Causes another fault, thus recursively entering the panic path.
3) Hangs - causing us to miss saving to pstore.

I don't know what can be done to resolve this - it is hard to make a
100% convincing argument about the execution of any code in the panic
path.

-Tony
ÿôèº{.nÇ+‰·Ÿ®‰­†+%ŠËÿ±éݶ\x17¥Šwÿº{.nÇ+‰·¥Š{±þG«éÿŠ{ayº\x1dʇڙë,j\a­¢f£¢·hšïêÿ‘êçz_è®\x03(­éšŽŠÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?™¨è­Ú&£ø§~á¶iO•æ¬z·švØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?–I¥

^ permalink raw reply	[flat|nested] 24+ messages in thread

* RE: [RFC][PATCH v4 -next 1/4] Move kmsg_dump(KMSG_DUMP_PANIC) below smp_send_stop()
  2012-01-11 22:22                   ` Luck, Tony
@ 2012-01-13 22:50                     ` Seiji Aguchi
       [not found]                     ` <32727E9A83EE9A42A1F0906295A3A77B2C78F49973@USINDEVS01.corp.hds.com>
  1 sibling, 0 replies; 24+ messages in thread
From: Seiji Aguchi @ 2012-01-13 22:50 UTC (permalink / raw)
  To: Luck, Tony, Don Zickus, Chen Gong
  Cc: linux-kernel, Matthew Garrett, Vivek Goyal, Chen, Gong, akpm,
	Brown, Len, 'ying.huang, 'ak, 'hughd, 'mingo,
	jmorris, a.p.zijlstra, namhyung, dle-develop, Satoru Moriya

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="utf-8", Size: 1603 bytes --]

Tony, 

I understand you are seriously concerned about reliability of pstore.
And I'm in the same position.

But I still suggest to move kmsg_dump() below smp_send_stop().

>The 20% of me that isn't buying this
>still has worries that smp_send_stop() might fail in one of several ways:
>1) Fails to actually stop one or more other cpus (this is similar to our
>current situation where other cpus may interfere with us saving kmsg in
>pstore).

I don't understand this case. Have you ever experienced some specific cases 
failing to stop cpus?

As for x86, this case will never happen if not cpus are broken.

>2) Causes another fault, thus recursively entering the panic path.
>3) Hangs - causing us to miss saving to pstore.

These concerns are not just smp_send_stop().

In panic(), there are some function calls above kmsg_dump(). 
 Ex. dump_stack(), printk(), crash_kexec()....
If they panic/hang, same issues will happen.

So, 2) and 3) are not reasonable reasons for rejecting to move kmsg_dump() below smp_send_stop().

>
>I don't know what can be done to resolve this - it is hard to make a
>100% convincing argument about the execution of any code in the panic
>path.

One of the ways we have confidence is doing more testing.
As for kdump, LKDTM is used for checking regressions of kdump.

If pstore works with LKDTM, we can prove that pstore has minimal reliabliy.
(I don't know if we need additional testing at this time.)

Seiji

ÿôèº{.nÇ+‰·Ÿ®‰­†+%ŠËÿ±éݶ\x17¥Šwÿº{.nÇ+‰·¥Š{±þG«éÿŠ{ayº\x1dʇڙë,j\a­¢f£¢·hšïêÿ‘êçz_è®\x03(­éšŽŠÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?™¨è­Ú&£ø§~á¶iO•æ¬z·švØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?–I¥

^ permalink raw reply	[flat|nested] 24+ messages in thread

* RE: [RFC][PATCH v4 -next 1/4] Move kmsg_dump(KMSG_DUMP_PANIC) below smp_send_stop()
       [not found]                     ` <32727E9A83EE9A42A1F0906295A3A77B2C78F49973@USINDEVS01.corp.hds.com>
@ 2012-01-19 20:58                       ` Seiji Aguchi
  2012-01-20 17:56                         ` Luck, Tony
  0 siblings, 1 reply; 24+ messages in thread
From: Seiji Aguchi @ 2012-01-19 20:58 UTC (permalink / raw)
  To: Luck, Tony, Don Zickus, Chen Gong
  Cc: linux-kernel, Matthew Garrett, Vivek Goyal, Chen, Gong, akpm,
	Brown, Len, 'ying.huang, 'ak, 'hughd, 'mingo,
	jmorris, a.p.zijlstra, namhyung, dle-develop, Satoru Moriya

Tony,

Do you have any comments?

________________________________________

Tony,

I understand you are seriously concerned about reliability of pstore.
And I'm in the same position.

But I still suggest to move kmsg_dump() below smp_send_stop().

>The 20% of me that isn't buying this
>still has worries that smp_send_stop() might fail in one of several ways:
>1) Fails to actually stop one or more other cpus (this is similar to our
>current situation where other cpus may interfere with us saving kmsg in
>pstore).

I don't understand this case. Have you ever experienced some specific cases
failing to stop cpus?

As for x86, this case will never happen if not cpus are broken.

>2) Causes another fault, thus recursively entering the panic path.
>3) Hangs - causing us to miss saving to pstore.

These concerns are not just smp_send_stop().

In panic(), there are some function calls above kmsg_dump().
 Ex. dump_stack(), printk(), crash_kexec()....
If they panic/hang, same issues will happen.

So, 2) and 3) are not reasonable reasons for rejecting to move kmsg_dump() below smp_send_stop().

>
>I don't know what can be done to resolve this - it is hard to make a
>100% convincing argument about the execution of any code in the panic
>path.

One of the ways we have confidence is doing more testing.
As for kdump, LKDTM is used for checking regressions of kdump.

If pstore works with LKDTM, we can prove that pstore has minimal reliabliy.
(I don't know if we need additional testing at this time.)

Seiji


^ permalink raw reply	[flat|nested] 24+ messages in thread

* RE: [RFC][PATCH v4 -next 1/4] Move kmsg_dump(KMSG_DUMP_PANIC) below smp_send_stop()
  2012-01-19 20:58                       ` Seiji Aguchi
@ 2012-01-20 17:56                         ` Luck, Tony
  2012-02-03 17:18                           ` Don Zickus
  0 siblings, 1 reply; 24+ messages in thread
From: Luck, Tony @ 2012-01-20 17:56 UTC (permalink / raw)
  To: Seiji Aguchi, Don Zickus, Chen Gong
  Cc: linux-kernel, Matthew Garrett, Vivek Goyal, Chen, Gong, akpm,
	Brown, Len, 'ying.huang, 'ak, 'hughd, 'mingo,
	jmorris, a.p.zijlstra, namhyung, dle-develop, Satoru Moriya

> Do you have any comments?

I'm stuck in because I don't know how assign probabilities to
the failure cases with kmsg_dump() before and after the smp_send_stop().

There's a well documented tendency in humans to stick with the status
quo in such situations.  I'm definitely finding it hard to provide
a positive recommendation (ACK).

So I'll just talk out loud here for a bit in case someone sees
something obviously flawed in my understanding.

Problem statement: We'd like to maximize our chances of saving the
tail of the kernel log when the system goes down. With the current
ordering there is a concern that other cpus will interfere with the
one that is saving the log.

Problems in current code flow:
*) Other cpus might hold locks that we need. Our options are to fail,
   or to "bust" the locks (but busting the locks may lead to other
   problems in the code path - those locks were there for a reason).
   There are only a couple of ways that this could be an issue.
   1) The lock is held because someone is doing some other pstore
   filesystem operation (reading and erasing records). This has a
   very low probability. Normal code flow will have some process harvest
   records from pstore in some /etc/rc.d/* script - this process should
   take much less than a second.
   2) The lock is held because some other kmsg_dump() store is in progress.
   This one seems more worrying - think of an OOPS (or several) right
   before we panic

Problems in proposed code flow:
*) smp_send_stop() fails:
   1) doesn't actually stop other cpus (we are no worse off than before we
   made this change)
   2) doesn't return - so we don't even try to dump to pstore back end. x86
   code has recently been hardened (though I can still imagine a pathological
   case where in a crash the cpu calling this is uncertain of its own
   identity, and somehow manages to stop itself - perhaps we are so screwed up
   in this case that we have no hope anyway)
*) Even if it succeeds - we may still run into problems busting locks because
   even though the cpu that held them isn't executing, the data structures
   or device registers protected by the lock may be in an inconsistent state.
*) If we had just let this other cpus keep running, they'd have finished their
   operation and freed up the problem lock anyway

-Tony

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC][PATCH v4 -next 1/4] Move kmsg_dump(KMSG_DUMP_PANIC) below smp_send_stop()
  2012-01-20 17:56                         ` Luck, Tony
@ 2012-02-03 17:18                           ` Don Zickus
  2012-02-03 22:32                             ` Luck, Tony
  0 siblings, 1 reply; 24+ messages in thread
From: Don Zickus @ 2012-02-03 17:18 UTC (permalink / raw)
  To: Luck, Tony
  Cc: Seiji Aguchi, Chen Gong, linux-kernel, Matthew Garrett,
	Vivek Goyal, Chen, Gong, akpm, Brown, Len, 'ying.huang,
	'ak, 'hughd, 'mingo, jmorris, a.p.zijlstra, namhyung,
	dle-develop, Satoru Moriya

On Fri, Jan 20, 2012 at 05:56:15PM +0000, Luck, Tony wrote:
> > Do you have any comments?
> 
> I'm stuck in because I don't know how assign probabilities to
> the failure cases with kmsg_dump() before and after the smp_send_stop().
> 
> There's a well documented tendency in humans to stick with the status
> quo in such situations.  I'm definitely finding it hard to provide
> a positive recommendation (ACK).
> 
> So I'll just talk out loud here for a bit in case someone sees
> something obviously flawed in my understanding.

Thanks for trying to think this through.

> 
> Problem statement: We'd like to maximize our chances of saving the
> tail of the kernel log when the system goes down. With the current
> ordering there is a concern that other cpus will interfere with the
> one that is saving the log.
> 
> Problems in current code flow:
> *) Other cpus might hold locks that we need. Our options are to fail,
>    or to "bust" the locks (but busting the locks may lead to other
>    problems in the code path - those locks were there for a reason).
>    There are only a couple of ways that this could be an issue.
>    1) The lock is held because someone is doing some other pstore
>    filesystem operation (reading and erasing records). This has a
>    very low probability. Normal code flow will have some process harvest
>    records from pstore in some /etc/rc.d/* script - this process should
>    take much less than a second.

ok.

>    2) The lock is held because some other kmsg_dump() store is in progress.
>    This one seems more worrying - think of an OOPS (or several) right
>    before we panic

I think Seiji had another patch to address this.

> 
> Problems in proposed code flow:
> *) smp_send_stop() fails:
>    1) doesn't actually stop other cpus (we are no worse off than before we
>    made this change)
>    2) doesn't return - so we don't even try to dump to pstore back end. x86
>    code has recently been hardened (though I can still imagine a pathological
>    case where in a crash the cpu calling this is uncertain of its own
>    identity, and somehow manages to stop itself - perhaps we are so screwed up
>    in this case that we have no hope anyway)

Where in the code do you see that it might not return?  We can also
conceive of a scenario such that pstore or apei code has a bug and oops
the box a second time and we are no better off.  That code has a lot more
churn then the shutdown code, I believe.

> *) Even if it succeeds - we may still run into problems busting locks because
>    even though the cpu that held them isn't executing, the data structures
>    or device registers protected by the lock may be in an inconsistent state.
> *) If we had just let this other cpus keep running, they'd have finished their
>    operation and freed up the problem lock anyway

So this is an interesting concern and it would be nice to have that extra
second to finish things off before breaking the spin lock.  I was trying
to figure out if there was a way to do that.

On x86 I think we can (and maybe others).  I had a thought, most
spinlocks are taken by disabling interrupts (ie spin_lock_irqsave).  By
disabling interrupts you block IPIs.  Originally the shutdown code would
use the REBOOT_IPI to stop other cpus but would fail if some piece of code
on another cpu was spinning forever with irqs disabled.  I modified it to
use NMIs to be more robust.

What if we send the REBOOT_IPI first and let it block for up to a second.
Most code paths that are done with spin_locks will use
spin_lock_irqrestore.  As soon as the interrupts are re-enabled the
REBOOT_IPI comes in and takes the processor.  If after a second the cpu
still is blocking interrupts, just use the NMI as a big hammer to shut it
down.

This would allow the pstore stuff to block shutting things down for a
little bit to finish writing its data out before accepting the IPI (at
least for a second).  Otherwise if it takes more than a second and the NMI
has to come in, we may have to investigate what is going on.

Would that help win you over?

I know that is x86 specific, but other arches might be able to adapt a
similar approach?

Cheers,
Don

^ permalink raw reply	[flat|nested] 24+ messages in thread

* RE: [RFC][PATCH v4 -next 1/4] Move kmsg_dump(KMSG_DUMP_PANIC) below smp_send_stop()
  2012-02-03 17:18                           ` Don Zickus
@ 2012-02-03 22:32                             ` Luck, Tony
  2012-02-03 22:57                               ` Don Zickus
  0 siblings, 1 reply; 24+ messages in thread
From: Luck, Tony @ 2012-02-03 22:32 UTC (permalink / raw)
  To: Don Zickus
  Cc: Seiji Aguchi, Chen Gong, linux-kernel, Matthew Garrett,
	Vivek Goyal, Chen, Gong, akpm, Brown, Len, 'ying.huang,
	'ak, 'hughd, 'mingo, jmorris, a.p.zijlstra, namhyung,
	dle-develop, Satoru Moriya

> What if we send the REBOOT_IPI first and let it block for up to a second.
> Most code paths that are done with spin_locks will use
> spin_lock_irqrestore.  As soon as the interrupts are re-enabled the
> REBOOT_IPI comes in and takes the processor.  If after a second the cpu
> still is blocking interrupts, just use the NMI as a big hammer to shut it
> down.

This looks good - it certainly deals with my "if we just let them run
a bit, they'd release the locks" quibble.  One second sounds very
generous - but I'm not going to bikeshed that (so long as it is a total
of one second - not one second per cpu). So the pseudo-code is:

	send_reboot_ipi_to_everyone_else()

	wait_1_second()

	for_each_cpu_that_didnt_respond_to_reboot_ipi {
		hit_that_cpu_with_NMI()
	}

Perhaps a notification printk() if we had to use the NMI hammer?

-Tony

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC][PATCH v4 -next 1/4] Move kmsg_dump(KMSG_DUMP_PANIC) below smp_send_stop()
  2012-02-03 22:32                             ` Luck, Tony
@ 2012-02-03 22:57                               ` Don Zickus
  2012-02-08 20:19                                 ` Don Zickus
  0 siblings, 1 reply; 24+ messages in thread
From: Don Zickus @ 2012-02-03 22:57 UTC (permalink / raw)
  To: Luck, Tony
  Cc: Seiji Aguchi, Chen Gong, linux-kernel, Matthew Garrett,
	Vivek Goyal, Chen, Gong, akpm, Brown, Len,
	'ying.huang@intel.com', 'ak@linux.intel.com',
	'hughd@chromium.org', 'mingo@elte.hu',
	jmorris, a.p.zijlstra, namhyung, dle-develop, Satoru Moriya

On Fri, Feb 03, 2012 at 10:32:31PM +0000, Luck, Tony wrote:
> > What if we send the REBOOT_IPI first and let it block for up to a second.
> > Most code paths that are done with spin_locks will use
> > spin_lock_irqrestore.  As soon as the interrupts are re-enabled the
> > REBOOT_IPI comes in and takes the processor.  If after a second the cpu
> > still is blocking interrupts, just use the NMI as a big hammer to shut it
> > down.
> 
> This looks good - it certainly deals with my "if we just let them run
> a bit, they'd release the locks" quibble.  One second sounds very
> generous - but I'm not going to bikeshed that (so long as it is a total
> of one second - not one second per cpu). So the pseudo-code is:

This is how the stop_cpus is implemented on x86 and the one second comes
from there

arch/x86/kernel/smp.c::native_irq_stop_other_cpus and
native_nmi_stop_other_cpus

> 
> 	send_reboot_ipi_to_everyone_else()
> 
> 	wait_1_second()
> 
> 	for_each_cpu_that_didnt_respond_to_reboot_ipi {
> 		hit_that_cpu_with_NMI()
> 	}
> 
> Perhaps a notification printk() if we had to use the NMI hammer?

Yes.

Again this is for x86, but I guess that is our common case with pstore.

Cheers,
Don

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC][PATCH v4 -next 1/4] Move kmsg_dump(KMSG_DUMP_PANIC) below smp_send_stop()
  2012-02-03 22:57                               ` Don Zickus
@ 2012-02-08 20:19                                 ` Don Zickus
  2012-02-08 21:28                                   ` Luck, Tony
  0 siblings, 1 reply; 24+ messages in thread
From: Don Zickus @ 2012-02-08 20:19 UTC (permalink / raw)
  To: Luck, Tony
  Cc: Seiji Aguchi, Chen Gong, linux-kernel, Matthew Garrett,
	Vivek Goyal, Chen, Gong, akpm, Brown, Len,
	'ying.huang@intel.com', 'ak@linux.intel.com',
	'hughd@chromium.org', 'mingo@elte.hu',
	jmorris, a.p.zijlstra, namhyung, dle-develop, Satoru Moriya

On Fri, Feb 03, 2012 at 05:57:40PM -0500, Don Zickus wrote:
> On Fri, Feb 03, 2012 at 10:32:31PM +0000, Luck, Tony wrote:
> > > What if we send the REBOOT_IPI first and let it block for up to a second.
> > > Most code paths that are done with spin_locks will use
> > > spin_lock_irqrestore.  As soon as the interrupts are re-enabled the
> > > REBOOT_IPI comes in and takes the processor.  If after a second the cpu
> > > still is blocking interrupts, just use the NMI as a big hammer to shut it
> > > down.
> > 
> > This looks good - it certainly deals with my "if we just let them run
> > a bit, they'd release the locks" quibble.  One second sounds very
> > generous - but I'm not going to bikeshed that (so long as it is a total
> > of one second - not one second per cpu). So the pseudo-code is:
> 

Hi Tony,

If I put together a patch to address this would you be willing to let
Seiji move the kmsg_dump to below smp_send_stop()?

Cheers,
Don

> This is how the stop_cpus is implemented on x86 and the one second comes
> from there
> 
> arch/x86/kernel/smp.c::native_irq_stop_other_cpus and
> native_nmi_stop_other_cpus
> 
> > 
> > 	send_reboot_ipi_to_everyone_else()
> > 
> > 	wait_1_second()
> > 
> > 	for_each_cpu_that_didnt_respond_to_reboot_ipi {
> > 		hit_that_cpu_with_NMI()
> > 	}
> > 
> > Perhaps a notification printk() if we had to use the NMI hammer?
> 
> Yes.
> 
> Again this is for x86, but I guess that is our common case with pstore.
> 
> Cheers,
> Don

^ permalink raw reply	[flat|nested] 24+ messages in thread

* RE: [RFC][PATCH v4 -next 1/4] Move kmsg_dump(KMSG_DUMP_PANIC) below smp_send_stop()
  2012-02-08 20:19                                 ` Don Zickus
@ 2012-02-08 21:28                                   ` Luck, Tony
  2012-02-08 22:48                                     ` Don Zickus
  0 siblings, 1 reply; 24+ messages in thread
From: Luck, Tony @ 2012-02-08 21:28 UTC (permalink / raw)
  To: Don Zickus
  Cc: Seiji Aguchi, Chen Gong, linux-kernel, Matthew Garrett,
	Vivek Goyal, Chen, Gong, akpm, Brown, Len, Huang, Ying,
	'ak@linux.intel.com', 'hughd@chromium.org',
	'mingo@elte.hu',
	jmorris, a.p.zijlstra, namhyung, dle-develop, Satoru Moriya

> If I put together a patch to address this would you be willing to let
> Seiji move the kmsg_dump to below smp_send_stop()?

I'd "Ack" it ... not my decision whether to apply it.

-Tony

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC][PATCH v4 -next 1/4] Move kmsg_dump(KMSG_DUMP_PANIC) below smp_send_stop()
  2012-02-08 21:28                                   ` Luck, Tony
@ 2012-02-08 22:48                                     ` Don Zickus
  2012-02-08 22:56                                       ` Seiji Aguchi
  0 siblings, 1 reply; 24+ messages in thread
From: Don Zickus @ 2012-02-08 22:48 UTC (permalink / raw)
  To: Luck, Tony
  Cc: Seiji Aguchi, Chen Gong, linux-kernel, Matthew Garrett,
	Vivek Goyal, Chen, Gong, akpm, Brown, Len, Huang, Ying,
	'ak@linux.intel.com', 'hughd@chromium.org',
	'mingo@elte.hu',
	jmorris, a.p.zijlstra, namhyung, dle-develop, Satoru Moriya

On Wed, Feb 08, 2012 at 09:28:37PM +0000, Luck, Tony wrote:
> > If I put together a patch to address this would you be willing to let
> > Seiji move the kmsg_dump to below smp_send_stop()?
> 
> I'd "Ack" it ... not my decision whether to apply it.

Sure, I understand. But I figured your objection is holding it up to. ;-)

Let me put together a patch for that change.

Cheers,
Don

^ permalink raw reply	[flat|nested] 24+ messages in thread

* RE: [RFC][PATCH v4 -next 1/4] Move kmsg_dump(KMSG_DUMP_PANIC) below smp_send_stop()
  2012-02-08 22:48                                     ` Don Zickus
@ 2012-02-08 22:56                                       ` Seiji Aguchi
  0 siblings, 0 replies; 24+ messages in thread
From: Seiji Aguchi @ 2012-02-08 22:56 UTC (permalink / raw)
  To: Don Zickus, Luck, Tony
  Cc: Chen Gong, linux-kernel, Matthew Garrett, Vivek Goyal, Chen,
	Gong, akpm, Brown, Len, Huang, Ying, 'ak@linux.intel.com',
	'hughd@chromium.org', 'mingo@elte.hu',
	jmorris, a.p.zijlstra, namhyung, dle-develop, Satoru Moriya

Tony,

You are the appropriate person to apply this patch.

Seiji

-----Original Message-----
From: Don Zickus [mailto:dzickus@redhat.com] 
Sent: Wednesday, February 08, 2012 5:49 PM
To: Luck, Tony
Cc: Seiji Aguchi; Chen Gong; linux-kernel@vger.kernel.org; Matthew Garrett; Vivek Goyal; Chen, Gong; akpm@linux-foundation.org; Brown, Len; Huang, Ying; 'ak@linux.intel.com'; 'hughd@chromium.org'; 'mingo@elte.hu'; jmorris@namei.org; a.p.zijlstra@chello.nl; namhyung@gmail.com; dle-develop@lists.sourceforge.net; Satoru Moriya
Subject: Re: [RFC][PATCH v4 -next 1/4] Move kmsg_dump(KMSG_DUMP_PANIC) below smp_send_stop()

On Wed, Feb 08, 2012 at 09:28:37PM +0000, Luck, Tony wrote:
> > If I put together a patch to address this would you be willing to 
> > let Seiji move the kmsg_dump to below smp_send_stop()?
> 
> I'd "Ack" it ... not my decision whether to apply it.

Sure, I understand. But I figured your objection is holding it up to. ;-)

Let me put together a patch for that change.

Cheers,
Don

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2012-02-08 22:57 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-01-05 17:35 [RFC][PATCH v4 -next 0/4] Make pstore/kmsg_dump run after stopping other cpus in panic path Seiji Aguchi
2012-01-05 17:36 ` [RFC][PATCH v4 -next 1/4] Move kmsg_dump(KMSG_DUMP_PANIC) below smp_send_stop() Seiji Aguchi
2012-01-05 19:06   ` Luck, Tony
2012-01-05 20:10     ` Seiji Aguchi
2012-01-05 21:01       ` Don Zickus
2012-01-09 17:59         ` Seiji Aguchi
2012-01-10  3:06           ` Chen Gong
2012-01-10 20:29             ` Seiji Aguchi
2012-01-11  7:28               ` Chen Gong
2012-01-11 17:25                 ` Don Zickus
2012-01-11 22:22                   ` Luck, Tony
2012-01-13 22:50                     ` Seiji Aguchi
     [not found]                     ` <32727E9A83EE9A42A1F0906295A3A77B2C78F49973@USINDEVS01.corp.hds.com>
2012-01-19 20:58                       ` Seiji Aguchi
2012-01-20 17:56                         ` Luck, Tony
2012-02-03 17:18                           ` Don Zickus
2012-02-03 22:32                             ` Luck, Tony
2012-02-03 22:57                               ` Don Zickus
2012-02-08 20:19                                 ` Don Zickus
2012-02-08 21:28                                   ` Luck, Tony
2012-02-08 22:48                                     ` Don Zickus
2012-02-08 22:56                                       ` Seiji Aguchi
2012-01-05 17:38 ` [RFC][PATCH v4 -next 2/4] Skip spin_locks in panic case and Add WARN_ON() Seiji Aguchi
2012-01-05 17:39 ` [RFC][PATCH v4 -next 3/4]Skip subsequent kmsg_dump() function calls in panic path Seiji Aguchi
2012-01-05 17:41 ` [RFC][PATCH v4 -next 4/4] Skip spin_lock of efi_pstore_write() in panic case Seiji Aguchi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).