All of lore.kernel.org
 help / color / mirror / Atom feed
* pstore dump inside an nmi handler
@ 2011-07-08 20:17 Don Zickus
  2011-07-08 21:40 ` Luck, Tony
  0 siblings, 1 reply; 11+ messages in thread
From: Don Zickus @ 2011-07-08 20:17 UTC (permalink / raw)
  To: tony.luck; +Cc: mjg, linux-kernel

Hi Tony,

I was playing with the APEI EINJ module, injecting errors trying to
capture a GHES record, then panic into a kdump kernel and reboot.

Matthew brought to my attention that pstore should capture an error record
on the panic path using kmsg_dump().  After injecting an error with EINJ,
I went to check to see if there was a pstore entry.  There wasn't.

Playing on another box, I noticed the machine double faulted and didn't
even make it into a kdump kernel.

Upon investigation, I noticed that when a fatal error occurs on the
platform, it will generate an NMI that will be handle by the
ghes_nmi_handler.  This handler calls panic() which calls kmsg_dump()
which calls pstore_dump().

Inside pstore_dump(), the first thing it tries to grab is a mutex_lock()
(inside an nmi hander).  This seems to be the root cause of my problems.

I am not familiar enough with pstore to just modify its locking, so I
wanted to ask you.

My first thought was to wrap the mutex_lock with a 'if !in_nmi()', but that
seemed kinda hacky.  Then I was wondering if there was a way to do this
locklessly or atomically because you are only dealing with whole blocks I
think.  I don't know.

Wanted to give you a heads up and seek your thoughts.  I am willing to
hack up some code and test. :-)

Cheers,
Don


^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: pstore dump inside an nmi handler
  2011-07-08 20:17 pstore dump inside an nmi handler Don Zickus
@ 2011-07-08 21:40 ` Luck, Tony
  2011-07-08 21:48   ` David Miller
                     ` (2 more replies)
  0 siblings, 3 replies; 11+ messages in thread
From: Luck, Tony @ 2011-07-08 21:40 UTC (permalink / raw)
  To: Don Zickus; +Cc: mjg, linux-kernel

> Inside pstore_dump(), the first thing it tries to grab is a mutex_lock()
> (inside an nmi hander).  This seems to be the root cause of my problems.

Someone else pointed out that mutex_lock() is a problem here too. They
wondered whether spin_lock_irqsave() would work - or whether pstore
backends were allowed to sleep - to which I said I hoped they didn't,
but wasn't really sure what the future will hold.

So ... ideas (and patches) are most welcome.

-Tony



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: pstore dump inside an nmi handler
  2011-07-08 21:40 ` Luck, Tony
@ 2011-07-08 21:48   ` David Miller
  2011-07-08 21:49   ` Matthew Garrett
  2011-07-11 21:55   ` Don Zickus
  2 siblings, 0 replies; 11+ messages in thread
From: David Miller @ 2011-07-08 21:48 UTC (permalink / raw)
  To: tony.luck; +Cc: dzickus, mjg, linux-kernel

From: "Luck, Tony" <tony.luck@intel.com>
Date: Fri, 8 Jul 2011 14:40:13 -0700

>> Inside pstore_dump(), the first thing it tries to grab is a mutex_lock()
>> (inside an nmi hander).  This seems to be the root cause of my problems.
> 
> Someone else pointed out that mutex_lock() is a problem here too. They
> wondered whether spin_lock_irqsave() would work - or whether pstore
> backends were allowed to sleep - to which I said I hoped they didn't,
> but wasn't really sure what the future will hold.
> 
> So ... ideas (and patches) are most welcome.

Probably you'll need to queue things atomically using something
based upon cmpxchg() like the generic ring buffer does.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: pstore dump inside an nmi handler
  2011-07-08 21:40 ` Luck, Tony
  2011-07-08 21:48   ` David Miller
@ 2011-07-08 21:49   ` Matthew Garrett
  2011-07-08 22:10     ` Luck, Tony
  2011-07-11 15:39     ` Don Zickus
  2011-07-11 21:55   ` Don Zickus
  2 siblings, 2 replies; 11+ messages in thread
From: Matthew Garrett @ 2011-07-08 21:49 UTC (permalink / raw)
  To: Luck, Tony; +Cc: Don Zickus, linux-kernel

On Fri, Jul 08, 2011 at 02:40:13PM -0700, Luck, Tony wrote:
> > Inside pstore_dump(), the first thing it tries to grab is a mutex_lock()
> > (inside an nmi hander).  This seems to be the root cause of my problems.
> 
> Someone else pointed out that mutex_lock() is a problem here too. They
> wondered whether spin_lock_irqsave() would work - or whether pstore
> backends were allowed to sleep - to which I said I hoped they didn't,
> but wasn't really sure what the future will hold.

EFI can't sleep (at least, not as far as the kernel's concerned), so 
we're safe there. I think it's fair to assume atomicity here - crash 
dumping is a pretty specific situation. Although we may need to think 
about whether pstore should be saving reboot and poweroff in that case.

-- 
Matthew Garrett | mjg59@srcf.ucam.org

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: pstore dump inside an nmi handler
  2011-07-08 21:49   ` Matthew Garrett
@ 2011-07-08 22:10     ` Luck, Tony
  2011-07-11 15:39     ` Don Zickus
  1 sibling, 0 replies; 11+ messages in thread
From: Luck, Tony @ 2011-07-08 22:10 UTC (permalink / raw)
  To: Matthew Garrett; +Cc: Don Zickus, linux-kernel

> EFI can't sleep (at least, not as far as the kernel's concerned), so 
> we're safe there. I think it's fair to assume atomicity here - crash 
> dumping is a pretty specific situation. Although we may need to think 
> about whether pstore should be saving reboot and poweroff in that case.

Hitachi engineers were very vocal about wanting the reboot/poweroff
logs (so they could deal with customers who accidentally rebooted the
wrong machine, and then denied doing so while also logging reports
that the victim machine had spontaneously reset).

-Tony

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: pstore dump inside an nmi handler
  2011-07-08 21:49   ` Matthew Garrett
  2011-07-08 22:10     ` Luck, Tony
@ 2011-07-11 15:39     ` Don Zickus
  2011-07-13 18:15       ` Luck, Tony
  1 sibling, 1 reply; 11+ messages in thread
From: Don Zickus @ 2011-07-11 15:39 UTC (permalink / raw)
  To: Matthew Garrett; +Cc: Luck, Tony, linux-kernel

On Fri, Jul 08, 2011 at 10:49:41PM +0100, Matthew Garrett wrote:
> On Fri, Jul 08, 2011 at 02:40:13PM -0700, Luck, Tony wrote:
> > > Inside pstore_dump(), the first thing it tries to grab is a mutex_lock()
> > > (inside an nmi hander).  This seems to be the root cause of my problems.
> > 
> > Someone else pointed out that mutex_lock() is a problem here too. They
> > wondered whether spin_lock_irqsave() would work - or whether pstore
> > backends were allowed to sleep - to which I said I hoped they didn't,
> > but wasn't really sure what the future will hold.
> 
> EFI can't sleep (at least, not as far as the kernel's concerned), so 
> we're safe there. I think it's fair to assume atomicity here - crash 
> dumping is a pretty specific situation. Although we may need to think 
> about whether pstore should be saving reboot and poweroff in that case.

Hmm, reading the ERST spec in ACPI4 (which is the backend behind pstore on
x86 boxes), it seems that reading/writing involves a state machine.  The
code doesn't seem to account for this if in the middle of reading an error
record, a system error occurs and you need to write an error record.  I
imagine the state machine would get screwed up and you would lose any info
you tried writing.

But perhaps, if we detect a spinlock is in use, we can bust it and reset
the state machine to solve that problem?

Cheers,
Don

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: pstore dump inside an nmi handler
  2011-07-08 21:40 ` Luck, Tony
  2011-07-08 21:48   ` David Miller
  2011-07-08 21:49   ` Matthew Garrett
@ 2011-07-11 21:55   ` Don Zickus
  2011-07-12 15:34     ` Don Zickus
  2 siblings, 1 reply; 11+ messages in thread
From: Don Zickus @ 2011-07-11 21:55 UTC (permalink / raw)
  To: Luck, Tony; +Cc: mjg, linux-kernel, ying.huang

On Fri, Jul 08, 2011 at 02:40:13PM -0700, Luck, Tony wrote:
> > Inside pstore_dump(), the first thing it tries to grab is a mutex_lock()
> > (inside an nmi hander).  This seems to be the root cause of my problems.
> 
> Someone else pointed out that mutex_lock() is a problem here too. They
> wondered whether spin_lock_irqsave() would work - or whether pstore
> backends were allowed to sleep - to which I said I hoped they didn't,
> but wasn't really sure what the future will hold.
> 
> So ... ideas (and patches) are most welcome.

I tested the spin_lock_irqsave thing on my one box where it was failing
and got past my initial problem into kdump.  So that is a positive and I
can post the patch for that.  Though it probably isn't a complete
solution, it is better than a mutex.

However, I have been scratching my head at a follow up problem, which is
when I inject an error which produces an NMI->GHES->panic, the error
record doesn't get stored under pstore (or maybe ERST too).  I do see the
ERST code follow all the correct steps in storing the kmsg_dump logs into
the ERST table.  Just on the reboot, when I mount pstore it isn't there.

When I perform an 'echo c > /proc/sysrq-trigger', it shows up on the
reboot.  Not sure what can be going wrong.  I cc'd Ying with hopes he
might have some thoughts.

Cheers,
Don

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: pstore dump inside an nmi handler
  2011-07-11 21:55   ` Don Zickus
@ 2011-07-12 15:34     ` Don Zickus
  2011-07-13 16:58       ` Luck, Tony
  0 siblings, 1 reply; 11+ messages in thread
From: Don Zickus @ 2011-07-12 15:34 UTC (permalink / raw)
  To: Luck, Tony; +Cc: mjg, linux-kernel, ying.huang

On Mon, Jul 11, 2011 at 05:55:41PM -0400, Don Zickus wrote:
> On Fri, Jul 08, 2011 at 02:40:13PM -0700, Luck, Tony wrote:
> > > Inside pstore_dump(), the first thing it tries to grab is a mutex_lock()
> > > (inside an nmi hander).  This seems to be the root cause of my problems.
> > 
> > Someone else pointed out that mutex_lock() is a problem here too. They
> > wondered whether spin_lock_irqsave() would work - or whether pstore
> > backends were allowed to sleep - to which I said I hoped they didn't,
> > but wasn't really sure what the future will hold.
> > 
> > So ... ideas (and patches) are most welcome.
> 
> I tested the spin_lock_irqsave thing on my one box where it was failing
> and got past my initial problem into kdump.  So that is a positive and I
> can post the patch for that.  Though it probably isn't a complete
> solution, it is better than a mutex.
> 
> However, I have been scratching my head at a follow up problem, which is
> when I inject an error which produces an NMI->GHES->panic, the error
> record doesn't get stored under pstore (or maybe ERST too).  I do see the
> ERST code follow all the correct steps in storing the kmsg_dump logs into
> the ERST table.  Just on the reboot, when I mount pstore it isn't there.

Actually, is it expected that the ERST can handle only 8 records?  Also if
you remove those records with pstore mount under /mnt; 'rm -rf
/mnt/dmesg-*', are those records removed immediately or are they cached to
be removed later?  IOW, if a did a 'rm -rf ..' and then an 'echo c >
/proc/sysrq-trigger' immediately after it, would I expect those records to
be removed or not?  Testing shows they are removed on reboot but the later
'echo c > ..' didn't save any new error records. :-/

Cheers,
Don

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: pstore dump inside an nmi handler
  2011-07-12 15:34     ` Don Zickus
@ 2011-07-13 16:58       ` Luck, Tony
  0 siblings, 0 replies; 11+ messages in thread
From: Luck, Tony @ 2011-07-13 16:58 UTC (permalink / raw)
  To: Don Zickus; +Cc: mjg, linux-kernel, Huang, Ying

> Actually, is it expected that the ERST can handle only 8 records?  Also if
> you remove those records with pstore mount under /mnt; 'rm -rf
> /mnt/dmesg-*', are those records removed immediately or are they cached to
> be removed later?  IOW, if a did a 'rm -rf ..' and then an 'echo c >
> /proc/sysrq-trigger' immediately after it, would I expect those records to
> be removed or not?  Testing shows they are removed on reboot but the later
> 'echo c > ..' didn't save any new error records. :-/

It isn't specified how much storage will be provided by ERST.  I tried to
find a limit on the system on which I did the original development, but
got bored at a hundred ... which seemed to be plenty enough since the usual
mode of operation ought to be to copy the records from pstore to some other
storage and remove them from pstore early in boot.

The unlink from pstore is passed immediately to the backing store to
do it's clear/erase operation ... but whether the BIOS implementation
of ERST makes that space immediately available might vary from platform
to platform.

-Tony

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: pstore dump inside an nmi handler
  2011-07-11 15:39     ` Don Zickus
@ 2011-07-13 18:15       ` Luck, Tony
  2011-07-14 13:49         ` Don Zickus
  0 siblings, 1 reply; 11+ messages in thread
From: Luck, Tony @ 2011-07-13 18:15 UTC (permalink / raw)
  To: Don Zickus, Matthew Garrett; +Cc: linux-kernel

> But perhaps, if we detect a spinlock is in use, we can bust it and reset
> the state machine to solve that problem?

The ERST table specifies a series of actions to perform to make
the records get saved to (or read from) the platform persistent
store. These actions will be dependent on the h/w, and also on
the creativity of the BIOS writer who composed the ERST action
table.

Resetting the state machine looks like it might be tricky from
any arbitrary point in the sequence.

-Tony

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: pstore dump inside an nmi handler
  2011-07-13 18:15       ` Luck, Tony
@ 2011-07-14 13:49         ` Don Zickus
  0 siblings, 0 replies; 11+ messages in thread
From: Don Zickus @ 2011-07-14 13:49 UTC (permalink / raw)
  To: Luck, Tony; +Cc: Matthew Garrett, linux-kernel

On Wed, Jul 13, 2011 at 11:15:24AM -0700, Luck, Tony wrote:
> > But perhaps, if we detect a spinlock is in use, we can bust it and reset
> > the state machine to solve that problem?
> 
> The ERST table specifies a series of actions to perform to make
> the records get saved to (or read from) the platform persistent
> store. These actions will be dependent on the h/w, and also on
> the creativity of the BIOS writer who composed the ERST action
> table.
> 
> Resetting the state machine looks like it might be tricky from
> any arbitrary point in the sequence.

I agree and I think Matthew told me privately, it would probably be an
untested path, which would make things more problematic.  It would be nice
to have an NVRAM implementation.  That looks simpler and more robust in a
panic path.  But that is up to the vendor's BIOS. :-(

Cheers,
Don

> 
> -Tony

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2011-07-14 13:49 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-07-08 20:17 pstore dump inside an nmi handler Don Zickus
2011-07-08 21:40 ` Luck, Tony
2011-07-08 21:48   ` David Miller
2011-07-08 21:49   ` Matthew Garrett
2011-07-08 22:10     ` Luck, Tony
2011-07-11 15:39     ` Don Zickus
2011-07-13 18:15       ` Luck, Tony
2011-07-14 13:49         ` Don Zickus
2011-07-11 21:55   ` Don Zickus
2011-07-12 15:34     ` Don Zickus
2011-07-13 16:58       ` Luck, Tony

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.