linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Vishal Verma <vishal.l.verma@intel.com>
To: Borislav Petkov <bp@suse.de>
Cc: linux-kernel@vger.kernel.org, linux-nvdimm@ml01.01.org,
	x86@kernel.org, Ross Zwisler <ross.zwisler@linux.intel.com>,
	Tony Luck <tony.luck@intel.com>,
	Dan Williams <dan.j.williams@intel.com>
Subject: Re: [RFC PATCH] x86, mce: change the mce notifier to 'blocking' from 'atomic'
Date: Wed, 12 Apr 2017 13:59:03 -0600	[thread overview]
Message-ID: <20170412195903.GA29506@omniknight.lm.intel.com> (raw)
In-Reply-To: <20170412091442.dwonfr4dwyta7nvx@pd.tnic>

On 04/12, Borislav Petkov wrote:
> On Tue, Apr 11, 2017 at 04:44:57PM -0600, Vishal Verma wrote:
> > The NFIT MCE handler callback (for handling media errors on NVDIMMs)
> > takes a mutex to add the location of a memory error to a list. But since
> > the notifier call chain for machine checks (x86_mce_decoder_chain) is
> > atomic, we get a lockdep splat like:
> > 
> >   BUG: sleeping function called from invalid context at kernel/locking/mutex.c:620
> >   in_atomic(): 1, irqs_disabled(): 0, pid: 4, name: kworker/0:0
> >   [..]
> >   Call Trace:
> >    dump_stack+0x86/0xc3
> >    ___might_sleep+0x178/0x240
> >    __might_sleep+0x4a/0x80
> >    mutex_lock_nested+0x43/0x3f0
> >    ? __lock_acquire+0xcbc/0x1290
> >    nfit_handle_mce+0x33/0x180 [nfit]
> >    notifier_call_chain+0x4a/0x70
> >    atomic_notifier_call_chain+0x6e/0x110
> >    ? atomic_notifier_call_chain+0x5/0x110
> >    mce_gen_pool_process+0x41/0x70
> > 
> > Commit 648ed94038c030245a06e4be59744fd5cdc18c40
> >       x86/mce: Provide a lockless memory pool to save error records
> > Changes the mce notifier callbacks to be run in a process context, and
> > this can allow us to use the 'blocking' type notifier, where we can take
> > mutexes etc. in the call chain functions.
> > 
> > Reported-by: Ross Zwisler <ross.zwisler@linux.intel.com>
> > Cc: Borislav Petkov <bp@suse.de>
> > Cc: Tony Luck <tony.luck@intel.com>
> > Cc: Dan Williams <dan.j.williams@intel.com>
> > Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
> > ---
> >  arch/x86/kernel/cpu/mcheck/mce-genpool.c  | 2 +-
> >  arch/x86/kernel/cpu/mcheck/mce-internal.h | 2 +-
> >  arch/x86/kernel/cpu/mcheck/mce.c          | 8 ++++----
> >  3 files changed, 6 insertions(+), 6 deletions(-)
> > 
> > While this patch almost solves the problem, I think it is not quite right.
> > The x86_mce_decoder_chain is also called from print_mce for fatal machine
> > checks, and that is, afaict, still from an atomic context. One thing Tony
> > suggested was splitting the notifier chain into two distinct chains, one
> > for regular logging and recoverable actions that allows blocking, the
> > other from the panic path.
> 
> Well, if Mohammad won't come to the mountain...
> 
> So the NFIT handler has:
> 
>         /* We only care about memory errors */
>         if (!(mce->status & MCACOD))
>                 return NOTIFY_DONE;
> 
> what severity are we talking here? Errors which can be reported on the
> panic path, i.e., in atomic context or only AO/AR ones which don't raise
> an #MC exception?

I don't think we can do anything about the panic path errors. The NFIT
handler takes the recoverable machine checks, and essentially, adds the
location to a list.

> 
> -- 
> Regards/Gruss,
>     Boris.
> 
> SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
> -- 

  reply	other threads:[~2017-04-12 20:00 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-04-11 22:44 [RFC PATCH] x86, mce: change the mce notifier to 'blocking' from 'atomic' Vishal Verma
2017-04-12  9:14 ` Borislav Petkov
2017-04-12 19:59   ` Vishal Verma [this message]
2017-04-12 20:22     ` Borislav Petkov
2017-04-12 20:27       ` Verma, Vishal L
2017-04-12 20:52         ` Luck, Tony
2017-04-12 20:55           ` Dan Williams
2017-04-12 21:12             ` Thomas Gleixner
2017-04-12 21:19               ` Luck, Tony
2017-04-12 21:47                 ` Borislav Petkov
2017-04-12 22:16                   ` Borislav Petkov
2017-04-12 22:26                     ` Luck, Tony
2017-04-12 22:29                       ` Borislav Petkov
2017-04-13 11:31                         ` Borislav Petkov
2017-04-13 12:12                           ` Borislav Petkov
2017-04-18 16:28                             ` Luck, Tony
2017-04-21 21:39                           ` Verma, Vishal L
2017-04-12 21:13         ` Borislav Petkov
2017-04-12 21:50           ` Thomas Gleixner
2017-04-12 22:42             ` Paul E. McKenney
2017-04-12 23:45               ` Paul E. McKenney
2017-04-13 14:34                 ` Paul E. McKenney
2017-04-18 20:27 ` [tip:ras/urgent] x86/mce: Make the MCE notifier a blocking one tip-bot for Vishal Verma

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170412195903.GA29506@omniknight.lm.intel.com \
    --to=vishal.l.verma@intel.com \
    --cc=bp@suse.de \
    --cc=dan.j.williams@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nvdimm@ml01.01.org \
    --cc=ross.zwisler@linux.intel.com \
    --cc=tony.luck@intel.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).