From: tip-bot for Vishal Verma <tipbot@zytor.com> To: linux-tip-commits@vger.kernel.org Cc: linux-kernel@vger.kernel.org, hpa@zytor.com, ross.zwisler@linux.intel.com, x86@kernel.org, dan.j.williams@intel.com, tony.luck@intel.com, mingo@kernel.org, vishal.l.verma@intel.com, tglx@linutronix.de, bp@suse.de, stable@vger.kernel.org, linux-edac@vger.kernel.org Subject: [tip:ras/urgent] x86/mce: Make the MCE notifier a blocking one Date: Tue, 18 Apr 2017 13:27:57 -0700 [thread overview] Message-ID: <tip-0dc9c639e6553e39c13b2c0d54c8a1b098cb95e2@git.kernel.org> (raw) In-Reply-To: <20170411224457.24777-1-vishal.l.verma@intel.com> Commit-ID: 0dc9c639e6553e39c13b2c0d54c8a1b098cb95e2 Gitweb: http://git.kernel.org/tip/0dc9c639e6553e39c13b2c0d54c8a1b098cb95e2 Author: Vishal Verma <vishal.l.verma@intel.com> AuthorDate: Tue, 18 Apr 2017 20:42:35 +0200 Committer: Thomas Gleixner <tglx@linutronix.de> CommitDate: Tue, 18 Apr 2017 22:23:48 +0200 x86/mce: Make the MCE notifier a blocking one The NFIT MCE handler callback (for handling media errors on NVDIMMs) takes a mutex to add the location of a memory error to a list. But since the notifier call chain for machine checks (x86_mce_decoder_chain) is atomic, we get a lockdep splat like: BUG: sleeping function called from invalid context at kernel/locking/mutex.c:620 in_atomic(): 1, irqs_disabled(): 0, pid: 4, name: kworker/0:0 [..] Call Trace: dump_stack ___might_sleep __might_sleep mutex_lock_nested ? __lock_acquire nfit_handle_mce notifier_call_chain atomic_notifier_call_chain ? atomic_notifier_call_chain mce_gen_pool_process Convert the notifier to a blocking one which gets to run only in process context. Boris: remove the notifier call in atomic context in print_mce(). For now, let's print the MCE on the atomic path so that we can make sure they go out and get logged at least. Fixes: 6839a6d96f4e ("nfit: do an ARS scrub on hitting a latent media error") Reported-by: Ross Zwisler <ross.zwisler@linux.intel.com> Signed-off-by: Vishal Verma <vishal.l.verma@intel.com> Acked-by: Tony Luck <tony.luck@intel.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: x86-ml <x86@kernel.org> Cc: <stable@vger.kernel.org> Link: http://lkml.kernel.org/r/20170411224457.24777-1-vishal.l.verma@intel.com Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> --- arch/x86/kernel/cpu/mcheck/mce-genpool.c | 2 +- arch/x86/kernel/cpu/mcheck/mce-internal.h | 2 +- arch/x86/kernel/cpu/mcheck/mce.c | 17 +++-------------- 3 files changed, 5 insertions(+), 16 deletions(-) diff --git a/arch/x86/kernel/cpu/mcheck/mce-genpool.c b/arch/x86/kernel/cpu/mcheck/mce-genpool.c index 1e5a50c..217cd44 100644 --- a/arch/x86/kernel/cpu/mcheck/mce-genpool.c +++ b/arch/x86/kernel/cpu/mcheck/mce-genpool.c @@ -85,7 +85,7 @@ void mce_gen_pool_process(struct work_struct *__unused) head = llist_reverse_order(head); llist_for_each_entry_safe(node, tmp, head, llnode) { mce = &node->mce; - atomic_notifier_call_chain(&x86_mce_decoder_chain, 0, mce); + blocking_notifier_call_chain(&x86_mce_decoder_chain, 0, mce); gen_pool_free(mce_evt_pool, (unsigned long)node, sizeof(*node)); } } diff --git a/arch/x86/kernel/cpu/mcheck/mce-internal.h b/arch/x86/kernel/cpu/mcheck/mce-internal.h index 903043e..19592ba 100644 --- a/arch/x86/kernel/cpu/mcheck/mce-internal.h +++ b/arch/x86/kernel/cpu/mcheck/mce-internal.h @@ -13,7 +13,7 @@ enum severity_level { MCE_PANIC_SEVERITY, }; -extern struct atomic_notifier_head x86_mce_decoder_chain; +extern struct blocking_notifier_head x86_mce_decoder_chain; #define ATTR_LEN 16 #define INITIAL_CHECK_INTERVAL 5 * 60 /* 5 minutes */ diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c index 5accfbd..af44ebe 100644 --- a/arch/x86/kernel/cpu/mcheck/mce.c +++ b/arch/x86/kernel/cpu/mcheck/mce.c @@ -123,7 +123,7 @@ static void (*quirk_no_way_out)(int bank, struct mce *m, struct pt_regs *regs); * CPU/chipset specific EDAC code can register a notifier call here to print * MCE errors in a human-readable form. */ -ATOMIC_NOTIFIER_HEAD(x86_mce_decoder_chain); +BLOCKING_NOTIFIER_HEAD(x86_mce_decoder_chain); /* Do initial initialization of a struct mce */ void mce_setup(struct mce *m) @@ -220,7 +220,7 @@ void mce_register_decode_chain(struct notifier_block *nb) WARN_ON(nb->priority > MCE_PRIO_LOWEST && nb->priority < MCE_PRIO_EDAC); - atomic_notifier_chain_register(&x86_mce_decoder_chain, nb); + blocking_notifier_chain_register(&x86_mce_decoder_chain, nb); } EXPORT_SYMBOL_GPL(mce_register_decode_chain); @@ -228,7 +228,7 @@ void mce_unregister_decode_chain(struct notifier_block *nb) { atomic_dec(&num_notifiers); - atomic_notifier_chain_unregister(&x86_mce_decoder_chain, nb); + blocking_notifier_chain_unregister(&x86_mce_decoder_chain, nb); } EXPORT_SYMBOL_GPL(mce_unregister_decode_chain); @@ -321,18 +321,7 @@ static void __print_mce(struct mce *m) static void print_mce(struct mce *m) { - int ret = 0; - __print_mce(m); - - /* - * Print out human-readable details about the MCE error, - * (if the CPU has an implementation for that) - */ - ret = atomic_notifier_call_chain(&x86_mce_decoder_chain, 0, m); - if (ret == NOTIFY_STOP) - return; - pr_emerg_ratelimited(HW_ERR "Run the above through 'mcelog --ascii'\n"); }
WARNING: multiple messages have this Message-ID (diff)
From: tip-bot for Borislav Petkov <tipbot@zytor.com> To: linux-tip-commits@vger.kernel.org Cc: linux-kernel@vger.kernel.org, hpa@zytor.com, ross.zwisler@linux.intel.com, x86@kernel.org, dan.j.williams@intel.com, tony.luck@intel.com, mingo@kernel.org, vishal.l.verma@intel.com, tglx@linutronix.de, bp@suse.de, stable@vger.kernel.org, linux-edac@vger.kernel.org Subject: [tip:ras/urgent] x86/mce: Make the MCE notifier a blocking one Date: Tue, 18 Apr 2017 13:27:57 -0700 [thread overview] Message-ID: <tip-0dc9c639e6553e39c13b2c0d54c8a1b098cb95e2@git.kernel.org> (raw) Commit-ID: 0dc9c639e6553e39c13b2c0d54c8a1b098cb95e2 Gitweb: http://git.kernel.org/tip/0dc9c639e6553e39c13b2c0d54c8a1b098cb95e2 Author: Vishal Verma <vishal.l.verma@intel.com> AuthorDate: Tue, 18 Apr 2017 20:42:35 +0200 Committer: Thomas Gleixner <tglx@linutronix.de> CommitDate: Tue, 18 Apr 2017 22:23:48 +0200 x86/mce: Make the MCE notifier a blocking one The NFIT MCE handler callback (for handling media errors on NVDIMMs) takes a mutex to add the location of a memory error to a list. But since the notifier call chain for machine checks (x86_mce_decoder_chain) is atomic, we get a lockdep splat like: BUG: sleeping function called from invalid context at kernel/locking/mutex.c:620 in_atomic(): 1, irqs_disabled(): 0, pid: 4, name: kworker/0:0 [..] Call Trace: dump_stack ___might_sleep __might_sleep mutex_lock_nested ? __lock_acquire nfit_handle_mce notifier_call_chain atomic_notifier_call_chain ? atomic_notifier_call_chain mce_gen_pool_process Convert the notifier to a blocking one which gets to run only in process context. Boris: remove the notifier call in atomic context in print_mce(). For now, let's print the MCE on the atomic path so that we can make sure they go out and get logged at least. Fixes: 6839a6d96f4e ("nfit: do an ARS scrub on hitting a latent media error") Reported-by: Ross Zwisler <ross.zwisler@linux.intel.com> Signed-off-by: Vishal Verma <vishal.l.verma@intel.com> Acked-by: Tony Luck <tony.luck@intel.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: x86-ml <x86@kernel.org> Cc: <stable@vger.kernel.org> Link: http://lkml.kernel.org/r/20170411224457.24777-1-vishal.l.verma@intel.com Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> --- arch/x86/kernel/cpu/mcheck/mce-genpool.c | 2 +- arch/x86/kernel/cpu/mcheck/mce-internal.h | 2 +- arch/x86/kernel/cpu/mcheck/mce.c | 17 +++-------------- 3 files changed, 5 insertions(+), 16 deletions(-) -- To unsubscribe from this list: send the line "unsubscribe linux-edac" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html diff --git a/arch/x86/kernel/cpu/mcheck/mce-genpool.c b/arch/x86/kernel/cpu/mcheck/mce-genpool.c index 1e5a50c..217cd44 100644 --- a/arch/x86/kernel/cpu/mcheck/mce-genpool.c +++ b/arch/x86/kernel/cpu/mcheck/mce-genpool.c @@ -85,7 +85,7 @@ void mce_gen_pool_process(struct work_struct *__unused) head = llist_reverse_order(head); llist_for_each_entry_safe(node, tmp, head, llnode) { mce = &node->mce; - atomic_notifier_call_chain(&x86_mce_decoder_chain, 0, mce); + blocking_notifier_call_chain(&x86_mce_decoder_chain, 0, mce); gen_pool_free(mce_evt_pool, (unsigned long)node, sizeof(*node)); } } diff --git a/arch/x86/kernel/cpu/mcheck/mce-internal.h b/arch/x86/kernel/cpu/mcheck/mce-internal.h index 903043e..19592ba 100644 --- a/arch/x86/kernel/cpu/mcheck/mce-internal.h +++ b/arch/x86/kernel/cpu/mcheck/mce-internal.h @@ -13,7 +13,7 @@ enum severity_level { MCE_PANIC_SEVERITY, }; -extern struct atomic_notifier_head x86_mce_decoder_chain; +extern struct blocking_notifier_head x86_mce_decoder_chain; #define ATTR_LEN 16 #define INITIAL_CHECK_INTERVAL 5 * 60 /* 5 minutes */ diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c index 5accfbd..af44ebe 100644 --- a/arch/x86/kernel/cpu/mcheck/mce.c +++ b/arch/x86/kernel/cpu/mcheck/mce.c @@ -123,7 +123,7 @@ static void (*quirk_no_way_out)(int bank, struct mce *m, struct pt_regs *regs); * CPU/chipset specific EDAC code can register a notifier call here to print * MCE errors in a human-readable form. */ -ATOMIC_NOTIFIER_HEAD(x86_mce_decoder_chain); +BLOCKING_NOTIFIER_HEAD(x86_mce_decoder_chain); /* Do initial initialization of a struct mce */ void mce_setup(struct mce *m) @@ -220,7 +220,7 @@ void mce_register_decode_chain(struct notifier_block *nb) WARN_ON(nb->priority > MCE_PRIO_LOWEST && nb->priority < MCE_PRIO_EDAC); - atomic_notifier_chain_register(&x86_mce_decoder_chain, nb); + blocking_notifier_chain_register(&x86_mce_decoder_chain, nb); } EXPORT_SYMBOL_GPL(mce_register_decode_chain); @@ -228,7 +228,7 @@ void mce_unregister_decode_chain(struct notifier_block *nb) { atomic_dec(&num_notifiers); - atomic_notifier_chain_unregister(&x86_mce_decoder_chain, nb); + blocking_notifier_chain_unregister(&x86_mce_decoder_chain, nb); } EXPORT_SYMBOL_GPL(mce_unregister_decode_chain); @@ -321,18 +321,7 @@ static void __print_mce(struct mce *m) static void print_mce(struct mce *m) { - int ret = 0; - __print_mce(m); - - /* - * Print out human-readable details about the MCE error, - * (if the CPU has an implementation for that) - */ - ret = atomic_notifier_call_chain(&x86_mce_decoder_chain, 0, m); - if (ret == NOTIFY_STOP) - return; - pr_emerg_ratelimited(HW_ERR "Run the above through 'mcelog --ascii'\n"); }
next prev parent reply other threads:[~2017-04-18 20:29 UTC|newest] Thread overview: 46+ messages / expand[flat|nested] mbox.gz Atom feed top 2017-04-11 22:44 [RFC PATCH] x86, mce: change the mce notifier to 'blocking' from 'atomic' Vishal Verma 2017-04-11 22:44 ` Vishal Verma 2017-04-12 9:14 ` Borislav Petkov 2017-04-12 9:14 ` Borislav Petkov 2017-04-12 19:59 ` Vishal Verma 2017-04-12 19:59 ` Vishal Verma 2017-04-12 20:22 ` Borislav Petkov 2017-04-12 20:22 ` Borislav Petkov 2017-04-12 20:27 ` Verma, Vishal L 2017-04-12 20:27 ` Verma, Vishal L 2017-04-12 20:52 ` Luck, Tony 2017-04-12 20:52 ` Luck, Tony 2017-04-12 20:55 ` Dan Williams 2017-04-12 20:55 ` Dan Williams 2017-04-12 21:12 ` Thomas Gleixner 2017-04-12 21:12 ` Thomas Gleixner 2017-04-12 21:19 ` Luck, Tony 2017-04-12 21:19 ` Luck, Tony 2017-04-12 21:47 ` Borislav Petkov 2017-04-12 21:47 ` Borislav Petkov 2017-04-12 22:16 ` Borislav Petkov 2017-04-12 22:16 ` Borislav Petkov 2017-04-12 22:26 ` Luck, Tony 2017-04-12 22:26 ` Luck, Tony 2017-04-12 22:29 ` Borislav Petkov 2017-04-12 22:29 ` Borislav Petkov 2017-04-13 11:31 ` Borislav Petkov 2017-04-13 11:31 ` Borislav Petkov 2017-04-13 12:12 ` Borislav Petkov 2017-04-13 12:12 ` Borislav Petkov 2017-04-18 16:28 ` Luck, Tony 2017-04-18 16:28 ` Luck, Tony [not found] ` <20170413113159.rc32ebiswn64nzrr-fF5Pk5pvG8Y@public.gmane.org> 2017-04-21 21:39 ` Verma, Vishal L 2017-04-21 21:39 ` Verma, Vishal L 2017-04-12 21:13 ` Borislav Petkov 2017-04-12 21:13 ` Borislav Petkov 2017-04-12 21:50 ` Thomas Gleixner 2017-04-12 21:50 ` Thomas Gleixner 2017-04-12 22:42 ` Paul E. McKenney 2017-04-12 22:42 ` Paul E. McKenney 2017-04-12 23:45 ` Paul E. McKenney 2017-04-12 23:45 ` Paul E. McKenney 2017-04-13 14:34 ` Paul E. McKenney 2017-04-13 14:34 ` Paul E. McKenney 2017-04-18 20:27 ` tip-bot for Vishal Verma [this message] 2017-04-18 20:27 ` [tip:ras/urgent] x86/mce: Make the MCE notifier a blocking one tip-bot for Borislav Petkov
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=tip-0dc9c639e6553e39c13b2c0d54c8a1b098cb95e2@git.kernel.org \ --to=tipbot@zytor.com \ --cc=bp@suse.de \ --cc=dan.j.williams@intel.com \ --cc=hpa@zytor.com \ --cc=linux-edac@vger.kernel.org \ --cc=linux-kernel@vger.kernel.org \ --cc=linux-tip-commits@vger.kernel.org \ --cc=mingo@kernel.org \ --cc=ross.zwisler@linux.intel.com \ --cc=stable@vger.kernel.org \ --cc=tglx@linutronix.de \ --cc=tony.luck@intel.com \ --cc=vishal.l.verma@intel.com \ --cc=x86@kernel.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.