All of lore.kernel.org
 help / color / mirror / Atom feed
From: Borislav Petkov <bp@suse.de>
To: "Luck, Tony" <tony.luck@intel.com>
Cc: "Chen, Gong" <gong.chen@linux.intel.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 4/4 Rebase] x86, MCE: Avoid potential deadlock in MCE context
Date: Thu, 11 Jun 2015 10:27:47 +0200	[thread overview]
Message-ID: <20150611082747.GA30391@pd.tnic> (raw)
In-Reply-To: <20150608202658.GF5877@pd.tnic>

On Mon, Jun 08, 2015 at 10:26:58PM +0200, Borislav Petkov wrote:
> On Mon, Jun 08, 2015 at 08:03:08PM +0000, Luck, Tony wrote:
> > @@ -156,7 +156,8 @@ void mce_log(struct mce *mce)
> >  	/* Emit the trace record: */
> >  	trace_mce_record(mce);
> >  
> > -	atomic_notifier_call_chain(&x86_mce_decoder_chain, 0, mce);
> > +	mce_genpool_add(mce);
> > +	irq_work_queue(&mce_irq_work);
> > 
> > Is it safe to call irq_work_queue() from MCE context?
> 
> Yeah, we're using it in contexts like:

Yeah, so that is safe but there's another issue which I triggered
yesterday. I hope the commit message explains it thoroughly. If you see
holes, please shout. I haven't run this on the EINJ box yet, that's
still outstanding.

In any case, I'm thinking of not rushing those changes to tip yet and
have them simmer in linux-next for a whole round and have them ready for
4.3. Just to be on the safe side and have some more time hammering on
them.

---
From: Borislav Petkov <bp@suse.de>
Date: Wed, 10 Jun 2015 18:49:18 +0200
Subject: [PATCH] x86/mce: Fix logging of early boot machine checks

With moving the irq_work queuing to mce_log() which simplified other
code, we're faced with the problem of logging MCEs during early boot
which might be there from before. In particular, facilities which we
rely on, like workqueues, for example, are not initialized yet. See
splat below.

As a first step, move the workqueue init before the first invocation of
mce_log() in

__mcheck_cpu_init_generic
|-> machine_check_poll
    |-> mce_log

But that's not enough because the normal system workqueue on which stuff
gets queued with schedule_work() is not initialized yet either.

Therefore, we still queue the work in the genpool but delay the
scheduling of that work to a late initcall where everything should be
initialized.

While at it, make sure we queue irq_work only if the addition to the
genpool has been successful.

  BUG: unable to handle kernel NULL pointer dereference at           (null)
  IP: [<          (null)>]           (null)
  PGD 0
  Oops: 0010 [#1] PREEMPT SMP
  Modules linked in:
  CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.1.0-rc7+ #12
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140531_083030-gandalf 04/01/2014
  task: ffffffff8197d580 ti: ffffffff8196c000 task.ti: ffffffff8196c000
  RIP: 0010:[<0000000000000000>]  [<          (null)>]           (null)

  <registers snipped>

  Call Trace:
   <IRQ>
    ? irq_work_run_list
    irq_work_run
    smp_irq_work_interrupt
    irq_work_interrupt
   <EOI>
    ? apic_send_IPI_self
    arch_irq_work_raise
    irq_work_queue
    mce_log
    machine_check_poll
    __mcheck_cpu_init_generic
    mcheck_cpu_init
    identify_cpu
    identify_boot_cpu
    check_bugs
    start_kernel
    x86_64_start_reservations
    x86_64_start_kernel
  Code:  Bad RIP value.
  RIP  [<          (null)>]           (null)
   RSP <ffff88007b603f40>
  CR2: 0000000000000000
  ---[ end trace 6d6dd507e2636bd4 ]---
  Kernel panic - not syncing: Fatal exception in interrupt
  ---[ end Kernel panic - not syncing: Fatal exception in interrupt

Signed-off-by: Borislav Petkov <bp@suse.de>
---
 arch/x86/kernel/cpu/mcheck/mce.c | 33 +++++++++++++++++++++++++--------
 1 file changed, 25 insertions(+), 8 deletions(-)

diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
index a93a150c3583..17cef061c24a 100644
--- a/arch/x86/kernel/cpu/mcheck/mce.c
+++ b/arch/x86/kernel/cpu/mcheck/mce.c
@@ -159,8 +159,8 @@ void mce_log(struct mce *mce)
 	/* Emit the trace record: */
 	trace_mce_record(mce);
 
-	mce_genpool_add(mce);
-	irq_work_queue(&mce_irq_work);
+	if (mce_genpool_add(mce))
+		irq_work_queue(&mce_irq_work);
 
 	mce->finished = 0;
 	wmb();
@@ -480,7 +480,7 @@ int mce_available(struct cpuinfo_x86 *c)
 
 static void mce_schedule_work(void)
 {
-	if (!mce_genpool_empty())
+	if (!mce_genpool_empty() && keventd_up())
 		schedule_work(&mce_work);
 }
 
@@ -648,8 +648,9 @@ bool machine_check_poll(enum mcp_flags flags, mce_banks_t *b)
 			if (m.status & MCI_STATUS_ADDRV) {
 				m.severity = severity;
 				m.usable_addr = mce_usable_address(&m);
-				mce_genpool_add(&m);
-				mce_schedule_work();
+
+				if (mce_genpool_add(&m))
+					mce_schedule_work();
 			}
 		}
 
@@ -1704,13 +1705,14 @@ void mcheck_cpu_init(struct cpuinfo_x86 *c)
 		return;
 	}
 
+	INIT_WORK(&mce_work, mce_process_work);
+	init_irq_work(&mce_irq_work, mce_irq_work_cb);
+
 	machine_check_vector = do_machine_check;
 
 	__mcheck_cpu_init_generic();
 	__mcheck_cpu_init_vendor(c);
 	__mcheck_cpu_init_timer();
-	INIT_WORK(&mce_work, mce_process_work);
-	init_irq_work(&mce_irq_work, mce_irq_work_cb);
 }
 
 /*
@@ -2565,5 +2567,20 @@ static int __init mcheck_debugfs_init(void)
 
 	return 0;
 }
-late_initcall(mcheck_debugfs_init);
+#else
+static int __init mcheck_debugfs_init(void) {}
 #endif
+
+static int __init mcheck_late_init(void)
+{
+	mcheck_debugfs_init();
+
+	/*
+	 * Flush out everything that has been logged during early boot, now that
+	 * everything has been initialized (workqueues, decoders, ...).
+	 */
+	mce_schedule_work();
+
+	return 0;
+}
+late_initcall(mcheck_late_init);
-- 
2.3.5

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.
--

      reply	other threads:[~2015-06-11  8:28 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-05-20 19:35 MCE ring buffer management (Rebase) Chen, Gong
2015-05-20 19:35 ` [PATCH 1/4 Rebase] x86, MCE: Provide a lock-less memory pool to save error record Chen, Gong
2015-05-20 10:36   ` Borislav Petkov
2015-05-22 21:06     ` Chen, Gong
2015-05-22  8:17       ` Borislav Petkov
2015-05-20 19:35 ` [PATCH 2/4 Rebase] x86, MCE: Don't use percpu for MCE workqueue/irq_work Chen, Gong
2015-05-20 19:35 ` [PATCH 3/4 Rebase] x86, MCE: Remove mce_ring for SRAO error Chen, Gong
2015-05-20 19:35 ` [PATCH 4/4 Rebase] x86, MCE: Avoid potential deadlock in MCE context Chen, Gong
2015-05-20  9:28   ` Borislav Petkov
2015-05-22 21:12     ` Chen, Gong
2015-05-22  9:09       ` Borislav Petkov
2015-06-08 13:41         ` Borislav Petkov
2015-06-08 20:03           ` Luck, Tony
2015-06-08 20:26             ` Borislav Petkov
2015-06-11  8:27               ` Borislav Petkov [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150611082747.GA30391@pd.tnic \
    --to=bp@suse.de \
    --cc=gong.chen@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=tony.luck@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.