All of lore.kernel.org
 help / color / mirror / Atom feed
From: Borislav Petkov <bp@alien8.de>
To: Steven Rostedt <rostedt@goodmis.org>
Cc: Corey Minyard <minyard@acm.org>,
	linux-rt-users@vger.kernel.org,
	Corey Minyard <cminyard@mvista.com>,
	Tony Luck <tony.luck@intel.com>
Subject: Re: [PATCH][RT] x86: Fix an RT MCE crash
Date: Thu, 30 Jun 2016 18:01:28 +0200	[thread overview]
Message-ID: <20160630160128.GA4365@pd.tnic> (raw)
In-Reply-To: <20160630115101.6337c395@gandalf.local.home>

+ Tony.

On Thu, Jun 30, 2016 at 11:51:01AM -0400, Steven Rostedt wrote:
> > >> From: Corey Minyard <cminyard@mvista.com>
> > >>
> > >> On some x86 systems an MCE interrupt would come in before the kernel
> > >> was ready for it.  Looking at the latest RT code, it has similar
> > >> (but not quite the same) code, except it adds a bool that tells if
> > >> MCE handling is initialized.  Add the same bool for older versions.
> > >>
> > >> Signed-off-by: Corey Minyard <cminyard@mvista.com>
> > >> ---
> > >>   arch/x86/kernel/cpu/mcheck/mce.c | 5 ++++-
> > >>   1 file changed, 4 insertions(+), 1 deletion(-)
> > >>
> > >> We noticed this issue on a new Broadwell system when we booted RT

Do you have any logs which hint at when exactly the MCE gets raised?

> > >> on it.  This patch is for 3.10, I'm not sure if it applies to
> > >> other kernel versions.  
> > > Do you mean other 'older' versions? and that this works with the
> > > versions after 3.10 without this patch?  
> > 
> > I haven't look at supported kernel versions besides 3.10 and 4.4.
> > The fix was from the 4.4 version of this code.  This patch fixes
> > v3.10-rt; I can look at finding which other versions need this.  I
> > was planning to do this, but I wanted to get the patch out for
> > comments first.
> 
> I'm not an MCE expert (I just Cc'd one though ;-)
> 
> OK, so you are saying that the fix was from 4.4-rt? I can go and look
> for it, and if so, I can add it to the "backport" patches I need to do.
> Which I need to go and do that soon (backport patches from previous
> versions). It may already be in that list.
> 
> -- Steve
> 
> > 
> > -corey
> > 
> > > -- Steve
> > >  
> > >> diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
> > >> index aaf4b9b..7125584 100644
> > >> --- a/arch/x86/kernel/cpu/mcheck/mce.c
> > >> +++ b/arch/x86/kernel/cpu/mcheck/mce.c
> > >> @@ -1365,6 +1365,7 @@ static void __mce_notify_work(void)
> > >>   }
> > >>   
> > >>   #ifdef CONFIG_PREEMPT_RT_FULL
> > >> +static bool notify_work_ready __read_mostly;
> > >>   struct task_struct *mce_notify_helper;
> > >>   
> > >>   static int mce_notify_helper_thread(void *unused)
> > >> @@ -1386,12 +1387,14 @@ static int mce_notify_work_init(void)

Hmm, what is mce_notify_work_init() ?

This must be some RT-homegrown thing.

What it is supposed to do? Upstream is much different from 3.10 or
whatever that kernel version is.

> > >>   	if (!mce_notify_helper)
> > >>   		return -ENOMEM;
> > >>   
> > >> +	notify_work_ready = true;
> > >>   	return 0;
> > >>   }
> > >>   
> > >>   static void mce_notify_work(void)

That is gone upstream too AFAICT.

> > >>   {
> > >> -	wake_up_process(mce_notify_helper);
> > >> +	if (notify_work_ready)
> > >> +		wake_up_process(mce_notify_helper);
> > >>   }
> > >>   #else
> > >>   static void mce_notify_work(void)  

Color me puzzled.

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.

  parent reply	other threads:[~2016-06-30 16:01 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-06-30 13:24 [PATCH][RT] x86: Fix an RT MCE crash minyard
2016-06-30 13:43 ` Steven Rostedt
2016-06-30 14:49   ` Corey Minyard
2016-06-30 15:51     ` Steven Rostedt
2016-06-30 15:58       ` Corey Minyard
2016-06-30 16:01       ` Borislav Petkov [this message]
2016-06-30 16:17         ` Luck, Tony
2016-06-30 16:40           ` Corey Minyard
2016-06-30 17:01             ` Borislav Petkov
2016-06-30 17:18               ` Corey Minyard
2016-06-30 17:26                 ` Borislav Petkov
2016-06-30 17:54                   ` Corey Minyard
2016-06-30 18:22                     ` Borislav Petkov
2016-06-30 19:44                       ` Corey Minyard
2016-06-30 20:34                         ` Borislav Petkov
2016-06-30 22:47                           ` Corey Minyard
2016-07-01  7:20                             ` Borislav Petkov
2016-07-06  0:59                               ` Corey Minyard
2016-07-06  8:37                                 ` Borislav Petkov
2016-07-06 12:03                                   ` Corey Minyard
2016-07-06 13:32                                     ` Steven Rostedt
2016-07-06 13:43                                       ` Sebastian Andrzej Siewior
2016-07-11 17:32                                         ` Steven Rostedt
2016-07-01  9:20         ` Daniel Wagner
2016-06-30 16:04       ` Corey Minyard

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160630160128.GA4365@pd.tnic \
    --to=bp@alien8.de \
    --cc=cminyard@mvista.com \
    --cc=linux-rt-users@vger.kernel.org \
    --cc=minyard@acm.org \
    --cc=rostedt@goodmis.org \
    --cc=tony.luck@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.