All of lore.kernel.org
 help / color / mirror / Atom feed
From: Corey Minyard <minyard@acm.org>
To: Steven Rostedt <rostedt@goodmis.org>
Cc: linux-rt-users@vger.kernel.org,
	Corey Minyard <cminyard@mvista.com>,
	Borislav Petkov <bp@alien8.de>
Subject: Re: [PATCH][RT] x86: Fix an RT MCE crash
Date: Thu, 30 Jun 2016 10:58:57 -0500	[thread overview]
Message-ID: <577541C1.20302@acm.org> (raw)
In-Reply-To: <20160630115101.6337c395@gandalf.local.home>

On 06/30/2016 10:51 AM, Steven Rostedt wrote:
> On Thu, 30 Jun 2016 09:49:19 -0500
> Corey Minyard <minyard@acm.org> wrote:
>
>> On 06/30/2016 08:43 AM, Steven Rostedt wrote:
>>> On Thu, 30 Jun 2016 08:24:49 -0500
>>> minyard@acm.org wrote:
>>>   
>>>> From: Corey Minyard <cminyard@mvista.com>
>>>>
>>>> On some x86 systems an MCE interrupt would come in before the kernel
>>>> was ready for it.  Looking at the latest RT code, it has similar
>>>> (but not quite the same) code, except it adds a bool that tells if
>>>> MCE handling is initialized.  Add the same bool for older versions.
>>>>
>>>> Signed-off-by: Corey Minyard <cminyard@mvista.com>
>>>> ---
>>>>    arch/x86/kernel/cpu/mcheck/mce.c | 5 ++++-
>>>>    1 file changed, 4 insertions(+), 1 deletion(-)
>>>>
>>>> We noticed this issue on a new Broadwell system when we booted RT
>>>> on it.  This patch is for 3.10, I'm not sure if it applies to
>>>> other kernel versions.
>>> Do you mean other 'older' versions? and that this works with the
>>> versions after 3.10 without this patch?
>> I haven't look at supported kernel versions besides 3.10 and 4.4.
>> The fix was from the 4.4 version of this code.  This patch fixes
>> v3.10-rt; I can look at finding which other versions need this.  I
>> was planning to do this, but I wanted to get the patch out for
>> comments first.
> I'm not an MCE expert (I just Cc'd one though ;-)

Ok.  It's not really an MCE bug per say, just an initialization
order bug.

>
> OK, so you are saying that the fix was from 4.4-rt? I can go and look
> for it, and if so, I can add it to the "backport" patches I need to do.
> Which I need to go and do that soon (backport patches from previous
> versions). It may already be in that list.

The fix was from 4.4-rt, but it's not a separate fix.  The 4.4 change is
d21959b8ad98 (x86/mce: use swait queue for mce wakeups)
and it's doing the same thing as the 3.10-rt change
49fe500d2abd (x86/mce: Defer mce wakeups to threads for
PREEMPT_RT).

The 3.10-rt change just doesn't have the bool that fixes the
initialization order issue.

-corey

>
> -- Steve
>
>> -corey
>>
>>> -- Steve
>>>   
>>>> diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
>>>> index aaf4b9b..7125584 100644
>>>> --- a/arch/x86/kernel/cpu/mcheck/mce.c
>>>> +++ b/arch/x86/kernel/cpu/mcheck/mce.c
>>>> @@ -1365,6 +1365,7 @@ static void __mce_notify_work(void)
>>>>    }
>>>>    
>>>>    #ifdef CONFIG_PREEMPT_RT_FULL
>>>> +static bool notify_work_ready __read_mostly;
>>>>    struct task_struct *mce_notify_helper;
>>>>    
>>>>    static int mce_notify_helper_thread(void *unused)
>>>> @@ -1386,12 +1387,14 @@ static int mce_notify_work_init(void)
>>>>    	if (!mce_notify_helper)
>>>>    		return -ENOMEM;
>>>>    
>>>> +	notify_work_ready = true;
>>>>    	return 0;
>>>>    }
>>>>    
>>>>    static void mce_notify_work(void)
>>>>    {
>>>> -	wake_up_process(mce_notify_helper);
>>>> +	if (notify_work_ready)
>>>> +		wake_up_process(mce_notify_helper);
>>>>    }
>>>>    #else
>>>>    static void mce_notify_work(void)


  reply	other threads:[~2016-06-30 15:59 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-06-30 13:24 [PATCH][RT] x86: Fix an RT MCE crash minyard
2016-06-30 13:43 ` Steven Rostedt
2016-06-30 14:49   ` Corey Minyard
2016-06-30 15:51     ` Steven Rostedt
2016-06-30 15:58       ` Corey Minyard [this message]
2016-06-30 16:01       ` Borislav Petkov
2016-06-30 16:17         ` Luck, Tony
2016-06-30 16:40           ` Corey Minyard
2016-06-30 17:01             ` Borislav Petkov
2016-06-30 17:18               ` Corey Minyard
2016-06-30 17:26                 ` Borislav Petkov
2016-06-30 17:54                   ` Corey Minyard
2016-06-30 18:22                     ` Borislav Petkov
2016-06-30 19:44                       ` Corey Minyard
2016-06-30 20:34                         ` Borislav Petkov
2016-06-30 22:47                           ` Corey Minyard
2016-07-01  7:20                             ` Borislav Petkov
2016-07-06  0:59                               ` Corey Minyard
2016-07-06  8:37                                 ` Borislav Petkov
2016-07-06 12:03                                   ` Corey Minyard
2016-07-06 13:32                                     ` Steven Rostedt
2016-07-06 13:43                                       ` Sebastian Andrzej Siewior
2016-07-11 17:32                                         ` Steven Rostedt
2016-07-01  9:20         ` Daniel Wagner
2016-06-30 16:04       ` Corey Minyard

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=577541C1.20302@acm.org \
    --to=minyard@acm.org \
    --cc=bp@alien8.de \
    --cc=cminyard@mvista.com \
    --cc=linux-rt-users@vger.kernel.org \
    --cc=rostedt@goodmis.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.