linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Andy Lutomirski <luto@amacapital.net>
To: "Luck, Tony" <tony.luck@intel.com>
Cc: Oleg Nesterov <oleg@redhat.com>, Borislav Petkov <bp@alien8.de>,
	X86 ML <x86@kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Andi Kleen <andi@firstfloor.org>
Subject: Re: [RFC PATCH] x86, entry: Switch stacks on a paranoid entry from userspace
Date: Thu, 13 Nov 2014 16:50:16 -0800	[thread overview]
Message-ID: <CALCETrVYWAOzaHqet8gS5hafvQC9prrH05qgc+v5dxSL9AF8bg@mail.gmail.com> (raw)
In-Reply-To: <CALCETrVJE25Txr5vRKAxHpeKx9p0VJBbat44w+X4gT8qX7a4bg@mail.gmail.com>

On Thu, Nov 13, 2014 at 3:13 PM, Andy Lutomirski <luto@amacapital.net> wrote:
> On Thu, Nov 13, 2014 at 2:47 PM, Andy Lutomirski <luto@amacapital.net> wrote:
>> On Thu, Nov 13, 2014 at 2:33 PM, Luck, Tony <tony.luck@intel.com> wrote:
>>>> Are you sure that this works in an unmodified kernel
>>>
>>> Unmodified kernel has run tens of thousands of injection/consumption/recovery cycles.
>>>
>>> I did get a crash with the entry/exit traces you asked for.  Last 20000 lines of console log
>>> attached.  There are a couple of OOPs before things fall apart completely.  I haven't yet
>>> counted all the entry/exits from the last cycle to see if they match.
>>>
>>
>> That log was a good hint, and I am a fool.  I'll send a v3 once I test it.
>
> ...or not.  I confused myself there.  I thought I had a bug, but I was wrong.
>
> I'm stress-testing sleeping in an int3 handler that entered from user
> space, and I'm not seeing any problems, even with perf firing lots of
> NMIs.  I'm also passing the kprobes smoke test with my patch applied,
> and the stack switching code is correctly not switching stacks.
>
> Any chance you could try to trigger this this again with regs->sp,
> regs->ip, and regs->cs added to the cpu=%d regs=... message?  I feel
> like I'm missing something weird here.

Can you also try rebasing onto what will probably be v3?

https://git.kernel.org/cgit/linux/kernel/git/luto/linux.git/tag/?id=paranoid-stack-v2.9

It adds debugging for inappropriate reschedules from the wrong stack.
Setting CONFIG_DEBUG_ATOMIC_SLEEP might also be a good idea.

It seems plausible to me that the failure mode is worst ==
MCE_AR_SEVERITY but regs->cs == 0 (i.e. in kernel).  This could blow
up in one of two ways:

1. current is the idle thread.  This causes the warning you saw.

2. current is a real user process, but that MCE was nested inside
another exception somehow or otherwise didn't switch stacks.  Now
we're on an IST stack and we schedule.  So far so good, but the next
thing that tries to use that IST stack cause lots of corruption.

It looks like if bug 1 is happening, then you might never notice it
without mce-stack.patch applied -- you'll set TIF_MCE_NOTIFY on the
idle thread, but the idle thread never returns to userspace, so the
mce notifier never has a chance to crash.

--Andy

  reply	other threads:[~2014-11-14  0:50 UTC|newest]

Thread overview: 63+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-11-11 20:56 [RFC PATCH] x86, entry: Switch stacks on a paranoid entry from userspace Andy Lutomirski
2014-11-11 21:36 ` Borislav Petkov
2014-11-11 22:00   ` Luck, Tony
2014-11-11 22:15     ` Andy Lutomirski
2014-11-11 22:12   ` Andy Lutomirski
2014-11-11 22:33     ` Borislav Petkov
2014-11-11 22:40       ` Andy Lutomirski
2014-11-11 23:09         ` Borislav Petkov
2014-11-11 23:21           ` Andy Lutomirski
2014-11-12  0:22             ` Luck, Tony
2014-11-12  0:40               ` Andy Lutomirski
2014-11-12  1:06                 ` Luck, Tony
2014-11-12  2:01                   ` Andy Lutomirski
2014-11-12  2:06                   ` Tony Luck
2014-11-12 10:30                     ` Borislav Petkov
2014-11-12 15:48                       ` Andy Lutomirski
2014-11-12 16:22                         ` Borislav Petkov
2014-11-12 17:17                           ` Luck, Tony
2014-11-12 17:30                             ` Borislav Petkov
2014-11-13 18:04                           ` Borislav Petkov
2014-11-14 21:56                             ` Luck, Tony
2014-11-14 22:07                               ` Andy Lutomirski
2014-11-17 18:50                               ` Borislav Petkov
2014-11-17 19:57                                 ` Andy Lutomirski
2014-11-17 20:03                                   ` Borislav Petkov
2014-11-17 20:05                                     ` Andy Lutomirski
2014-11-17 21:55                                       ` Luck, Tony
2014-11-17 22:26                                         ` Andy Lutomirski
2014-11-17 23:16                                           ` Luck, Tony
2014-11-18  0:05                                             ` Andy Lutomirski
2014-11-18  0:22                                               ` Luck, Tony
2014-11-18  0:55                                                 ` Andy Lutomirski
2014-11-18 18:30                                                   ` Luck, Tony
2014-11-18 23:04                                                     ` Andy Lutomirski
2014-11-18 23:26                                                       ` Luck, Tony
2014-11-18 16:12                                       ` Borislav Petkov
2014-11-12 22:00 ` Oleg Nesterov
2014-11-12 23:17   ` Andy Lutomirski
2014-11-12 23:41     ` Luck, Tony
2014-11-13  0:02       ` Andy Lutomirski
2014-11-13  0:31         ` Luck, Tony
2014-11-13  1:34           ` Andy Lutomirski
2014-11-13  3:03           ` Andy Lutomirski
2014-11-13 18:43             ` Luck, Tony
2014-11-13 22:23               ` Andy Lutomirski
2014-11-13 22:25                 ` Andy Lutomirski
2014-11-13 22:33                 ` Luck, Tony
2014-11-13 22:47                   ` Andy Lutomirski
2014-11-13 23:13                     ` Andy Lutomirski
2014-11-14  0:50                       ` Andy Lutomirski [this message]
2014-11-14  1:20                         ` Luck, Tony
2014-11-14  1:36                           ` Andy Lutomirski
2014-11-14 17:49                         ` Luck, Tony
2014-11-14 19:10                           ` Andy Lutomirski
2014-11-14 19:37                             ` Luck, Tony
2014-11-14 18:27                         ` Luck, Tony
2014-11-14 10:34             ` Borislav Petkov
2014-11-14 17:18               ` Andy Lutomirski
2014-11-14 17:24                 ` Borislav Petkov
2014-11-14 17:26                   ` Andy Lutomirski
2014-11-14 18:53                     ` Borislav Petkov
2014-11-13 10:59           ` Borislav Petkov
2014-11-13 21:23             ` Borislav Petkov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CALCETrVYWAOzaHqet8gS5hafvQC9prrH05qgc+v5dxSL9AF8bg@mail.gmail.com \
    --to=luto@amacapital.net \
    --cc=andi@firstfloor.org \
    --cc=bp@alien8.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=oleg@redhat.com \
    --cc=peterz@infradead.org \
    --cc=tony.luck@intel.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).