linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Andy Lutomirski <luto@kernel.org>
To: Andy Lutomirski <luto@kernel.org>, "H. Peter Anvin" <hpa@zytor.com>
Cc: Dan Rue <dan.rue@linaro.org>, Shuah Khan <shuah@kernel.org>,
	Ingo Molnar <mingo@kernel.org>,
	Dmitry Safonov <dsafonov@virtuozzo.com>,
	Borislav Petkov <bp@alien8.de>,
	"open list:KERNEL SELFTEST FRAMEWORK"
	<linux-kselftest@vger.kernel.org>,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: selftests/x86/fsgsbase_64 test problem
Date: Fri, 26 Jan 2018 11:46:19 -0800	[thread overview]
Message-ID: <CALCETrWvkd68wPCxGwmzqgsLTr_59+=L9u8obwt+f+oUQwDY=w@mail.gmail.com> (raw)
In-Reply-To: <CALCETrUi7Ub2TbFy3Cvj+j4VXZeYULPY+mgL7OX7bz9L8GO9ew@mail.gmail.com>

On Fri, Jan 26, 2018 at 10:59 AM, Andy Lutomirski <luto@kernel.org> wrote:
> On Fri, Jan 26, 2018 at 8:22 AM, Andy Lutomirski <luto@kernel.org> wrote:
>> On Fri, Jan 26, 2018 at 7:36 AM, Dan Rue <dan.rue@linaro.org> wrote:
>>>
>>> We've noticed that fsgsbase_64 can fail intermittently with the
>>> following error:
>>>
>>>         [RUN]   ARCH_SET_GS(0x0) and clear gs, then schedule to 0x1
>>>                 Before schedule, set selector to 0x1
>>>                 other thread: ARCH_SET_GS(0x1) -- sel is 0x0
>>>         [FAIL]  GS/BASE changed from 0x1/0x0 to 0x0/0x0
>>>
>>> This can be reliably reproduced by running fsgsbase_64 in a loop. i.e.
>>>
>>>     for i in $(seq 1 10000); do ./fsgsbase_64 || break; done
>>>
>>> This problem isn't new - I've reproduced it on latest mainline and every
>>> release going back to v4.12 (I did not try earlier). This was tested on
>>> a Supermicro board with a Xeon E3-1220 as well as an Intel Nuc with an
>>> i3-5010U.
>>>
>>
>> Hmm, I can reproduce it, too.  I'll look in a bit.
>
> I'm triggering a different error, and I think what's going on is that
> the kernel doesn't currently re-save GSBASE when a task switches out
> and that task has save gsbase != 0 and in-register GS == 0.  This is
> arguably a bug, but it's not an infoleak, and fixing it could be a wee
> bit expensive.  I'm not sure what, if anything, to do about this.  I
> suppose I could add some gross perf hackery to the test to detect this
> case and suppress the error.
>
> I can also trigger the problem you're seeing, and I don't know what's
> up.  It may be related to and old problem I've seen that causes signal
> delivery to sometimes corrupt %gs.  It's deterministic, but it depends
> in some odd way on register state.  I can currently reproduce that
> issue 100% of the time, and I'm trying to see if I can figure out
> what's happening.

I think it's a CPU bug, and I'm a bit mystified.  I can trigger the
following, plausibly related issue:

Write a program that writes %gs = 1.
Run that program under gdb
break in which %gs == 1
display/x $gs
si

Under QEMU TCG, gs stays equal to 1.  On native or KVM, on Skylake, it
changes to 0.

On KVM or native, I do not observe do_debug getting called with %gs ==
1.  On TCG, I do.  I don't think that's precisely the problem that's
causing the test to fail, since the test doesn't use TF or ptrace, but
I wouldn't be shocked if it's related.

hpa, any insight?

(NB: if you want to play with this as I've described it, you may need
to make invalid_selector() in ptrace.c always return false.  The
current implementation is too strict and causes problems.)

  reply	other threads:[~2018-01-26 19:46 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-01-26 15:36 selftests/x86/fsgsbase_64 test problem Dan Rue
2018-01-26 16:22 ` Andy Lutomirski
2018-01-26 18:59   ` Andy Lutomirski
2018-01-26 19:46     ` Andy Lutomirski [this message]
2018-01-26 22:38       ` Andy Lutomirski
2018-01-26 22:42         ` Andy Lutomirski
2018-01-28 19:21           ` Andy Lutomirski
2018-01-29  9:13             ` H. Peter Anvin
2018-01-29 16:37               ` Andy Lutomirski
2018-01-29 18:12                 ` H. Peter Anvin
2018-01-29 18:26                   ` Andy Lutomirski
2018-01-29 18:30                     ` H. Peter Anvin
2018-02-27 22:59                       ` Dan Rue
2018-01-26 22:56         ` Borislav Petkov
2018-01-28 19:21           ` Andy Lutomirski
2018-01-26 22:51       ` H. Peter Anvin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CALCETrWvkd68wPCxGwmzqgsLTr_59+=L9u8obwt+f+oUQwDY=w@mail.gmail.com' \
    --to=luto@kernel.org \
    --cc=bp@alien8.de \
    --cc=dan.rue@linaro.org \
    --cc=dsafonov@virtuozzo.com \
    --cc=hpa@zytor.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=shuah@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).