linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* On the correctness of dbe3ed1c078c193be34326728d494c5c4bc115e2
@ 2013-09-01 12:20 H. Peter Anvin
  2013-09-01 15:58 ` Linus Torvalds
  0 siblings, 1 reply; 6+ messages in thread
From: H. Peter Anvin @ 2013-09-01 12:20 UTC (permalink / raw)
  To: Linus Torvalds, Randy Dunlap, Ingo Molnar, Thomas Gleixner,
	Linux Kernel Mailing List, Arjan van de Ven

A truly ancient commit (v2.6.23), dbe3ed1c078c193be34326728d494c5c4bc115e2:

    x86-64: page faults from user mode are always user faults

    Randy Dunlap noticed an interesting "crashme" behaviour on his dual
    Prescott Xeon setup, where he gets page faults with the error code
    having a zero "user" bit, but the register state points back to user
    mode.

    This may be a CPU microcode buglet triggered by some strange
    instruction pattern that crashme generates, and loading a microcode
    update seems to possibly have fixed it.

    Regardless, we really should trust the register state more than the
    error code, since it's really the register state that determines
    whether we can actually send a signal, or whether we're in kernel
    mode and need to oops/kill the process in the case of a page fault.

... introduced the following code (since slightly modified):

+	/*
+	 * User-mode registers count as a user access even for any
+	 * potential system fault or CPU buglet.
+	 */
+	if (user_mode_vm(regs))
+		error_code |= PF_USER;
+

This has the end result that we treat a user space instruction which
touches a privileged data structure that then page faults (e.g. a
segment load which causes #PF on the GDT) as a user-space fault.

This seems very wrong to me, since such a #PF would indicate a serious
error in the kernel.

If this was a buglet introduced by a specific processor ("Prescott Xeon"
I presume means Nocona) and then even fixed in a patch, I'm concerned
that we are putting the cart before the horse with this change.

I went through the errata sheets for the CPUs of the time, but nothing
jumped out at me as causing this kind of problem, although there is a
mention of a couple of undefined opcodes which ought to #UD being able
to generate a "load to an incorrect address".  Kind of a stretch, though.

	-hpa

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: On the correctness of dbe3ed1c078c193be34326728d494c5c4bc115e2
  2013-09-01 12:20 On the correctness of dbe3ed1c078c193be34326728d494c5c4bc115e2 H. Peter Anvin
@ 2013-09-01 15:58 ` Linus Torvalds
  2013-09-01 16:00   ` H. Peter Anvin
  2013-09-01 16:10   ` H. Peter Anvin
  0 siblings, 2 replies; 6+ messages in thread
From: Linus Torvalds @ 2013-09-01 15:58 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Randy Dunlap, Ingo Molnar, Thomas Gleixner,
	Linux Kernel Mailing List, Arjan van de Ven

On Sun, Sep 1, 2013 at 5:20 AM, H. Peter Anvin <hpa@zytor.com> wrote:
>
> This has the end result that we treat a user space instruction which
> touches a privileged data structure that then page faults (e.g. a
> segment load which causes #PF on the GDT) as a user-space fault.
>
> This seems very wrong to me, since such a #PF would indicate a serious
> error in the kernel.

Not necessarily. Don't we basically do exactly that for the F00F bug
workaround, for example?

                Linus

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: On the correctness of dbe3ed1c078c193be34326728d494c5c4bc115e2
  2013-09-01 15:58 ` Linus Torvalds
@ 2013-09-01 16:00   ` H. Peter Anvin
  2013-09-01 16:12     ` Linus Torvalds
  2013-09-01 16:10   ` H. Peter Anvin
  1 sibling, 1 reply; 6+ messages in thread
From: H. Peter Anvin @ 2013-09-01 16:00 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Randy Dunlap, Ingo Molnar, Thomas Gleixner,
	Linux Kernel Mailing List, Arjan van de Ven

On 09/01/2013 08:58 AM, Linus Torvalds wrote:
> On Sun, Sep 1, 2013 at 5:20 AM, H. Peter Anvin <hpa@zytor.com> wrote:
>>
>> This has the end result that we treat a user space instruction which
>> touches a privileged data structure that then page faults (e.g. a
>> segment load which causes #PF on the GDT) as a user-space fault.
>>
>> This seems very wrong to me, since such a #PF would indicate a serious
>> error in the kernel.
> 
> Not necessarily. Don't we basically do exactly that for the F00F bug
> workaround, for example?
> 

We do, but only after matching on an exact address (is_f00f_bug()).
Note also that is_f00f_bug() isn't conditional on PF_USER.

	-hpa


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: On the correctness of dbe3ed1c078c193be34326728d494c5c4bc115e2
  2013-09-01 15:58 ` Linus Torvalds
  2013-09-01 16:00   ` H. Peter Anvin
@ 2013-09-01 16:10   ` H. Peter Anvin
  1 sibling, 0 replies; 6+ messages in thread
From: H. Peter Anvin @ 2013-09-01 16:10 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Randy Dunlap, Ingo Molnar, Thomas Gleixner,
	Linux Kernel Mailing List, Arjan van de Ven

On 09/01/2013 08:58 AM, Linus Torvalds wrote:
> On Sun, Sep 1, 2013 at 5:20 AM, H. Peter Anvin <hpa@zytor.com> wrote:
>>
>> This has the end result that we treat a user space instruction which
>> touches a privileged data structure that then page faults (e.g. a
>> segment load which causes #PF on the GDT) as a user-space fault.
>>
>> This seems very wrong to me, since such a #PF would indicate a serious
>> error in the kernel.
> 
> Not necessarily. Don't we basically do exactly that for the F00F bug
> workaround, for example?
> 
>                 Linus
> 

Actually, from looking at it, the F00F workaround is broken *exactly*
because of this patch.  By forcing PF_USER to set, we go into the
if (error_code & PF_USER) branch of __bad_area_semaphore(), which means
we *don't* do the F00F checking, and will deliver a SIGSEGV with the IDT
address to the user space process instead of SIGILL.

	-hpa


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: On the correctness of dbe3ed1c078c193be34326728d494c5c4bc115e2
  2013-09-01 16:00   ` H. Peter Anvin
@ 2013-09-01 16:12     ` Linus Torvalds
  2013-09-01 16:13       ` H. Peter Anvin
  0 siblings, 1 reply; 6+ messages in thread
From: Linus Torvalds @ 2013-09-01 16:12 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Randy Dunlap, Ingo Molnar, Thomas Gleixner,
	Linux Kernel Mailing List, Arjan van de Ven

On Sun, Sep 1, 2013 at 9:00 AM, H. Peter Anvin <hpa@zytor.com> wrote:
> On 09/01/2013 08:58 AM, Linus Torvalds wrote:
>>
>> Not necessarily. Don't we basically do exactly that for the F00F bug
>> workaround, for example?
>
> We do, but only after matching on an exact address (is_f00f_bug()).
> Note also that is_f00f_bug() isn't conditional on PF_USER.

Right. But I'm wondering why you care? There's nothing we can do about
spurious page faults if they dp happen. The PF_USER thing we do means
that bad_area_nosemaphore will go through the "send signal" path.

I guess we can remove the setting of PF_USER, but that would just mean
that then we have to test for "is_user_vm()" in bad_area_semaphore
instead. So the end result would be exactly the same.

And my point was that we actually do have this "users can cause page
faults on IDT etc accesses" as a real thing.

So basically: what do you propose to do? You basically can't remove
the line without adding it somewhere else.

                  Linus

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: On the correctness of dbe3ed1c078c193be34326728d494c5c4bc115e2
  2013-09-01 16:12     ` Linus Torvalds
@ 2013-09-01 16:13       ` H. Peter Anvin
  0 siblings, 0 replies; 6+ messages in thread
From: H. Peter Anvin @ 2013-09-01 16:13 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Randy Dunlap, Ingo Molnar, Thomas Gleixner,
	Linux Kernel Mailing List, Arjan van de Ven

On 09/01/2013 09:12 AM, Linus Torvalds wrote:
> On Sun, Sep 1, 2013 at 9:00 AM, H. Peter Anvin <hpa@zytor.com> wrote:
>> On 09/01/2013 08:58 AM, Linus Torvalds wrote:
>>>
>>> Not necessarily. Don't we basically do exactly that for the F00F bug
>>> workaround, for example?
>>
>> We do, but only after matching on an exact address (is_f00f_bug()).
>> Note also that is_f00f_bug() isn't conditional on PF_USER.
> 
> Right. But I'm wondering why you care? There's nothing we can do about
> spurious page faults if they dp happen. The PF_USER thing we do means
> that bad_area_nosemaphore will go through the "send signal" path.
> 
> I guess we can remove the setting of PF_USER, but that would just mean
> that then we have to test for "is_user_vm()" in bad_area_semaphore
> instead. So the end result would be exactly the same.
> 
> And my point was that we actually do have this "users can cause page
> faults on IDT etc accesses" as a real thing.
> 
> So basically: what do you propose to do? You basically can't remove
> the line without adding it somewhere else.
> 

is_f00f_bug() already contains:

                if (nr == 6) {
                        do_invalid_op(regs, 0);
                        return 1;
                }

... that's where we're supposed to issue SIGILL.

	-hpa


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2013-09-01 16:14 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-09-01 12:20 On the correctness of dbe3ed1c078c193be34326728d494c5c4bc115e2 H. Peter Anvin
2013-09-01 15:58 ` Linus Torvalds
2013-09-01 16:00   ` H. Peter Anvin
2013-09-01 16:12     ` Linus Torvalds
2013-09-01 16:13       ` H. Peter Anvin
2013-09-01 16:10   ` H. Peter Anvin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).