linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Luck, Tony" <tony.luck@intel.com>
To: Borislav Petkov <bp@alien8.de>
Cc: Youquan Song <youquan.song@intel.com>,
	x86@kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 5/8] x86/mce: Avoid tail copy when machine check terminated a copy from user
Date: Wed, 16 Sep 2020 12:26:59 -0700	[thread overview]
Message-ID: <20200916192659.GA30252@agluck-desk2.amr.corp.intel.com> (raw)
In-Reply-To: <20200916105336.GF2643@zn.tnic>

On Wed, Sep 16, 2020 at 12:53:36PM +0200, Borislav Petkov wrote:
> On Tue, Sep 08, 2020 at 10:55:16AM -0700, Tony Luck wrote:
> >  2:	mov %ecx,%eax
> >  	ASM_CLAC
> >  	ret
> > +3:	xorl %eax,%eax	/* pretend we succeeded? */
> 
> Hmm, but copy_*_user returns the uncopied bytes in eax. Users of this
> need to handle the MC case properly but if you return 0, they would
> think that they copied everything but there's some trailing stuff they
> didn't manage to take.
> 
> And it's not like they *should* have to retry to copy it because they
> will walk right into the faulty region and cause more MCEs.
> 
> So how is this "I-got-an-MCE-while-copying-from-user" handled on the
> higher level?

I got stuck on this part of the series for a long time. This code
was pretty subtle before I came to make it even more confusing.

In a normal copy from user operation the expectation is that there
will be some page faults. If a #PF happens the kernel either fixes
the fault and provides a good mapping, or finds that it can't and
doesn't map anything. Either way the extable fixup jumps to the
error return of the low level copy function which returns the number
of bytes still to be copied.

The subtle part is that the iterator that calls into the low level
copy calls fault_in_pages_readable().  I thought this was just some
optimization to prefault the pages, but actually it turns out to be
critical to how -EFAULT gets returned to a user when they pass a
bad address.

When the low level copy stops at a page fault, the next call to
fault_in_pages_readable() checks if the fault was fixed and returns
-EFAULT if it wasn't.

Interesting back story ... now look at what happens if we take a
machine check instead of a page fault.

The machine check handler will invoke the extable fixup to move
the RIP for the copy function to the early return path. Machine
check handler also set up some task_work for the application
to call memory_failure() which will unmap the page. But that work
won't happen until we try to return to the user. So the page (and
the poison) are still mapped.

So we take a another machine check on the same address when
fault_in_pages_readable() does __get_user().  That ought to break
us out ... but for some reason I still don't understand didn't.
But even if it did ... the second machine check is not at all
a good idea.

Returning zero bytes left to say we completed avoids that. The
user is guaranteed a SIGBUS when the task_work does fire. So whatever
system call was in progress is not going to see the apparent
successful return.

Both the code and the commit comment need much more of this
description.

Unless you have some better way out of the dilemmma that the
real fixup is only scheduled at the point that the extable
fixup just arranges for a simple local return from the copy.

> 
> Your 7/8 says:
> 
> "Add code to recover from a machine check while copying data from user
> space to the kernel. Action for this case is the same as if the user
> touched the poison directly; unmap the page and send a SIGBUS to the
> task."
> 
> So how are users of copy_*_user() expected to handle the page
> disappearing from under them?

When the return to user happens the task_work that was scheduled
in the machine check handler takes care of the error return to the
user.

-Tony

  reply	other threads:[~2020-09-16 19:29 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20200908175519.14223-1-tony.luck@intel.com>
2020-09-08 17:55 ` [PATCH 1/8] x86/mce: Stop mce_reign() from re-computing severity for every CPU Tony Luck
2020-09-14 17:21   ` Borislav Petkov
2020-09-14 17:32   ` [tip: ras/core] " tip-bot2 for Tony Luck
2020-09-08 17:55 ` [PATCH 4/8] x86/mce: Add _ASM_EXTABLE_CPY for copy user access Tony Luck
2020-09-16  9:59   ` Borislav Petkov
2020-09-08 17:55 ` [PATCH 5/8] x86/mce: Avoid tail copy when machine check terminated a copy from user Tony Luck
2020-09-16 10:53   ` Borislav Petkov
2020-09-16 19:26     ` Luck, Tony [this message]
2020-09-17 17:04       ` Borislav Petkov
2020-09-17 21:57         ` Luck, Tony
2020-09-18  7:51           ` Borislav Petkov
2020-09-08 17:55 ` [PATCH 6/8] x86/mce: Change fault_in_kernel_space() from static to global Tony Luck
2020-09-08 17:55 ` [PATCH 7/8] x86/mce: Recover from poison found while copying from user space Tony Luck
2020-09-18 16:13   ` Borislav Petkov
2020-09-08 17:55 ` [PATCH 8/8] x86/mce: Decode a kernel instruction to determine if it is copying from user Tony Luck
2020-09-21 11:31   ` Borislav Petkov
2020-09-30 23:26     ` [PATCH v2 0/7] Add machine check recovery when copying from user space Tony Luck
2020-09-30 23:26       ` [PATCH v2 1/7] x86/mce: Pass pointer to saved pt_regs to severity calculation routines Tony Luck
2020-09-30 23:26       ` [PATCH v2 2/7] x86/mce: Provide method to find out the type of exception handle Tony Luck
2020-10-05 16:35         ` Borislav Petkov
2020-09-30 23:26       ` [PATCH v2 3/7] x86/mce: Add _ASM_EXTABLE_CPY for copy user access Tony Luck
2020-10-05 16:34         ` Borislav Petkov
2020-09-30 23:26       ` [PATCH v2 4/7] x86/mce: Avoid tail copy when machine check terminated a copy from user Tony Luck
2020-09-30 23:26       ` [PATCH v2 5/7] x86/mce: Change fault_in_kernel_space() from static to global Tony Luck
2020-10-05 16:33         ` Borislav Petkov
2020-09-30 23:26       ` [PATCH v2 6/7] x86/mce: Recover from poison found while copying from user space Tony Luck
2020-10-05 16:32         ` Borislav Petkov
2020-10-05 17:47           ` Luck, Tony
2020-09-30 23:26       ` [PATCH v2 7/7] x86/mce: Decode a kernel instruction to determine if it is copying from user Tony Luck
2020-10-05 16:31         ` Borislav Petkov
2020-10-06 21:09           ` [PATCH v3 0/6] Add machine check recovery when copying from user space Tony Luck
2020-10-06 21:09             ` [PATCH v3 1/6] x86/mce: Pass pointer to saved pt_regs to severity calculation routines Tony Luck
2020-10-07 10:02               ` [tip: ras/core] " tip-bot2 for Youquan Song
2020-10-06 21:09             ` [PATCH v3 2/6] x86/mce: Provide method to find out the type of exception handle Tony Luck
2020-10-07 10:02               ` [tip: ras/core] x86/mce: Provide method to find out the type of an exception handler tip-bot2 for Tony Luck
2020-10-06 21:09             ` [PATCH v3 3/6] x86/mce: Add _ASM_EXTABLE_CPY for copy user access Tony Luck
2020-10-07 10:02               ` [tip: ras/core] " tip-bot2 for Youquan Song
2020-10-06 21:09             ` [PATCH v3 4/6] x86/mce: Avoid tail copy when machine check terminated a copy from user Tony Luck
2020-10-07  8:23               ` David Laight
2020-10-07 18:49                 ` Luck, Tony
2020-10-07 21:11                   ` David Laight
2020-10-07 10:02               ` [tip: ras/core] " tip-bot2 for Tony Luck
2020-10-06 21:09             ` [PATCH v3 5/6] x86/mce: Recover from poison found while copying from user space Tony Luck
2020-10-07 10:02               ` [tip: ras/core] " tip-bot2 for Tony Luck
2020-10-06 21:09             ` [PATCH v3 6/6] x86/mce: Decode a kernel instruction to determine if it is copying from user Tony Luck
2020-10-07 10:02               ` [tip: ras/core] " tip-bot2 for Tony Luck
2020-09-09 15:05 ` [RESEND PATCH 0/8] Add machine check recovery when copying from user space Tony Luck
     [not found] ` <20200908175519.14223-4-tony.luck@intel.com>
2020-09-15  9:11   ` [PATCH 3/8] x86/mce: Provide method to find out the type of exception handle Borislav Petkov
2020-09-15 16:24     ` Luck, Tony

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200916192659.GA30252@agluck-desk2.amr.corp.intel.com \
    --to=tony.luck@intel.com \
    --cc=bp@alien8.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=x86@kernel.org \
    --cc=youquan.song@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).