linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [RFC 0/4] Fix machine check recovery for copy_from_user
@ 2021-03-26  0:02 Tony Luck
  2021-03-26  0:02 ` [PATCH 1/4] x86/mce: Fix copyin code to return -EFAULT on machine check Tony Luck
                   ` (4 more replies)
  0 siblings, 5 replies; 20+ messages in thread
From: Tony Luck @ 2021-03-26  0:02 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Tony Luck, x86, linux-kernel, linux-mm, Andy Lutomirski,
	Aili Yao,
	HORIGUCHI NAOYA( 堀口 直也)

Maybe this is the way forward?  I made some poor choices before
to treat poison consumption in the kernel when accessing user data
(get_user() or copy_from_user()) ... in particular assuming that
the right action was sending a SIGBUS to the task as if it had
synchronously accessed the poison location.

First three patches may need to be combined (or broken up differently)
for bisectablilty. But they are presented separately here since they
touch separate parts of the problem.

Second part is definitley incomplete. But I'd like to check that it
is the right approach before expending more brain cells in the maze
of nested macros that is lib/iov_iter.c

Last part has been posted before. It covers the case where the kernel
takes more than one swing at reading poison data before returning to
user.

Tony Luck (4):
  x86/mce: Fix copyin code to return -EFAULT on machine check.
  mce/iter: Check for copyin failure & return error up stack
  mce/copyin: fix to not SIGBUS when copying from user hits poison
  x86/mce: Avoid infinite loop for copy from user recovery

 arch/x86/kernel/cpu/mce/core.c     | 63 +++++++++++++++++++++---------
 arch/x86/kernel/cpu/mce/severity.c |  2 -
 arch/x86/lib/copy_user_64.S        | 18 +++++----
 fs/iomap/buffered-io.c             |  8 +++-
 include/linux/sched.h              |  2 +-
 include/linux/uio.h                |  2 +-
 lib/iov_iter.c                     | 15 ++++++-
 7 files changed, 77 insertions(+), 33 deletions(-)

-- 
2.29.2



^ permalink raw reply	[flat|nested] 20+ messages in thread
* Re: [PATCH 4/4] x86/mce: Avoid infinite loop for copy from user recovery
@ 2021-04-19 21:28 Jue Wang
  2021-04-19 21:41 ` Luck, Tony
  0 siblings, 1 reply; 20+ messages in thread
From: Jue Wang @ 2021-04-19 21:28 UTC (permalink / raw)
  To: tony.luck; +Cc: bp, linux-kernel, linux-mm, luto, naoya.horiguchi, x86, yaoaili

On Thu, 25 Mar 2021 17:02:35 -0700, Tony Luck wrote:
...

> But there are places in the kernel where the code assumes that this
> EFAULT return was simply because of a page fault. The code takes some
> action to fix that, and then retries the access. This results in a second
> machine check.

What about return EHWPOISON instead of EFAULT and update the callers
to handle EHWPOISON explicitly: i.e., not retry but give up on the page?

My main concern is that the strong assumptions that the kernel can't hit more
than a fixed number of poisoned cache lines before turning to user space
may simply not be true.

When DIMM goes bad, it can easily affect an entire bank or entire ram device
chip. Even with memory interleaving, it's possible that a kernel control path
touches lots of poisoned cache lines in the buffer it is working through.


^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2021-04-19 21:41 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-03-26  0:02 [RFC 0/4] Fix machine check recovery for copy_from_user Tony Luck
2021-03-26  0:02 ` [PATCH 1/4] x86/mce: Fix copyin code to return -EFAULT on machine check Tony Luck
2021-04-06 19:24   ` Borislav Petkov
2021-03-26  0:02 ` [PATCH 2/4] mce/iter: Check for copyin failure & return error up stack Tony Luck
2021-03-26  0:02 ` [PATCH 3/4] mce/copyin: fix to not SIGBUS when copying from user hits poison Tony Luck
2021-04-07 21:18   ` Borislav Petkov
2021-04-07 21:43     ` Luck, Tony
2021-04-08  8:49       ` Borislav Petkov
2021-04-08 17:08         ` Luck, Tony
2021-04-13 10:07           ` Borislav Petkov
2021-04-13 16:13             ` Luck, Tony
2021-04-14 13:05               ` Borislav Petkov
2021-03-26  0:02 ` [PATCH 4/4] x86/mce: Avoid infinite loop for copy from user recovery Tony Luck
2021-04-08 13:36   ` Borislav Petkov
2021-04-08 16:06     ` Luck, Tony
2021-04-08  2:13 ` [RFC 0/4] Fix machine check recovery for copy_from_user Aili Yao
2021-04-08 14:39   ` Luck, Tony
2021-04-09  6:49     ` Aili Yao
2021-04-19 21:28 [PATCH 4/4] x86/mce: Avoid infinite loop for copy from user recovery Jue Wang
2021-04-19 21:41 ` Luck, Tony

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).