All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andy Lutomirski <luto@kernel.org>
To: Tony Luck <tony.luck@intel.com>
Cc: Borislav Petkov <bp@alien8.de>, X86 ML <x86@kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Darren Hart <dvhart@infradead.org>,
	Andy Lutomirski <luto@kernel.org>,
	LKML <linux-kernel@vger.kernel.org>,
	linux-edac <linux-edac@vger.kernel.org>,
	Linux-MM <linux-mm@kvack.org>
Subject: Re: [PATCH v2 0/3] Fix infinite machine check loop in futex_wait_setup()
Date: Thu, 14 Jan 2021 09:22:44 -0800	[thread overview]
Message-ID: <CALCETrUvSWM5yOviJRD4Y+HZxDschAuTHfyffTVc8qifvh1AiA@mail.gmail.com> (raw)
In-Reply-To: <20210111214452.1826-1-tony.luck@intel.com>

On Mon, Jan 11, 2021 at 1:45 PM Tony Luck <tony.luck@intel.com> wrote:
>
> Linux can now recover from machine checks where kernel code is
> doing get_user() to access application memory. But there isn't
> a way to distinguish whether get_user() failed because of a page
> fault or a machine check.
>
> Thus there is a problem if any kernel code thinks it can retry
> an access after doing something that would fix the page fault.
>
> One such example (I'm sure there are more) is in futex_wait_setup()
> where an attempt to read the futex with page faults disabled. Then
> a retry (after dropping a lock so page faults are safe):
>
>
>         ret = get_futex_value_locked(&uval, uaddr);
>
>         if (ret) {
>                 queue_unlock(*hb);
>
>                 ret = get_user(uval, uaddr);
>
> It would be good to avoid deliberately taking a second machine
> check (especially as the recovery code does really bad things
> and ends up in an infinite loop!).
>
> V2 (thanks to feedback from PeterZ) fixes this by changing get_user() to
> return -ENXIO ("No such device or address") for the case where a machine
> check occurred. Peter left it open which error code to use (suggesting
> "-EMEMERR or whatever name we come up with"). I think the existing ENXIO
> error code seems appropriate (the address being accessed has effectively
> gone away). But I don't have a strong attachment if anyone thinks we
> need a new code.
>
> Callers can check for ENXIO in paths where the access would be
> retried so they can avoid a second machine check.
>

I don't love this -- I'm concerned that there will be some code path
that expects a failing get_user() to return -EFAULT, not -ENXIO.
Also, get_user() *can* return -EFAULT when it hits bad memory even
with your patch if the recovery code manages to yank the PTE before
get_user().

So I tend to think that the machine check code should arrange to
survive some reasonable number of duplicate machine checks.

  parent reply	other threads:[~2021-01-14 17:23 UTC|newest]

Thread overview: 54+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-08 22:22 [PATCH 0/2] Fix infinite machine check loop in futex_wait_setup() Tony Luck
2021-01-08 22:22 ` [PATCH 1/2] x86/mce: Avoid infinite loop for copy from user recovery Tony Luck
2021-01-08 22:22 ` [PATCH 2/2] futex, x86/mce: Avoid double machine checks Tony Luck
2021-01-08 22:47   ` Peter Zijlstra
2021-01-08 23:08     ` Luck, Tony
2021-01-08 23:14       ` Peter Zijlstra
2021-01-08 23:20         ` Luck, Tony
2021-01-11 21:44 ` [PATCH v2 0/3] Fix infinite machine check loop in futex_wait_setup() Tony Luck
2021-01-11 21:44   ` [PATCH v2 1/3] x86/mce: Avoid infinite loop for copy from user recovery Tony Luck
2021-01-11 22:11     ` Andy Lutomirski
2021-01-11 22:20       ` Luck, Tony
2021-01-12 17:00         ` Andy Lutomirski
2021-01-12 17:00           ` Andy Lutomirski
2021-01-12 17:16           ` Luck, Tony
2021-01-12 17:21             ` Andy Lutomirski
2021-01-12 17:21               ` Andy Lutomirski
2021-01-12 18:23               ` Luck, Tony
2021-01-12 18:57                 ` Andy Lutomirski
2021-01-12 18:57                   ` Andy Lutomirski
2021-01-12 20:52                   ` Luck, Tony
2021-01-12 22:04                     ` Andy Lutomirski
2021-01-13  1:50                       ` Luck, Tony
2021-01-13  4:15                         ` Andy Lutomirski
2021-01-13 10:00                           ` Borislav Petkov
2021-01-13 16:06                             ` Luck, Tony
2021-01-13 16:19                               ` Borislav Petkov
2021-01-13 16:32                                 ` Luck, Tony
2021-01-13 17:35                                   ` Borislav Petkov
2021-01-14 20:22     ` Borislav Petkov
2021-01-14 21:05       ` Luck, Tony
2021-01-11 21:44   ` [PATCH v2 2/3] x86/mce: Add new return value to get_user() for machine check Tony Luck
2021-01-11 21:44   ` [PATCH v2 3/3] futex, x86/mce: Avoid double machine checks Tony Luck
2021-01-14 17:22   ` Andy Lutomirski [this message]
2021-01-14 17:22     ` [PATCH v2 0/3] Fix infinite machine check loop in futex_wait_setup() Andy Lutomirski
2021-01-15  0:38   ` [PATCH v3] x86/mce: Avoid infinite loop for copy from user recovery Tony Luck
2021-01-15 15:27     ` Borislav Petkov
2021-01-15 19:34       ` Luck, Tony
2021-01-15 20:51         ` [PATCH v4] " Luck, Tony
2021-01-15 23:23           ` Luck, Tony
2021-01-19 10:56             ` Borislav Petkov
2021-01-19 23:57               ` Luck, Tony
2021-01-20 12:18                 ` Borislav Petkov
2021-01-20 17:17                   ` Luck, Tony
2021-01-21 21:09                   ` Luck, Tony
2021-01-25 22:55                     ` [PATCH v5] " Luck, Tony
2021-01-26 11:03                       ` Borislav Petkov
2021-01-26 22:36                         ` Luck, Tony
2021-01-28 17:57                           ` Borislav Petkov
2021-02-01 18:58                             ` Luck, Tony
2021-02-02 11:01                               ` Borislav Petkov
2021-02-02 16:04                                 ` Luck, Tony
2021-02-02 21:06                                   ` Borislav Petkov
2021-02-02 22:12                                     ` Luck, Tony
2021-01-18 15:39         ` [PATCH v3] " Borislav Petkov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CALCETrUvSWM5yOviJRD4Y+HZxDschAuTHfyffTVc8qifvh1AiA@mail.gmail.com \
    --to=luto@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=bp@alien8.de \
    --cc=dvhart@infradead.org \
    --cc=linux-edac@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=peterz@infradead.org \
    --cc=tony.luck@intel.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.