linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Jue Wang <juew@google.com>
To: "Luck, Tony" <tony.luck@intel.com>
Cc: "Borislav Petkov" <bp@alien8.de>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	luto@kernel.org,
	"HORIGUCHI NAOYA(堀口 直也)" <naoya.horiguchi@nec.com>,
	x86 <x86@kernel.org>,
	yaoaili@kingsoft.com
Subject: Re: [PATCH 3/4] mce/copyin: fix to not SIGBUS when copying from user hits poison
Date: Mon, 19 Apr 2021 13:32:35 -0700	[thread overview]
Message-ID: <CAPcxDJ7fKHF69T6jepX+yVP=+t43i9hQD3W6SaaDLk9_UBy9uw@mail.gmail.com> (raw)

On Thu, 8 Apr 2021 10:08:52 -0700, Tony Luck wrote:
> KVM apparently passes a machine check into the guest. Though it seems
> to be misisng the MCG_STATUS information to tell the guest whether this
> is an "Action Required" machine check, or an "Action Optional" (i.e.
> whether the poison was found synchonously by execution of the current
> instruction, or asynchronously).

The KVM_X86_SET_MCE ioctl takes a parameter of struct kvm_x86_mce, hypervisor
can set with necessary semantics.

1140 #ifdef KVM_CAP_MCE
1141 /* x86 MCE */
1142 struct kvm_x86_mce {
1143         __u64 status;
1144         __u64 addr;
1145         __u64 misc;
1146         __u64 mcg_status;
1147         __u8 bank;
1148         __u8 pad1[7];
1149         __u64 pad2[3];
1150 };
1151 #endif

> > Are we documenting somewhere: "if your process gets a SIGBUS and this
> > and that, which means your page got offlined, you should do this and
> > that to recover"?

> Essentially it boils down to:
> SIGBUS handler gets additional data giving virtual address that has gone away

> 1) Can the application replace the lost page?
> Use mmap(addr, MAP_FIXED, ...) to map a fresh page into the gap
> and fill with replacement data. This case can return from SIGBUS
> handler to re-execute failed instruction
> 2) Can the application continue in degraded mode w/o the lost page?
> Hunt down pointers to lost page and update structures to say
> "this data lost". Use siglongjmp() to go to preset recovery path
> 3) Can the application shut down gracefully?
> Record details of the lost page. Inform next-of-kin. Exit.
> 4) Default - just exit
Two possible addition to these great points:
5) If for some reason the page cannot be unmapped (e.g.,
either losing to much memory like hugetlbfs 1G pages, or
THP split failure for SHMEM THP), kernel maintains a
consistent semantic (i.e., MCE SIGBUS with vaddr) to all future
accesses from user space, by leaving the hwpoisoned page
mapped or in the radix tree.
6). If for some reason the vaddr is not available upon the
first MCE recovery and page is unmapped, kernel provides
correct semantic (MCE SIGBUS with vaddr) in subsequent
page faults from user space accessing the same vaddr.


             reply	other threads:[~2021-04-19 20:32 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-19 20:32 Jue Wang [this message]
  -- strict thread matches above, loose matches on Subject: below --
2021-04-14  5:47 [PATCH 3/4] mce/copyin: fix to not SIGBUS when copying from user hits poison Jue Wang
2021-04-14 13:10 ` Borislav Petkov
2021-04-14 14:46   ` Jue Wang
2021-04-14 15:35     ` Borislav Petkov
2021-03-26  0:02 [RFC 0/4] Fix machine check recovery for copy_from_user Tony Luck
2021-03-26  0:02 ` [PATCH 3/4] mce/copyin: fix to not SIGBUS when copying from user hits poison Tony Luck
2021-04-07 21:18   ` Borislav Petkov
2021-04-07 21:43     ` Luck, Tony
2021-04-08  8:49       ` Borislav Petkov
2021-04-08 17:08         ` Luck, Tony
2021-04-13 10:07           ` Borislav Petkov
2021-04-13 16:13             ` Luck, Tony
2021-04-14 13:05               ` Borislav Petkov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAPcxDJ7fKHF69T6jepX+yVP=+t43i9hQD3W6SaaDLk9_UBy9uw@mail.gmail.com' \
    --to=juew@google.com \
    --cc=bp@alien8.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=luto@kernel.org \
    --cc=naoya.horiguchi@nec.com \
    --cc=tony.luck@intel.com \
    --cc=x86@kernel.org \
    --cc=yaoaili@kingsoft.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).