From: Jue Wang <juew@google.com>
To: "Luck, Tony" <tony.luck@intel.com>
Cc: "Borislav Petkov" <bp@alien8.de>,
"dinghui@sangfor.com.cn" <dinghui@sangfor.com.cn>,
"huangcun@sangfor.com.cn" <huangcun@sangfor.com.cn>,
"linux-edac@vger.kernel.org" <linux-edac@vger.kernel.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"HORIGUCHI NAOYA(堀口 直也)" <naoya.horiguchi@nec.com>,
"Oscar Salvador" <osalvador@suse.de>, x86 <x86@kernel.org>,
"Song, Youquan" <youquan.song@intel.com>
Subject: Re: [PATCH 2/3] x86/mce: Avoid infinite loop for copy from user recovery
Date: Thu, 22 Jul 2021 21:16:40 -0700 [thread overview]
Message-ID: <CAPcxDJ7=UsAkDwVuoQcTt2B2UA4RWjs_o_=Fnk4Hfuqj+V8hAA@mail.gmail.com> (raw)
In-Reply-To: <0e39ef0e1b6d4532a09ad2d6e0b28310@intel.com>
On Thu, Jul 22, 2021 at 9:01 PM Luck, Tony <tony.luck@intel.com> wrote:
>
> >> I'm not aware of, nor expecting to find, places where the kernel
> >> tries to access user address A and hits poison, and then tries to
> >> access user address B (without returrning to user between access
> >> A and access B).
> >This seems a reasonablely easy scenario.
> >
> > A user space app allocates a buffer of xyz KB/MB/GB.
> >
> > Unfortunately the dimms are bad and multiple cache lines have
> > uncorrectable errors in them on different pages.
> >
> > Then the user space app tries to write the content of the buffer into some
> > file via write(2) from the entire buffer in one go.
>
> Before this patch Linux gets into an infinite loop taking machine
> checks on the first of the poison addresses in the buffer.
>
> With this patch (and also patch 3/3 in this series). There are
> a few machine checks on the first poison address (I think the number
> depends on the alignment of the poison within a page ... but I'm
> not sure). My test code shows 4 machine checks at the same
> address. Then Linux returns a short byte count to the user
> showing how many bytes were actually written to the file.
>
> The fast that there are many more poison lines in the buffer
> beyond the place where the write stopped on the first one is
> irrelevant.
In our test, the application memory was anon.
With 1 UC error injected, the test always passes with the error
recovered and a SIGBUS delivered to user space.
When there are >1 UC errors in buffer, then indefinite mce loop.
>
> [Well, if the second poisoned line is immediately after the first
> you may hit h/w prefetch issues and h/w may signal a fatal
> machine check ... but that's a different problem that s/w could
> only solve with painful LFENCE operations between each 64-bytes
> of the copy]
>
> -Tony
next prev parent reply other threads:[~2021-07-23 4:17 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-07-22 13:54 [PATCH 2/3] x86/mce: Avoid infinite loop for copy from user recovery Jue Wang
2021-07-22 15:19 ` Luck, Tony
2021-07-22 23:30 ` Jue Wang
2021-07-23 0:14 ` Luck, Tony
2021-07-23 3:47 ` Jue Wang
2021-07-23 4:01 ` Luck, Tony
2021-07-23 4:16 ` Jue Wang [this message]
2021-07-23 14:47 ` Luck, Tony
-- strict thread matches above, loose matches on Subject: below --
2021-07-31 6:30 Jue Wang
2021-07-31 20:43 ` Luck, Tony
2021-08-02 15:29 ` Jue Wang
2021-07-06 19:06 [PATCH 0/3] More machine check recovery fixes Tony Luck
2021-07-06 19:06 ` [PATCH 2/3] x86/mce: Avoid infinite loop for copy from user recovery Tony Luck
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAPcxDJ7=UsAkDwVuoQcTt2B2UA4RWjs_o_=Fnk4Hfuqj+V8hAA@mail.gmail.com' \
--to=juew@google.com \
--cc=bp@alien8.de \
--cc=dinghui@sangfor.com.cn \
--cc=huangcun@sangfor.com.cn \
--cc=linux-edac@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=naoya.horiguchi@nec.com \
--cc=osalvador@suse.de \
--cc=tony.luck@intel.com \
--cc=x86@kernel.org \
--cc=youquan.song@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).