linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Aili Yao <yaoaili@kingsoft.com>
To: Naoya Horiguchi <nao.horiguchi@gmail.com>
Cc: <linux-mm@kvack.org>, Tony Luck <tony.luck@intel.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Oscar Salvador <osalvador@suse.de>,
	"David Hildenbrand" <david@redhat.com>,
	Borislav Petkov <bp@alien8.de>,
	"Andy Lutomirski" <luto@kernel.org>,
	Naoya Horiguchi <naoya.horiguchi@nec.com>,
	Jue Wang <juew@google.com>, <linux-kernel@vger.kernel.org>,
	<yaoaili126@gmail.com>
Subject: Re: [PATCH v4 2/2] mm,hwpoison: send SIGBUS when the page has already been poisoned
Date: Fri, 7 May 2021 17:38:52 +0800	[thread overview]
Message-ID: <20210507173852.0adc5cc4@alex-virtual-machine> (raw)
In-Reply-To: <20210427062953.2080293-3-nao.horiguchi@gmail.com>

On Tue, 27 Apr 2021 15:29:53 +0900
Naoya Horiguchi <nao.horiguchi@gmail.com> wrote:

> From: Naoya Horiguchi <naoya.horiguchi@nec.com>
> 
> When memory_failure() is called with MF_ACTION_REQUIRED on the
> page that has already been hwpoisoned, memory_failure() could fail
> to send SIGBUS to the affected process, which results in infinite
> loop of MCEs.
> 
> Currently memory_failure() returns 0 if it's called for already
> hwpoisoned page, then the caller, kill_me_maybe(), could return
> without sending SIGBUS to current process.  An action required MCE
> is raised when the current process accesses to the broken memory,
> so no SIGBUS means that the current process continues to run and
> access to the error page again soon, so running into MCE loop.
> 
> This issue can arise for example in the following scenarios:
> 
>   - Two or more threads access to the poisoned page concurrently.
>     If local MCE is enabled, MCE handler independently handles the
>     MCE events.  So there's a race among MCE events, and the
>     second or latter threads fall into the situation in question.
> 
>   - If there was a precedent memory error event and memory_failure()
>     for the event failed to unmap the error page for some reason,
>     the subsequent memory access to the error page triggers the
>     MCE loop situation.
> 
> To fix the issue, make memory_failure() return some error code when the
> error page has already been hwpoisoned.  This allows memory error
> handler to control how it sends signals to userspace.  And make sure
> that any process touching a hwpoisoned page should get a SIGBUS (if
> possible) with the error virtual address, even in "already hwpoisoned"
> path of memory_failure() as is done in page fault path.
> 
> kill_accessing_process() does pagetable walk to find the error virtual
> address.  If multiple virtual addresses are found in the pagetable walk,
> no one knows which address is the correct one, so we fall back to sending
> SIGBUS in kill_me_maybe() without error address info as we do now.
> This corner case is left to be solved in the future.
> 
> Signed-off-by: Naoya Horiguchi <naoya.horiguchi@nec.com>

Sorry for my late response, I just get time to rethink the pagewalk patch. Please let me share my thoughts, 
If anything wrong, just point out, thanks!

This whole pagewalk patch is meant to fix invalid virtual address along SIGBUS, For this invalid virtual address issue,
It seems this is one existing issue before this race issue is posted. while the issue is not fixed for a long time.

Then I think why this issue is not fixed, maybe just no process will care this virtual address as it will be killed.
Maybe virtual guest will need this address to forward it to vCPU, but untill now the memory recovery function in the VM doesn't
work at all, and without this address, It seems not a big impact though.

Maybe there are some other cases will care the virtual address, if anyone knows, just point out.

But invalid virtual address is still no good.

Before this, I post one RFC patch try to fix this issue with one knowing issue:it failed for mutiple pte entry;
Then this patch is posted trying to address this.

First I read this patch, I think this method is good and right and i test it. But now I think it again, I am wondering even the process
have multi pte entry and wrong virtuall address, but it still pointing to the same page, right?
If the process won't exit and get the wrong virtual address, what wrong action will it do? Do I miss some thing? 
while I can just think the virtual machine example, but the qemu will translate the wrong virtual address to right guest physical address? 
I am not sure VM will have multi pte entry?

And I think the virtual address along SIGBUS is not mean to backtrace the code, it just want to tell where the error memory is, for multi pte
entry, one virtual address for the same physical page is not enough?

Compare this patch with my RFC patch, difference:
1.This patch will just fix the race issue's invalid virtual address. while my RFC patch will cover all the error case for recovery;
2.For multi entry, this patch will do one force_sig with no other infomation, But the RFC patch will take one possible right address, I don't know which one is better.

And if this multi pte entry is one real issue, it seems the normal recovey work will aslo trigger this, would it be better to fix that first?

Thanks!
Aili Yao






  reply	other threads:[~2021-05-07  9:39 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-27  6:29 [PATCH v4 0/2] mm,hwpoison: fix sending SIGBUS for Action Required MCE Naoya Horiguchi
2021-04-27  6:29 ` [PATCH v4 1/2] mm/memory-failure: Use a mutex to avoid memory_failure() races Naoya Horiguchi
2021-05-06  9:37   ` Aili Yao
2021-05-06 15:34     ` Luck, Tony
2021-04-27  6:29 ` [PATCH v4 2/2] mm,hwpoison: send SIGBUS when the page has already been poisoned Naoya Horiguchi
2021-05-07  9:38   ` Aili Yao [this message]
2021-05-10  7:21     ` HORIGUCHI NAOYA(堀口 直也)
2021-05-10  8:00       ` Aili Yao
2021-05-10  8:31         ` HORIGUCHI NAOYA(堀口 直也)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210507173852.0adc5cc4@alex-virtual-machine \
    --to=yaoaili@kingsoft.com \
    --cc=akpm@linux-foundation.org \
    --cc=bp@alien8.de \
    --cc=david@redhat.com \
    --cc=juew@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=luto@kernel.org \
    --cc=nao.horiguchi@gmail.com \
    --cc=naoya.horiguchi@nec.com \
    --cc=osalvador@suse.de \
    --cc=tony.luck@intel.com \
    --cc=yaoaili126@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).