All of lore.kernel.org
 help / color / mirror / Atom feed
From: Aili Yao <yaoaili@kingsoft.com>
To: Michal Hocko <mhocko@suse.com>
Cc: David Hildenbrand <david@redhat.com>,
	<linux-kernel@vger.kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	"Michael S. Tsirkin" <mst@redhat.com>,
	Jason Wang <jasowang@redhat.com>,
	Alexey Dobriyan <adobriyan@gmail.com>,
	Mike Rapoport <rppt@kernel.org>,
	"Matthew Wilcox (Oracle)" <willy@infradead.org>,
	Oscar Salvador <osalvador@suse.de>,
	"Roman Gushchin" <guro@fb.com>,
	Alex Shi <alex.shi@linux.alibaba.com>,
	Steven Price <steven.price@arm.com>,
	Mike Kravetz <mike.kravetz@oracle.com>,
	Jiri Bohac <jbohac@suse.cz>,
	"K. Y. Srinivasan" <kys@microsoft.com>,
	Haiyang Zhang <haiyangz@microsoft.com>,
	Stephen Hemminger <sthemmin@microsoft.com>,
	"Wei Liu" <wei.liu@kernel.org>,
	Naoya Horiguchi <naoya.horiguchi@nec.com>,
	<linux-hyperv@vger.kernel.org>,
	<virtualization@lists.linux-foundation.org>,
	<linux-fsdevel@vger.kernel.org>, <linux-mm@kvack.org>,
	<yaoaili126@gmail.com>
Subject: Re: [PATCH v1 3/7] mm: rename and move page_is_poisoned()
Date: Thu, 6 May 2021 15:28:05 +0800	[thread overview]
Message-ID: <20210506152805.13fe775e@alex-virtual-machine> (raw)
In-Reply-To: <YJOVZlFGcSG+mmIk@dhcp22.suse.cz>

On Thu, 6 May 2021 09:06:14 +0200
Michal Hocko <mhocko@suse.com> wrote:

> On Thu 06-05-21 08:56:11, Aili Yao wrote:
> > On Wed, 5 May 2021 15:27:39 +0200
> > Michal Hocko <mhocko@suse.com> wrote:
> >   
> > > On Wed 05-05-21 15:17:53, David Hildenbrand wrote:  
> > > > On 05.05.21 15:13, Michal Hocko wrote:    
> > > > > On Thu 29-04-21 14:25:15, David Hildenbrand wrote:    
> > > > > > Commit d3378e86d182 ("mm/gup: check page posion status for coredump.")
> > > > > > introduced page_is_poisoned(), however, v5 [1] of the patch used
> > > > > > "page_is_hwpoison()" and something went wrong while upstreaming. Rename the
> > > > > > function and move it to page-flags.h, from where it can be used in other
> > > > > > -- kcore -- context.
> > > > > > 
> > > > > > Move the comment to the place where it belongs and simplify.
> > > > > > 
> > > > > > [1] https://lkml.kernel.org/r/20210322193318.377c9ce9@alex-virtual-machine
> > > > > > 
> > > > > > Signed-off-by: David Hildenbrand <david@redhat.com>    
> > > > > 
> > > > > I do agree that being explicit about hwpoison is much better. Poisoned
> > > > > page can be also an unitialized one and I believe this is the reason why
> > > > > you are bringing that up.    
> > > > 
> > > > I'm bringing it up because I want to reuse that function as state above :)
> > > >     
> > > > > 
> > > > > But you've made me look at d3378e86d182 and I am wondering whether this
> > > > > is really a valid patch. First of all it can leak a reference count
> > > > > AFAICS. Moreover it doesn't really fix anything because the page can be
> > > > > marked hwpoison right after the check is done. I do not think the race
> > > > > is feasible to be closed. So shouldn't we rather revert it?    
> > > > 
> > > > I am not sure if we really care about races here that much here? I mean,
> > > > essentially we are racing with HW breaking asynchronously. Just because we
> > > > would be synchronizing with SetPageHWPoison() wouldn't mean we can stop HW
> > > > from breaking.    
> > > 
> > > Right
> > >   
> > > > Long story short, this should be good enough for the cases we actually can
> > > > handle? What am I missing?    
> > > 
> > > I am not sure I follow. My point is that I fail to see any added value
> > > of the check as it doesn't prevent the race (it fundamentally cannot as
> > > the page can be poisoned at any time) but the failure path doesn't
> > > put_page which is incorrect even for hwpoison pages.  
> > 
> > Sorry, I have something to say:
> > 
> > I have noticed the ref count leak in the previous topic ,but  I don't think
> > it's a really matter. For memory recovery case for user pages, we will keep one
> > reference to the poison page so the error page will not be freed to buddy allocator.
> > which can be checked in memory_faulure() function.  
> 
> So what would happen if those pages are hwpoisoned from userspace rather
> than by HW. And repeatedly so?

Sorry, I may be not totally understand what you mean.

Do you mean hard page offline from mcelog?
If yes, I think it's not for one real UC error but for CE storms.
when we access this page in kernel, the access may success even it was marked hwpoison.

Thanks!
Aili Yao 

  reply	other threads:[~2021-05-06  7:28 UTC|newest]

Thread overview: 54+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-29 12:25 [PATCH v1 0/7] fs/proc/kcore: don't read offline sections, logically offline pages and hwpoisoned pages David Hildenbrand
2021-04-29 12:25 ` David Hildenbrand
2021-04-29 12:25 ` [PATCH v1 1/7] fs/proc/kcore: drop KCORE_REMAP and KCORE_OTHER David Hildenbrand
2021-04-29 12:25   ` David Hildenbrand
2021-05-02  6:31   ` Mike Rapoport
2021-04-29 12:25 ` [PATCH v1 2/7] fs/proc/kcore: pfn_is_ram check only applies to KCORE_RAM David Hildenbrand
2021-04-29 12:25   ` David Hildenbrand
2021-05-02  6:31   ` Mike Rapoport
2021-04-29 12:25 ` [PATCH v1 3/7] mm: rename and move page_is_poisoned() David Hildenbrand
2021-04-29 12:25   ` David Hildenbrand
2021-05-02  6:32   ` Mike Rapoport
2021-05-05 13:13   ` Michal Hocko
2021-05-05 13:17     ` David Hildenbrand
2021-05-05 13:17       ` David Hildenbrand
2021-05-05 13:27       ` Michal Hocko
2021-05-05 13:39         ` David Hildenbrand
2021-05-05 13:39           ` David Hildenbrand
2021-05-05 13:45           ` Michal Hocko
2021-05-06  1:08             ` Aili Yao
2021-05-06  0:56         ` Aili Yao
2021-05-06  7:06           ` Michal Hocko
2021-05-06  7:28             ` Aili Yao [this message]
2021-05-06  7:55               ` Michal Hocko
2021-05-06  8:52                 ` Aili Yao
2021-04-29 12:25 ` [PATCH v1 4/7] fs/proc/kcore: don't read offline sections, logically offline pages and hwpoisoned pages David Hildenbrand
2021-04-29 12:25   ` David Hildenbrand
2021-05-02  6:32   ` Mike Rapoport
2021-04-29 12:25 ` [PATCH v1 5/7] mm: introduce page_offline_(begin|end|freeze|unfreeze) to synchronize setting PageOffline() David Hildenbrand
2021-04-29 12:25   ` David Hildenbrand
2021-05-02  6:33   ` Mike Rapoport
2021-05-03  8:11     ` David Hildenbrand
2021-05-03  8:11       ` David Hildenbrand
2021-05-05 13:24   ` Michal Hocko
2021-05-05 15:10     ` David Hildenbrand
2021-05-05 15:10       ` David Hildenbrand
2021-05-05 17:41       ` Mike Rapoport
2021-04-29 12:25 ` [PATCH v1 6/7] virtio-mem: use page_offline_(start|end) when " David Hildenbrand
2021-04-29 12:25   ` David Hildenbrand
2021-05-02  6:33   ` Mike Rapoport
2021-05-03  8:16     ` David Hildenbrand
2021-05-03  8:16       ` David Hildenbrand
2021-05-03  8:23   ` Michael S. Tsirkin
2021-05-03  8:23     ` Michael S. Tsirkin
2021-04-29 12:25 ` [PATCH v1 7/7] fs/proc/kcore: use page_offline_(freeze|unfreeze) David Hildenbrand
2021-04-29 12:25   ` David Hildenbrand
2021-05-02  6:34   ` Mike Rapoport
2021-05-03  8:28     ` David Hildenbrand
2021-05-03  8:28       ` David Hildenbrand
2021-05-03  9:28       ` Mike Rapoport
2021-05-03 10:13         ` David Hildenbrand
2021-05-03 10:13           ` David Hildenbrand
2021-05-03 11:33           ` Mike Rapoport
2021-05-03 11:35             ` David Hildenbrand
2021-05-03 11:35               ` David Hildenbrand

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210506152805.13fe775e@alex-virtual-machine \
    --to=yaoaili@kingsoft.com \
    --cc=adobriyan@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=alex.shi@linux.alibaba.com \
    --cc=david@redhat.com \
    --cc=guro@fb.com \
    --cc=haiyangz@microsoft.com \
    --cc=jasowang@redhat.com \
    --cc=jbohac@suse.cz \
    --cc=kys@microsoft.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-hyperv@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=mike.kravetz@oracle.com \
    --cc=mst@redhat.com \
    --cc=naoya.horiguchi@nec.com \
    --cc=osalvador@suse.de \
    --cc=rppt@kernel.org \
    --cc=steven.price@arm.com \
    --cc=sthemmin@microsoft.com \
    --cc=virtualization@lists.linux-foundation.org \
    --cc=wei.liu@kernel.org \
    --cc=willy@infradead.org \
    --cc=yaoaili126@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.