All of lore.kernel.org
 help / color / mirror / Atom feed
From: "HORIGUCHI NAOYA(堀口 直也)" <naoya.horiguchi@nec.com>
To: Miaohe Lin <linmiaohe@huawei.com>
Cc: Naoya Horiguchi <naoya.horiguchi@linux.dev>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	David Hildenbrand <david@redhat.com>,
	Mike Kravetz <mike.kravetz@oracle.com>,
	Liu Shixin <liushixin2@huawei.com>,
	Yang Shi <shy828301@gmail.com>,
	Oscar Salvador <osalvador@suse.de>,
	Muchun Song <songmuchun@bytedance.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: [mm-unstable PATCH v5 4/8] mm, hwpoison: make unpoison aware of raw error info in hwpoisoned hugepage
Date: Mon, 11 Jul 2022 09:24:01 +0000	[thread overview]
Message-ID: <20220711092400.GA2741993@hori.linux.bs1.fc.nec.co.jp> (raw)
In-Reply-To: <3c66e7f6-e499-9802-409a-a32404fc87cc@huawei.com>

On Mon, Jul 11, 2022 at 03:09:01PM +0800, Miaohe Lin wrote:
> On 2022/7/8 13:36, Naoya Horiguchi wrote:
> > From: Naoya Horiguchi <naoya.horiguchi@nec.com>
> > 
> > Raw error info list needs to be removed when hwpoisoned hugetlb is
> > unpoisoned.  And unpoison handler needs to know how many errors there
> > are in the target hugepage. So add them.
> > 
> > HPageVmemmapOptimized(hpage) and HPageRawHwpUnreliable(hpage)) can't be
> > unpoisoned, so let's skip them.
> > 
> > Signed-off-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
> > Reported-by: kernel test robot <lkp@intel.com>
> > ---
> > v4 -> v5:
> > - fix type of return value of free_raw_hwp_pages()
> >   (found by kernel test robot),
> > - prevent unpoison for HPageVmemmapOptimized and HPageRawHwpUnreliable.
> > ---
...
> > diff --git a/mm/memory-failure.c b/mm/memory-failure.c
> > index 6833c5e4b410..89e74ec8a95f 100644
> > --- a/mm/memory-failure.c
> > +++ b/mm/memory-failure.c
> > @@ -1720,22 +1720,41 @@ static int hugetlb_set_page_hwpoison(struct page *hpage, struct page *page)
> >  	return ret;
> >  }
> >  
> > -int hugetlb_clear_page_hwpoison(struct page *hpage)
> > +static long free_raw_hwp_pages(struct page *hpage, bool move_flag)
> 
> NO strong opinion: Maybe the return type should be "unsigned" as it always >= 0 ?

Yes, will update.

> 
> >  {
> >  	struct llist_head *head;
> >  	struct llist_node *t, *tnode;
> > +	long count = 0;
> >  
> > -	if (!HPageRawHwpUnreliable(hpage))
> > -		ClearPageHWPoison(hpage);
> > +	/*
> > +	 * HPageVmemmapOptimized hugepages can't be unpoisoned because
> > +	 * struct pages for tail pages are required to free hwpoisoned
> > +	 * hugepages.  HPageRawHwpUnreliable hugepages shouldn't be
> > +	 * unpoisoned by definition.
> > +	 */
> > +	if (HPageVmemmapOptimized(hpage) || HPageRawHwpUnreliable(hpage))
> > +		return 0;
> >  	head = raw_hwp_list_head(hpage);
> >  	llist_for_each_safe(tnode, t, head->first) {
> >  		struct raw_hwp_page *p = container_of(tnode, struct raw_hwp_page, node);
> >  
> > -		SetPageHWPoison(p->page);
> > +		if (move_flag)
> > +			SetPageHWPoison(p->page);
> >  		kfree(p);
> > +		count++;
> >  	}
> >  	llist_del_all(head);
> > -	return 0;
> > +	return count;
> > +}
> > +
> > +int hugetlb_clear_page_hwpoison(struct page *hpage)
> 
> It seems the return value is unused?

Yes, the return value is not needed now.

> 
> > +{
> > +	int ret = -EBUSY;
> > +
> > +	if (!HPageRawHwpUnreliable(hpage))
> > +		ret = !TestClearPageHWPoison(hpage);
> > +	free_raw_hwp_pages(hpage, true);
> > +	return ret;
> >  }
> >  
> >  /*
> > @@ -1879,6 +1898,10 @@ static inline int try_memory_failure_hugetlb(unsigned long pfn, int flags, int *
> >  	return 0;
> >  }
> >  
> > +static inline long free_raw_hwp_pages(struct page *hpage, bool move_flag)
> 
> If return type is changed, remember to change here too.

OK.

> > +{
> > +	return 0;
> > +}
> >  #endif	/* CONFIG_HUGETLB_PAGE */
> >  
> >  static int memory_failure_dev_pagemap(unsigned long pfn, int flags,
> > @@ -2284,6 +2307,7 @@ int unpoison_memory(unsigned long pfn)
> >  	struct page *p;
> >  	int ret = -EBUSY;
> >  	int freeit = 0;
> > +	long count = 1;
> >  	static DEFINE_RATELIMIT_STATE(unpoison_rs, DEFAULT_RATELIMIT_INTERVAL,
> >  					DEFAULT_RATELIMIT_BURST);
> >  
> > @@ -2331,6 +2355,13 @@ int unpoison_memory(unsigned long pfn)
> >  
> >  	ret = get_hwpoison_page(p, MF_UNPOISON);
> >  	if (!ret) {
> > +		if (PageHuge(p)) {
> > +			count = free_raw_hwp_pages(page, false);
> 
> It seems the current behavior is: if any subpage of a hugetlb page is unpoisoned, then all of the
> hwpoisoned subpages will be unpoisoned. I'm not sure whether this is what we want.

Basically raw_hwp_info is not available to userspace (it might be recorded
in dmesg but not available via /proc/kpageflags), so unpoisoning error
subpages one-by-one is sometimes bothering.  If someone would like to
unpoison one-by-one (I expect nobody would), I can do this.

> Thanks.

Thank you!

- Naoya Horiguchi

  reply	other threads:[~2022-07-11  9:50 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-07-08  5:36 [mm-unstable PATCH v5 0/8] mm, hwpoison: enable 1GB hugepage support (v5) Naoya Horiguchi
2022-07-08  5:36 ` [mm-unstable PATCH v5 1/8] mm/hugetlb: check gigantic_page_runtime_supported() in return_unused_surplus_pages() Naoya Horiguchi
2022-07-11  1:55   ` Miaohe Lin
2022-07-08  5:36 ` [mm-unstable PATCH v5 2/8] mm/hugetlb: make pud_huge() and follow_huge_pud() aware of non-present pud entry Naoya Horiguchi
2022-07-08  5:36 ` [mm-unstable PATCH v5 3/8] mm, hwpoison, hugetlb: support saving mechanism of raw error pages Naoya Horiguchi
2022-07-11  3:26   ` Miaohe Lin
2022-07-11  5:51     ` HORIGUCHI NAOYA(堀口 直也)
2022-07-11  7:14       ` Miaohe Lin
2022-07-08  5:36 ` [mm-unstable PATCH v5 4/8] mm, hwpoison: make unpoison aware of raw error info in hwpoisoned hugepage Naoya Horiguchi
2022-07-11  7:09   ` Miaohe Lin
2022-07-11  9:24     ` HORIGUCHI NAOYA(堀口 直也) [this message]
2022-07-11 11:13       ` Miaohe Lin
2022-07-08  5:36 ` [mm-unstable PATCH v5 5/8] mm, hwpoison: set PG_hwpoison for busy hugetlb pages Naoya Horiguchi
2022-07-08  5:36 ` [mm-unstable PATCH v5 6/8] mm, hwpoison: make __page_handle_poison returns int Naoya Horiguchi
2022-07-08  5:36 ` [mm-unstable PATCH v5 7/8] mm, hwpoison: skip raw hwpoison page in freeing 1GB hugepage Naoya Horiguchi
2022-07-08  5:36 ` [mm-unstable PATCH v5 8/8] mm, hwpoison: enable memory error handling on " Naoya Horiguchi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220711092400.GA2741993@hori.linux.bs1.fc.nec.co.jp \
    --to=naoya.horiguchi@nec.com \
    --cc=akpm@linux-foundation.org \
    --cc=david@redhat.com \
    --cc=linmiaohe@huawei.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=liushixin2@huawei.com \
    --cc=mike.kravetz@oracle.com \
    --cc=naoya.horiguchi@linux.dev \
    --cc=osalvador@suse.de \
    --cc=shy828301@gmail.com \
    --cc=songmuchun@bytedance.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.