From: Bin Wang <wangbin224@huawei.com>
To: <mike.kravetz@oracle.com>
Cc: <akpm@linux-foundation.org>, <linux-kernel@vger.kernel.org>,
<linux-mm@kvack.org>, <naoya.horiguchi@nec.com>,
<wuxu.wu@huawei.com>
Subject: Re: Re: [PATCH v2] mm: hugetlb: add hwcrp_hugepages to record memory failure on hugetlbfs
Date: Mon, 28 Jun 2021 17:27:52 +0800 [thread overview]
Message-ID: <20210628092752.2135-1-wangbin224@huawei.com> (raw)
In-Reply-To: <c38e54ea-6aae-addf-0d6b-140b097a346b@oracle.com>
> > diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
> > index 926eeb9bf4eb..ffb6e7b6756b 100644
> > --- a/fs/hugetlbfs/inode.c
> > +++ b/fs/hugetlbfs/inode.c
> > @@ -986,8 +986,7 @@ static int hugetlbfs_error_remove_page(struct address_space *mapping,
> > pgoff_t index = page->index;
> >
> > remove_huge_page(page);
> > - if (unlikely(hugetlb_unreserve_pages(inode, index, index + 1, 1)))
> > - hugetlb_fix_reserve_counts(inode);
>
> As mentioned, huge page reserve counts are not used to record number of
> poisioned pages. The calls to hugetlb_unreserve_pages and possibly
> hugetlb_fix_reserve_counts are necessary for reserve accounting. They
> can not be removed.
Thanks for your explanation very much. I didn't get the point of the
comments from the first patch. hugetlb_fix_reserve_counts() shouldn't
be removed and I will fix this.
> > + hugetlb_fix_hwcrp_counts(page);
>
> This new routine just counts memory errors on 'in use' huge pages.
> I do not see a call anywhere to count memory errors on huge pages
> not in use.
It's my oversight. I should have considered this situation. I tested
it with hwcrp_hugepages count, and this is the result:
# cat /proc/meminfo |grep -E 'HugePages_|Hard'
HardwareCorrupted: 0 kB
HugePages_Total: 64
HugePages_Free: 64
HugePages_Rsvd: 0
HugePages_Surp: 0
HugePages_Hwcrp: 0
No count changes, even the HardwareCorrupted. I'm not sure if this is
normal. This is what happens in kernel(stable master branch):
static int memory_failure_hugetlb(unsigned long pfn, int flags)
{
...
/* TestSetPageHWPoison return 0 */
if (TestSetPageHWPoison(head)) {
...
}
num_poisoned_pages_inc(); /* HardwareCorrupted += PAGE_SIZE */
/* get_hwpoison_page() return 0 */
if (!(flags & MF_COUNT_INCREASED) && !get_hwpoison_page(p, flags, 0)) {
/*
* Check "filter hit" and "race with other subpage."
*/
lock_page(head);
/* PageHWPoison() return 1 */
if (PageHWPoison(head)) {
/* (p != head && TestSetPageHWPoison(head)) is hit */
if ((hwpoison_filter(p) && TestClearPageHWPoison(p))
|| (p != head && TestSetPageHWPoison(head))) {
/* HardwareCorrupted -= PAGE_SIZE */
num_poisoned_pages_dec();
unlock_page(head);
return 0;
}
}
...
}
...
}
It seems like that memory errors on huge pages not in use hit the
"race with other subpage". I think I shouldn't add hwcrp_hugepages in
this routine. Maybe we need more conditions to distinguish this.
> >
> > return 0;
> > }
> > diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
> > index f7ca1a3870ea..1d5bada80aa5 100644
> > --- a/include/linux/hugetlb.h
> > +++ b/include/linux/hugetlb.h
> > @@ -171,6 +171,7 @@ void putback_active_hugepage(struct page *page);
> > void move_hugetlb_state(struct page *oldpage, struct page *newpage, int reason);
> > void free_huge_page(struct page *page);
> > void hugetlb_fix_reserve_counts(struct inode *inode);
> > +void hugetlb_fix_hwcrp_counts(struct page *page);
> > extern struct mutex *hugetlb_fault_mutex_table;
> > u32 hugetlb_fault_mutex_hash(struct address_space *mapping, pgoff_t idx);
> >
> > @@ -602,12 +603,14 @@ struct hstate {
> > unsigned long free_huge_pages;
> > unsigned long resv_huge_pages;
> > unsigned long surplus_huge_pages;
> > + unsigned long hwcrp_huge_pages;
> > unsigned long nr_overcommit_huge_pages;
> > struct list_head hugepage_activelist;
> > struct list_head hugepage_freelists[MAX_NUMNODES];
> > unsigned int nr_huge_pages_node[MAX_NUMNODES];
> > unsigned int free_huge_pages_node[MAX_NUMNODES];
> > unsigned int surplus_huge_pages_node[MAX_NUMNODES];
> > + unsigned int hwcrp_huge_pages_node[MAX_NUMNODES];
>
> I understand your requirement to count the number of memory errors on
> hugetlb pages. However, we need to think carefully about we represent
> that count.
>
> Noaya, do you have opinions on where would be the best place to store
> this information? The hugetlb memory error code has the comment 'needs
> work'. Ideally, we could isolate memory errors to a single base (4K for
> x86) page and free the remaining base pages to buddy. We could also
> potentially allocate a 'replacement' hugetlb page doing something like
> alloc_and_dissolve_huge_page.
>
> If we get an error on a hugetlb page and can isolate it to a base page
> and replace the huge page, is it still a huge page memory error?
>
> IMO, we should work on isolating memory errors to a base page and
> replacing the huge page. Then, the existing count of base pages with
> memory errors would be sufficient?
>
> This is something I would like to work, but I have higher priorities
> right now.
Yes, splitting the huge pages and isolating a base page is ideal. And
we do this with dissolve_free_huge_page() when page_mapping() return
NULL. I think there is a reason(but I do not get it) why we don't split
huge pags in hugetlbfs_error_remove_page() or after. So I choose to
add a new count.
--
Bin Wang
next prev parent reply other threads:[~2021-06-28 9:28 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-06-23 8:51 [PATCH v2] mm: hugetlb: add hwcrp_hugepages to record memory failure on hugetlbfs wangbin
2021-06-25 23:07 ` Mike Kravetz
2021-06-28 9:27 ` Bin Wang [this message]
2021-06-29 8:08 ` Naoya Horiguchi
2021-07-11 14:20 ` Bin Wang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20210628092752.2135-1-wangbin224@huawei.com \
--to=wangbin224@huawei.com \
--cc=akpm@linux-foundation.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mike.kravetz@oracle.com \
--cc=naoya.horiguchi@nec.com \
--cc=wuxu.wu@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).