All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Xu <peterx@redhat.com>
To: Yang Shi <shy828301@gmail.com>
Cc: "HORIGUCHI NAOYA(堀口 直也)" <naoya.horiguchi@nec.com>,
	"Hugh Dickins" <hughd@google.com>,
	"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
	"Matthew Wilcox" <willy@infradead.org>,
	"Oscar Salvador" <osalvador@suse.de>,
	"Andrew Morton" <akpm@linux-foundation.org>,
	"Linux MM" <linux-mm@kvack.org>,
	"Linux FS-devel Mailing List" <linux-fsdevel@vger.kernel.org>,
	"Linux Kernel Mailing List" <linux-kernel@vger.kernel.org>
Subject: Re: [v3 PATCH 2/5] mm: filemap: check if THP has hwpoisoned subpage for PMD page fault
Date: Mon, 11 Oct 2021 20:55:26 -0400	[thread overview]
Message-ID: <YWTc/n4r6CJdvPpt@t490s> (raw)
In-Reply-To: <CAHbLzkqm_Os8TLXgbkL-oxQVsQqRbtmjdMdx0KxNke8mUF1mWA@mail.gmail.com>

On Thu, Oct 07, 2021 at 02:28:35PM -0700, Yang Shi wrote:
> On Wed, Oct 6, 2021 at 4:57 PM Yang Shi <shy828301@gmail.com> wrote:
> >
> > On Wed, Oct 6, 2021 at 1:15 PM Peter Xu <peterx@redhat.com> wrote:
> > >
> > > On Thu, Sep 30, 2021 at 02:53:08PM -0700, Yang Shi wrote:
> > > > @@ -1148,8 +1148,12 @@ static int __get_hwpoison_page(struct page *page)
> > > >               return -EBUSY;
> > > >
> > > >       if (get_page_unless_zero(head)) {
> > > > -             if (head == compound_head(page))
> > > > +             if (head == compound_head(page)) {
> > > > +                     if (PageTransHuge(head))
> > > > +                             SetPageHasHWPoisoned(head);
> > > > +
> > > >                       return 1;
> > > > +             }
> > > >
> > > >               pr_info("Memory failure: %#lx cannot catch tail\n",
> > > >                       page_to_pfn(page));
> > >
> > > Sorry for the late comments.
> > >
> > > I'm wondering whether it's ideal to set this bit here, as get_hwpoison_page()
> > > sounds like a pure helper to get a refcount out of a sane hwpoisoned page.  I'm
> > > afraid there can be side effect that we set this without being noticed, so I'm
> > > also wondering we should keep it in memory_failure().
> > >
> > > Quotting comments for get_hwpoison_page():
> > >
> > >  * get_hwpoison_page() takes a page refcount of an error page to handle memory
> > >  * error on it, after checking that the error page is in a well-defined state
> > >  * (defined as a page-type we can successfully handle the memor error on it,
> > >  * such as LRU page and hugetlb page).
> > >
> > > For example, I see that both unpoison_memory() and soft_offline_page() will
> > > call it too, does it mean that we'll also set the bits e.g. even when we want
> > > to inject an unpoison event too?
> >
> > unpoison_memory() should be not a problem since it will just bail out
> > once THP is met as the comment says:
> >
> > /*
> > * unpoison_memory() can encounter thp only when the thp is being
> > * worked by memory_failure() and the page lock is not held yet.
> > * In such case, we yield to memory_failure() and make unpoison fail.
> > */
> >
> >
> > And I think we should set the flag for soft offline too, right? The
> > soft offline does set the hwpoison flag for the corrupted sub page and
> > doesn't split file THP, so it should be captured by page fault as
> > well. And yes for poison injection.
> 
> Err... I must be blind. The soft offline does *NOT* set hwpoison flag
> for any page. So your comment does stand. The flag should be set
> outside get_hwpoison_page().

I saw that page_handle_poison() sets it, so perhaps we do need to take care of
soft offline?  Though I still think the extra bit should be set outside of the
get_hwpoison_page() function.

Another thing is I noticed soft_offline_in_use_page() will still ignore file
backed split.  I'm not sure whether it means we'd better also handle that case
as well, so shmem thp can be split there too?

-- 
Peter Xu


  reply	other threads:[~2021-10-12  0:55 UTC|newest]

Thread overview: 59+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-30 21:53 [RFC v3 PATCH 0/5] Solve silent data loss caused by poisoned page cache (shmem/tmpfs) Yang Shi
2021-09-30 21:53 ` [v3 PATCH 1/5] mm: hwpoison: remove the unnecessary THP check Yang Shi
2021-10-06  2:35   ` Yang Shi
2021-10-06  4:00     ` Naoya Horiguchi
2021-10-06 17:56       ` Yang Shi
2021-09-30 21:53 ` [v3 PATCH 2/5] mm: filemap: check if THP has hwpoisoned subpage for PMD page fault Yang Shi
2021-10-01  7:23   ` Naoya Horiguchi
2021-10-01 21:07     ` Yang Shi
2021-10-04 14:06   ` Kirill A. Shutemov
2021-10-04 18:17     ` Yang Shi
2021-10-04 19:41       ` Kirill A. Shutemov
2021-10-04 20:13         ` Yang Shi
2021-10-06 19:54           ` Peter Xu
2021-10-06 23:41             ` Yang Shi
2021-10-07 16:14               ` Peter Xu
2021-10-07 18:28                 ` Yang Shi
2021-10-08  9:35             ` Kirill A. Shutemov
2021-10-11 22:57               ` Peter Xu
2021-10-06 20:15   ` Peter Xu
2021-10-06 23:57     ` Yang Shi
2021-10-07 16:06       ` Peter Xu
2021-10-07 18:19         ` Yang Shi
2021-10-07 20:27           ` Yang Shi
2021-10-07 21:28       ` Yang Shi
2021-10-12  0:55         ` Peter Xu [this message]
2021-10-12  1:44           ` Peter Xu
2021-10-12 18:02             ` Yang Shi
2021-10-12 22:10               ` Peter Xu
2021-10-13  2:48                 ` Yang Shi
2021-10-13  3:01                   ` Peter Xu
2021-10-13  3:27                     ` Yang Shi
2021-10-13  3:41                       ` Peter Xu
2021-10-13 21:42                         ` Yang Shi
2021-10-13 23:13                           ` Peter Xu
2021-10-14  6:54                     ` Naoya Horiguchi
2021-10-06 20:18   ` Peter Xu
2021-10-07  2:49     ` Yang Shi
2021-11-01 19:05   ` Naresh Kamboju
2021-11-01 19:26     ` Yang Shi
2021-09-30 21:53 ` [v3 PATCH 3/5] mm: hwpoison: refactor refcount check handling Yang Shi
2021-10-06 22:01   ` Peter Xu
2021-10-07  2:47     ` Yang Shi
2021-10-07 16:18       ` Peter Xu
2021-09-30 21:53 ` [v3 PATCH 4/5] mm: shmem: don't truncate page if memory failure happens Yang Shi
2021-10-01  7:05   ` Naoya Horiguchi
2021-10-01 21:08     ` Yang Shi
2021-10-12  1:57   ` Peter Xu
2021-10-12 19:17     ` Yang Shi
2021-10-12 22:26       ` Peter Xu
2021-10-13  3:00         ` Yang Shi
2021-10-13  3:06           ` Peter Xu
2021-10-13  3:29             ` Yang Shi
2021-09-30 21:53 ` [v3 PATCH 5/5] mm: hwpoison: handle non-anonymous THP correctly Yang Shi
2021-10-01  7:06   ` Naoya Horiguchi
2021-10-01 21:09     ` Yang Shi
2021-10-13  2:40 ` [RFC v3 PATCH 0/5] Solve silent data loss caused by poisoned page cache (shmem/tmpfs) Peter Xu
2021-10-13  3:09   ` Yang Shi
2021-10-13  3:24     ` Peter Xu
2021-10-14  6:54     ` Naoya Horiguchi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YWTc/n4r6CJdvPpt@t490s \
    --to=peterx@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=hughd@google.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=naoya.horiguchi@nec.com \
    --cc=osalvador@suse.de \
    --cc=shy828301@gmail.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.