linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Mike Kravetz <mike.kravetz@oracle.com>
To: David Hildenbrand <david@redhat.com>,
	Miaohe Lin <linmiaohe@huawei.com>,
	akpm@linux-foundation.org
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	Oscar Salvador <osalvador@suse.de>,
	Naoya Horiguchi <naoya.horiguchi@nec.com>
Subject: Re: [PATCH] mm/memory_hotplug: avoid consuming corrupted data when offline pages
Date: Thu, 21 Apr 2022 15:04:56 -0700	[thread overview]
Message-ID: <bb011aaf-a45c-c0a8-5e5f-211900d17f19@oracle.com> (raw)
In-Reply-To: <8fae51af-b12f-4232-1330-54f7b0943907@redhat.com>

On 4/21/22 07:20, David Hildenbrand wrote:
> On 21.04.22 15:51, Miaohe Lin wrote:
>> When trying to offline pages, HWPoisoned hugepage is migrated without
>> checking PageHWPoison first. So corrupted data could be consumed. Fix
>> it by deferring isolate_huge_page until PageHWPoison is handled.
>>
> 
> CCing Oscar, Mike and Naoya
> 
>> Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
>> ---
>>  mm/memory_hotplug.c | 11 +++++++----
>>  1 file changed, 7 insertions(+), 4 deletions(-)
>>
>> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
>> index 4c6065e5d274..093f85ec5c5c 100644
>> --- a/mm/memory_hotplug.c
>> +++ b/mm/memory_hotplug.c
>> @@ -1600,11 +1600,9 @@ do_migrate_range(unsigned long start_pfn, unsigned long end_pfn)
>>  		folio = page_folio(page);
>>  		head = &folio->page;
>>  
>> -		if (PageHuge(page)) {
>> +		if (PageHuge(page))
>>  			pfn = page_to_pfn(head) + compound_nr(head) - 1;
>> -			isolate_huge_page(head, &source);
>> -			continue;
>> -		} else if (PageTransHuge(page))
>> +		else if (PageTransHuge(page))
>>  			pfn = page_to_pfn(head) + thp_nr_pages(page) - 1;
>>  
>>  		/*
>> @@ -1622,6 +1620,11 @@ do_migrate_range(unsigned long start_pfn, unsigned long end_pfn)
>>  			continue;
>>  		}
>>  
>> +		if (PageHuge(page)) {
>> +			isolate_huge_page(head, &source);
>> +			continue;
>> +		}
>> +
>>  		if (!get_page_unless_zero(page))
>>  			continue;
>>  		/*
> 
> The problem statement makes sense to me but I am not sure about the
> details if we run into the "PageHWPoison" path with a huge page. I have
> the gut feeling that we have to do more for huge pages in the
> PageHWPoison() path, because we might be dealing with a free huge page
> after unmap succeeds. I might be wrong.
> 

Thinking about memory errors always makes my head hurt :)

What 'state' could a poisoned hugetlb page be in here?
- Mapped into a process address space?
- On the hugetlb free lists?

IIUC, when poisoning a hugetlb page we try to dissolve those that are
free and unmap those which are mapped.  So, this means those operations
must have failed for some reason.  Is that correct?

Now, IF the above is true this implies there is a poisoned page somewhere
within the hugetlb page.  But, poison is only marked in the head page.
So, we do not really know where within the page the actual error exists.
Is my reasoning still correct?

If my reasoning is correct, then I am not sure what we can do here.
The code to handle poison is:

                 if (PageHWPoison(page)) {
                        if (WARN_ON(folio_test_lru(folio)))
                                folio_isolate_lru(folio);
                        if (folio_mapped(folio))
                                try_to_unmap(folio, TTU_IGNORE_MLOCK);
                        continue;
                }

My first observation is that if a hugetlb page is passed to try_to_unmap
as above we may BUG.  This is because we need to hold i_mmap_rwsem in
write mode because of the possibility of calling huge_pmd_unshare.  :(

I 'think' try_to_unmap could succeed on a poisoned hugetlb page once we
add correct locking.  So, the page would be unmapped.  I do not think anyone
is holding a ref, so the last unmap should put the hugetlb page on the
free list.  Correct?  We will not migrate the page, but ...

After the call to do_migrate_range() in offline_pages, we will call
dissolve_free_huge_pages.  For each hugetlb page, dissolve_free_huge_pages
will call dissolve_free_huge_page likely passing in the 'head' page.
When dissolve_free_huge_page is called for poisoned hugetlb pages from
the memory error handling code, it passes in the sub page which contains
the memory error.  Before freeing the hugetlb page to buddy, there is this
code:

                        /*
                         * Move PageHWPoison flag from head page to the raw
                         * error page, which makes any subpages rather than
                         * the error page reusable.
                         */
                        if (PageHWPoison(head) && page != head) {
                                SetPageHWPoison(page);
                                ClearPageHWPoison(head);
                        }
                        update_and_free_page(h, head, false)

As previously mentioned, outside the memory error handling code we do
not know what page within the hugetlb page contains poison.  So, this
will likely result with a page with errors on the free list and an OK
page marked with error.

In other places, we 'bail' if we encounter a hugetlb page with poison.
It would be unfortunate to prevent memory offline if the range contains
a hugetlb page with poison.  After all, offlining a section with poison
make sense.
-- 
Mike Kravetz


  reply	other threads:[~2022-04-21 22:05 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-04-21 13:51 [PATCH] mm/memory_hotplug: avoid consuming corrupted data when offline pages Miaohe Lin
2022-04-21 14:20 ` David Hildenbrand
2022-04-21 22:04   ` Mike Kravetz [this message]
2022-04-22  6:53     ` Miaohe Lin
     [not found]       ` <20220425135555.GA3767079@hori.linux.bs1.fc.nec.co.jp>
2022-04-26 11:41         ` Miaohe Lin
2022-04-27  3:13           ` HORIGUCHI NAOYA(堀口 直也)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bb011aaf-a45c-c0a8-5e5f-211900d17f19@oracle.com \
    --to=mike.kravetz@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=david@redhat.com \
    --cc=linmiaohe@huawei.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=naoya.horiguchi@nec.com \
    --cc=osalvador@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).