All of lore.kernel.org
 help / color / mirror / Atom feed
From: Joao Martins <joao.m.martins@oracle.com>
To: Matthew Wilcox <willy@infradead.org>
Cc: Jane Chu <jane.chu@oracle.com>,
	Naoya Horiguchi <naoya.horiguchi@nec.com>,
	linux-mm@kvack.org, Andrew Morton <akpm@linux-foundation.org>,
	Dan Williams <dan.j.williams@intel.com>
Subject: Re: [RFC] Missing compound_head() in memory-failure
Date: Tue, 1 Feb 2022 18:34:03 +0000	[thread overview]
Message-ID: <dceac808-cb08-53db-b9fb-4e5502c25008@oracle.com> (raw)
In-Reply-To: <YflyKDTO4kC4FbVj@casper.infradead.org>

On 2/1/22 17:47, Matthew Wilcox wrote:
> On Tue, Feb 01, 2022 at 05:40:57PM +0000, Joao Martins wrote:
>> On 2/1/22 16:01, Joao Martins wrote:
>>> On 2/1/22 15:46, Matthew Wilcox wrote:
>>>> On Mon, Jan 31, 2022 at 08:54:39PM +0000, Joao Martins wrote:
>>>>> On 1/31/22 20:29, Matthew Wilcox wrote:
>>>>>> Unless I am mistaken, you have to pass the compound head of the page
>>>>>> which has the error to collect_procs().  Am I mistaken?
>>>>>>
>>>>> -rc2 already has a fix for it:
>>>>>
>>>>> https://lore.kernel.org/linux-mm/20220129021420.PgBIZm-q9%25akpm@linux-foundation.org/
>>>>>
>>>>> Earlier in that function there's a:
>>>>>
>>>>> 	page = compound_head(page);
>>>>>
>>>>> So the @page passed to collect_procs() already is a head page.
>>>>
>>>> It's wrong though ;-(  You set the HWPoison bit on the page after
>>>> calling compound_head(), so you set the bit on the head page instead
>>>> of the precise page that had the poison.
>>>>
>>> Considering that on device-dax we would unmap the whole 2M page regardless
>>> of the poisoned subpage isn't that actually representative still?
>>>
>>
>> To say this another way. We do set the HWPoison on the head page,
>> and not the subpage as you say, but we end up propagating the resultant
>> MCE action on the superset of pages in the whole PMD or PUD.
>>
>> What I was trying to say in perhaps a convoluted way is that device-dax
>> case isn't different than HugeTLB that only wants to poison head. If there's
>> a head page, there's likely a PMD or PUD populated (depending what the device
>> was onlined with) and thus that's what gets unmapped. There's no idea of
>> subpages being treated any differently, at least as far as device-dax is
>> concerned -- unless I miss auditing some other code path.
> 
> Using HugeTLB as a model is not a good idea.  THP is the model to
> aim for; one can choose to map pages askew (ie not aligned with a
> PMD), and I don't think you'll find all the mappings with the current
> code (eg if someone has mapped a single page of the hugepage).
> 

Don't know if that last sentence referred to THP or DAX -- but to be extra sure,
on device-dax current code you can't map or remap subpages of a hugepage.
Even if you open a device-dax fd and touch just a subpage, what gets mapped
in the user mm is the entirety of the PMD (or PUD). You can't even map a
single random page, it always needs to be aligned /at a minimum/ to the
namespace boundary (which is the page table granularity i.e. PTE, PMD or PUD).

>> fsdax IIUC seems to rely more on the subpage bit being flagged but no
>> functional change here for fsdax as there's only base pages there (no heads).
>>
>>>> I'm fixing this up as part of the folio patches, but you may wish to
>>>> fix it earlier than that.
>> (+Dan in case I misrepresented or missed something)
>>
>> Should we deem it a problem, I'll fix for the next -rc.
>> Just in case, here's the diff stashed:
>>
>> diff --git a/mm/memory-failure.c b/mm/memory-failure.c
>> index 2e2f740c63dc..661c23df8115 100644
>> --- a/mm/memory-failure.c
>> +++ b/mm/memory-failure.c
>> @@ -1577,7 +1577,7 @@ static int memory_failure_hugetlb(unsigned long pfn, int flags)
>>  static int memory_failure_dev_pagemap(unsigned long pfn, int flags,
>>                 struct dev_pagemap *pgmap)
>>  {
>> -       struct page *page = pfn_to_page(pfn);
>> +       struct page *page = pfn_to_page(pfn), *subpage = page;
>>         unsigned long size = 0;
>>         struct to_kill *tk;
>>         LIST_HEAD(tokill);
>> @@ -1631,7 +1631,7 @@ static int memory_failure_dev_pagemap(unsigned long pfn, int flags,
>>          * Use this flag as an indication that the dax page has been
>>          * remapped UC to prevent speculative consumption of poison.
>>          */
>> -       SetPageHWPoison(page);
>> +       SetPageHWPoison(subpage);
>>
>>         /*
>>          * Unlike System-RAM there is no possibility to swap in a


  reply	other threads:[~2022-02-01 18:34 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-01-31 20:29 [RFC] Missing compound_head() in memory-failure Matthew Wilcox
2022-01-31 20:54 ` Joao Martins
2022-02-01  2:34   ` Matthew Wilcox
2022-02-01 15:46   ` Matthew Wilcox
2022-02-01 16:01     ` Joao Martins
2022-02-01 17:40       ` Joao Martins
2022-02-01 17:47         ` Matthew Wilcox
2022-02-01 18:34           ` Joao Martins [this message]
2022-02-01 21:11     ` Jane Chu
2022-02-01 22:04       ` Joao Martins
2022-02-01 22:20         ` Dan Williams
2022-02-01 23:04           ` Jane Chu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=dceac808-cb08-53db-b9fb-4e5502c25008@oracle.com \
    --to=joao.m.martins@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=dan.j.williams@intel.com \
    --cc=jane.chu@oracle.com \
    --cc=linux-mm@kvack.org \
    --cc=naoya.horiguchi@nec.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.