linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: David Hildenbrand <david@redhat.com>
To: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Michal Hocko <mhocko@kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	Andrew Morton <akpm@linux-foundation.org>
Subject: Re: [PATCH v2 2/2] mm/memory-failure.c: Don't access uninitialized memmaps in memory_failure()
Date: Fri, 11 Oct 2019 12:13:17 +0200	[thread overview]
Message-ID: <3706d642-6c29-41b8-a676-1b5541af3169@redhat.com> (raw)
In-Reply-To: <20191011060249.GA30500@hori.linux.bs1.fc.nec.co.jp>

On 11.10.19 08:02, Naoya Horiguchi wrote:
> On Thu, Oct 10, 2019 at 09:58:40AM +0200, David Hildenbrand wrote:
>> On 10.10.19 09:52, David Hildenbrand wrote:
>>> On 10.10.19 09:35, Michal Hocko wrote:
>>>> On Thu 10-10-19 09:27:32, David Hildenbrand wrote:
>>>>> On 09.10.19 16:43, Michal Hocko wrote:
>>>>>> On Wed 09-10-19 16:24:35, David Hildenbrand wrote:
>>>>>>> We should check for pfn_to_online_page() to not access uninitialized
>>>>>>> memmaps. Reshuffle the code so we don't have to duplicate the error
>>>>>>> message.
>>>>>>>
>>>>>>> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
>>>>>>> Cc: Andrew Morton <akpm@linux-foundation.org>
>>>>>>> Cc: Michal Hocko <mhocko@kernel.org>
>>>>>>> Signed-off-by: David Hildenbrand <david@redhat.com>
>>>>>>> ---
>>>>>>>   mm/memory-failure.c | 14 ++++++++------
>>>>>>>   1 file changed, 8 insertions(+), 6 deletions(-)
>>>>>>>
>>>>>>> diff --git a/mm/memory-failure.c b/mm/memory-failure.c
>>>>>>> index 7ef849da8278..e866e6e5660b 100644
>>>>>>> --- a/mm/memory-failure.c
>>>>>>> +++ b/mm/memory-failure.c
>>>>>>> @@ -1253,17 +1253,19 @@ int memory_failure(unsigned long pfn, int flags)
>>>>>>>   	if (!sysctl_memory_failure_recovery)
>>>>>>>   		panic("Memory failure on page %lx", pfn);
>>>>>>>   
>>>>>>> -	if (!pfn_valid(pfn)) {
>>>>>>> +	p = pfn_to_online_page(pfn);
>>>>>>> +	if (!p) {
>>>>>>> +		if (pfn_valid(pfn)) {
>>>>>>> +			pgmap = get_dev_pagemap(pfn, NULL);
>>>>>>> +			if (pgmap)
>>>>>>> +				return memory_failure_dev_pagemap(pfn, flags,
>>>>>>> +								  pgmap);
>>>>>>> +		}
>>>>>>>   		pr_err("Memory failure: %#lx: memory outside kernel control\n",
>>>>>>>   			pfn);
>>>>>>>   		return -ENXIO;
>>>>>>
>>>>>> Don't we need that earlier at hwpoison_inject level?
>>>>>>
>>>>>
>>>>> Theoretically yes, this is another instance. But pfn_to_online_page(pfn)
>>>>> alone would not be sufficient as discussed. We would, again, have to
>>>>> special-case ZONE_DEVICE via things like get_dev_pagemap() ...
>>>>>
>>>>> But mm/hwpoison-inject.c:hwpoison_inject() is a pure debug feature either way:
>>>>>
>>>>> 	/*
>>>>> 	 * Note that the below poison/unpoison interfaces do not involve
>>>>> 	 * hardware status change, hence do not require hardware support.
>>>>> 	 * They are mainly for testing hwpoison in software level.
>>>>> 	 */
>>>>>
>>>>> So it's not that bad compared to memory_failure() called from real HW or
>>>>> drivers/base/memory.c:soft_offline_page_store()/hard_offline_page_store()
>>>>
>>>> Yes, this is just a toy. And yes we need to handle zone device pages
>>>> here because a) people likely want to test MCE behavior even on these
>>>> pages and b) HW can really trigger MCEs there as well. I was just
>>>> pointing that the patch is likely incomplete.
>>>>
>>>
>>> I rather think this deserves a separate patch as it is a separate
>>> interface :)
>>>
>>> I do wonder why hwpoison_inject() has to perform so much extra work
>>> compared to other memory_failure() users. This smells like legacy
>>> leftovers to me, but I might be wrong. The interface is fairly old,
>>> though. Does anybody know why we need this magic? I can spot quite some
>>> duplicate checks/things getting performed.
> 
> It concerns me too, this *is* an old legacy code. I guess it was left as-is
> because no one complained about it.  That's not good, so I'll do some cleanup.

Most of that stuff was introduced in

commit 31d3d3484f9bd263925ecaa341500ac2df3a5d9b
Author: Wu Fengguang <fengguang.wu@intel.com>
Date:   Wed Dec 16 12:19:59 2009 +0100

     HWPOISON: limit hwpoison injector to known page types

     __memory_failure()'s workflow is

             set PG_hwpoison
             //...
             unset PG_hwpoison if didn't pass hwpoison filter

     That could kill unrelated process if it happens to page fault on the
     page with the (temporary) PG_hwpoison. The race should be big enough to
     appear in stress tests.

     Fix it by grabbing the page and checking filter at inject time.  This
     also avoids the very noisy "Injecting memory failure..." messages.


Now, we still have the same "issue" in memory_failure() today:


	if (TestSetPageHWPoison(p)) {
		pr_err("Memory failure: %#lx: already hardware poisoned\n",
			pfn);
		return 0;
	}
[...]
	if (hwpoison_filter(p)) {
		if (TestClearPageHWPoison(p))
			num_poisoned_pages_dec();
		unlock_page(p);
		put_hwpoison_page(p);
		return 0;
	}

However, I don't understand why we need that special handling only for this
debug interface and not the other users.

I'd vote for ripping out that legacy crap (so the interface works correctly
with ZONE_DEVICE) and instead (if really required) rework memory_failure()
to not produce such side effects.

> 
>>>
>>> Naiive me would just make the interface perform the same as
>>> hard_offline_page_store(). But most probably I am not getting the real
>>> purpose of both different interfaces.
> 
> Maybe for historical reason, we have these slightly different interfaces:
> 
> - corrupt-pfn
>    - purely for debugging purpose
>    - paired with unpoison-pfn
>    - used by in-kernel tool tools/vm/page-types.c
> - hard_offline_page
>    - paired with soft_offline_page
>    - used by other userspace tools like mcelog
> 
> But these don't explain why implemented differently, so I think that both
> should be written in the same manner.
> 
>>>
>>> HWPOISON_INJECT is only selected for DEBUG_KERNEL, so I would have
>>> guessed that fixing this is not urgent.
>>>
>>> BTW: mm/memory-failure.c:soft_offline_page() also looks wrong and needs
>>> fixing to make sure we access initialized memmaps.
>>>
>>
>> To be more precise, soft_offline_page_store() needs a
>> pfn_to_online_page() check. Will send a patch.
> 
> Thanks for finding this.
> 
> - Naoya Horiguchi
> 


-- 

Thanks,

David / dhildenb


  reply	other threads:[~2019-10-11 10:13 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-10-09 14:24 [PATCH v2 0/2] mm: Don't access uninitialized memmaps in PFN walkers David Hildenbrand
2019-10-09 14:24 ` [PATCH v2 1/2] mm: Don't access uninitialized memmaps in fs/proc/page.c David Hildenbrand
2019-10-09 14:41   ` Michal Hocko
2019-10-14  8:44   ` David Hildenbrand
2019-10-09 14:24 ` [PATCH v2 2/2] mm/memory-failure.c: Don't access uninitialized memmaps in memory_failure() David Hildenbrand
2019-10-09 14:43   ` Michal Hocko
2019-10-10  7:27     ` David Hildenbrand
2019-10-10  7:35       ` Michal Hocko
2019-10-10  7:52         ` David Hildenbrand
2019-10-10  7:58           ` David Hildenbrand
2019-10-11  6:02             ` Naoya Horiguchi
2019-10-11 10:13               ` David Hildenbrand [this message]
2019-10-14 13:36                 ` Michal Hocko
2019-10-15 14:23                   ` Oscar Salvador
2019-10-10  0:26   ` Naoya Horiguchi
2019-10-10  7:17     ` David Hildenbrand
2019-10-11  6:50       ` Naoya Horiguchi
2019-10-19  2:05       ` Andrew Morton
2019-10-21  9:44         ` David Hildenbrand
2019-10-14  8:41   ` David Hildenbrand

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3706d642-6c29-41b8-a676-1b5541af3169@redhat.com \
    --to=david@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=n-horiguchi@ah.jp.nec.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).