All of lore.kernel.org
 help / color / mirror / Atom feed
From: Michal Hocko <mhocko@kernel.org>
To: David Hildenbrand <david@redhat.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Oscar Salvador <OSalvador@suse.com>
Subject: Re: [PATCH v2 2/2] mm/memory-failure.c: Don't access uninitialized memmaps in memory_failure()
Date: Mon, 14 Oct 2019 15:36:17 +0200	[thread overview]
Message-ID: <20191014133617.GJ317@dhcp22.suse.cz> (raw)
In-Reply-To: <3706d642-6c29-41b8-a676-1b5541af3169@redhat.com>

[Cc Oscar]

On Fri 11-10-19 12:13:17, David Hildenbrand wrote:
> On 11.10.19 08:02, Naoya Horiguchi wrote:
> > On Thu, Oct 10, 2019 at 09:58:40AM +0200, David Hildenbrand wrote:
> >> On 10.10.19 09:52, David Hildenbrand wrote:
> >>> On 10.10.19 09:35, Michal Hocko wrote:
> >>>> On Thu 10-10-19 09:27:32, David Hildenbrand wrote:
> >>>>> On 09.10.19 16:43, Michal Hocko wrote:
> >>>>>> On Wed 09-10-19 16:24:35, David Hildenbrand wrote:
> >>>>>>> We should check for pfn_to_online_page() to not access uninitialized
> >>>>>>> memmaps. Reshuffle the code so we don't have to duplicate the error
> >>>>>>> message.
> >>>>>>>
> >>>>>>> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
> >>>>>>> Cc: Andrew Morton <akpm@linux-foundation.org>
> >>>>>>> Cc: Michal Hocko <mhocko@kernel.org>
> >>>>>>> Signed-off-by: David Hildenbrand <david@redhat.com>
> >>>>>>> ---
> >>>>>>>   mm/memory-failure.c | 14 ++++++++------
> >>>>>>>   1 file changed, 8 insertions(+), 6 deletions(-)
> >>>>>>>
> >>>>>>> diff --git a/mm/memory-failure.c b/mm/memory-failure.c
> >>>>>>> index 7ef849da8278..e866e6e5660b 100644
> >>>>>>> --- a/mm/memory-failure.c
> >>>>>>> +++ b/mm/memory-failure.c
> >>>>>>> @@ -1253,17 +1253,19 @@ int memory_failure(unsigned long pfn, int flags)
> >>>>>>>   	if (!sysctl_memory_failure_recovery)
> >>>>>>>   		panic("Memory failure on page %lx", pfn);
> >>>>>>>   
> >>>>>>> -	if (!pfn_valid(pfn)) {
> >>>>>>> +	p = pfn_to_online_page(pfn);
> >>>>>>> +	if (!p) {
> >>>>>>> +		if (pfn_valid(pfn)) {
> >>>>>>> +			pgmap = get_dev_pagemap(pfn, NULL);
> >>>>>>> +			if (pgmap)
> >>>>>>> +				return memory_failure_dev_pagemap(pfn, flags,
> >>>>>>> +								  pgmap);
> >>>>>>> +		}
> >>>>>>>   		pr_err("Memory failure: %#lx: memory outside kernel control\n",
> >>>>>>>   			pfn);
> >>>>>>>   		return -ENXIO;
> >>>>>>
> >>>>>> Don't we need that earlier at hwpoison_inject level?
> >>>>>>
> >>>>>
> >>>>> Theoretically yes, this is another instance. But pfn_to_online_page(pfn)
> >>>>> alone would not be sufficient as discussed. We would, again, have to
> >>>>> special-case ZONE_DEVICE via things like get_dev_pagemap() ...
> >>>>>
> >>>>> But mm/hwpoison-inject.c:hwpoison_inject() is a pure debug feature either way:
> >>>>>
> >>>>> 	/*
> >>>>> 	 * Note that the below poison/unpoison interfaces do not involve
> >>>>> 	 * hardware status change, hence do not require hardware support.
> >>>>> 	 * They are mainly for testing hwpoison in software level.
> >>>>> 	 */
> >>>>>
> >>>>> So it's not that bad compared to memory_failure() called from real HW or
> >>>>> drivers/base/memory.c:soft_offline_page_store()/hard_offline_page_store()
> >>>>
> >>>> Yes, this is just a toy. And yes we need to handle zone device pages
> >>>> here because a) people likely want to test MCE behavior even on these
> >>>> pages and b) HW can really trigger MCEs there as well. I was just
> >>>> pointing that the patch is likely incomplete.
> >>>>
> >>>
> >>> I rather think this deserves a separate patch as it is a separate
> >>> interface :)
> >>>
> >>> I do wonder why hwpoison_inject() has to perform so much extra work
> >>> compared to other memory_failure() users. This smells like legacy
> >>> leftovers to me, but I might be wrong. The interface is fairly old,
> >>> though. Does anybody know why we need this magic? I can spot quite some
> >>> duplicate checks/things getting performed.
> > 
> > It concerns me too, this *is* an old legacy code. I guess it was left as-is
> > because no one complained about it.  That's not good, so I'll do some cleanup.
> 
> Most of that stuff was introduced in
> 
> commit 31d3d3484f9bd263925ecaa341500ac2df3a5d9b
> Author: Wu Fengguang <fengguang.wu@intel.com>
> Date:   Wed Dec 16 12:19:59 2009 +0100
> 
>      HWPOISON: limit hwpoison injector to known page types
> 
>      __memory_failure()'s workflow is
> 
>              set PG_hwpoison
>              //...
>              unset PG_hwpoison if didn't pass hwpoison filter
> 
>      That could kill unrelated process if it happens to page fault on the
>      page with the (temporary) PG_hwpoison. The race should be big enough to
>      appear in stress tests.
> 
>      Fix it by grabbing the page and checking filter at inject time.  This
>      also avoids the very noisy "Injecting memory failure..." messages.
> 
> 
> Now, we still have the same "issue" in memory_failure() today:
> 
> 
> 	if (TestSetPageHWPoison(p)) {
> 		pr_err("Memory failure: %#lx: already hardware poisoned\n",
> 			pfn);
> 		return 0;
> 	}
> [...]
> 	if (hwpoison_filter(p)) {
> 		if (TestClearPageHWPoison(p))
> 			num_poisoned_pages_dec();
> 		unlock_page(p);
> 		put_hwpoison_page(p);
> 		return 0;
> 	}
> 
> However, I don't understand why we need that special handling only for this
> debug interface and not the other users.
> 
> I'd vote for ripping out that legacy crap (so the interface works correctly
> with ZONE_DEVICE) and instead (if really required) rework memory_failure()
> to not produce such side effects.

I do agree. The two should be really using the same code. My
understanding was that MADV_HWPOISON was there to test the actual MCE
behavior (and the man page seems to agree with that).

Oscar is working on a rewrite. Not sure he has considered this as well.
-- 
Michal Hocko
SUSE Labs

  reply	other threads:[~2019-10-14 13:36 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-10-09 14:24 [PATCH v2 0/2] mm: Don't access uninitialized memmaps in PFN walkers David Hildenbrand
2019-10-09 14:24 ` [PATCH v2 1/2] mm: Don't access uninitialized memmaps in fs/proc/page.c David Hildenbrand
2019-10-09 14:41   ` Michal Hocko
2019-10-14  8:44   ` David Hildenbrand
2019-10-09 14:24 ` [PATCH v2 2/2] mm/memory-failure.c: Don't access uninitialized memmaps in memory_failure() David Hildenbrand
2019-10-09 14:43   ` Michal Hocko
2019-10-10  7:27     ` David Hildenbrand
2019-10-10  7:35       ` Michal Hocko
2019-10-10  7:52         ` David Hildenbrand
2019-10-10  7:58           ` David Hildenbrand
2019-10-11  6:02             ` Naoya Horiguchi
2019-10-11 10:13               ` David Hildenbrand
2019-10-14 13:36                 ` Michal Hocko [this message]
2019-10-15 14:23                   ` Oscar Salvador
2019-10-10  0:26   ` Naoya Horiguchi
2019-10-10  7:17     ` David Hildenbrand
2019-10-11  6:50       ` Naoya Horiguchi
2019-10-19  2:05       ` Andrew Morton
2019-10-21  9:44         ` David Hildenbrand
2019-10-14  8:41   ` David Hildenbrand

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191014133617.GJ317@dhcp22.suse.cz \
    --to=mhocko@kernel.org \
    --cc=OSalvador@suse.com \
    --cc=akpm@linux-foundation.org \
    --cc=david@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=n-horiguchi@ah.jp.nec.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.