Re: [PATCH 3/7] mm,madvise: call soft_offline_page() without MF_COUNT_INCREASED

From: Vlastimil Babka <vbabka@suse.cz>
To: Oscar Salvador <osalvador@suse.de>
Cc: akpm@linux-foundation.org, n-horiguchi@ah.jp.nec.com,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	Naoya Horiguchi <naoya.horiguchi@nec.com>,
	Dan Williams <dan.j.williams@intel.com>
Subject: Re: [PATCH 3/7] mm,madvise: call soft_offline_page() without MF_COUNT_INCREASED
Date: Fri, 4 Dec 2020 18:25:31 +0100	[thread overview]
Message-ID: <840d4669-ae3f-b7c4-6132-e20d1bf9e952@suse.cz> (raw)
In-Reply-To: <20201201113511.GA22242@linux>

On 12/1/20 12:35 PM, Oscar Salvador wrote:
> On Wed, Nov 25, 2020 at 07:20:33PM +0100, Vlastimil Babka wrote:
>> On 11/19/20 11:57 AM, Oscar Salvador wrote:
>> > From: Naoya Horiguchi <naoya.horiguchi@nec.com>
>> > 
>> > The call to get_user_pages_fast is only to get the pointer to a struct
>> > page of a given address, pinning it is memory-poisoning handler's job,
>> > so drop the refcount grabbed by get_user_pages_fast().
>> > 
>> > Note that the target page is still pinned after this put_page() because
>> > the current process should have refcount from mapping.
>> 
>> Well, but can't it go away due to reclaim, migration or whatever?
> 
> Yes, it can.
> 
>> > @@ -900,20 +900,23 @@ static int madvise_inject_error(int behavior,
>> >   		 */
>> >   		size = page_size(compound_head(page));
>> > +		/*
>> > +		 * The get_user_pages_fast() is just to get the pfn of the
>> > +		 * given address, and the refcount has nothing to do with
>> > +		 * what we try to test, so it should be released immediately.
>> > +		 * This is racy but it's intended because the real hardware
>> > +		 * errors could happen at any moment and memory error handlers
>> > +		 * must properly handle the race.
>> 
>> Sure they have to. We might just be unexpectedly messing with other process'
>> memory. Or does anything else prevent that?
> 
> No, nothing does, and I have to confess that I managed to confuse myself here.
> If we release such page and that page ends up in buddy, nothing prevents someone
> else to get that page, and then we would be messing with other process memory.
> 
> I guess the right thing to do is just to make sure we got that page and that
> that page remains pinned as long as the memory failure handling goes.

OK, so that means we don't introduce this race for MADV_SOFT_OFFLINE, but it's
already (and still) there for MADV_HWPOISON since Dan's 23e7b5c2e271 ("mm,
madvise_inject_error: Let memory_failure() optionally take a page reference") no?

> I will remove those patches from the patchset and re-submit with only the
> refactoring and pcp-disabling.
> 
> Thanks Vlastimil
>