Re: [patch 23/29] mm, hwpoison: do not lock page again when me_huge_page() successfully recovers

From: "HORIGUCHI NAOYA(堀口　直也)" <naoya.horiguchi@nec.com>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	"aneesh.kumar@linux.vnet.ibm.com"
	<aneesh.kumar@linux.vnet.ibm.com>, Linux-MM <linux-mm@kvack.org>,
	Michal Hocko <mhocko@kernel.org>,
	"mm-commits@vger.kernel.org" <mm-commits@vger.kernel.org>,
	Oscar Salvador <osalvador@suse.de>,
	stable <stable@vger.kernel.org>, Tony Luck <tony.luck@intel.com>
Subject: Re: [patch 23/29] mm, hwpoison: do not lock page again when me_huge_page() successfully recovers
Date: Sun, 14 Mar 2021 06:36:23 +0000	[thread overview]
Message-ID: <20210314062427.GA30930@hori.linux.bs1.fc.nec.co.jp> (raw)
In-Reply-To: <CAHk-=wht_gk_d9k+NZs7eJvBeLOQT4xGcykgaCRHuiQ+LbReRw@mail.gmail.com>

On Sat, Mar 13, 2021 at 11:23:40AM -0800, Linus Torvalds wrote:
> On Fri, Mar 12, 2021 at 9:08 PM Andrew Morton <akpm@linux-foundation.org> wrote:
> >
> > From: Naoya Horiguchi <naoya.horiguchi@nec.com>
> > Subject: mm, hwpoison: do not lock page again when me_huge_page() successfully recovers
> 
> This patch can not possibly be correct, and is dangerous and very very wrong.
> 
> >  out:
> > -       unlock_page(head);
> > +       if (PageLocked(head))
> > +               unlock_page(head);
> 
> You absolutely CANNOT do things like this. It is fundamentally insane.
> 
> You have two situations:
> 
>  (a) you know you locked the page.
> 
>      In this case an unconditional "unlock_page()" is the only correct
> thing to do.
> 
>  (b) you don't know whether you maybe unlocked it again.
> 
>      In this case SOMEBODY ELSE might have locked the page, and you
> doing "if it's locked, then unlock it" is pure and utter drivel, and
> fundamentally and seriously wrong. You're unlocking SOMEBODY ELSES
> lock, after having already unlocked your own once.
> 
> Now, it is possible that this kind of pattern could be ok if you
> absolutely 100% know - and it is obvious from the code - that you are
> the only thread that can possibly access the page. But if that is the
> case, then the page should never have been locked in the first place,
> because that would be an insane and pointless operation, since the
> whole - and only - point of locking is to enforce some kind of
> exclusivity - and if exclusivity is explicit in the code-path, then
> locking the page is pointless.
> 
> And yes, in this case I can see a remote theoretical model: "all good
> cases keep the page locked, and the only trhing that unlocks the page
> also guarantees nothing else can possiblly see it".
> 
> But no. I don't actually believe this is fuaranteed to the case here,
> and even if it were, I don't think this is a code sequence we can or
> should accept.
> 
> End result: there is no possible situation that I can think of where the above
> 
>        if (PageLocked(head))
>                unlock_page(head);
> 
> kind of sequence can EVER possibly be correct, and it shows a complete
> disregard of everything that locking is all about.
> 
> Now, the memory error handling may be so special that this effectively
> "works" in practice, but there is no way I can apply such a
> fundamentally broken patch.

OK, I'll update the patch with the second approach you mentioned below.

> 
> I see two options:
> 
>  - make the result of identify_page_state() *tell* you whether the
> page was already unlocked (and then make the unlock conditional on
> *that*)
> 
>    This is valid, because now you removed that whole "somebody else
> might have transferred the lock" issue: you know you already unlocked
> the page based on the return value, so you don't unlock it again.
> That's perfectly valid.
> 
>  - probably better: make identify_page_state() itself always unlock
> the page, and turn the two different code sequences that currently do
> 
>         res = identify_page_state(pfn, p, page_flags);
>   out:
>         unlock_page(head);
>         return res;
> 
> into just doing
> 
>         return identify_page_state(pfn, p, page_flags);
>   out:
>         unlock_page(head);
>         return -EBUSY;
> 
> instead, and move that now pointless "res" variable into the only
> if-statement that actually uses it (for other things entirely! It
> should be a "enum mf_result" there!)
> 
> And yes, that second (and imho better) rule means that now
> {page_action()" itself needs to be the thing that unlocks the page,
> and that in turn might be done a few different ways:
> 
>  (a) either add a separate "MF_UNLOCKED" state bit to the possible
> return codes and make that me_huge_page() that unlocks the page set
> that bit (NOTE! It still needs to also return either MF_FAILED or
> MF_RECOVERED, so this "MF_UNLOCKED" bit really should be a separate
> bitmask entirely from the other MF_xyz bits)
> 
>  (b) make the rule be that *all* the ->action() functions need to just
> unlock the page.
> 
> ( suspect (b) is the better model to aim for, because honestly, static
> code rules for locking are basically almost always superior to dynamic
> "based on this flag" locking rules. You can in theory actually have
> static tooling check that the locking nests correctly with the
> unlocking that way (and it's just conceptually simpler to have a hard
> rule about "this function always unlocks the page").

I'll try (b). Thank you for the detailed advices.

- Naoya Horiguchi

> 
> But that existing patch is *so* fundamentally wrong that I cannot
> possibly apply it, and I feel bad about the fact that it has made it
> all the way to me with sign-offs and reviewed-by's and everything.
> 
>                   Linus
>