mm-commits.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "HORIGUCHI NAOYA(堀口 直也)" <naoya.horiguchi@nec.com>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	"aneesh.kumar@linux.vnet.ibm.com"
	<aneesh.kumar@linux.vnet.ibm.com>, Linux-MM <linux-mm@kvack.org>,
	Michal Hocko <mhocko@kernel.org>,
	"mm-commits@vger.kernel.org" <mm-commits@vger.kernel.org>,
	Oscar Salvador <osalvador@suse.de>,
	stable <stable@vger.kernel.org>, Tony Luck <tony.luck@intel.com>
Subject: Re: [patch 23/29] mm, hwpoison: do not lock page again when me_huge_page() successfully recovers
Date: Sun, 14 Mar 2021 06:36:23 +0000	[thread overview]
Message-ID: <20210314062427.GA30930@hori.linux.bs1.fc.nec.co.jp> (raw)
In-Reply-To: <CAHk-=wht_gk_d9k+NZs7eJvBeLOQT4xGcykgaCRHuiQ+LbReRw@mail.gmail.com>

On Sat, Mar 13, 2021 at 11:23:40AM -0800, Linus Torvalds wrote:
> On Fri, Mar 12, 2021 at 9:08 PM Andrew Morton <akpm@linux-foundation.org> wrote:
> >
> > From: Naoya Horiguchi <naoya.horiguchi@nec.com>
> > Subject: mm, hwpoison: do not lock page again when me_huge_page() successfully recovers
> 
> This patch can not possibly be correct, and is dangerous and very very wrong.
> 
> >  out:
> > -       unlock_page(head);
> > +       if (PageLocked(head))
> > +               unlock_page(head);
> 
> You absolutely CANNOT do things like this. It is fundamentally insane.
> 
> You have two situations:
> 
>  (a) you know you locked the page.
> 
>      In this case an unconditional "unlock_page()" is the only correct
> thing to do.
> 
>  (b) you don't know whether you maybe unlocked it again.
> 
>      In this case SOMEBODY ELSE might have locked the page, and you
> doing "if it's locked, then unlock it" is pure and utter drivel, and
> fundamentally and seriously wrong. You're unlocking SOMEBODY ELSES
> lock, after having already unlocked your own once.
> 
> Now, it is possible that this kind of pattern could be ok if you
> absolutely 100% know - and it is obvious from the code - that you are
> the only thread that can possibly access the page. But if that is the
> case, then the page should never have been locked in the first place,
> because that would be an insane and pointless operation, since the
> whole - and only - point of locking is to enforce some kind of
> exclusivity - and if exclusivity is explicit in the code-path, then
> locking the page is pointless.
> 
> And yes, in this case I can see a remote theoretical model: "all good
> cases keep the page locked, and the only trhing that unlocks the page
> also guarantees nothing else can possiblly see it".
> 
> But no. I don't actually believe this is fuaranteed to the case here,
> and even if it were, I don't think this is a code sequence we can or
> should accept.
> 
> End result: there is no possible situation that I can think of where the above
> 
>        if (PageLocked(head))
>                unlock_page(head);
> 
> kind of sequence can EVER possibly be correct, and it shows a complete
> disregard of everything that locking is all about.
> 
> Now, the memory error handling may be so special that this effectively
> "works" in practice, but there is no way I can apply such a
> fundamentally broken patch.

OK, I'll update the patch with the second approach you mentioned below.

> 
> I see two options:
> 
>  - make the result of identify_page_state() *tell* you whether the
> page was already unlocked (and then make the unlock conditional on
> *that*)
> 
>    This is valid, because now you removed that whole "somebody else
> might have transferred the lock" issue: you know you already unlocked
> the page based on the return value, so you don't unlock it again.
> That's perfectly valid.
> 
>  - probably better: make identify_page_state() itself always unlock
> the page, and turn the two different code sequences that currently do
> 
>         res = identify_page_state(pfn, p, page_flags);
>   out:
>         unlock_page(head);
>         return res;
> 
> into just doing
> 
>         return identify_page_state(pfn, p, page_flags);
>   out:
>         unlock_page(head);
>         return -EBUSY;
> 
> instead, and move that now pointless "res" variable into the only
> if-statement that actually uses it (for other things entirely! It
> should be a "enum mf_result" there!)
> 
> And yes, that second (and imho better) rule means that now
> {page_action()" itself needs to be the thing that unlocks the page,
> and that in turn might be done a few different ways:
> 
>  (a) either add a separate "MF_UNLOCKED" state bit to the possible
> return codes and make that me_huge_page() that unlocks the page set
> that bit (NOTE! It still needs to also return either MF_FAILED or
> MF_RECOVERED, so this "MF_UNLOCKED" bit really should be a separate
> bitmask entirely from the other MF_xyz bits)
> 
>  (b) make the rule be that *all* the ->action() functions need to just
> unlock the page.
> 
> ( suspect (b) is the better model to aim for, because honestly, static
> code rules for locking are basically almost always superior to dynamic
> "based on this flag" locking rules. You can in theory actually have
> static tooling check that the locking nests correctly with the
> unlocking that way (and it's just conceptually simpler to have a hard
> rule about "this function always unlocks the page").

I'll try (b). Thank you for the detailed advices.

- Naoya Horiguchi

> 
> But that existing patch is *so* fundamentally wrong that I cannot
> possibly apply it, and I feel bad about the fact that it has made it
> all the way to me with sign-offs and reviewed-by's and everything.
> 
>                   Linus
> 

  reply	other threads:[~2021-03-14  6:37 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-13  5:06 incoming Andrew Morton
2021-03-13  5:07 ` [patch 01/29] memblock: fix section mismatch warning Andrew Morton
2021-03-13  5:07 ` [patch 02/29] stop_machine: mark helpers __always_inline Andrew Morton
2021-03-13  5:07 ` [patch 03/29] init/Kconfig: make COMPILE_TEST depend on HAS_IOMEM Andrew Morton
2021-03-13  5:07 ` [patch 04/29] mm/page_alloc.c: refactor initialization of struct page for holes in memory layout Andrew Morton
2021-03-13  5:07 ` [patch 05/29] mm/fork: clear PASID for new mm Andrew Morton
2021-03-13  5:07 ` [patch 06/29] hugetlb: dedup the code to add a new file_region Andrew Morton
2021-03-13  5:07 ` [patch 07/29] hugetlb: break earlier in add_reservation_in_range() when we can Andrew Morton
2021-03-13  5:07 ` [patch 08/29] mm: introduce page_needs_cow_for_dma() for deciding whether cow Andrew Morton
2021-03-13  5:07 ` [patch 09/29] mm: use is_cow_mapping() across tree where proper Andrew Morton
2021-03-13  5:07 ` [patch 10/29] hugetlb: do early cow when page pinned on src mm Andrew Morton
2021-03-13  5:07 ` [patch 11/29] mm/highmem.c: fix zero_user_segments() with start > end Andrew Morton
2021-03-13  5:07 ` [patch 12/29] binfmt_misc: fix possible deadlock in bm_register_write Andrew Morton
2021-03-13  5:07 ` [patch 13/29] MAINTAINERS: exclude uapi directories in API/ABI section Andrew Morton
2021-03-13  5:07 ` [patch 14/29] linux/compiler-clang.h: define HAVE_BUILTIN_BSWAP* Andrew Morton
2021-03-13  5:07 ` [patch 15/29] kfence: fix printk format for ptrdiff_t Andrew Morton
2021-03-13  5:07 ` [patch 16/29] kfence, slab: fix cache_alloc_debugcheck_after() for bulk allocations Andrew Morton
2021-03-13  5:08 ` [patch 17/29] kfence: fix reports if constant function prefixes exist Andrew Morton
2021-03-13  5:08 ` [patch 18/29] include/linux/sched/mm.h: use rcu_dereference in in_vfork() Andrew Morton
2021-03-13  5:08 ` [patch 19/29] mm/madvise: replace ptrace attach requirement for process_madvise Andrew Morton
2021-03-13  5:08 ` [patch 20/29] kasan, mm: fix crash with HW_TAGS and DEBUG_PAGEALLOC Andrew Morton
2021-03-13  5:08 ` [patch 21/29] kasan: fix KASAN_STACK dependency for HW_TAGS Andrew Morton
2021-03-13  5:08 ` [patch 22/29] mm/userfaultfd: fix memory corruption due to writeprotect Andrew Morton
2021-03-13  5:08 ` [patch 23/29] mm, hwpoison: do not lock page again when me_huge_page() successfully recovers Andrew Morton
2021-03-13 19:23   ` Linus Torvalds
2021-03-14  6:36     ` HORIGUCHI NAOYA(堀口 直也) [this message]
2021-03-13  5:08 ` [patch 24/29] ia64: fix ia64_syscall_get_set_arguments() for break-based syscalls Andrew Morton
2021-03-13  5:08 ` [patch 25/29] ia64: fix ptrace(PTRACE_SYSCALL_INFO_EXIT) sign Andrew Morton
2021-03-13  5:08 ` [patch 26/29] mm/memcg: rename mem_cgroup_split_huge_fixup to split_page_memcg and add nr_pages argument Andrew Morton
2021-03-13  5:08 ` [patch 27/29] mm/memcg: set memcg when splitting page Andrew Morton
2021-03-13  5:08 ` [patch 28/29] zram: fix return value on writeback_store Andrew Morton
2021-03-13  5:08 ` [patch 29/29] zram: fix broken page writeback Andrew Morton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210314062427.GA30930@hori.linux.bs1.fc.nec.co.jp \
    --to=naoya.horiguchi@nec.com \
    --cc=akpm@linux-foundation.org \
    --cc=aneesh.kumar@linux.vnet.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=mm-commits@vger.kernel.org \
    --cc=osalvador@suse.de \
    --cc=stable@vger.kernel.org \
    --cc=tony.luck@intel.com \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).