From: "HORIGUCHI NAOYA(堀口 直也)" <naoya.horiguchi@nec.com>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>,
"aneesh.kumar@linux.vnet.ibm.com"
<aneesh.kumar@linux.vnet.ibm.com>, Linux-MM <linux-mm@kvack.org>,
Michal Hocko <mhocko@kernel.org>,
"mm-commits@vger.kernel.org" <mm-commits@vger.kernel.org>,
Oscar Salvador <osalvador@suse.de>,
stable <stable@vger.kernel.org>, Tony Luck <tony.luck@intel.com>
Subject: Re: [patch 23/29] mm, hwpoison: do not lock page again when me_huge_page() successfully recovers
Date: Sun, 14 Mar 2021 06:36:23 +0000 [thread overview]
Message-ID: <20210314062427.GA30930@hori.linux.bs1.fc.nec.co.jp> (raw)
In-Reply-To: <CAHk-=wht_gk_d9k+NZs7eJvBeLOQT4xGcykgaCRHuiQ+LbReRw@mail.gmail.com>
On Sat, Mar 13, 2021 at 11:23:40AM -0800, Linus Torvalds wrote:
> On Fri, Mar 12, 2021 at 9:08 PM Andrew Morton <akpm@linux-foundation.org> wrote:
> >
> > From: Naoya Horiguchi <naoya.horiguchi@nec.com>
> > Subject: mm, hwpoison: do not lock page again when me_huge_page() successfully recovers
>
> This patch can not possibly be correct, and is dangerous and very very wrong.
>
> > out:
> > - unlock_page(head);
> > + if (PageLocked(head))
> > + unlock_page(head);
>
> You absolutely CANNOT do things like this. It is fundamentally insane.
>
> You have two situations:
>
> (a) you know you locked the page.
>
> In this case an unconditional "unlock_page()" is the only correct
> thing to do.
>
> (b) you don't know whether you maybe unlocked it again.
>
> In this case SOMEBODY ELSE might have locked the page, and you
> doing "if it's locked, then unlock it" is pure and utter drivel, and
> fundamentally and seriously wrong. You're unlocking SOMEBODY ELSES
> lock, after having already unlocked your own once.
>
> Now, it is possible that this kind of pattern could be ok if you
> absolutely 100% know - and it is obvious from the code - that you are
> the only thread that can possibly access the page. But if that is the
> case, then the page should never have been locked in the first place,
> because that would be an insane and pointless operation, since the
> whole - and only - point of locking is to enforce some kind of
> exclusivity - and if exclusivity is explicit in the code-path, then
> locking the page is pointless.
>
> And yes, in this case I can see a remote theoretical model: "all good
> cases keep the page locked, and the only trhing that unlocks the page
> also guarantees nothing else can possiblly see it".
>
> But no. I don't actually believe this is fuaranteed to the case here,
> and even if it were, I don't think this is a code sequence we can or
> should accept.
>
> End result: there is no possible situation that I can think of where the above
>
> if (PageLocked(head))
> unlock_page(head);
>
> kind of sequence can EVER possibly be correct, and it shows a complete
> disregard of everything that locking is all about.
>
> Now, the memory error handling may be so special that this effectively
> "works" in practice, but there is no way I can apply such a
> fundamentally broken patch.
OK, I'll update the patch with the second approach you mentioned below.
>
> I see two options:
>
> - make the result of identify_page_state() *tell* you whether the
> page was already unlocked (and then make the unlock conditional on
> *that*)
>
> This is valid, because now you removed that whole "somebody else
> might have transferred the lock" issue: you know you already unlocked
> the page based on the return value, so you don't unlock it again.
> That's perfectly valid.
>
> - probably better: make identify_page_state() itself always unlock
> the page, and turn the two different code sequences that currently do
>
> res = identify_page_state(pfn, p, page_flags);
> out:
> unlock_page(head);
> return res;
>
> into just doing
>
> return identify_page_state(pfn, p, page_flags);
> out:
> unlock_page(head);
> return -EBUSY;
>
> instead, and move that now pointless "res" variable into the only
> if-statement that actually uses it (for other things entirely! It
> should be a "enum mf_result" there!)
>
> And yes, that second (and imho better) rule means that now
> {page_action()" itself needs to be the thing that unlocks the page,
> and that in turn might be done a few different ways:
>
> (a) either add a separate "MF_UNLOCKED" state bit to the possible
> return codes and make that me_huge_page() that unlocks the page set
> that bit (NOTE! It still needs to also return either MF_FAILED or
> MF_RECOVERED, so this "MF_UNLOCKED" bit really should be a separate
> bitmask entirely from the other MF_xyz bits)
>
> (b) make the rule be that *all* the ->action() functions need to just
> unlock the page.
>
> ( suspect (b) is the better model to aim for, because honestly, static
> code rules for locking are basically almost always superior to dynamic
> "based on this flag" locking rules. You can in theory actually have
> static tooling check that the locking nests correctly with the
> unlocking that way (and it's just conceptually simpler to have a hard
> rule about "this function always unlocks the page").
I'll try (b). Thank you for the detailed advices.
- Naoya Horiguchi
>
> But that existing patch is *so* fundamentally wrong that I cannot
> possibly apply it, and I feel bad about the fact that it has made it
> all the way to me with sign-offs and reviewed-by's and everything.
>
> Linus
>
next prev parent reply other threads:[~2021-03-14 6:37 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-03-13 5:06 incoming Andrew Morton
2021-03-13 5:07 ` [patch 01/29] memblock: fix section mismatch warning Andrew Morton
2021-03-13 5:07 ` [patch 02/29] stop_machine: mark helpers __always_inline Andrew Morton
2021-03-13 5:07 ` [patch 03/29] init/Kconfig: make COMPILE_TEST depend on HAS_IOMEM Andrew Morton
2021-03-13 5:07 ` [patch 04/29] mm/page_alloc.c: refactor initialization of struct page for holes in memory layout Andrew Morton
2021-03-13 5:07 ` [patch 05/29] mm/fork: clear PASID for new mm Andrew Morton
2021-03-13 5:07 ` [patch 06/29] hugetlb: dedup the code to add a new file_region Andrew Morton
2021-03-13 5:07 ` [patch 07/29] hugetlb: break earlier in add_reservation_in_range() when we can Andrew Morton
2021-03-13 5:07 ` [patch 08/29] mm: introduce page_needs_cow_for_dma() for deciding whether cow Andrew Morton
2021-03-13 5:07 ` [patch 09/29] mm: use is_cow_mapping() across tree where proper Andrew Morton
2021-03-13 5:07 ` [patch 10/29] hugetlb: do early cow when page pinned on src mm Andrew Morton
2021-03-13 5:07 ` [patch 11/29] mm/highmem.c: fix zero_user_segments() with start > end Andrew Morton
2021-03-13 5:07 ` [patch 12/29] binfmt_misc: fix possible deadlock in bm_register_write Andrew Morton
2021-03-13 5:07 ` [patch 13/29] MAINTAINERS: exclude uapi directories in API/ABI section Andrew Morton
2021-03-13 5:07 ` [patch 14/29] linux/compiler-clang.h: define HAVE_BUILTIN_BSWAP* Andrew Morton
2021-03-13 5:07 ` [patch 15/29] kfence: fix printk format for ptrdiff_t Andrew Morton
2021-03-13 5:07 ` [patch 16/29] kfence, slab: fix cache_alloc_debugcheck_after() for bulk allocations Andrew Morton
2021-03-13 5:08 ` [patch 17/29] kfence: fix reports if constant function prefixes exist Andrew Morton
2021-03-13 5:08 ` [patch 18/29] include/linux/sched/mm.h: use rcu_dereference in in_vfork() Andrew Morton
2021-03-13 5:08 ` [patch 19/29] mm/madvise: replace ptrace attach requirement for process_madvise Andrew Morton
2021-03-13 5:08 ` [patch 20/29] kasan, mm: fix crash with HW_TAGS and DEBUG_PAGEALLOC Andrew Morton
2021-03-13 5:08 ` [patch 21/29] kasan: fix KASAN_STACK dependency for HW_TAGS Andrew Morton
2021-03-13 5:08 ` [patch 22/29] mm/userfaultfd: fix memory corruption due to writeprotect Andrew Morton
2021-03-13 5:08 ` [patch 23/29] mm, hwpoison: do not lock page again when me_huge_page() successfully recovers Andrew Morton
2021-03-13 19:23 ` Linus Torvalds
2021-03-14 6:36 ` HORIGUCHI NAOYA(堀口 直也) [this message]
2021-03-13 5:08 ` [patch 24/29] ia64: fix ia64_syscall_get_set_arguments() for break-based syscalls Andrew Morton
2021-03-13 5:08 ` [patch 25/29] ia64: fix ptrace(PTRACE_SYSCALL_INFO_EXIT) sign Andrew Morton
2021-03-13 5:08 ` [patch 26/29] mm/memcg: rename mem_cgroup_split_huge_fixup to split_page_memcg and add nr_pages argument Andrew Morton
2021-03-13 5:08 ` [patch 27/29] mm/memcg: set memcg when splitting page Andrew Morton
2021-03-13 5:08 ` [patch 28/29] zram: fix return value on writeback_store Andrew Morton
2021-03-13 5:08 ` [patch 29/29] zram: fix broken page writeback Andrew Morton
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20210314062427.GA30930@hori.linux.bs1.fc.nec.co.jp \
--to=naoya.horiguchi@nec.com \
--cc=akpm@linux-foundation.org \
--cc=aneesh.kumar@linux.vnet.ibm.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@kernel.org \
--cc=mm-commits@vger.kernel.org \
--cc=osalvador@suse.de \
--cc=stable@vger.kernel.org \
--cc=tony.luck@intel.com \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).