From: Andrew Morton <akpm@linux-foundation.org>
To: akpm@linux-foundation.org, dave@stgolabs.net,
dvhart@infradead.org, hughd@google.com,
kirill.shutemov@linux.intel.com, linux-mm@kvack.org,
mgorman@techsingularity.net, mike.kravetz@oracle.com,
mingo@redhat.com, mm-commits@vger.kernel.org,
neelnatu@google.com, peterz@infradead.org,
stable@vger.kernel.org, tglx@linutronix.de,
torvalds@linux-foundation.org, wetpzy@gmail.com,
willy@infradead.org
Subject: [patch 17/24] mm, futex: fix shared futex pgoff on shmem huge page
Date: Thu, 24 Jun 2021 18:39:52 -0700 [thread overview]
Message-ID: <20210625013952.eaD8IsXDa%akpm@linux-foundation.org> (raw)
In-Reply-To: <20210624183838.ac3161ca4a43989665ac8b2f@linux-foundation.org>
From: Hugh Dickins <hughd@google.com>
Subject: mm, futex: fix shared futex pgoff on shmem huge page
If more than one futex is placed on a shmem huge page, it can happen that
waking the second wakes the first instead, and leaves the second waiting:
the key's shared.pgoff is wrong.
When 3.11 commit 13d60f4b6ab5 ("futex: Take hugepages into account when
generating futex_key"), the only shared huge pages came from hugetlbfs,
and the code added to deal with its exceptional page->index was put into
hugetlb source. Then that was missed when 4.8 added shmem huge pages.
page_to_pgoff() is what others use for this nowadays: except that, as
currently written, it gives the right answer on hugetlbfs head, but
nonsense on hugetlbfs tails. Fix that by calling hugetlbfs-specific
hugetlb_basepage_index() on PageHuge tails as well as on head.
Yes, it's unconventional to declare hugetlb_basepage_index() there in
pagemap.h, rather than in hugetlb.h; but I do not expect anything but
page_to_pgoff() ever to need it.
[akpm@linux-foundation.org: give hugetlb_basepage_index() prototype the correct scope]
Link: https://lkml.kernel.org/r/b17d946b-d09-326e-b42a-52884c36df32@google.com
Fixes: 800d8c63b2e9 ("shmem: add huge pages support")
Reported-by: Neel Natu <neelnatu@google.com>
Signed-off-by: Hugh Dickins <hughd@google.com>
Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Zhang Yi <wetpzy@gmail.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Darren Hart <dvhart@infradead.org>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
include/linux/hugetlb.h | 16 ----------------
include/linux/pagemap.h | 13 +++++++------
kernel/futex.c | 3 +--
mm/hugetlb.c | 5 +----
4 files changed, 9 insertions(+), 28 deletions(-)
--- a/include/linux/hugetlb.h~mm-futex-fix-shared-futex-pgoff-on-shmem-huge-page
+++ a/include/linux/hugetlb.h
@@ -741,17 +741,6 @@ static inline int hstate_index(struct hs
return h - hstates;
}
-pgoff_t __basepage_index(struct page *page);
-
-/* Return page->index in PAGE_SIZE units */
-static inline pgoff_t basepage_index(struct page *page)
-{
- if (!PageCompound(page))
- return page->index;
-
- return __basepage_index(page);
-}
-
extern int dissolve_free_huge_page(struct page *page);
extern int dissolve_free_huge_pages(unsigned long start_pfn,
unsigned long end_pfn);
@@ -988,11 +977,6 @@ static inline int hstate_index(struct hs
return 0;
}
-static inline pgoff_t basepage_index(struct page *page)
-{
- return page->index;
-}
-
static inline int dissolve_free_huge_page(struct page *page)
{
return 0;
--- a/include/linux/pagemap.h~mm-futex-fix-shared-futex-pgoff-on-shmem-huge-page
+++ a/include/linux/pagemap.h
@@ -516,7 +516,7 @@ static inline struct page *read_mapping_
}
/*
- * Get index of the page with in radix-tree
+ * Get index of the page within radix-tree (but not for hugetlb pages).
* (TODO: remove once hugetlb pages will have ->index in PAGE_SIZE)
*/
static inline pgoff_t page_to_index(struct page *page)
@@ -535,15 +535,16 @@ static inline pgoff_t page_to_index(stru
return pgoff;
}
+extern pgoff_t hugetlb_basepage_index(struct page *page);
+
/*
- * Get the offset in PAGE_SIZE.
- * (TODO: hugepage should have ->index in PAGE_SIZE)
+ * Get the offset in PAGE_SIZE (even for hugetlb pages).
+ * (TODO: hugetlb pages should have ->index in PAGE_SIZE)
*/
static inline pgoff_t page_to_pgoff(struct page *page)
{
- if (unlikely(PageHeadHuge(page)))
- return page->index << compound_order(page);
-
+ if (unlikely(PageHuge(page)))
+ return hugetlb_basepage_index(page);
return page_to_index(page);
}
--- a/kernel/futex.c~mm-futex-fix-shared-futex-pgoff-on-shmem-huge-page
+++ a/kernel/futex.c
@@ -35,7 +35,6 @@
#include <linux/jhash.h>
#include <linux/pagemap.h>
#include <linux/syscalls.h>
-#include <linux/hugetlb.h>
#include <linux/freezer.h>
#include <linux/memblock.h>
#include <linux/fault-inject.h>
@@ -650,7 +649,7 @@ again:
key->both.offset |= FUT_OFF_INODE; /* inode-based key */
key->shared.i_seq = get_inode_sequence_number(inode);
- key->shared.pgoff = basepage_index(tail);
+ key->shared.pgoff = page_to_pgoff(tail);
rcu_read_unlock();
}
--- a/mm/hugetlb.c~mm-futex-fix-shared-futex-pgoff-on-shmem-huge-page
+++ a/mm/hugetlb.c
@@ -1588,15 +1588,12 @@ struct address_space *hugetlb_page_mappi
return NULL;
}
-pgoff_t __basepage_index(struct page *page)
+pgoff_t hugetlb_basepage_index(struct page *page)
{
struct page *page_head = compound_head(page);
pgoff_t index = page_index(page_head);
unsigned long compound_idx;
- if (!PageHuge(page_head))
- return page_index(page);
-
if (compound_order(page_head) >= MAX_ORDER)
compound_idx = page_to_pfn(page) - page_to_pfn(page_head);
else
_
next prev parent reply other threads:[~2021-06-25 1:39 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-06-25 1:38 incoming Andrew Morton
2021-06-25 1:39 ` [patch 01/24] mm: page_vma_mapped_walk(): use page for pvmw->page Andrew Morton
2021-06-25 1:39 ` [patch 02/24] mm: page_vma_mapped_walk(): settle PageHuge on entry Andrew Morton
2021-06-25 1:39 ` [patch 03/24] mm: page_vma_mapped_walk(): use pmde for *pvmw->pmd Andrew Morton
2021-06-25 1:39 ` [patch 04/24] mm: page_vma_mapped_walk(): prettify PVMW_MIGRATION block Andrew Morton
2021-06-25 1:39 ` [patch 05/24] mm: page_vma_mapped_walk(): crossing page table boundary Andrew Morton
2021-06-25 1:39 ` [patch 06/24] mm: page_vma_mapped_walk(): add a level of indentation Andrew Morton
2021-06-25 1:39 ` [patch 07/24] mm: page_vma_mapped_walk(): use goto instead of while (1) Andrew Morton
2021-06-25 1:39 ` [patch 08/24] mm: page_vma_mapped_walk(): get vma_address_end() earlier Andrew Morton
2021-06-25 1:39 ` [patch 09/24] mm/thp: fix page_vma_mapped_walk() if THP mapped by ptes Andrew Morton
2021-06-25 1:39 ` [patch 10/24] mm/thp: another PVMW_SYNC fix in page_vma_mapped_walk() Andrew Morton
2021-06-25 1:39 ` [patch 11/24] nilfs2: fix memory leak in nilfs_sysfs_delete_device_group Andrew Morton
2021-06-25 1:39 ` [patch 12/24] mm/vmalloc: add vmalloc_no_huge Andrew Morton
2021-06-25 1:39 ` [patch 13/24] KVM: s390: prepare for hugepage vmalloc Andrew Morton
2021-06-25 1:39 ` [patch 14/24] mm/vmalloc: unbreak kasan vmalloc support Andrew Morton
2021-06-25 1:39 ` [patch 15/24] kthread_worker: split code for canceling the delayed work timer Andrew Morton
2021-06-25 1:39 ` [patch 16/24] kthread: prevent deadlock when kthread_mod_delayed_work() races with kthread_cancel_delayed_work_sync() Andrew Morton
2021-06-25 1:39 ` Andrew Morton [this message]
2021-06-25 1:39 ` [patch 18/24] mm/memory-failure: use a mutex to avoid memory_failure() races Andrew Morton
2021-06-25 1:39 ` [patch 19/24] mm,hwpoison: return -EHWPOISON to denote that the page has already been poisoned Andrew Morton
2021-06-25 1:40 ` [patch 20/24] mm/hwpoison: do not lock page again when me_huge_page() successfully recovers Andrew Morton
2021-06-25 1:40 ` [patch 21/24] mm/page_alloc: __alloc_pages_bulk(): do bounds check before accessing array Andrew Morton
2021-06-25 1:40 ` [patch 22/24] mm/page_alloc: do bulk array bounds check after checking populated elements Andrew Morton
2021-06-25 1:40 ` [patch 23/24] MAINTAINERS: fix Marek's identity again Andrew Morton
2021-06-25 1:40 ` [patch 24/24] mailmap: add Marek's other e-mail address and identity without diacritics Andrew Morton
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20210625013952.eaD8IsXDa%akpm@linux-foundation.org \
--to=akpm@linux-foundation.org \
--cc=dave@stgolabs.net \
--cc=dvhart@infradead.org \
--cc=hughd@google.com \
--cc=kirill.shutemov@linux.intel.com \
--cc=linux-mm@kvack.org \
--cc=mgorman@techsingularity.net \
--cc=mike.kravetz@oracle.com \
--cc=mingo@redhat.com \
--cc=mm-commits@vger.kernel.org \
--cc=neelnatu@google.com \
--cc=peterz@infradead.org \
--cc=stable@vger.kernel.org \
--cc=tglx@linutronix.de \
--cc=torvalds@linux-foundation.org \
--cc=wetpzy@gmail.com \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).