All of lore.kernel.org
 help / color / mirror / Atom feed
From: Nicholas Piggin <npiggin@gmail.com>
To: linux-mm@kvack.org, Andrew Morton <akpm@linux-foundation.org>
Cc: Nicholas Piggin <npiggin@gmail.com>,
	linux-kernel@vger.kernel.org, linux-arch@vger.kernel.org,
	linuxppc-dev@lists.ozlabs.org,
	Jonathan Cameron <Jonathan.Cameron@Huawei.com>,
	Christoph Hellwig <hch@infradead.org>,
	Christophe Leroy <christophe.leroy@csgroup.eu>,
	Rick Edgecombe <rick.p.edgecombe@intel.com>,
	Ding Tianhong <dingtianhong@huawei.com>,
	Christoph Hellwig <hch@lst.de>
Subject: [PATCH v11 01/13] mm/vmalloc: fix HUGE_VMAP regression by enabling huge pages in vmalloc_to_page
Date: Tue, 26 Jan 2021 14:44:58 +1000	[thread overview]
Message-ID: <20210126044510.2491820-2-npiggin@gmail.com> (raw)
In-Reply-To: <20210126044510.2491820-1-npiggin@gmail.com>

vmalloc_to_page returns NULL for addresses mapped by larger pages[*].
Whether or not a vmap is huge depends on the architecture details,
alignments, boot options, etc., which the caller can not be expected
to know. Therefore HUGE_VMAP is a regression for vmalloc_to_page.

This change teaches vmalloc_to_page about larger pages, and returns
the struct page that corresponds to the offset within the large page.
This makes the API agnostic to mapping implementation details.

[*] As explained by commit 029c54b095995 ("mm/vmalloc.c: huge-vmap:
    fail gracefully on unexpected huge vmap mappings")

Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 mm/vmalloc.c | 41 ++++++++++++++++++++++++++---------------
 1 file changed, 26 insertions(+), 15 deletions(-)

diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index e6f352bf0498..62372f9e0167 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -34,7 +34,7 @@
 #include <linux/bitops.h>
 #include <linux/rbtree_augmented.h>
 #include <linux/overflow.h>
-
+#include <linux/pgtable.h>
 #include <linux/uaccess.h>
 #include <asm/tlbflush.h>
 #include <asm/shmparam.h>
@@ -343,7 +343,9 @@ int is_vmalloc_or_module_addr(const void *x)
 }
 
 /*
- * Walk a vmap address to the struct page it maps.
+ * Walk a vmap address to the struct page it maps. Huge vmap mappings will
+ * return the tail page that corresponds to the base page address, which
+ * matches small vmap mappings.
  */
 struct page *vmalloc_to_page(const void *vmalloc_addr)
 {
@@ -363,25 +365,33 @@ struct page *vmalloc_to_page(const void *vmalloc_addr)
 
 	if (pgd_none(*pgd))
 		return NULL;
+	if (WARN_ON_ONCE(pgd_leaf(*pgd)))
+		return NULL; /* XXX: no allowance for huge pgd */
+	if (WARN_ON_ONCE(pgd_bad(*pgd)))
+		return NULL;
+
 	p4d = p4d_offset(pgd, addr);
 	if (p4d_none(*p4d))
 		return NULL;
-	pud = pud_offset(p4d, addr);
+	if (p4d_leaf(*p4d))
+		return p4d_page(*p4d) + ((addr & ~P4D_MASK) >> PAGE_SHIFT);
+	if (WARN_ON_ONCE(p4d_bad(*p4d)))
+		return NULL;
 
-	/*
-	 * Don't dereference bad PUD or PMD (below) entries. This will also
-	 * identify huge mappings, which we may encounter on architectures
-	 * that define CONFIG_HAVE_ARCH_HUGE_VMAP=y. Such regions will be
-	 * identified as vmalloc addresses by is_vmalloc_addr(), but are
-	 * not [unambiguously] associated with a struct page, so there is
-	 * no correct value to return for them.
-	 */
-	WARN_ON_ONCE(pud_bad(*pud));
-	if (pud_none(*pud) || pud_bad(*pud))
+	pud = pud_offset(p4d, addr);
+	if (pud_none(*pud))
+		return NULL;
+	if (pud_leaf(*pud))
+		return pud_page(*pud) + ((addr & ~PUD_MASK) >> PAGE_SHIFT);
+	if (WARN_ON_ONCE(pud_bad(*pud)))
 		return NULL;
+
 	pmd = pmd_offset(pud, addr);
-	WARN_ON_ONCE(pmd_bad(*pmd));
-	if (pmd_none(*pmd) || pmd_bad(*pmd))
+	if (pmd_none(*pmd))
+		return NULL;
+	if (pmd_leaf(*pmd))
+		return pmd_page(*pmd) + ((addr & ~PMD_MASK) >> PAGE_SHIFT);
+	if (WARN_ON_ONCE(pmd_bad(*pmd)))
 		return NULL;
 
 	ptep = pte_offset_map(pmd, addr);
@@ -389,6 +399,7 @@ struct page *vmalloc_to_page(const void *vmalloc_addr)
 	if (pte_present(pte))
 		page = pte_page(pte);
 	pte_unmap(ptep);
+
 	return page;
 }
 EXPORT_SYMBOL(vmalloc_to_page);
-- 
2.23.0


WARNING: multiple messages have this Message-ID (diff)
From: Nicholas Piggin <npiggin@gmail.com>
To: linux-mm@kvack.org, Andrew Morton <akpm@linux-foundation.org>
Cc: linux-arch@vger.kernel.org,
	Ding Tianhong <dingtianhong@huawei.com>,
	linux-kernel@vger.kernel.org, Nicholas Piggin <npiggin@gmail.com>,
	Christoph Hellwig <hch@infradead.org>,
	Jonathan Cameron <Jonathan.Cameron@Huawei.com>,
	Rick Edgecombe <rick.p.edgecombe@intel.com>,
	linuxppc-dev@lists.ozlabs.org, Christoph Hellwig <hch@lst.de>
Subject: [PATCH v11 01/13] mm/vmalloc: fix HUGE_VMAP regression by enabling huge pages in vmalloc_to_page
Date: Tue, 26 Jan 2021 14:44:58 +1000	[thread overview]
Message-ID: <20210126044510.2491820-2-npiggin@gmail.com> (raw)
In-Reply-To: <20210126044510.2491820-1-npiggin@gmail.com>

vmalloc_to_page returns NULL for addresses mapped by larger pages[*].
Whether or not a vmap is huge depends on the architecture details,
alignments, boot options, etc., which the caller can not be expected
to know. Therefore HUGE_VMAP is a regression for vmalloc_to_page.

This change teaches vmalloc_to_page about larger pages, and returns
the struct page that corresponds to the offset within the large page.
This makes the API agnostic to mapping implementation details.

[*] As explained by commit 029c54b095995 ("mm/vmalloc.c: huge-vmap:
    fail gracefully on unexpected huge vmap mappings")

Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 mm/vmalloc.c | 41 ++++++++++++++++++++++++++---------------
 1 file changed, 26 insertions(+), 15 deletions(-)

diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index e6f352bf0498..62372f9e0167 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -34,7 +34,7 @@
 #include <linux/bitops.h>
 #include <linux/rbtree_augmented.h>
 #include <linux/overflow.h>
-
+#include <linux/pgtable.h>
 #include <linux/uaccess.h>
 #include <asm/tlbflush.h>
 #include <asm/shmparam.h>
@@ -343,7 +343,9 @@ int is_vmalloc_or_module_addr(const void *x)
 }
 
 /*
- * Walk a vmap address to the struct page it maps.
+ * Walk a vmap address to the struct page it maps. Huge vmap mappings will
+ * return the tail page that corresponds to the base page address, which
+ * matches small vmap mappings.
  */
 struct page *vmalloc_to_page(const void *vmalloc_addr)
 {
@@ -363,25 +365,33 @@ struct page *vmalloc_to_page(const void *vmalloc_addr)
 
 	if (pgd_none(*pgd))
 		return NULL;
+	if (WARN_ON_ONCE(pgd_leaf(*pgd)))
+		return NULL; /* XXX: no allowance for huge pgd */
+	if (WARN_ON_ONCE(pgd_bad(*pgd)))
+		return NULL;
+
 	p4d = p4d_offset(pgd, addr);
 	if (p4d_none(*p4d))
 		return NULL;
-	pud = pud_offset(p4d, addr);
+	if (p4d_leaf(*p4d))
+		return p4d_page(*p4d) + ((addr & ~P4D_MASK) >> PAGE_SHIFT);
+	if (WARN_ON_ONCE(p4d_bad(*p4d)))
+		return NULL;
 
-	/*
-	 * Don't dereference bad PUD or PMD (below) entries. This will also
-	 * identify huge mappings, which we may encounter on architectures
-	 * that define CONFIG_HAVE_ARCH_HUGE_VMAP=y. Such regions will be
-	 * identified as vmalloc addresses by is_vmalloc_addr(), but are
-	 * not [unambiguously] associated with a struct page, so there is
-	 * no correct value to return for them.
-	 */
-	WARN_ON_ONCE(pud_bad(*pud));
-	if (pud_none(*pud) || pud_bad(*pud))
+	pud = pud_offset(p4d, addr);
+	if (pud_none(*pud))
+		return NULL;
+	if (pud_leaf(*pud))
+		return pud_page(*pud) + ((addr & ~PUD_MASK) >> PAGE_SHIFT);
+	if (WARN_ON_ONCE(pud_bad(*pud)))
 		return NULL;
+
 	pmd = pmd_offset(pud, addr);
-	WARN_ON_ONCE(pmd_bad(*pmd));
-	if (pmd_none(*pmd) || pmd_bad(*pmd))
+	if (pmd_none(*pmd))
+		return NULL;
+	if (pmd_leaf(*pmd))
+		return pmd_page(*pmd) + ((addr & ~PMD_MASK) >> PAGE_SHIFT);
+	if (WARN_ON_ONCE(pmd_bad(*pmd)))
 		return NULL;
 
 	ptep = pte_offset_map(pmd, addr);
@@ -389,6 +399,7 @@ struct page *vmalloc_to_page(const void *vmalloc_addr)
 	if (pte_present(pte))
 		page = pte_page(pte);
 	pte_unmap(ptep);
+
 	return page;
 }
 EXPORT_SYMBOL(vmalloc_to_page);
-- 
2.23.0


  reply	other threads:[~2021-01-27 10:03 UTC|newest]

Thread overview: 61+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-26  4:44 [PATCH v11 00/13] huge vmalloc mappings Nicholas Piggin
2021-01-26  4:44 ` Nicholas Piggin
2021-01-26  4:44 ` Nicholas Piggin [this message]
2021-01-26  4:44   ` [PATCH v11 01/13] mm/vmalloc: fix HUGE_VMAP regression by enabling huge pages in vmalloc_to_page Nicholas Piggin
2021-01-26  6:40   ` Miaohe Lin
2021-01-26  6:40     ` Miaohe Lin
2021-01-28  3:13   ` Ding Tianhong
2021-01-28  3:13     ` Ding Tianhong
2021-02-02 10:22     ` Nicholas Piggin
2021-02-02 10:22       ` Nicholas Piggin
2021-01-26  4:44 ` [PATCH v11 02/13] mm: apply_to_pte_range warn and fail if a large pte is encountered Nicholas Piggin
2021-01-26  4:44   ` Nicholas Piggin
2021-01-26  6:49   ` Miaohe Lin
2021-01-26  6:49     ` Miaohe Lin
2021-01-26  4:45 ` [PATCH v11 03/13] mm/vmalloc: rename vmap_*_range vmap_pages_*_range Nicholas Piggin
2021-01-26  4:45   ` Nicholas Piggin
2021-01-27  2:10   ` Miaohe Lin
2021-01-27  2:10     ` Miaohe Lin
2021-01-26  4:45 ` [PATCH v11 04/13] mm/ioremap: rename ioremap_*_range to vmap_*_range Nicholas Piggin
2021-01-26  4:45   ` Nicholas Piggin
2021-01-26  6:40   ` Christoph Hellwig
2021-01-26  6:40     ` Christoph Hellwig
2021-01-28  2:38   ` Miaohe Lin
2021-01-28  2:38     ` Miaohe Lin
2021-01-26  4:45 ` [PATCH v11 05/13] mm: HUGE_VMAP arch support cleanup Nicholas Piggin
2021-01-26  4:45   ` Nicholas Piggin
2021-01-26  4:45   ` Nicholas Piggin
2021-01-26  6:07   ` Ding Tianhong
2021-01-26  6:07     ` Ding Tianhong
2021-01-26  6:07     ` Ding Tianhong
2021-01-26 13:26   ` kernel test robot
2021-01-26 13:26     ` kernel test robot
2021-01-26 13:26     ` kernel test robot
2021-01-27  5:26   ` kernel test robot
2021-01-27  5:26     ` kernel test robot
2021-01-27  5:26     ` kernel test robot
2021-01-26  4:45 ` [PATCH v11 06/13] powerpc: inline huge vmap supported functions Nicholas Piggin
2021-01-26  4:45   ` Nicholas Piggin
2021-01-26  4:45 ` [PATCH v11 07/13] arm64: " Nicholas Piggin
2021-01-26  4:45   ` Nicholas Piggin
2021-01-26  4:45   ` Nicholas Piggin
2021-01-26  4:45 ` [PATCH v11 08/13] x86: " Nicholas Piggin
2021-01-26  4:45   ` Nicholas Piggin
2021-01-26  4:45 ` [PATCH v11 09/13] mm/vmalloc: provide fallback arch huge vmap support functions Nicholas Piggin
2021-01-26  4:45   ` Nicholas Piggin
2021-01-26  4:45 ` [PATCH v11 10/13] mm: Move vmap_range from mm/ioremap.c to mm/vmalloc.c Nicholas Piggin
2021-01-26  4:45   ` Nicholas Piggin
2021-01-26  4:45 ` [PATCH v11 11/13] mm/vmalloc: add vmap_range_noflush variant Nicholas Piggin
2021-01-26  4:45   ` Nicholas Piggin
2021-01-26  4:45 ` [PATCH v11 12/13] mm/vmalloc: Hugepage vmalloc mappings Nicholas Piggin
2021-01-26  4:45   ` Nicholas Piggin
2021-01-26  6:59   ` Ding Tianhong
2021-01-26  6:59     ` Ding Tianhong
2021-01-26  9:47     ` Nicholas Piggin
2021-01-26  9:47       ` Nicholas Piggin
2021-01-26 11:48       ` Ding Tianhong
2021-01-26 11:48         ` Ding Tianhong
2021-01-26  4:45 ` [PATCH v11 13/13] powerpc/64s/radix: Enable huge " Nicholas Piggin
2021-01-26  4:45   ` Nicholas Piggin
2021-01-27 10:26   ` Michael Ellerman
2021-01-27 10:26     ` Michael Ellerman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210126044510.2491820-2-npiggin@gmail.com \
    --to=npiggin@gmail.com \
    --cc=Jonathan.Cameron@Huawei.com \
    --cc=akpm@linux-foundation.org \
    --cc=christophe.leroy@csgroup.eu \
    --cc=dingtianhong@huawei.com \
    --cc=hch@infradead.org \
    --cc=hch@lst.de \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=rick.p.edgecombe@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.