All of lore.kernel.org
 help / color / mirror / Atom feed
From: Nicholas Piggin <npiggin@gmail.com>
To: linux-mm@kvack.org, Andrew Morton <akpm@linux-foundation.org>
Cc: Nicholas Piggin <npiggin@gmail.com>,
	linux-kernel@vger.kernel.org, linux-arch@vger.kernel.org,
	linuxppc-dev@lists.ozlabs.org,
	Jonathan Cameron <Jonathan.Cameron@Huawei.com>,
	Christoph Hellwig <hch@infradead.org>,
	Christophe Leroy <christophe.leroy@csgroup.eu>,
	Rick Edgecombe <rick.p.edgecombe@intel.com>,
	Ding Tianhong <dingtianhong@huawei.com>,
	Miaohe Lin <linmiaohe@huawei.com>, Christoph Hellwig <hch@lst.de>
Subject: [PATCH v12 02/14] mm/vmalloc: fix HUGE_VMAP regression by enabling huge pages in vmalloc_to_page
Date: Tue,  2 Feb 2021 21:05:03 +1000	[thread overview]
Message-ID: <20210202110515.3575274-3-npiggin@gmail.com> (raw)
In-Reply-To: <20210202110515.3575274-1-npiggin@gmail.com>

vmalloc_to_page returns NULL for addresses mapped by larger pages[*].
Whether or not a vmap is huge depends on the architecture details,
alignments, boot options, etc., which the caller can not be expected
to know. Therefore HUGE_VMAP is a regression for vmalloc_to_page.

This change teaches vmalloc_to_page about larger pages, and returns
the struct page that corresponds to the offset within the large page.
This makes the API agnostic to mapping implementation details.

[*] As explained by commit 029c54b095995 ("mm/vmalloc.c: huge-vmap:
    fail gracefully on unexpected huge vmap mappings")

Reviewed-by: Miaohe Lin <linmiaohe@huawei.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 mm/vmalloc.c | 41 ++++++++++++++++++++++++++---------------
 1 file changed, 26 insertions(+), 15 deletions(-)

diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index e6f352bf0498..62372f9e0167 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -34,7 +34,7 @@
 #include <linux/bitops.h>
 #include <linux/rbtree_augmented.h>
 #include <linux/overflow.h>
-
+#include <linux/pgtable.h>
 #include <linux/uaccess.h>
 #include <asm/tlbflush.h>
 #include <asm/shmparam.h>
@@ -343,7 +343,9 @@ int is_vmalloc_or_module_addr(const void *x)
 }
 
 /*
- * Walk a vmap address to the struct page it maps.
+ * Walk a vmap address to the struct page it maps. Huge vmap mappings will
+ * return the tail page that corresponds to the base page address, which
+ * matches small vmap mappings.
  */
 struct page *vmalloc_to_page(const void *vmalloc_addr)
 {
@@ -363,25 +365,33 @@ struct page *vmalloc_to_page(const void *vmalloc_addr)
 
 	if (pgd_none(*pgd))
 		return NULL;
+	if (WARN_ON_ONCE(pgd_leaf(*pgd)))
+		return NULL; /* XXX: no allowance for huge pgd */
+	if (WARN_ON_ONCE(pgd_bad(*pgd)))
+		return NULL;
+
 	p4d = p4d_offset(pgd, addr);
 	if (p4d_none(*p4d))
 		return NULL;
-	pud = pud_offset(p4d, addr);
+	if (p4d_leaf(*p4d))
+		return p4d_page(*p4d) + ((addr & ~P4D_MASK) >> PAGE_SHIFT);
+	if (WARN_ON_ONCE(p4d_bad(*p4d)))
+		return NULL;
 
-	/*
-	 * Don't dereference bad PUD or PMD (below) entries. This will also
-	 * identify huge mappings, which we may encounter on architectures
-	 * that define CONFIG_HAVE_ARCH_HUGE_VMAP=y. Such regions will be
-	 * identified as vmalloc addresses by is_vmalloc_addr(), but are
-	 * not [unambiguously] associated with a struct page, so there is
-	 * no correct value to return for them.
-	 */
-	WARN_ON_ONCE(pud_bad(*pud));
-	if (pud_none(*pud) || pud_bad(*pud))
+	pud = pud_offset(p4d, addr);
+	if (pud_none(*pud))
+		return NULL;
+	if (pud_leaf(*pud))
+		return pud_page(*pud) + ((addr & ~PUD_MASK) >> PAGE_SHIFT);
+	if (WARN_ON_ONCE(pud_bad(*pud)))
 		return NULL;
+
 	pmd = pmd_offset(pud, addr);
-	WARN_ON_ONCE(pmd_bad(*pmd));
-	if (pmd_none(*pmd) || pmd_bad(*pmd))
+	if (pmd_none(*pmd))
+		return NULL;
+	if (pmd_leaf(*pmd))
+		return pmd_page(*pmd) + ((addr & ~PMD_MASK) >> PAGE_SHIFT);
+	if (WARN_ON_ONCE(pmd_bad(*pmd)))
 		return NULL;
 
 	ptep = pte_offset_map(pmd, addr);
@@ -389,6 +399,7 @@ struct page *vmalloc_to_page(const void *vmalloc_addr)
 	if (pte_present(pte))
 		page = pte_page(pte);
 	pte_unmap(ptep);
+
 	return page;
 }
 EXPORT_SYMBOL(vmalloc_to_page);
-- 
2.23.0


WARNING: multiple messages have this Message-ID (diff)
From: Nicholas Piggin <npiggin@gmail.com>
To: linux-mm@kvack.org, Andrew Morton <akpm@linux-foundation.org>
Cc: linux-arch@vger.kernel.org, Miaohe Lin <linmiaohe@huawei.com>,
	Ding Tianhong <dingtianhong@huawei.com>,
	linux-kernel@vger.kernel.org, Nicholas Piggin <npiggin@gmail.com>,
	Christoph Hellwig <hch@infradead.org>,
	Jonathan Cameron <Jonathan.Cameron@Huawei.com>,
	Rick Edgecombe <rick.p.edgecombe@intel.com>,
	linuxppc-dev@lists.ozlabs.org, Christoph Hellwig <hch@lst.de>
Subject: [PATCH v12 02/14] mm/vmalloc: fix HUGE_VMAP regression by enabling huge pages in vmalloc_to_page
Date: Tue,  2 Feb 2021 21:05:03 +1000	[thread overview]
Message-ID: <20210202110515.3575274-3-npiggin@gmail.com> (raw)
In-Reply-To: <20210202110515.3575274-1-npiggin@gmail.com>

vmalloc_to_page returns NULL for addresses mapped by larger pages[*].
Whether or not a vmap is huge depends on the architecture details,
alignments, boot options, etc., which the caller can not be expected
to know. Therefore HUGE_VMAP is a regression for vmalloc_to_page.

This change teaches vmalloc_to_page about larger pages, and returns
the struct page that corresponds to the offset within the large page.
This makes the API agnostic to mapping implementation details.

[*] As explained by commit 029c54b095995 ("mm/vmalloc.c: huge-vmap:
    fail gracefully on unexpected huge vmap mappings")

Reviewed-by: Miaohe Lin <linmiaohe@huawei.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 mm/vmalloc.c | 41 ++++++++++++++++++++++++++---------------
 1 file changed, 26 insertions(+), 15 deletions(-)

diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index e6f352bf0498..62372f9e0167 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -34,7 +34,7 @@
 #include <linux/bitops.h>
 #include <linux/rbtree_augmented.h>
 #include <linux/overflow.h>
-
+#include <linux/pgtable.h>
 #include <linux/uaccess.h>
 #include <asm/tlbflush.h>
 #include <asm/shmparam.h>
@@ -343,7 +343,9 @@ int is_vmalloc_or_module_addr(const void *x)
 }
 
 /*
- * Walk a vmap address to the struct page it maps.
+ * Walk a vmap address to the struct page it maps. Huge vmap mappings will
+ * return the tail page that corresponds to the base page address, which
+ * matches small vmap mappings.
  */
 struct page *vmalloc_to_page(const void *vmalloc_addr)
 {
@@ -363,25 +365,33 @@ struct page *vmalloc_to_page(const void *vmalloc_addr)
 
 	if (pgd_none(*pgd))
 		return NULL;
+	if (WARN_ON_ONCE(pgd_leaf(*pgd)))
+		return NULL; /* XXX: no allowance for huge pgd */
+	if (WARN_ON_ONCE(pgd_bad(*pgd)))
+		return NULL;
+
 	p4d = p4d_offset(pgd, addr);
 	if (p4d_none(*p4d))
 		return NULL;
-	pud = pud_offset(p4d, addr);
+	if (p4d_leaf(*p4d))
+		return p4d_page(*p4d) + ((addr & ~P4D_MASK) >> PAGE_SHIFT);
+	if (WARN_ON_ONCE(p4d_bad(*p4d)))
+		return NULL;
 
-	/*
-	 * Don't dereference bad PUD or PMD (below) entries. This will also
-	 * identify huge mappings, which we may encounter on architectures
-	 * that define CONFIG_HAVE_ARCH_HUGE_VMAP=y. Such regions will be
-	 * identified as vmalloc addresses by is_vmalloc_addr(), but are
-	 * not [unambiguously] associated with a struct page, so there is
-	 * no correct value to return for them.
-	 */
-	WARN_ON_ONCE(pud_bad(*pud));
-	if (pud_none(*pud) || pud_bad(*pud))
+	pud = pud_offset(p4d, addr);
+	if (pud_none(*pud))
+		return NULL;
+	if (pud_leaf(*pud))
+		return pud_page(*pud) + ((addr & ~PUD_MASK) >> PAGE_SHIFT);
+	if (WARN_ON_ONCE(pud_bad(*pud)))
 		return NULL;
+
 	pmd = pmd_offset(pud, addr);
-	WARN_ON_ONCE(pmd_bad(*pmd));
-	if (pmd_none(*pmd) || pmd_bad(*pmd))
+	if (pmd_none(*pmd))
+		return NULL;
+	if (pmd_leaf(*pmd))
+		return pmd_page(*pmd) + ((addr & ~PMD_MASK) >> PAGE_SHIFT);
+	if (WARN_ON_ONCE(pmd_bad(*pmd)))
 		return NULL;
 
 	ptep = pte_offset_map(pmd, addr);
@@ -389,6 +399,7 @@ struct page *vmalloc_to_page(const void *vmalloc_addr)
 	if (pte_present(pte))
 		page = pte_page(pte);
 	pte_unmap(ptep);
+
 	return page;
 }
 EXPORT_SYMBOL(vmalloc_to_page);
-- 
2.23.0


  parent reply	other threads:[~2021-02-02 11:06 UTC|newest]

Thread overview: 63+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-02-02 11:05 [PATCH v12 00/14] huge vmalloc mappings Nicholas Piggin
2021-02-02 11:05 ` Nicholas Piggin
2021-02-02 11:05 ` [PATCH v12 01/14] ARM: mm: add missing pud_page define to 2-level page tables Nicholas Piggin
2021-02-02 11:05   ` Nicholas Piggin
2021-02-02 11:05   ` Nicholas Piggin
2021-02-02 11:13   ` Russell King - ARM Linux admin
2021-02-02 11:13     ` Russell King - ARM Linux admin
2021-02-02 11:13     ` Russell King - ARM Linux admin
2021-02-02 11:47     ` Ding Tianhong
2021-02-02 11:47       ` Ding Tianhong
2021-02-02 11:47       ` Ding Tianhong
2021-02-02 11:48       ` Ding Tianhong
2021-02-02 11:48         ` Ding Tianhong
2021-02-02 11:48         ` Ding Tianhong
2021-02-02 12:07       ` Russell King - ARM Linux admin
2021-02-02 12:07         ` Russell King - ARM Linux admin
2021-02-02 12:07         ` Russell King - ARM Linux admin
2021-02-03  3:01     ` Nicholas Piggin
2021-02-03  3:01       ` Nicholas Piggin
2021-02-03  3:01       ` Nicholas Piggin
2021-02-02 11:05 ` Nicholas Piggin [this message]
2021-02-02 11:05   ` [PATCH v12 02/14] mm/vmalloc: fix HUGE_VMAP regression by enabling huge pages in vmalloc_to_page Nicholas Piggin
2021-02-02 11:05 ` [PATCH v12 03/14] mm: apply_to_pte_range warn and fail if a large pte is encountered Nicholas Piggin
2021-02-02 11:05   ` Nicholas Piggin
2021-02-02 11:05 ` [PATCH v12 04/14] mm/vmalloc: rename vmap_*_range vmap_pages_*_range Nicholas Piggin
2021-02-02 11:05   ` Nicholas Piggin
2021-02-02 11:05 ` [PATCH v12 05/14] mm/ioremap: rename ioremap_*_range to vmap_*_range Nicholas Piggin
2021-02-02 11:05   ` Nicholas Piggin
2021-02-02 11:05 ` [PATCH v12 06/14] mm: HUGE_VMAP arch support cleanup Nicholas Piggin
2021-02-02 11:05   ` Nicholas Piggin
2021-02-02 11:05   ` Nicholas Piggin
2021-02-02 18:21   ` kernel test robot
2021-02-02 18:21     ` kernel test robot
2021-02-02 18:21     ` kernel test robot
2021-02-02 11:05 ` [PATCH v12 07/14] powerpc: inline huge vmap supported functions Nicholas Piggin
2021-02-02 11:05   ` Nicholas Piggin
2021-02-02 11:05 ` [PATCH v12 08/14] arm64: " Nicholas Piggin
2021-02-02 11:05   ` Nicholas Piggin
2021-02-02 11:05   ` Nicholas Piggin
2021-02-02 11:05 ` [PATCH v12 09/14] x86: " Nicholas Piggin
2021-02-02 11:05   ` Nicholas Piggin
2021-02-02 11:05 ` [PATCH v12 10/14] mm/vmalloc: provide fallback arch huge vmap support functions Nicholas Piggin
2021-02-02 11:05   ` Nicholas Piggin
2021-02-02 20:53   ` kernel test robot
2021-02-02 20:53     ` kernel test robot
2021-02-02 20:53     ` kernel test robot
2021-02-02 11:05 ` [PATCH v12 11/14] mm: Move vmap_range from mm/ioremap.c to mm/vmalloc.c Nicholas Piggin
2021-02-02 11:05   ` Nicholas Piggin
2021-02-02 11:05 ` [PATCH v12 12/14] mm/vmalloc: add vmap_range_noflush variant Nicholas Piggin
2021-02-02 11:05   ` Nicholas Piggin
2021-02-02 11:05 ` [PATCH v12 13/14] mm/vmalloc: Hugepage vmalloc mappings Nicholas Piggin
2021-02-02 11:05   ` Nicholas Piggin
2021-02-19  3:45   ` Ding Tianhong
2021-02-19  3:45     ` Ding Tianhong
2021-02-19  7:45     ` Nicholas Piggin
2021-02-19  7:45       ` Nicholas Piggin
2021-02-19  8:52       ` Ding Tianhong
2021-02-19  8:52         ` Ding Tianhong
2021-02-02 11:05 ` [PATCH v12 14/14] powerpc/64s/radix: Enable huge " Nicholas Piggin
2021-02-02 11:05   ` Nicholas Piggin
2021-02-02 13:48   ` kernel test robot
2021-02-02 13:48     ` kernel test robot
2021-02-02 13:48     ` kernel test robot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210202110515.3575274-3-npiggin@gmail.com \
    --to=npiggin@gmail.com \
    --cc=Jonathan.Cameron@Huawei.com \
    --cc=akpm@linux-foundation.org \
    --cc=christophe.leroy@csgroup.eu \
    --cc=dingtianhong@huawei.com \
    --cc=hch@infradead.org \
    --cc=hch@lst.de \
    --cc=linmiaohe@huawei.com \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=rick.p.edgecombe@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.