All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 0/4] add hugetlb_free_vmemmap sysctl
@ 2022-03-07 13:07 Muchun Song
  2022-03-07 13:07 ` [PATCH v3 1/4] mm: hugetlb: disable freeing vmemmap pages when struct page crosses page boundaries Muchun Song
                   ` (3 more replies)
  0 siblings, 4 replies; 10+ messages in thread
From: Muchun Song @ 2022-03-07 13:07 UTC (permalink / raw)
  To: corbet, mike.kravetz, akpm, mcgrof, keescook, yzaikin, osalvador, david
  Cc: linux-doc, linux-kernel, linux-mm, duanxiongchun, smuchun, Muchun Song

This series amis to add hugetlb_free_vmemmap sysctl to enable the feature
of freeing vmemmap pages of HugeTLB pages.

v3:
  - Add pr_warn_once() (Mike).
  - Handle the transition from enabling to disabling (Luis)

v2:
  - Fix compilation when !CONFIG_MHP_MEMMAP_ON_MEMORY reported by kernel
    test robot <lkp@intel.com>.
  - Move sysctl code from kernel/sysctl.c to mm/hugetlb_vmemmap.c.

Muchun Song (4):
  mm: hugetlb: disable freeing vmemmap pages when struct page crosses
    page boundaries
  mm: memory_hotplug: override memmap_on_memory when
    hugetlb_free_vmemmap=on
  sysctl: allow to set extra1 to SYSCTL_ONE
  mm: hugetlb: add hugetlb_free_vmemmap sysctl

 Documentation/admin-guide/sysctl/vm.rst |  14 ++++
 include/linux/memory_hotplug.h          |   9 +++
 kernel/sysctl.c                         |   2 +-
 mm/hugetlb_vmemmap.c                    | 113 +++++++++++++++++++++++++++-----
 mm/hugetlb_vmemmap.h                    |   4 +-
 mm/memory_hotplug.c                     |  27 ++++++--
 6 files changed, 143 insertions(+), 26 deletions(-)

-- 
2.11.0


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v3 1/4] mm: hugetlb: disable freeing vmemmap pages when struct page crosses page boundaries
  2022-03-07 13:07 [PATCH v3 0/4] add hugetlb_free_vmemmap sysctl Muchun Song
@ 2022-03-07 13:07 ` Muchun Song
  2022-03-07 16:35   ` Luis Chamberlain
  2022-03-07 13:07 ` [PATCH v3 2/4] mm: memory_hotplug: override memmap_on_memory when hugetlb_free_vmemmap=on Muchun Song
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 10+ messages in thread
From: Muchun Song @ 2022-03-07 13:07 UTC (permalink / raw)
  To: corbet, mike.kravetz, akpm, mcgrof, keescook, yzaikin, osalvador, david
  Cc: linux-doc, linux-kernel, linux-mm, duanxiongchun, smuchun, Muchun Song

If the size of "struct page" is not the power of two and this
feature is enabled, then the vmemmap pages of HugeTLB will be
corrupted after remapping (panic is about to happen in theory).
But this only exists when !CONFIG_MEMCG && !CONFIG_SLUB on
x86_64.  However, it is not a conventional configuration nowadays.
So it is not a real word issue, just the result of a code review.
But we cannot prevent anyone from configuring that combined
configure.  This feature should be disable in this case to fix
this issue.

Signed-off-by: Muchun Song <songmuchun@bytedance.com>
---
 mm/hugetlb_vmemmap.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c
index b3118dba0518..49bc7f845438 100644
--- a/mm/hugetlb_vmemmap.c
+++ b/mm/hugetlb_vmemmap.c
@@ -121,6 +121,18 @@ void __init hugetlb_vmemmap_init(struct hstate *h)
 	if (!hugetlb_free_vmemmap_enabled())
 		return;
 
+	if (IS_ENABLED(CONFIG_HUGETLB_PAGE_FREE_VMEMMAP_DEFAULT_ON) &&
+	    !is_power_of_2(sizeof(struct page))) {
+		/*
+		 * The hugetlb_free_vmemmap_enabled_key can be enabled when
+		 * CONFIG_HUGETLB_PAGE_FREE_VMEMMAP_DEFAULT_ON. It should
+		 * be disabled if "struct page" crosses page boundaries.
+		 */
+		pr_warn_once("cannot free vmemmap pages because \"struct page\" crosses page boundaries\n");
+		static_branch_disable(&hugetlb_free_vmemmap_enabled_key);
+		return;
+	}
+
 	vmemmap_pages = (nr_pages * sizeof(struct page)) >> PAGE_SHIFT;
 	/*
 	 * The head page is not to be freed to buddy allocator, the other tail
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v3 2/4] mm: memory_hotplug: override memmap_on_memory when hugetlb_free_vmemmap=on
  2022-03-07 13:07 [PATCH v3 0/4] add hugetlb_free_vmemmap sysctl Muchun Song
  2022-03-07 13:07 ` [PATCH v3 1/4] mm: hugetlb: disable freeing vmemmap pages when struct page crosses page boundaries Muchun Song
@ 2022-03-07 13:07 ` Muchun Song
  2022-03-07 13:07 ` [PATCH v3 3/4] sysctl: allow to set extra1 to SYSCTL_ONE Muchun Song
  2022-03-07 13:07 ` [PATCH v3 4/4] mm: hugetlb: add hugetlb_free_vmemmap sysctl Muchun Song
  3 siblings, 0 replies; 10+ messages in thread
From: Muchun Song @ 2022-03-07 13:07 UTC (permalink / raw)
  To: corbet, mike.kravetz, akpm, mcgrof, keescook, yzaikin, osalvador, david
  Cc: linux-doc, linux-kernel, linux-mm, duanxiongchun, smuchun, Muchun Song

When "hugetlb_free_vmemmap=on" and "memory_hotplug.memmap_on_memory"
are both passed to boot cmdline, the variable of "memmap_on_memory"
will be set to 1 even if the vmemmap pages will not be allocated from
the hotadded memory since the former takes precedence over the latter.
In the next patch, we want to enable or disable the feature of freeing
vmemmap pages of HugeTLB via sysctl.  We need a way to know if the
feature of memory_hotplug.memmap_on_memory is enabled when enabling
the feature of freeing vmemmap pages since those two features are not
compatible, however, the variable of "memmap_on_memory" cannot indicate
this nowadays.  Do not set "memmap_on_memory" to 1 when both parameters
are passed to cmdline, in this case, "memmap_on_memory" could indicate
if this feature is enabled by the users.

Also introduce mhp_memmap_on_memory() helper to move the definition of
"memmap_on_memory" to the scope of CONFIG_MHP_MEMMAP_ON_MEMORY.  It
could save a sizeof(bool) memory when !CONFIG_MHP_MEMMAP_ON_MEMORY.
In the next patch, mhp_memmap_on_memory() will also be exported to be
used in hugetlb_vmemmap.c.

Signed-off-by: Muchun Song <songmuchun@bytedance.com>
---
 mm/memory_hotplug.c | 32 ++++++++++++++++++++++++++------
 1 file changed, 26 insertions(+), 6 deletions(-)

diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index c226a337c1ef..d92edf102cfe 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -42,14 +42,36 @@
 #include "internal.h"
 #include "shuffle.h"
 
+#ifdef CONFIG_MHP_MEMMAP_ON_MEMORY
+static int memmap_on_memory_set(const char *val, const struct kernel_param *kp)
+{
+	if (hugetlb_free_vmemmap_enabled())
+		return 0;
+	return param_set_bool(val, kp);
+}
+
+static const struct kernel_param_ops memmap_on_memory_ops = {
+	.flags	= KERNEL_PARAM_OPS_FL_NOARG,
+	.set	= memmap_on_memory_set,
+	.get	= param_get_bool,
+};
 
 /*
  * memory_hotplug.memmap_on_memory parameter
  */
 static bool memmap_on_memory __ro_after_init;
-#ifdef CONFIG_MHP_MEMMAP_ON_MEMORY
-module_param(memmap_on_memory, bool, 0444);
+module_param_cb(memmap_on_memory, &memmap_on_memory_ops, &memmap_on_memory, 0444);
 MODULE_PARM_DESC(memmap_on_memory, "Enable memmap on memory for memory hotplug");
+
+static inline bool mhp_memmap_on_memory(void)
+{
+	return memmap_on_memory;
+}
+#else
+static inline bool mhp_memmap_on_memory(void)
+{
+	return false;
+}
 #endif
 
 enum {
@@ -1289,9 +1311,7 @@ bool mhp_supports_memmap_on_memory(unsigned long size)
 	 *       altmap as an alternative source of memory, and we do not exactly
 	 *       populate a single PMD.
 	 */
-	return memmap_on_memory &&
-	       !hugetlb_free_vmemmap_enabled() &&
-	       IS_ENABLED(CONFIG_MHP_MEMMAP_ON_MEMORY) &&
+	return mhp_memmap_on_memory() &&
 	       size == memory_block_size_bytes() &&
 	       IS_ALIGNED(vmemmap_size, PMD_SIZE) &&
 	       IS_ALIGNED(remaining_size, (pageblock_nr_pages << PAGE_SHIFT));
@@ -2075,7 +2095,7 @@ static int __ref try_remove_memory(u64 start, u64 size)
 	 * We only support removing memory added with MHP_MEMMAP_ON_MEMORY in
 	 * the same granularity it was added - a single memory block.
 	 */
-	if (memmap_on_memory) {
+	if (mhp_memmap_on_memory()) {
 		nr_vmemmap_pages = walk_memory_blocks(start, size, NULL,
 						      get_nr_vmemmap_pages_cb);
 		if (nr_vmemmap_pages) {
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v3 3/4] sysctl: allow to set extra1 to SYSCTL_ONE
  2022-03-07 13:07 [PATCH v3 0/4] add hugetlb_free_vmemmap sysctl Muchun Song
  2022-03-07 13:07 ` [PATCH v3 1/4] mm: hugetlb: disable freeing vmemmap pages when struct page crosses page boundaries Muchun Song
  2022-03-07 13:07 ` [PATCH v3 2/4] mm: memory_hotplug: override memmap_on_memory when hugetlb_free_vmemmap=on Muchun Song
@ 2022-03-07 13:07 ` Muchun Song
  2022-03-07 13:07 ` [PATCH v3 4/4] mm: hugetlb: add hugetlb_free_vmemmap sysctl Muchun Song
  3 siblings, 0 replies; 10+ messages in thread
From: Muchun Song @ 2022-03-07 13:07 UTC (permalink / raw)
  To: corbet, mike.kravetz, akpm, mcgrof, keescook, yzaikin, osalvador, david
  Cc: linux-doc, linux-kernel, linux-mm, duanxiongchun, smuchun, Muchun Song

proc_do_static_key() does not consider the situation where a sysctl is only
allowed to be enabled and cannot be disabled under certain circumstances
since it set "->extra1" to SYSCTL_ZERO unconditionally.  This patch add the
functionality to set "->extra1" accordingly.

Signed-off-by: Muchun Song <songmuchun@bytedance.com>
---
 kernel/sysctl.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index 64065abf361e..ab3e9c937268 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -1631,7 +1631,7 @@ int proc_do_static_key(struct ctl_table *table, int write,
 		.data   = &val,
 		.maxlen = sizeof(val),
 		.mode   = table->mode,
-		.extra1 = SYSCTL_ZERO,
+		.extra1 = table->extra1 == SYSCTL_ONE ? SYSCTL_ONE : SYSCTL_ZERO,
 		.extra2 = SYSCTL_ONE,
 	};
 
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v3 4/4] mm: hugetlb: add hugetlb_free_vmemmap sysctl
  2022-03-07 13:07 [PATCH v3 0/4] add hugetlb_free_vmemmap sysctl Muchun Song
                   ` (2 preceding siblings ...)
  2022-03-07 13:07 ` [PATCH v3 3/4] sysctl: allow to set extra1 to SYSCTL_ONE Muchun Song
@ 2022-03-07 13:07 ` Muchun Song
  3 siblings, 0 replies; 10+ messages in thread
From: Muchun Song @ 2022-03-07 13:07 UTC (permalink / raw)
  To: corbet, mike.kravetz, akpm, mcgrof, keescook, yzaikin, osalvador, david
  Cc: linux-doc, linux-kernel, linux-mm, duanxiongchun, smuchun, Muchun Song

We must add "hugetlb_free_vmemmap=on" to boot cmdline and reboot the
server to enable the feature of freeing vmemmap pages of HugeTLB
pages.  Rebooting usually takes a long time.  Add a sysctl to enable
or disable the feature at runtime without rebooting.

Disabling requires there is no any optimized HugeTLB page in the
system.  If you fail to disable it, you can set "nr_hugepages" to 0
and then retry.

Signed-off-by: Muchun Song <songmuchun@bytedance.com>
---
 Documentation/admin-guide/sysctl/vm.rst |  14 ++++
 include/linux/memory_hotplug.h          |   9 +++
 mm/hugetlb_vmemmap.c                    | 113 +++++++++++++++++++++++++-------
 mm/hugetlb_vmemmap.h                    |   4 +-
 mm/memory_hotplug.c                     |   7 +-
 5 files changed, 116 insertions(+), 31 deletions(-)

diff --git a/Documentation/admin-guide/sysctl/vm.rst b/Documentation/admin-guide/sysctl/vm.rst
index f4804ce37c58..9e0e153ed935 100644
--- a/Documentation/admin-guide/sysctl/vm.rst
+++ b/Documentation/admin-guide/sysctl/vm.rst
@@ -561,6 +561,20 @@ Change the minimum size of the hugepage pool.
 See Documentation/admin-guide/mm/hugetlbpage.rst
 
 
+hugetlb_free_vmemmap
+====================
+
+Enable (set to 1) or disable (set to 0) the feature of optimizing vmemmap
+pages associated with each HugeTLB page.  Once true, the vmemmap pages of
+subsequent allocation of HugeTLB pages from buddy system will be optimized,
+whereas already allocated HugeTLB pages will not be optimized.  If you fail
+to disable this feature, you can set "nr_hugepages" to 0 and then retry
+since it is only allowed to be disabled after there is no any optimized
+HugeTLB page in the system.
+
+See Documentation/admin-guide/mm/hugetlbpage.rst
+
+
 nr_hugepages_mempolicy
 ======================
 
diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h
index e0b2209ab71c..20d7edf62a6a 100644
--- a/include/linux/memory_hotplug.h
+++ b/include/linux/memory_hotplug.h
@@ -351,4 +351,13 @@ void arch_remove_linear_mapping(u64 start, u64 size);
 extern bool mhp_supports_memmap_on_memory(unsigned long size);
 #endif /* CONFIG_MEMORY_HOTPLUG */
 
+#ifdef CONFIG_MHP_MEMMAP_ON_MEMORY
+bool mhp_memmap_on_memory(void);
+#else
+static inline bool mhp_memmap_on_memory(void)
+{
+	return false;
+}
+#endif
+
 #endif /* __LINUX_MEMORY_HOTPLUG_H */
diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c
index 49bc7f845438..0f7fe49220cf 100644
--- a/mm/hugetlb_vmemmap.c
+++ b/mm/hugetlb_vmemmap.c
@@ -10,6 +10,7 @@
 
 #define pr_fmt(fmt)	"HugeTLB: " fmt
 
+#include <linux/memory_hotplug.h>
 #include "hugetlb_vmemmap.h"
 
 /*
@@ -26,6 +27,10 @@ DEFINE_STATIC_KEY_MAYBE(CONFIG_HUGETLB_PAGE_FREE_VMEMMAP_DEFAULT_ON,
 			hugetlb_free_vmemmap_enabled_key);
 EXPORT_SYMBOL(hugetlb_free_vmemmap_enabled_key);
 
+/* How many HugeTLB pages with vmemmap pages optimized. */
+static atomic_long_t optimized_pages = ATOMIC_LONG_INIT(0);
+static DECLARE_RWSEM(sysctl_rwsem);
+
 static int __init early_hugetlb_free_vmemmap_param(char *buf)
 {
 	/* We cannot optimize if a "struct page" crosses page boundaries. */
@@ -48,11 +53,6 @@ static int __init early_hugetlb_free_vmemmap_param(char *buf)
 }
 early_param("hugetlb_free_vmemmap", early_hugetlb_free_vmemmap_param);
 
-static inline unsigned long free_vmemmap_pages_size_per_hpage(struct hstate *h)
-{
-	return (unsigned long)free_vmemmap_pages_per_hpage(h) << PAGE_SHIFT;
-}
-
 /*
  * Previously discarded vmemmap pages will be allocated and remapping
  * after this function returns zero.
@@ -61,14 +61,16 @@ int alloc_huge_page_vmemmap(struct hstate *h, struct page *head)
 {
 	int ret;
 	unsigned long vmemmap_addr = (unsigned long)head;
-	unsigned long vmemmap_end, vmemmap_reuse;
+	unsigned long vmemmap_end, vmemmap_reuse, vmemmap_pages;
 
 	if (!HPageVmemmapOptimized(head))
 		return 0;
 
-	vmemmap_addr += RESERVE_VMEMMAP_SIZE;
-	vmemmap_end = vmemmap_addr + free_vmemmap_pages_size_per_hpage(h);
-	vmemmap_reuse = vmemmap_addr - PAGE_SIZE;
+	vmemmap_addr	+= RESERVE_VMEMMAP_SIZE;
+	vmemmap_pages	= free_vmemmap_pages_per_hpage(h);
+	vmemmap_end	= vmemmap_addr + (vmemmap_pages << PAGE_SHIFT);
+	vmemmap_reuse	= vmemmap_addr - PAGE_SIZE;
+
 	/*
 	 * The pages which the vmemmap virtual address range [@vmemmap_addr,
 	 * @vmemmap_end) are mapped to are freed to the buddy allocator, and
@@ -78,8 +80,14 @@ int alloc_huge_page_vmemmap(struct hstate *h, struct page *head)
 	 */
 	ret = vmemmap_remap_alloc(vmemmap_addr, vmemmap_end, vmemmap_reuse,
 				  GFP_KERNEL | __GFP_NORETRY | __GFP_THISNODE);
-	if (!ret)
+	if (!ret) {
 		ClearHPageVmemmapOptimized(head);
+		/*
+		 * Paired with acquire semantic in
+		 * hugetlb_free_vmemmap_handler().
+		 */
+		atomic_long_dec_return_release(&optimized_pages);
+	}
 
 	return ret;
 }
@@ -87,22 +95,28 @@ int alloc_huge_page_vmemmap(struct hstate *h, struct page *head)
 void free_huge_page_vmemmap(struct hstate *h, struct page *head)
 {
 	unsigned long vmemmap_addr = (unsigned long)head;
-	unsigned long vmemmap_end, vmemmap_reuse;
+	unsigned long vmemmap_end, vmemmap_reuse, vmemmap_pages;
 
-	if (!free_vmemmap_pages_per_hpage(h))
-		return;
+	down_read(&sysctl_rwsem);
+	vmemmap_pages = free_vmemmap_pages_per_hpage(h);
+	if (!vmemmap_pages)
+		goto out;
 
-	vmemmap_addr += RESERVE_VMEMMAP_SIZE;
-	vmemmap_end = vmemmap_addr + free_vmemmap_pages_size_per_hpage(h);
-	vmemmap_reuse = vmemmap_addr - PAGE_SIZE;
+	vmemmap_addr	+= RESERVE_VMEMMAP_SIZE;
+	vmemmap_end	= vmemmap_addr + (vmemmap_pages << PAGE_SHIFT);
+	vmemmap_reuse	= vmemmap_addr - PAGE_SIZE;
 
 	/*
 	 * Remap the vmemmap virtual address range [@vmemmap_addr, @vmemmap_end)
 	 * to the page which @vmemmap_reuse is mapped to, then free the pages
 	 * which the range [@vmemmap_addr, @vmemmap_end] is mapped to.
 	 */
-	if (!vmemmap_remap_free(vmemmap_addr, vmemmap_end, vmemmap_reuse))
+	if (!vmemmap_remap_free(vmemmap_addr, vmemmap_end, vmemmap_reuse)) {
 		SetHPageVmemmapOptimized(head);
+		atomic_long_inc(&optimized_pages);
+	}
+out:
+	up_read(&sysctl_rwsem);
 }
 
 void __init hugetlb_vmemmap_init(struct hstate *h)
@@ -118,18 +132,16 @@ void __init hugetlb_vmemmap_init(struct hstate *h)
 	BUILD_BUG_ON(__NR_USED_SUBPAGE >=
 		     RESERVE_VMEMMAP_SIZE / sizeof(struct page));
 
-	if (!hugetlb_free_vmemmap_enabled())
-		return;
-
-	if (IS_ENABLED(CONFIG_HUGETLB_PAGE_FREE_VMEMMAP_DEFAULT_ON) &&
-	    !is_power_of_2(sizeof(struct page))) {
+	if (!is_power_of_2(sizeof(struct page))) {
 		/*
 		 * The hugetlb_free_vmemmap_enabled_key can be enabled when
 		 * CONFIG_HUGETLB_PAGE_FREE_VMEMMAP_DEFAULT_ON. It should
 		 * be disabled if "struct page" crosses page boundaries.
 		 */
-		pr_warn_once("cannot free vmemmap pages because \"struct page\" crosses page boundaries\n");
-		static_branch_disable(&hugetlb_free_vmemmap_enabled_key);
+		if (IS_ENABLED(CONFIG_HUGETLB_PAGE_FREE_VMEMMAP_DEFAULT_ON)) {
+			pr_warn_once("cannot free vmemmap pages because \"struct page\" crosses page boundaries\n");
+			static_branch_disable(&hugetlb_free_vmemmap_enabled_key);
+		}
 		return;
 	}
 
@@ -148,3 +160,56 @@ void __init hugetlb_vmemmap_init(struct hstate *h)
 	pr_info("can free %d vmemmap pages for %s\n", h->nr_free_vmemmap_pages,
 		h->name);
 }
+
+static int hugetlb_free_vmemmap_handler(struct ctl_table *table, int write,
+					void *buffer, size_t *length,
+					loff_t *ppos)
+{
+	int ret;
+
+	down_write(&sysctl_rwsem);
+	/*
+	 * Cannot be disabled when there is at lease one optimized
+	 * HugeTLB in the system.
+	 *
+	 * The acquire semantic is paired with release semantic in
+	 * alloc_huge_page_vmemmap(). If we saw the @optimized_pages
+	 * with 0, all the operations of vmemmap pages remapping from
+	 * alloc_huge_page_vmemmap() are visible too so that we can
+	 * safely disable static key.
+	 */
+	table->extra1 = atomic_long_read_acquire(&optimized_pages) ?
+			SYSCTL_ONE : SYSCTL_ZERO;
+	ret = proc_do_static_key(table, write, buffer, length, ppos);
+	up_write(&sysctl_rwsem);
+
+	return ret;
+}
+
+static struct ctl_table hugetlb_vmemmap_sysctls[] = {
+	{
+		.procname	= "hugetlb_free_vmemmap",
+		.data		= &hugetlb_free_vmemmap_enabled_key.key,
+		.mode		= 0644,
+		.proc_handler	= hugetlb_free_vmemmap_handler,
+	},
+	{ }
+};
+
+static __init int hugetlb_vmemmap_sysctls_init(void)
+{
+	if (!is_power_of_2(sizeof(struct page)))
+		return 0;
+
+	/*
+	 * The vmemmap pages cannot be optimized if
+	 * "memory_hotplug.memmap_on_memory" is enabled.
+	 */
+	if (mhp_memmap_on_memory())
+		return 0;
+
+	register_sysctl_init("vm", hugetlb_vmemmap_sysctls);
+
+	return 0;
+}
+late_initcall(hugetlb_vmemmap_sysctls_init);
diff --git a/mm/hugetlb_vmemmap.h b/mm/hugetlb_vmemmap.h
index cb2bef8f9e73..b67a159027f4 100644
--- a/mm/hugetlb_vmemmap.h
+++ b/mm/hugetlb_vmemmap.h
@@ -21,7 +21,9 @@ void hugetlb_vmemmap_init(struct hstate *h);
  */
 static inline unsigned int free_vmemmap_pages_per_hpage(struct hstate *h)
 {
-	return h->nr_free_vmemmap_pages;
+	if (hugetlb_free_vmemmap_enabled())
+		return h->nr_free_vmemmap_pages;
+	return 0;
 }
 #else
 static inline int alloc_huge_page_vmemmap(struct hstate *h, struct page *head)
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index d92edf102cfe..e69c31cea917 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -63,15 +63,10 @@ static bool memmap_on_memory __ro_after_init;
 module_param_cb(memmap_on_memory, &memmap_on_memory_ops, &memmap_on_memory, 0444);
 MODULE_PARM_DESC(memmap_on_memory, "Enable memmap on memory for memory hotplug");
 
-static inline bool mhp_memmap_on_memory(void)
+bool mhp_memmap_on_memory(void)
 {
 	return memmap_on_memory;
 }
-#else
-static inline bool mhp_memmap_on_memory(void)
-{
-	return false;
-}
 #endif
 
 enum {
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH v3 1/4] mm: hugetlb: disable freeing vmemmap pages when struct page crosses page boundaries
  2022-03-07 13:07 ` [PATCH v3 1/4] mm: hugetlb: disable freeing vmemmap pages when struct page crosses page boundaries Muchun Song
@ 2022-03-07 16:35   ` Luis Chamberlain
  2022-03-07 17:03     ` Muchun Song
  0 siblings, 1 reply; 10+ messages in thread
From: Luis Chamberlain @ 2022-03-07 16:35 UTC (permalink / raw)
  To: Muchun Song
  Cc: corbet, mike.kravetz, akpm, keescook, yzaikin, osalvador, david,
	linux-doc, linux-kernel, linux-mm, duanxiongchun, smuchun

On Mon, Mar 07, 2022 at 09:07:05PM +0800, Muchun Song wrote:
> If the size of "struct page" is not the power of two and this
> feature is enabled, then the vmemmap pages of HugeTLB will be
> corrupted after remapping (panic is about to happen in theory).

Huh what? If a panic is possible best we prevent this in kconfig
all together. I'd instead just put some work into this instead of
adding all this run time hacks.

Can you try to add kconfig magic to detect if a PAGE_SIZE is PO2?

  Luis

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v3 1/4] mm: hugetlb: disable freeing vmemmap pages when struct page crosses page boundaries
  2022-03-07 16:35   ` Luis Chamberlain
@ 2022-03-07 17:03     ` Muchun Song
  2022-03-07 17:12       ` Muchun Song
  2022-03-10 21:31       ` Luis Chamberlain
  0 siblings, 2 replies; 10+ messages in thread
From: Muchun Song @ 2022-03-07 17:03 UTC (permalink / raw)
  To: Luis Chamberlain
  Cc: Jonathan Corbet, Mike Kravetz, Andrew Morton, Kees Cook,
	Iurii Zaikin, Oscar Salvador, David Hildenbrand,
	Linux Doc Mailing List, LKML, Linux Memory Management List,
	Xiongchun duan, Muchun Song

On Tue, Mar 8, 2022 at 12:35 AM Luis Chamberlain <mcgrof@kernel.org> wrote:
>
> On Mon, Mar 07, 2022 at 09:07:05PM +0800, Muchun Song wrote:
> > If the size of "struct page" is not the power of two and this
> > feature is enabled, then the vmemmap pages of HugeTLB will be
> > corrupted after remapping (panic is about to happen in theory).
>
> Huh what? If a panic is possible best we prevent this in kconfig
> all together. I'd instead just put some work into this instead of
> adding all this run time hacks.

If the size of `struct page` is not power of 2, then those lines added
by this patch will be optimized away by the compiler, therefore there
is going to be no extra overhead to detect this.

>
> Can you try to add kconfig magic to detect if a PAGE_SIZE is PO2?
>

I agree with you that it is better if we can move this check
into Kconfig. I tried this a few months ago. It is not easy to
do this. How to check if a `struct page size` is PO2 in
Kconfig? If you have any thoughts please let me know.

Thanks.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v3 1/4] mm: hugetlb: disable freeing vmemmap pages when struct page crosses page boundaries
  2022-03-07 17:03     ` Muchun Song
@ 2022-03-07 17:12       ` Muchun Song
  2022-03-10 21:31       ` Luis Chamberlain
  1 sibling, 0 replies; 10+ messages in thread
From: Muchun Song @ 2022-03-07 17:12 UTC (permalink / raw)
  To: Luis Chamberlain
  Cc: Jonathan Corbet, Mike Kravetz, Andrew Morton, Kees Cook,
	Iurii Zaikin, Oscar Salvador, David Hildenbrand,
	Linux Doc Mailing List, LKML, Linux Memory Management List,
	Xiongchun duan, Muchun Song

On Tue, Mar 8, 2022 at 1:03 AM Muchun Song <songmuchun@bytedance.com> wrote:
>
> On Tue, Mar 8, 2022 at 12:35 AM Luis Chamberlain <mcgrof@kernel.org> wrote:
> >
> > On Mon, Mar 07, 2022 at 09:07:05PM +0800, Muchun Song wrote:
> > > If the size of "struct page" is not the power of two and this
> > > feature is enabled, then the vmemmap pages of HugeTLB will be
> > > corrupted after remapping (panic is about to happen in theory).
> >
> > Huh what? If a panic is possible best we prevent this in kconfig
> > all together. I'd instead just put some work into this instead of
> > adding all this run time hacks.
>
> If the size of `struct page` is not power of 2, then those lines added
> by this patch will be optimized away by the compiler, therefore there
> is going to be no extra overhead to detect this.
>
> >
> > Can you try to add kconfig magic to detect if a PAGE_SIZE is PO2?
> >
>
> I agree with you that it is better if we can move this check
> into Kconfig. I tried this a few months ago. It is not easy to
> do this. How to check if a `struct page size` is PO2 in
> Kconfig? If you have any thoughts please let me know.
>
> Thanks.

Here is a discussion [1] from a few months ago.

[1] https://lore.kernel.org/all/CAMZfGtWfz8DcwKBLdf3j0x9Dt6ZvOd+MvjX6yXrAoKDeXxW95w@mail.gmail.com/

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v3 1/4] mm: hugetlb: disable freeing vmemmap pages when struct page crosses page boundaries
  2022-03-07 17:03     ` Muchun Song
  2022-03-07 17:12       ` Muchun Song
@ 2022-03-10 21:31       ` Luis Chamberlain
  2022-03-11  7:22         ` Muchun Song
  1 sibling, 1 reply; 10+ messages in thread
From: Luis Chamberlain @ 2022-03-10 21:31 UTC (permalink / raw)
  To: Muchun Song
  Cc: Jonathan Corbet, Mike Kravetz, Andrew Morton, Kees Cook,
	Iurii Zaikin, Oscar Salvador, David Hildenbrand,
	Linux Doc Mailing List, LKML, Linux Memory Management List,
	Xiongchun duan, Muchun Song

On Tue, Mar 08, 2022 at 01:03:08AM +0800, Muchun Song wrote:
> On Tue, Mar 8, 2022 at 12:35 AM Luis Chamberlain <mcgrof@kernel.org> wrote:
> >
> > On Mon, Mar 07, 2022 at 09:07:05PM +0800, Muchun Song wrote:
> > > If the size of "struct page" is not the power of two and this
> > > feature is enabled, then the vmemmap pages of HugeTLB will be
> > > corrupted after remapping (panic is about to happen in theory).
> >
> > Huh what? If a panic is possible best we prevent this in kconfig
> > all together. I'd instead just put some work into this instead of
> > adding all this run time hacks.
> 
> If the size of `struct page` is not power of 2, then those lines added
> by this patch will be optimized away by the compiler, therefore there
> is going to be no extra overhead to detect this.
> 
> >
> > Can you try to add kconfig magic to detect if a PAGE_SIZE is PO2?
> >
> 
> I agree with you that it is better if we can move this check
> into Kconfig. I tried this a few months ago. It is not easy to
> do this. How to check if a `struct page size` is PO2 in
> Kconfig? If you have any thoughts please let me know.

Can you query this with a script?

config HAS_PAGE_SIZE_PO2
	bool                                                                    
	default $(shell, scripts/check_po2_page_size.sh arguments_are_allowed)

  Luis

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v3 1/4] mm: hugetlb: disable freeing vmemmap pages when struct page crosses page boundaries
  2022-03-10 21:31       ` Luis Chamberlain
@ 2022-03-11  7:22         ` Muchun Song
  0 siblings, 0 replies; 10+ messages in thread
From: Muchun Song @ 2022-03-11  7:22 UTC (permalink / raw)
  To: Luis Chamberlain
  Cc: Jonathan Corbet, Mike Kravetz, Andrew Morton, Kees Cook,
	Iurii Zaikin, Oscar Salvador, David Hildenbrand,
	Linux Doc Mailing List, LKML, Linux Memory Management List,
	Xiongchun duan, Muchun Song

On Fri, Mar 11, 2022 at 5:31 AM Luis Chamberlain <mcgrof@kernel.org> wrote:
>
> On Tue, Mar 08, 2022 at 01:03:08AM +0800, Muchun Song wrote:
> > On Tue, Mar 8, 2022 at 12:35 AM Luis Chamberlain <mcgrof@kernel.org> wrote:
> > >
> > > On Mon, Mar 07, 2022 at 09:07:05PM +0800, Muchun Song wrote:
> > > > If the size of "struct page" is not the power of two and this
> > > > feature is enabled, then the vmemmap pages of HugeTLB will be
> > > > corrupted after remapping (panic is about to happen in theory).
> > >
> > > Huh what? If a panic is possible best we prevent this in kconfig
> > > all together. I'd instead just put some work into this instead of
> > > adding all this run time hacks.
> >
> > If the size of `struct page` is not power of 2, then those lines added
> > by this patch will be optimized away by the compiler, therefore there
> > is going to be no extra overhead to detect this.
> >
> > >
> > > Can you try to add kconfig magic to detect if a PAGE_SIZE is PO2?
> > >
> >
> > I agree with you that it is better if we can move this check
> > into Kconfig. I tried this a few months ago. It is not easy to
> > do this. How to check if a `struct page size` is PO2 in
> > Kconfig? If you have any thoughts please let me know.
>
> Can you query this with a script?
>
> config HAS_PAGE_SIZE_PO2
>         bool
>         default $(shell, scripts/check_po2_page_size.sh arguments_are_allowed)
>

Excellent. I'll try this approach.

Thanks very much.

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2022-03-11  7:24 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-03-07 13:07 [PATCH v3 0/4] add hugetlb_free_vmemmap sysctl Muchun Song
2022-03-07 13:07 ` [PATCH v3 1/4] mm: hugetlb: disable freeing vmemmap pages when struct page crosses page boundaries Muchun Song
2022-03-07 16:35   ` Luis Chamberlain
2022-03-07 17:03     ` Muchun Song
2022-03-07 17:12       ` Muchun Song
2022-03-10 21:31       ` Luis Chamberlain
2022-03-11  7:22         ` Muchun Song
2022-03-07 13:07 ` [PATCH v3 2/4] mm: memory_hotplug: override memmap_on_memory when hugetlb_free_vmemmap=on Muchun Song
2022-03-07 13:07 ` [PATCH v3 3/4] sysctl: allow to set extra1 to SYSCTL_ONE Muchun Song
2022-03-07 13:07 ` [PATCH v3 4/4] mm: hugetlb: add hugetlb_free_vmemmap sysctl Muchun Song

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.