linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/3] add hugetlb_free_vmemmap sysctl
@ 2022-03-02  8:37 Muchun Song
  2022-03-02  8:37 ` [PATCH v2 1/3] mm: hugetlb: disable freeing vmemmap pages when struct page crosses page boundaries Muchun Song
                   ` (2 more replies)
  0 siblings, 3 replies; 11+ messages in thread
From: Muchun Song @ 2022-03-02  8:37 UTC (permalink / raw)
  To: corbet, mike.kravetz, akpm, mcgrof, keescook, yzaikin
  Cc: linux-doc, linux-kernel, linux-mm, duanxiongchun, smuchun, Muchun Song

This series amis to add hugetlb_free_vmemmap sysctl to enable the feature
of freeing vmemmap pages of HugeTLB pages.

v2:
  - Fix compilation when !CONFIG_MHP_MEMMAP_ON_MEMORY reported by kernel
    test robot <lkp@intel.com>.
  - Move sysctl code from kernel/sysctl.c to mm/hugetlb_vmemmap.c.

Muchun Song (3):
  mm: hugetlb: disable freeing vmemmap pages when struct page crosses
    page boundaries
  sysctl: allow to set extra1 to SYSCTL_ONE
  mm: hugetlb: add hugetlb_free_vmemmap sysctl

 Documentation/admin-guide/sysctl/vm.rst | 13 ++++++++++
 include/linux/memory_hotplug.h          |  9 +++++++
 kernel/sysctl.c                         |  2 +-
 mm/hugetlb_vmemmap.c                    | 43 ++++++++++++++++++++++++++++++++-
 mm/hugetlb_vmemmap.h                    |  4 ++-
 mm/memory_hotplug.c                     |  5 ++++
 6 files changed, 73 insertions(+), 3 deletions(-)

-- 
2.11.0


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH v2 1/3] mm: hugetlb: disable freeing vmemmap pages when struct page crosses page boundaries
  2022-03-02  8:37 [PATCH v2 0/3] add hugetlb_free_vmemmap sysctl Muchun Song
@ 2022-03-02  8:37 ` Muchun Song
  2022-03-02 21:21   ` Luis Chamberlain
  2022-03-03  0:25   ` Mike Kravetz
  2022-03-02  8:37 ` [PATCH v2 2/3] sysctl: allow to set extra1 to SYSCTL_ONE Muchun Song
  2022-03-02  8:37 ` [PATCH v2 3/3] mm: hugetlb: add hugetlb_free_vmemmap sysctl Muchun Song
  2 siblings, 2 replies; 11+ messages in thread
From: Muchun Song @ 2022-03-02  8:37 UTC (permalink / raw)
  To: corbet, mike.kravetz, akpm, mcgrof, keescook, yzaikin
  Cc: linux-doc, linux-kernel, linux-mm, duanxiongchun, smuchun, Muchun Song

If CONFIG_HUGETLB_PAGE_FREE_VMEMMAP_DEFAULT_ON is enabled and the size
of "struct page" is not power of two, we cannot optimize vmemmap pages
of HugeTLB pages. We should disable this feature in this case.

Signed-off-by: Muchun Song <songmuchun@bytedance.com>
---
 mm/hugetlb_vmemmap.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c
index b3118dba0518..836d1117f08b 100644
--- a/mm/hugetlb_vmemmap.c
+++ b/mm/hugetlb_vmemmap.c
@@ -121,6 +121,17 @@ void __init hugetlb_vmemmap_init(struct hstate *h)
 	if (!hugetlb_free_vmemmap_enabled())
 		return;
 
+	if (IS_ENABLED(CONFIG_HUGETLB_PAGE_FREE_VMEMMAP_DEFAULT_ON) &&
+	    !is_power_of_2(sizeof(struct page))) {
+		/*
+		 * The hugetlb_free_vmemmap_enabled_key can be enabled when
+		 * CONFIG_HUGETLB_PAGE_FREE_VMEMMAP_DEFAULT_ON. It should
+		 * be disabled if "struct page" crosses page boundaries.
+		 */
+		static_branch_disable(&hugetlb_free_vmemmap_enabled_key);
+		return;
+	}
+
 	vmemmap_pages = (nr_pages * sizeof(struct page)) >> PAGE_SHIFT;
 	/*
 	 * The head page is not to be freed to buddy allocator, the other tail
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v2 2/3] sysctl: allow to set extra1 to SYSCTL_ONE
  2022-03-02  8:37 [PATCH v2 0/3] add hugetlb_free_vmemmap sysctl Muchun Song
  2022-03-02  8:37 ` [PATCH v2 1/3] mm: hugetlb: disable freeing vmemmap pages when struct page crosses page boundaries Muchun Song
@ 2022-03-02  8:37 ` Muchun Song
  2022-03-02  8:37 ` [PATCH v2 3/3] mm: hugetlb: add hugetlb_free_vmemmap sysctl Muchun Song
  2 siblings, 0 replies; 11+ messages in thread
From: Muchun Song @ 2022-03-02  8:37 UTC (permalink / raw)
  To: corbet, mike.kravetz, akpm, mcgrof, keescook, yzaikin
  Cc: linux-doc, linux-kernel, linux-mm, duanxiongchun, smuchun, Muchun Song

Some sysctls only allow to be enabled and cannot be set back to be
disabled. But proc_do_static_key() does not consider this situation,
which set ->extra1 to SYSCTL_ZERO unconditionally. This patch add
the ability to set ->extra1 to SYSCTL_ONE, which will be used in
the next patch.

Signed-off-by: Muchun Song <songmuchun@bytedance.com>
---
 kernel/sysctl.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index 64065abf361e..ab3e9c937268 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -1631,7 +1631,7 @@ int proc_do_static_key(struct ctl_table *table, int write,
 		.data   = &val,
 		.maxlen = sizeof(val),
 		.mode   = table->mode,
-		.extra1 = SYSCTL_ZERO,
+		.extra1 = table->extra1 == SYSCTL_ONE ? SYSCTL_ONE : SYSCTL_ZERO,
 		.extra2 = SYSCTL_ONE,
 	};
 
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v2 3/3] mm: hugetlb: add hugetlb_free_vmemmap sysctl
  2022-03-02  8:37 [PATCH v2 0/3] add hugetlb_free_vmemmap sysctl Muchun Song
  2022-03-02  8:37 ` [PATCH v2 1/3] mm: hugetlb: disable freeing vmemmap pages when struct page crosses page boundaries Muchun Song
  2022-03-02  8:37 ` [PATCH v2 2/3] sysctl: allow to set extra1 to SYSCTL_ONE Muchun Song
@ 2022-03-02  8:37 ` Muchun Song
  2022-03-02 21:25   ` Luis Chamberlain
  2 siblings, 1 reply; 11+ messages in thread
From: Muchun Song @ 2022-03-02  8:37 UTC (permalink / raw)
  To: corbet, mike.kravetz, akpm, mcgrof, keescook, yzaikin
  Cc: linux-doc, linux-kernel, linux-mm, duanxiongchun, smuchun, Muchun Song

We must add "hugetlb_free_vmemmap=on" to boot cmdline and reboot the
server to enable the feature of freeing vmemmap pages of HugeTLB
pages. Rebooting usually taske a long time. Add a sysctl to enable
the feature at runtime and do not need to reboot.

Signed-off-by: Muchun Song <songmuchun@bytedance.com>
---
 Documentation/admin-guide/sysctl/vm.rst | 13 ++++++++++
 include/linux/memory_hotplug.h          |  9 +++++++
 mm/hugetlb_vmemmap.c                    | 42 ++++++++++++++++++++++++++++-----
 mm/hugetlb_vmemmap.h                    |  4 +++-
 mm/memory_hotplug.c                     |  5 ++++
 5 files changed, 66 insertions(+), 7 deletions(-)

diff --git a/Documentation/admin-guide/sysctl/vm.rst b/Documentation/admin-guide/sysctl/vm.rst
index f4804ce37c58..01f18e6cc227 100644
--- a/Documentation/admin-guide/sysctl/vm.rst
+++ b/Documentation/admin-guide/sysctl/vm.rst
@@ -561,6 +561,19 @@ Change the minimum size of the hugepage pool.
 See Documentation/admin-guide/mm/hugetlbpage.rst
 
 
+hugetlb_free_vmemmap
+====================
+
+A toggle value indicating if vmemmap pages are allowed to be optimized.
+If it is off (0), then it can be set true (1).  Once true, the vmemmap
+pages associated with each HugeTLB page will be optimized, and the toggle
+cannot be set back to false.  It only optimizes the subsequent allocation
+of HugeTLB pages from buddy system, while already allocated HugeTLB pages
+will not be optimized.
+
+See Documentation/admin-guide/mm/hugetlbpage.rst
+
+
 nr_hugepages_mempolicy
 ======================
 
diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h
index e0b2209ab71c..20d7edf62a6a 100644
--- a/include/linux/memory_hotplug.h
+++ b/include/linux/memory_hotplug.h
@@ -351,4 +351,13 @@ void arch_remove_linear_mapping(u64 start, u64 size);
 extern bool mhp_supports_memmap_on_memory(unsigned long size);
 #endif /* CONFIG_MEMORY_HOTPLUG */
 
+#ifdef CONFIG_MHP_MEMMAP_ON_MEMORY
+bool mhp_memmap_on_memory(void);
+#else
+static inline bool mhp_memmap_on_memory(void)
+{
+	return false;
+}
+#endif
+
 #endif /* __LINUX_MEMORY_HOTPLUG_H */
diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c
index 836d1117f08b..3bcc8f25bd50 100644
--- a/mm/hugetlb_vmemmap.c
+++ b/mm/hugetlb_vmemmap.c
@@ -10,6 +10,7 @@
 
 #define pr_fmt(fmt)	"HugeTLB: " fmt
 
+#include <linux/memory_hotplug.h>
 #include "hugetlb_vmemmap.h"
 
 /*
@@ -118,17 +119,14 @@ void __init hugetlb_vmemmap_init(struct hstate *h)
 	BUILD_BUG_ON(__NR_USED_SUBPAGE >=
 		     RESERVE_VMEMMAP_SIZE / sizeof(struct page));
 
-	if (!hugetlb_free_vmemmap_enabled())
-		return;
-
-	if (IS_ENABLED(CONFIG_HUGETLB_PAGE_FREE_VMEMMAP_DEFAULT_ON) &&
-	    !is_power_of_2(sizeof(struct page))) {
+	if (!is_power_of_2(sizeof(struct page))) {
 		/*
 		 * The hugetlb_free_vmemmap_enabled_key can be enabled when
 		 * CONFIG_HUGETLB_PAGE_FREE_VMEMMAP_DEFAULT_ON. It should
 		 * be disabled if "struct page" crosses page boundaries.
 		 */
-		static_branch_disable(&hugetlb_free_vmemmap_enabled_key);
+		if (IS_ENABLED(CONFIG_HUGETLB_PAGE_FREE_VMEMMAP_DEFAULT_ON))
+			static_branch_disable(&hugetlb_free_vmemmap_enabled_key);
 		return;
 	}
 
@@ -147,3 +145,35 @@ void __init hugetlb_vmemmap_init(struct hstate *h)
 	pr_info("can free %d vmemmap pages for %s\n", h->nr_free_vmemmap_pages,
 		h->name);
 }
+
+static struct ctl_table hugetlb_vmemmap_sysctls[] = {
+	{
+		.procname	= "hugetlb_free_vmemmap",
+		.data		= &hugetlb_free_vmemmap_enabled_key.key,
+		.mode		= 0644,
+		/* only handle a transition from default "0" to "1" */
+		.proc_handler	= proc_do_static_key,
+		.extra1		= SYSCTL_ONE,
+		.extra2		= SYSCTL_ONE,
+	},
+	{ }
+};
+
+static __init int hugetlb_vmemmap_sysctls_init(void)
+{
+	/*
+	 * The vmemmap pages cannot be optimized if
+	 * "memory_hotplug.memmap_on_memory" is enabled unless
+	 * "hugetlb_free_vmemmap" is enabled as well since
+	 * "hugetlb_free_vmemmap" takes precedence over
+	 * "memory_hotplug.memmap_on_memory".
+	 */
+	if (mhp_memmap_on_memory() && !hugetlb_free_vmemmap_enabled())
+		return 0;
+
+	if (is_power_of_2(sizeof(struct page)))
+		register_sysctl_init("vm", hugetlb_vmemmap_sysctls);
+
+	return 0;
+}
+late_initcall(hugetlb_vmemmap_sysctls_init);
diff --git a/mm/hugetlb_vmemmap.h b/mm/hugetlb_vmemmap.h
index cb2bef8f9e73..b67a159027f4 100644
--- a/mm/hugetlb_vmemmap.h
+++ b/mm/hugetlb_vmemmap.h
@@ -21,7 +21,9 @@ void hugetlb_vmemmap_init(struct hstate *h);
  */
 static inline unsigned int free_vmemmap_pages_per_hpage(struct hstate *h)
 {
-	return h->nr_free_vmemmap_pages;
+	if (hugetlb_free_vmemmap_enabled())
+		return h->nr_free_vmemmap_pages;
+	return 0;
 }
 #else
 static inline int alloc_huge_page_vmemmap(struct hstate *h, struct page *head)
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index c226a337c1ef..c2115e566abc 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -50,6 +50,11 @@ static bool memmap_on_memory __ro_after_init;
 #ifdef CONFIG_MHP_MEMMAP_ON_MEMORY
 module_param(memmap_on_memory, bool, 0444);
 MODULE_PARM_DESC(memmap_on_memory, "Enable memmap on memory for memory hotplug");
+
+bool mhp_memmap_on_memory(void)
+{
+	return memmap_on_memory;
+}
 #endif
 
 enum {
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 1/3] mm: hugetlb: disable freeing vmemmap pages when struct page crosses page boundaries
  2022-03-02  8:37 ` [PATCH v2 1/3] mm: hugetlb: disable freeing vmemmap pages when struct page crosses page boundaries Muchun Song
@ 2022-03-02 21:21   ` Luis Chamberlain
  2022-03-03  2:38     ` Muchun Song
  2022-03-03  0:25   ` Mike Kravetz
  1 sibling, 1 reply; 11+ messages in thread
From: Luis Chamberlain @ 2022-03-02 21:21 UTC (permalink / raw)
  To: Muchun Song
  Cc: corbet, mike.kravetz, akpm, keescook, yzaikin, linux-doc,
	linux-kernel, linux-mm, duanxiongchun, smuchun, Adam Manzanares,
	Davidlohr Bueso

On Wed, Mar 02, 2022 at 04:37:56PM +0800, Muchun Song wrote:
> If CONFIG_HUGETLB_PAGE_FREE_VMEMMAP_DEFAULT_ON is enabled and the size
> of "struct page" is not power of two, we cannot optimize vmemmap pages
> of HugeTLB pages. We should disable this feature in this case.

The commit log does not descrie what happens if this is left enabled in
that case? Is this a fix? Why would it be a fix? Was something failing?
How did you spot this issue? What are the consequences of not applying
this patch?

  Luis

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 3/3] mm: hugetlb: add hugetlb_free_vmemmap sysctl
  2022-03-02  8:37 ` [PATCH v2 3/3] mm: hugetlb: add hugetlb_free_vmemmap sysctl Muchun Song
@ 2022-03-02 21:25   ` Luis Chamberlain
  2022-03-03 11:15     ` Muchun Song
  0 siblings, 1 reply; 11+ messages in thread
From: Luis Chamberlain @ 2022-03-02 21:25 UTC (permalink / raw)
  To: Muchun Song
  Cc: corbet, mike.kravetz, akpm, keescook, yzaikin, linux-doc,
	linux-kernel, linux-mm, duanxiongchun, smuchun, Adam Manzanares,
	Davidlohr Bueso

On Wed, Mar 02, 2022 at 04:37:58PM +0800, Muchun Song wrote:
> We must add "hugetlb_free_vmemmap=on" to boot cmdline and reboot the
> server to enable the feature of freeing vmemmap pages of HugeTLB
> pages. Rebooting usually taske a long time. Add a sysctl to enable
> the feature at runtime and do not need to reboot.
> 
> Signed-off-by: Muchun Song <songmuchun@bytedance.com>
> ---
>  Documentation/admin-guide/sysctl/vm.rst | 13 ++++++++++
>  include/linux/memory_hotplug.h          |  9 +++++++
>  mm/hugetlb_vmemmap.c                    | 42 ++++++++++++++++++++++++++++-----
>  mm/hugetlb_vmemmap.h                    |  4 +++-
>  mm/memory_hotplug.c                     |  5 ++++
>  5 files changed, 66 insertions(+), 7 deletions(-)
> 
> diff --git a/Documentation/admin-guide/sysctl/vm.rst b/Documentation/admin-guide/sysctl/vm.rst
> index f4804ce37c58..01f18e6cc227 100644
> --- a/Documentation/admin-guide/sysctl/vm.rst
> +++ b/Documentation/admin-guide/sysctl/vm.rst
> @@ -561,6 +561,19 @@ Change the minimum size of the hugepage pool.
>  See Documentation/admin-guide/mm/hugetlbpage.rst
>  
>  
> +hugetlb_free_vmemmap
> +====================
> +
> +A toggle value indicating if vmemmap pages are allowed to be optimized.
> +If it is off (0), then it can be set true (1).  Once true, the vmemmap
> +pages associated with each HugeTLB page will be optimized, and the toggle
> +cannot be set back to false.  It only optimizes the subsequent allocation
> +of HugeTLB pages from buddy system, while already allocated HugeTLB pages
> +will not be optimized.

The commit log or documentation does not descrie why its safe to toggle
one way and not the other?

  Luis

> +
> +See Documentation/admin-guide/mm/hugetlbpage.rst
> +
> +
>  nr_hugepages_mempolicy
>  ======================
>  
> diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h
> index e0b2209ab71c..20d7edf62a6a 100644
> --- a/include/linux/memory_hotplug.h
> +++ b/include/linux/memory_hotplug.h
> @@ -351,4 +351,13 @@ void arch_remove_linear_mapping(u64 start, u64 size);
>  extern bool mhp_supports_memmap_on_memory(unsigned long size);
>  #endif /* CONFIG_MEMORY_HOTPLUG */
>  
> +#ifdef CONFIG_MHP_MEMMAP_ON_MEMORY
> +bool mhp_memmap_on_memory(void);
> +#else
> +static inline bool mhp_memmap_on_memory(void)
> +{
> +	return false;
> +}
> +#endif
> +
>  #endif /* __LINUX_MEMORY_HOTPLUG_H */
> diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c
> index 836d1117f08b..3bcc8f25bd50 100644
> --- a/mm/hugetlb_vmemmap.c
> +++ b/mm/hugetlb_vmemmap.c
> @@ -10,6 +10,7 @@
>  
>  #define pr_fmt(fmt)	"HugeTLB: " fmt
>  
> +#include <linux/memory_hotplug.h>
>  #include "hugetlb_vmemmap.h"
>  
>  /*
> @@ -118,17 +119,14 @@ void __init hugetlb_vmemmap_init(struct hstate *h)
>  	BUILD_BUG_ON(__NR_USED_SUBPAGE >=
>  		     RESERVE_VMEMMAP_SIZE / sizeof(struct page));
>  
> -	if (!hugetlb_free_vmemmap_enabled())
> -		return;
> -
> -	if (IS_ENABLED(CONFIG_HUGETLB_PAGE_FREE_VMEMMAP_DEFAULT_ON) &&
> -	    !is_power_of_2(sizeof(struct page))) {
> +	if (!is_power_of_2(sizeof(struct page))) {
>  		/*
>  		 * The hugetlb_free_vmemmap_enabled_key can be enabled when
>  		 * CONFIG_HUGETLB_PAGE_FREE_VMEMMAP_DEFAULT_ON. It should
>  		 * be disabled if "struct page" crosses page boundaries.
>  		 */
> -		static_branch_disable(&hugetlb_free_vmemmap_enabled_key);
> +		if (IS_ENABLED(CONFIG_HUGETLB_PAGE_FREE_VMEMMAP_DEFAULT_ON))
> +			static_branch_disable(&hugetlb_free_vmemmap_enabled_key);
>  		return;
>  	}
>  
> @@ -147,3 +145,35 @@ void __init hugetlb_vmemmap_init(struct hstate *h)
>  	pr_info("can free %d vmemmap pages for %s\n", h->nr_free_vmemmap_pages,
>  		h->name);
>  }
> +
> +static struct ctl_table hugetlb_vmemmap_sysctls[] = {
> +	{
> +		.procname	= "hugetlb_free_vmemmap",
> +		.data		= &hugetlb_free_vmemmap_enabled_key.key,
> +		.mode		= 0644,
> +		/* only handle a transition from default "0" to "1" */
> +		.proc_handler	= proc_do_static_key,
> +		.extra1		= SYSCTL_ONE,
> +		.extra2		= SYSCTL_ONE,
> +	},
> +	{ }
> +};
> +
> +static __init int hugetlb_vmemmap_sysctls_init(void)
> +{
> +	/*
> +	 * The vmemmap pages cannot be optimized if
> +	 * "memory_hotplug.memmap_on_memory" is enabled unless
> +	 * "hugetlb_free_vmemmap" is enabled as well since
> +	 * "hugetlb_free_vmemmap" takes precedence over
> +	 * "memory_hotplug.memmap_on_memory".
> +	 */
> +	if (mhp_memmap_on_memory() && !hugetlb_free_vmemmap_enabled())
> +		return 0;
> +
> +	if (is_power_of_2(sizeof(struct page)))
> +		register_sysctl_init("vm", hugetlb_vmemmap_sysctls);
> +
> +	return 0;
> +}
> +late_initcall(hugetlb_vmemmap_sysctls_init);
> diff --git a/mm/hugetlb_vmemmap.h b/mm/hugetlb_vmemmap.h
> index cb2bef8f9e73..b67a159027f4 100644
> --- a/mm/hugetlb_vmemmap.h
> +++ b/mm/hugetlb_vmemmap.h
> @@ -21,7 +21,9 @@ void hugetlb_vmemmap_init(struct hstate *h);
>   */
>  static inline unsigned int free_vmemmap_pages_per_hpage(struct hstate *h)
>  {
> -	return h->nr_free_vmemmap_pages;
> +	if (hugetlb_free_vmemmap_enabled())
> +		return h->nr_free_vmemmap_pages;
> +	return 0;
>  }
>  #else
>  static inline int alloc_huge_page_vmemmap(struct hstate *h, struct page *head)
> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> index c226a337c1ef..c2115e566abc 100644
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -50,6 +50,11 @@ static bool memmap_on_memory __ro_after_init;
>  #ifdef CONFIG_MHP_MEMMAP_ON_MEMORY
>  module_param(memmap_on_memory, bool, 0444);
>  MODULE_PARM_DESC(memmap_on_memory, "Enable memmap on memory for memory hotplug");
> +
> +bool mhp_memmap_on_memory(void)
> +{
> +	return memmap_on_memory;
> +}
>  #endif
>  
>  enum {
> -- 
> 2.11.0
> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 1/3] mm: hugetlb: disable freeing vmemmap pages when struct page crosses page boundaries
  2022-03-02  8:37 ` [PATCH v2 1/3] mm: hugetlb: disable freeing vmemmap pages when struct page crosses page boundaries Muchun Song
  2022-03-02 21:21   ` Luis Chamberlain
@ 2022-03-03  0:25   ` Mike Kravetz
  2022-03-03  2:28     ` Muchun Song
  1 sibling, 1 reply; 11+ messages in thread
From: Mike Kravetz @ 2022-03-03  0:25 UTC (permalink / raw)
  To: Muchun Song, corbet, akpm, mcgrof, keescook, yzaikin
  Cc: linux-doc, linux-kernel, linux-mm, duanxiongchun, smuchun

On 3/2/22 00:37, Muchun Song wrote:
> If CONFIG_HUGETLB_PAGE_FREE_VMEMMAP_DEFAULT_ON is enabled and the size
> of "struct page" is not power of two, we cannot optimize vmemmap pages
> of HugeTLB pages. We should disable this feature in this case.

I'll let you reply to the question from Luis, but IIUC there is no issue
today as "struct page" is certainly a power of two.  This is more future
looking.  Correct?

> Signed-off-by: Muchun Song <songmuchun@bytedance.com>
> ---
>  mm/hugetlb_vmemmap.c | 11 +++++++++++
>  1 file changed, 11 insertions(+)
> 
> diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c
> index b3118dba0518..836d1117f08b 100644
> --- a/mm/hugetlb_vmemmap.c
> +++ b/mm/hugetlb_vmemmap.c
> @@ -121,6 +121,17 @@ void __init hugetlb_vmemmap_init(struct hstate *h)
>  	if (!hugetlb_free_vmemmap_enabled())
>  		return;
>  
> +	if (IS_ENABLED(CONFIG_HUGETLB_PAGE_FREE_VMEMMAP_DEFAULT_ON) &&
> +	    !is_power_of_2(sizeof(struct page))) {
> +		/*
> +		 * The hugetlb_free_vmemmap_enabled_key can be enabled when
> +		 * CONFIG_HUGETLB_PAGE_FREE_VMEMMAP_DEFAULT_ON. It should
> +		 * be disabled if "struct page" crosses page boundaries.
> +		 */
> +		static_branch_disable(&hugetlb_free_vmemmap_enabled_key);

Should we possibly print a warning here as in the routine early_hugetlb_free_vmemmap_param?  This is called once per hstate, so
perhaps pr_warn_once.

-- 
Mike Kravetz

> +		return;
> +	}
> +
>  	vmemmap_pages = (nr_pages * sizeof(struct page)) >> PAGE_SHIFT;
>  	/*
>  	 * The head page is not to be freed to buddy allocator, the other tail



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 1/3] mm: hugetlb: disable freeing vmemmap pages when struct page crosses page boundaries
  2022-03-03  0:25   ` Mike Kravetz
@ 2022-03-03  2:28     ` Muchun Song
  0 siblings, 0 replies; 11+ messages in thread
From: Muchun Song @ 2022-03-03  2:28 UTC (permalink / raw)
  To: Mike Kravetz
  Cc: Jonathan Corbet, Andrew Morton, Luis Chamberlain, Kees Cook,
	Iurii Zaikin, Linux Doc Mailing List, LKML,
	Linux Memory Management List, Xiongchun duan, Muchun Song

On Thu, Mar 3, 2022 at 8:25 AM Mike Kravetz <mike.kravetz@oracle.com> wrote:
>
> On 3/2/22 00:37, Muchun Song wrote:
> > If CONFIG_HUGETLB_PAGE_FREE_VMEMMAP_DEFAULT_ON is enabled and the size
> > of "struct page" is not power of two, we cannot optimize vmemmap pages
> > of HugeTLB pages. We should disable this feature in this case.
>
> I'll let you reply to the question from Luis, but IIUC there is no issue
> today as "struct page" is certainly a power of two.  This is more future
> looking.  Correct?

Partly right. The size of "struct page" is not the power of two if
!CONFIG_MEMCG && CONFIG_SLAB on x86_64.  But it is not
a conventional configuration nowadays.  So it is not a critical
problem. I am not sure if a Fixes tag is necessary.

>
> > Signed-off-by: Muchun Song <songmuchun@bytedance.com>
> > ---
> >  mm/hugetlb_vmemmap.c | 11 +++++++++++
> >  1 file changed, 11 insertions(+)
> >
> > diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c
> > index b3118dba0518..836d1117f08b 100644
> > --- a/mm/hugetlb_vmemmap.c
> > +++ b/mm/hugetlb_vmemmap.c
> > @@ -121,6 +121,17 @@ void __init hugetlb_vmemmap_init(struct hstate *h)
> >       if (!hugetlb_free_vmemmap_enabled())
> >               return;
> >
> > +     if (IS_ENABLED(CONFIG_HUGETLB_PAGE_FREE_VMEMMAP_DEFAULT_ON) &&
> > +         !is_power_of_2(sizeof(struct page))) {
> > +             /*
> > +              * The hugetlb_free_vmemmap_enabled_key can be enabled when
> > +              * CONFIG_HUGETLB_PAGE_FREE_VMEMMAP_DEFAULT_ON. It should
> > +              * be disabled if "struct page" crosses page boundaries.
> > +              */
> > +             static_branch_disable(&hugetlb_free_vmemmap_enabled_key);
>
> Should we possibly print a warning here as in the routine early_hugetlb_free_vmemmap_param?  This is called once per hstate, so
> perhaps pr_warn_once.

Good point. Will do.

Thanks.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 1/3] mm: hugetlb: disable freeing vmemmap pages when struct page crosses page boundaries
  2022-03-02 21:21   ` Luis Chamberlain
@ 2022-03-03  2:38     ` Muchun Song
  0 siblings, 0 replies; 11+ messages in thread
From: Muchun Song @ 2022-03-03  2:38 UTC (permalink / raw)
  To: Luis Chamberlain
  Cc: Jonathan Corbet, Mike Kravetz, Andrew Morton, Kees Cook,
	Iurii Zaikin, Linux Doc Mailing List, LKML,
	Linux Memory Management List, Xiongchun duan, Muchun Song,
	Adam Manzanares, Davidlohr Bueso

On Thu, Mar 3, 2022 at 5:21 AM Luis Chamberlain <mcgrof@kernel.org> wrote:
>
> On Wed, Mar 02, 2022 at 04:37:56PM +0800, Muchun Song wrote:
> > If CONFIG_HUGETLB_PAGE_FREE_VMEMMAP_DEFAULT_ON is enabled and the size
> > of "struct page" is not power of two, we cannot optimize vmemmap pages
> > of HugeTLB pages. We should disable this feature in this case.
>
> The commit log does not describe what happens if this is left enabled in
> that case? Is this a fix? Why would it be a fix? Was something failing?
> How did you spot this issue? What are the consequences of not applying
> this patch?
>

If the size of "struct page" is not the power of two and this feature is
enabled, then the vmemmap pages of HugeTLB will be corrupted
after remapping (panic is about to happen in theory).  But this only
exists when !CONFIG_MEMCG && CONFIG_SLAB on x86_64.
However, it is not a conventional configuration nowadays.  So it is
not a real word issue, just the result of a code review.  But we cannot
prevent someone from configuring that combined configure.  OK,
this information should go to the commit log. Will update it.

Thanks.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 3/3] mm: hugetlb: add hugetlb_free_vmemmap sysctl
  2022-03-02 21:25   ` Luis Chamberlain
@ 2022-03-03 11:15     ` Muchun Song
  2022-03-03 14:59       ` Luis Chamberlain
  0 siblings, 1 reply; 11+ messages in thread
From: Muchun Song @ 2022-03-03 11:15 UTC (permalink / raw)
  To: Luis Chamberlain
  Cc: Jonathan Corbet, Mike Kravetz, Andrew Morton, Kees Cook,
	Iurii Zaikin, Linux Doc Mailing List, LKML,
	Linux Memory Management List, Xiongchun duan, Muchun Song,
	Adam Manzanares, Davidlohr Bueso

On Thu, Mar 3, 2022 at 5:25 AM Luis Chamberlain <mcgrof@kernel.org> wrote:
>
> On Wed, Mar 02, 2022 at 04:37:58PM +0800, Muchun Song wrote:
> > We must add "hugetlb_free_vmemmap=on" to boot cmdline and reboot the
> > server to enable the feature of freeing vmemmap pages of HugeTLB
> > pages. Rebooting usually taske a long time. Add a sysctl to enable
> > the feature at runtime and do not need to reboot.
> >
> > Signed-off-by: Muchun Song <songmuchun@bytedance.com>
> > ---
> >  Documentation/admin-guide/sysctl/vm.rst | 13 ++++++++++
> >  include/linux/memory_hotplug.h          |  9 +++++++
> >  mm/hugetlb_vmemmap.c                    | 42 ++++++++++++++++++++++++++++-----
> >  mm/hugetlb_vmemmap.h                    |  4 +++-
> >  mm/memory_hotplug.c                     |  5 ++++
> >  5 files changed, 66 insertions(+), 7 deletions(-)
> >
> > diff --git a/Documentation/admin-guide/sysctl/vm.rst b/Documentation/admin-guide/sysctl/vm.rst
> > index f4804ce37c58..01f18e6cc227 100644
> > --- a/Documentation/admin-guide/sysctl/vm.rst
> > +++ b/Documentation/admin-guide/sysctl/vm.rst
> > @@ -561,6 +561,19 @@ Change the minimum size of the hugepage pool.
> >  See Documentation/admin-guide/mm/hugetlbpage.rst
> >
> >
> > +hugetlb_free_vmemmap
> > +====================
> > +
> > +A toggle value indicating if vmemmap pages are allowed to be optimized.
> > +If it is off (0), then it can be set true (1).  Once true, the vmemmap
> > +pages associated with each HugeTLB page will be optimized, and the toggle
> > +cannot be set back to false.  It only optimizes the subsequent allocation
> > +of HugeTLB pages from buddy system, while already allocated HugeTLB pages
> > +will not be optimized.
>
> The commit log or documentation does not descrie why its safe to toggle
> one way and not the other?
>

I thought it was easy to handle the transition from disable to enable
(code is simple).  I might be wrong. I'll try to handle the other side in
the next version if it is not hard to handle.

Thanks Luis.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 3/3] mm: hugetlb: add hugetlb_free_vmemmap sysctl
  2022-03-03 11:15     ` Muchun Song
@ 2022-03-03 14:59       ` Luis Chamberlain
  0 siblings, 0 replies; 11+ messages in thread
From: Luis Chamberlain @ 2022-03-03 14:59 UTC (permalink / raw)
  To: Muchun Song
  Cc: Jonathan Corbet, Mike Kravetz, Andrew Morton, Kees Cook,
	Iurii Zaikin, Linux Doc Mailing List, LKML,
	Linux Memory Management List, Xiongchun duan, Muchun Song,
	Adam Manzanares, Davidlohr Bueso

On Thu, Mar 03, 2022 at 07:15:05PM +0800, Muchun Song wrote:
> On Thu, Mar 3, 2022 at 5:25 AM Luis Chamberlain <mcgrof@kernel.org> wrote:
> >
> > On Wed, Mar 02, 2022 at 04:37:58PM +0800, Muchun Song wrote:
> > > We must add "hugetlb_free_vmemmap=on" to boot cmdline and reboot the
> > > server to enable the feature of freeing vmemmap pages of HugeTLB
> > > pages. Rebooting usually taske a long time. Add a sysctl to enable
> > > the feature at runtime and do not need to reboot.
> > >
> > > Signed-off-by: Muchun Song <songmuchun@bytedance.com>
> > > ---
> > >  Documentation/admin-guide/sysctl/vm.rst | 13 ++++++++++
> > >  include/linux/memory_hotplug.h          |  9 +++++++
> > >  mm/hugetlb_vmemmap.c                    | 42 ++++++++++++++++++++++++++++-----
> > >  mm/hugetlb_vmemmap.h                    |  4 +++-
> > >  mm/memory_hotplug.c                     |  5 ++++
> > >  5 files changed, 66 insertions(+), 7 deletions(-)
> > >
> > > diff --git a/Documentation/admin-guide/sysctl/vm.rst b/Documentation/admin-guide/sysctl/vm.rst
> > > index f4804ce37c58..01f18e6cc227 100644
> > > --- a/Documentation/admin-guide/sysctl/vm.rst
> > > +++ b/Documentation/admin-guide/sysctl/vm.rst
> > > @@ -561,6 +561,19 @@ Change the minimum size of the hugepage pool.
> > >  See Documentation/admin-guide/mm/hugetlbpage.rst
> > >
> > >
> > > +hugetlb_free_vmemmap
> > > +====================
> > > +
> > > +A toggle value indicating if vmemmap pages are allowed to be optimized.
> > > +If it is off (0), then it can be set true (1).  Once true, the vmemmap
> > > +pages associated with each HugeTLB page will be optimized, and the toggle
> > > +cannot be set back to false.  It only optimizes the subsequent allocation
> > > +of HugeTLB pages from buddy system, while already allocated HugeTLB pages
> > > +will not be optimized.
> >
> > The commit log or documentation does not descrie why its safe to toggle
> > one way and not the other?
> >
> 
> I thought it was easy to handle the transition from disable to enable
> (code is simple).  I might be wrong. I'll try to handle the other side in
> the next version if it is not hard to handle.

You should do the homework and explain why something is not possible.
And if you are enabling to disable something why is it safe to do so
at runtime?

  Luis

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2022-03-03 14:59 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-03-02  8:37 [PATCH v2 0/3] add hugetlb_free_vmemmap sysctl Muchun Song
2022-03-02  8:37 ` [PATCH v2 1/3] mm: hugetlb: disable freeing vmemmap pages when struct page crosses page boundaries Muchun Song
2022-03-02 21:21   ` Luis Chamberlain
2022-03-03  2:38     ` Muchun Song
2022-03-03  0:25   ` Mike Kravetz
2022-03-03  2:28     ` Muchun Song
2022-03-02  8:37 ` [PATCH v2 2/3] sysctl: allow to set extra1 to SYSCTL_ONE Muchun Song
2022-03-02  8:37 ` [PATCH v2 3/3] mm: hugetlb: add hugetlb_free_vmemmap sysctl Muchun Song
2022-03-02 21:25   ` Luis Chamberlain
2022-03-03 11:15     ` Muchun Song
2022-03-03 14:59       ` Luis Chamberlain

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).