linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/4] Clean up hugetlb boot command line processing
@ 2020-04-01 18:38 Mike Kravetz
  2020-04-01 18:38 ` [PATCH v2 1/4] hugetlbfs: add arch_hugetlb_valid_size Mike Kravetz
                   ` (3 more replies)
  0 siblings, 4 replies; 13+ messages in thread
From: Mike Kravetz @ 2020-04-01 18:38 UTC (permalink / raw)
  To: linux-mm, linux-kernel, linux-arm-kernel, linuxppc-dev,
	linux-riscv, linux-s390, sparclinux, linux-doc
  Cc: Catalin Marinas, Will Deacon, Benjamin Herrenschmidt,
	Paul Mackerras, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Heiko Carstens, Vasily Gorbik, Christian Borntraeger,
	David S . Miller, Thomas Gleixner, Ingo Molnar, Dave Hansen,
	Jonathan Corbet, Longpeng, Christophe Leroy, Mina Almasry,
	Andrew Morton, Mike Kravetz

v2 -
   Fix build errors with patch 1 (Will)
   Change arch_hugetlb_valid_size arg to unsigned long and remove
     irrelevant 'extern' keyword (Christophe)
   Documentation and other misc changes (Randy, Christophe, Mina)
   Do not process command line options if !hugepages_supported()
     (Dave, but it sounds like we may want to additional changes to
      hugepages_supported() for x86?  If that is needed I would prefer
      a separate patch.)

Longpeng(Mike) reported a weird message from hugetlb command line processing
and proposed a solution [1].  While the proposed patch does address the
specific issue, there are other related issues in command line processing.
As hugetlbfs evolved, updates to command line processing have been made to
meet immediate needs and not necessarily in a coordinated manner.  The result
is that some processing is done in arch specific code, some is done in arch
independent code and coordination is problematic.  Semantics can vary between
architectures.

The patch series does the following:
- Define arch specific arch_hugetlb_valid_size routine used to validate
  passed huge page sizes.
- Move hugepagesz= command line parsing out of arch specific code and into
  an arch independent routine.
- Clean up command line processing to follow desired semantics and
  document those semantics.

[1] https://lore.kernel.org/linux-mm/20200305033014.1152-1-longpeng2@huawei.com

Mike Kravetz (4):
  hugetlbfs: add arch_hugetlb_valid_size
  hugetlbfs: move hugepagesz= parsing to arch independent code
  hugetlbfs: remove hugetlb_add_hstate() warning for existing hstate
  hugetlbfs: clean up command line processing

 .../admin-guide/kernel-parameters.txt         |  35 +++--
 Documentation/admin-guide/mm/hugetlbpage.rst  |  44 ++++++
 arch/arm64/include/asm/hugetlb.h              |   2 +
 arch/arm64/mm/hugetlbpage.c                   |  30 +---
 arch/powerpc/include/asm/hugetlb.h            |   3 +
 arch/powerpc/mm/hugetlbpage.c                 |  30 ++--
 arch/riscv/include/asm/hugetlb.h              |   3 +
 arch/riscv/mm/hugetlbpage.c                   |  24 +--
 arch/s390/include/asm/hugetlb.h               |   3 +
 arch/s390/mm/hugetlbpage.c                    |  24 +--
 arch/sparc/include/asm/hugetlb.h              |   3 +
 arch/sparc/mm/init_64.c                       |  43 ++----
 arch/x86/include/asm/hugetlb.h                |   5 +
 arch/x86/mm/hugetlbpage.c                     |  23 +--
 include/linux/hugetlb.h                       |   8 +-
 mm/hugetlb.c                                  | 141 ++++++++++++++----
 16 files changed, 252 insertions(+), 169 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v2 1/4] hugetlbfs: add arch_hugetlb_valid_size
  2020-04-01 18:38 [PATCH v2 0/4] Clean up hugetlb boot command line processing Mike Kravetz
@ 2020-04-01 18:38 ` Mike Kravetz
  2020-04-10 19:16   ` Peter Xu
  2020-04-01 18:38 ` [PATCH v2 2/4] hugetlbfs: move hugepagesz= parsing to arch independent code Mike Kravetz
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 13+ messages in thread
From: Mike Kravetz @ 2020-04-01 18:38 UTC (permalink / raw)
  To: linux-mm, linux-kernel, linux-arm-kernel, linuxppc-dev,
	linux-riscv, linux-s390, sparclinux, linux-doc
  Cc: Catalin Marinas, Will Deacon, Benjamin Herrenschmidt,
	Paul Mackerras, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Heiko Carstens, Vasily Gorbik, Christian Borntraeger,
	David S . Miller, Thomas Gleixner, Ingo Molnar, Dave Hansen,
	Jonathan Corbet, Longpeng, Christophe Leroy, Mina Almasry,
	Andrew Morton, Mike Kravetz

The architecture independent routine hugetlb_default_setup sets up
the default huge pages size.  It has no way to verify if the passed
value is valid, so it accepts it and attempts to validate at a later
time.  This requires undocumented cooperation between the arch specific
and arch independent code.

For architectures that support more than one huge page size, provide
a routine arch_hugetlb_valid_size to validate a huge page size.
hugetlb_default_setup can use this to validate passed values.

arch_hugetlb_valid_size will also be used in a subsequent patch to
move processing of the "hugepagesz=" in arch specific code to a common
routine in arch independent code.

Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
---
 arch/arm64/include/asm/hugetlb.h   |  2 ++
 arch/arm64/mm/hugetlbpage.c        | 17 +++++++++++++----
 arch/powerpc/include/asm/hugetlb.h |  3 +++
 arch/powerpc/mm/hugetlbpage.c      | 20 +++++++++++++-------
 arch/riscv/include/asm/hugetlb.h   |  3 +++
 arch/riscv/mm/hugetlbpage.c        | 26 +++++++++++++++++---------
 arch/s390/include/asm/hugetlb.h    |  3 +++
 arch/s390/mm/hugetlbpage.c         | 16 ++++++++++++----
 arch/sparc/include/asm/hugetlb.h   |  3 +++
 arch/sparc/mm/init_64.c            | 24 ++++++++++++++++--------
 arch/x86/include/asm/hugetlb.h     |  5 +++++
 arch/x86/mm/hugetlbpage.c          | 17 +++++++++++++----
 include/linux/hugetlb.h            |  7 +++++++
 mm/hugetlb.c                       | 15 ++++++++++++---
 14 files changed, 122 insertions(+), 39 deletions(-)

diff --git a/arch/arm64/include/asm/hugetlb.h b/arch/arm64/include/asm/hugetlb.h
index 2eb6c234d594..81606223494f 100644
--- a/arch/arm64/include/asm/hugetlb.h
+++ b/arch/arm64/include/asm/hugetlb.h
@@ -59,6 +59,8 @@ extern void huge_pte_clear(struct mm_struct *mm, unsigned long addr,
 extern void set_huge_swap_pte_at(struct mm_struct *mm, unsigned long addr,
 				 pte_t *ptep, pte_t pte, unsigned long sz);
 #define set_huge_swap_pte_at set_huge_swap_pte_at
+bool __init arch_hugetlb_valid_size(unsigned long size);
+#define arch_hugetlb_valid_size arch_hugetlb_valid_size
 
 #include <asm-generic/hugetlb.h>
 
diff --git a/arch/arm64/mm/hugetlbpage.c b/arch/arm64/mm/hugetlbpage.c
index bbeb6a5a6ba6..069b96ee2aec 100644
--- a/arch/arm64/mm/hugetlbpage.c
+++ b/arch/arm64/mm/hugetlbpage.c
@@ -462,17 +462,26 @@ static int __init hugetlbpage_init(void)
 }
 arch_initcall(hugetlbpage_init);
 
-static __init int setup_hugepagesz(char *opt)
+bool __init arch_hugetlb_valid_size(unsigned long size)
 {
-	unsigned long ps = memparse(opt, &opt);
-
-	switch (ps) {
+	switch (size) {
 #ifdef CONFIG_ARM64_4K_PAGES
 	case PUD_SIZE:
 #endif
 	case CONT_PMD_SIZE:
 	case PMD_SIZE:
 	case CONT_PTE_SIZE:
+		return true;
+	}
+
+	return false;
+}
+
+static __init int setup_hugepagesz(char *opt)
+{
+	unsigned long ps = memparse(opt, &opt);
+
+	if (arch_hugetlb_valid_size(ps)) {
 		add_huge_page_size(ps);
 		return 1;
 	}
diff --git a/arch/powerpc/include/asm/hugetlb.h b/arch/powerpc/include/asm/hugetlb.h
index bd6504c28c2f..19b453ee1431 100644
--- a/arch/powerpc/include/asm/hugetlb.h
+++ b/arch/powerpc/include/asm/hugetlb.h
@@ -64,6 +64,9 @@ static inline void arch_clear_hugepage_flags(struct page *page)
 {
 }
 
+#define arch_hugetlb_valid_size arch_hugetlb_valid_size
+bool __init arch_hugetlb_valid_size(unsigned long size);
+
 #include <asm-generic/hugetlb.h>
 
 #else /* ! CONFIG_HUGETLB_PAGE */
diff --git a/arch/powerpc/mm/hugetlbpage.c b/arch/powerpc/mm/hugetlbpage.c
index 33b3461d91e8..de54d2a37830 100644
--- a/arch/powerpc/mm/hugetlbpage.c
+++ b/arch/powerpc/mm/hugetlbpage.c
@@ -558,7 +558,7 @@ unsigned long vma_mmu_pagesize(struct vm_area_struct *vma)
 	return vma_kernel_pagesize(vma);
 }
 
-static int __init add_huge_page_size(unsigned long long size)
+bool __init arch_hugetlb_valid_size(unsigned long size)
 {
 	int shift = __ffs(size);
 	int mmu_psize;
@@ -566,20 +566,26 @@ static int __init add_huge_page_size(unsigned long long size)
 	/* Check that it is a page size supported by the hardware and
 	 * that it fits within pagetable and slice limits. */
 	if (size <= PAGE_SIZE || !is_power_of_2(size))
-		return -EINVAL;
+		return false;
 
 	mmu_psize = check_and_get_huge_psize(shift);
 	if (mmu_psize < 0)
-		return -EINVAL;
+		return false;
 
 	BUG_ON(mmu_psize_defs[mmu_psize].shift != shift);
 
-	/* Return if huge page size has already been setup */
-	if (size_to_hstate(size))
-		return 0;
+	return true;
+}
 
-	hugetlb_add_hstate(shift - PAGE_SHIFT);
+static int __init add_huge_page_size(unsigned long long size)
+{
+	int shift = __ffs(size);
+
+	if (!arch_hugetlb_valid_size((unsigned long)size))
+		return -EINVAL;
 
+	if (!size_to_hstate(size))
+		hugetlb_add_hstate(shift - PAGE_SHIFT);
 	return 0;
 }
 
diff --git a/arch/riscv/include/asm/hugetlb.h b/arch/riscv/include/asm/hugetlb.h
index 728a5db66597..a6c414fa5b82 100644
--- a/arch/riscv/include/asm/hugetlb.h
+++ b/arch/riscv/include/asm/hugetlb.h
@@ -5,6 +5,9 @@
 #include <asm-generic/hugetlb.h>
 #include <asm/page.h>
 
+bool __init arch_hugetlb_valid_size(unsigned long size);
+#define arch_hugetlb_valid_size arch_hugetlb_valid_size
+
 static inline int is_hugepage_only_range(struct mm_struct *mm,
 					 unsigned long addr,
 					 unsigned long len) {
diff --git a/arch/riscv/mm/hugetlbpage.c b/arch/riscv/mm/hugetlbpage.c
index a6189ed36c5f..da1f516bc451 100644
--- a/arch/riscv/mm/hugetlbpage.c
+++ b/arch/riscv/mm/hugetlbpage.c
@@ -12,21 +12,29 @@ int pmd_huge(pmd_t pmd)
 	return pmd_leaf(pmd);
 }
 
+bool __init arch_hugetlb_valid_size(unsigned long size)
+{
+	if (size == HPAGE_SIZE)
+		return true;
+	else if (IS_ENABLED(CONFIG_64BIT) && size == PUD_SIZE)
+		return true;
+	else
+		return false;
+}
+
 static __init int setup_hugepagesz(char *opt)
 {
 	unsigned long ps = memparse(opt, &opt);
 
-	if (ps == HPAGE_SIZE) {
-		hugetlb_add_hstate(HPAGE_SHIFT - PAGE_SHIFT);
-	} else if (IS_ENABLED(CONFIG_64BIT) && ps == PUD_SIZE) {
-		hugetlb_add_hstate(PUD_SHIFT - PAGE_SHIFT);
-	} else {
-		hugetlb_bad_size();
-		pr_err("hugepagesz: Unsupported page size %lu M\n", ps >> 20);
-		return 0;
+	if (arch_hugetlb_valid_size(ps)) {
+		hugetlb_add_hstate(ilog2(ps) - PAGE_SHIFT);
+		return 1;
 	}
 
-	return 1;
+	hugetlb_bad_size();
+	pr_err("hugepagesz: Unsupported page size %lu M\n", ps >> 20);
+	return 0;
+
 }
 __setup("hugepagesz=", setup_hugepagesz);
 
diff --git a/arch/s390/include/asm/hugetlb.h b/arch/s390/include/asm/hugetlb.h
index de8f0bf5f238..f49981c00e72 100644
--- a/arch/s390/include/asm/hugetlb.h
+++ b/arch/s390/include/asm/hugetlb.h
@@ -15,6 +15,9 @@
 #define hugetlb_free_pgd_range			free_pgd_range
 #define hugepages_supported()			(MACHINE_HAS_EDAT1)
 
+bool __init arch_hugetlb_valid_size(unsigned long size);
+#define arch_hugetlb_valid_size arch_hugetlb_valid_size
+
 void set_huge_pte_at(struct mm_struct *mm, unsigned long addr,
 		     pte_t *ptep, pte_t pte);
 pte_t huge_ptep_get(pte_t *ptep);
diff --git a/arch/s390/mm/hugetlbpage.c b/arch/s390/mm/hugetlbpage.c
index f01daddcbc5e..ac25b207624c 100644
--- a/arch/s390/mm/hugetlbpage.c
+++ b/arch/s390/mm/hugetlbpage.c
@@ -251,16 +251,24 @@ follow_huge_pud(struct mm_struct *mm, unsigned long address,
 	return pud_page(*pud) + ((address & ~PUD_MASK) >> PAGE_SHIFT);
 }
 
+bool __init arch_hugetlb_valid_size(unsigned long size)
+{
+	if (MACHINE_HAS_EDAT1 && size == PMD_SIZE)
+		return true;
+	else if (MACHINE_HAS_EDAT2 && size == PUD_SIZE)
+		return true;
+	else
+		return false;
+}
+
 static __init int setup_hugepagesz(char *opt)
 {
 	unsigned long size;
 	char *string = opt;
 
 	size = memparse(opt, &opt);
-	if (MACHINE_HAS_EDAT1 && size == PMD_SIZE) {
-		hugetlb_add_hstate(PMD_SHIFT - PAGE_SHIFT);
-	} else if (MACHINE_HAS_EDAT2 && size == PUD_SIZE) {
-		hugetlb_add_hstate(PUD_SHIFT - PAGE_SHIFT);
+	if (arch_hugetlb_valid_size(size)) {
+		hugetlb_add_hstate(ilog2(size) - PAGE_SHIFT);
 	} else {
 		hugetlb_bad_size();
 		pr_err("hugepagesz= specifies an unsupported page size %s\n",
diff --git a/arch/sparc/include/asm/hugetlb.h b/arch/sparc/include/asm/hugetlb.h
index 3963f80d1cb3..a88668b455d3 100644
--- a/arch/sparc/include/asm/hugetlb.h
+++ b/arch/sparc/include/asm/hugetlb.h
@@ -10,6 +10,9 @@ struct pud_huge_patch_entry {
 	unsigned int insn;
 };
 extern struct pud_huge_patch_entry __pud_huge_patch, __pud_huge_patch_end;
+
+bool __init arch_hugetlb_valid_size(unsigned long size);
+#define arch_hugetlb_valid_size arch_hugetlb_valid_size
 #endif
 
 #define __HAVE_ARCH_HUGE_SET_HUGE_PTE_AT
diff --git a/arch/sparc/mm/init_64.c b/arch/sparc/mm/init_64.c
index 1cf0d666dea3..2bfe8e22b706 100644
--- a/arch/sparc/mm/init_64.c
+++ b/arch/sparc/mm/init_64.c
@@ -360,16 +360,11 @@ static void __init pud_huge_patch(void)
 	__asm__ __volatile__("flush %0" : : "r" (addr));
 }
 
-static int __init setup_hugepagesz(char *string)
+bool __init arch_hugetlb_valid_size(unsigned long size)
 {
-	unsigned long long hugepage_size;
-	unsigned int hugepage_shift;
+	unsigned int hugepage_shift = ilog2(size);
 	unsigned short hv_pgsz_idx;
 	unsigned int hv_pgsz_mask;
-	int rc = 0;
-
-	hugepage_size = memparse(string, &string);
-	hugepage_shift = ilog2(hugepage_size);
 
 	switch (hugepage_shift) {
 	case HPAGE_16GB_SHIFT:
@@ -397,7 +392,20 @@ static int __init setup_hugepagesz(char *string)
 		hv_pgsz_mask = 0;
 	}
 
-	if ((hv_pgsz_mask & cpu_pgsz_mask) == 0U) {
+	if ((hv_pgsz_mask & cpu_pgsz_mask) == 0U)
+		return false;
+
+	return true;
+}
+
+static int __init setup_hugepagesz(char *string)
+{
+	unsigned long long hugepage_size;
+	int rc = 0;
+
+	hugepage_size = memparse(string, &string);
+
+	if (!arch_hugetlb_valid_size((unsigned long)hugepage_size)) {
 		hugetlb_bad_size();
 		pr_err("hugepagesz=%llu not supported by MMU.\n",
 			hugepage_size);
diff --git a/arch/x86/include/asm/hugetlb.h b/arch/x86/include/asm/hugetlb.h
index f65cfb48cfdd..53ae4df41433 100644
--- a/arch/x86/include/asm/hugetlb.h
+++ b/arch/x86/include/asm/hugetlb.h
@@ -7,6 +7,11 @@
 
 #define hugepages_supported() boot_cpu_has(X86_FEATURE_PSE)
 
+#ifdef CONFIG_X86_64
+bool __init arch_hugetlb_valid_size(unsigned long size);
+#define arch_hugetlb_valid_size arch_hugetlb_valid_size
+#endif
+
 static inline int is_hugepage_only_range(struct mm_struct *mm,
 					 unsigned long addr,
 					 unsigned long len) {
diff --git a/arch/x86/mm/hugetlbpage.c b/arch/x86/mm/hugetlbpage.c
index 5bfd5aef5378..1c4372bfe782 100644
--- a/arch/x86/mm/hugetlbpage.c
+++ b/arch/x86/mm/hugetlbpage.c
@@ -181,13 +181,22 @@ hugetlb_get_unmapped_area(struct file *file, unsigned long addr,
 #endif /* CONFIG_HUGETLB_PAGE */
 
 #ifdef CONFIG_X86_64
+bool __init arch_hugetlb_valid_size(unsigned long size)
+{
+	if (size == PMD_SIZE)
+		return true;
+	else if (size == PUD_SIZE && boot_cpu_has(X86_FEATURE_GBPAGES))
+		return true;
+	else
+		return false;
+}
+
 static __init int setup_hugepagesz(char *opt)
 {
 	unsigned long ps = memparse(opt, &opt);
-	if (ps == PMD_SIZE) {
-		hugetlb_add_hstate(PMD_SHIFT - PAGE_SHIFT);
-	} else if (ps == PUD_SIZE && boot_cpu_has(X86_FEATURE_GBPAGES)) {
-		hugetlb_add_hstate(PUD_SHIFT - PAGE_SHIFT);
+
+	if (arch_hugetlb_valid_size(ps)) {
+		hugetlb_add_hstate(ilog2(ps) - PAGE_SHIFT);
 	} else {
 		hugetlb_bad_size();
 		printk(KERN_ERR "hugepagesz: Unsupported page size %lu M\n",
diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
index 2fc9bc91894e..962bb1e6682b 100644
--- a/include/linux/hugetlb.h
+++ b/include/linux/hugetlb.h
@@ -688,6 +688,13 @@ static inline spinlock_t *huge_pte_lockptr(struct hstate *h,
 	return &mm->page_table_lock;
 }
 
+#ifndef arch_hugetlb_valid_size
+static inline bool arch_hugetlb_valid_size(unsigned long size)
+{
+	return size == HPAGE_SIZE;
+}
+#endif
+
 #ifndef hugepages_supported
 /*
  * Some platform decide whether they support huge pages at boot
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 38c3363b46e1..e9d9d179cf12 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -3324,12 +3324,21 @@ static int __init hugetlb_nrpages_setup(char *s)
 }
 __setup("hugepages=", hugetlb_nrpages_setup);
 
-static int __init hugetlb_default_setup(char *s)
+static int __init default_hugepagesz_setup(char *s)
 {
-	default_hstate_size = memparse(s, &s);
+	unsigned long size;
+
+	size = (unsigned long)memparse(s, NULL);
+
+	if (!arch_hugetlb_valid_size(size)) {
+		pr_err("HugeTLB: unsupported default_hugepagesz %s\n", s);
+		return 0;
+	}
+
+	default_hstate_size = size;
 	return 1;
 }
-__setup("default_hugepagesz=", hugetlb_default_setup);
+__setup("default_hugepagesz=", default_hugepagesz_setup);
 
 static unsigned int cpuset_mems_nr(unsigned int *array)
 {
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v2 2/4] hugetlbfs: move hugepagesz= parsing to arch independent code
  2020-04-01 18:38 [PATCH v2 0/4] Clean up hugetlb boot command line processing Mike Kravetz
  2020-04-01 18:38 ` [PATCH v2 1/4] hugetlbfs: add arch_hugetlb_valid_size Mike Kravetz
@ 2020-04-01 18:38 ` Mike Kravetz
  2020-04-10 19:26   ` Peter Xu
  2020-04-01 18:38 ` [PATCH v2 3/4] hugetlbfs: remove hugetlb_add_hstate() warning for existing hstate Mike Kravetz
  2020-04-01 18:38 ` [PATCH v2 4/4] hugetlbfs: clean up command line processing Mike Kravetz
  3 siblings, 1 reply; 13+ messages in thread
From: Mike Kravetz @ 2020-04-01 18:38 UTC (permalink / raw)
  To: linux-mm, linux-kernel, linux-arm-kernel, linuxppc-dev,
	linux-riscv, linux-s390, sparclinux, linux-doc
  Cc: Catalin Marinas, Will Deacon, Benjamin Herrenschmidt,
	Paul Mackerras, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Heiko Carstens, Vasily Gorbik, Christian Borntraeger,
	David S . Miller, Thomas Gleixner, Ingo Molnar, Dave Hansen,
	Jonathan Corbet, Longpeng, Christophe Leroy, Mina Almasry,
	Andrew Morton, Mike Kravetz

Now that architectures provide arch_hugetlb_valid_size(), parsing
of "hugepagesz=" can be done in architecture independent code.
Create a single routine to handle hugepagesz= parsing and remove
all arch specific routines.  We can also remove the interface
hugetlb_bad_size() as this is no longer used outside arch independent
code.

This also provides consistent behavior of hugetlbfs command line
options.  The hugepagesz= option should only be specified once for
a specific size, but some architectures allow multiple instances.
This appears to be more of an oversight when code was added by some
architectures to set up ALL huge pages sizes.

Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
---
 arch/arm64/mm/hugetlbpage.c   | 15 ---------------
 arch/powerpc/mm/hugetlbpage.c | 15 ---------------
 arch/riscv/mm/hugetlbpage.c   | 16 ----------------
 arch/s390/mm/hugetlbpage.c    | 18 ------------------
 arch/sparc/mm/init_64.c       | 22 ----------------------
 arch/x86/mm/hugetlbpage.c     | 16 ----------------
 include/linux/hugetlb.h       |  1 -
 mm/hugetlb.c                  | 23 +++++++++++++++++------
 8 files changed, 17 insertions(+), 109 deletions(-)

diff --git a/arch/arm64/mm/hugetlbpage.c b/arch/arm64/mm/hugetlbpage.c
index 069b96ee2aec..f706b821aba6 100644
--- a/arch/arm64/mm/hugetlbpage.c
+++ b/arch/arm64/mm/hugetlbpage.c
@@ -476,18 +476,3 @@ bool __init arch_hugetlb_valid_size(unsigned long size)
 
 	return false;
 }
-
-static __init int setup_hugepagesz(char *opt)
-{
-	unsigned long ps = memparse(opt, &opt);
-
-	if (arch_hugetlb_valid_size(ps)) {
-		add_huge_page_size(ps);
-		return 1;
-	}
-
-	hugetlb_bad_size();
-	pr_err("hugepagesz: Unsupported page size %lu K\n", ps >> 10);
-	return 0;
-}
-__setup("hugepagesz=", setup_hugepagesz);
diff --git a/arch/powerpc/mm/hugetlbpage.c b/arch/powerpc/mm/hugetlbpage.c
index de54d2a37830..2c3fa0a7787b 100644
--- a/arch/powerpc/mm/hugetlbpage.c
+++ b/arch/powerpc/mm/hugetlbpage.c
@@ -589,21 +589,6 @@ static int __init add_huge_page_size(unsigned long long size)
 	return 0;
 }
 
-static int __init hugepage_setup_sz(char *str)
-{
-	unsigned long long size;
-
-	size = memparse(str, &str);
-
-	if (add_huge_page_size(size) != 0) {
-		hugetlb_bad_size();
-		pr_err("Invalid huge page size specified(%llu)\n", size);
-	}
-
-	return 1;
-}
-__setup("hugepagesz=", hugepage_setup_sz);
-
 static int __init hugetlbpage_init(void)
 {
 	bool configured = false;
diff --git a/arch/riscv/mm/hugetlbpage.c b/arch/riscv/mm/hugetlbpage.c
index da1f516bc451..4e5d7e9f0eef 100644
--- a/arch/riscv/mm/hugetlbpage.c
+++ b/arch/riscv/mm/hugetlbpage.c
@@ -22,22 +22,6 @@ bool __init arch_hugetlb_valid_size(unsigned long size)
 		return false;
 }
 
-static __init int setup_hugepagesz(char *opt)
-{
-	unsigned long ps = memparse(opt, &opt);
-
-	if (arch_hugetlb_valid_size(ps)) {
-		hugetlb_add_hstate(ilog2(ps) - PAGE_SHIFT);
-		return 1;
-	}
-
-	hugetlb_bad_size();
-	pr_err("hugepagesz: Unsupported page size %lu M\n", ps >> 20);
-	return 0;
-
-}
-__setup("hugepagesz=", setup_hugepagesz);
-
 #ifdef CONFIG_CONTIG_ALLOC
 static __init int gigantic_pages_init(void)
 {
diff --git a/arch/s390/mm/hugetlbpage.c b/arch/s390/mm/hugetlbpage.c
index ac25b207624c..242dfc0d462d 100644
--- a/arch/s390/mm/hugetlbpage.c
+++ b/arch/s390/mm/hugetlbpage.c
@@ -261,24 +261,6 @@ bool __init arch_hugetlb_valid_size(unsigned long size)
 		return false;
 }
 
-static __init int setup_hugepagesz(char *opt)
-{
-	unsigned long size;
-	char *string = opt;
-
-	size = memparse(opt, &opt);
-	if (arch_hugetlb_valid_size(size)) {
-		hugetlb_add_hstate(ilog2(size) - PAGE_SHIFT);
-	} else {
-		hugetlb_bad_size();
-		pr_err("hugepagesz= specifies an unsupported page size %s\n",
-			string);
-		return 0;
-	}
-	return 1;
-}
-__setup("hugepagesz=", setup_hugepagesz);
-
 static unsigned long hugetlb_get_unmapped_area_bottomup(struct file *file,
 		unsigned long addr, unsigned long len,
 		unsigned long pgoff, unsigned long flags)
diff --git a/arch/sparc/mm/init_64.c b/arch/sparc/mm/init_64.c
index 2bfe8e22b706..4618f96fd30f 100644
--- a/arch/sparc/mm/init_64.c
+++ b/arch/sparc/mm/init_64.c
@@ -397,28 +397,6 @@ bool __init arch_hugetlb_valid_size(unsigned long size)
 
 	return true;
 }
-
-static int __init setup_hugepagesz(char *string)
-{
-	unsigned long long hugepage_size;
-	int rc = 0;
-
-	hugepage_size = memparse(string, &string);
-
-	if (!arch_hugetlb_valid_size((unsigned long)hugepage_size)) {
-		hugetlb_bad_size();
-		pr_err("hugepagesz=%llu not supported by MMU.\n",
-			hugepage_size);
-		goto out;
-	}
-
-	add_huge_page_size(hugepage_size);
-	rc = 1;
-
-out:
-	return rc;
-}
-__setup("hugepagesz=", setup_hugepagesz);
 #endif	/* CONFIG_HUGETLB_PAGE */
 
 void update_mmu_cache(struct vm_area_struct *vma, unsigned long address, pte_t *ptep)
diff --git a/arch/x86/mm/hugetlbpage.c b/arch/x86/mm/hugetlbpage.c
index 1c4372bfe782..937d640a89e3 100644
--- a/arch/x86/mm/hugetlbpage.c
+++ b/arch/x86/mm/hugetlbpage.c
@@ -191,22 +191,6 @@ bool __init arch_hugetlb_valid_size(unsigned long size)
 		return false;
 }
 
-static __init int setup_hugepagesz(char *opt)
-{
-	unsigned long ps = memparse(opt, &opt);
-
-	if (arch_hugetlb_valid_size(ps)) {
-		hugetlb_add_hstate(ilog2(ps) - PAGE_SHIFT);
-	} else {
-		hugetlb_bad_size();
-		printk(KERN_ERR "hugepagesz: Unsupported page size %lu M\n",
-			ps >> 20);
-		return 0;
-	}
-	return 1;
-}
-__setup("hugepagesz=", setup_hugepagesz);
-
 #ifdef CONFIG_CONTIG_ALLOC
 static __init int gigantic_pages_init(void)
 {
diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
index 962bb1e6682b..d40340b13198 100644
--- a/include/linux/hugetlb.h
+++ b/include/linux/hugetlb.h
@@ -519,7 +519,6 @@ int huge_add_to_page_cache(struct page *page, struct address_space *mapping,
 int __init __alloc_bootmem_huge_page(struct hstate *h);
 int __init alloc_bootmem_huge_page(struct hstate *h);
 
-void __init hugetlb_bad_size(void);
 void __init hugetlb_add_hstate(unsigned order);
 struct hstate *size_to_hstate(unsigned long size);
 
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index e9d9d179cf12..02b13ee26790 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -3249,12 +3249,6 @@ static int __init hugetlb_init(void)
 }
 subsys_initcall(hugetlb_init);
 
-/* Should be called on processing a hugepagesz=... option */
-void __init hugetlb_bad_size(void)
-{
-	parsed_valid_hugepagesz = false;
-}
-
 void __init hugetlb_add_hstate(unsigned int order)
 {
 	struct hstate *h;
@@ -3324,6 +3318,23 @@ static int __init hugetlb_nrpages_setup(char *s)
 }
 __setup("hugepages=", hugetlb_nrpages_setup);
 
+static int __init hugepagesz_setup(char *s)
+{
+	unsigned long size;
+
+	size = (unsigned long)memparse(s, NULL);
+
+	if (!arch_hugetlb_valid_size(size)) {
+		parsed_valid_hugepagesz = false;
+		pr_err("HugeTLB: unsupported hugepagesz %s\n", s);
+		return 0;
+	}
+
+	hugetlb_add_hstate(ilog2(size) - PAGE_SHIFT);
+	return 1;
+}
+__setup("hugepagesz=", hugepagesz_setup);
+
 static int __init default_hugepagesz_setup(char *s)
 {
 	unsigned long size;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v2 3/4] hugetlbfs: remove hugetlb_add_hstate() warning for existing hstate
  2020-04-01 18:38 [PATCH v2 0/4] Clean up hugetlb boot command line processing Mike Kravetz
  2020-04-01 18:38 ` [PATCH v2 1/4] hugetlbfs: add arch_hugetlb_valid_size Mike Kravetz
  2020-04-01 18:38 ` [PATCH v2 2/4] hugetlbfs: move hugepagesz= parsing to arch independent code Mike Kravetz
@ 2020-04-01 18:38 ` Mike Kravetz
  2020-04-10 19:34   ` Peter Xu
  2020-04-01 18:38 ` [PATCH v2 4/4] hugetlbfs: clean up command line processing Mike Kravetz
  3 siblings, 1 reply; 13+ messages in thread
From: Mike Kravetz @ 2020-04-01 18:38 UTC (permalink / raw)
  To: linux-mm, linux-kernel, linux-arm-kernel, linuxppc-dev,
	linux-riscv, linux-s390, sparclinux, linux-doc
  Cc: Catalin Marinas, Will Deacon, Benjamin Herrenschmidt,
	Paul Mackerras, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Heiko Carstens, Vasily Gorbik, Christian Borntraeger,
	David S . Miller, Thomas Gleixner, Ingo Molnar, Dave Hansen,
	Jonathan Corbet, Longpeng, Christophe Leroy, Mina Almasry,
	Andrew Morton, Mike Kravetz

The routine hugetlb_add_hstate prints a warning if the hstate already
exists.  This was originally done as part of kernel command line
parsing.  If 'hugepagesz=' was specified more than once, the warning
	pr_warn("hugepagesz= specified twice, ignoring\n");
would be printed.

Some architectures want to enable all huge page sizes.  They would
call hugetlb_add_hstate for all supported sizes.  However, this was
done after command line processing and as a result hstates could have
already been created for some sizes.  To make sure no warning were
printed, there would often be code like:
	if (!size_to_hstate(size)
		hugetlb_add_hstate(ilog2(size) - PAGE_SHIFT)

The only time we want to print the warning is as the result of command
line processing.  So, remove the warning from hugetlb_add_hstate and
add it to the single arch independent routine processing "hugepagesz=".
After this, calls to size_to_hstate() in arch specific code can be
removed and hugetlb_add_hstate can be called without worrying about
warning messages.

Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
---
 arch/arm64/mm/hugetlbpage.c   | 16 ++++------------
 arch/powerpc/mm/hugetlbpage.c |  3 +--
 arch/riscv/mm/hugetlbpage.c   |  2 +-
 arch/sparc/mm/init_64.c       | 19 ++++---------------
 arch/x86/mm/hugetlbpage.c     |  2 +-
 mm/hugetlb.c                  |  9 ++++++---
 6 files changed, 17 insertions(+), 34 deletions(-)

diff --git a/arch/arm64/mm/hugetlbpage.c b/arch/arm64/mm/hugetlbpage.c
index f706b821aba6..21fa98b51e00 100644
--- a/arch/arm64/mm/hugetlbpage.c
+++ b/arch/arm64/mm/hugetlbpage.c
@@ -441,22 +441,14 @@ void huge_ptep_clear_flush(struct vm_area_struct *vma,
 	clear_flush(vma->vm_mm, addr, ptep, pgsize, ncontig);
 }
 
-static void __init add_huge_page_size(unsigned long size)
-{
-	if (size_to_hstate(size))
-		return;
-
-	hugetlb_add_hstate(ilog2(size) - PAGE_SHIFT);
-}
-
 static int __init hugetlbpage_init(void)
 {
 #ifdef CONFIG_ARM64_4K_PAGES
-	add_huge_page_size(PUD_SIZE);
+	hugetlb_add_hstate(PUD_SHIFT - PAGE_SHIFT);
 #endif
-	add_huge_page_size(CONT_PMD_SIZE);
-	add_huge_page_size(PMD_SIZE);
-	add_huge_page_size(CONT_PTE_SIZE);
+	hugetlb_add_hstate(CONT_PMD_SHIFT - PAGE_SHIFT);
+	hugetlb_add_hstate(PMD_SHIFT - PAGE_SHIFT);
+	hugetlb_add_hstate(CONT_PTE_SHIFT - PAGE_SHIFT);
 
 	return 0;
 }
diff --git a/arch/powerpc/mm/hugetlbpage.c b/arch/powerpc/mm/hugetlbpage.c
index 2c3fa0a7787b..4d5ed1093615 100644
--- a/arch/powerpc/mm/hugetlbpage.c
+++ b/arch/powerpc/mm/hugetlbpage.c
@@ -584,8 +584,7 @@ static int __init add_huge_page_size(unsigned long long size)
 	if (!arch_hugetlb_valid_size((unsigned long)size))
 		return -EINVAL;
 
-	if (!size_to_hstate(size))
-		hugetlb_add_hstate(shift - PAGE_SHIFT);
+	hugetlb_add_hstate(shift - PAGE_SHIFT);
 	return 0;
 }
 
diff --git a/arch/riscv/mm/hugetlbpage.c b/arch/riscv/mm/hugetlbpage.c
index 4e5d7e9f0eef..932dadfdca54 100644
--- a/arch/riscv/mm/hugetlbpage.c
+++ b/arch/riscv/mm/hugetlbpage.c
@@ -26,7 +26,7 @@ bool __init arch_hugetlb_valid_size(unsigned long size)
 static __init int gigantic_pages_init(void)
 {
 	/* With CONTIG_ALLOC, we can allocate gigantic pages at runtime */
-	if (IS_ENABLED(CONFIG_64BIT) && !size_to_hstate(1UL << PUD_SHIFT))
+	if (IS_ENABLED(CONFIG_64BIT))
 		hugetlb_add_hstate(PUD_SHIFT - PAGE_SHIFT);
 	return 0;
 }
diff --git a/arch/sparc/mm/init_64.c b/arch/sparc/mm/init_64.c
index 4618f96fd30f..ae819a16d07a 100644
--- a/arch/sparc/mm/init_64.c
+++ b/arch/sparc/mm/init_64.c
@@ -325,23 +325,12 @@ static void __update_mmu_tsb_insert(struct mm_struct *mm, unsigned long tsb_inde
 }
 
 #ifdef CONFIG_HUGETLB_PAGE
-static void __init add_huge_page_size(unsigned long size)
-{
-	unsigned int order;
-
-	if (size_to_hstate(size))
-		return;
-
-	order = ilog2(size) - PAGE_SHIFT;
-	hugetlb_add_hstate(order);
-}
-
 static int __init hugetlbpage_init(void)
 {
-	add_huge_page_size(1UL << HPAGE_64K_SHIFT);
-	add_huge_page_size(1UL << HPAGE_SHIFT);
-	add_huge_page_size(1UL << HPAGE_256MB_SHIFT);
-	add_huge_page_size(1UL << HPAGE_2GB_SHIFT);
+	hugetlb_add_hstate(HPAGE_64K_SHIFT - PAGE_SHIFT);
+	hugetlb_add_hstate(HPAGE_SHIFT - PAGE_SHIFT);
+	hugetlb_add_hstate(HPAGE_256MB_SHIFT - PAGE_SHIFT);
+	hugetlb_add_hstate(HPAGE_2GB_SHIFT - PAGE_SHIFT);
 
 	return 0;
 }
diff --git a/arch/x86/mm/hugetlbpage.c b/arch/x86/mm/hugetlbpage.c
index 937d640a89e3..cf5781142716 100644
--- a/arch/x86/mm/hugetlbpage.c
+++ b/arch/x86/mm/hugetlbpage.c
@@ -195,7 +195,7 @@ bool __init arch_hugetlb_valid_size(unsigned long size)
 static __init int gigantic_pages_init(void)
 {
 	/* With compaction or CMA we can allocate gigantic pages at runtime */
-	if (boot_cpu_has(X86_FEATURE_GBPAGES) && !size_to_hstate(1UL << PUD_SHIFT))
+	if (boot_cpu_has(X86_FEATURE_GBPAGES))
 		hugetlb_add_hstate(PUD_SHIFT - PAGE_SHIFT);
 	return 0;
 }
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 02b13ee26790..72a4343509d5 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -3216,8 +3216,7 @@ static int __init hugetlb_init(void)
 		}
 
 		default_hstate_size = HPAGE_SIZE;
-		if (!size_to_hstate(default_hstate_size))
-			hugetlb_add_hstate(HUGETLB_PAGE_ORDER);
+		hugetlb_add_hstate(HUGETLB_PAGE_ORDER);
 	}
 	default_hstate_idx = hstate_index(size_to_hstate(default_hstate_size));
 	if (default_hstate_max_huge_pages) {
@@ -3255,7 +3254,6 @@ void __init hugetlb_add_hstate(unsigned int order)
 	unsigned long i;
 
 	if (size_to_hstate(PAGE_SIZE << order)) {
-		pr_warn("hugepagesz= specified twice, ignoring\n");
 		return;
 	}
 	BUG_ON(hugetlb_max_hstate >= HUGE_MAX_HSTATE);
@@ -3330,6 +3328,11 @@ static int __init hugepagesz_setup(char *s)
 		return 0;
 	}
 
+	if (size_to_hstate(size)) {
+		pr_warn("HugeTLB: hugepagesz %s specified twice, ignoring\n", s);
+		return 0;
+	}
+
 	hugetlb_add_hstate(ilog2(size) - PAGE_SHIFT);
 	return 1;
 }
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v2 4/4] hugetlbfs: clean up command line processing
  2020-04-01 18:38 [PATCH v2 0/4] Clean up hugetlb boot command line processing Mike Kravetz
                   ` (2 preceding siblings ...)
  2020-04-01 18:38 ` [PATCH v2 3/4] hugetlbfs: remove hugetlb_add_hstate() warning for existing hstate Mike Kravetz
@ 2020-04-01 18:38 ` Mike Kravetz
  2020-04-01 18:55   ` Randy Dunlap
  2020-04-10 20:37   ` Peter Xu
  3 siblings, 2 replies; 13+ messages in thread
From: Mike Kravetz @ 2020-04-01 18:38 UTC (permalink / raw)
  To: linux-mm, linux-kernel, linux-arm-kernel, linuxppc-dev,
	linux-riscv, linux-s390, sparclinux, linux-doc
  Cc: Catalin Marinas, Will Deacon, Benjamin Herrenschmidt,
	Paul Mackerras, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Heiko Carstens, Vasily Gorbik, Christian Borntraeger,
	David S . Miller, Thomas Gleixner, Ingo Molnar, Dave Hansen,
	Jonathan Corbet, Longpeng, Christophe Leroy, Mina Almasry,
	Andrew Morton, Mike Kravetz

With all hugetlb page processing done in a single file clean up code.
- Make code match desired semantics
  - Update documentation with semantics
- Make all warnings and errors messages start with 'HugeTLB:'.
- Consistently name command line parsing routines.
- Check for hugepages_supported() before processing parameters.
- Add comments to code
  - Describe some of the subtle interactions
  - Describe semantics of command line arguments

Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
---
 .../admin-guide/kernel-parameters.txt         | 35 ++++---
 Documentation/admin-guide/mm/hugetlbpage.rst  | 44 +++++++++
 mm/hugetlb.c                                  | 96 +++++++++++++++----
 3 files changed, 142 insertions(+), 33 deletions(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 1bd5454b5e5f..de653cfe1726 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -832,12 +832,15 @@
 			See also Documentation/networking/decnet.txt.
 
 	default_hugepagesz=
-			[same as hugepagesz=] The size of the default
-			HugeTLB page size. This is the size represented by
-			the legacy /proc/ hugepages APIs, used for SHM, and
-			default size when mounting hugetlbfs filesystems.
-			Defaults to the default architecture's huge page size
-			if not specified.
+			[HW] The size of the default HugeTLB page size. This
+			is the size represented by the legacy /proc/ hugepages
+			APIs.  In addition, this is the default hugetlb size
+			used for shmget(), mmap() and mounting hugetlbfs
+			filesystems.  If not specified, defaults to the
+			architecture's default huge page size.  Huge page
+			sizes are architecture dependent.  See also
+			Documentation/admin-guide/mm/hugetlbpage.rst.
+			Format: size[KMG]
 
 	deferred_probe_timeout=
 			[KNL] Debugging option to set a timeout in seconds for
@@ -1480,13 +1483,19 @@
 			If enabled, boot-time allocation of gigantic hugepages
 			is skipped.
 
-	hugepages=	[HW,X86-32,IA-64] HugeTLB pages to allocate at boot.
-	hugepagesz=	[HW,IA-64,PPC,X86-64] The size of the HugeTLB pages.
-			On x86-64 and powerpc, this option can be specified
-			multiple times interleaved with hugepages= to reserve
-			huge pages of different sizes. Valid pages sizes on
-			x86-64 are 2M (when the CPU supports "pse") and 1G
-			(when the CPU supports the "pdpe1gb" cpuinfo flag).
+	hugepages=	[HW] Number of HugeTLB pages to allocate at boot.
+			If this follows hugepagesz (below), it specifies
+			the number of pages of hugepagesz to be allocated.
+			Format: <integer>
+	hugepagesz=
+			[HW] The size of the HugeTLB pages.  This is used in
+			conjunction with hugepages (above) to allocate huge
+			pages of a specific size at boot.  The pair
+			hugepagesz=X hugepages=Y can be specified once for
+			each supported huge page size. Huge page sizes are
+			architecture dependent.  See also
+			Documentation/admin-guide/mm/hugetlbpage.rst.
+			Format: size[KMG]
 
 	hung_task_panic=
 			[KNL] Should the hung task detector generate panics.
diff --git a/Documentation/admin-guide/mm/hugetlbpage.rst b/Documentation/admin-guide/mm/hugetlbpage.rst
index 1cc0bc78d10e..de340c586995 100644
--- a/Documentation/admin-guide/mm/hugetlbpage.rst
+++ b/Documentation/admin-guide/mm/hugetlbpage.rst
@@ -100,6 +100,50 @@ with a huge page size selection parameter "hugepagesz=<size>".  <size> must
 be specified in bytes with optional scale suffix [kKmMgG].  The default huge
 page size may be selected with the "default_hugepagesz=<size>" boot parameter.
 
+Hugetlb boot command line parameter semantics
+hugepagesz - Specify a huge page size.  Used in conjunction with hugepages
+	parameter to preallocate a number of huge pages of the specified
+	size.  Hence, hugepagesz and hugepages are typically specified in
+	pairs such as:
+		hugepagesz=2M hugepages=512
+	hugepagesz can only be specified once on the command line for a
+	specific huge page size.  Valid huge page sizes are architecture
+	dependent.
+hugepages - Specify the number of huge pages to preallocate.  This typically
+	follows a valid hugepagesz parameter.  However, if hugepages is the
+	first or only hugetlb command line parameter it specifies the number
+	of huge pages of default size to allocate.  The number of huge pages
+	of default size specified in this manner can be overwritten by a
+	hugepagesz,hugepages parameter pair for the default size.
+	For example, on an architecture with 2M default huge page size:
+		hugepages=256 hugepagesz=2M hugepages=512
+	will result in 512 2M huge pages being allocated.  If a hugepages
+	parameter is preceded by an invalid hugepagesz parameter, it will
+	be ignored.
+default_hugepagesz - Specify the default huge page size.  This parameter can
+	only be specified once on the command line.  No other hugetlb command
+	line parameter is associated with default_hugepagesz.  Therefore, it
+	can appear anywhere on the command line.  If hugepages= is the first
+	hugetlb command line parameter, the specified number of huge pages
+	will apply to the default huge page size specified with
+	default_hugepagesz.  For example,
+		hugepages=512 default_hugepagesz=2M
+	will result in 512 2M huge pages being allocated.  However, specifying
+	the number of default huge pages in this manner will not apply to
+	gigantic huge pages.  For example,
+		hugepages=10 default_hugepagesz=1G
+				or
+		default_hugepagesz=1G hugepages=10
+	will NOT result in the allocation of 10 1G huge pages.  In order to
+	preallocate gigantic huge pages, there must be hugepagesz, hugepages
+	parameter pair.  For example,
+		hugepagesz=1G hugepages=10 default_hugepagesz=1G
+				or
+		default_hugepagesz=1G hugepagesz=1G hugepages=10
+	will result 10 1G huge pages being allocated and the default huge
+	page size will be set to 1G.  Valid default huge page size is
+	architecture dependent.
+
 When multiple huge page sizes are supported, ``/proc/sys/vm/nr_hugepages``
 indicates the current number of pre-allocated huge pages of the default size.
 Thus, one can use the following command to dynamically allocate/deallocate
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 72a4343509d5..74ef53f7c5a7 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -3054,7 +3054,7 @@ static void __init hugetlb_sysfs_init(void)
 		err = hugetlb_sysfs_add_hstate(h, hugepages_kobj,
 					 hstate_kobjs, &hstate_attr_group);
 		if (err)
-			pr_err("Hugetlb: Unable to add hstate %s", h->name);
+			pr_err("HugeTLB: Unable to add hstate %s", h->name);
 	}
 }
 
@@ -3158,7 +3158,7 @@ static void hugetlb_register_node(struct node *node)
 						nhs->hstate_kobjs,
 						&per_node_hstate_attr_group);
 		if (err) {
-			pr_err("Hugetlb: Unable to add hstate %s for node %d\n",
+			pr_err("HugeTLB: Unable to add hstate %s for node %d\n",
 				h->name, node->dev.id);
 			hugetlb_unregister_node(node);
 			break;
@@ -3209,19 +3209,35 @@ static int __init hugetlb_init(void)
 	if (!hugepages_supported())
 		return 0;
 
-	if (!size_to_hstate(default_hstate_size)) {
-		if (default_hstate_size != 0) {
-			pr_err("HugeTLB: unsupported default_hugepagesz %lu. Reverting to %lu\n",
-			       default_hstate_size, HPAGE_SIZE);
-		}
-
+	/*
+	 * Make sure HPAGE_SIZE (HUGETLB_PAGE_ORDER) hstate exists.  Some
+	 * architectures depend on setup being done here.
+	 *
+	 * If a valid default huge page size was specified on the command line,
+	 * add associated hstate if necessary.  If not, set default_hstate_size
+	 * to default size.  default_hstate_idx is used at runtime to identify
+	 * the default huge page size/hstate.
+	 */
+	hugetlb_add_hstate(HUGETLB_PAGE_ORDER);
+	if (default_hstate_size)
+		hugetlb_add_hstate(ilog2(default_hstate_size) - PAGE_SHIFT);
+	else
 		default_hstate_size = HPAGE_SIZE;
-		hugetlb_add_hstate(HUGETLB_PAGE_ORDER);
-	}
 	default_hstate_idx = hstate_index(size_to_hstate(default_hstate_size));
+
+	/*
+	 * default_hstate_max_huge_pages != 0 indicates a count (hugepages=)
+	 * specified before a size (hugepagesz=).  Use this count for the
+	 * default huge page size, unless a specific value was specified for
+	 * this size in a hugepagesz/hugepages pair.
+	 */
 	if (default_hstate_max_huge_pages) {
 		if (!default_hstate.max_huge_pages)
-			default_hstate.max_huge_pages = default_hstate_max_huge_pages;
+			default_hstate.max_huge_pages =
+				default_hstate_max_huge_pages;
+		else
+			pr_warn("HugeTLB: First hugepages=%lu ignored\n",
+				default_hstate_max_huge_pages);
 	}
 
 	hugetlb_init_hstates();
@@ -3274,20 +3290,31 @@ void __init hugetlb_add_hstate(unsigned int order)
 	parsed_hstate = h;
 }
 
-static int __init hugetlb_nrpages_setup(char *s)
+/*
+ * hugepages command line processing
+ * hugepages normally follows a valid hugepagsz specification.  If not, ignore
+ * the hugepages value.  hugepages can also be the first huge page command line
+ * option in which case it specifies the number of huge pages for the default
+ * size.
+ */
+static int __init hugepages_setup(char *s)
 {
 	unsigned long *mhp;
 	static unsigned long *last_mhp;
 
+	if (!hugepages_supported()) {
+		pr_warn("HugeTLB: huge pages not supported, ignoring hugepages = %s\n", s);
+		return 0;
+	}
+
 	if (!parsed_valid_hugepagesz) {
-		pr_warn("hugepages = %s preceded by "
-			"an unsupported hugepagesz, ignoring\n", s);
+		pr_warn("HugeTLB: hugepages = %s preceded by an unsupported hugepagesz, ignoring\n", s);
 		parsed_valid_hugepagesz = true;
-		return 1;
+		return 0;
 	}
 	/*
-	 * !hugetlb_max_hstate means we haven't parsed a hugepagesz= parameter yet,
-	 * so this hugepages= parameter goes to the "default hstate".
+	 * !hugetlb_max_hstate means we haven't parsed a hugepagesz= parameter
+	 * yet, so this hugepages= parameter goes to the "default hstate".
 	 */
 	else if (!hugetlb_max_hstate)
 		mhp = &default_hstate_max_huge_pages;
@@ -3295,8 +3322,8 @@ static int __init hugetlb_nrpages_setup(char *s)
 		mhp = &parsed_hstate->max_huge_pages;
 
 	if (mhp == last_mhp) {
-		pr_warn("hugepages= specified twice without interleaving hugepagesz=, ignoring\n");
-		return 1;
+		pr_warn("HugeTLB: hugepages= specified twice without interleaving hugepagesz=, ignoring hugepages=%s\n", s);
+		return 0;
 	}
 
 	if (sscanf(s, "%lu", mhp) <= 0)
@@ -3314,12 +3341,24 @@ static int __init hugetlb_nrpages_setup(char *s)
 
 	return 1;
 }
-__setup("hugepages=", hugetlb_nrpages_setup);
+__setup("hugepages=", hugepages_setup);
 
+/*
+ * hugepagesz command line processing
+ * A specific huge page size can only be specified once with hugepagesz.
+ * hugepagesz is followed by hugepages on the command line.  The global
+ * variable 'parsed_valid_hugepagesz' is used to determine if prior
+ * hugepagesz argument was valid.
+ */
 static int __init hugepagesz_setup(char *s)
 {
 	unsigned long size;
 
+	if (!hugepages_supported()) {
+		pr_warn("HugeTLB: huge pages not supported, ignoring hugepagesz = %s\n", s);
+		return 0;
+	}
+
 	size = (unsigned long)memparse(s, NULL);
 
 	if (!arch_hugetlb_valid_size(size)) {
@@ -3329,19 +3368,31 @@ static int __init hugepagesz_setup(char *s)
 	}
 
 	if (size_to_hstate(size)) {
+		parsed_valid_hugepagesz = false;
 		pr_warn("HugeTLB: hugepagesz %s specified twice, ignoring\n", s);
 		return 0;
 	}
 
+	parsed_valid_hugepagesz = true;
 	hugetlb_add_hstate(ilog2(size) - PAGE_SHIFT);
 	return 1;
 }
 __setup("hugepagesz=", hugepagesz_setup);
 
+/*
+ * default_hugepagesz command line input
+ * Only one instance of default_hugepagesz allowed on command line.  Do not
+ * add hstate here as that will confuse hugepagesz/hugepages processing.
+ */
 static int __init default_hugepagesz_setup(char *s)
 {
 	unsigned long size;
 
+	if (!hugepages_supported()) {
+		pr_warn("HugeTLB: huge pages not supported, ignoring default_hugepagesz = %s\n", s);
+		return 0;
+	}
+
 	size = (unsigned long)memparse(s, NULL);
 
 	if (!arch_hugetlb_valid_size(size)) {
@@ -3349,6 +3400,11 @@ static int __init default_hugepagesz_setup(char *s)
 		return 0;
 	}
 
+	if (default_hstate_size) {
+		pr_err("HugeTLB: default_hugepagesz previously specified, ignoring %s\n", s);
+		return 0;
+	}
+
 	default_hstate_size = size;
 	return 1;
 }
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 4/4] hugetlbfs: clean up command line processing
  2020-04-01 18:38 ` [PATCH v2 4/4] hugetlbfs: clean up command line processing Mike Kravetz
@ 2020-04-01 18:55   ` Randy Dunlap
  2020-04-10 20:37   ` Peter Xu
  1 sibling, 0 replies; 13+ messages in thread
From: Randy Dunlap @ 2020-04-01 18:55 UTC (permalink / raw)
  To: Mike Kravetz, linux-mm, linux-kernel, linux-arm-kernel,
	linuxppc-dev, linux-riscv, linux-s390, sparclinux, linux-doc
  Cc: Catalin Marinas, Will Deacon, Benjamin Herrenschmidt,
	Paul Mackerras, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Heiko Carstens, Vasily Gorbik, Christian Borntraeger,
	David S . Miller, Thomas Gleixner, Ingo Molnar, Dave Hansen,
	Jonathan Corbet, Longpeng, Christophe Leroy, Mina Almasry,
	Andrew Morton

On 4/1/20 11:38 AM, Mike Kravetz wrote:
> With all hugetlb page processing done in a single file clean up code.
> - Make code match desired semantics
>   - Update documentation with semantics
> - Make all warnings and errors messages start with 'HugeTLB:'.
> - Consistently name command line parsing routines.
> - Check for hugepages_supported() before processing parameters.
> - Add comments to code
>   - Describe some of the subtle interactions
>   - Describe semantics of command line arguments
> 
> Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
> ---

Hi Mike,
One nit, please see below:

>  .../admin-guide/kernel-parameters.txt         | 35 ++++---
>  Documentation/admin-guide/mm/hugetlbpage.rst  | 44 +++++++++
>  mm/hugetlb.c                                  | 96 +++++++++++++++----
>  3 files changed, 142 insertions(+), 33 deletions(-)
> 
> diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
> index 1bd5454b5e5f..de653cfe1726 100644
> --- a/Documentation/admin-guide/kernel-parameters.txt
> +++ b/Documentation/admin-guide/kernel-parameters.txt
> @@ -832,12 +832,15 @@
>  			See also Documentation/networking/decnet.txt.
>  
>  	default_hugepagesz=
> -			[same as hugepagesz=] The size of the default
> -			HugeTLB page size. This is the size represented by
> -			the legacy /proc/ hugepages APIs, used for SHM, and
> -			default size when mounting hugetlbfs filesystems.
> -			Defaults to the default architecture's huge page size
> -			if not specified.
> +			[HW] The size of the default HugeTLB page size. This

			Drop one "size" above?

> +			is the size represented by the legacy /proc/ hugepages
> +			APIs.  In addition, this is the default hugetlb size
> +			used for shmget(), mmap() and mounting hugetlbfs
> +			filesystems.  If not specified, defaults to the
> +			architecture's default huge page size.  Huge page
> +			sizes are architecture dependent.  See also
> +			Documentation/admin-guide/mm/hugetlbpage.rst.
> +			Format: size[KMG]



-- 
~Randy


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 1/4] hugetlbfs: add arch_hugetlb_valid_size
  2020-04-01 18:38 ` [PATCH v2 1/4] hugetlbfs: add arch_hugetlb_valid_size Mike Kravetz
@ 2020-04-10 19:16   ` Peter Xu
  2020-04-13 17:04     ` Mike Kravetz
  0 siblings, 1 reply; 13+ messages in thread
From: Peter Xu @ 2020-04-10 19:16 UTC (permalink / raw)
  To: Mike Kravetz
  Cc: linux-mm, linux-kernel, linux-arm-kernel, linuxppc-dev,
	linux-riscv, linux-s390, sparclinux, linux-doc, Catalin Marinas,
	Will Deacon, Benjamin Herrenschmidt, Paul Mackerras,
	Paul Walmsley, Palmer Dabbelt, Albert Ou, Heiko Carstens,
	Vasily Gorbik, Christian Borntraeger, David S . Miller,
	Thomas Gleixner, Ingo Molnar, Dave Hansen, Jonathan Corbet,
	Longpeng, Christophe Leroy, Mina Almasry, Andrew Morton

On Wed, Apr 01, 2020 at 11:38:16AM -0700, Mike Kravetz wrote:
> diff --git a/arch/arm64/include/asm/hugetlb.h b/arch/arm64/include/asm/hugetlb.h
> index 2eb6c234d594..81606223494f 100644
> --- a/arch/arm64/include/asm/hugetlb.h
> +++ b/arch/arm64/include/asm/hugetlb.h
> @@ -59,6 +59,8 @@ extern void huge_pte_clear(struct mm_struct *mm, unsigned long addr,
>  extern void set_huge_swap_pte_at(struct mm_struct *mm, unsigned long addr,
>  				 pte_t *ptep, pte_t pte, unsigned long sz);
>  #define set_huge_swap_pte_at set_huge_swap_pte_at
> +bool __init arch_hugetlb_valid_size(unsigned long size);
> +#define arch_hugetlb_valid_size arch_hugetlb_valid_size

Sorry for chimming in late.

Since we're working on removing arch-dependent codes after all.. I'm
thinking whether we can define arch_hugetlb_valid_size() once in the
common header (e.g. linux/hugetlb.h), then in mm/hugetlb.c:

bool __init __attribute((weak)) arch_hugetlb_valid_size(unsigned long size)
{
	return size == HPAGE_SIZE;
}

We can simply redefine arch_hugetlb_valid_size() in arch specific C
files where we want to override the default.  Would that be slightly
cleaner?

Thanks,

-- 
Peter Xu


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 2/4] hugetlbfs: move hugepagesz= parsing to arch independent code
  2020-04-01 18:38 ` [PATCH v2 2/4] hugetlbfs: move hugepagesz= parsing to arch independent code Mike Kravetz
@ 2020-04-10 19:26   ` Peter Xu
  0 siblings, 0 replies; 13+ messages in thread
From: Peter Xu @ 2020-04-10 19:26 UTC (permalink / raw)
  To: Mike Kravetz
  Cc: linux-mm, linux-kernel, linux-arm-kernel, linuxppc-dev,
	linux-riscv, linux-s390, sparclinux, linux-doc, Catalin Marinas,
	Will Deacon, Benjamin Herrenschmidt, Paul Mackerras,
	Paul Walmsley, Palmer Dabbelt, Albert Ou, Heiko Carstens,
	Vasily Gorbik, Christian Borntraeger, David S . Miller,
	Thomas Gleixner, Ingo Molnar, Dave Hansen, Jonathan Corbet,
	Longpeng, Christophe Leroy, Mina Almasry, Andrew Morton

On Wed, Apr 01, 2020 at 11:38:17AM -0700, Mike Kravetz wrote:
> Now that architectures provide arch_hugetlb_valid_size(), parsing
> of "hugepagesz=" can be done in architecture independent code.
> Create a single routine to handle hugepagesz= parsing and remove
> all arch specific routines.  We can also remove the interface
> hugetlb_bad_size() as this is no longer used outside arch independent
> code.
> 
> This also provides consistent behavior of hugetlbfs command line
> options.  The hugepagesz= option should only be specified once for
> a specific size, but some architectures allow multiple instances.
> This appears to be more of an oversight when code was added by some
> architectures to set up ALL huge pages sizes.
> 
> Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>

This could change the error messages for a wrong setup on archs, but I
guess it's not a big deal, assuming even to capture error people will
majorly still look for error lines in general..

Reviewed-by: Peter Xu <peterx@redhat.com>

-- 
Peter Xu


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 3/4] hugetlbfs: remove hugetlb_add_hstate() warning for existing hstate
  2020-04-01 18:38 ` [PATCH v2 3/4] hugetlbfs: remove hugetlb_add_hstate() warning for existing hstate Mike Kravetz
@ 2020-04-10 19:34   ` Peter Xu
  0 siblings, 0 replies; 13+ messages in thread
From: Peter Xu @ 2020-04-10 19:34 UTC (permalink / raw)
  To: Mike Kravetz
  Cc: linux-mm, linux-kernel, linux-arm-kernel, linuxppc-dev,
	linux-riscv, linux-s390, sparclinux, linux-doc, Catalin Marinas,
	Will Deacon, Benjamin Herrenschmidt, Paul Mackerras,
	Paul Walmsley, Palmer Dabbelt, Albert Ou, Heiko Carstens,
	Vasily Gorbik, Christian Borntraeger, David S . Miller,
	Thomas Gleixner, Ingo Molnar, Dave Hansen, Jonathan Corbet,
	Longpeng, Christophe Leroy, Mina Almasry, Andrew Morton

On Wed, Apr 01, 2020 at 11:38:18AM -0700, Mike Kravetz wrote:

[...]

> @@ -3255,7 +3254,6 @@ void __init hugetlb_add_hstate(unsigned int order)
>  	unsigned long i;
>  
>  	if (size_to_hstate(PAGE_SIZE << order)) {
> -		pr_warn("hugepagesz= specified twice, ignoring\n");
>  		return;
>  	}

Nitpick: I think the brackets need to be removed to follow linux
coding style.  With that:

Reviewed-by: Peter Xu <peterx@redhat.com>

-- 
Peter Xu


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 4/4] hugetlbfs: clean up command line processing
  2020-04-01 18:38 ` [PATCH v2 4/4] hugetlbfs: clean up command line processing Mike Kravetz
  2020-04-01 18:55   ` Randy Dunlap
@ 2020-04-10 20:37   ` Peter Xu
  2020-04-13 17:59     ` Mike Kravetz
  1 sibling, 1 reply; 13+ messages in thread
From: Peter Xu @ 2020-04-10 20:37 UTC (permalink / raw)
  To: Mike Kravetz
  Cc: linux-mm, linux-kernel, linux-arm-kernel, linuxppc-dev,
	linux-riscv, linux-s390, sparclinux, linux-doc, Catalin Marinas,
	Will Deacon, Benjamin Herrenschmidt, Paul Mackerras,
	Paul Walmsley, Palmer Dabbelt, Albert Ou, Heiko Carstens,
	Vasily Gorbik, Christian Borntraeger, David S . Miller,
	Thomas Gleixner, Ingo Molnar, Dave Hansen, Jonathan Corbet,
	Longpeng, Christophe Leroy, Mina Almasry, Andrew Morton

On Wed, Apr 01, 2020 at 11:38:19AM -0700, Mike Kravetz wrote:
> With all hugetlb page processing done in a single file clean up code.
> - Make code match desired semantics
>   - Update documentation with semantics
> - Make all warnings and errors messages start with 'HugeTLB:'.
> - Consistently name command line parsing routines.
> - Check for hugepages_supported() before processing parameters.
> - Add comments to code
>   - Describe some of the subtle interactions
>   - Describe semantics of command line arguments
> 
> Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
> ---
>  .../admin-guide/kernel-parameters.txt         | 35 ++++---
>  Documentation/admin-guide/mm/hugetlbpage.rst  | 44 +++++++++
>  mm/hugetlb.c                                  | 96 +++++++++++++++----
>  3 files changed, 142 insertions(+), 33 deletions(-)
> 
> diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
> index 1bd5454b5e5f..de653cfe1726 100644
> --- a/Documentation/admin-guide/kernel-parameters.txt
> +++ b/Documentation/admin-guide/kernel-parameters.txt
> @@ -832,12 +832,15 @@
>  			See also Documentation/networking/decnet.txt.
>  
>  	default_hugepagesz=
> -			[same as hugepagesz=] The size of the default
> -			HugeTLB page size. This is the size represented by
> -			the legacy /proc/ hugepages APIs, used for SHM, and
> -			default size when mounting hugetlbfs filesystems.
> -			Defaults to the default architecture's huge page size
> -			if not specified.
> +			[HW] The size of the default HugeTLB page size. This

Could I ask what's "HW"?  Sorry this is not a comment at all but
really a pure question I wanted to ask... :)

> +			is the size represented by the legacy /proc/ hugepages
> +			APIs.  In addition, this is the default hugetlb size
> +			used for shmget(), mmap() and mounting hugetlbfs
> +			filesystems.  If not specified, defaults to the
> +			architecture's default huge page size.  Huge page
> +			sizes are architecture dependent.  See also
> +			Documentation/admin-guide/mm/hugetlbpage.rst.
> +			Format: size[KMG]
>  
>  	deferred_probe_timeout=
>  			[KNL] Debugging option to set a timeout in seconds for
> @@ -1480,13 +1483,19 @@
>  			If enabled, boot-time allocation of gigantic hugepages
>  			is skipped.
>  
> -	hugepages=	[HW,X86-32,IA-64] HugeTLB pages to allocate at boot.
> -	hugepagesz=	[HW,IA-64,PPC,X86-64] The size of the HugeTLB pages.
> -			On x86-64 and powerpc, this option can be specified
> -			multiple times interleaved with hugepages= to reserve
> -			huge pages of different sizes. Valid pages sizes on
> -			x86-64 are 2M (when the CPU supports "pse") and 1G
> -			(when the CPU supports the "pdpe1gb" cpuinfo flag).
> +	hugepages=	[HW] Number of HugeTLB pages to allocate at boot.
> +			If this follows hugepagesz (below), it specifies
> +			the number of pages of hugepagesz to be allocated.

"... Otherwise it specifies the number of pages to allocate for the
default huge page size." ?

> +			Format: <integer>

How about add a new line here?

> +	hugepagesz=
> +			[HW] The size of the HugeTLB pages.  This is used in
> +			conjunction with hugepages (above) to allocate huge
> +			pages of a specific size at boot.  The pair
> +			hugepagesz=X hugepages=Y can be specified once for
> +			each supported huge page size. Huge page sizes are
> +			architecture dependent.  See also
> +			Documentation/admin-guide/mm/hugetlbpage.rst.
> +			Format: size[KMG]
>  
>  	hung_task_panic=
>  			[KNL] Should the hung task detector generate panics.
> diff --git a/Documentation/admin-guide/mm/hugetlbpage.rst b/Documentation/admin-guide/mm/hugetlbpage.rst
> index 1cc0bc78d10e..de340c586995 100644
> --- a/Documentation/admin-guide/mm/hugetlbpage.rst
> +++ b/Documentation/admin-guide/mm/hugetlbpage.rst
> @@ -100,6 +100,50 @@ with a huge page size selection parameter "hugepagesz=<size>".  <size> must
>  be specified in bytes with optional scale suffix [kKmMgG].  The default huge
>  page size may be selected with the "default_hugepagesz=<size>" boot parameter.
>  
> +Hugetlb boot command line parameter semantics
> +hugepagesz - Specify a huge page size.  Used in conjunction with hugepages
> +	parameter to preallocate a number of huge pages of the specified
> +	size.  Hence, hugepagesz and hugepages are typically specified in
> +	pairs such as:
> +		hugepagesz=2M hugepages=512
> +	hugepagesz can only be specified once on the command line for a
> +	specific huge page size.  Valid huge page sizes are architecture
> +	dependent.
> +hugepages - Specify the number of huge pages to preallocate.  This typically
> +	follows a valid hugepagesz parameter.  However, if hugepages is the
> +	first or only hugetlb command line parameter it specifies the number
> +	of huge pages of default size to allocate.  The number of huge pages
> +	of default size specified in this manner can be overwritten by a
> +	hugepagesz,hugepages parameter pair for the default size.
> +	For example, on an architecture with 2M default huge page size:
> +		hugepages=256 hugepagesz=2M hugepages=512
> +	will result in 512 2M huge pages being allocated.  If a hugepages
> +	parameter is preceded by an invalid hugepagesz parameter, it will
> +	be ignored.
> +default_hugepagesz - Specify the default huge page size.  This parameter can
> +	only be specified once on the command line.  No other hugetlb command
> +	line parameter is associated with default_hugepagesz.  Therefore, it
> +	can appear anywhere on the command line.  If hugepages= is the first
> +	hugetlb command line parameter, the specified number of huge pages
> +	will apply to the default huge page size specified with
> +	default_hugepagesz.  For example,
> +		hugepages=512 default_hugepagesz=2M

No strong opinion, but considering to the special case of gigantic
huge page mentioned below, I'm thinking maybe it's easier to just ask
the user to always use "hugepagesz=X hugepages=Y" pair when people
want to reserve huge pages.

For example, some user might start to use this after this series
legally:

    default_hugepagesz=2M hugepages=1024

Then the user thinks, hmm, maybe it's good to use 1G pages, by just
changing some numbers:

    default_hugepagesz=1G hugepages=2

Then if it stops working it could really confuse the user.

(Besides, it could be an extra maintainaince burden for linux itself)

> +	will result in 512 2M huge pages being allocated.  However, specifying
> +	the number of default huge pages in this manner will not apply to
> +	gigantic huge pages.  For example,
> +		hugepages=10 default_hugepagesz=1G
> +				or
> +		default_hugepagesz=1G hugepages=10
> +	will NOT result in the allocation of 10 1G huge pages.  In order to
> +	preallocate gigantic huge pages, there must be hugepagesz, hugepages
> +	parameter pair.  For example,
> +		hugepagesz=1G hugepages=10 default_hugepagesz=1G
> +				or
> +		default_hugepagesz=1G hugepagesz=1G hugepages=10
> +	will result 10 1G huge pages being allocated and the default huge
> +	page size will be set to 1G.  Valid default huge page size is
> +	architecture dependent.
> +
>  When multiple huge page sizes are supported, ``/proc/sys/vm/nr_hugepages``
>  indicates the current number of pre-allocated huge pages of the default size.
>  Thus, one can use the following command to dynamically allocate/deallocate
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index 72a4343509d5..74ef53f7c5a7 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -3054,7 +3054,7 @@ static void __init hugetlb_sysfs_init(void)
>  		err = hugetlb_sysfs_add_hstate(h, hugepages_kobj,
>  					 hstate_kobjs, &hstate_attr_group);
>  		if (err)
> -			pr_err("Hugetlb: Unable to add hstate %s", h->name);
> +			pr_err("HugeTLB: Unable to add hstate %s", h->name);
>  	}
>  }
>  
> @@ -3158,7 +3158,7 @@ static void hugetlb_register_node(struct node *node)
>  						nhs->hstate_kobjs,
>  						&per_node_hstate_attr_group);
>  		if (err) {
> -			pr_err("Hugetlb: Unable to add hstate %s for node %d\n",
> +			pr_err("HugeTLB: Unable to add hstate %s for node %d\n",
>  				h->name, node->dev.id);
>  			hugetlb_unregister_node(node);
>  			break;
> @@ -3209,19 +3209,35 @@ static int __init hugetlb_init(void)
>  	if (!hugepages_supported())
>  		return 0;
>  
> -	if (!size_to_hstate(default_hstate_size)) {
> -		if (default_hstate_size != 0) {
> -			pr_err("HugeTLB: unsupported default_hugepagesz %lu. Reverting to %lu\n",
> -			       default_hstate_size, HPAGE_SIZE);
> -		}
> -
> +	/*
> +	 * Make sure HPAGE_SIZE (HUGETLB_PAGE_ORDER) hstate exists.  Some
> +	 * architectures depend on setup being done here.
> +	 *
> +	 * If a valid default huge page size was specified on the command line,
> +	 * add associated hstate if necessary.  If not, set default_hstate_size
> +	 * to default size.  default_hstate_idx is used at runtime to identify
> +	 * the default huge page size/hstate.
> +	 */
> +	hugetlb_add_hstate(HUGETLB_PAGE_ORDER);
> +	if (default_hstate_size)
> +		hugetlb_add_hstate(ilog2(default_hstate_size) - PAGE_SHIFT);
> +	else
>  		default_hstate_size = HPAGE_SIZE;
> -		hugetlb_add_hstate(HUGETLB_PAGE_ORDER);
> -	}
>  	default_hstate_idx = hstate_index(size_to_hstate(default_hstate_size));
> +
> +	/*
> +	 * default_hstate_max_huge_pages != 0 indicates a count (hugepages=)
> +	 * specified before a size (hugepagesz=).  Use this count for the
> +	 * default huge page size, unless a specific value was specified for
> +	 * this size in a hugepagesz/hugepages pair.
> +	 */
>  	if (default_hstate_max_huge_pages) {

Since we're refactoring this - Could default_hstate_max_huge_pages be
dropped directly (in hugepages= we can create the default hstate, then
we set max_huge_pages of the default hstate there)?  Or did I miss
anything important?

>  		if (!default_hstate.max_huge_pages)
> -			default_hstate.max_huge_pages = default_hstate_max_huge_pages;
> +			default_hstate.max_huge_pages =
> +				default_hstate_max_huge_pages;
> +		else
> +			pr_warn("HugeTLB: First hugepages=%lu ignored\n",
> +				default_hstate_max_huge_pages);
>  	}
>  
>  	hugetlb_init_hstates();
> @@ -3274,20 +3290,31 @@ void __init hugetlb_add_hstate(unsigned int order)
>  	parsed_hstate = h;
>  }
>  
> -static int __init hugetlb_nrpages_setup(char *s)
> +/*
> + * hugepages command line processing
> + * hugepages normally follows a valid hugepagsz specification.  If not, ignore
> + * the hugepages value.  hugepages can also be the first huge page command line
> + * option in which case it specifies the number of huge pages for the default
> + * size.
> + */
> +static int __init hugepages_setup(char *s)
>  {
>  	unsigned long *mhp;
>  	static unsigned long *last_mhp;
>  
> +	if (!hugepages_supported()) {
> +		pr_warn("HugeTLB: huge pages not supported, ignoring hugepages = %s\n", s);
> +		return 0;
> +	}
> +
>  	if (!parsed_valid_hugepagesz) {
> -		pr_warn("hugepages = %s preceded by "
> -			"an unsupported hugepagesz, ignoring\n", s);
> +		pr_warn("HugeTLB: hugepages = %s preceded by an unsupported hugepagesz, ignoring\n", s);

s/preceded/is preceded/?

>  		parsed_valid_hugepagesz = true;
> -		return 1;
> +		return 0;
>  	}
>  	/*
> -	 * !hugetlb_max_hstate means we haven't parsed a hugepagesz= parameter yet,
> -	 * so this hugepages= parameter goes to the "default hstate".
> +	 * !hugetlb_max_hstate means we haven't parsed a hugepagesz= parameter
> +	 * yet, so this hugepages= parameter goes to the "default hstate".
>  	 */
>  	else if (!hugetlb_max_hstate)
>  		mhp = &default_hstate_max_huge_pages;
> @@ -3295,8 +3322,8 @@ static int __init hugetlb_nrpages_setup(char *s)
>  		mhp = &parsed_hstate->max_huge_pages;
>  
>  	if (mhp == last_mhp) {
> -		pr_warn("hugepages= specified twice without interleaving hugepagesz=, ignoring\n");
> -		return 1;
> +		pr_warn("HugeTLB: hugepages= specified twice without interleaving hugepagesz=, ignoring hugepages=%s\n", s);
> +		return 0;
>  	}
>  
>  	if (sscanf(s, "%lu", mhp) <= 0)
> @@ -3314,12 +3341,24 @@ static int __init hugetlb_nrpages_setup(char *s)
>  
>  	return 1;
>  }
> -__setup("hugepages=", hugetlb_nrpages_setup);
> +__setup("hugepages=", hugepages_setup);
>  
> +/*
> + * hugepagesz command line processing
> + * A specific huge page size can only be specified once with hugepagesz.
> + * hugepagesz is followed by hugepages on the command line.  The global
> + * variable 'parsed_valid_hugepagesz' is used to determine if prior
> + * hugepagesz argument was valid.
> + */
>  static int __init hugepagesz_setup(char *s)
>  {
>  	unsigned long size;
>  
> +	if (!hugepages_supported()) {
> +		pr_warn("HugeTLB: huge pages not supported, ignoring hugepagesz = %s\n", s);
> +		return 0;
> +	}
> +
>  	size = (unsigned long)memparse(s, NULL);
>  
>  	if (!arch_hugetlb_valid_size(size)) {
> @@ -3329,19 +3368,31 @@ static int __init hugepagesz_setup(char *s)
>  	}
>  
>  	if (size_to_hstate(size)) {
> +		parsed_valid_hugepagesz = false;
>  		pr_warn("HugeTLB: hugepagesz %s specified twice, ignoring\n", s);
>  		return 0;
>  	}
>  
> +	parsed_valid_hugepagesz = true;
>  	hugetlb_add_hstate(ilog2(size) - PAGE_SHIFT);
>  	return 1;
>  }
>  __setup("hugepagesz=", hugepagesz_setup);
>  
> +/*
> + * default_hugepagesz command line input
> + * Only one instance of default_hugepagesz allowed on command line.  Do not
> + * add hstate here as that will confuse hugepagesz/hugepages processing.
> + */
>  static int __init default_hugepagesz_setup(char *s)
>  {
>  	unsigned long size;
>  
> +	if (!hugepages_supported()) {
> +		pr_warn("HugeTLB: huge pages not supported, ignoring default_hugepagesz = %s\n", s);
> +		return 0;
> +	}
> +
>  	size = (unsigned long)memparse(s, NULL);
>  
>  	if (!arch_hugetlb_valid_size(size)) {
> @@ -3349,6 +3400,11 @@ static int __init default_hugepagesz_setup(char *s)
>  		return 0;
>  	}
>  
> +	if (default_hstate_size) {
> +		pr_err("HugeTLB: default_hugepagesz previously specified, ignoring %s\n", s);
> +		return 0;
> +	}

Nitpick: ideally this can be moved before memparse().

Thanks,

> +
>  	default_hstate_size = size;
>  	return 1;
>  }
> -- 
> 2.25.1
> 
> 

-- 
Peter Xu


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 1/4] hugetlbfs: add arch_hugetlb_valid_size
  2020-04-10 19:16   ` Peter Xu
@ 2020-04-13 17:04     ` Mike Kravetz
  0 siblings, 0 replies; 13+ messages in thread
From: Mike Kravetz @ 2020-04-13 17:04 UTC (permalink / raw)
  To: Peter Xu
  Cc: linux-mm, linux-kernel, linux-arm-kernel, linuxppc-dev,
	linux-riscv, linux-s390, sparclinux, linux-doc, Catalin Marinas,
	Will Deacon, Benjamin Herrenschmidt, Paul Mackerras,
	Paul Walmsley, Palmer Dabbelt, Albert Ou, Heiko Carstens,
	Vasily Gorbik, Christian Borntraeger, David S.Miller,
	Thomas Gleixner, Ingo Molnar, Dave Hansen, Jonathan Corbet,
	Longpeng, Christophe Leroy, Mina Almasry, Andrew Morton

On 4/10/20 12:16 PM, Peter Xu wrote:
> On Wed, Apr 01, 2020 at 11:38:16AM -0700, Mike Kravetz wrote:
>> diff --git a/arch/arm64/include/asm/hugetlb.h b/arch/arm64/include/asm/hugetlb.h
>> index 2eb6c234d594..81606223494f 100644
>> --- a/arch/arm64/include/asm/hugetlb.h
>> +++ b/arch/arm64/include/asm/hugetlb.h
>> @@ -59,6 +59,8 @@ extern void huge_pte_clear(struct mm_struct *mm, unsigned long addr,
>>  extern void set_huge_swap_pte_at(struct mm_struct *mm, unsigned long addr,
>>  				 pte_t *ptep, pte_t pte, unsigned long sz);
>>  #define set_huge_swap_pte_at set_huge_swap_pte_at
>> +bool __init arch_hugetlb_valid_size(unsigned long size);
>> +#define arch_hugetlb_valid_size arch_hugetlb_valid_size
> 
> Sorry for chimming in late.

Thank you for taking a look!

> Since we're working on removing arch-dependent codes after all.. I'm
> thinking whether we can define arch_hugetlb_valid_size() once in the
> common header (e.g. linux/hugetlb.h), then in mm/hugetlb.c:
> 
> bool __init __attribute((weak)) arch_hugetlb_valid_size(unsigned long size)
> {
> 	return size == HPAGE_SIZE;
> }
> 
> We can simply redefine arch_hugetlb_valid_size() in arch specific C
> files where we want to override the default.  Would that be slightly
> cleaner?

I think both the #define X X and weak attribute methods are acceptable.
I went with the #define method only because it was most familiar to me.
Using the weak attribute method does appear to be cleaner.  I'll code it up.

Anyone else have a preference?
-- 
Mike Kravetz

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 4/4] hugetlbfs: clean up command line processing
  2020-04-10 20:37   ` Peter Xu
@ 2020-04-13 17:59     ` Mike Kravetz
  2020-04-14 15:27       ` Peter Xu
  0 siblings, 1 reply; 13+ messages in thread
From: Mike Kravetz @ 2020-04-13 17:59 UTC (permalink / raw)
  To: Peter Xu
  Cc: linux-mm, linux-kernel, linux-arm-kernel, linuxppc-dev,
	linux-riscv, linux-s390, sparclinux, linux-doc, Catalin Marinas,
	Will Deacon, Benjamin Herrenschmidt, Paul Mackerras,
	Paul Walmsley, Palmer Dabbelt, Albert Ou, Heiko Carstens,
	Vasily Gorbik, Christian Borntraeger, David S.Miller,
	Thomas Gleixner, Ingo Molnar, Dave Hansen, Jonathan Corbet,
	Longpeng, Christophe Leroy, Mina Almasry, Andrew Morton

On 4/10/20 1:37 PM, Peter Xu wrote:
> On Wed, Apr 01, 2020 at 11:38:19AM -0700, Mike Kravetz wrote:
>> With all hugetlb page processing done in a single file clean up code.
>> - Make code match desired semantics
>>   - Update documentation with semantics
>> - Make all warnings and errors messages start with 'HugeTLB:'.
>> - Consistently name command line parsing routines.
>> - Check for hugepages_supported() before processing parameters.
>> - Add comments to code
>>   - Describe some of the subtle interactions
>>   - Describe semantics of command line arguments
>>
>> Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
>> ---
>>  .../admin-guide/kernel-parameters.txt         | 35 ++++---
>>  Documentation/admin-guide/mm/hugetlbpage.rst  | 44 +++++++++
>>  mm/hugetlb.c                                  | 96 +++++++++++++++----
>>  3 files changed, 142 insertions(+), 33 deletions(-)
>>
>> diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
>> index 1bd5454b5e5f..de653cfe1726 100644
>> --- a/Documentation/admin-guide/kernel-parameters.txt
>> +++ b/Documentation/admin-guide/kernel-parameters.txt
>> @@ -832,12 +832,15 @@
>>  			See also Documentation/networking/decnet.txt.
>>  
>>  	default_hugepagesz=
>> -			[same as hugepagesz=] The size of the default
>> -			HugeTLB page size. This is the size represented by
>> -			the legacy /proc/ hugepages APIs, used for SHM, and
>> -			default size when mounting hugetlbfs filesystems.
>> -			Defaults to the default architecture's huge page size
>> -			if not specified.
>> +			[HW] The size of the default HugeTLB page size. This
> 
> Could I ask what's "HW"?  Sorry this is not a comment at all but
> really a pure question I wanted to ask... :)

kernel-parameters.rst includes kernel-parameters.txt and included the meaning
for these codes.

       HW      Appropriate hardware is enabled.

Previously, it listed an obsolete list of architectures.

>> +			is the size represented by the legacy /proc/ hugepages
>> +			APIs.  In addition, this is the default hugetlb size
>> +			used for shmget(), mmap() and mounting hugetlbfs
>> +			filesystems.  If not specified, defaults to the
>> +			architecture's default huge page size.  Huge page
>> +			sizes are architecture dependent.  See also
>> +			Documentation/admin-guide/mm/hugetlbpage.rst.
>> +			Format: size[KMG]
>>  
>>  	deferred_probe_timeout=
>>  			[KNL] Debugging option to set a timeout in seconds for
>> @@ -1480,13 +1483,19 @@
>>  			If enabled, boot-time allocation of gigantic hugepages
>>  			is skipped.
>>  
>> -	hugepages=	[HW,X86-32,IA-64] HugeTLB pages to allocate at boot.
>> -	hugepagesz=	[HW,IA-64,PPC,X86-64] The size of the HugeTLB pages.
>> -			On x86-64 and powerpc, this option can be specified
>> -			multiple times interleaved with hugepages= to reserve
>> -			huge pages of different sizes. Valid pages sizes on
>> -			x86-64 are 2M (when the CPU supports "pse") and 1G
>> -			(when the CPU supports the "pdpe1gb" cpuinfo flag).
>> +	hugepages=	[HW] Number of HugeTLB pages to allocate at boot.
>> +			If this follows hugepagesz (below), it specifies
>> +			the number of pages of hugepagesz to be allocated.
> 
> "... Otherwise it specifies the number of pages to allocate for the
> default huge page size." ?

Yes, best to be specific.  I suspect this is the most common way this
parameter is used.

> 
>> +			Format: <integer>
> 
> How about add a new line here?

Sure

>> +	hugepagesz=
>> +			[HW] The size of the HugeTLB pages.  This is used in
>> +			conjunction with hugepages (above) to allocate huge
>> +			pages of a specific size at boot.  The pair
>> +			hugepagesz=X hugepages=Y can be specified once for
>> +			each supported huge page size. Huge page sizes are
>> +			architecture dependent.  See also
>> +			Documentation/admin-guide/mm/hugetlbpage.rst.
>> +			Format: size[KMG]
>>  
>>  	hung_task_panic=
>>  			[KNL] Should the hung task detector generate panics.
>> diff --git a/Documentation/admin-guide/mm/hugetlbpage.rst b/Documentation/admin-guide/mm/hugetlbpage.rst
>> index 1cc0bc78d10e..de340c586995 100644
>> --- a/Documentation/admin-guide/mm/hugetlbpage.rst
>> +++ b/Documentation/admin-guide/mm/hugetlbpage.rst
>> @@ -100,6 +100,50 @@ with a huge page size selection parameter "hugepagesz=<size>".  <size> must
>>  be specified in bytes with optional scale suffix [kKmMgG].  The default huge
>>  page size may be selected with the "default_hugepagesz=<size>" boot parameter.
>>  
>> +Hugetlb boot command line parameter semantics
>> +hugepagesz - Specify a huge page size.  Used in conjunction with hugepages
>> +	parameter to preallocate a number of huge pages of the specified
>> +	size.  Hence, hugepagesz and hugepages are typically specified in
>> +	pairs such as:
>> +		hugepagesz=2M hugepages=512
>> +	hugepagesz can only be specified once on the command line for a
>> +	specific huge page size.  Valid huge page sizes are architecture
>> +	dependent.
>> +hugepages - Specify the number of huge pages to preallocate.  This typically
>> +	follows a valid hugepagesz parameter.  However, if hugepages is the
>> +	first or only hugetlb command line parameter it specifies the number
>> +	of huge pages of default size to allocate.  The number of huge pages
>> +	of default size specified in this manner can be overwritten by a
>> +	hugepagesz,hugepages parameter pair for the default size.
>> +	For example, on an architecture with 2M default huge page size:
>> +		hugepages=256 hugepagesz=2M hugepages=512
>> +	will result in 512 2M huge pages being allocated.  If a hugepages
>> +	parameter is preceded by an invalid hugepagesz parameter, it will
>> +	be ignored.
>> +default_hugepagesz - Specify the default huge page size.  This parameter can
>> +	only be specified once on the command line.  No other hugetlb command
>> +	line parameter is associated with default_hugepagesz.  Therefore, it
>> +	can appear anywhere on the command line.  If hugepages= is the first
>> +	hugetlb command line parameter, the specified number of huge pages
>> +	will apply to the default huge page size specified with
>> +	default_hugepagesz.  For example,
>> +		hugepages=512 default_hugepagesz=2M
> 
> No strong opinion, but considering to the special case of gigantic
> huge page mentioned below, I'm thinking maybe it's easier to just ask
> the user to always use "hugepagesz=X hugepages=Y" pair when people
> want to reserve huge pages.

We can ask people to do this.  However, I do not think we can force it at
this time.  Why?  Mostly because I have seen many instances where people
only specify 'hugepages=X' on the command line to preallocate X huge pages
of default size.  So, forcing 'hugepagesz=X hugepages=Y' would break those
users.

> For example, some user might start to use this after this series
> legally:
> 
>     default_hugepagesz=2M hugepages=1024

Well, that 'works' today.  You get that silly error message:

HugeTLB: unsupported default_hugepagesz 2097152. Reverting to 2097152

But, it does preallocate 1024 huge pages of size 2M.  Because people
have noticed the silly error message, I suspect this usage,

	default_hugepagesz=X hugepages=Y

is in use today and we need to support it.

> Then the user thinks, hmm, maybe it's good to use 1G pages, by just
> changing some numbers:
> 
>     default_hugepagesz=1G hugepages=2
> 
> Then if it stops working it could really confuse the user.
> 
> (Besides, it could be an extra maintainaince burden for linux itself)

I did not think about/look into the different behavior for gigantic pages
until updating the documentation.  This is not 'new' behavior introduced
by this patch series.  It comes about because we do not definitively set
the default huge page size until after command line processing
(in hugetlb_init).  And, we must preallocate gigantic huge pages during
command line processing because that is when the bootmem allocater is
available.

I really did not want to change the code to handle this (gigantic page)
situation.  There is no indication it is a problem today.  However, the
more I think about it the more I think the code should be changed.  I'll
work on code modifications to support this.

>> +	will result in 512 2M huge pages being allocated.  However, specifying
>> +	the number of default huge pages in this manner will not apply to
>> +	gigantic huge pages.  For example,
>> +		hugepages=10 default_hugepagesz=1G
>> +				or
>> +		default_hugepagesz=1G hugepages=10
>> +	will NOT result in the allocation of 10 1G huge pages.  In order to
>> +	preallocate gigantic huge pages, there must be hugepagesz, hugepages
>> +	parameter pair.  For example,
>> +		hugepagesz=1G hugepages=10 default_hugepagesz=1G
>> +				or
>> +		default_hugepagesz=1G hugepagesz=1G hugepages=10
>> +	will result 10 1G huge pages being allocated and the default huge
>> +	page size will be set to 1G.  Valid default huge page size is
>> +	architecture dependent.
>> +
>>  When multiple huge page sizes are supported, ``/proc/sys/vm/nr_hugepages``
>>  indicates the current number of pre-allocated huge pages of the default size.
>>  Thus, one can use the following command to dynamically allocate/deallocate
>> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
>> index 72a4343509d5..74ef53f7c5a7 100644
>> --- a/mm/hugetlb.c
>> +++ b/mm/hugetlb.c
>> @@ -3054,7 +3054,7 @@ static void __init hugetlb_sysfs_init(void)
>>  		err = hugetlb_sysfs_add_hstate(h, hugepages_kobj,
>>  					 hstate_kobjs, &hstate_attr_group);
>>  		if (err)
>> -			pr_err("Hugetlb: Unable to add hstate %s", h->name);
>> +			pr_err("HugeTLB: Unable to add hstate %s", h->name);
>>  	}
>>  }
>>  
>> @@ -3158,7 +3158,7 @@ static void hugetlb_register_node(struct node *node)
>>  						nhs->hstate_kobjs,
>>  						&per_node_hstate_attr_group);
>>  		if (err) {
>> -			pr_err("Hugetlb: Unable to add hstate %s for node %d\n",
>> +			pr_err("HugeTLB: Unable to add hstate %s for node %d\n",
>>  				h->name, node->dev.id);
>>  			hugetlb_unregister_node(node);
>>  			break;
>> @@ -3209,19 +3209,35 @@ static int __init hugetlb_init(void)
>>  	if (!hugepages_supported())
>>  		return 0;
>>  
>> -	if (!size_to_hstate(default_hstate_size)) {
>> -		if (default_hstate_size != 0) {
>> -			pr_err("HugeTLB: unsupported default_hugepagesz %lu. Reverting to %lu\n",
>> -			       default_hstate_size, HPAGE_SIZE);
>> -		}
>> -
>> +	/*
>> +	 * Make sure HPAGE_SIZE (HUGETLB_PAGE_ORDER) hstate exists.  Some
>> +	 * architectures depend on setup being done here.
>> +	 *
>> +	 * If a valid default huge page size was specified on the command line,
>> +	 * add associated hstate if necessary.  If not, set default_hstate_size
>> +	 * to default size.  default_hstate_idx is used at runtime to identify
>> +	 * the default huge page size/hstate.
>> +	 */
>> +	hugetlb_add_hstate(HUGETLB_PAGE_ORDER);
>> +	if (default_hstate_size)
>> +		hugetlb_add_hstate(ilog2(default_hstate_size) - PAGE_SHIFT);
>> +	else
>>  		default_hstate_size = HPAGE_SIZE;
>> -		hugetlb_add_hstate(HUGETLB_PAGE_ORDER);
>> -	}
>>  	default_hstate_idx = hstate_index(size_to_hstate(default_hstate_size));
>> +
>> +	/*
>> +	 * default_hstate_max_huge_pages != 0 indicates a count (hugepages=)
>> +	 * specified before a size (hugepagesz=).  Use this count for the
>> +	 * default huge page size, unless a specific value was specified for
>> +	 * this size in a hugepagesz/hugepages pair.
>> +	 */
>>  	if (default_hstate_max_huge_pages) {
> 
> Since we're refactoring this - Could default_hstate_max_huge_pages be
> dropped directly (in hugepages= we can create the default hstate, then
> we set max_huge_pages of the default hstate there)?  Or did I miss
> anything important?

I do not think that works for 'hugepages=X default_hugepagesz=Y' processing?
It seems like there will need to be more work done on default_hugepagesz
processing.

>>  		if (!default_hstate.max_huge_pages)
>> -			default_hstate.max_huge_pages = default_hstate_max_huge_pages;
>> +			default_hstate.max_huge_pages =
>> +				default_hstate_max_huge_pages;
>> +		else
>> +			pr_warn("HugeTLB: First hugepages=%lu ignored\n",
>> +				default_hstate_max_huge_pages);
>>  	}
>>  
>>  	hugetlb_init_hstates();
>> @@ -3274,20 +3290,31 @@ void __init hugetlb_add_hstate(unsigned int order)
>>  	parsed_hstate = h;
>>  }
>>  
>> -static int __init hugetlb_nrpages_setup(char *s)
>> +/*
>> + * hugepages command line processing
>> + * hugepages normally follows a valid hugepagsz specification.  If not, ignore
>> + * the hugepages value.  hugepages can also be the first huge page command line
>> + * option in which case it specifies the number of huge pages for the default
>> + * size.
>> + */
>> +static int __init hugepages_setup(char *s)
>>  {
>>  	unsigned long *mhp;
>>  	static unsigned long *last_mhp;
>>  
>> +	if (!hugepages_supported()) {
>> +		pr_warn("HugeTLB: huge pages not supported, ignoring hugepages = %s\n", s);
>> +		return 0;
>> +	}
>> +
>>  	if (!parsed_valid_hugepagesz) {
>> -		pr_warn("hugepages = %s preceded by "
>> -			"an unsupported hugepagesz, ignoring\n", s);
>> +		pr_warn("HugeTLB: hugepages = %s preceded by an unsupported hugepagesz, ignoring\n", s);
> 
> s/preceded/is preceded/?

Thanks

> 
>>  		parsed_valid_hugepagesz = true;
>> -		return 1;
>> +		return 0;
>>  	}
>>  	/*
>> -	 * !hugetlb_max_hstate means we haven't parsed a hugepagesz= parameter yet,
>> -	 * so this hugepages= parameter goes to the "default hstate".
>> +	 * !hugetlb_max_hstate means we haven't parsed a hugepagesz= parameter
>> +	 * yet, so this hugepages= parameter goes to the "default hstate".
>>  	 */
>>  	else if (!hugetlb_max_hstate)
>>  		mhp = &default_hstate_max_huge_pages;
>> @@ -3295,8 +3322,8 @@ static int __init hugetlb_nrpages_setup(char *s)
>>  		mhp = &parsed_hstate->max_huge_pages;
>>  
>>  	if (mhp == last_mhp) {
>> -		pr_warn("hugepages= specified twice without interleaving hugepagesz=, ignoring\n");
>> -		return 1;
>> +		pr_warn("HugeTLB: hugepages= specified twice without interleaving hugepagesz=, ignoring hugepages=%s\n", s);
>> +		return 0;
>>  	}
>>  
>>  	if (sscanf(s, "%lu", mhp) <= 0)
>> @@ -3314,12 +3341,24 @@ static int __init hugetlb_nrpages_setup(char *s)
>>  
>>  	return 1;
>>  }
>> -__setup("hugepages=", hugetlb_nrpages_setup);
>> +__setup("hugepages=", hugepages_setup);
>>  
>> +/*
>> + * hugepagesz command line processing
>> + * A specific huge page size can only be specified once with hugepagesz.
>> + * hugepagesz is followed by hugepages on the command line.  The global
>> + * variable 'parsed_valid_hugepagesz' is used to determine if prior
>> + * hugepagesz argument was valid.
>> + */
>>  static int __init hugepagesz_setup(char *s)
>>  {
>>  	unsigned long size;
>>  
>> +	if (!hugepages_supported()) {
>> +		pr_warn("HugeTLB: huge pages not supported, ignoring hugepagesz = %s\n", s);
>> +		return 0;
>> +	}
>> +
>>  	size = (unsigned long)memparse(s, NULL);
>>  
>>  	if (!arch_hugetlb_valid_size(size)) {
>> @@ -3329,19 +3368,31 @@ static int __init hugepagesz_setup(char *s)
>>  	}
>>  
>>  	if (size_to_hstate(size)) {
>> +		parsed_valid_hugepagesz = false;
>>  		pr_warn("HugeTLB: hugepagesz %s specified twice, ignoring\n", s);
>>  		return 0;
>>  	}
>>  
>> +	parsed_valid_hugepagesz = true;
>>  	hugetlb_add_hstate(ilog2(size) - PAGE_SHIFT);
>>  	return 1;
>>  }
>>  __setup("hugepagesz=", hugepagesz_setup);
>>  
>> +/*
>> + * default_hugepagesz command line input
>> + * Only one instance of default_hugepagesz allowed on command line.  Do not
>> + * add hstate here as that will confuse hugepagesz/hugepages processing.
>> + */
>>  static int __init default_hugepagesz_setup(char *s)
>>  {
>>  	unsigned long size;
>>  
>> +	if (!hugepages_supported()) {
>> +		pr_warn("HugeTLB: huge pages not supported, ignoring default_hugepagesz = %s\n", s);
>> +		return 0;
>> +	}
>> +
>>  	size = (unsigned long)memparse(s, NULL);
>>  
>>  	if (!arch_hugetlb_valid_size(size)) {
>> @@ -3349,6 +3400,11 @@ static int __init default_hugepagesz_setup(char *s)
>>  		return 0;
>>  	}
>>  
>> +	if (default_hstate_size) {
>> +		pr_err("HugeTLB: default_hugepagesz previously specified, ignoring %s\n", s);
>> +		return 0;
>> +	}
> 
> Nitpick: ideally this can be moved before memparse().
> 
> Thanks,

Thanks you.

Unless someone thinks otherwise, I believe more work is needed to handle
preallocation of gigantic huge page sizes in instances like:

hugepages=128 default_hugepagesz=1G
or
default_hugepagesz=1G hugepages=128

I'll work on making such changes.
-- 
Mike Kravetz

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 4/4] hugetlbfs: clean up command line processing
  2020-04-13 17:59     ` Mike Kravetz
@ 2020-04-14 15:27       ` Peter Xu
  0 siblings, 0 replies; 13+ messages in thread
From: Peter Xu @ 2020-04-14 15:27 UTC (permalink / raw)
  To: Mike Kravetz
  Cc: linux-mm, linux-kernel, linux-arm-kernel, linuxppc-dev,
	linux-riscv, linux-s390, sparclinux, linux-doc, Catalin Marinas,
	Will Deacon, Benjamin Herrenschmidt, Paul Mackerras,
	Paul Walmsley, Palmer Dabbelt, Albert Ou, Heiko Carstens,
	Vasily Gorbik, Christian Borntraeger, David S.Miller,
	Thomas Gleixner, Ingo Molnar, Dave Hansen, Jonathan Corbet,
	Longpeng, Christophe Leroy, Mina Almasry, Andrew Morton

On Mon, Apr 13, 2020 at 10:59:26AM -0700, Mike Kravetz wrote:
> On 4/10/20 1:37 PM, Peter Xu wrote:
> > On Wed, Apr 01, 2020 at 11:38:19AM -0700, Mike Kravetz wrote:
> >> With all hugetlb page processing done in a single file clean up code.
> >> - Make code match desired semantics
> >>   - Update documentation with semantics
> >> - Make all warnings and errors messages start with 'HugeTLB:'.
> >> - Consistently name command line parsing routines.
> >> - Check for hugepages_supported() before processing parameters.
> >> - Add comments to code
> >>   - Describe some of the subtle interactions
> >>   - Describe semantics of command line arguments
> >>
> >> Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
> >> ---
> >>  .../admin-guide/kernel-parameters.txt         | 35 ++++---
> >>  Documentation/admin-guide/mm/hugetlbpage.rst  | 44 +++++++++
> >>  mm/hugetlb.c                                  | 96 +++++++++++++++----
> >>  3 files changed, 142 insertions(+), 33 deletions(-)
> >>
> >> diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
> >> index 1bd5454b5e5f..de653cfe1726 100644
> >> --- a/Documentation/admin-guide/kernel-parameters.txt
> >> +++ b/Documentation/admin-guide/kernel-parameters.txt
> >> @@ -832,12 +832,15 @@
> >>  			See also Documentation/networking/decnet.txt.
> >>  
> >>  	default_hugepagesz=
> >> -			[same as hugepagesz=] The size of the default
> >> -			HugeTLB page size. This is the size represented by
> >> -			the legacy /proc/ hugepages APIs, used for SHM, and
> >> -			default size when mounting hugetlbfs filesystems.
> >> -			Defaults to the default architecture's huge page size
> >> -			if not specified.
> >> +			[HW] The size of the default HugeTLB page size. This
> > 
> > Could I ask what's "HW"?  Sorry this is not a comment at all but
> > really a pure question I wanted to ask... :)
> 
> kernel-parameters.rst includes kernel-parameters.txt and included the meaning
> for these codes.
> 
>        HW      Appropriate hardware is enabled.
> 
> Previously, it listed an obsolete list of architectures.

I see. It was a bit confusing since hugepage is not a real hardware,
"CAP (capability)" might be easier, but I get the point now, thanks!

[...]

> >> diff --git a/Documentation/admin-guide/mm/hugetlbpage.rst b/Documentation/admin-guide/mm/hugetlbpage.rst
> >> index 1cc0bc78d10e..de340c586995 100644
> >> --- a/Documentation/admin-guide/mm/hugetlbpage.rst
> >> +++ b/Documentation/admin-guide/mm/hugetlbpage.rst
> >> @@ -100,6 +100,50 @@ with a huge page size selection parameter "hugepagesz=<size>".  <size> must
> >>  be specified in bytes with optional scale suffix [kKmMgG].  The default huge
> >>  page size may be selected with the "default_hugepagesz=<size>" boot parameter.
> >>  
> >> +Hugetlb boot command line parameter semantics
> >> +hugepagesz - Specify a huge page size.  Used in conjunction with hugepages
> >> +	parameter to preallocate a number of huge pages of the specified
> >> +	size.  Hence, hugepagesz and hugepages are typically specified in
> >> +	pairs such as:
> >> +		hugepagesz=2M hugepages=512
> >> +	hugepagesz can only be specified once on the command line for a
> >> +	specific huge page size.  Valid huge page sizes are architecture
> >> +	dependent.
> >> +hugepages - Specify the number of huge pages to preallocate.  This typically
> >> +	follows a valid hugepagesz parameter.  However, if hugepages is the
> >> +	first or only hugetlb command line parameter it specifies the number
> >> +	of huge pages of default size to allocate.  The number of huge pages
> >> +	of default size specified in this manner can be overwritten by a
> >> +	hugepagesz,hugepages parameter pair for the default size.
> >> +	For example, on an architecture with 2M default huge page size:
> >> +		hugepages=256 hugepagesz=2M hugepages=512
> >> +	will result in 512 2M huge pages being allocated.  If a hugepages
> >> +	parameter is preceded by an invalid hugepagesz parameter, it will
> >> +	be ignored.
> >> +default_hugepagesz - Specify the default huge page size.  This parameter can
> >> +	only be specified once on the command line.  No other hugetlb command
> >> +	line parameter is associated with default_hugepagesz.  Therefore, it
> >> +	can appear anywhere on the command line.  If hugepages= is the first
> >> +	hugetlb command line parameter, the specified number of huge pages
> >> +	will apply to the default huge page size specified with
> >> +	default_hugepagesz.  For example,
> >> +		hugepages=512 default_hugepagesz=2M
> > 
> > No strong opinion, but considering to the special case of gigantic
> > huge page mentioned below, I'm thinking maybe it's easier to just ask
> > the user to always use "hugepagesz=X hugepages=Y" pair when people
> > want to reserve huge pages.
> 
> We can ask people to do this.  However, I do not think we can force it at
> this time.  Why?  Mostly because I have seen many instances where people
> only specify 'hugepages=X' on the command line to preallocate X huge pages
> of default size.  So, forcing 'hugepagesz=X hugepages=Y' would break those
> users.
> 
> > For example, some user might start to use this after this series
> > legally:
> > 
> >     default_hugepagesz=2M hugepages=1024
> 
> Well, that 'works' today.  You get that silly error message:
> 
> HugeTLB: unsupported default_hugepagesz 2097152. Reverting to 2097152
> 
> But, it does preallocate 1024 huge pages of size 2M.  Because people
> have noticed the silly error message, I suspect this usage,
> 
> 	default_hugepagesz=X hugepages=Y
> 
> is in use today and we need to support it.

Fair enough.

[...]

> >> @@ -3209,19 +3209,35 @@ static int __init hugetlb_init(void)
> >>  	if (!hugepages_supported())
> >>  		return 0;
> >>  
> >> -	if (!size_to_hstate(default_hstate_size)) {
> >> -		if (default_hstate_size != 0) {
> >> -			pr_err("HugeTLB: unsupported default_hugepagesz %lu. Reverting to %lu\n",
> >> -			       default_hstate_size, HPAGE_SIZE);
> >> -		}
> >> -
> >> +	/*
> >> +	 * Make sure HPAGE_SIZE (HUGETLB_PAGE_ORDER) hstate exists.  Some
> >> +	 * architectures depend on setup being done here.
> >> +	 *
> >> +	 * If a valid default huge page size was specified on the command line,
> >> +	 * add associated hstate if necessary.  If not, set default_hstate_size
> >> +	 * to default size.  default_hstate_idx is used at runtime to identify
> >> +	 * the default huge page size/hstate.
> >> +	 */
> >> +	hugetlb_add_hstate(HUGETLB_PAGE_ORDER);
> >> +	if (default_hstate_size)
> >> +		hugetlb_add_hstate(ilog2(default_hstate_size) - PAGE_SHIFT);
> >> +	else
> >>  		default_hstate_size = HPAGE_SIZE;
> >> -		hugetlb_add_hstate(HUGETLB_PAGE_ORDER);
> >> -	}
> >>  	default_hstate_idx = hstate_index(size_to_hstate(default_hstate_size));
> >> +
> >> +	/*
> >> +	 * default_hstate_max_huge_pages != 0 indicates a count (hugepages=)
> >> +	 * specified before a size (hugepagesz=).  Use this count for the
> >> +	 * default huge page size, unless a specific value was specified for
> >> +	 * this size in a hugepagesz/hugepages pair.
> >> +	 */
> >>  	if (default_hstate_max_huge_pages) {
> > 
> > Since we're refactoring this - Could default_hstate_max_huge_pages be
> > dropped directly (in hugepages= we can create the default hstate, then
> > we set max_huge_pages of the default hstate there)?  Or did I miss
> > anything important?
> 
> I do not think that works for 'hugepages=X default_hugepagesz=Y' processing?
> It seems like there will need to be more work done on default_hugepagesz
> processing.

That was really an awkward kernel cmdline... But I guess you're right.

I think it awkward because it can be also read in sequence as "reserve
X huge pages of default huge page size, then change default value to
Y".  So instead of awkward, maybe "ambiguous".  However I have totally
no clue on how to make this better either - there's really quite a lot
of freedom right now on specifying all these options right now.

Thanks,

-- 
Peter Xu


^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2020-04-14 15:28 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-04-01 18:38 [PATCH v2 0/4] Clean up hugetlb boot command line processing Mike Kravetz
2020-04-01 18:38 ` [PATCH v2 1/4] hugetlbfs: add arch_hugetlb_valid_size Mike Kravetz
2020-04-10 19:16   ` Peter Xu
2020-04-13 17:04     ` Mike Kravetz
2020-04-01 18:38 ` [PATCH v2 2/4] hugetlbfs: move hugepagesz= parsing to arch independent code Mike Kravetz
2020-04-10 19:26   ` Peter Xu
2020-04-01 18:38 ` [PATCH v2 3/4] hugetlbfs: remove hugetlb_add_hstate() warning for existing hstate Mike Kravetz
2020-04-10 19:34   ` Peter Xu
2020-04-01 18:38 ` [PATCH v2 4/4] hugetlbfs: clean up command line processing Mike Kravetz
2020-04-01 18:55   ` Randy Dunlap
2020-04-10 20:37   ` Peter Xu
2020-04-13 17:59     ` Mike Kravetz
2020-04-14 15:27       ` Peter Xu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).