linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] mm/hugetlb: Warn the user when issues arise on boot due to hugepages
@ 2017-06-03  0:54 Liam R. Howlett
  2017-06-05  4:57 ` Michal Hocko
  2017-06-05 22:38 ` Andrew Morton
  0 siblings, 2 replies; 22+ messages in thread
From: Liam R. Howlett @ 2017-06-03  0:54 UTC (permalink / raw)
  To: linux-mm
  Cc: akpm, mike.kravetz, mhocko, n-horiguchi, aneesh.kumar,
	gerald.schaefer, zhongjiang, aarcange, kirill.shutemov

When the user specifies too many hugepages or an invalid
default_hugepagesz the communication to the user is implicit in the
allocation message.  This patch adds a warning when the desired page
count is not allocated and prints an error when the default_hugepagesz
is invalid on boot.

Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
---
 mm/hugetlb.c | 15 ++++++++++++++-
 1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index e5828875f7bb..6de30bbac23e 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -70,6 +70,7 @@ struct mutex *hugetlb_fault_mutex_table ____cacheline_aligned_in_smp;
 
 /* Forward declaration */
 static int hugetlb_acct_memory(struct hstate *h, long delta);
+static char * __init memfmt(char *buf, unsigned long n);
 
 static inline void unlock_or_release_subpool(struct hugepage_subpool *spool)
 {
@@ -2189,7 +2190,14 @@ static void __init hugetlb_hstate_alloc_pages(struct hstate *h)
 					 &node_states[N_MEMORY]))
 			break;
 	}
-	h->max_huge_pages = i;
+	if (i < h->max_huge_pages) {
+		char buf[32];
+
+		memfmt(buf, huge_page_size(h)),
+		pr_warn("HugeTLB: allocating %lu of page size %s failed.  Only allocated %lu hugepages.\n",
+			h->max_huge_pages, buf, i);
+		h->max_huge_pages = i;
+	}
 }
 
 static void __init hugetlb_init_hstates(void)
@@ -2785,6 +2793,11 @@ static int __init hugetlb_init(void)
 		return 0;
 
 	if (!size_to_hstate(default_hstate_size)) {
+		if (default_hstate_size != 0) {
+			pr_err("HugeTLB: unsupported default_hugepagesz %lu. Reverting to %lu\n",
+			       default_hstate_size, HPAGE_SIZE);
+		}
+
 		default_hstate_size = HPAGE_SIZE;
 		if (!size_to_hstate(default_hstate_size))
 			hugetlb_add_hstate(HUGETLB_PAGE_ORDER);
-- 
2.13.0.92.gcd65a7235

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH] mm/hugetlb: Warn the user when issues arise on boot due to hugepages
  2017-06-03  0:54 [PATCH] mm/hugetlb: Warn the user when issues arise on boot due to hugepages Liam R. Howlett
@ 2017-06-05  4:57 ` Michal Hocko
  2017-06-05 15:15   ` Liam R. Howlett
  2017-06-05 22:38 ` Andrew Morton
  1 sibling, 1 reply; 22+ messages in thread
From: Michal Hocko @ 2017-06-05  4:57 UTC (permalink / raw)
  To: Liam R. Howlett
  Cc: linux-mm, akpm, mike.kravetz, n-horiguchi, aneesh.kumar,
	gerald.schaefer, zhongjiang, aarcange, kirill.shutemov

On Fri 02-06-17 20:54:13, Liam R. Howlett wrote:
> When the user specifies too many hugepages or an invalid
> default_hugepagesz the communication to the user is implicit in the
> allocation message.  This patch adds a warning when the desired page
> count is not allocated and prints an error when the default_hugepagesz
> is invalid on boot.

We do not warn when doing echo $NUM > nr_hugepages, so why should we
behave any different during the boot?
 
> Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
> ---
>  mm/hugetlb.c | 15 ++++++++++++++-
>  1 file changed, 14 insertions(+), 1 deletion(-)
> 
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index e5828875f7bb..6de30bbac23e 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -70,6 +70,7 @@ struct mutex *hugetlb_fault_mutex_table ____cacheline_aligned_in_smp;
>  
>  /* Forward declaration */
>  static int hugetlb_acct_memory(struct hstate *h, long delta);
> +static char * __init memfmt(char *buf, unsigned long n);
>  
>  static inline void unlock_or_release_subpool(struct hugepage_subpool *spool)
>  {
> @@ -2189,7 +2190,14 @@ static void __init hugetlb_hstate_alloc_pages(struct hstate *h)
>  					 &node_states[N_MEMORY]))
>  			break;
>  	}
> -	h->max_huge_pages = i;
> +	if (i < h->max_huge_pages) {
> +		char buf[32];
> +
> +		memfmt(buf, huge_page_size(h)),
> +		pr_warn("HugeTLB: allocating %lu of page size %s failed.  Only allocated %lu hugepages.\n",
> +			h->max_huge_pages, buf, i);
> +		h->max_huge_pages = i;
> +	}
>  }
>  
>  static void __init hugetlb_init_hstates(void)
> @@ -2785,6 +2793,11 @@ static int __init hugetlb_init(void)
>  		return 0;
>  
>  	if (!size_to_hstate(default_hstate_size)) {
> +		if (default_hstate_size != 0) {
> +			pr_err("HugeTLB: unsupported default_hugepagesz %lu. Reverting to %lu\n",
> +			       default_hstate_size, HPAGE_SIZE);
> +		}
> +
>  		default_hstate_size = HPAGE_SIZE;
>  		if (!size_to_hstate(default_hstate_size))
>  			hugetlb_add_hstate(HUGETLB_PAGE_ORDER);
> -- 
> 2.13.0.92.gcd65a7235
> 

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] mm/hugetlb: Warn the user when issues arise on boot due to hugepages
  2017-06-05  4:57 ` Michal Hocko
@ 2017-06-05 15:15   ` Liam R. Howlett
  2017-06-06  5:49     ` Michal Hocko
  0 siblings, 1 reply; 22+ messages in thread
From: Liam R. Howlett @ 2017-06-05 15:15 UTC (permalink / raw)
  To: Michal Hocko
  Cc: linux-mm, akpm, mike.kravetz, n-horiguchi, aneesh.kumar,
	gerald.schaefer, zhongjiang, aarcange, kirill.shutemov

* Michal Hocko <mhocko@suse.com> [170605 00:57]:
> On Fri 02-06-17 20:54:13, Liam R. Howlett wrote:
> > When the user specifies too many hugepages or an invalid
> > default_hugepagesz the communication to the user is implicit in the
> > allocation message.  This patch adds a warning when the desired page
> > count is not allocated and prints an error when the default_hugepagesz
> > is invalid on boot.
> 
> We do not warn when doing echo $NUM > nr_hugepages, so why should we
> behave any different during the boot?

During boot hugepages will allocate until there is a fraction of the
hugepage size left.  That is, we allocate until either the request is
satisfied or memory for the pages is exhausted.  When memory for the
pages is exhausted, it will most likely lead to the system failing with
the OOM manager not finding enough (or anything) to kill (unless you're
using really big hugepages in the order of 100s of MB or in the GBs).
The user will most likely see the OOM messages much later in the boot
sequence than the implicitly stated message.  Worse yet, you may even
get an OOM for each processor which causes many pages of OOMs on modern
systems.  Although these messages will be printed earlier than the OOM
messages, at least giving the user errors and warnings will highlight
the configuration as an issue.  I'm trying to point the user in the
right direction by providing a more robust statement of what is failing.

During the sysctl or echo command, the user can check the results much
easier than if the system hangs during boot and the scenario of having
nothing to OOM for kernel memory is highly unlikely.

Thanks,
Liam

>  
> > Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
> > ---
> >  mm/hugetlb.c | 15 ++++++++++++++-
> >  1 file changed, 14 insertions(+), 1 deletion(-)
> > 
> > diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> > index e5828875f7bb..6de30bbac23e 100644
> > --- a/mm/hugetlb.c
> > +++ b/mm/hugetlb.c
> > @@ -70,6 +70,7 @@ struct mutex *hugetlb_fault_mutex_table ____cacheline_aligned_in_smp;
> >  
> >  /* Forward declaration */
> >  static int hugetlb_acct_memory(struct hstate *h, long delta);
> > +static char * __init memfmt(char *buf, unsigned long n);
> >  
> >  static inline void unlock_or_release_subpool(struct hugepage_subpool *spool)
> >  {
> > @@ -2189,7 +2190,14 @@ static void __init hugetlb_hstate_alloc_pages(struct hstate *h)
> >  					 &node_states[N_MEMORY]))
> >  			break;
> >  	}
> > -	h->max_huge_pages = i;
> > +	if (i < h->max_huge_pages) {
> > +		char buf[32];
> > +
> > +		memfmt(buf, huge_page_size(h)),
> > +		pr_warn("HugeTLB: allocating %lu of page size %s failed.  Only allocated %lu hugepages.\n",
> > +			h->max_huge_pages, buf, i);
> > +		h->max_huge_pages = i;
> > +	}
> >  }
> >  
> >  static void __init hugetlb_init_hstates(void)
> > @@ -2785,6 +2793,11 @@ static int __init hugetlb_init(void)
> >  		return 0;
> >  
> >  	if (!size_to_hstate(default_hstate_size)) {
> > +		if (default_hstate_size != 0) {
> > +			pr_err("HugeTLB: unsupported default_hugepagesz %lu. Reverting to %lu\n",
> > +			       default_hstate_size, HPAGE_SIZE);
> > +		}
> > +
> >  		default_hstate_size = HPAGE_SIZE;
> >  		if (!size_to_hstate(default_hstate_size))
> >  			hugetlb_add_hstate(HUGETLB_PAGE_ORDER);
> > -- 
> > 2.13.0.92.gcd65a7235
> > 
> 
> -- 
> Michal Hocko
> SUSE Labs
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] mm/hugetlb: Warn the user when issues arise on boot due to hugepages
  2017-06-03  0:54 [PATCH] mm/hugetlb: Warn the user when issues arise on boot due to hugepages Liam R. Howlett
  2017-06-05  4:57 ` Michal Hocko
@ 2017-06-05 22:38 ` Andrew Morton
  2017-06-06 19:03   ` Matthew Wilcox
  1 sibling, 1 reply; 22+ messages in thread
From: Andrew Morton @ 2017-06-05 22:38 UTC (permalink / raw)
  To: Liam R. Howlett
  Cc: linux-mm, mike.kravetz, mhocko, n-horiguchi, aneesh.kumar,
	gerald.schaefer, zhongjiang, aarcange, kirill.shutemov

On Fri,  2 Jun 2017 20:54:13 -0400 "Liam R. Howlett" <Liam.Howlett@Oracle.com> wrote:

> When the user specifies too many hugepages or an invalid
> default_hugepagesz the communication to the user is implicit in the
> allocation message.  This patch adds a warning when the desired page
> count is not allocated and prints an error when the default_hugepagesz
> is invalid on boot.
> 
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -70,6 +70,7 @@ struct mutex *hugetlb_fault_mutex_table ____cacheline_aligned_in_smp;
>  
>  /* Forward declaration */
>  static int hugetlb_acct_memory(struct hstate *h, long delta);
> +static char * __init memfmt(char *buf, unsigned long n);

It's better to just move memfmt() to the right place.  After all, you
have revealed that it was in the wrong place, no?

(Am a bit surprised that something as general as memfmt is private to
hugetlb.c)

--- a/mm/hugetlb.c~mm-hugetlb-warn-the-user-when-issues-arise-on-boot-due-to-hugepages-fix
+++ a/mm/hugetlb.c
@@ -69,7 +69,17 @@ struct mutex *hugetlb_fault_mutex_table
 
 /* Forward declaration */
 static int hugetlb_acct_memory(struct hstate *h, long delta);
-static char * __init memfmt(char *buf, unsigned long n);
+
+static char * __init memfmt(char *buf, unsigned long n)
+{
+	if (n >= (1UL << 30))
+		sprintf(buf, "%lu GB", n >> 30);
+	else if (n >= (1UL << 20))
+		sprintf(buf, "%lu MB", n >> 20);
+	else
+		sprintf(buf, "%lu KB", n >> 10);
+	return buf;
+}
 
 static inline void unlock_or_release_subpool(struct hugepage_subpool *spool)
 {
@@ -2238,17 +2248,6 @@ static void __init hugetlb_init_hstates(
 	VM_BUG_ON(minimum_order == UINT_MAX);
 }
 
-static char * __init memfmt(char *buf, unsigned long n)
-{
-	if (n >= (1UL << 30))
-		sprintf(buf, "%lu GB", n >> 30);
-	else if (n >= (1UL << 20))
-		sprintf(buf, "%lu MB", n >> 20);
-	else
-		sprintf(buf, "%lu KB", n >> 10);
-	return buf;
-}
-
 static void __init report_hugepages(void)
 {
 	struct hstate *h;
_

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] mm/hugetlb: Warn the user when issues arise on boot due to hugepages
  2017-06-05 15:15   ` Liam R. Howlett
@ 2017-06-06  5:49     ` Michal Hocko
  2017-06-06  6:01       ` Michal Hocko
  0 siblings, 1 reply; 22+ messages in thread
From: Michal Hocko @ 2017-06-06  5:49 UTC (permalink / raw)
  To: Liam R. Howlett
  Cc: linux-mm, akpm, mike.kravetz, n-horiguchi, aneesh.kumar,
	gerald.schaefer, zhongjiang, aarcange, kirill.shutemov

On Mon 05-06-17 11:15:41, Liam R. Howlett wrote:
> * Michal Hocko <mhocko@suse.com> [170605 00:57]:
> > On Fri 02-06-17 20:54:13, Liam R. Howlett wrote:
> > > When the user specifies too many hugepages or an invalid
> > > default_hugepagesz the communication to the user is implicit in the
> > > allocation message.  This patch adds a warning when the desired page
> > > count is not allocated and prints an error when the default_hugepagesz
> > > is invalid on boot.
> > 
> > We do not warn when doing echo $NUM > nr_hugepages, so why should we
> > behave any different during the boot?
> 
> During boot hugepages will allocate until there is a fraction of the
> hugepage size left.  That is, we allocate until either the request is
> satisfied or memory for the pages is exhausted.  When memory for the
> pages is exhausted, it will most likely lead to the system failing with
> the OOM manager not finding enough (or anything) to kill (unless you're
> using really big hugepages in the order of 100s of MB or in the GBs).
> The user will most likely see the OOM messages much later in the boot
> sequence than the implicitly stated message.  Worse yet, you may even
> get an OOM for each processor which causes many pages of OOMs on modern
> systems.  Although these messages will be printed earlier than the OOM
> messages, at least giving the user errors and warnings will highlight
> the configuration as an issue.  I'm trying to point the user in the
> right direction by providing a more robust statement of what is failing.

Well, an oom report will tell us how much memory is eaten by hugetlb so
you would get a clue that something is misconfigured.
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] mm/hugetlb: Warn the user when issues arise on boot due to hugepages
  2017-06-06  5:49     ` Michal Hocko
@ 2017-06-06  6:01       ` Michal Hocko
  2017-06-12 17:28         ` Liam R. Howlett
  0 siblings, 1 reply; 22+ messages in thread
From: Michal Hocko @ 2017-06-06  6:01 UTC (permalink / raw)
  To: Liam R. Howlett
  Cc: linux-mm, akpm, mike.kravetz, n-horiguchi, aneesh.kumar,
	gerald.schaefer, zhongjiang, aarcange, kirill.shutemov

On Tue 06-06-17 07:49:17, Michal Hocko wrote:
> On Mon 05-06-17 11:15:41, Liam R. Howlett wrote:
> > * Michal Hocko <mhocko@suse.com> [170605 00:57]:
> > > On Fri 02-06-17 20:54:13, Liam R. Howlett wrote:
> > > > When the user specifies too many hugepages or an invalid
> > > > default_hugepagesz the communication to the user is implicit in the
> > > > allocation message.  This patch adds a warning when the desired page
> > > > count is not allocated and prints an error when the default_hugepagesz
> > > > is invalid on boot.
> > > 
> > > We do not warn when doing echo $NUM > nr_hugepages, so why should we
> > > behave any different during the boot?
> > 
> > During boot hugepages will allocate until there is a fraction of the
> > hugepage size left.  That is, we allocate until either the request is
> > satisfied or memory for the pages is exhausted.  When memory for the
> > pages is exhausted, it will most likely lead to the system failing with
> > the OOM manager not finding enough (or anything) to kill (unless you're
> > using really big hugepages in the order of 100s of MB or in the GBs).
> > The user will most likely see the OOM messages much later in the boot
> > sequence than the implicitly stated message.  Worse yet, you may even
> > get an OOM for each processor which causes many pages of OOMs on modern
> > systems.  Although these messages will be printed earlier than the OOM
> > messages, at least giving the user errors and warnings will highlight
> > the configuration as an issue.  I'm trying to point the user in the
> > right direction by providing a more robust statement of what is failing.
> 
> Well, an oom report will tell us how much memory is eaten by hugetlb so
> you would get a clue that something is misconfigured.

And just to be more clear. I do not _object_ to the warning I just
_think_ it is not very useful actually. If somebody misconfigure so
badly that hugetlb allocations fail during the boot then it will be
very likely visible. But if somebody misconfigures slightly less to not
fail the system is very likely to not work properly and there will be no
warning that this might be the source of problems. So is it worth adding
more code with that limited usefulness?
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] mm/hugetlb: Warn the user when issues arise on boot due to hugepages
  2017-06-05 22:38 ` Andrew Morton
@ 2017-06-06 19:03   ` Matthew Wilcox
  0 siblings, 0 replies; 22+ messages in thread
From: Matthew Wilcox @ 2017-06-06 19:03 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Liam R. Howlett, linux-mm, mike.kravetz, mhocko, n-horiguchi,
	aneesh.kumar, gerald.schaefer, zhongjiang, aarcange,
	kirill.shutemov

On Mon, Jun 05, 2017 at 03:38:19PM -0700, Andrew Morton wrote:
> It's better to just move memfmt() to the right place.  After all, you
> have revealed that it was in the wrong place, no?
> 
> (Am a bit surprised that something as general as memfmt is private to
> hugetlb.c)

Oh, hey, look, memory management people and storage people have
their own ideas about "general" code.  Storage people have been using
string_get_size() for a while.  It feels a bit over-engineered to me,
but since we already have it, we should use it.

---- 8< ----

Subject: [PATCH] Replace memfmt with string_get_size

The hugetlb code has its own function to report human-readable sizes.
Convert it to use the shared string_get_size function.  This will lead
to a minor difference in user visible output (MiB/GiB instead of MB/GB),
but some would argue that's desirable anyway.

Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index e5828875f7bb..7f2b7d9f1f45 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -20,6 +20,7 @@
 #include <linux/slab.h>
 #include <linux/sched/signal.h>
 #include <linux/rmap.h>
+#include <linux/string_helpers.h>
 #include <linux/swap.h>
 #include <linux/swapops.h>
 #include <linux/page-isolation.h>
@@ -2207,26 +2208,15 @@ static void __init hugetlb_init_hstates(void)
 	VM_BUG_ON(minimum_order == UINT_MAX);
 }
 
-static char * __init memfmt(char *buf, unsigned long n)
-{
-	if (n >= (1UL << 30))
-		sprintf(buf, "%lu GB", n >> 30);
-	else if (n >= (1UL << 20))
-		sprintf(buf, "%lu MB", n >> 20);
-	else
-		sprintf(buf, "%lu KB", n >> 10);
-	return buf;
-}
-
 static void __init report_hugepages(void)
 {
 	struct hstate *h;
 
 	for_each_hstate(h) {
 		char buf[32];
+		string_get_size(huge_page_size(h), 1, STRING_UNITS_2, buf, 32);
 		pr_info("HugeTLB registered %s page size, pre-allocated %ld pages\n",
-			memfmt(buf, huge_page_size(h)),
-			h->free_huge_pages);
+			buf, h->free_huge_pages);
 	}
 }
 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH] mm/hugetlb: Warn the user when issues arise on boot due to hugepages
  2017-06-06  6:01       ` Michal Hocko
@ 2017-06-12 17:28         ` Liam R. Howlett
  2017-06-12 17:49           ` Michal Hocko
  0 siblings, 1 reply; 22+ messages in thread
From: Liam R. Howlett @ 2017-06-12 17:28 UTC (permalink / raw)
  To: Michal Hocko
  Cc: linux-mm, akpm, mike.kravetz, n-horiguchi, aneesh.kumar,
	gerald.schaefer, zhongjiang, aarcange, kirill.shutemov

* Michal Hocko <mhocko@suse.com> [170606 02:01]:
> On Tue 06-06-17 07:49:17, Michal Hocko wrote:
> > On Mon 05-06-17 11:15:41, Liam R. Howlett wrote:
> > > * Michal Hocko <mhocko@suse.com> [170605 00:57]:
> > > > On Fri 02-06-17 20:54:13, Liam R. Howlett wrote:
> > > > > When the user specifies too many hugepages or an invalid
> > > > > default_hugepagesz the communication to the user is implicit in the
> > > > > allocation message.  This patch adds a warning when the desired page
> > > > > count is not allocated and prints an error when the default_hugepagesz
> > > > > is invalid on boot.
> > > > 
> > > > We do not warn when doing echo $NUM > nr_hugepages, so why should we
> > > > behave any different during the boot?
> > > 
> > > During boot hugepages will allocate until there is a fraction of the
> > > hugepage size left.  That is, we allocate until either the request is
> > > satisfied or memory for the pages is exhausted.  When memory for the
> > > pages is exhausted, it will most likely lead to the system failing with
> > > the OOM manager not finding enough (or anything) to kill (unless you're
> > > using really big hugepages in the order of 100s of MB or in the GBs).
> > > The user will most likely see the OOM messages much later in the boot
> > > sequence than the implicitly stated message.  Worse yet, you may even
> > > get an OOM for each processor which causes many pages of OOMs on modern
> > > systems.  Although these messages will be printed earlier than the OOM
> > > messages, at least giving the user errors and warnings will highlight
> > > the configuration as an issue.  I'm trying to point the user in the
> > > right direction by providing a more robust statement of what is failing.
> > 
> > Well, an oom report will tell us how much memory is eaten by hugetlb so
> > you would get a clue that something is misconfigured.

Absolutely, however this is again implicitly telling the user why the
system is failing to boot.  A lot of time may be - and has been - spent
finding what went wrong, and by multiple users.

> 
> And just to be more clear. I do not _object_ to the warning I just
> _think_ it is not very useful actually. If somebody misconfigure so
> badly that hugetlb allocations fail during the boot then it will be
> very likely visible. But if somebody misconfigures slightly less to not
> fail the system is very likely to not work properly and there will be no
> warning that this might be the source of problems. So is it worth adding
> more code with that limited usefulness?

I think telling the user that something failed is very useful.  This
obviously does not cover off all failure cases as you have pointed out,
but it is certainly better than silently continuing as is the case
today.

Are you suggesting that the error message be provided if the failure
happens after boot as well?

Thanks,
Liam

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] mm/hugetlb: Warn the user when issues arise on boot due to hugepages
  2017-06-12 17:28         ` Liam R. Howlett
@ 2017-06-12 17:49           ` Michal Hocko
  2017-06-12 18:37             ` Liam R. Howlett
  0 siblings, 1 reply; 22+ messages in thread
From: Michal Hocko @ 2017-06-12 17:49 UTC (permalink / raw)
  To: Liam R. Howlett
  Cc: linux-mm, akpm, mike.kravetz, n-horiguchi, aneesh.kumar,
	gerald.schaefer, zhongjiang, aarcange, kirill.shutemov

On Mon 12-06-17 13:28:30, Liam R. Howlett wrote:
> * Michal Hocko <mhocko@suse.com> [170606 02:01]:
[..]
> > And just to be more clear. I do not _object_ to the warning I just
> > _think_ it is not very useful actually. If somebody misconfigure so
> > badly that hugetlb allocations fail during the boot then it will be
> > very likely visible. But if somebody misconfigures slightly less to not
> > fail the system is very likely to not work properly and there will be no
> > warning that this might be the source of problems. So is it worth adding
> > more code with that limited usefulness?
> 
> I think telling the user that something failed is very useful.  This
> obviously does not cover off all failure cases as you have pointed out,
> but it is certainly better than silently continuing as is the case
> today.
> 
> Are you suggesting that the error message be provided if the failure
> happens after boot as well?

No, I am just suggesting that the warning as proposed is not useful and
it is worth the additional (aleit little) code. It doesn't cover many
other miscofigurations which might be even more serious because there
would be still _some_ memory left while the system would crawl to death.

My objections are not hard enough to give a right NAK I just think this
is a pointless code which won't help the current situation much.
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] mm/hugetlb: Warn the user when issues arise on boot due to hugepages
  2017-06-12 17:49           ` Michal Hocko
@ 2017-06-12 18:37             ` Liam R. Howlett
  2017-06-12 18:52               ` Michal Hocko
  0 siblings, 1 reply; 22+ messages in thread
From: Liam R. Howlett @ 2017-06-12 18:37 UTC (permalink / raw)
  To: Michal Hocko
  Cc: linux-mm, akpm, mike.kravetz, n-horiguchi, aneesh.kumar,
	gerald.schaefer, zhongjiang, aarcange, kirill.shutemov

* Michal Hocko <mhocko@suse.com> [170612 13:49]:
> On Mon 12-06-17 13:28:30, Liam R. Howlett wrote:
> > * Michal Hocko <mhocko@suse.com> [170606 02:01]:
> [..]
> > > And just to be more clear. I do not _object_ to the warning I just
> > > _think_ it is not very useful actually. If somebody misconfigure so
> > > badly that hugetlb allocations fail during the boot then it will be
> > > very likely visible. But if somebody misconfigures slightly less to not
> > > fail the system is very likely to not work properly and there will be no
> > > warning that this might be the source of problems. So is it worth adding
> > > more code with that limited usefulness?
> > 
> > I think telling the user that something failed is very useful.  This
> > obviously does not cover off all failure cases as you have pointed out,
> > but it is certainly better than silently continuing as is the case
> > today.
> > 
> > Are you suggesting that the error message be provided if the failure
> > happens after boot as well?
> 
> No, I am just suggesting that the warning as proposed is not useful and
> it is worth the additional (aleit little) code. It doesn't cover many
> other miscofigurations which might be even more serious because there
> would be still _some_ memory left while the system would crawl to death.

There is already some memory left as long as the huge page size doesn't
work out to be exactly the amount of free pages.  This is why it's so
annoying as the OOM kicks in much later in the boot process and leaves
it up to the user to debug a kernel dump with zero error or warning
messages about what happened before things went bad.  Worse yet, I've
seen several pages of OOMs scroll by as each processor takes turns
telling the user it is out of memory.  If there's no message stating any
configuration issue, then many admins would probably think something is
seriously broken and it's not just a simple typo of K vs M.

Even though this doesn't catch all errors, I think it's a worth while
change since this is currently a silent failure which results in a
system crash.

> 
> My objections are not hard enough to give a right NAK I just think this
> is a pointless code which won't help the current situation much.

I'm opened to other suggestions on how to make this better, but
providing the user with more information seems like a good start.  I
think it's reasonable to think many end users would not know what to do
with a kernel oops when no errors or warnings have occurred.

Thanks,
Liam

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] mm/hugetlb: Warn the user when issues arise on boot due to hugepages
  2017-06-12 18:37             ` Liam R. Howlett
@ 2017-06-12 18:52               ` Michal Hocko
  2017-06-13  1:35                 ` Liam R. Howlett
  0 siblings, 1 reply; 22+ messages in thread
From: Michal Hocko @ 2017-06-12 18:52 UTC (permalink / raw)
  To: Liam R. Howlett
  Cc: linux-mm, akpm, mike.kravetz, n-horiguchi, aneesh.kumar,
	gerald.schaefer, zhongjiang, aarcange, kirill.shutemov

On Mon 12-06-17 14:37:18, Liam R. Howlett wrote:
> * Michal Hocko <mhocko@suse.com> [170612 13:49]:
> > On Mon 12-06-17 13:28:30, Liam R. Howlett wrote:
> > > * Michal Hocko <mhocko@suse.com> [170606 02:01]:
> > [..]
> > > > And just to be more clear. I do not _object_ to the warning I just
> > > > _think_ it is not very useful actually. If somebody misconfigure so
> > > > badly that hugetlb allocations fail during the boot then it will be
> > > > very likely visible. But if somebody misconfigures slightly less to not
> > > > fail the system is very likely to not work properly and there will be no
> > > > warning that this might be the source of problems. So is it worth adding
> > > > more code with that limited usefulness?
> > > 
> > > I think telling the user that something failed is very useful.  This
> > > obviously does not cover off all failure cases as you have pointed out,
> > > but it is certainly better than silently continuing as is the case
> > > today.
> > > 
> > > Are you suggesting that the error message be provided if the failure
> > > happens after boot as well?
> > 
> > No, I am just suggesting that the warning as proposed is not useful and
> > it is worth the additional (aleit little) code. It doesn't cover many
> > other miscofigurations which might be even more serious because there
> > would be still _some_ memory left while the system would crawl to death.
> 
> There is already some memory left as long as the huge page size doesn't
> work out to be exactly the amount of free pages.  This is why it's so
> annoying as the OOM kicks in much later in the boot process and leaves
> it up to the user to debug a kernel dump with zero error or warning
> messages about what happened before things went bad.

Exactly. And I my argument is that this won't get handled by your patch.

> Worse yet, I've
> seen several pages of OOMs scroll by as each processor takes turns
> telling the user it is out of memory.

This is not how the oom report works. We only report when _killing_ a
task. And the reason you have seen so many of them is that killing any
number of processes will not help. Yes this is quite subtimal and it
would be great to see that the OOM is due to hugetlb configuration or
e.g. too large ramdisk or unreclaimable shmem. Fixing that would be much
more reasonable than sticking a warning that will almost never trigger
unless somebody messed up royally.

> If there's no message stating any
> configuration issue, then many admins would probably think something is
> seriously broken and it's not just a simple typo of K vs M.
> 
> Even though this doesn't catch all errors, I think it's a worth while
> change since this is currently a silent failure which results in a
> system crash.

Seriously, this warning just doesn't help in _most_ miscofigurations. It
just focuses on one particular which really requires to misconfigure
really badly. And there are way too many other ways to screw your system
that way, yet we do not warn about many of those. So just try to step
back and think whether this is something we actually do care about and
if yes then try to come up with a more reasonable warning which would
cover a wider range of misconfigurations.
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] mm/hugetlb: Warn the user when issues arise on boot due to hugepages
  2017-06-12 18:52               ` Michal Hocko
@ 2017-06-13  1:35                 ` Liam R. Howlett
  2017-06-13  5:42                   ` Michal Hocko
  2017-06-16 19:07                   ` Andrew Morton
  0 siblings, 2 replies; 22+ messages in thread
From: Liam R. Howlett @ 2017-06-13  1:35 UTC (permalink / raw)
  To: Michal Hocko
  Cc: linux-mm, akpm, mike.kravetz, n-horiguchi, aneesh.kumar,
	gerald.schaefer, zhongjiang, aarcange, kirill.shutemov

* Michal Hocko <mhocko@suse.com> [170612 14:52]:
> On Mon 12-06-17 14:37:18, Liam R. Howlett wrote:
> > * Michal Hocko <mhocko@suse.com> [170612 13:49]:
> > > On Mon 12-06-17 13:28:30, Liam R. Howlett wrote:
> > > > * Michal Hocko <mhocko@suse.com> [170606 02:01]:
> > > [..]
> > > > > And just to be more clear. I do not _object_ to the warning I just
> > > > > _think_ it is not very useful actually. If somebody misconfigure so
> > > > > badly that hugetlb allocations fail during the boot then it will be
> > > > > very likely visible. But if somebody misconfigures slightly less to not
> > > > > fail the system is very likely to not work properly and there will be no
> > > > > warning that this might be the source of problems. So is it worth adding
> > > > > more code with that limited usefulness?
> > > > 
> > > > I think telling the user that something failed is very useful.  This
> > > > obviously does not cover off all failure cases as you have pointed out,
> > > > but it is certainly better than silently continuing as is the case
> > > > today.
> > > > 
> > > > Are you suggesting that the error message be provided if the failure
> > > > happens after boot as well?
> > > 
> > > No, I am just suggesting that the warning as proposed is not useful and
> > > it is worth the additional (aleit little) code. It doesn't cover many
> > > other miscofigurations which might be even more serious because there
> > > would be still _some_ memory left while the system would crawl to death.
> > 
> > There is already some memory left as long as the huge page size doesn't
> > work out to be exactly the amount of free pages.  This is why it's so
> > annoying as the OOM kicks in much later in the boot process and leaves
> > it up to the user to debug a kernel dump with zero error or warning
> > messages about what happened before things went bad.
> 
> Exactly. And I my argument is that this won't get handled by your patch.

Right, but it was more explicit on an error that did occur.  More in
line with an invalid hugepagesz error message from platform specific
code.

> 
> > Worse yet, I've
> > seen several pages of OOMs scroll by as each processor takes turns
> > telling the user it is out of memory.
> 
> This is not how the oom report works. We only report when _killing_ a
> task. And the reason you have seen so many of them is that killing any
> number of processes will not help. Yes this is quite subtimal and it
> would be great to see that the OOM is due to hugetlb configuration or
> e.g. too large ramdisk or unreclaimable shmem. Fixing that would be much
> more reasonable than sticking a warning that will almost never trigger
> unless somebody messed up royally.

Thanks, I appreciate you taking the time to explain this to me.

> 
> > If there's no message stating any
> > configuration issue, then many admins would probably think something is
> > seriously broken and it's not just a simple typo of K vs M.
> > 
> > Even though this doesn't catch all errors, I think it's a worth while
> > change since this is currently a silent failure which results in a
> > system crash.
> 
> Seriously, this warning just doesn't help in _most_ miscofigurations. It
> just focuses on one particular which really requires to misconfigure
> really badly. And there are way too many other ways to screw your system
> that way, yet we do not warn about many of those. So just try to step
> back and think whether this is something we actually do care about and
> if yes then try to come up with a more reasonable warning which would
> cover a wider range of misconfigurations.

Understood.  Again, I appreciate all the time you have taken on my
patch and explaining your points.  I will look at this again as you
have suggested.

Thanks,
Liam

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] mm/hugetlb: Warn the user when issues arise on boot due to hugepages
  2017-06-13  1:35                 ` Liam R. Howlett
@ 2017-06-13  5:42                   ` Michal Hocko
  2017-06-13 15:25                     ` Liam R. Howlett
  2017-06-16 19:07                   ` Andrew Morton
  1 sibling, 1 reply; 22+ messages in thread
From: Michal Hocko @ 2017-06-13  5:42 UTC (permalink / raw)
  To: Liam R. Howlett
  Cc: linux-mm, akpm, mike.kravetz, n-horiguchi, aneesh.kumar,
	gerald.schaefer, zhongjiang, aarcange, kirill.shutemov

On Mon 12-06-17 21:35:17, Liam R. Howlett wrote:
[...]
> Understood.  Again, I appreciate all the time you have taken on my
> patch and explaining your points.  I will look at this again as you
> have suggested.

One way to go forward might be to check the size of the per node pool
and warn if it grows over a certain threshold of the available memory
on that node. I do not have a good idea what would be that threshold,
though. It will certainly depend on workloads. I can also imagine that
somebody might want to dedicate the full numa node for hugetlb pages
and still be OK so take this suggestion with some reserve. It is hard
to protect against misconfigurations in general but maybe you will find
some way here.
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] mm/hugetlb: Warn the user when issues arise on boot due to hugepages
  2017-06-13  5:42                   ` Michal Hocko
@ 2017-06-13 15:25                     ` Liam R. Howlett
  2017-06-13 16:26                       ` Mike Kravetz
  2017-06-14  7:25                       ` Michal Hocko
  0 siblings, 2 replies; 22+ messages in thread
From: Liam R. Howlett @ 2017-06-13 15:25 UTC (permalink / raw)
  To: Michal Hocko
  Cc: linux-mm, akpm, mike.kravetz, n-horiguchi, aneesh.kumar,
	gerald.schaefer, zhongjiang, aarcange, kirill.shutemov

* Michal Hocko <mhocko@suse.com> [170613 01:42]:
> On Mon 12-06-17 21:35:17, Liam R. Howlett wrote:
> [...]
> > Understood.  Again, I appreciate all the time you have taken on my
> > patch and explaining your points.  I will look at this again as you
> > have suggested.
> 
> One way to go forward might be to check the size of the per node pool
> and warn if it grows over a certain threshold of the available memory
> on that node. I do not have a good idea what would be that threshold,
> though. It will certainly depend on workloads. I can also imagine that
> somebody might want to dedicate the full numa node for hugetlb pages
> and still be OK so take this suggestion with some reserve. It is hard
> to protect against misconfigurations in general but maybe you will find
> some way here.

I thought about an upper threshold of memory and discussed it
internally, but came to the same conclusion; it may be desired and
there's no safe bet beyond warning if the user requests over 100% of the
memory.  In the case of requesting over 100% of the memory, we could
warn the user and specify what was allocated.  Would it be reasonable to
warn on both boot and through sysfs of such requests?  I'm concerned
that this is yet another too-targeted approach.

The OOM issue would still arise much later in boot for the init
situation.  I have an OOM patch I'd like to send out for another hugetlb
corner case which may improve this situation.  I was going to send it
out separately as I thought of it as unrelated to this scenario and I
believe it should be a config option.  The OOM patch has its own issues
and would only be an RFC at this point.

Thanks,
Liam

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] mm/hugetlb: Warn the user when issues arise on boot due to hugepages
  2017-06-13 15:25                     ` Liam R. Howlett
@ 2017-06-13 16:26                       ` Mike Kravetz
  2017-06-14  7:27                         ` Michal Hocko
  2017-06-14  7:25                       ` Michal Hocko
  1 sibling, 1 reply; 22+ messages in thread
From: Mike Kravetz @ 2017-06-13 16:26 UTC (permalink / raw)
  To: Liam R. Howlett, Michal Hocko
  Cc: linux-mm, akpm, n-horiguchi, aneesh.kumar, gerald.schaefer,
	zhongjiang, aarcange, kirill.shutemov

A thought somewhat related to this discussion:

I noticed that huge pages specified on the kernel command line are allocated
via 'subsys_initcall'.  This is before 'fs_initcall', even though these huge
pages are only used by hugetlbfs.  Was just thinking that it might be better
to move huge page allocations to later in the init process.  At least make
them part of fs_initcall if not late_initcall?

Only reason for doing this is because huge page allocations are fairly
tolerant of allocation failure.

Of course, I could be missing some init dependency.
-- 
Mike Kravetz

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] mm/hugetlb: Warn the user when issues arise on boot due to hugepages
  2017-06-13 15:25                     ` Liam R. Howlett
  2017-06-13 16:26                       ` Mike Kravetz
@ 2017-06-14  7:25                       ` Michal Hocko
  1 sibling, 0 replies; 22+ messages in thread
From: Michal Hocko @ 2017-06-14  7:25 UTC (permalink / raw)
  To: Liam R. Howlett
  Cc: linux-mm, akpm, mike.kravetz, n-horiguchi, aneesh.kumar,
	gerald.schaefer, zhongjiang, aarcange, kirill.shutemov

On Tue 13-06-17 11:25:02, Liam R. Howlett wrote:
> * Michal Hocko <mhocko@suse.com> [170613 01:42]:
> > On Mon 12-06-17 21:35:17, Liam R. Howlett wrote:
> > [...]
> > > Understood.  Again, I appreciate all the time you have taken on my
> > > patch and explaining your points.  I will look at this again as you
> > > have suggested.
> > 
> > One way to go forward might be to check the size of the per node pool
> > and warn if it grows over a certain threshold of the available memory
> > on that node. I do not have a good idea what would be that threshold,
> > though. It will certainly depend on workloads. I can also imagine that
> > somebody might want to dedicate the full numa node for hugetlb pages
> > and still be OK so take this suggestion with some reserve. It is hard
> > to protect against misconfigurations in general but maybe you will find
> > some way here.
> 
> I thought about an upper threshold of memory and discussed it
> internally, but came to the same conclusion; it may be desired and
> there's no safe bet beyond warning if the user requests over 100% of the
> memory.  In the case of requesting over 100% of the memory, we could
> warn the user and specify what was allocated.  Would it be reasonable to
> warn on both boot and through sysfs of such requests?  I'm concerned
> that this is yet another too-targeted approach.

No, I think 100% is just too targeted. As I've said already said, a good
enough treshold might be hard to get right but filling up more than 90%
of memory with hugetlb pages will just bite you unless you know what you
are doing. And if so you can safely ignore such a warning...
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] mm/hugetlb: Warn the user when issues arise on boot due to hugepages
  2017-06-13 16:26                       ` Mike Kravetz
@ 2017-06-14  7:27                         ` Michal Hocko
  2017-06-14 17:02                           ` Mike Kravetz
  0 siblings, 1 reply; 22+ messages in thread
From: Michal Hocko @ 2017-06-14  7:27 UTC (permalink / raw)
  To: Mike Kravetz
  Cc: Liam R. Howlett, linux-mm, akpm, n-horiguchi, aneesh.kumar,
	gerald.schaefer, zhongjiang, aarcange, kirill.shutemov

On Tue 13-06-17 09:26:15, Mike Kravetz wrote:
> A thought somewhat related to this discussion:
> 
> I noticed that huge pages specified on the kernel command line are allocated
> via 'subsys_initcall'.  This is before 'fs_initcall', even though these huge
> pages are only used by hugetlbfs.  Was just thinking that it might be better
> to move huge page allocations to later in the init process.  At least make
> them part of fs_initcall if not late_initcall?
> 
> Only reason for doing this is because huge page allocations are fairly
> tolerant of allocation failure.

I am not really familiar with the initcall hierarchy to be honest. I
even do not understand what relattion does fs_initcall have to
allocation failures. Could you be more specific?

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] mm/hugetlb: Warn the user when issues arise on boot due to hugepages
  2017-06-14  7:27                         ` Michal Hocko
@ 2017-06-14 17:02                           ` Mike Kravetz
  0 siblings, 0 replies; 22+ messages in thread
From: Mike Kravetz @ 2017-06-14 17:02 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Liam R. Howlett, linux-mm, akpm, n-horiguchi, aneesh.kumar,
	gerald.schaefer, zhongjiang, aarcange, kirill.shutemov

On 06/14/2017 12:27 AM, Michal Hocko wrote:
> On Tue 13-06-17 09:26:15, Mike Kravetz wrote:
>> A thought somewhat related to this discussion:
>>
>> I noticed that huge pages specified on the kernel command line are allocated
>> via 'subsys_initcall'.  This is before 'fs_initcall', even though these huge
>> pages are only used by hugetlbfs.  Was just thinking that it might be better
>> to move huge page allocations to later in the init process.  At least make
>> them part of fs_initcall if not late_initcall?
>>
>> Only reason for doing this is because huge page allocations are fairly
>> tolerant of allocation failure.
> 
> I am not really familiar with the initcall hierarchy to be honest. I
> even do not understand what relattion does fs_initcall have to
> allocation failures. Could you be more specific?

The short answer is never mind.  We can not easily change where huge pages
are allocated as pointed out here:
https://patchwork.kernel.org/patch/7837381/

The longer explanation of my thoughts on moving location of allocations:
Liam was saying that someone was getting OOMs because they allocated too
many huge pages.  It also seems that this happened during boot.  I may be
reading too much into his comments.

My thought is that this 'may' not happen if huge pages were allocated
later in the boot process.  If they were allocated after boot, this would
not even be an issue.  So, I started looking at where they were allocated
in the initcall hierarchy.  huge pages are allocated via subsys_initcall
which (IIUC) happens before fs_initcall.  It seems to me that perhaps
huge page allocation belongs in fs_initcall as the pre-allocated pages
are only used by hugetlbfs.

Moving huge page allocation from subsys_initcall to fs_initcall is unlikely
to have any impact on the OOMs that Liam was trying to flag.  In fact
it may not have an impact on any such issues.  But, in a way it does seem
'more correct'.  Nobody in the init process should be depending on huge
page allocation (I think), so what if we moved it all the way to the end
of initcall processing: late_initcall.  Then perhaps it might have an 
impact on such issues.

-- 
Mike Kravetz

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] mm/hugetlb: Warn the user when issues arise on boot due to hugepages
  2017-06-13  1:35                 ` Liam R. Howlett
  2017-06-13  5:42                   ` Michal Hocko
@ 2017-06-16 19:07                   ` Andrew Morton
  2017-06-17  0:25                     ` Mike Kravetz
  2017-06-17  6:51                     ` Michal Hocko
  1 sibling, 2 replies; 22+ messages in thread
From: Andrew Morton @ 2017-06-16 19:07 UTC (permalink / raw)
  To: Liam R. Howlett
  Cc: Michal Hocko, linux-mm, mike.kravetz, n-horiguchi, aneesh.kumar,
	gerald.schaefer, zhongjiang, aarcange, kirill.shutemov

On Mon, 12 Jun 2017 21:35:17 -0400 "Liam R. Howlett" <Liam.Howlett@Oracle.com> wrote:

> > 
> > > If there's no message stating any
> > > configuration issue, then many admins would probably think something is
> > > seriously broken and it's not just a simple typo of K vs M.
> > > 
> > > Even though this doesn't catch all errors, I think it's a worth while
> > > change since this is currently a silent failure which results in a
> > > system crash.
> > 
> > Seriously, this warning just doesn't help in _most_ miscofigurations. It
> > just focuses on one particular which really requires to misconfigure
> > really badly. And there are way too many other ways to screw your system
> > that way, yet we do not warn about many of those. So just try to step
> > back and think whether this is something we actually do care about and
> > if yes then try to come up with a more reasonable warning which would
> > cover a wider range of misconfigurations.
> 
> Understood.  Again, I appreciate all the time you have taken on my
> patch and explaining your points.  I will look at this again as you
> have suggested.

So do we want to drop
mm-hugetlb-warn-the-user-when-issues-arise-on-boot-due-to-hugepages.patch?

I'd be inclined to keep it if Liam found it a bit useful - it does have
some overhead, but half the patch is in __init code...

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] mm/hugetlb: Warn the user when issues arise on boot due to hugepages
  2017-06-16 19:07                   ` Andrew Morton
@ 2017-06-17  0:25                     ` Mike Kravetz
  2017-06-17  6:51                     ` Michal Hocko
  1 sibling, 0 replies; 22+ messages in thread
From: Mike Kravetz @ 2017-06-17  0:25 UTC (permalink / raw)
  To: Andrew Morton, Liam R. Howlett
  Cc: Michal Hocko, linux-mm, n-horiguchi, aneesh.kumar,
	gerald.schaefer, zhongjiang, aarcange, kirill.shutemov

On 06/16/2017 12:07 PM, Andrew Morton wrote:
> On Mon, 12 Jun 2017 21:35:17 -0400 "Liam R. Howlett" <Liam.Howlett@Oracle.com> wrote:
> 
>>>
>>>> If there's no message stating any
>>>> configuration issue, then many admins would probably think something is
>>>> seriously broken and it's not just a simple typo of K vs M.
>>>>
>>>> Even though this doesn't catch all errors, I think it's a worth while
>>>> change since this is currently a silent failure which results in a
>>>> system crash.
>>>
>>> Seriously, this warning just doesn't help in _most_ miscofigurations. It
>>> just focuses on one particular which really requires to misconfigure
>>> really badly. And there are way too many other ways to screw your system
>>> that way, yet we do not warn about many of those. So just try to step
>>> back and think whether this is something we actually do care about and
>>> if yes then try to come up with a more reasonable warning which would
>>> cover a wider range of misconfigurations.
>>
>> Understood.  Again, I appreciate all the time you have taken on my
>> patch and explaining your points.  I will look at this again as you
>> have suggested.
> 
> So do we want to drop
> mm-hugetlb-warn-the-user-when-issues-arise-on-boot-due-to-hugepages.patch?
> 
> I'd be inclined to keep it if Liam found it a bit useful - it does have
> some overhead, but half the patch is in __init code...
> 

Before sending out this patch, I asked Liam off list why he was doing it.
Was it something he just thought would be useful?  Or, was there some type
of user situation/need.  He said that he had been called in to assist on
several occasions when a system OOMed during boot.  In almost all of these
situations, the user had grossly misconfigured huge pages.  DB users want
to pre-allocate just the right amount of huge pages, but sometimes they
can be really off.  In such situations, the huge page init code just
allocates as many huge pages as it can and reports the number allocated.
There is no indication that it quit allocating because it ran out of memory.
Of course, a user could compare the number in the message to what they
requested on the command line to determine if they got all the huge pages
they requested.  The thought was that it would be useful to at least flag
this situation.  That way, the user might be able to better relate the huge
page allocation failure to the OOM.

I'm not sure if the e-mail discussion made it obvious that this is
something he has seen on several occasions.

I see Michal's point that this will only flag the situation where someone
configures huge pages very badly.  And, a more extensive look at the situation
of misconfiguring huge pages might be in order.  But, this has happened on
several occasions which led to the creation of this patch.

-- 
Mike Kravetz

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] mm/hugetlb: Warn the user when issues arise on boot due to hugepages
  2017-06-16 19:07                   ` Andrew Morton
  2017-06-17  0:25                     ` Mike Kravetz
@ 2017-06-17  6:51                     ` Michal Hocko
  2017-06-19 19:59                       ` Liam R. Howlett
  1 sibling, 1 reply; 22+ messages in thread
From: Michal Hocko @ 2017-06-17  6:51 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Liam R. Howlett, linux-mm, mike.kravetz, n-horiguchi,
	aneesh.kumar, gerald.schaefer, zhongjiang, aarcange,
	kirill.shutemov

On Fri 16-06-17 12:07:55, Andrew Morton wrote:
> On Mon, 12 Jun 2017 21:35:17 -0400 "Liam R. Howlett" <Liam.Howlett@Oracle.com> wrote:
> 
> > > 
> > > > If there's no message stating any
> > > > configuration issue, then many admins would probably think something is
> > > > seriously broken and it's not just a simple typo of K vs M.
> > > > 
> > > > Even though this doesn't catch all errors, I think it's a worth while
> > > > change since this is currently a silent failure which results in a
> > > > system crash.
> > > 
> > > Seriously, this warning just doesn't help in _most_ miscofigurations. It
> > > just focuses on one particular which really requires to misconfigure
> > > really badly. And there are way too many other ways to screw your system
> > > that way, yet we do not warn about many of those. So just try to step
> > > back and think whether this is something we actually do care about and
> > > if yes then try to come up with a more reasonable warning which would
> > > cover a wider range of misconfigurations.
> > 
> > Understood.  Again, I appreciate all the time you have taken on my
> > patch and explaining your points.  I will look at this again as you
> > have suggested.
> 
> So do we want to drop
> mm-hugetlb-warn-the-user-when-issues-arise-on-boot-due-to-hugepages.patch?
> 
> I'd be inclined to keep it if Liam found it a bit useful - it does have
> some overhead, but half the patch is in __init code...

I would rather see a more generic warning that would catch more
misconfiguration than those ultimately broken ones. If we find out that
such a warning is not feasible then I would not oppose to go with the
current approach but let's try a bit harder before we go that way.

Liam, are you willing to go that way?
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] mm/hugetlb: Warn the user when issues arise on boot due to hugepages
  2017-06-17  6:51                     ` Michal Hocko
@ 2017-06-19 19:59                       ` Liam R. Howlett
  0 siblings, 0 replies; 22+ messages in thread
From: Liam R. Howlett @ 2017-06-19 19:59 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Andrew Morton, linux-mm, mike.kravetz, n-horiguchi, aneesh.kumar,
	gerald.schaefer, zhongjiang, aarcange, kirill.shutemov

* Michal Hocko <mhocko@suse.com> [170617 02:51]:
> On Fri 16-06-17 12:07:55, Andrew Morton wrote:
> > On Mon, 12 Jun 2017 21:35:17 -0400 "Liam R. Howlett" <Liam.Howlett@Oracle.com> wrote:
> > 
> > > > 
> > > > > If there's no message stating any
> > > > > configuration issue, then many admins would probably think something is
> > > > > seriously broken and it's not just a simple typo of K vs M.
> > > > > 
> > > > > Even though this doesn't catch all errors, I think it's a worth while
> > > > > change since this is currently a silent failure which results in a
> > > > > system crash.
> > > > 
> > > > Seriously, this warning just doesn't help in _most_ miscofigurations. It
> > > > just focuses on one particular which really requires to misconfigure
> > > > really badly. And there are way too many other ways to screw your system
> > > > that way, yet we do not warn about many of those. So just try to step
> > > > back and think whether this is something we actually do care about and
> > > > if yes then try to come up with a more reasonable warning which would
> > > > cover a wider range of misconfigurations.
> > > 
> > > Understood.  Again, I appreciate all the time you have taken on my
> > > patch and explaining your points.  I will look at this again as you
> > > have suggested.
> > 
> > So do we want to drop
> > mm-hugetlb-warn-the-user-when-issues-arise-on-boot-due-to-hugepages.patch?
> > 
> > I'd be inclined to keep it if Liam found it a bit useful - it does have
> > some overhead, but half the patch is in __init code...
> 
> I would rather see a more generic warning that would catch more
> misconfiguration than those ultimately broken ones. If we find out that
> such a warning is not feasible then I would not oppose to go with the
> current approach but let's try a bit harder before we go that way.
> 
> Liam, are you willing to go that way?

As I see it, and as you have pointed out, we can only be sure it's an
error if it's over 100% of the memory.  Although it's certainly worth
while looking for a way to detect an incorrect configuration that
doesn't meet this criteria, I'm not sure it's worth holding out to make
the change.  I think giving any direct message could save someone a lot
of debug time.  If it's okay, I'd like to go ahead with the change and
also look for a way to correct and notify of a broader range of
configurations that cause severe issues in regards to hugepages.

Thanks,
Liam

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2017-06-19 19:59 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-06-03  0:54 [PATCH] mm/hugetlb: Warn the user when issues arise on boot due to hugepages Liam R. Howlett
2017-06-05  4:57 ` Michal Hocko
2017-06-05 15:15   ` Liam R. Howlett
2017-06-06  5:49     ` Michal Hocko
2017-06-06  6:01       ` Michal Hocko
2017-06-12 17:28         ` Liam R. Howlett
2017-06-12 17:49           ` Michal Hocko
2017-06-12 18:37             ` Liam R. Howlett
2017-06-12 18:52               ` Michal Hocko
2017-06-13  1:35                 ` Liam R. Howlett
2017-06-13  5:42                   ` Michal Hocko
2017-06-13 15:25                     ` Liam R. Howlett
2017-06-13 16:26                       ` Mike Kravetz
2017-06-14  7:27                         ` Michal Hocko
2017-06-14 17:02                           ` Mike Kravetz
2017-06-14  7:25                       ` Michal Hocko
2017-06-16 19:07                   ` Andrew Morton
2017-06-17  0:25                     ` Mike Kravetz
2017-06-17  6:51                     ` Michal Hocko
2017-06-19 19:59                       ` Liam R. Howlett
2017-06-05 22:38 ` Andrew Morton
2017-06-06 19:03   ` Matthew Wilcox

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).