All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] mm: respect the __GFP_NOWARN flag when warning about stalls
@ 2017-09-11  6:52 ` Mikulas Patocka
  0 siblings, 0 replies; 20+ messages in thread
From: Mikulas Patocka @ 2017-09-11  6:52 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Vlastimil Babka, Tetsuo Handa, Johannes Weiner, Mel Gorman,
	Dave Hansen, Andrew Morton, linux-mm, linux-kernel

I am occasionally getting these warnings in khugepaged. It is an old 
machine with 550MHz CPU and 512 MB RAM.

Note that khugepaged has nice value 19, so when the machine is loaded with 
some work, khugepaged is stalled and this stall produces warning in the 
allocator.

khugepaged does allocations with __GFP_NOWARN, but the flag __GFP_NOWARN
is masked off when calling warn_alloc. This patch removes the masking of
__GFP_NOWARN, so that the warning is suppressed.

khugepaged: page allocation stalls for 10273ms, order:10, mode:0x4340ca(__GFP_HIGHMEM|__GFP_IO|__GFP_FS|__GFP_COMP|__GFP_NOMEMALLOC|__GFP_HARDWALL|__GFP_MOVABLE|__GFP_DIRECT_RECLAIM), nodemask=(null)
CPU: 0 PID: 3936 Comm: khugepaged Not tainted 4.12.3 #1
Hardware name: System Manufacturer Product Name/VA-503A, BIOS 4.51 PG 08/02/00
Call Trace:
 ? warn_alloc+0xb9/0x140
 ? __alloc_pages_nodemask+0x724/0x880
 ? arch_irq_stat_cpu+0x1/0x40
 ? detach_if_pending+0x80/0x80
 ? khugepaged+0x10a/0x1d40
 ? pick_next_task_fair+0xd2/0x180
 ? wait_woken+0x60/0x60
 ? kthread+0xcf/0x100
 ? release_pte_page+0x40/0x40
 ? kthread_create_on_node+0x40/0x40
 ? ret_from_fork+0x19/0x30

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org
Fixes: 63f53dea0c98 ("mm: warn about allocations which stall for too long")

---
 mm/page_alloc.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Index: linux-2.6/mm/page_alloc.c
===================================================================
--- linux-2.6.orig/mm/page_alloc.c
+++ linux-2.6/mm/page_alloc.c
@@ -3923,7 +3923,7 @@ retry:
 
 	/* Make sure we know about allocations which stall for too long */
 	if (time_after(jiffies, alloc_start + stall_timeout)) {
-		warn_alloc(gfp_mask & ~__GFP_NOWARN, ac->nodemask,
+		warn_alloc(gfp_mask, ac->nodemask,
 			"page allocation stalls for %ums, order:%u",
 			jiffies_to_msecs(jiffies-alloc_start), order);
 		stall_timeout += 10 * HZ;

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH] mm: respect the __GFP_NOWARN flag when warning about stalls
@ 2017-09-11  6:52 ` Mikulas Patocka
  0 siblings, 0 replies; 20+ messages in thread
From: Mikulas Patocka @ 2017-09-11  6:52 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Vlastimil Babka, Tetsuo Handa, Johannes Weiner, Mel Gorman,
	Dave Hansen, Andrew Morton, linux-mm, linux-kernel

I am occasionally getting these warnings in khugepaged. It is an old 
machine with 550MHz CPU and 512 MB RAM.

Note that khugepaged has nice value 19, so when the machine is loaded with 
some work, khugepaged is stalled and this stall produces warning in the 
allocator.

khugepaged does allocations with __GFP_NOWARN, but the flag __GFP_NOWARN
is masked off when calling warn_alloc. This patch removes the masking of
__GFP_NOWARN, so that the warning is suppressed.

khugepaged: page allocation stalls for 10273ms, order:10, mode:0x4340ca(__GFP_HIGHMEM|__GFP_IO|__GFP_FS|__GFP_COMP|__GFP_NOMEMALLOC|__GFP_HARDWALL|__GFP_MOVABLE|__GFP_DIRECT_RECLAIM), nodemask=(null)
CPU: 0 PID: 3936 Comm: khugepaged Not tainted 4.12.3 #1
Hardware name: System Manufacturer Product Name/VA-503A, BIOS 4.51 PG 08/02/00
Call Trace:
 ? warn_alloc+0xb9/0x140
 ? __alloc_pages_nodemask+0x724/0x880
 ? arch_irq_stat_cpu+0x1/0x40
 ? detach_if_pending+0x80/0x80
 ? khugepaged+0x10a/0x1d40
 ? pick_next_task_fair+0xd2/0x180
 ? wait_woken+0x60/0x60
 ? kthread+0xcf/0x100
 ? release_pte_page+0x40/0x40
 ? kthread_create_on_node+0x40/0x40
 ? ret_from_fork+0x19/0x30

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org
Fixes: 63f53dea0c98 ("mm: warn about allocations which stall for too long")

---
 mm/page_alloc.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Index: linux-2.6/mm/page_alloc.c
===================================================================
--- linux-2.6.orig/mm/page_alloc.c
+++ linux-2.6/mm/page_alloc.c
@@ -3923,7 +3923,7 @@ retry:
 
 	/* Make sure we know about allocations which stall for too long */
 	if (time_after(jiffies, alloc_start + stall_timeout)) {
-		warn_alloc(gfp_mask & ~__GFP_NOWARN, ac->nodemask,
+		warn_alloc(gfp_mask, ac->nodemask,
 			"page allocation stalls for %ums, order:%u",
 			jiffies_to_msecs(jiffies-alloc_start), order);
 		stall_timeout += 10 * HZ;

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] mm: respect the __GFP_NOWARN flag when warning about stalls
  2017-09-11  6:52 ` Mikulas Patocka
@ 2017-09-11  8:26   ` Michal Hocko
  -1 siblings, 0 replies; 20+ messages in thread
From: Michal Hocko @ 2017-09-11  8:26 UTC (permalink / raw)
  To: Mikulas Patocka
  Cc: Vlastimil Babka, Tetsuo Handa, Johannes Weiner, Mel Gorman,
	Dave Hansen, Andrew Morton, linux-mm, linux-kernel

On Mon 11-09-17 02:52:53, Mikulas Patocka wrote:
> I am occasionally getting these warnings in khugepaged. It is an old 
> machine with 550MHz CPU and 512 MB RAM.
> 
> Note that khugepaged has nice value 19, so when the machine is loaded with 
> some work, khugepaged is stalled and this stall produces warning in the 
> allocator.
> 
> khugepaged does allocations with __GFP_NOWARN, but the flag __GFP_NOWARN
> is masked off when calling warn_alloc. This patch removes the masking of
> __GFP_NOWARN, so that the warning is suppressed.
> 
> khugepaged: page allocation stalls for 10273ms, order:10, mode:0x4340ca(__GFP_HIGHMEM|__GFP_IO|__GFP_FS|__GFP_COMP|__GFP_NOMEMALLOC|__GFP_HARDWALL|__GFP_MOVABLE|__GFP_DIRECT_RECLAIM), nodemask=(null)
> CPU: 0 PID: 3936 Comm: khugepaged Not tainted 4.12.3 #1
> Hardware name: System Manufacturer Product Name/VA-503A, BIOS 4.51 PG 08/02/00
> Call Trace:
>  ? warn_alloc+0xb9/0x140
>  ? __alloc_pages_nodemask+0x724/0x880
>  ? arch_irq_stat_cpu+0x1/0x40
>  ? detach_if_pending+0x80/0x80
>  ? khugepaged+0x10a/0x1d40
>  ? pick_next_task_fair+0xd2/0x180
>  ? wait_woken+0x60/0x60
>  ? kthread+0xcf/0x100
>  ? release_pte_page+0x40/0x40
>  ? kthread_create_on_node+0x40/0x40
>  ? ret_from_fork+0x19/0x30
> 
> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
> Cc: stable@vger.kernel.org
> Fixes: 63f53dea0c98 ("mm: warn about allocations which stall for too long")

This patch hasn't introduced this behavior. It deliberately skipped
warning on __GFP_NOWARN. This has been introduced later by 822519634142
("mm: page_alloc: __GFP_NOWARN shouldn't suppress stall warnings"). I
disagreed [1] but overall consensus was that such a warning won't be
harmful. Could you be more specific why do you consider it wrong,
please?

[1] http://lkml.kernel.org/r/20170125184548.GB32041@dhcp22.suse.cz

> 
> ---
>  mm/page_alloc.c |    2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> Index: linux-2.6/mm/page_alloc.c
> ===================================================================
> --- linux-2.6.orig/mm/page_alloc.c
> +++ linux-2.6/mm/page_alloc.c
> @@ -3923,7 +3923,7 @@ retry:
>  
>  	/* Make sure we know about allocations which stall for too long */
>  	if (time_after(jiffies, alloc_start + stall_timeout)) {
> -		warn_alloc(gfp_mask & ~__GFP_NOWARN, ac->nodemask,
> +		warn_alloc(gfp_mask, ac->nodemask,
>  			"page allocation stalls for %ums, order:%u",
>  			jiffies_to_msecs(jiffies-alloc_start), order);
>  		stall_timeout += 10 * HZ;

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] mm: respect the __GFP_NOWARN flag when warning about stalls
@ 2017-09-11  8:26   ` Michal Hocko
  0 siblings, 0 replies; 20+ messages in thread
From: Michal Hocko @ 2017-09-11  8:26 UTC (permalink / raw)
  To: Mikulas Patocka
  Cc: Vlastimil Babka, Tetsuo Handa, Johannes Weiner, Mel Gorman,
	Dave Hansen, Andrew Morton, linux-mm, linux-kernel

On Mon 11-09-17 02:52:53, Mikulas Patocka wrote:
> I am occasionally getting these warnings in khugepaged. It is an old 
> machine with 550MHz CPU and 512 MB RAM.
> 
> Note that khugepaged has nice value 19, so when the machine is loaded with 
> some work, khugepaged is stalled and this stall produces warning in the 
> allocator.
> 
> khugepaged does allocations with __GFP_NOWARN, but the flag __GFP_NOWARN
> is masked off when calling warn_alloc. This patch removes the masking of
> __GFP_NOWARN, so that the warning is suppressed.
> 
> khugepaged: page allocation stalls for 10273ms, order:10, mode:0x4340ca(__GFP_HIGHMEM|__GFP_IO|__GFP_FS|__GFP_COMP|__GFP_NOMEMALLOC|__GFP_HARDWALL|__GFP_MOVABLE|__GFP_DIRECT_RECLAIM), nodemask=(null)
> CPU: 0 PID: 3936 Comm: khugepaged Not tainted 4.12.3 #1
> Hardware name: System Manufacturer Product Name/VA-503A, BIOS 4.51 PG 08/02/00
> Call Trace:
>  ? warn_alloc+0xb9/0x140
>  ? __alloc_pages_nodemask+0x724/0x880
>  ? arch_irq_stat_cpu+0x1/0x40
>  ? detach_if_pending+0x80/0x80
>  ? khugepaged+0x10a/0x1d40
>  ? pick_next_task_fair+0xd2/0x180
>  ? wait_woken+0x60/0x60
>  ? kthread+0xcf/0x100
>  ? release_pte_page+0x40/0x40
>  ? kthread_create_on_node+0x40/0x40
>  ? ret_from_fork+0x19/0x30
> 
> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
> Cc: stable@vger.kernel.org
> Fixes: 63f53dea0c98 ("mm: warn about allocations which stall for too long")

This patch hasn't introduced this behavior. It deliberately skipped
warning on __GFP_NOWARN. This has been introduced later by 822519634142
("mm: page_alloc: __GFP_NOWARN shouldn't suppress stall warnings"). I
disagreed [1] but overall consensus was that such a warning won't be
harmful. Could you be more specific why do you consider it wrong,
please?

[1] http://lkml.kernel.org/r/20170125184548.GB32041@dhcp22.suse.cz

> 
> ---
>  mm/page_alloc.c |    2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> Index: linux-2.6/mm/page_alloc.c
> ===================================================================
> --- linux-2.6.orig/mm/page_alloc.c
> +++ linux-2.6/mm/page_alloc.c
> @@ -3923,7 +3923,7 @@ retry:
>  
>  	/* Make sure we know about allocations which stall for too long */
>  	if (time_after(jiffies, alloc_start + stall_timeout)) {
> -		warn_alloc(gfp_mask & ~__GFP_NOWARN, ac->nodemask,
> +		warn_alloc(gfp_mask, ac->nodemask,
>  			"page allocation stalls for %ums, order:%u",
>  			jiffies_to_msecs(jiffies-alloc_start), order);
>  		stall_timeout += 10 * HZ;

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] mm: respect the __GFP_NOWARN flag when warning about stalls
  2017-09-11  8:26   ` Michal Hocko
@ 2017-09-11 23:36     ` Mikulas Patocka
  -1 siblings, 0 replies; 20+ messages in thread
From: Mikulas Patocka @ 2017-09-11 23:36 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Vlastimil Babka, Tetsuo Handa, Johannes Weiner, Mel Gorman,
	Dave Hansen, Andrew Morton, linux-mm, linux-kernel



On Mon, 11 Sep 2017, Michal Hocko wrote:

> On Mon 11-09-17 02:52:53, Mikulas Patocka wrote:
> > I am occasionally getting these warnings in khugepaged. It is an old 
> > machine with 550MHz CPU and 512 MB RAM.
> > 
> > Note that khugepaged has nice value 19, so when the machine is loaded with 
> > some work, khugepaged is stalled and this stall produces warning in the 
> > allocator.
> > 
> > khugepaged does allocations with __GFP_NOWARN, but the flag __GFP_NOWARN
> > is masked off when calling warn_alloc. This patch removes the masking of
> > __GFP_NOWARN, so that the warning is suppressed.
> > 
> > khugepaged: page allocation stalls for 10273ms, order:10, mode:0x4340ca(__GFP_HIGHMEM|__GFP_IO|__GFP_FS|__GFP_COMP|__GFP_NOMEMALLOC|__GFP_HARDWALL|__GFP_MOVABLE|__GFP_DIRECT_RECLAIM), nodemask=(null)
> > CPU: 0 PID: 3936 Comm: khugepaged Not tainted 4.12.3 #1
> > Hardware name: System Manufacturer Product Name/VA-503A, BIOS 4.51 PG 08/02/00
> > Call Trace:
> >  ? warn_alloc+0xb9/0x140
> >  ? __alloc_pages_nodemask+0x724/0x880
> >  ? arch_irq_stat_cpu+0x1/0x40
> >  ? detach_if_pending+0x80/0x80
> >  ? khugepaged+0x10a/0x1d40
> >  ? pick_next_task_fair+0xd2/0x180
> >  ? wait_woken+0x60/0x60
> >  ? kthread+0xcf/0x100
> >  ? release_pte_page+0x40/0x40
> >  ? kthread_create_on_node+0x40/0x40
> >  ? ret_from_fork+0x19/0x30
> > 
> > Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
> > Cc: stable@vger.kernel.org
> > Fixes: 63f53dea0c98 ("mm: warn about allocations which stall for too long")
> 
> This patch hasn't introduced this behavior. It deliberately skipped
> warning on __GFP_NOWARN. This has been introduced later by 822519634142
> ("mm: page_alloc: __GFP_NOWARN shouldn't suppress stall warnings"). I
> disagreed [1] but overall consensus was that such a warning won't be
> harmful. Could you be more specific why do you consider it wrong,
> please?

I consider the warning wrong, because it warns when nothing goes wrong. 
I've got 7 these warnings for 4 weeks of uptime. The warnings typically 
happen when I run some compilation.

A process with low priority is expected to be running slowly when there's 
some high-priority process, so there's no need to warn that the 
low-priority process runs slowly.

What else can be done to avoid the warning? Skip the warning if the 
process has lower priority?

Mikulas

> [1] http://lkml.kernel.org/r/20170125184548.GB32041@dhcp22.suse.cz
> 
> > 
> > ---
> >  mm/page_alloc.c |    2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > Index: linux-2.6/mm/page_alloc.c
> > ===================================================================
> > --- linux-2.6.orig/mm/page_alloc.c
> > +++ linux-2.6/mm/page_alloc.c
> > @@ -3923,7 +3923,7 @@ retry:
> >  
> >  	/* Make sure we know about allocations which stall for too long */
> >  	if (time_after(jiffies, alloc_start + stall_timeout)) {
> > -		warn_alloc(gfp_mask & ~__GFP_NOWARN, ac->nodemask,
> > +		warn_alloc(gfp_mask, ac->nodemask,
> >  			"page allocation stalls for %ums, order:%u",
> >  			jiffies_to_msecs(jiffies-alloc_start), order);
> >  		stall_timeout += 10 * HZ;
> 
> -- 
> Michal Hocko
> SUSE Labs
> 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] mm: respect the __GFP_NOWARN flag when warning about stalls
@ 2017-09-11 23:36     ` Mikulas Patocka
  0 siblings, 0 replies; 20+ messages in thread
From: Mikulas Patocka @ 2017-09-11 23:36 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Vlastimil Babka, Tetsuo Handa, Johannes Weiner, Mel Gorman,
	Dave Hansen, Andrew Morton, linux-mm, linux-kernel



On Mon, 11 Sep 2017, Michal Hocko wrote:

> On Mon 11-09-17 02:52:53, Mikulas Patocka wrote:
> > I am occasionally getting these warnings in khugepaged. It is an old 
> > machine with 550MHz CPU and 512 MB RAM.
> > 
> > Note that khugepaged has nice value 19, so when the machine is loaded with 
> > some work, khugepaged is stalled and this stall produces warning in the 
> > allocator.
> > 
> > khugepaged does allocations with __GFP_NOWARN, but the flag __GFP_NOWARN
> > is masked off when calling warn_alloc. This patch removes the masking of
> > __GFP_NOWARN, so that the warning is suppressed.
> > 
> > khugepaged: page allocation stalls for 10273ms, order:10, mode:0x4340ca(__GFP_HIGHMEM|__GFP_IO|__GFP_FS|__GFP_COMP|__GFP_NOMEMALLOC|__GFP_HARDWALL|__GFP_MOVABLE|__GFP_DIRECT_RECLAIM), nodemask=(null)
> > CPU: 0 PID: 3936 Comm: khugepaged Not tainted 4.12.3 #1
> > Hardware name: System Manufacturer Product Name/VA-503A, BIOS 4.51 PG 08/02/00
> > Call Trace:
> >  ? warn_alloc+0xb9/0x140
> >  ? __alloc_pages_nodemask+0x724/0x880
> >  ? arch_irq_stat_cpu+0x1/0x40
> >  ? detach_if_pending+0x80/0x80
> >  ? khugepaged+0x10a/0x1d40
> >  ? pick_next_task_fair+0xd2/0x180
> >  ? wait_woken+0x60/0x60
> >  ? kthread+0xcf/0x100
> >  ? release_pte_page+0x40/0x40
> >  ? kthread_create_on_node+0x40/0x40
> >  ? ret_from_fork+0x19/0x30
> > 
> > Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
> > Cc: stable@vger.kernel.org
> > Fixes: 63f53dea0c98 ("mm: warn about allocations which stall for too long")
> 
> This patch hasn't introduced this behavior. It deliberately skipped
> warning on __GFP_NOWARN. This has been introduced later by 822519634142
> ("mm: page_alloc: __GFP_NOWARN shouldn't suppress stall warnings"). I
> disagreed [1] but overall consensus was that such a warning won't be
> harmful. Could you be more specific why do you consider it wrong,
> please?

I consider the warning wrong, because it warns when nothing goes wrong. 
I've got 7 these warnings for 4 weeks of uptime. The warnings typically 
happen when I run some compilation.

A process with low priority is expected to be running slowly when there's 
some high-priority process, so there's no need to warn that the 
low-priority process runs slowly.

What else can be done to avoid the warning? Skip the warning if the 
process has lower priority?

Mikulas

> [1] http://lkml.kernel.org/r/20170125184548.GB32041@dhcp22.suse.cz
> 
> > 
> > ---
> >  mm/page_alloc.c |    2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > Index: linux-2.6/mm/page_alloc.c
> > ===================================================================
> > --- linux-2.6.orig/mm/page_alloc.c
> > +++ linux-2.6/mm/page_alloc.c
> > @@ -3923,7 +3923,7 @@ retry:
> >  
> >  	/* Make sure we know about allocations which stall for too long */
> >  	if (time_after(jiffies, alloc_start + stall_timeout)) {
> > -		warn_alloc(gfp_mask & ~__GFP_NOWARN, ac->nodemask,
> > +		warn_alloc(gfp_mask, ac->nodemask,
> >  			"page allocation stalls for %ums, order:%u",
> >  			jiffies_to_msecs(jiffies-alloc_start), order);
> >  		stall_timeout += 10 * HZ;
> 
> -- 
> Michal Hocko
> SUSE Labs
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] mm: respect the __GFP_NOWARN flag when warning about stalls
  2017-09-11 23:36     ` Mikulas Patocka
@ 2017-09-12  7:14       ` Vlastimil Babka
  -1 siblings, 0 replies; 20+ messages in thread
From: Vlastimil Babka @ 2017-09-12  7:14 UTC (permalink / raw)
  To: Mikulas Patocka, Michal Hocko
  Cc: Tetsuo Handa, Johannes Weiner, Mel Gorman, Dave Hansen,
	Andrew Morton, linux-mm, linux-kernel

On 09/12/2017 01:36 AM, Mikulas Patocka wrote:
> 
> 
> On Mon, 11 Sep 2017, Michal Hocko wrote:
> 
>> On Mon 11-09-17 02:52:53, Mikulas Patocka wrote:
>>
>> This patch hasn't introduced this behavior. It deliberately skipped
>> warning on __GFP_NOWARN. This has been introduced later by 822519634142
>> ("mm: page_alloc: __GFP_NOWARN shouldn't suppress stall warnings"). I
>> disagreed [1] but overall consensus was that such a warning won't be
>> harmful. Could you be more specific why do you consider it wrong,
>> please?
> 
> I consider the warning wrong, because it warns when nothing goes wrong. 
> I've got 7 these warnings for 4 weeks of uptime. The warnings typically 
> happen when I run some compilation.
> 
> A process with low priority is expected to be running slowly when there's 
> some high-priority process, so there's no need to warn that the 
> low-priority process runs slowly.
> 
> What else can be done to avoid the warning? Skip the warning if the 
> process has lower priority?

We would have to consider (instead of jiffies) the time the process was
either running, or waiting on something that's related to memory
allocation/reclaim (page lock etc.). I.e. deduct the time the process
was runable but there was no available cpu. I expect however that such
level of detail wouldn't be feasible here, though?

Vlastimil

> Mikulas
> 
>> [1] http://lkml.kernel.org/r/20170125184548.GB32041@dhcp22.suse.cz
>>
>>>
>>> ---
>>>  mm/page_alloc.c |    2 +-
>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> Index: linux-2.6/mm/page_alloc.c
>>> ===================================================================
>>> --- linux-2.6.orig/mm/page_alloc.c
>>> +++ linux-2.6/mm/page_alloc.c
>>> @@ -3923,7 +3923,7 @@ retry:
>>>  
>>>  	/* Make sure we know about allocations which stall for too long */
>>>  	if (time_after(jiffies, alloc_start + stall_timeout)) {
>>> -		warn_alloc(gfp_mask & ~__GFP_NOWARN, ac->nodemask,
>>> +		warn_alloc(gfp_mask, ac->nodemask,
>>>  			"page allocation stalls for %ums, order:%u",
>>>  			jiffies_to_msecs(jiffies-alloc_start), order);
>>>  		stall_timeout += 10 * HZ;
>>
>> -- 
>> Michal Hocko
>> SUSE Labs
>>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] mm: respect the __GFP_NOWARN flag when warning about stalls
@ 2017-09-12  7:14       ` Vlastimil Babka
  0 siblings, 0 replies; 20+ messages in thread
From: Vlastimil Babka @ 2017-09-12  7:14 UTC (permalink / raw)
  To: Mikulas Patocka, Michal Hocko
  Cc: Tetsuo Handa, Johannes Weiner, Mel Gorman, Dave Hansen,
	Andrew Morton, linux-mm, linux-kernel

On 09/12/2017 01:36 AM, Mikulas Patocka wrote:
> 
> 
> On Mon, 11 Sep 2017, Michal Hocko wrote:
> 
>> On Mon 11-09-17 02:52:53, Mikulas Patocka wrote:
>>
>> This patch hasn't introduced this behavior. It deliberately skipped
>> warning on __GFP_NOWARN. This has been introduced later by 822519634142
>> ("mm: page_alloc: __GFP_NOWARN shouldn't suppress stall warnings"). I
>> disagreed [1] but overall consensus was that such a warning won't be
>> harmful. Could you be more specific why do you consider it wrong,
>> please?
> 
> I consider the warning wrong, because it warns when nothing goes wrong. 
> I've got 7 these warnings for 4 weeks of uptime. The warnings typically 
> happen when I run some compilation.
> 
> A process with low priority is expected to be running slowly when there's 
> some high-priority process, so there's no need to warn that the 
> low-priority process runs slowly.
> 
> What else can be done to avoid the warning? Skip the warning if the 
> process has lower priority?

We would have to consider (instead of jiffies) the time the process was
either running, or waiting on something that's related to memory
allocation/reclaim (page lock etc.). I.e. deduct the time the process
was runable but there was no available cpu. I expect however that such
level of detail wouldn't be feasible here, though?

Vlastimil

> Mikulas
> 
>> [1] http://lkml.kernel.org/r/20170125184548.GB32041@dhcp22.suse.cz
>>
>>>
>>> ---
>>>  mm/page_alloc.c |    2 +-
>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> Index: linux-2.6/mm/page_alloc.c
>>> ===================================================================
>>> --- linux-2.6.orig/mm/page_alloc.c
>>> +++ linux-2.6/mm/page_alloc.c
>>> @@ -3923,7 +3923,7 @@ retry:
>>>  
>>>  	/* Make sure we know about allocations which stall for too long */
>>>  	if (time_after(jiffies, alloc_start + stall_timeout)) {
>>> -		warn_alloc(gfp_mask & ~__GFP_NOWARN, ac->nodemask,
>>> +		warn_alloc(gfp_mask, ac->nodemask,
>>>  			"page allocation stalls for %ums, order:%u",
>>>  			jiffies_to_msecs(jiffies-alloc_start), order);
>>>  		stall_timeout += 10 * HZ;
>>
>> -- 
>> Michal Hocko
>> SUSE Labs
>>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] mm: respect the __GFP_NOWARN flag when warning about stalls
  2017-09-11 23:36     ` Mikulas Patocka
@ 2017-09-13 11:54       ` Michal Hocko
  -1 siblings, 0 replies; 20+ messages in thread
From: Michal Hocko @ 2017-09-13 11:54 UTC (permalink / raw)
  To: Mikulas Patocka
  Cc: Vlastimil Babka, Tetsuo Handa, Johannes Weiner, Mel Gorman,
	Dave Hansen, Andrew Morton, linux-mm, linux-kernel

On Mon 11-09-17 19:36:59, Mikulas Patocka wrote:
> 
> 
> On Mon, 11 Sep 2017, Michal Hocko wrote:
> 
> > On Mon 11-09-17 02:52:53, Mikulas Patocka wrote:
> > > I am occasionally getting these warnings in khugepaged. It is an old 
> > > machine with 550MHz CPU and 512 MB RAM.
> > > 
> > > Note that khugepaged has nice value 19, so when the machine is loaded with 
> > > some work, khugepaged is stalled and this stall produces warning in the 
> > > allocator.
> > > 
> > > khugepaged does allocations with __GFP_NOWARN, but the flag __GFP_NOWARN
> > > is masked off when calling warn_alloc. This patch removes the masking of
> > > __GFP_NOWARN, so that the warning is suppressed.
> > > 
> > > khugepaged: page allocation stalls for 10273ms, order:10, mode:0x4340ca(__GFP_HIGHMEM|__GFP_IO|__GFP_FS|__GFP_COMP|__GFP_NOMEMALLOC|__GFP_HARDWALL|__GFP_MOVABLE|__GFP_DIRECT_RECLAIM), nodemask=(null)
> > > CPU: 0 PID: 3936 Comm: khugepaged Not tainted 4.12.3 #1
> > > Hardware name: System Manufacturer Product Name/VA-503A, BIOS 4.51 PG 08/02/00
> > > Call Trace:
> > >  ? warn_alloc+0xb9/0x140
> > >  ? __alloc_pages_nodemask+0x724/0x880
> > >  ? arch_irq_stat_cpu+0x1/0x40
> > >  ? detach_if_pending+0x80/0x80
> > >  ? khugepaged+0x10a/0x1d40
> > >  ? pick_next_task_fair+0xd2/0x180
> > >  ? wait_woken+0x60/0x60
> > >  ? kthread+0xcf/0x100
> > >  ? release_pte_page+0x40/0x40
> > >  ? kthread_create_on_node+0x40/0x40
> > >  ? ret_from_fork+0x19/0x30
> > > 
> > > Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
> > > Cc: stable@vger.kernel.org
> > > Fixes: 63f53dea0c98 ("mm: warn about allocations which stall for too long")
> > 
> > This patch hasn't introduced this behavior. It deliberately skipped
> > warning on __GFP_NOWARN. This has been introduced later by 822519634142
> > ("mm: page_alloc: __GFP_NOWARN shouldn't suppress stall warnings"). I
> > disagreed [1] but overall consensus was that such a warning won't be
> > harmful. Could you be more specific why do you consider it wrong,
> > please?
> 
> I consider the warning wrong, because it warns when nothing goes wrong. 
> I've got 7 these warnings for 4 weeks of uptime. The warnings typically 
> happen when I run some compilation.
> 
> A process with low priority is expected to be running slowly when there's 
> some high-priority process, so there's no need to warn that the 
> low-priority process runs slowly.

I would tend to agree. It is certainly a noise in the log. And a kind of
thing I was worried about when objecting the patch previously. 
 
> What else can be done to avoid the warning? Skip the warning if the 
> process has lower priority?

No, I wouldn't play with priorities. Either we agree that NOWARN
allocations simply do _not_warn_ or we simply explain users that some of
those warnings might not be that critical and overloaded system might
show them.

Let's see what others think about this.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] mm: respect the __GFP_NOWARN flag when warning about stalls
@ 2017-09-13 11:54       ` Michal Hocko
  0 siblings, 0 replies; 20+ messages in thread
From: Michal Hocko @ 2017-09-13 11:54 UTC (permalink / raw)
  To: Mikulas Patocka
  Cc: Vlastimil Babka, Tetsuo Handa, Johannes Weiner, Mel Gorman,
	Dave Hansen, Andrew Morton, linux-mm, linux-kernel

On Mon 11-09-17 19:36:59, Mikulas Patocka wrote:
> 
> 
> On Mon, 11 Sep 2017, Michal Hocko wrote:
> 
> > On Mon 11-09-17 02:52:53, Mikulas Patocka wrote:
> > > I am occasionally getting these warnings in khugepaged. It is an old 
> > > machine with 550MHz CPU and 512 MB RAM.
> > > 
> > > Note that khugepaged has nice value 19, so when the machine is loaded with 
> > > some work, khugepaged is stalled and this stall produces warning in the 
> > > allocator.
> > > 
> > > khugepaged does allocations with __GFP_NOWARN, but the flag __GFP_NOWARN
> > > is masked off when calling warn_alloc. This patch removes the masking of
> > > __GFP_NOWARN, so that the warning is suppressed.
> > > 
> > > khugepaged: page allocation stalls for 10273ms, order:10, mode:0x4340ca(__GFP_HIGHMEM|__GFP_IO|__GFP_FS|__GFP_COMP|__GFP_NOMEMALLOC|__GFP_HARDWALL|__GFP_MOVABLE|__GFP_DIRECT_RECLAIM), nodemask=(null)
> > > CPU: 0 PID: 3936 Comm: khugepaged Not tainted 4.12.3 #1
> > > Hardware name: System Manufacturer Product Name/VA-503A, BIOS 4.51 PG 08/02/00
> > > Call Trace:
> > >  ? warn_alloc+0xb9/0x140
> > >  ? __alloc_pages_nodemask+0x724/0x880
> > >  ? arch_irq_stat_cpu+0x1/0x40
> > >  ? detach_if_pending+0x80/0x80
> > >  ? khugepaged+0x10a/0x1d40
> > >  ? pick_next_task_fair+0xd2/0x180
> > >  ? wait_woken+0x60/0x60
> > >  ? kthread+0xcf/0x100
> > >  ? release_pte_page+0x40/0x40
> > >  ? kthread_create_on_node+0x40/0x40
> > >  ? ret_from_fork+0x19/0x30
> > > 
> > > Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
> > > Cc: stable@vger.kernel.org
> > > Fixes: 63f53dea0c98 ("mm: warn about allocations which stall for too long")
> > 
> > This patch hasn't introduced this behavior. It deliberately skipped
> > warning on __GFP_NOWARN. This has been introduced later by 822519634142
> > ("mm: page_alloc: __GFP_NOWARN shouldn't suppress stall warnings"). I
> > disagreed [1] but overall consensus was that such a warning won't be
> > harmful. Could you be more specific why do you consider it wrong,
> > please?
> 
> I consider the warning wrong, because it warns when nothing goes wrong. 
> I've got 7 these warnings for 4 weeks of uptime. The warnings typically 
> happen when I run some compilation.
> 
> A process with low priority is expected to be running slowly when there's 
> some high-priority process, so there's no need to warn that the 
> low-priority process runs slowly.

I would tend to agree. It is certainly a noise in the log. And a kind of
thing I was worried about when objecting the patch previously. 
 
> What else can be done to avoid the warning? Skip the warning if the 
> process has lower priority?

No, I wouldn't play with priorities. Either we agree that NOWARN
allocations simply do _not_warn_ or we simply explain users that some of
those warnings might not be that critical and overloaded system might
show them.

Let's see what others think about this.
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] mm: respect the __GFP_NOWARN flag when warning about stalls
  2017-09-13 11:54       ` Michal Hocko
@ 2017-09-13 13:54         ` Tetsuo Handa
  -1 siblings, 0 replies; 20+ messages in thread
From: Tetsuo Handa @ 2017-09-13 13:54 UTC (permalink / raw)
  To: mhocko, mpatocka
  Cc: vbabka, hannes, mgorman, dave.hansen, akpm, linux-mm, linux-kernel

Michal Hocko wrote:
> On Mon 11-09-17 19:36:59, Mikulas Patocka wrote:
> > 
> > 
> > On Mon, 11 Sep 2017, Michal Hocko wrote:
> > 
> > > On Mon 11-09-17 02:52:53, Mikulas Patocka wrote:
> > > > I am occasionally getting these warnings in khugepaged. It is an old 
> > > > machine with 550MHz CPU and 512 MB RAM.
> > > > 
> > > > Note that khugepaged has nice value 19, so when the machine is loaded with 
> > > > some work, khugepaged is stalled and this stall produces warning in the 
> > > > allocator.
> > > > 
> > > > khugepaged does allocations with __GFP_NOWARN, but the flag __GFP_NOWARN
> > > > is masked off when calling warn_alloc. This patch removes the masking of
> > > > __GFP_NOWARN, so that the warning is suppressed.
> > > > 
> > > > khugepaged: page allocation stalls for 10273ms, order:10, mode:0x4340ca(__GFP_HIGHMEM|__GFP_IO|__GFP_FS|__GFP_COMP|__GFP_NOMEMALLOC|__GFP_HARDWALL|__GFP_MOVABLE|__GFP_DIRECT_RECLAIM), nodemask=(null)
> > > > CPU: 0 PID: 3936 Comm: khugepaged Not tainted 4.12.3 #1
> > > > Hardware name: System Manufacturer Product Name/VA-503A, BIOS 4.51 PG 08/02/00
> > > > Call Trace:
> > > >  ? warn_alloc+0xb9/0x140
> > > >  ? __alloc_pages_nodemask+0x724/0x880
> > > >  ? arch_irq_stat_cpu+0x1/0x40
> > > >  ? detach_if_pending+0x80/0x80
> > > >  ? khugepaged+0x10a/0x1d40
> > > >  ? pick_next_task_fair+0xd2/0x180
> > > >  ? wait_woken+0x60/0x60
> > > >  ? kthread+0xcf/0x100
> > > >  ? release_pte_page+0x40/0x40
> > > >  ? kthread_create_on_node+0x40/0x40
> > > >  ? ret_from_fork+0x19/0x30
> > > > 
> > > > Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
> > > > Cc: stable@vger.kernel.org
> > > > Fixes: 63f53dea0c98 ("mm: warn about allocations which stall for too long")
> > > 
> > > This patch hasn't introduced this behavior. It deliberately skipped
> > > warning on __GFP_NOWARN. This has been introduced later by 822519634142
> > > ("mm: page_alloc: __GFP_NOWARN shouldn't suppress stall warnings"). I
> > > disagreed [1] but overall consensus was that such a warning won't be
> > > harmful. Could you be more specific why do you consider it wrong,
> > > please?
> > 
> > I consider the warning wrong, because it warns when nothing goes wrong. 
> > I've got 7 these warnings for 4 weeks of uptime. The warnings typically 
> > happen when I run some compilation.
> > 
> > A process with low priority is expected to be running slowly when there's 
> > some high-priority process, so there's no need to warn that the 
> > low-priority process runs slowly.
> 
> I would tend to agree. It is certainly a noise in the log. And a kind of
> thing I was worried about when objecting the patch previously. 
>  
> > What else can be done to avoid the warning? Skip the warning if the 
> > process has lower priority?
> 
> No, I wouldn't play with priorities. Either we agree that NOWARN
> allocations simply do _not_warn_ or we simply explain users that some of
> those warnings might not be that critical and overloaded system might
> show them.
> 
> Let's see what others think about this.

Whether __GFP_NOWARN should warn about stalls is not a topic to discuss.
I consider warn_alloc() for reporting stalls is broken. It fails to provide
backtrace of stalling location. For example, OOM lockup with oom_lock held
cannot be reported by warn_alloc(). It fails to provide readable output when
called concurrently. For example, concurrent calls can cause printk()/
schedule_timeout_killable() lockup with oom_lock held. printk() offloading is
not an option, for there will be situations where printk() offloading cannot
be used (e.g. queuing via printk() is faster than writing to serial consoles
which results in unreadable logs due to log_bug overflow).

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] mm: respect the __GFP_NOWARN flag when warning about stalls
@ 2017-09-13 13:54         ` Tetsuo Handa
  0 siblings, 0 replies; 20+ messages in thread
From: Tetsuo Handa @ 2017-09-13 13:54 UTC (permalink / raw)
  To: mhocko, mpatocka
  Cc: vbabka, hannes, mgorman, dave.hansen, akpm, linux-mm, linux-kernel

Michal Hocko wrote:
> On Mon 11-09-17 19:36:59, Mikulas Patocka wrote:
> > 
> > 
> > On Mon, 11 Sep 2017, Michal Hocko wrote:
> > 
> > > On Mon 11-09-17 02:52:53, Mikulas Patocka wrote:
> > > > I am occasionally getting these warnings in khugepaged. It is an old 
> > > > machine with 550MHz CPU and 512 MB RAM.
> > > > 
> > > > Note that khugepaged has nice value 19, so when the machine is loaded with 
> > > > some work, khugepaged is stalled and this stall produces warning in the 
> > > > allocator.
> > > > 
> > > > khugepaged does allocations with __GFP_NOWARN, but the flag __GFP_NOWARN
> > > > is masked off when calling warn_alloc. This patch removes the masking of
> > > > __GFP_NOWARN, so that the warning is suppressed.
> > > > 
> > > > khugepaged: page allocation stalls for 10273ms, order:10, mode:0x4340ca(__GFP_HIGHMEM|__GFP_IO|__GFP_FS|__GFP_COMP|__GFP_NOMEMALLOC|__GFP_HARDWALL|__GFP_MOVABLE|__GFP_DIRECT_RECLAIM), nodemask=(null)
> > > > CPU: 0 PID: 3936 Comm: khugepaged Not tainted 4.12.3 #1
> > > > Hardware name: System Manufacturer Product Name/VA-503A, BIOS 4.51 PG 08/02/00
> > > > Call Trace:
> > > >  ? warn_alloc+0xb9/0x140
> > > >  ? __alloc_pages_nodemask+0x724/0x880
> > > >  ? arch_irq_stat_cpu+0x1/0x40
> > > >  ? detach_if_pending+0x80/0x80
> > > >  ? khugepaged+0x10a/0x1d40
> > > >  ? pick_next_task_fair+0xd2/0x180
> > > >  ? wait_woken+0x60/0x60
> > > >  ? kthread+0xcf/0x100
> > > >  ? release_pte_page+0x40/0x40
> > > >  ? kthread_create_on_node+0x40/0x40
> > > >  ? ret_from_fork+0x19/0x30
> > > > 
> > > > Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
> > > > Cc: stable@vger.kernel.org
> > > > Fixes: 63f53dea0c98 ("mm: warn about allocations which stall for too long")
> > > 
> > > This patch hasn't introduced this behavior. It deliberately skipped
> > > warning on __GFP_NOWARN. This has been introduced later by 822519634142
> > > ("mm: page_alloc: __GFP_NOWARN shouldn't suppress stall warnings"). I
> > > disagreed [1] but overall consensus was that such a warning won't be
> > > harmful. Could you be more specific why do you consider it wrong,
> > > please?
> > 
> > I consider the warning wrong, because it warns when nothing goes wrong. 
> > I've got 7 these warnings for 4 weeks of uptime. The warnings typically 
> > happen when I run some compilation.
> > 
> > A process with low priority is expected to be running slowly when there's 
> > some high-priority process, so there's no need to warn that the 
> > low-priority process runs slowly.
> 
> I would tend to agree. It is certainly a noise in the log. And a kind of
> thing I was worried about when objecting the patch previously. 
>  
> > What else can be done to avoid the warning? Skip the warning if the 
> > process has lower priority?
> 
> No, I wouldn't play with priorities. Either we agree that NOWARN
> allocations simply do _not_warn_ or we simply explain users that some of
> those warnings might not be that critical and overloaded system might
> show them.
> 
> Let's see what others think about this.

Whether __GFP_NOWARN should warn about stalls is not a topic to discuss.
I consider warn_alloc() for reporting stalls is broken. It fails to provide
backtrace of stalling location. For example, OOM lockup with oom_lock held
cannot be reported by warn_alloc(). It fails to provide readable output when
called concurrently. For example, concurrent calls can cause printk()/
schedule_timeout_killable() lockup with oom_lock held. printk() offloading is
not an option, for there will be situations where printk() offloading cannot
be used (e.g. queuing via printk() is faster than writing to serial consoles
which results in unreadable logs due to log_bug overflow).

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] mm: respect the __GFP_NOWARN flag when warning about stalls
  2017-09-13 13:54         ` Tetsuo Handa
@ 2017-09-13 14:03           ` Vlastimil Babka
  -1 siblings, 0 replies; 20+ messages in thread
From: Vlastimil Babka @ 2017-09-13 14:03 UTC (permalink / raw)
  To: Tetsuo Handa, mhocko, mpatocka
  Cc: hannes, mgorman, dave.hansen, akpm, linux-mm, linux-kernel

On 09/13/2017 03:54 PM, Tetsuo Handa wrote:
> Michal Hocko wrote:
>> On Mon 11-09-17 19:36:59, Mikulas Patocka wrote:
>>>
>>>
>>> On Mon, 11 Sep 2017, Michal Hocko wrote:
>>>
>>>> On Mon 11-09-17 02:52:53, Mikulas Patocka wrote:
>>>>> I am occasionally getting these warnings in khugepaged. It is an old 
>>>>> machine with 550MHz CPU and 512 MB RAM.
>>>>>
>>>>> Note that khugepaged has nice value 19, so when the machine is loaded with 
>>>>> some work, khugepaged is stalled and this stall produces warning in the 
>>>>> allocator.
>>>>>
>>>>> khugepaged does allocations with __GFP_NOWARN, but the flag __GFP_NOWARN
>>>>> is masked off when calling warn_alloc. This patch removes the masking of
>>>>> __GFP_NOWARN, so that the warning is suppressed.
>>>>>
>>>>> khugepaged: page allocation stalls for 10273ms, order:10, mode:0x4340ca(__GFP_HIGHMEM|__GFP_IO|__GFP_FS|__GFP_COMP|__GFP_NOMEMALLOC|__GFP_HARDWALL|__GFP_MOVABLE|__GFP_DIRECT_RECLAIM), nodemask=(null)
>>>>> CPU: 0 PID: 3936 Comm: khugepaged Not tainted 4.12.3 #1
>>>>> Hardware name: System Manufacturer Product Name/VA-503A, BIOS 4.51 PG 08/02/00
>>>>> Call Trace:
>>>>>  ? warn_alloc+0xb9/0x140
>>>>>  ? __alloc_pages_nodemask+0x724/0x880
>>>>>  ? arch_irq_stat_cpu+0x1/0x40
>>>>>  ? detach_if_pending+0x80/0x80
>>>>>  ? khugepaged+0x10a/0x1d40
>>>>>  ? pick_next_task_fair+0xd2/0x180
>>>>>  ? wait_woken+0x60/0x60
>>>>>  ? kthread+0xcf/0x100
>>>>>  ? release_pte_page+0x40/0x40
>>>>>  ? kthread_create_on_node+0x40/0x40
>>>>>  ? ret_from_fork+0x19/0x30
>>>>>
>>>>> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
>>>>> Cc: stable@vger.kernel.org
>>>>> Fixes: 63f53dea0c98 ("mm: warn about allocations which stall for too long")
>>>>
>>>> This patch hasn't introduced this behavior. It deliberately skipped
>>>> warning on __GFP_NOWARN. This has been introduced later by 822519634142
>>>> ("mm: page_alloc: __GFP_NOWARN shouldn't suppress stall warnings"). I
>>>> disagreed [1] but overall consensus was that such a warning won't be
>>>> harmful. Could you be more specific why do you consider it wrong,
>>>> please?
>>>
>>> I consider the warning wrong, because it warns when nothing goes wrong. 
>>> I've got 7 these warnings for 4 weeks of uptime. The warnings typically 
>>> happen when I run some compilation.
>>>
>>> A process with low priority is expected to be running slowly when there's 
>>> some high-priority process, so there's no need to warn that the 
>>> low-priority process runs slowly.
>>
>> I would tend to agree. It is certainly a noise in the log. And a kind of
>> thing I was worried about when objecting the patch previously. 
>>  
>>> What else can be done to avoid the warning? Skip the warning if the 
>>> process has lower priority?
>>
>> No, I wouldn't play with priorities. Either we agree that NOWARN
>> allocations simply do _not_warn_ or we simply explain users that some of
>> those warnings might not be that critical and overloaded system might
>> show them.
>>
>> Let's see what others think about this.
> 
> Whether __GFP_NOWARN should warn about stalls is not a topic to discuss.

It is the topic of this thread, which tries to address a concrete
problem somebody has experienced. In that context, the rest of your
concerns seem to me not related to this problem, IMHO.

> I consider warn_alloc() for reporting stalls is broken. It fails to provide
> backtrace of stalling location. For example, OOM lockup with oom_lock held
> cannot be reported by warn_alloc(). It fails to provide readable output when
> called concurrently. For example, concurrent calls can cause printk()/
> schedule_timeout_killable() lockup with oom_lock held. printk() offloading is
> not an option, for there will be situations where printk() offloading cannot
> be used (e.g. queuing via printk() is faster than writing to serial consoles
> which results in unreadable logs due to log_bug overflow).
> 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] mm: respect the __GFP_NOWARN flag when warning about stalls
@ 2017-09-13 14:03           ` Vlastimil Babka
  0 siblings, 0 replies; 20+ messages in thread
From: Vlastimil Babka @ 2017-09-13 14:03 UTC (permalink / raw)
  To: Tetsuo Handa, mhocko, mpatocka
  Cc: hannes, mgorman, dave.hansen, akpm, linux-mm, linux-kernel

On 09/13/2017 03:54 PM, Tetsuo Handa wrote:
> Michal Hocko wrote:
>> On Mon 11-09-17 19:36:59, Mikulas Patocka wrote:
>>>
>>>
>>> On Mon, 11 Sep 2017, Michal Hocko wrote:
>>>
>>>> On Mon 11-09-17 02:52:53, Mikulas Patocka wrote:
>>>>> I am occasionally getting these warnings in khugepaged. It is an old 
>>>>> machine with 550MHz CPU and 512 MB RAM.
>>>>>
>>>>> Note that khugepaged has nice value 19, so when the machine is loaded with 
>>>>> some work, khugepaged is stalled and this stall produces warning in the 
>>>>> allocator.
>>>>>
>>>>> khugepaged does allocations with __GFP_NOWARN, but the flag __GFP_NOWARN
>>>>> is masked off when calling warn_alloc. This patch removes the masking of
>>>>> __GFP_NOWARN, so that the warning is suppressed.
>>>>>
>>>>> khugepaged: page allocation stalls for 10273ms, order:10, mode:0x4340ca(__GFP_HIGHMEM|__GFP_IO|__GFP_FS|__GFP_COMP|__GFP_NOMEMALLOC|__GFP_HARDWALL|__GFP_MOVABLE|__GFP_DIRECT_RECLAIM), nodemask=(null)
>>>>> CPU: 0 PID: 3936 Comm: khugepaged Not tainted 4.12.3 #1
>>>>> Hardware name: System Manufacturer Product Name/VA-503A, BIOS 4.51 PG 08/02/00
>>>>> Call Trace:
>>>>>  ? warn_alloc+0xb9/0x140
>>>>>  ? __alloc_pages_nodemask+0x724/0x880
>>>>>  ? arch_irq_stat_cpu+0x1/0x40
>>>>>  ? detach_if_pending+0x80/0x80
>>>>>  ? khugepaged+0x10a/0x1d40
>>>>>  ? pick_next_task_fair+0xd2/0x180
>>>>>  ? wait_woken+0x60/0x60
>>>>>  ? kthread+0xcf/0x100
>>>>>  ? release_pte_page+0x40/0x40
>>>>>  ? kthread_create_on_node+0x40/0x40
>>>>>  ? ret_from_fork+0x19/0x30
>>>>>
>>>>> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
>>>>> Cc: stable@vger.kernel.org
>>>>> Fixes: 63f53dea0c98 ("mm: warn about allocations which stall for too long")
>>>>
>>>> This patch hasn't introduced this behavior. It deliberately skipped
>>>> warning on __GFP_NOWARN. This has been introduced later by 822519634142
>>>> ("mm: page_alloc: __GFP_NOWARN shouldn't suppress stall warnings"). I
>>>> disagreed [1] but overall consensus was that such a warning won't be
>>>> harmful. Could you be more specific why do you consider it wrong,
>>>> please?
>>>
>>> I consider the warning wrong, because it warns when nothing goes wrong. 
>>> I've got 7 these warnings for 4 weeks of uptime. The warnings typically 
>>> happen when I run some compilation.
>>>
>>> A process with low priority is expected to be running slowly when there's 
>>> some high-priority process, so there's no need to warn that the 
>>> low-priority process runs slowly.
>>
>> I would tend to agree. It is certainly a noise in the log. And a kind of
>> thing I was worried about when objecting the patch previously. 
>>  
>>> What else can be done to avoid the warning? Skip the warning if the 
>>> process has lower priority?
>>
>> No, I wouldn't play with priorities. Either we agree that NOWARN
>> allocations simply do _not_warn_ or we simply explain users that some of
>> those warnings might not be that critical and overloaded system might
>> show them.
>>
>> Let's see what others think about this.
> 
> Whether __GFP_NOWARN should warn about stalls is not a topic to discuss.

It is the topic of this thread, which tries to address a concrete
problem somebody has experienced. In that context, the rest of your
concerns seem to me not related to this problem, IMHO.

> I consider warn_alloc() for reporting stalls is broken. It fails to provide
> backtrace of stalling location. For example, OOM lockup with oom_lock held
> cannot be reported by warn_alloc(). It fails to provide readable output when
> called concurrently. For example, concurrent calls can cause printk()/
> schedule_timeout_killable() lockup with oom_lock held. printk() offloading is
> not an option, for there will be situations where printk() offloading cannot
> be used (e.g. queuing via printk() is faster than writing to serial consoles
> which results in unreadable logs due to log_bug overflow).
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] mm: respect the __GFP_NOWARN flag when warning about stalls
  2017-09-13 14:03           ` Vlastimil Babka
@ 2017-09-13 14:14             ` Tetsuo Handa
  -1 siblings, 0 replies; 20+ messages in thread
From: Tetsuo Handa @ 2017-09-13 14:14 UTC (permalink / raw)
  To: vbabka, mhocko, mpatocka
  Cc: hannes, mgorman, dave.hansen, akpm, linux-mm, linux-kernel

Vlastimil Babka wrote:
> On 09/13/2017 03:54 PM, Tetsuo Handa wrote:
> > Michal Hocko wrote:
> >> Let's see what others think about this.
> > 
> > Whether __GFP_NOWARN should warn about stalls is not a topic to discuss.
> 
> It is the topic of this thread, which tries to address a concrete
> problem somebody has experienced. In that context, the rest of your
> concerns seem to me not related to this problem, IMHO.

I suggested replacing warn_alloc() with safe/useful one rather than tweaking
warn_alloc() about __GFP_NOWARN.

> 
> > I consider warn_alloc() for reporting stalls is broken. It fails to provide
> > backtrace of stalling location. For example, OOM lockup with oom_lock held
> > cannot be reported by warn_alloc(). It fails to provide readable output when
> > called concurrently. For example, concurrent calls can cause printk()/
> > schedule_timeout_killable() lockup with oom_lock held. printk() offloading is
> > not an option, for there will be situations where printk() offloading cannot
> > be used (e.g. queuing via printk() is faster than writing to serial consoles
> > which results in unreadable logs due to log_bug overflow).

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] mm: respect the __GFP_NOWARN flag when warning about stalls
@ 2017-09-13 14:14             ` Tetsuo Handa
  0 siblings, 0 replies; 20+ messages in thread
From: Tetsuo Handa @ 2017-09-13 14:14 UTC (permalink / raw)
  To: vbabka, mhocko, mpatocka
  Cc: hannes, mgorman, dave.hansen, akpm, linux-mm, linux-kernel

Vlastimil Babka wrote:
> On 09/13/2017 03:54 PM, Tetsuo Handa wrote:
> > Michal Hocko wrote:
> >> Let's see what others think about this.
> > 
> > Whether __GFP_NOWARN should warn about stalls is not a topic to discuss.
> 
> It is the topic of this thread, which tries to address a concrete
> problem somebody has experienced. In that context, the rest of your
> concerns seem to me not related to this problem, IMHO.

I suggested replacing warn_alloc() with safe/useful one rather than tweaking
warn_alloc() about __GFP_NOWARN.

> 
> > I consider warn_alloc() for reporting stalls is broken. It fails to provide
> > backtrace of stalling location. For example, OOM lockup with oom_lock held
> > cannot be reported by warn_alloc(). It fails to provide readable output when
> > called concurrently. For example, concurrent calls can cause printk()/
> > schedule_timeout_killable() lockup with oom_lock held. printk() offloading is
> > not an option, for there will be situations where printk() offloading cannot
> > be used (e.g. queuing via printk() is faster than writing to serial consoles
> > which results in unreadable logs due to log_bug overflow).

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] mm: respect the __GFP_NOWARN flag when warning about stalls
  2017-09-13 14:14             ` Tetsuo Handa
@ 2017-09-13 14:35               ` Michal Hocko
  -1 siblings, 0 replies; 20+ messages in thread
From: Michal Hocko @ 2017-09-13 14:35 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: vbabka, mpatocka, hannes, mgorman, dave.hansen, akpm, linux-mm,
	linux-kernel

On Wed 13-09-17 23:14:43, Tetsuo Handa wrote:
> Vlastimil Babka wrote:
> > On 09/13/2017 03:54 PM, Tetsuo Handa wrote:
> > > Michal Hocko wrote:
> > >> Let's see what others think about this.
> > > 
> > > Whether __GFP_NOWARN should warn about stalls is not a topic to discuss.
> > 
> > It is the topic of this thread, which tries to address a concrete
> > problem somebody has experienced. In that context, the rest of your
> > concerns seem to me not related to this problem, IMHO.
> 
> I suggested replacing warn_alloc() with safe/useful one rather than tweaking
> warn_alloc() about __GFP_NOWARN.

What you seem to ignore is that whatever method you use for reporting
stalling allocations you would still have to consider whether to dump
a stall information for __GFP_NOWARN ones. And as the current report
shows that might be a bad idea. So please stick to the topic and do not
move it towards _what_ is the proper way of stall detection.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] mm: respect the __GFP_NOWARN flag when warning about stalls
@ 2017-09-13 14:35               ` Michal Hocko
  0 siblings, 0 replies; 20+ messages in thread
From: Michal Hocko @ 2017-09-13 14:35 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: vbabka, mpatocka, hannes, mgorman, dave.hansen, akpm, linux-mm,
	linux-kernel

On Wed 13-09-17 23:14:43, Tetsuo Handa wrote:
> Vlastimil Babka wrote:
> > On 09/13/2017 03:54 PM, Tetsuo Handa wrote:
> > > Michal Hocko wrote:
> > >> Let's see what others think about this.
> > > 
> > > Whether __GFP_NOWARN should warn about stalls is not a topic to discuss.
> > 
> > It is the topic of this thread, which tries to address a concrete
> > problem somebody has experienced. In that context, the rest of your
> > concerns seem to me not related to this problem, IMHO.
> 
> I suggested replacing warn_alloc() with safe/useful one rather than tweaking
> warn_alloc() about __GFP_NOWARN.

What you seem to ignore is that whatever method you use for reporting
stalling allocations you would still have to consider whether to dump
a stall information for __GFP_NOWARN ones. And as the current report
shows that might be a bad idea. So please stick to the topic and do not
move it towards _what_ is the proper way of stall detection.
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] mm: respect the __GFP_NOWARN flag when warning about stalls
  2017-09-12  7:14       ` Vlastimil Babka
@ 2017-09-13 14:52         ` Shakeel Butt
  -1 siblings, 0 replies; 20+ messages in thread
From: Shakeel Butt @ 2017-09-13 14:52 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Mikulas Patocka, Michal Hocko, Tetsuo Handa, Johannes Weiner,
	Mel Gorman, Dave Hansen, Andrew Morton, Linux MM, LKML

>
> We would have to consider (instead of jiffies) the time the process was
> either running, or waiting on something that's related to memory
> allocation/reclaim (page lock etc.). I.e. deduct the time the process
> was runable but there was no available cpu. I expect however that such
> level of detail wouldn't be feasible here, though?
>

Johannes' memdelay work (once merged) might be useful here. I think
memdalay can differentiate between an allocating process getting
delayed due to preemption or due to unsuccessful reclaim/compaction.
If the delay is due to unsuccessful reclaim/compaction then we should
warn here.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] mm: respect the __GFP_NOWARN flag when warning about stalls
@ 2017-09-13 14:52         ` Shakeel Butt
  0 siblings, 0 replies; 20+ messages in thread
From: Shakeel Butt @ 2017-09-13 14:52 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Mikulas Patocka, Michal Hocko, Tetsuo Handa, Johannes Weiner,
	Mel Gorman, Dave Hansen, Andrew Morton, Linux MM, LKML

>
> We would have to consider (instead of jiffies) the time the process was
> either running, or waiting on something that's related to memory
> allocation/reclaim (page lock etc.). I.e. deduct the time the process
> was runable but there was no available cpu. I expect however that such
> level of detail wouldn't be feasible here, though?
>

Johannes' memdelay work (once merged) might be useful here. I think
memdalay can differentiate between an allocating process getting
delayed due to preemption or due to unsuccessful reclaim/compaction.
If the delay is due to unsuccessful reclaim/compaction then we should
warn here.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2017-09-13 14:52 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-09-11  6:52 [PATCH] mm: respect the __GFP_NOWARN flag when warning about stalls Mikulas Patocka
2017-09-11  6:52 ` Mikulas Patocka
2017-09-11  8:26 ` Michal Hocko
2017-09-11  8:26   ` Michal Hocko
2017-09-11 23:36   ` Mikulas Patocka
2017-09-11 23:36     ` Mikulas Patocka
2017-09-12  7:14     ` Vlastimil Babka
2017-09-12  7:14       ` Vlastimil Babka
2017-09-13 14:52       ` Shakeel Butt
2017-09-13 14:52         ` Shakeel Butt
2017-09-13 11:54     ` Michal Hocko
2017-09-13 11:54       ` Michal Hocko
2017-09-13 13:54       ` Tetsuo Handa
2017-09-13 13:54         ` Tetsuo Handa
2017-09-13 14:03         ` Vlastimil Babka
2017-09-13 14:03           ` Vlastimil Babka
2017-09-13 14:14           ` Tetsuo Handa
2017-09-13 14:14             ` Tetsuo Handa
2017-09-13 14:35             ` Michal Hocko
2017-09-13 14:35               ` Michal Hocko

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.