linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* v5.13 regression - suspend went belly up
@ 2021-06-28 13:33 Mike Galbraith
  2021-06-28 14:51 ` Mel Gorman
  0 siblings, 1 reply; 3+ messages in thread
From: Mike Galbraith @ 2021-06-28 13:33 UTC (permalink / raw)
  To: lkml; +Cc: Mel Gorman

Greetings,

A regression popping up post rc7 is a bit unusual, but hohum, I suppose
they can bite whenever they damn well feel like it.  This one was
bisected rc7..release, the (surprising to me) result then confirmed in
four disgruntled local trees.

b3b64ebd38225d8032b5db42938d969b602040c2 is the first bad commit
commit b3b64ebd38225d8032b5db42938d969b602040c2
Author: Mel Gorman <mgorman@techsingularity.net>
Date:   Thu Jun 24 18:40:07 2021 -0700

    mm/page_alloc: do bulk array bounds check after checking populated elements

    Dan Carpenter reported the following

      The patch 0f87d9d30f21: "mm/page_alloc: add an array-based interface
      to the bulk page allocator" from Apr 29, 2021, leads to the following
      static checker warning:

            mm/page_alloc.c:5338 __alloc_pages_bulk()
            warn: potentially one past the end of array 'page_array[nr_populated]'

    The problem can occur if an array is passed in that is fully populated.
    That potentially ends up allocating a single page and storing it past
    the end of the array.  This patch returns 0 if the array is fully
    populated.

    Link: https://lkml.kernel.org/r/20210618125102.GU30378@techsingularity.net
    Fixes: 0f87d9d30f21 ("mm/page_alloc: add an array-based interface to the bulk page allocator")
    Signed-off-by: Mel Gorman <mgorman@techsinguliarity.net>
    Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
    Cc: Jesper Dangaard Brouer <brouer@redhat.com>
    Cc: Vlastimil Babka <vbabka@suse.cz>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

 mm/page_alloc.c | 4 ++++
 1 file changed, 4 insertions(+)

[   45.503409] PM: suspend entry (deep)
[   45.529287] Filesystems sync: 0.026 seconds
[   45.532934] Freezing user space processes ... (elapsed 0.001 seconds) done.
[   45.534340] OOM killer disabled.
[   45.534341] Freezing remaining freezable tasks ...
[   65.505035] Freezing of tasks failed after 20.008 seconds (5 tasks refusing to freeze, wq_busy=0):
[   65.505079] task:lockd           state:S stack:    0 pid: 1809 ppid:     2 flags:0x00004000
[   65.505093] Call Trace:
[   65.505102]  __schedule+0x28b/0x870
[   65.505121]  schedule+0x3c/0xa0
[   65.505130]  schedule_timeout+0x1d2/0x260
[   65.505145]  ? del_timer_sync+0x40/0x40
[   65.505160]  svc_recv+0xc3/0x8a0 [sunrpc]
[   65.505310]  ? grace_ender+0x10/0x10 [lockd]
[   65.505331]  lockd+0x8b/0x190 [lockd]
[   65.505348]  ? __kthread_parkme+0x4c/0x70
[   65.505361]  kthread+0x115/0x130
[   65.505371]  ? kthread_park+0x90/0x90
[   65.505381]  ret_from_fork+0x1f/0x30
[   65.505401] task:nfsd            state:S stack:    0 pid: 2044 ppid:     2 flags:0x00004000
[   65.505413] Call Trace:
[   65.505417]  __schedule+0x28b/0x870
[   65.505428]  schedule+0x3c/0xa0
[   65.505437]  schedule_timeout+0x1d2/0x260
[   65.505449]  ? del_timer_sync+0x40/0x40
[   65.505459]  svc_recv+0xc3/0x8a0 [sunrpc]
[   65.505572]  ? nfsd_shutdown_threads+0x80/0x80 [nfsd]
[   65.505636]  nfsd+0xd6/0x150 [nfsd]
[   65.505691]  kthread+0x115/0x130
[   65.505701]  ? kthread_park+0x90/0x90
[   65.505710]  ret_from_fork+0x1f/0x30
[   65.505722] task:nfsd            state:S stack:    0 pid: 2045 ppid:     2 flags:0x00004000
[   65.505732] Call Trace:
[   65.505737]  __schedule+0x28b/0x870
[   65.505748]  schedule+0x3c/0xa0
[   65.505757]  schedule_timeout+0x1d2/0x260
[   65.505769]  ? del_timer_sync+0x40/0x40
[   65.505779]  svc_recv+0xc3/0x8a0 [sunrpc]
[   65.505889]  ? nfsd_shutdown_threads+0x80/0x80 [nfsd]
[   65.505945]  nfsd+0xd6/0x150 [nfsd]
[   65.505998]  kthread+0x115/0x130
[   65.506008]  ? kthread_park+0x90/0x90
[   65.506017]  ret_from_fork+0x1f/0x30
[   65.506029] task:nfsd            state:S stack:    0 pid: 2046 ppid:     2 flags:0x00004000
[   65.506039] Call Trace:
[   65.506043]  __schedule+0x28b/0x870
[   65.506054]  schedule+0x3c/0xa0
[   65.506063]  schedule_timeout+0x1d2/0x260
[   65.506081]  ? del_timer_sync+0x40/0x40
[   65.506094]  svc_recv+0xc3/0x8a0 [sunrpc]
[   65.506202]  ? nfsd_shutdown_threads+0x80/0x80 [nfsd]
[   65.506260]  nfsd+0xd6/0x150 [nfsd]
[   65.506314]  kthread+0x115/0x130
[   65.506327]  ? kthread_park+0x90/0x90
[   65.506339]  ret_from_fork+0x1f/0x30
[   65.506355] task:nfsd            state:S stack:    0 pid: 2051 ppid:     2 flags:0x00004000
[   65.506367] Call Trace:
[   65.506374]  __schedule+0x28b/0x870
[   65.506387]  schedule+0x3c/0xa0
[   65.506399]  schedule_timeout+0x1d2/0x260
[   65.506414]  ? del_timer_sync+0x40/0x40
[   65.506426]  svc_recv+0xc3/0x8a0 [sunrpc]
[   65.506530]  ? nfsd_shutdown_threads+0x80/0x80 [nfsd]
[   65.506608]  nfsd+0xd6/0x150 [nfsd]
[   65.506662]  kthread+0x115/0x130
[   65.506675]  ? kthread_park+0x90/0x90
[   65.506687]  ret_from_fork+0x1f/0x30

[   65.506780] Restarting kernel threads ... done.
[   65.506968] OOM killer enabled.
[   65.506973] Restarting tasks ... done.
[   65.508047] PM: suspend exit
[   65.508081] PM: suspend entry (s2idle)
[   65.516062] Filesystems sync: 0.007 seconds
[   65.516186] Freezing user space processes ... (elapsed 0.007 seconds) done.
[   65.523663] OOM killer disabled.
[   65.523664] Freezing remaining freezable tasks ...
[   85.525885] Freezing of tasks failed after 20.004 seconds (9 tasks refusing to freeze, wq_busy=0):
[   85.525937] task:lockd           state:S stack:    0 pid: 1809 ppid:     2 flags:0x00004000
[   85.525955] Call Trace:
[   85.525966]  __schedule+0x28b/0x870
[   85.525989]  schedule+0x3c/0xa0
[   85.526001]  schedule_timeout+0x1d2/0x260
[   85.526017]  ? del_timer_sync+0x40/0x40
[   85.526033]  svc_recv+0xc3/0x8a0 [sunrpc]
[   85.526167]  ? grace_ender+0x10/0x10 [lockd]
[   85.526190]  lockd+0x8b/0x190 [lockd]
[   85.526210]  ? __kthread_parkme+0x4c/0x70
[   85.526225]  kthread+0x115/0x130
[   85.526237]  ? kthread_park+0x90/0x90
[   85.526250]  ret_from_fork+0x1f/0x30
[   85.526271] task:nfsd            state:S stack:    0 pid: 2044 ppid:     2 flags:0x00004000
[   85.526284] Call Trace:
[   85.526291]  __schedule+0x28b/0x870
[   85.526305]  schedule+0x3c/0xa0
[   85.526317]  schedule_timeout+0x1d2/0x260
[   85.526332]  ? del_timer_sync+0x40/0x40
[   85.526344]  svc_recv+0xc3/0x8a0 [sunrpc]
[   85.526451]  ? nfsd_shutdown_threads+0x80/0x80 [nfsd]
[   85.526511]  nfsd+0xd6/0x150 [nfsd]
[   85.526563]  kthread+0x115/0x130
[   85.526575]  ? kthread_park+0x90/0x90
[   85.526588]  ret_from_fork+0x1f/0x30
[   85.526602] task:nfsd            state:S stack:    0 pid: 2045 ppid:     2 flags:0x00004000
[   85.526615] Call Trace:
[   85.526621]  __schedule+0x28b/0x870
[   85.526634]  schedule+0x3c/0xa0
[   85.526646]  schedule_timeout+0x1d2/0x260
[   85.526661]  ? del_timer_sync+0x40/0x40
[   85.526673]  svc_recv+0xc3/0x8a0 [sunrpc]
[   85.526776]  ? nfsd_shutdown_threads+0x80/0x80 [nfsd]
[   85.526830]  nfsd+0xd6/0x150 [nfsd]
[   85.526879]  kthread+0x115/0x130
[   85.526892]  ? kthread_park+0x90/0x90
[   85.526904]  ret_from_fork+0x1f/0x30
[   85.526918] task:nfsd            state:S stack:    0 pid: 2046 ppid:     2 flags:0x00004000
[   85.526931] Call Trace:
[   85.526937]  __schedule+0x28b/0x870
[   85.526951]  schedule+0x3c/0xa0
[   85.526962]  schedule_timeout+0x1d2/0x260
[   85.526977]  ? del_timer_sync+0x40/0x40
[   85.526989]  svc_recv+0xc3/0x8a0 [sunrpc]
[   85.527089]  ? nfsd_shutdown_threads+0x80/0x80 [nfsd]
[   85.527142]  nfsd+0xd6/0x150 [nfsd]
[   85.527192]  kthread+0x115/0x130
[   85.527204]  ? kthread_park+0x90/0x90
[   85.527217]  ret_from_fork+0x1f/0x30
[   85.527231] task:nfsd            state:S stack:    0 pid: 2047 ppid:     2 flags:0x00004000
[   85.527242] Call Trace:
[   85.527249]  __schedule+0x28b/0x870
[   85.527262]  schedule+0x3c/0xa0
[   85.527273]  schedule_timeout+0x1d2/0x260
[   85.527288]  ? del_timer_sync+0x40/0x40
[   85.527300]  svc_recv+0xc3/0x8a0 [sunrpc]
[   85.527397]  ? nfsd_shutdown_threads+0x80/0x80 [nfsd]
[   85.527449]  nfsd+0xd6/0x150 [nfsd]
[   85.527497]  kthread+0x115/0x130
[   85.527510]  ? kthread_park+0x90/0x90
[   85.527522]  ret_from_fork+0x1f/0x30
[   85.527536] task:nfsd            state:S stack:    0 pid: 2048 ppid:     2 flags:0x00004000
[   85.527548] Call Trace:
[   85.527554]  __schedule+0x28b/0x870
[   85.527568]  schedule+0x3c/0xa0
[   85.527579]  schedule_timeout+0x1d2/0x260
[   85.527594]  ? del_timer_sync+0x40/0x40
[   85.527606]  svc_recv+0xc3/0x8a0 [sunrpc]
[   85.527698]  ? nfsd_shutdown_threads+0x80/0x80 [nfsd]
[   85.527749]  nfsd+0xd6/0x150 [nfsd]
[   85.527798]  kthread+0x115/0x130
[   85.527810]  ? kthread_park+0x90/0x90
[   85.527822]  ret_from_fork+0x1f/0x30
[   85.527837] task:nfsd            state:S stack:    0 pid: 2049 ppid:     2 flags:0x00004000
[   85.527848] Call Trace:
[   85.527854]  __schedule+0x28b/0x870
[   85.527868]  schedule+0x3c/0xa0
[   85.527879]  schedule_timeout+0x1d2/0x260
[   85.527894]  ? del_timer_sync+0x40/0x40
[   85.527906]  svc_recv+0xc3/0x8a0 [sunrpc]
[   85.527999]  ? nfsd_shutdown_threads+0x80/0x80 [nfsd]
[   85.528050]  nfsd+0xd6/0x150 [nfsd]
[   85.528098]  kthread+0x115/0x130
[   85.528110]  ? kthread_park+0x90/0x90
[   85.528122]  ret_from_fork+0x1f/0x30
[   85.528137] task:nfsd            state:S stack:    0 pid: 2050 ppid:     2 flags:0x00004000
[   85.528148] Call Trace:
[   85.528154]  __schedule+0x28b/0x870
[   85.528167]  schedule+0x3c/0xa0
[   85.528178]  schedule_timeout+0x1d2/0x260
[   85.528193]  ? del_timer_sync+0x40/0x40
[   85.528205]  svc_recv+0xc3/0x8a0 [sunrpc]
[   85.528298]  ? nfsd_shutdown_threads+0x80/0x80 [nfsd]
[   85.528363]  nfsd+0xd6/0x150 [nfsd]
[   85.528412]  kthread+0x115/0x130
[   85.528424]  ? kthread_park+0x90/0x90
[   85.528436]  ret_from_fork+0x1f/0x30
[   85.528450] task:nfsd            state:S stack:    0 pid: 2051 ppid:     2 flags:0x00004000
[   85.528462] Call Trace:
[   85.528469]  __schedule+0x28b/0x870
[   85.528482]  schedule+0x3c/0xa0
[   85.528494]  schedule_timeout+0x1d2/0x260
[   85.528508]  ? del_timer_sync+0x40/0x40
[   85.528521]  svc_recv+0xc3/0x8a0 [sunrpc]
[   85.528616]  ? nfsd_shutdown_threads+0x80/0x80 [nfsd]
[   85.528668]  nfsd+0xd6/0x150 [nfsd]
[   85.528715]  kthread+0x115/0x130
[   85.528728]  ? kthread_park+0x90/0x90
[   85.528740]  ret_from_fork+0x1f/0x30

[   85.528831] Restarting kernel threads ... done.
[   85.529020] OOM killer enabled.
[   85.529024] Restarting tasks ... done.
[   85.529753] PM: suspend exit


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: v5.13 regression - suspend went belly up
  2021-06-28 13:33 v5.13 regression - suspend went belly up Mike Galbraith
@ 2021-06-28 14:51 ` Mel Gorman
  2021-06-28 15:11   ` Mike Galbraith
  0 siblings, 1 reply; 3+ messages in thread
From: Mel Gorman @ 2021-06-28 14:51 UTC (permalink / raw)
  To: Mike Galbraith; +Cc: lkml

On Mon, Jun 28, 2021 at 03:33:42PM +0200, Mike Galbraith wrote:
> Greetings,
> 
> A regression popping up post rc7 is a bit unusual, but hohum, I suppose
> they can bite whenever they damn well feel like it.  This one was
> bisected rc7..release, the (surprising to me) result then confirmed in
> four disgruntled local trees.
> 

Hi Mike, this may be NFS-related fallout given that it appears in the
stack. Can you try this please?

--8<---
mm/page_alloc: Correct return value of populated elements if bulk array is populated

Dave Jones reported the following

	This made it into 5.13 final, and completely breaks NFSD for me
	(Serving tcp v3 mounts).  Existing mounts on clients hang, as do
	new mounts from new clients.  Rebooting the server back to rc7
	everything recovers.

The commit b3b64ebd3822 ("mm/page_alloc: do bulk array bounds check after
checking populated elements") returns the wrong value if the array is
already populated which is interpreted as an allocation failure.

Fixes: b3b64ebd3822 ("mm/page_alloc: do bulk array bounds check after checking populated elements")
Reported-by: Dave Jones <davej@codemonkey.org.uk>
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Cc: <stable@vger.kernel.org> [5.13+]
---
 mm/page_alloc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index ef2265f86b91..04220581579c 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -5058,7 +5058,7 @@ unsigned long __alloc_pages_bulk(gfp_t gfp, int preferred_nid,
 
 	/* Already populated array? */
 	if (unlikely(page_array && nr_pages - nr_populated == 0))
-		return 0;
+		return nr_populated;
 
 	/* Use the single page allocator for one page. */
 	if (nr_pages - nr_populated == 1)

^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: v5.13 regression - suspend went belly up
  2021-06-28 14:51 ` Mel Gorman
@ 2021-06-28 15:11   ` Mike Galbraith
  0 siblings, 0 replies; 3+ messages in thread
From: Mike Galbraith @ 2021-06-28 15:11 UTC (permalink / raw)
  To: Mel Gorman; +Cc: lkml

On Mon, 2021-06-28 at 15:51 +0100, Mel Gorman wrote:
>
> Hi Mike, this may be NFS-related fallout given that it appears in the
> stack. Can you try this please?

Yup, all better.  Thanks.

	-Mike


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2021-06-28 15:33 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-28 13:33 v5.13 regression - suspend went belly up Mike Galbraith
2021-06-28 14:51 ` Mel Gorman
2021-06-28 15:11   ` Mike Galbraith

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).