All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] mm/page_alloc: Correct return value of populated elements if bulk array is populated
@ 2021-06-28 15:02 Mel Gorman
  2021-06-29 16:59   ` Geert Uytterhoeven
  0 siblings, 1 reply; 3+ messages in thread
From: Mel Gorman @ 2021-06-28 15:02 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Dave Jones, Dan Carpenter, Jesper Dangaard Brouer,
	Vlastimil Babka, Linux-MM, LKML, Linus Torvalds

Dave Jones reported the following

	This made it into 5.13 final, and completely breaks NFSD for me
	(Serving tcp v3 mounts).  Existing mounts on clients hang, as do
	new mounts from new clients.  Rebooting the server back to rc7
	everything recovers.

The commit b3b64ebd3822 ("mm/page_alloc: do bulk array bounds check after
checking populated elements") returns the wrong value if the array is
already populated which is interpreted as an allocation failure. Dave
reported this fixes his problem and it also passed a test running dbench
over NFS.

Fixes: b3b64ebd3822 ("mm/page_alloc: do bulk array bounds check after checking populated elements")
Reported-and-tested-by: Dave Jones <davej@codemonkey.org.uk>
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Cc: <stable@vger.kernel.org> [5.13+]
---
 mm/page_alloc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index ef2265f86b91..04220581579c 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -5058,7 +5058,7 @@ unsigned long __alloc_pages_bulk(gfp_t gfp, int preferred_nid,
 
 	/* Already populated array? */
 	if (unlikely(page_array && nr_pages - nr_populated == 0))
-		return 0;
+		return nr_populated;
 
 	/* Use the single page allocator for one page. */
 	if (nr_pages - nr_populated == 1)

^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] mm/page_alloc: Correct return value of populated elements if bulk array is populated
  2021-06-28 15:02 [PATCH] mm/page_alloc: Correct return value of populated elements if bulk array is populated Mel Gorman
@ 2021-06-29 16:59   ` Geert Uytterhoeven
  0 siblings, 0 replies; 3+ messages in thread
From: Geert Uytterhoeven @ 2021-06-29 16:59 UTC (permalink / raw)
  To: Mel Gorman
  Cc: Andrew Morton, Dave Jones, Dan Carpenter, Jesper Dangaard Brouer,
	Vlastimil Babka, Linux-MM, LKML, Linus Torvalds, Linux-Renesas

Hi Mel,

On Mon, Jun 28, 2021 at 5:29 PM Mel Gorman <mgorman@techsingularity.net> wrote:
> Dave Jones reported the following
>
>         This made it into 5.13 final, and completely breaks NFSD for me
>         (Serving tcp v3 mounts).  Existing mounts on clients hang, as do
>         new mounts from new clients.  Rebooting the server back to rc7
>         everything recovers.
>
> The commit b3b64ebd3822 ("mm/page_alloc: do bulk array bounds check after
> checking populated elements") returns the wrong value if the array is
> already populated which is interpreted as an allocation failure. Dave
> reported this fixes his problem and it also passed a test running dbench
> over NFS.
>
> Fixes: b3b64ebd3822 ("mm/page_alloc: do bulk array bounds check after checking populated elements")
> Reported-and-tested-by: Dave Jones <davej@codemonkey.org.uk>
> Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
> Cc: <stable@vger.kernel.org> [5.13+]

I saw similar failures as Mike Galbraith when doing s2idle or s2ram
on some boards with some configs:

    Freezing of tasks failed after 20.004 seconds (1 tasks refusing to
freeze, wq_busy=0):
    task:NFSv4 callback  state:S stack:    0 pid:  280 ppid:     2
flags:0x00000000
    [<c094b634>] (__schedule) from [<c094b8d0>] (schedule+0xc0/0x110)
    [<c094b8d0>] (schedule) from [<c094faec>] (schedule_timeout+0xc8/0x108)
    [<c094faec>] (schedule_timeout) from [<c092e0a0>] (svc_recv+0x108/0xa30)
    [<c092e0a0>] (svc_recv) from [<c04c5990>] (nfs4_callback_svc+0x6c/0x84)
    [<c04c5990>] (nfs4_callback_svc) from [<c0244ddc>] (kthread+0x128/0x138)
    [<c0244ddc>] (kthread) from [<c0200114>] (ret_from_fork+0x14/0x20)

I've bisected it (twice, as I couldn't believe the result) to the
same commit, which helped me find the fix.

After cherry-picking commit 66d9282523b32281 ("mm/page_alloc: Correct
return value of populated elements if bulk array is populated"),
the problem went away.

Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>

Gr{oetje,eeting}s,

                        Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] mm/page_alloc: Correct return value of populated elements if bulk array is populated
@ 2021-06-29 16:59   ` Geert Uytterhoeven
  0 siblings, 0 replies; 3+ messages in thread
From: Geert Uytterhoeven @ 2021-06-29 16:59 UTC (permalink / raw)
  To: Mel Gorman
  Cc: Andrew Morton, Dave Jones, Dan Carpenter, Jesper Dangaard Brouer,
	Vlastimil Babka, Linux-MM, LKML, Linus Torvalds, Linux-Renesas

Hi Mel,

On Mon, Jun 28, 2021 at 5:29 PM Mel Gorman <mgorman@techsingularity.net> wrote:
> Dave Jones reported the following
>
>         This made it into 5.13 final, and completely breaks NFSD for me
>         (Serving tcp v3 mounts).  Existing mounts on clients hang, as do
>         new mounts from new clients.  Rebooting the server back to rc7
>         everything recovers.
>
> The commit b3b64ebd3822 ("mm/page_alloc: do bulk array bounds check after
> checking populated elements") returns the wrong value if the array is
> already populated which is interpreted as an allocation failure. Dave
> reported this fixes his problem and it also passed a test running dbench
> over NFS.
>
> Fixes: b3b64ebd3822 ("mm/page_alloc: do bulk array bounds check after checking populated elements")
> Reported-and-tested-by: Dave Jones <davej@codemonkey.org.uk>
> Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
> Cc: <stable@vger.kernel.org> [5.13+]

I saw similar failures as Mike Galbraith when doing s2idle or s2ram
on some boards with some configs:

    Freezing of tasks failed after 20.004 seconds (1 tasks refusing to
freeze, wq_busy=0):
    task:NFSv4 callback  state:S stack:    0 pid:  280 ppid:     2
flags:0x00000000
    [<c094b634>] (__schedule) from [<c094b8d0>] (schedule+0xc0/0x110)
    [<c094b8d0>] (schedule) from [<c094faec>] (schedule_timeout+0xc8/0x108)
    [<c094faec>] (schedule_timeout) from [<c092e0a0>] (svc_recv+0x108/0xa30)
    [<c092e0a0>] (svc_recv) from [<c04c5990>] (nfs4_callback_svc+0x6c/0x84)
    [<c04c5990>] (nfs4_callback_svc) from [<c0244ddc>] (kthread+0x128/0x138)
    [<c0244ddc>] (kthread) from [<c0200114>] (ret_from_fork+0x14/0x20)

I've bisected it (twice, as I couldn't believe the result) to the
same commit, which helped me find the fix.

After cherry-picking commit 66d9282523b32281 ("mm/page_alloc: Correct
return value of populated elements if bulk array is populated"),
the problem went away.

Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>

Gr{oetje,eeting}s,

                        Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2021-06-29 16:59 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-28 15:02 [PATCH] mm/page_alloc: Correct return value of populated elements if bulk array is populated Mel Gorman
2021-06-29 16:59 ` Geert Uytterhoeven
2021-06-29 16:59   ` Geert Uytterhoeven

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.