From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752239AbcFCH5Z (ORCPT ); Fri, 3 Jun 2016 03:57:25 -0400 Received: from mail-it0-f68.google.com ([209.85.214.68]:35056 "EHLO mail-it0-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932083AbcFCH5X (ORCPT ); Fri, 3 Jun 2016 03:57:23 -0400 MIME-Version: 1.0 In-Reply-To: <20160602114341.e3b974640fc3f8cbcb54898b@linux-foundation.org> References: <20160530155644.GP2527@techsingularity.net> <574E05B8.3060009@suse.cz> <20160601091921.GT2527@techsingularity.net> <574EB274.4030408@suse.cz> <20160602103936.GU2527@techsingularity.net> <0eb1f112-65d4-f2e5-911e-697b21324b9f@suse.cz> <20160602121936.GV2527@techsingularity.net> <20160602114341.e3b974640fc3f8cbcb54898b@linux-foundation.org> From: Geert Uytterhoeven Date: Fri, 3 Jun 2016 09:57:22 +0200 X-Google-Sender-Auth: kcPcvXBDSgQoKFvAws65UueukuU Message-ID: Subject: Re: BUG: scheduling while atomic: cron/668/0x10c9a0c0 To: Andrew Morton , Mel Gorman Cc: Vlastimil Babka , Linux Kernel Mailing List , Linux MM , linux-m68k Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Andrew, Mel, On Thu, Jun 2, 2016 at 8:43 PM, Andrew Morton wrote: > On Thu, 2 Jun 2016 13:19:36 +0100 Mel Gorman wrote: >> > >Signed-off-by: Mel Gorman >> > >> > Acked-by: Vlastimil Babka >> > >> >> Thanks. > > I queued this. A tested-by:Geert would be nice? > > > From: Mel Gorman > Subject: mm, page_alloc: recalculate the preferred zoneref if the context can ignore memory policies > > The optimistic fast path may use cpuset_current_mems_allowed instead of of > a NULL nodemask supplied by the caller for cpuset allocations. The > preferred zone is calculated on this basis for statistic purposes and as a > starting point in the zonelist iterator. > > However, if the context can ignore memory policies due to being atomic or > being able to ignore watermarks then the starting point in the zonelist > iterator is no longer correct. This patch resets the zonelist iterator in > the allocator slowpath if the context can ignore memory policies. This > will alter the zone used for statistics but only after it is known that it > makes sense for that context. Resetting it before entering the slowpath > would potentially allow an ALLOC_CPUSET allocation to be accounted for > against the wrong zone. Note that while nodemask is not explicitly set to > the original nodemask, it would only have been overwritten if > cpuset_enabled() and it was reset before the slowpath was entered. > > Link: http://lkml.kernel.org/r/20160602103936.GU2527@techsingularity.net > Fixes: c33d6c06f60f710 ("mm, page_alloc: avoid looking up the first zone in a zonelist twice") My understanding was that this was an an additional patch, not fixing the problem in-se? Indeed, after applying this patch (without the other one that added "z = ac->preferred_zoneref;" to the reset_fair block of get_page_from_freelist()) I still get crashes... Now testing with both applied... > Signed-off-by: Mel Gorman > Reported-by: Geert Uytterhoeven > Acked-by: Vlastimil Babka > Signed-off-by: Andrew Morton > --- > > mm/page_alloc.c | 23 ++++++++++++++++------- > 1 file changed, 16 insertions(+), 7 deletions(-) > > diff -puN mm/page_alloc.c~mm-page_alloc-recalculate-the-preferred-zoneref-if-the-context-can-ignore-memory-policies mm/page_alloc.c > --- a/mm/page_alloc.c~mm-page_alloc-recalculate-the-preferred-zoneref-if-the-context-can-ignore-memory-policies > +++ a/mm/page_alloc.c > @@ -3604,6 +3604,17 @@ retry: > */ > alloc_flags = gfp_to_alloc_flags(gfp_mask); > > + /* > + * Reset the zonelist iterators if memory policies can be ignored. > + * These allocations are high priority and system rather than user > + * orientated. > + */ > + if ((alloc_flags & ALLOC_NO_WATERMARKS) || !(alloc_flags & ALLOC_CPUSET)) { > + ac->zonelist = node_zonelist(numa_node_id(), gfp_mask); > + ac->preferred_zoneref = first_zones_zonelist(ac->zonelist, > + ac->high_zoneidx, ac->nodemask); > + } > + > /* This is the last chance, in general, before the goto nopage. */ > page = get_page_from_freelist(gfp_mask, order, > alloc_flags & ~ALLOC_NO_WATERMARKS, ac); > @@ -3612,12 +3623,6 @@ retry: > > /* Allocate without watermarks if the context allows */ > if (alloc_flags & ALLOC_NO_WATERMARKS) { > - /* > - * Ignore mempolicies if ALLOC_NO_WATERMARKS on the grounds > - * the allocation is high priority and these type of > - * allocations are system rather than user orientated > - */ > - ac->zonelist = node_zonelist(numa_node_id(), gfp_mask); > page = get_page_from_freelist(gfp_mask, order, > ALLOC_NO_WATERMARKS, ac); > if (page) > @@ -3816,7 +3821,11 @@ retry_cpuset: > /* Dirty zone balancing only done in the fast path */ > ac.spread_dirty_pages = (gfp_mask & __GFP_WRITE); > > - /* The preferred zone is used for statistics later */ > + /* > + * The preferred zone is used for statistics but crucially it is > + * also used as the starting point for the zonelist iterator. It > + * may get reset for allocations that ignore memory policies. > + */ > ac.preferred_zoneref = first_zones_zonelist(ac.zonelist, > ac.high_zoneidx, ac.nodemask); > if (!ac.preferred_zoneref) { > _ > -- Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds