Re: [PATCH] mm: page_alloc: avoid excessive IRQ disabled times in free_unref_page_list

From: Mel Gorman <mgorman@techsingularity.net>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Lucas Stach <l.stach@pengutronix.de>,
	Michal Hocko <mhocko@suse.com>, Vlastimil Babka <vbabka@suse.cz>,
	linux-mm@kvack.org, kernel@pengutronix.de,
	patchwork-lst@pengutronix.de
Subject: Re: [PATCH] mm: page_alloc: avoid excessive IRQ disabled times in free_unref_page_list
Date: Fri, 8 Dec 2017 10:21:30 +0000	[thread overview]
Message-ID: <20171208102130.4d4rwpwkseziniug@techsingularity.net> (raw)
In-Reply-To: <20171207165317.9ef234b9f83cb62cdad72427@linux-foundation.org>

On Thu, Dec 07, 2017 at 04:53:17PM -0800, Andrew Morton wrote:
> On Fri, 8 Dec 2017 00:25:37 +0000 Mel Gorman <mgorman@techsingularity.net> wrote:
> 
> > Well, it's release_pages. From core VM and the block layer, not very long
> > but for drivers and filesystems, it can be arbitrarily long. Even from the
> > VM, the function can be called a lot but as it's from pagevec context so
> > it's naturally broken into small pieces anyway.
> 
> OK.
> 
> > > If "significantly" then there may be additional benefit in rearranging
> > > free_hot_cold_page_list() so it only walks a small number of list
> > > entries at a time.  So the data from the first loop is still in cache
> > > during execution of the second loop.  And that way this
> > > long-irq-off-time problem gets fixed automagically.
> > > 
> > 
> > I'm not sure it's worthwhile. In too many cases, the list of pages being
> > released are either cache cold or are so long that the cache data is
> > being thrashed anyway.
> 
> Well, whether the incoming data is cache-cold or very-long, doing that
> double pass in small bites would reduce thrashing.
> 
> > Once the core page allocator is involved, then
> > there will be further cache thrashing due to buddy page merging accessing
> > data that is potentially very close. I think it's unlikely there would be
> > much value in using alternative schemes unless we were willing to have
> > very large per-cpu lists -- something I prototyped for fast networking
> > but never heard back whether it's worthwhile or not.
> 
> I mean something like this....
> 

Ok yes, I see. That is a viable alternative to Lucas's patch that should
achieve the same result with the bonus of some of the entries still being
cache hot. Lucas, care to give it a spin and see does it also address
your problem?

-- 
Mel Gorman
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>