Re: [LSF/MM TOPIC] wmark based pro-active compaction

From: Vlastimil Babka <vbabka@suse.cz>
To: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Mel Gorman <mgorman@suse.de>, Michal Hocko <mhocko@kernel.org>,
	lsf-pc@lists.linux-foundation.org, linux-mm@kvack.org,
	David Rientjes <rientjes@google.com>,
	Johannes Weiner <hannes@cmpxchg.org>
Subject: Re: [LSF/MM TOPIC] wmark based pro-active compaction
Date: Thu, 19 Jan 2017 15:18:08 +0100	[thread overview]
Message-ID: <bfbc31ec-7400-6174-62c3-94d82667320d@suse.cz> (raw)
In-Reply-To: <20170113070331.GA7874@js1304-P5Q-DELUXE>

On 01/13/2017 08:03 AM, Joonsoo Kim wrote:
>>>> So what is the problem? The demand for high order pages is growing and
>>>> that seems to be the general trend. The problem is that while they can
>>>> bring performance benefit they can get be really expensive to allocate
>>>> especially when we enter the direct compaction. So we really want to
>>>> prevent from expensive path and defer as much as possible to the
>>>> background. A huge step forward was kcompactd introduced by Vlastimil.
>>>> We are still not there yet though, because it might be already quite
>>>> late when we wakeup_kcompactd(). The memory might be already fragmented
>>>> when we hit there.
>>
>> Right.
> 
> Before we talk about pro-active compaction, I'd like to know the
> usecase that really needs pro-active compaction. For THP, IMHO, it's
> better not to do pro-active compaction, because high-order page made
> by pro-active compaction could be broken before it is used. And,

I agree that THP should be given lower priority, but wouldn't rule it
out completely.

> THP page can be setup lately by THP daemon. Benefit of pro-active
> compaction would not compensate overhead of it in this case.

khugepaged can only help in the longer term, but we can still help
shorter-lived processes

> I guess
> that almost cases that have a fallback would hit this category.

Yes, ideally we can derive this info from the GFP flags and prioritize
accordingly.

> For the order lower than costly order, system would have such a
> freepage usually. So, my question is pro-active compaction is really
> needed even if it's cost is really high? Reason I ask this question is
> that I tested some patches to do pro-active compaction and found that
> cost looks too much high. I heard that someone want this feature but
> I'm not sure they will use it with this high cost. Anyway, I will post
> some patches for pro-active compaction, soon.

David Rientjes mentioned their workloads benefit from background
compaction in the discussion about THP's "defrag" setting.

[...]

>> Parameters
>> - wake up period for kcompactd
>> - target per-order goals for kcompactd
>> - lowest efficiency where it's still considered worth to compact?
>>
>> An important question: how to evaluate this? Metrics should be feasible
>> (improved success rate, % of compaction that was handled by kcompactd
>> and not direct compaction...), but what are the good testcases?
> 
> Usecase should be defined first? Anyway, I hope that new testcase would
> be finished in short time. stress-highalloc test takes too much time
> to test various ideas.

Yeah, that too. But mainly it's too artificial.

>>
>> Ideally I would also revisit the topic of compaction mechanism (migrate
>> and free scanners) itself. It's been shown that they usually meet in the
> 
> +1
> 
>> 1/3 or 1/2 of zone, which means the rest of the zone is only
>> defragmented by "plugging free holes" by migrated pages, although it
>> might actually contain pageblocks more suitable for migrating from, than
>> the first part of the zone. It's also expensive for the free scanner to
>> actually find free pages, according to the stats.
> 
> Scalable approach would be [3] since it finds freepage by O(1) unlike
> others that are O(N).

There's however the issue that we need to skip (or potentially isolate
on a private list) freepages that lie in the area we are migrating from,
which is potentially O(N) where N is NR_FREE. This gets worse with
multiple compactors so we might have to e.g. reuse the pageblock skip
bits to indicate to others to go away, and rely on too_many_isolated()
or something similar to limit the number of concurrent compactors.

>>
>> Some approaches were proposed in recent years, but never got far as it's
>> always some kind of a trade-off (this partially goes back to the problem
>> of evaluation, often limited to stress-highalloc from mmtests):
>>
>> - "pivot" based approach where scanners' starting point changes and
>> isn't always zone boundaries [1]
>> - both scanners scan whole zone moving in the same direction, just
>> making sure they don't operate on the same pageblock at the same time [2]
>> - replacing free scanner by directly taking free pages from freelist
>>
>> However, the problem with this subtopic is that it might be too much
>> specialized for the full MM room.
> 
> Right. :)
> 
> Thanks.
> 
>>
>> [1] https://lkml.org/lkml/2015/1/19/158
>> [2] https://lkml.org/lkml/2015/6/24/706
>> [3] https://lkml.org/lkml/2015/12/3/63
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>