From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wj0-f198.google.com (mail-wj0-f198.google.com [209.85.210.198]) by kanga.kvack.org (Postfix) with ESMTP id 92A196B02A6 for ; Thu, 19 Jan 2017 09:18:14 -0500 (EST) Received: by mail-wj0-f198.google.com with SMTP id h7so8860988wjy.6 for ; Thu, 19 Jan 2017 06:18:14 -0800 (PST) Received: from mx2.suse.de (mx2.suse.de. [195.135.220.15]) by mx.google.com with ESMTPS id l45si4625789wrc.12.2017.01.19.06.18.12 for (version=TLS1 cipher=AES128-SHA bits=128/128); Thu, 19 Jan 2017 06:18:12 -0800 (PST) Subject: Re: [LSF/MM TOPIC] wmark based pro-active compaction References: <20161230131412.GI13301@dhcp22.suse.cz> <20161230140651.nud2ozpmvmziqyx4@suse.de> <20170113070331.GA7874@js1304-P5Q-DELUXE> From: Vlastimil Babka Message-ID: Date: Thu, 19 Jan 2017 15:18:08 +0100 MIME-Version: 1.0 In-Reply-To: <20170113070331.GA7874@js1304-P5Q-DELUXE> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org List-ID: To: Joonsoo Kim Cc: Mel Gorman , Michal Hocko , lsf-pc@lists.linux-foundation.org, linux-mm@kvack.org, David Rientjes , Johannes Weiner On 01/13/2017 08:03 AM, Joonsoo Kim wrote: >>>> So what is the problem? The demand for high order pages is growing and >>>> that seems to be the general trend. The problem is that while they can >>>> bring performance benefit they can get be really expensive to allocate >>>> especially when we enter the direct compaction. So we really want to >>>> prevent from expensive path and defer as much as possible to the >>>> background. A huge step forward was kcompactd introduced by Vlastimil. >>>> We are still not there yet though, because it might be already quite >>>> late when we wakeup_kcompactd(). The memory might be already fragmented >>>> when we hit there. >> >> Right. > > Before we talk about pro-active compaction, I'd like to know the > usecase that really needs pro-active compaction. For THP, IMHO, it's > better not to do pro-active compaction, because high-order page made > by pro-active compaction could be broken before it is used. And, I agree that THP should be given lower priority, but wouldn't rule it out completely. > THP page can be setup lately by THP daemon. Benefit of pro-active > compaction would not compensate overhead of it in this case. khugepaged can only help in the longer term, but we can still help shorter-lived processes > I guess > that almost cases that have a fallback would hit this category. Yes, ideally we can derive this info from the GFP flags and prioritize accordingly. > For the order lower than costly order, system would have such a > freepage usually. So, my question is pro-active compaction is really > needed even if it's cost is really high? Reason I ask this question is > that I tested some patches to do pro-active compaction and found that > cost looks too much high. I heard that someone want this feature but > I'm not sure they will use it with this high cost. Anyway, I will post > some patches for pro-active compaction, soon. David Rientjes mentioned their workloads benefit from background compaction in the discussion about THP's "defrag" setting. [...] >> Parameters >> - wake up period for kcompactd >> - target per-order goals for kcompactd >> - lowest efficiency where it's still considered worth to compact? >> >> An important question: how to evaluate this? Metrics should be feasible >> (improved success rate, % of compaction that was handled by kcompactd >> and not direct compaction...), but what are the good testcases? > > Usecase should be defined first? Anyway, I hope that new testcase would > be finished in short time. stress-highalloc test takes too much time > to test various ideas. Yeah, that too. But mainly it's too artificial. >> >> Ideally I would also revisit the topic of compaction mechanism (migrate >> and free scanners) itself. It's been shown that they usually meet in the > > +1 > >> 1/3 or 1/2 of zone, which means the rest of the zone is only >> defragmented by "plugging free holes" by migrated pages, although it >> might actually contain pageblocks more suitable for migrating from, than >> the first part of the zone. It's also expensive for the free scanner to >> actually find free pages, according to the stats. > > Scalable approach would be [3] since it finds freepage by O(1) unlike > others that are O(N). There's however the issue that we need to skip (or potentially isolate on a private list) freepages that lie in the area we are migrating from, which is potentially O(N) where N is NR_FREE. This gets worse with multiple compactors so we might have to e.g. reuse the pageblock skip bits to indicate to others to go away, and rely on too_many_isolated() or something similar to limit the number of concurrent compactors. >> >> Some approaches were proposed in recent years, but never got far as it's >> always some kind of a trade-off (this partially goes back to the problem >> of evaluation, often limited to stress-highalloc from mmtests): >> >> - "pivot" based approach where scanners' starting point changes and >> isn't always zone boundaries [1] >> - both scanners scan whole zone moving in the same direction, just >> making sure they don't operate on the same pageblock at the same time [2] >> - replacing free scanner by directly taking free pages from freelist >> >> However, the problem with this subtopic is that it might be too much >> specialized for the full MM room. > > Right. :) > > Thanks. > >> >> [1] https://lkml.org/lkml/2015/1/19/158 >> [2] https://lkml.org/lkml/2015/6/24/706 >> [3] https://lkml.org/lkml/2015/12/3/63 > > -- > To unsubscribe, send a message with 'unsubscribe linux-mm' in > the body to majordomo@kvack.org. For more info on Linux MM, > see: http://www.linux-mm.org/ . > Don't email: email@kvack.org > -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org