From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.0 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A5176C43387 for ; Thu, 17 Jan 2019 15:16:58 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 7F9FC20859 for ; Thu, 17 Jan 2019 15:16:58 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727024AbfAQPQ4 (ORCPT ); Thu, 17 Jan 2019 10:16:56 -0500 Received: from mx2.suse.de ([195.135.220.15]:37378 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1725882AbfAQPQ4 (ORCPT ); Thu, 17 Jan 2019 10:16:56 -0500 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay1.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 11629ABE6; Thu, 17 Jan 2019 15:16:55 +0000 (UTC) Subject: Re: [PATCH 14/25] mm, compaction: Avoid rescanning the same pageblock multiple times To: Mel Gorman , Linux-MM Cc: David Rientjes , Andrea Arcangeli , ying.huang@intel.com, kirill@shutemov.name, Andrew Morton , Linux List Kernel Mailing References: <20190104125011.16071-1-mgorman@techsingularity.net> <20190104125011.16071-15-mgorman@techsingularity.net> From: Vlastimil Babka Message-ID: <67b95fef-6f9a-a91f-c1b2-1c3fbc9330ca@suse.cz> Date: Thu, 17 Jan 2019 16:16:54 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.4.0 MIME-Version: 1.0 In-Reply-To: <20190104125011.16071-15-mgorman@techsingularity.net> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 1/4/19 1:50 PM, Mel Gorman wrote: > Pageblocks are marked for skip when no pages are isolated after a scan. > However, it's possible to hit corner cases where the migration scanner > gets stuck near the boundary between the source and target scanner. Due > to pages being migrated in blocks of COMPACT_CLUSTER_MAX, pages that > are migrated can be reallocated before the pageblock is complete. The > pageblock is not necessarily skipped so it can be rescanned multiple > times. Similarly, a pageblock with some dirty/writeback pages may fail > to isolate and be rescanned until writeback completes which is wasteful. ^ migrate? If we failed to isolate, then it wouldn't bump nr_isolated. Wonder if we could do better checks and not isolate pages that cannot be at the moment migrated anyway. > > This patch tracks if a pageblock is being rescanned. If so, then the entire > pageblock will be migrated as one operation. This narrows the race window > during which pages can be reallocated during migration. Secondly, if there > are pages that cannot be isolated then the pageblock will still be fully > scanned and marked for skipping. On the second rescan, the pageblock skip > is set and the migration scanner makes progress. > > 4.20.0 4.20.0 > finishscan-v2r15 norescan-v2r15 > Amean fault-both-3 3729.80 ( 0.00%) 2872.13 * 23.00%* > Amean fault-both-5 5148.49 ( 0.00%) 4330.56 * 15.89%* > Amean fault-both-7 7393.24 ( 0.00%) 6496.63 ( 12.13%) > Amean fault-both-12 11709.32 ( 0.00%) 10280.59 ( 12.20%) > Amean fault-both-18 16626.82 ( 0.00%) 11079.19 * 33.37%* > Amean fault-both-24 19944.34 ( 0.00%) 17207.80 * 13.72%* > Amean fault-both-30 23435.53 ( 0.00%) 17736.13 * 24.32%* > Amean fault-both-32 23948.70 ( 0.00%) 18509.41 * 22.71%* > > 4.20.0 4.20.0 > finishscan-v2r15 norescan-v2r15 > Percentage huge-1 0.00 ( 0.00%) 0.00 ( 0.00%) > Percentage huge-3 88.39 ( 0.00%) 96.87 ( 9.60%) > Percentage huge-5 92.07 ( 0.00%) 94.63 ( 2.77%) > Percentage huge-7 91.96 ( 0.00%) 93.83 ( 2.03%) > Percentage huge-12 93.38 ( 0.00%) 92.65 ( -0.78%) > Percentage huge-18 91.89 ( 0.00%) 93.66 ( 1.94%) > Percentage huge-24 91.37 ( 0.00%) 93.15 ( 1.95%) > Percentage huge-30 92.77 ( 0.00%) 93.16 ( 0.42%) > Percentage huge-32 87.97 ( 0.00%) 92.58 ( 5.24%) > > The fault latency reduction is large and while the THP allocation > success rate is only slightly higher, it's already high at this > point of the series. > > Compaction migrate scanned 60718343.00 31772603.00 > Compaction free scanned 933061894.00 63267928.00 Hm I thought the order of magnitude difference between migrate and free scanned was already gone at this point as reported in the previous 2 patches. Or is this from different system/configuration? Anyway, encouraging result. I would expect that after "Keep migration source private to a single compaction instance" sets the skip bits much more early and aggressively, the rescans would not happen anymore thanks to those, even if cached pfns were not updated. > Migration scan rates are reduced by 48% and free scan rates are > also reduced as the same migration source block is not being selected > multiple times. The corner case where migration scan rates go through the > roof due to a dirty/writeback pageblock located at the boundary of the > migration/free scanner did not happen in this case. When it does happen, > the scan rates multiple by factors measured in the hundreds and would be > misleading to present. > > Signed-off-by: Mel Gorman Acked-by: Vlastimil Babka