From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.3 required=3.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_IN_DEF_DKIM_WL autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 067A2C433B4 for ; Sat, 24 Apr 2021 02:36:18 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id B2F60613BB for ; Sat, 24 Apr 2021 02:36:17 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236612AbhDXCeT (ORCPT ); Fri, 23 Apr 2021 22:34:19 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59638 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232619AbhDXCeQ (ORCPT ); Fri, 23 Apr 2021 22:34:16 -0400 Received: from mail-wr1-x42c.google.com (mail-wr1-x42c.google.com [IPv6:2a00:1450:4864:20::42c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5FF8BC061574 for ; Fri, 23 Apr 2021 19:33:37 -0700 (PDT) Received: by mail-wr1-x42c.google.com with SMTP id r7so38125402wrm.1 for ; Fri, 23 Apr 2021 19:33:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=HOmGJz1YMkX1d3zPs6qU93d5OQzAANLQ/MDgx0IN2Dw=; b=ta9rssePip7edcrE34ZasASEDthvSGLseGE9GD7p1VWVPVsb4A/J/LzedHXnEfFnO4 4PhlFUL/muvuQLE/n0X85ELGxCdSxjS2AQwAd4NGFcstib+mYp43s1Plaien3T06MJUT 7C0UHN9ecI8ljBFRgJnkIDow8EQ2RSlTR1s3M1YiJzBu8kDxnxAmAJkVZH84D7wJTOyi M6gZoMaGLsI99D7Z9uw9BfIYFew5qjMl5Es/hSuUxWjlIzPB5nBQ2wqIncv/JI0sUR95 BalsjO0T/qLJzmm4DsFpTcdMLAgsgcRRmUPcD7G/OMkMx4apxasvglqhDvuQ2OifWSai D5DQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=HOmGJz1YMkX1d3zPs6qU93d5OQzAANLQ/MDgx0IN2Dw=; b=BaIKbRZ8UqMTHN6spT0el7EZTtmA9efw14jFAkzRIMxDgvnNomfhdcx/KCgn5v+/Yv x10seyBccBNwF2BJHY2YzZ3wiyxHyHttIgzPi+uGdIhKhayy4ExUJi2p92ZOX6lK4HzD uv1N6tF1KhlxLKY8RLVTvOUKnmxrR1jDW+D7SAm8oEu0V/g9ft1ECwlHpn3qMjL0+w4z J5iNuokOHBK6LuakBRdQ9yMB0ecMZaY64r3II89G3YW9qM9QWoKFCWRIGF2SrkTtFQbn Bhp8BwMINkyx58yiQLZIK/+x5ruPf720tDN66qu0WKbJeWVTOtLhH1TENN/DiQ/qa/jU 9PeQ== X-Gm-Message-State: AOAM532t9qfqNq/MvIfv5tDfkqavNlNbiDNTnSjvyYC8UQa2gtd9nAP2 BLz+i+RBtGdHcm89R/s+Mp/N/ZYXE0AlevQVLPZyBQ== X-Google-Smtp-Source: ABdhPJwrS12uckB2hzMR4Ggc6bBm61WaYkKatrm6wvUyapuAnzX7rVpdAlmghVDKocxd2l/58IdtI9Yj55whwGANIRg= X-Received: by 2002:adf:e381:: with SMTP id e1mr7961383wrm.323.1619231609325; Fri, 23 Apr 2021 19:33:29 -0700 (PDT) MIME-Version: 1.0 References: <20210413075155.32652-1-sjpark@amazon.de> <3ddd4f8a-8e51-662b-df11-a63a0e75b2bc@kernel.dk> <20210413231436.GF63242@dread.disaster.area> <20210414155130.GU3762101@tassilo.jf.intel.com> <20210415030002.GX3762101@tassilo.jf.intel.com> <20210415095708.GA6874@lespinasse.org> In-Reply-To: <20210415095708.GA6874@lespinasse.org> From: Yu Zhao Date: Fri, 23 Apr 2021 20:33:17 -0600 Message-ID: Subject: Re: [PATCH v2 00/16] Multigenerational LRU Framework To: Michel Lespinasse Cc: Rik van Riel , Ying Huang , Dave Chinner , Jens Axboe , SeongJae Park , Linux-MM , Andrew Morton , Benjamin Manes , Dave Hansen , Hillf Danton , Johannes Weiner , Jonathan Corbet , Joonsoo Kim , Matthew Wilcox , Mel Gorman , Miaohe Lin , Michael Larabel , Michal Hocko , Roman Gushchin , Rong Chen , SeongJae Park , Tim Chen , Vlastimil Babka , Yang Shi , Zi Yan , linux-kernel , lkp@lists.01.org, Kernel Page Reclaim v2 , Andi Kleen Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sun, Apr 18, 2021 at 12:48 AM Michel Lespinasse wrote: > On Thu, Apr 15, 2021 at 01:13:13AM -0600, Yu Zhao wrote: > > Page table scanning doesn't replace the existing rmap walk. It is > > complementary and only happens when it is likely that most of the > > pages on a system under pressure have been referenced, i.e., out of > > *inactive* pages, by definition of the existing implementation. Under > > such a condition, scanning *active* pages one by one with the rmap is > > likely to cost more than scanning them all at once via page tables. > > When we evict *inactive* pages, we still use the rmap and share a > > common path with the existing code. > > > > Page table scanning falls back to the rmap walk if the page tables of > > a process are apparently sparse, i.e., rss < size of the page tables. > > Could you expand a bit more as to how page table scanning and rmap > scanning coexist ? Say, there is some memory pressure and you want to > identify good candidate pages to recaim. You could scan processes with > the page table scanning method, or you could scan the lru list through > the rmap method. How do you mix the two - when you use the lru/rmap > method, won't you encounter both pages that are mapped in "dense" > processes where scanning page tables would have been better, and pages > that are mapped in "sparse" processes where you are happy to be using > rmap, and even pges that are mapped into both types of processes at > once ? Or, can you change the lru/rmap scan so that it will efficiently > skip over all dense processes when you use it ? Hi Michel, Sorry for the late reply. I was out of town and am still catching up on emails. That's a great question. Currently the page table scanning isn't smart enough to know where dense regions are. My plan was to improve it gradually but it seems it couldn't wait because people have major concerns over this. At the moment, the page table scanning decides if a process is worthy by checking its RSS against the size of its page tables. This can only avoid extremely sparse regions, meaning the page table scanning will scan regions that ideally should be covered by the rmap, for some worse case scenarios. My next step is to add a bloom filter so it can quickly determine dense regions and target them only. Given what I just said, the rmap is unlikely to encounter dense regions, and that's why the perf profile shows its cpu usage drops from ~30% to ~5%. Now the question is how we build the bloom filter. A simple answer is to let the rmap do the legwork, i.e., when it encounters dense regions, add them to the filter. Of course this means we'll have to use the rmap more than we do now, which is not ideal for some workloads but necessary to avoid worst case scenarios. Does it make sense? From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.3 required=3.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_IN_DEF_DKIM_WL autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 074E1C433B4 for ; Sat, 24 Apr 2021 02:33:33 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 726C56146E for ; Sat, 24 Apr 2021 02:33:32 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 726C56146E Authentication-Results: mail.kernel.org; dmarc=fail (p=reject dis=none) header.from=google.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id B47356B0036; Fri, 23 Apr 2021 22:33:31 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id ACFF06B006C; Fri, 23 Apr 2021 22:33:31 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 922246B006E; Fri, 23 Apr 2021 22:33:31 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0173.hostedemail.com [216.40.44.173]) by kanga.kvack.org (Postfix) with ESMTP id 7094E6B0036 for ; Fri, 23 Apr 2021 22:33:31 -0400 (EDT) Received: from smtpin37.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 20C2A180137CB for ; Sat, 24 Apr 2021 02:33:31 +0000 (UTC) X-FDA: 78065689422.37.8C8E3A1 Received: from mail-wr1-f49.google.com (mail-wr1-f49.google.com [209.85.221.49]) by imf17.hostedemail.com (Postfix) with ESMTP id 8EB9C40002DA for ; Sat, 24 Apr 2021 02:33:27 +0000 (UTC) Received: by mail-wr1-f49.google.com with SMTP id h4so41083788wrt.12 for ; Fri, 23 Apr 2021 19:33:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=HOmGJz1YMkX1d3zPs6qU93d5OQzAANLQ/MDgx0IN2Dw=; b=ta9rssePip7edcrE34ZasASEDthvSGLseGE9GD7p1VWVPVsb4A/J/LzedHXnEfFnO4 4PhlFUL/muvuQLE/n0X85ELGxCdSxjS2AQwAd4NGFcstib+mYp43s1Plaien3T06MJUT 7C0UHN9ecI8ljBFRgJnkIDow8EQ2RSlTR1s3M1YiJzBu8kDxnxAmAJkVZH84D7wJTOyi M6gZoMaGLsI99D7Z9uw9BfIYFew5qjMl5Es/hSuUxWjlIzPB5nBQ2wqIncv/JI0sUR95 BalsjO0T/qLJzmm4DsFpTcdMLAgsgcRRmUPcD7G/OMkMx4apxasvglqhDvuQ2OifWSai D5DQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=HOmGJz1YMkX1d3zPs6qU93d5OQzAANLQ/MDgx0IN2Dw=; b=CupKl79s2T3u5LADCAxZ8DUk+8nfiGQ8cl+BZMONLkoyDnGw+OHZjTsNpO/gBV9dpP ZsKz2mfdD4IuAL0zfEsKxhIlCX12cPu69mZ0Yu1461628VBP/ksUTy240BcDjHpp7oLY moe1uKpW+aqEOPmUsXKtLRsK7h6By3u/0DBocVaGWtFMHvJwNuZ0uA08693cFJNg10Cc w2FSwHuhcQWgZLi613ykR3HrFEg4OaY6lNWBvWc0HfF/pqKSMskpuwBlgOt0hmygYUDR Bx13YDIOCNMy1PQICmHQnHYHrRYIPuSjyUmBYf2oekPvI3jFoy3UEsUU7dy7YR7Dfr2f NT6A== X-Gm-Message-State: AOAM533GFlNosY8NCwPC/kb18vzeujprL8w90rjsGg50nM0hPrOg5Ozh TdykbWR4VQko+zc8qZTr/XwHeYDsDMVbOVXtnf8HSg== X-Google-Smtp-Source: ABdhPJwrS12uckB2hzMR4Ggc6bBm61WaYkKatrm6wvUyapuAnzX7rVpdAlmghVDKocxd2l/58IdtI9Yj55whwGANIRg= X-Received: by 2002:adf:e381:: with SMTP id e1mr7961383wrm.323.1619231609325; Fri, 23 Apr 2021 19:33:29 -0700 (PDT) MIME-Version: 1.0 References: <20210413075155.32652-1-sjpark@amazon.de> <3ddd4f8a-8e51-662b-df11-a63a0e75b2bc@kernel.dk> <20210413231436.GF63242@dread.disaster.area> <20210414155130.GU3762101@tassilo.jf.intel.com> <20210415030002.GX3762101@tassilo.jf.intel.com> <20210415095708.GA6874@lespinasse.org> In-Reply-To: <20210415095708.GA6874@lespinasse.org> From: Yu Zhao Date: Fri, 23 Apr 2021 20:33:17 -0600 Message-ID: Subject: Re: [PATCH v2 00/16] Multigenerational LRU Framework To: Michel Lespinasse Cc: Rik van Riel , Ying Huang , Dave Chinner , Jens Axboe , SeongJae Park , Linux-MM , Andrew Morton , Benjamin Manes , Dave Hansen , Hillf Danton , Johannes Weiner , Jonathan Corbet , Joonsoo Kim , Matthew Wilcox , Mel Gorman , Miaohe Lin , Michael Larabel , Michal Hocko , Roman Gushchin , Rong Chen , SeongJae Park , Tim Chen , Vlastimil Babka , Yang Shi , Zi Yan , linux-kernel , lkp@lists.01.org, Kernel Page Reclaim v2 , Andi Kleen Content-Type: text/plain; charset="UTF-8" X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: 8EB9C40002DA X-Stat-Signature: h8qb4cgtjrjksuaerrsjij9ofr7n1uhx Received-SPF: none (google.com>: No applicable sender policy available) receiver=imf17; identity=mailfrom; envelope-from=""; helo=mail-wr1-f49.google.com; client-ip=209.85.221.49 X-HE-DKIM-Result: pass/pass X-HE-Tag: 1619231607-721929 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Sun, Apr 18, 2021 at 12:48 AM Michel Lespinasse wrote: > On Thu, Apr 15, 2021 at 01:13:13AM -0600, Yu Zhao wrote: > > Page table scanning doesn't replace the existing rmap walk. It is > > complementary and only happens when it is likely that most of the > > pages on a system under pressure have been referenced, i.e., out of > > *inactive* pages, by definition of the existing implementation. Under > > such a condition, scanning *active* pages one by one with the rmap is > > likely to cost more than scanning them all at once via page tables. > > When we evict *inactive* pages, we still use the rmap and share a > > common path with the existing code. > > > > Page table scanning falls back to the rmap walk if the page tables of > > a process are apparently sparse, i.e., rss < size of the page tables. > > Could you expand a bit more as to how page table scanning and rmap > scanning coexist ? Say, there is some memory pressure and you want to > identify good candidate pages to recaim. You could scan processes with > the page table scanning method, or you could scan the lru list through > the rmap method. How do you mix the two - when you use the lru/rmap > method, won't you encounter both pages that are mapped in "dense" > processes where scanning page tables would have been better, and pages > that are mapped in "sparse" processes where you are happy to be using > rmap, and even pges that are mapped into both types of processes at > once ? Or, can you change the lru/rmap scan so that it will efficiently > skip over all dense processes when you use it ? Hi Michel, Sorry for the late reply. I was out of town and am still catching up on emails. That's a great question. Currently the page table scanning isn't smart enough to know where dense regions are. My plan was to improve it gradually but it seems it couldn't wait because people have major concerns over this. At the moment, the page table scanning decides if a process is worthy by checking its RSS against the size of its page tables. This can only avoid extremely sparse regions, meaning the page table scanning will scan regions that ideally should be covered by the rmap, for some worse case scenarios. My next step is to add a bloom filter so it can quickly determine dense regions and target them only. Given what I just said, the rmap is unlikely to encounter dense regions, and that's why the perf profile shows its cpu usage drops from ~30% to ~5%. Now the question is how we build the bloom filter. A simple answer is to let the rmap do the legwork, i.e., when it encounters dense regions, add them to the filter. Of course this means we'll have to use the rmap more than we do now, which is not ideal for some workloads but necessary to avoid worst case scenarios. Does it make sense? From mboxrd@z Thu Jan 1 00:00:00 1970 Content-Type: multipart/mixed; boundary="===============7949701077387115575==" MIME-Version: 1.0 From: Yu Zhao To: lkp@lists.01.org Subject: Re: [PATCH v2 00/16] Multigenerational LRU Framework Date: Fri, 23 Apr 2021 20:33:17 -0600 Message-ID: In-Reply-To: <20210415095708.GA6874@lespinasse.org> List-Id: --===============7949701077387115575== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable On Sun, Apr 18, 2021 at 12:48 AM Michel Lespinasse wrote: > On Thu, Apr 15, 2021 at 01:13:13AM -0600, Yu Zhao wrote: > > Page table scanning doesn't replace the existing rmap walk. It is > > complementary and only happens when it is likely that most of the > > pages on a system under pressure have been referenced, i.e., out of > > *inactive* pages, by definition of the existing implementation. Under > > such a condition, scanning *active* pages one by one with the rmap is > > likely to cost more than scanning them all at once via page tables. > > When we evict *inactive* pages, we still use the rmap and share a > > common path with the existing code. > > > > Page table scanning falls back to the rmap walk if the page tables of > > a process are apparently sparse, i.e., rss < size of the page tables. > > Could you expand a bit more as to how page table scanning and rmap > scanning coexist ? Say, there is some memory pressure and you want to > identify good candidate pages to recaim. You could scan processes with > the page table scanning method, or you could scan the lru list through > the rmap method. How do you mix the two - when you use the lru/rmap > method, won't you encounter both pages that are mapped in "dense" > processes where scanning page tables would have been better, and pages > that are mapped in "sparse" processes where you are happy to be using > rmap, and even pges that are mapped into both types of processes at > once ? Or, can you change the lru/rmap scan so that it will efficiently > skip over all dense processes when you use it ? Hi Michel, Sorry for the late reply. I was out of town and am still catching up on ema= ils. That's a great question. Currently the page table scanning isn't smart enough to know where dense regions are. My plan was to improve it gradually but it seems it couldn't wait because people have major concerns over this. At the moment, the page table scanning decides if a process is worthy by checking its RSS against the size of its page tables. This can only avoid extremely sparse regions, meaning the page table scanning will scan regions that ideally should be covered by the rmap, for some worse case scenarios. My next step is to add a bloom filter so it can quickly determine dense regions and target them only. Given what I just said, the rmap is unlikely to encounter dense regions, and that's why the perf profile shows its cpu usage drops from ~30% to ~5%. Now the question is how we build the bloom filter. A simple answer is to let the rmap do the legwork, i.e., when it encounters dense regions, add them to the filter. Of course this means we'll have to use the rmap more than we do now, which is not ideal for some workloads but necessary to avoid worst case scenarios. Does it make sense? --===============7949701077387115575==--