From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3F116C43334 for ; Tue, 7 Jun 2022 22:27:43 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1352207AbiFGW1i (ORCPT ); Tue, 7 Jun 2022 18:27:38 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52376 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1376837AbiFGVSj (ORCPT ); Tue, 7 Jun 2022 17:18:39 -0400 Received: from mail-vs1-xe2e.google.com (mail-vs1-xe2e.google.com [IPv6:2607:f8b0:4864:20::e2e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B5C1E224118 for ; Tue, 7 Jun 2022 11:59:23 -0700 (PDT) Received: by mail-vs1-xe2e.google.com with SMTP id 68so17527482vse.11 for ; Tue, 07 Jun 2022 11:59:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=9Elzb0WSXIXlROa5Ne/fkFdeuavdNhknMmaOlUejDTY=; b=Cv0f7VDXY7aEPKcXAgnBc/m7gn4KcwOMCSs6PnP0SNIDWWYURJ50n1BCOckHKyAo4L osssjroBU4blBT4uRm44yv1fi6oI2L3ujbMxK0o1i7oGhzuJEZHYVRVpCkNRjU/hFtxf Ifer0vvxCoOUoY1v9kjyMENImilrTTM4fAy0Btq15FrFxyPHCeK2B0D84LFU8iZVCbEM TH8Id7ZOemPx6JqSCTpgYaMPAPVA73HfyKXhphSlNjINHTMX+uRZzLvngc5H22ab9HNY 8jkBwlmf+zLp0DZxf4DQWE1GgEnOF/mife+Vv/Z4dmejaQyeRkI9GPk+3cw2UFYny78M GbTQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=9Elzb0WSXIXlROa5Ne/fkFdeuavdNhknMmaOlUejDTY=; b=MT6L6TgAYCsaz39abgzZBMQe+kb1LtBl7vUTFKK/4ZUSA7puDl+ZkjBsZUrX3VNpBt ptfNZyBglSgx9TcdR5kA7+13hDT6z3YMmQ89orUY+2hgqznx1eJwEfrKxpdvyxcQMo7J BffGl+abskKvOIq5TFHt05IIqWI9Ae4kJ80lii/nvBxtq34GRbd/Q2d0g0Tm358PkKjY tB+sHAbciU2pjcrUvHUlBbUC+YgoeU2oGkP6liOSBtoq77oyjVg9IGg5nuJ0CPktzT97 RJxUCDOCm0CIoqDTYiJZJIMRA+bTQwqT53m1JJDg2FhAjU/c/qCCeAIe0EbZIcDoOGhW aB3A== X-Gm-Message-State: AOAM5308R7EcoBhOtpHRXv3Rb3+2T9hka3QhmRY7ZJNZuEe7dUxSIqMR y4M1wgjkoaS2uHf8Nx1JEwZuoTK3JlRgM3nxaU22zg== X-Google-Smtp-Source: ABdhPJwItyNg+JlCejjmtMRo2UgKL6vqnWouemHIBZUAlTzMQTqGRbdROMZliSDf1VNw1QrtZay3uVygVctsRUZw5Ew= X-Received: by 2002:a67:f3d0:0:b0:34b:b52d:d676 with SMTP id j16-20020a67f3d0000000b0034bb52dd676mr6382966vsn.6.1654628358760; Tue, 07 Jun 2022 11:59:18 -0700 (PDT) MIME-Version: 1.0 References: <20220518014632.922072-1-yuzhao@google.com> <20220518014632.922072-8-yuzhao@google.com> In-Reply-To: From: Yu Zhao Date: Tue, 7 Jun 2022 12:58:42 -0600 Message-ID: Subject: Re: [PATCH v11 07/14] mm: multi-gen LRU: exploit locality in rmap To: Barry Song <21cnbao@gmail.com> Cc: Andrew Morton , Linux-MM , Andi Kleen , Aneesh Kumar , Catalin Marinas , Dave Hansen , Hillf Danton , Jens Axboe , Johannes Weiner , Jonathan Corbet , Linus Torvalds , Matthew Wilcox , Mel Gorman , Michael Larabel , Michal Hocko , Mike Rapoport , Peter Zijlstra , Tejun Heo , Vlastimil Babka , Will Deacon , LAK , Linux Doc Mailing List , LKML , x86 , Kernel Page Reclaim v2 , Brian Geffon , Jan Alexander Steffens , Oleksandr Natalenko , Steven Barrett , Suleiman Souhlal , Daniel Byrne , Donald Carr , =?UTF-8?Q?Holger_Hoffst=C3=A4tte?= , Konstantin Kharlamov , Shuang Zhai , Sofia Trinh , Vaibhav Jain Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Jun 6, 2022 at 3:25 AM Barry Song <21cnbao@gmail.com> wrote: > > On Wed, May 18, 2022 at 4:49 PM Yu Zhao wrote: ... > > @@ -821,6 +822,12 @@ static bool folio_referenced_one(struct folio *folio, > > } > > > > if (pvmw.pte) { > > + if (lru_gen_enabled() && pte_young(*pvmw.pte) && > > + !(vma->vm_flags & (VM_SEQ_READ | VM_RAND_READ))) { > > + lru_gen_look_around(&pvmw); > > + referenced++; > > + } > > + > > if (ptep_clear_flush_young_notify(vma, address, > > Hello, Yu. > look_around() is calling ptep_test_and_clear_young(pvmw->vma, addr, pte + i) > only without flush and notify. for flush, there is a tlb operation for arm64: > static inline int ptep_clear_flush_young(struct vm_area_struct *vma, > unsigned long address, pte_t *ptep) > { > int young = ptep_test_and_clear_young(vma, address, ptep); > > if (young) { > /* > * We can elide the trailing DSB here since the worst that can > * happen is that a CPU continues to use the young entry in its > * TLB and we mistakenly reclaim the associated page. The > * window for such an event is bounded by the next > * context-switch, which provides a DSB to complete the TLB > * invalidation. > */ > flush_tlb_page_nosync(vma, address); > } > > return young; > } > > Does it mean the current kernel is over cautious? Hi Barry, This is up to individual archs. For x86, ptep_clear_flush_young() is ptep_test_and_clear_young(). For arm64, I'd say yes, based on Figure 1 of Navarro, Juan, et al. "Practical, transparent operating system support for superpages." [1]. int ptep_clear_flush_young(struct vm_area_struct *vma, unsigned long address, pte_t *ptep) { /* * On x86 CPUs, clearing the accessed bit without a TLB flush * doesn't cause data corruption. [ It could cause incorrect * page aging and the (mistaken) reclaim of hot pages, but the * chance of that should be relatively low. ] * * So as a performance optimization don't flush the TLB when * clearing the accessed bit, it will eventually be flushed by * a context switch or a VM operation anyway. [ In the rare * event of it not getting flushed for a long time the delay * shouldn't really matter because there's no real memory * pressure for swapout to react to. ] */ return ptep_test_and_clear_young(vma, address, ptep); } [1] https://www.usenix.org/legacy/events/osdi02/tech/full_papers/navarro/navarro.pdf > is it > safe to call ptep_test_and_clear_young() only? Yes. Though the h/w A-bit is designed to allow OSes to skip TLB flushes when unmapping, the Linux kernel doesn't do this. > btw, lru_gen_look_around() has already included 'address', are we doing > pte check for 'address' twice here? Yes for host MMU but no KVM MMU. ptep_clear_flush_young_notify() goes into the MMU notifier. We don't use the _notify variant in lru_gen_look_around() because GPA space generally exhibits no memory locality. Thanks. From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id D3E04C433EF for ; Tue, 7 Jun 2022 19:01:07 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:Cc:To:Subject:Message-ID:Date:From: In-Reply-To:References:MIME-Version:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=gP2ru9J1Fm7oUHToe9/QozmO/cswm8MPfohBJPjQBtA=; b=nXHSdmxlBgG55w gaeKQvzkjQGd07uhvrKFqbJ6JgZMrA/zaiM9cCAb07CA1/aekV3hoKF57lGjVoTIOHfRUFI9BaSy4 Kbm+LYvcj2z1s/bpVD+VHq5OPKLMxEATBd5LAiAeQWrNat+MpCUMLkVzsJT9i+1cBqk1P7Ritf1VG cEAGiv11DqURcxq3azTfdxmjtaRjHYkZTjBxZG4PRO0YBl/N0+rJ6ftTUEcnGZpkPxo2NGHPkQWLv 0vbOffd8FNMBXaI0KHiOvs+3hXLHXZSNPT4bQ3dy4WSVQn+5c0HxQKblUEa7Mj69wjwcn9m57tcpB QHvarXM6dYisZeKCpIfA==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1nyeQK-008vMx-Dm; Tue, 07 Jun 2022 18:59:28 +0000 Received: from mail-vs1-xe2b.google.com ([2607:f8b0:4864:20::e2b]) by bombadil.infradead.org with esmtps (Exim 4.94.2 #2 (Red Hat Linux)) id 1nyeQG-008vJs-Ci for linux-arm-kernel@lists.infradead.org; Tue, 07 Jun 2022 18:59:26 +0000 Received: by mail-vs1-xe2b.google.com with SMTP id f1so5050364vsv.5 for ; Tue, 07 Jun 2022 11:59:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=9Elzb0WSXIXlROa5Ne/fkFdeuavdNhknMmaOlUejDTY=; b=Cv0f7VDXY7aEPKcXAgnBc/m7gn4KcwOMCSs6PnP0SNIDWWYURJ50n1BCOckHKyAo4L osssjroBU4blBT4uRm44yv1fi6oI2L3ujbMxK0o1i7oGhzuJEZHYVRVpCkNRjU/hFtxf Ifer0vvxCoOUoY1v9kjyMENImilrTTM4fAy0Btq15FrFxyPHCeK2B0D84LFU8iZVCbEM TH8Id7ZOemPx6JqSCTpgYaMPAPVA73HfyKXhphSlNjINHTMX+uRZzLvngc5H22ab9HNY 8jkBwlmf+zLp0DZxf4DQWE1GgEnOF/mife+Vv/Z4dmejaQyeRkI9GPk+3cw2UFYny78M GbTQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=9Elzb0WSXIXlROa5Ne/fkFdeuavdNhknMmaOlUejDTY=; b=xoGmoEkkQ4qvSsyHrqE0cCmbA00jEBEozqtvig89ucV0qWDKVUXTWiA1+oVDmjcNw+ hpXYwv3uh1FNcguOcUbJwwSBhIDKI4B1Lyw9FDhL9olPYh3Hl59082LEqbCFpyr0yuMM Fa+Ahn2FnfvYChA2/EJpqKJI7LKDxYwHdoN/r8N/YLQgPoP0xDhoO7dlYfrlV697Gupu U/ExTrb5W8VBlUjQCMZgLlIb/ARFdxLWuOefIeaagAieoaVJqJ5QKyaw6hXRSVoo6eyW 4JM4S7vgeBtOb/EB7HvicVA9qt/wCmt3cGEqbCtjmfqDJxZgyJ4XxN5TimSk6E9X9Sgw r8Fg== X-Gm-Message-State: AOAM531rScEOqTt10Lb5Ehkn42NPl6CuRmVEwdpq84HpXeKf0gtwn+Gt NMc/FIFFuN182q/mXaNNd4hEf+7ZUnTcnyZx81m2ZQ== X-Google-Smtp-Source: ABdhPJwItyNg+JlCejjmtMRo2UgKL6vqnWouemHIBZUAlTzMQTqGRbdROMZliSDf1VNw1QrtZay3uVygVctsRUZw5Ew= X-Received: by 2002:a67:f3d0:0:b0:34b:b52d:d676 with SMTP id j16-20020a67f3d0000000b0034bb52dd676mr6382966vsn.6.1654628358760; Tue, 07 Jun 2022 11:59:18 -0700 (PDT) MIME-Version: 1.0 References: <20220518014632.922072-1-yuzhao@google.com> <20220518014632.922072-8-yuzhao@google.com> In-Reply-To: From: Yu Zhao Date: Tue, 7 Jun 2022 12:58:42 -0600 Message-ID: Subject: Re: [PATCH v11 07/14] mm: multi-gen LRU: exploit locality in rmap To: Barry Song <21cnbao@gmail.com> Cc: Andrew Morton , Linux-MM , Andi Kleen , Aneesh Kumar , Catalin Marinas , Dave Hansen , Hillf Danton , Jens Axboe , Johannes Weiner , Jonathan Corbet , Linus Torvalds , Matthew Wilcox , Mel Gorman , Michael Larabel , Michal Hocko , Mike Rapoport , Peter Zijlstra , Tejun Heo , Vlastimil Babka , Will Deacon , LAK , Linux Doc Mailing List , LKML , x86 , Kernel Page Reclaim v2 , Brian Geffon , Jan Alexander Steffens , Oleksandr Natalenko , Steven Barrett , Suleiman Souhlal , Daniel Byrne , Donald Carr , =?UTF-8?Q?Holger_Hoffst=C3=A4tte?= , Konstantin Kharlamov , Shuang Zhai , Sofia Trinh , Vaibhav Jain X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20220607_115924_474520_4674BA0B X-CRM114-Status: GOOD ( 25.69 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On Mon, Jun 6, 2022 at 3:25 AM Barry Song <21cnbao@gmail.com> wrote: > > On Wed, May 18, 2022 at 4:49 PM Yu Zhao wrote: ... > > @@ -821,6 +822,12 @@ static bool folio_referenced_one(struct folio *folio, > > } > > > > if (pvmw.pte) { > > + if (lru_gen_enabled() && pte_young(*pvmw.pte) && > > + !(vma->vm_flags & (VM_SEQ_READ | VM_RAND_READ))) { > > + lru_gen_look_around(&pvmw); > > + referenced++; > > + } > > + > > if (ptep_clear_flush_young_notify(vma, address, > > Hello, Yu. > look_around() is calling ptep_test_and_clear_young(pvmw->vma, addr, pte + i) > only without flush and notify. for flush, there is a tlb operation for arm64: > static inline int ptep_clear_flush_young(struct vm_area_struct *vma, > unsigned long address, pte_t *ptep) > { > int young = ptep_test_and_clear_young(vma, address, ptep); > > if (young) { > /* > * We can elide the trailing DSB here since the worst that can > * happen is that a CPU continues to use the young entry in its > * TLB and we mistakenly reclaim the associated page. The > * window for such an event is bounded by the next > * context-switch, which provides a DSB to complete the TLB > * invalidation. > */ > flush_tlb_page_nosync(vma, address); > } > > return young; > } > > Does it mean the current kernel is over cautious? Hi Barry, This is up to individual archs. For x86, ptep_clear_flush_young() is ptep_test_and_clear_young(). For arm64, I'd say yes, based on Figure 1 of Navarro, Juan, et al. "Practical, transparent operating system support for superpages." [1]. int ptep_clear_flush_young(struct vm_area_struct *vma, unsigned long address, pte_t *ptep) { /* * On x86 CPUs, clearing the accessed bit without a TLB flush * doesn't cause data corruption. [ It could cause incorrect * page aging and the (mistaken) reclaim of hot pages, but the * chance of that should be relatively low. ] * * So as a performance optimization don't flush the TLB when * clearing the accessed bit, it will eventually be flushed by * a context switch or a VM operation anyway. [ In the rare * event of it not getting flushed for a long time the delay * shouldn't really matter because there's no real memory * pressure for swapout to react to. ] */ return ptep_test_and_clear_young(vma, address, ptep); } [1] https://www.usenix.org/legacy/events/osdi02/tech/full_papers/navarro/navarro.pdf > is it > safe to call ptep_test_and_clear_young() only? Yes. Though the h/w A-bit is designed to allow OSes to skip TLB flushes when unmapping, the Linux kernel doesn't do this. > btw, lru_gen_look_around() has already included 'address', are we doing > pte check for 'address' twice here? Yes for host MMU but no KVM MMU. ptep_clear_flush_young_notify() goes into the MMU notifier. We don't use the _notify variant in lru_gen_look_around() because GPA space generally exhibits no memory locality. Thanks. _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel