From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.3 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,NICE_REPLY_A, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5AD92C48BDF for ; Fri, 18 Jun 2021 12:48:49 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id E51C261369 for ; Fri, 18 Jun 2021 12:48:48 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org E51C261369 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 3733B6B0070; Fri, 18 Jun 2021 08:48:48 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 324DF6B0071; Fri, 18 Jun 2021 08:48:48 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1768D6B0072; Fri, 18 Jun 2021 08:48:48 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0151.hostedemail.com [216.40.44.151]) by kanga.kvack.org (Postfix) with ESMTP id DB8496B0070 for ; Fri, 18 Jun 2021 08:48:47 -0400 (EDT) Received: from smtpin25.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 72D1E8249980 for ; Fri, 18 Jun 2021 12:48:47 +0000 (UTC) X-FDA: 78266823894.25.E14095F Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [216.205.24.124]) by imf09.hostedemail.com (Postfix) with ESMTP id F062A600016C for ; Fri, 18 Jun 2021 12:48:46 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1624020526; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=2Sstr1LebYxEIhXqsuL3LhCka6XTGlL3drKGABSQx2g=; b=CRKo1yWjyRZ7dMhhvjKi/Rphx55n4G8Hoix2/cv+tkAFC7xYJMgxvE1okWRQgCAIHS0RMS K8HBcRiiSsa6DKyhFr+/evrLT5K5yNeAPSaLtoRhZ8Z+UzOdvWUCmkW1fYb83YmL1xx40+ +m/uEHXUu9kALyFN3A+uUivzKh+5b38= Received: from mail-wr1-f71.google.com (mail-wr1-f71.google.com [209.85.221.71]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-587-V07HBy9nM120lI7yyxAhrw-1; Fri, 18 Jun 2021 08:48:45 -0400 X-MC-Unique: V07HBy9nM120lI7yyxAhrw-1 Received: by mail-wr1-f71.google.com with SMTP id t10-20020a5d49ca0000b029011a61d5c96bso4322349wrs.11 for ; Fri, 18 Jun 2021 05:48:45 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:to:cc:references:from:organization:subject :message-id:date:user-agent:mime-version:in-reply-to :content-language:content-transfer-encoding; bh=2Sstr1LebYxEIhXqsuL3LhCka6XTGlL3drKGABSQx2g=; b=W/LmCY7MTZA3pcuzSLLAnU9PB1ZR3kdkdqQpezpgKTvtqi2tbL1t4Ac1Z59NN9ZS7T 8HN0hkQ02odPzMynVhtoeg+tSDpSR45zF2kSltcyncjXMGBW+jRBpPiCb6FnDdvj7ytb 1FAbCVTPq6g2BhNYiWUpBumixnq0S+9CQm5T5ovGnSpN6HL7Xx0Yj4qTeHdchB9+PIgo Yxb3orfFxfj/ns7pr5phCbuMRC6DttFVyJJ8cG3qL5tb71aDcstnaUaQyCBUKBVyotWb GeN88eSQSLOI5Faam0bS5sKFpYVAjEKMJzjKcQ8VhPfAQMAMqBdALBN+iSytUJ9qKInk BACA== X-Gm-Message-State: AOAM530Ite0Pu33kuULM+JgdqAoV2QAubSp30B0e/rt9z//NAJg8PekD VhcDA2sqlnWl06Xte+E8hHKTM5HuNzD7g+y8K6P1WtJ3qA/VWTfrNtvBKyfUodD1SLM/P8Pi7yC ZebhvK4BywXQ= X-Received: by 2002:a05:6000:1091:: with SMTP id y17mr4381349wrw.100.1624020524091; Fri, 18 Jun 2021 05:48:44 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwPdnVJ26hChg/9Re31v9/uY+3xYMlABrLvZVcyeSyflXL4tNkVpIBK0/nY1FGwHBEl8tkZzg== X-Received: by 2002:a05:6000:1091:: with SMTP id y17mr4381320wrw.100.1624020523871; Fri, 18 Jun 2021 05:48:43 -0700 (PDT) Received: from [192.168.3.132] (p4ff23ece.dip0.t-ipconnect.de. [79.242.62.206]) by smtp.gmail.com with ESMTPSA id m23sm12153385wml.27.2021.06.18.05.48.43 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Fri, 18 Jun 2021 05:48:43 -0700 (PDT) To: Yu Zhao Cc: Vlastimil Babka , akpm@linuxfoundation.org, Linux-MM , Heiko Carstens , Rafael Aquini , Vladimir Davydov , "Kirill A . Shutemov" , Andrea Arcangeli , Donald Dutile , Matthew Wilcox , SeongJae Park References: <20210612000714.775825-1-willy@infradead.org> <7b35885b-1413-5e08-3930-c8c4b66bcfe7@redhat.com> From: David Hildenbrand Organization: Red Hat Subject: Re: [PATCH] mm: Mark idle page tracking as BROKEN Message-ID: <749b99b5-f448-754f-e3bc-fe4486e4483c@redhat.com> Date: Fri, 18 Jun 2021 14:48:42 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.10.1 MIME-Version: 1.0 In-Reply-To: X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: F062A600016C Authentication-Results: imf09.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=CRKo1yWj; spf=none (imf09.hostedemail.com: domain of david@redhat.com has no SPF policy when checking 216.205.24.124) smtp.mailfrom=david@redhat.com; dmarc=pass (policy=none) header.from=redhat.com X-Stat-Signature: fj6kkp34e4efj49tewxz1gbhn4zd9pj7 X-HE-Tag: 1624020526-603441 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 16.06.21 21:23, Yu Zhao wrote: > On Wed, Jun 16, 2021 at 2:43 AM David Hildenbrand wr= ote: >> >> On 16.06.21 10:36, Vlastimil Babka wrote: >>> On 6/16/21 8:22 AM, Yu Zhao wrote: >>>> On Tue, Jun 15, 2021 at 8:55 PM Matthew Wilcox = wrote: >>>>> >>>>> >>>>> I don't know. I asked the others on the call and the answer I got = was >>>>> essentially "Just delete it". >>>>> >>>>> I'm kind of hoping the others speak up. >>>> >>>> I listed a couple of things when acking this patch. Being broken is >>>> not a problem as long as there are users who care about it. What mad= e >>>> me think such users may not exist is that nobody ever complained abo= ut >>>> those things until we stumbled on them -- I'm not insisting on >>>> deleting this feature, just clarifying why I thought so. >>> >>> Similar feelings here. On the call it looked like the feature was aba= ndoned by >>> its creators, and it wasn't clear if the distros that had it enabled = did so due >>> to reasons that still apply for future versions. Sending the proposal= and >>> getting a feedback that there are users is one of the expected valid = outcomes. >> >> For us (RH) it will be very interesting to know the exact things that >> are "suboptimal" (I'm avoiding the terminology "broken" here), so we c= an >> actually evaluate if this might affect customers and might be worth >> "improving". >=20 > I consider the examples I gave in my first email breakages -- others > broke/break the idle page tracking -- and I think it's safe to assume > they will continue to happen. Right, just as with any other feature that has very bad (no?) upstream=20 test coverage and doesn't immediately blow up if not done 100% right. So to summarize (thanks for the input!): 1. It was really broken om arm64 before we had 07509e10dcc7 ("arm64:=20 pgtable: Fix pte_accessible()") but should be working now. 2. Functions that call pte/pmd_mkold() but not test_and_clear_young()=20 are shaky. 3. MADV_FREE'ed pages won't actually get freed and treated as if they=20 were reaccessed, because page_referenced() will return true upon seeing=20 PageYoung(). 4. Huge page handling is suboptimal and requires proper care from user=20 space to get it right:=20 https://lore.kernel.org/linux-mm/20210614081610.16123-1-sjpark@amazon.de/ I suspect daemon will have similar interest in optimizing 2 and 3, right? >=20 > If you are really looking for improvements, the page compaction has > always been a good example. For the idle page tracking, with physical > memory as little as 4GB, it needs to go thru one million PFNs, no > matter how many compound or buddy pages there are. For THPs, it will > try to get_page_unless_zero() on tail pages, which always fails. This > is why we discussed it in the meeting. Right, this sounds sub-optimal. >=20 > What can't be improved is the memory locality of PFNs. They are not > grouped by memcgs or processes. Two PFNs next to each other can be > from two processes with two sets of five-level page tables. The cache > misses simply outweigh any potential benefits one might get from this > feature, speaking as one of the customers. Right. --=20 Thanks, David / dhildenb