From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.9 required=3.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1, USER_IN_DEF_DKIM_WL autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4F756C388F9 for ; Wed, 11 Nov 2020 07:50:57 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 8893320786 for ; Wed, 11 Nov 2020 07:50:56 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="OuvAtIt2" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 8893320786 Authentication-Results: mail.kernel.org; dmarc=fail (p=reject dis=none) header.from=google.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 966656B0068; Wed, 11 Nov 2020 02:50:55 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 8EF226B006C; Wed, 11 Nov 2020 02:50:55 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7DC356B006E; Wed, 11 Nov 2020 02:50:55 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0185.hostedemail.com [216.40.44.185]) by kanga.kvack.org (Postfix) with ESMTP id 4A1C86B0068 for ; Wed, 11 Nov 2020 02:50:55 -0500 (EST) Received: from smtpin11.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id E02473628 for ; Wed, 11 Nov 2020 07:50:54 +0000 (UTC) X-FDA: 77471366028.11.swing57_2304b19272fc Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin11.hostedemail.com (Postfix) with ESMTP id BF616180F8B80 for ; Wed, 11 Nov 2020 07:50:54 +0000 (UTC) X-HE-Tag: swing57_2304b19272fc X-Filterd-Recvd-Size: 7809 Received: from mail-oo1-f68.google.com (mail-oo1-f68.google.com [209.85.161.68]) by imf31.hostedemail.com (Postfix) with ESMTP for ; Wed, 11 Nov 2020 07:50:54 +0000 (UTC) Received: by mail-oo1-f68.google.com with SMTP id q28so232695oof.1 for ; Tue, 10 Nov 2020 23:50:54 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:from:to:cc:subject:in-reply-to:message-id:references :user-agent:mime-version; bh=6RzBU8NK+ALGWSpn/m7jTQQEqoq0Kx4JgYMbShvEO30=; b=OuvAtIt2NPP249/334v42ON1hgNeWAciqKYIDj/hI5aXw8zjwK9J+TSbhhbwiZvi7A d2pj8K+389ap3er/p54uWrQ6Wllgb8lA8KSw+9fTQ6iaLKHIvj/f4OuQqD6+EJtfLvmB e9EElIZPIgHJuBJVxDYy7zzRoyopxgzQ51uTKUtqy9R9TERjWyaohNECfMDOEs1svzem xkpjYWIdvnESLegco00Xq+IDGznrTP8ociKQ4+afzGxIaYqtLORQLve+Gv6mRONvk20Y NEI4TVCeVUtdMghfOJSeEbWAJgYB6f5pNUWXfDOse1ET6PojJL98lodlJh+7cX9QVEfG f0UQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:in-reply-to:message-id :references:user-agent:mime-version; bh=6RzBU8NK+ALGWSpn/m7jTQQEqoq0Kx4JgYMbShvEO30=; b=TW7PcSETMJ9kFCA+8CvNHe3nYwP7GzxTuaw3CucibgCjuEiybjj03mXa1O66xOGakK 6cR1lRN+TSO249pKLinOdsVLRebo8elE5Ofk+9A3geHZMOAQ3y4P3fWcDyX7ZdSiyZEv wfBHhQ/HqSfhHs7MPT5L7pSPqgd9xTmWf6Er1FGGWzFU1R3rTmASNjcJdc9/3zSkHFZr duqL+/LzHUC+7IY/DFeGnL7Wr7LCxKXZt9bYAKY7JWSJXJlb/NmsBU/oz+IZzacSVeFM wNKa0K7WLhgCrk5iFoyZNCnZHLjOE5AHQ9fiR8Tlnx2vAnXE/CccZvgcZ6NzEk+jcwEK gqvA== X-Gm-Message-State: AOAM5314N81b7ubxT0c3+U6lHqBx3sCbZWg8C92dReo0uXSQaA1hM5Jj rCbE5kzWRtmJ/TKCDNQ562zBzw== X-Google-Smtp-Source: ABdhPJxh/HxKUV/witGMklUb4vP64lLsFA9zYOw5ob7dEhcNLkW5+yaCXUXvGfmFVUpPprylJsuiHA== X-Received: by 2002:a4a:5857:: with SMTP id f84mr15967000oob.34.1605081053376; Tue, 10 Nov 2020 23:50:53 -0800 (PST) Received: from eggly.attlocal.net (172-10-233-147.lightspeed.sntcca.sbcglobal.net. [172.10.233.147]) by smtp.gmail.com with ESMTPSA id i82sm292461oia.2.2020.11.10.23.50.51 (version=TLS1 cipher=ECDHE-ECDSA-AES128-SHA bits=128/128); Tue, 10 Nov 2020 23:50:52 -0800 (PST) Date: Tue, 10 Nov 2020 23:50:50 -0800 (PST) From: Hugh Dickins X-X-Sender: hugh@eggly.anvils To: Shakeel Butt cc: Hugh Dickins , Jerome Glisse , Johannes Weiner , Andrew Morton , Vlastimil Babka , Michal Hocko , Linux MM , LKML , Balbir Singh Subject: Re: [PATCH] mm/rmap: always do TTU_IGNORE_ACCESS In-Reply-To: Message-ID: References: <20201104231928.1494083-1-shakeelb@google.com> User-Agent: Alpine 2.11 (LSU 23 2013-08-11) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Fri, 6 Nov 2020, Shakeel Butt wrote: > On Thu, Nov 5, 2020 at 7:00 PM Hugh Dickins wrote: > > > > I don't know why this was addressed to me in particular (easy to imagine > > I've made a mod at some time that bears on this, but I haven't found it); > > but have spent longer considering the patch than I should have done - > > apologies to everyone else I should be replying to. > > > > I really appreciate your insights and historical anecdotes. I always > learn something new. :) > > > On Wed, 4 Nov 2020, Shakeel Butt wrote: > > > > > Since the commit 369ea8242c0f ("mm/rmap: update to new mmu_notifier > > > semantic v2"), the code to check the secondary MMU's page table access > > > bit is broken for !(TTU_IGNORE_ACCESS) because the page is unmapped from > > > the secondary MMU's page table before the check. More specifically for > > > those secondary MMUs which unmap the memory in > > > mmu_notifier_invalidate_range_start() like kvm. > > > > Well, "broken" seems a bit unfair to 369ea8242c0f. It put a warning > > mmu_notifier_invalidate_range_start() at the beginning, and matching > > mmu_notifier_invalidate_range_end() at the end of try_to_unmap_one(); > > with its mmu_notifier_invalidate_range() exactly where the > > mmu_notifier_invalidate_page() was before (I think the story gets > > more complicated later). Yes, if notifiee takes invalidate_range_start() > > as signal to invalidate all their own range, then that will sometimes > > cause them unnecessary invalidations. > > > > Not just for !TTU_IGNORE_ACCESS: there's also the !TTU_IGNORE_MLOCK > > case meeting a VM_LOCKED vma and setting PageMlocked where that had > > been missed earlier (and page_check_references() has intentionally but > > confusingly marked this case as PAGEREF_RECLAIM, not to reclaim the page, > > but to reach the try_to_unmap_one() which will recognize and fix it up - > > historically easier to do there than in page_referenced_one()). > > > > But I think mmu_notifier is a diversion from what needs thinking about. > > > > > > > > However memory reclaim is the only user of !(TTU_IGNORE_ACCESS) or the > > > absence of TTU_IGNORE_ACCESS and it explicitly performs the page table > > > access check before trying to unmap the page. So, at worst the reclaim > > > will miss accesses in a very short window if we remove page table access > > > check in unmapping code. > > > > I agree with you and Johannes that the short race window when the page > > might be re-referenced is no issue at all: the functional issue is the > > one in your next paragraph. If that's agreed by memcg guys, great, > > then this patch is a nice observation and a welcome cleanup. > > > > > > > > There is an unintented consequence of !(TTU_IGNORE_ACCESS) for the memcg > > > reclaim. From memcg reclaim the page_referenced() only account the > > > accesses from the processes which are in the same memcg of the target > > > page but the unmapping code is considering accesses from all the > > > processes, so, decreasing the effectiveness of memcg reclaim. > > > > Are you sure it was unintended? > > > > Since the dawn of memcg reclaim, it has been the case that a recent > > reference in a "foreign" vma has rescued that page from being reclaimed: > > now you propose to change that. I expect some workflows will benefit > > and others be disadvantaged. I have no objection myself to the change, > > but I do think it needs to be better highlighted here, and explicitly > > agreed by those more familiar with memcg reclaim. > > The reason I said unintended was due to bed7161a519a2 ("Memory > controller: make page_referenced() cgroup aware"). From the commit > message it seems like the intention was to not be influenced by > foreign accesses during memcg reclaim but it missed to make > try_to_unmap_one() memcg aware. Oooh, that's a good reference (much better than the mmu_notifier one you cited in the patch). Yes, I agree Balbir was explicit about the intention then, and you're simply fixing it up. > > I agree with you that this is a behavior change and we have explicitly > agree to not let memcg reclaim be influenced by foreign accesses. I've not seen anyone else protesting, and Johannes and Andrew happy with this: so no more protest from me, let's proceed with the nice cleanup, and hope no regression surfaces. Hugh