From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 43262C4332F for ; Fri, 4 Nov 2022 17:42:55 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B803F6B0073; Fri, 4 Nov 2022 13:42:54 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id B310B8E0002; Fri, 4 Nov 2022 13:42:54 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9837F8E0001; Fri, 4 Nov 2022 13:42:54 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 884166B0073 for ; Fri, 4 Nov 2022 13:42:54 -0400 (EDT) Received: from smtpin29.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 3D005AB89D for ; Fri, 4 Nov 2022 17:42:54 +0000 (UTC) X-FDA: 80096480268.29.0CF9A00 Received: from mail-qt1-f181.google.com (mail-qt1-f181.google.com [209.85.160.181]) by imf24.hostedemail.com (Postfix) with ESMTP id D4BF4180003 for ; Fri, 4 Nov 2022 17:42:53 +0000 (UTC) Received: by mail-qt1-f181.google.com with SMTP id h24so3461414qta.7 for ; Fri, 04 Nov 2022 10:42:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux-foundation.org; s=google; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=Y7JBbJ2Fvdhb3EiG7i7cT3Xt7aRcECjo+Ml9qZORLks=; b=QtMfbnehToVkf5LyUXFO65eJG6We/YS6qgzH8Z2QVSs2weEaaF7YugQA4+T+JfaIPw 6dimRZa67FxZZM+NBrxbn3JA/iGOuUs1OWmETpF/W40mXdVqlUFdgDVv7Dko3fTRTMqj 2E8UHpNacdU4QteBhPkPgS/IGURatVauQnjJM= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=Y7JBbJ2Fvdhb3EiG7i7cT3Xt7aRcECjo+Ml9qZORLks=; b=u6Xjhuz8ypYQtR9wmD9i3V99H6X4ZwAl75FBsjbm/4caenMYxHuno2dQZdrAOgwNT5 boSMJ/quQEWzc2p0+LOEajomjmPuT/65CYIsIMx1fwV5nuHCWOfvux7Q/Z/mjQpwmFNm QMcLTGB8UZlqkYE61v3FpATqKvprpil9+zL3HWnFd6Oey0xWndCsa5UHb+aB5vjydJSN 9YRPOadGAYN5cEYnqLQ/G0zub8REJ31PHWX/wyvmCXQIOWe0qNI6+VcDXoDLeLOYMio0 qJ4HYwvYSnnWVnD+s3WxiUQ6lrig3sHEqu8HCOVy7gMDd9K25Sbzmc7usvsMOsoP/zGb igJA== X-Gm-Message-State: ACrzQf2Um1M0uJPwywYHFIBNmVnRRVu0yqW1Y3VJW3WJlUvSeRWkHYpH G2eYh5hMi/07kxcEhCFu/Uqe6qsNHaOH7A== X-Google-Smtp-Source: AMsMyM4tQNORkVZ4X0dbRo5gpwuauYuRFBzBwSauPUaPthnR6XBMzJyjvZokkhjKRrvhKex8zT/7Tw== X-Received: by 2002:a05:622a:4083:b0:3a5:42f9:de32 with SMTP id cg3-20020a05622a408300b003a542f9de32mr13906462qtb.335.1667583772881; Fri, 04 Nov 2022 10:42:52 -0700 (PDT) Received: from mail-yb1-f181.google.com (mail-yb1-f181.google.com. [209.85.219.181]) by smtp.gmail.com with ESMTPSA id bn35-20020a05620a2ae300b006bb2cd2f6d1sm3180917qkb.127.2022.11.04.10.42.52 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Fri, 04 Nov 2022 10:42:52 -0700 (PDT) Received: by mail-yb1-f181.google.com with SMTP id 63so6637952ybq.4 for ; Fri, 04 Nov 2022 10:42:52 -0700 (PDT) X-Received: by 2002:a05:6902:124f:b0:66e:e3da:487e with SMTP id t15-20020a056902124f00b0066ee3da487emr37488759ybu.310.1667583334773; Fri, 04 Nov 2022 10:35:34 -0700 (PDT) MIME-Version: 1.0 References: <140B437E-B994-45B7-8DAC-E9B66885BEEF@gmail.com> In-Reply-To: From: Linus Torvalds Date: Fri, 4 Nov 2022 10:35:04 -0700 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: mm: delay rmap removal until after TLB flush To: Alexander Gordeev Cc: Peter Zijlstra , Will Deacon , Aneesh Kumar , Nick Piggin , Heiko Carstens , Vasily Gorbik , Christian Borntraeger , Sven Schnelle , Nadav Amit , Jann Horn , John Hubbard , X86 ML , Matthew Wilcox , Andrew Morton , kernel list , Linux-MM , Andrea Arcangeli , "Kirill A . Shutemov" , Joerg Roedel , Uros Bizjak , Alistair Popple , linux-arch Content-Type: text/plain; charset="UTF-8" ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1667583773; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Y7JBbJ2Fvdhb3EiG7i7cT3Xt7aRcECjo+Ml9qZORLks=; b=IBl5aqhGCIr3I1JpAhj/qzMMV66SbWfo/Y1RvqYVxw5k4IldJn3eS1ERXxOL21OkTq9rHU leH+vBWriouxJH5YxyZ1/vLPnk8VVSkvyK6M+AOgUE8hahn3C8Ac4j9SnPPXliLjEu3fVy d8R4ljKEZYybENLkgf2+UQceSJ5vlLQ= ARC-Authentication-Results: i=1; imf24.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=google header.b=QtMfbneh; dmarc=none; spf=pass (imf24.hostedemail.com: domain of torvalds@linuxfoundation.org designates 209.85.160.181 as permitted sender) smtp.mailfrom=torvalds@linuxfoundation.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1667583773; a=rsa-sha256; cv=none; b=tw1Oyse4bshHveVoqhFd/4Y8+23mt+7Hc33yUSbz5HR7nAgSvjfFBmbfR0s2m53bJH2oQ9 avbA/71uO1VqbTQ2Bd1ioXaBQ8z1b72APsIZwFBr1JMiADBfquVNsFOXrXwVeTjA7gvqtZ 9LzOrrNw1x+EUM0ijWe+dELILF9m/r0= Authentication-Results: imf24.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=google header.b=QtMfbneh; dmarc=none; spf=pass (imf24.hostedemail.com: domain of torvalds@linuxfoundation.org designates 209.85.160.181 as permitted sender) smtp.mailfrom=torvalds@linuxfoundation.org X-Rspam-User: X-Rspamd-Server: rspam08 X-Rspamd-Queue-Id: D4BF4180003 X-Stat-Signature: se7fekqtwsen6i3xbej1racixojsp7o9 X-HE-Tag: 1667583773-105693 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu, Nov 3, 2022 at 11:33 PM Alexander Gordeev wrote: > > I rather have a question to the generic part (had to master the code quotting). Sure. Although now I think the series in in Andrew's -mm tree, or just about to get moved in there, so I'm not going to touch my actual branch any more. > > static void clean_and_free_pages_and_swap_cache(struct encoded_page **pages, unsigned int nr) > > { > > for (unsigned int i = 0; i < nr; i++) { > > struct encoded_page *encoded = pages[i]; > > unsigned int flags = encoded_page_flags(encoded); > > if (flags) { > > /* Clean the flagged pointer in-place */ > > struct page *page = encoded_page_ptr(encoded); > > pages[i] = encode_page(page, 0); > > > > /* The flag bit being set means that we should zap the rmap */ > > Why TLB_ZAP_RMAP bit is not checked explicitly here, like in s390 version? > (I assume, when/if ENCODE_PAGE_BITS is not TLB_ZAP_RMAP only, calling > page_zap_pte_rmap() without such a check would be a bug). No major reason. This is basically the same issue as the naming, which I touched on in https://lore.kernel.org/all/CAHk-=wiDg_1up8K4PhK4+kzPN7xJG297=nw+tvgrGn7aVgZdqw@mail.gmail.com/ and the follow-up note about how I hope the "encoded page pointer with flags" thing gets used by the mlock code some day too. IOW, there's kind of a generic "I have extra flags associated with the pointer", and then the specific "this case uses this flag", and depending on which mindset you have at the time, you might do one or the other. So in that clean_and_free_pages_and_swap_cache(), the code basically knows "I have a pointer with extra flags", and it's written that way. And that's partly historical, because it actually started with the original code tracking the dirty bit as the extra piece of information, and then transformed into this "no, the information is TLB_ZAP_RMAP". So "unsigned int flags" at one point was "bool dirty" instead, but then became more of a "I think this whole page pointer with flags is general", and the naming changed, and I had both cases in mind, and then the code is perhaps not so specifically named. I'm not sure the zap_page_range() case will ever use more than one flag, but the mlock case already has two different flags. So the "encode_page" thing is very much written to be about more than just the zap_page_range() case. But yes, that code could (and maybe should) use "if (flags & TLB_ZAP_RMAP)" to make it clear that in this case, the single flags bit is that one bit. But the "if ()" there actually does two conceptually *separate* things: it needs to clean the pointer in-place (which is regardless of "which" flag bit is set, and then it does that page_zap_pte_rmap(), which is just for the TLB_ZAP_RMAP bit. So to be really explicit about it, you'd have two different tests: one for "do I have flags that need to be cleaned up" and then an inner test for each flag. And since there is only one flag in this particular use case, it's essentially that inner test that I dropped as pointless. In contrast, in the s390 version, that bit was never encoded as a a general "flags associated with a page pointer" in the first place, so there was never any such duality. There is only TLB_ZAP_RMAP. Hope that explains the thinking. Linus