From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id CD589C433EF for ; Tue, 26 Oct 2021 19:06:45 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 6832F60EC0 for ; Tue, 26 Oct 2021 19:06:45 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 6832F60EC0 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id BE435940008; Tue, 26 Oct 2021 15:06:44 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id B9450940007; Tue, 26 Oct 2021 15:06:44 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A3514940008; Tue, 26 Oct 2021 15:06:44 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0073.hostedemail.com [216.40.44.73]) by kanga.kvack.org (Postfix) with ESMTP id 90BE1940007 for ; Tue, 26 Oct 2021 15:06:44 -0400 (EDT) Received: from smtpin20.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 53D2E31ED1 for ; Tue, 26 Oct 2021 19:06:44 +0000 (UTC) X-FDA: 78739520328.20.3AED23F Received: from mail-pf1-f181.google.com (mail-pf1-f181.google.com [209.85.210.181]) by imf19.hostedemail.com (Postfix) with ESMTP id 79C2FB0000B2 for ; Tue, 26 Oct 2021 19:06:39 +0000 (UTC) Received: by mail-pf1-f181.google.com with SMTP id o133so371504pfg.7 for ; Tue, 26 Oct 2021 12:06:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=EJh7ZcC2grrGO9nq0Nd8wB/2nJ7WWNVSIfZil/98gt8=; b=GjyKWLQJUKh8TkQ6X6zqdLjL711sDfJFgFpKn6UiAKnJvgxgOKP8W03app1FZUphoj XmeMtGlSohmdCVCBRgRsyOqOku3kGYDs/r+IyOCr2Het3vGy7cZxLL/EsgI7kbN7Ptrj kcOFHTywLCsCpFHDeALTx7m4HUaa/McdUF634Ua50ZNZN8QFSFx1vHo4LKSk4QWd/4uG b5m1/PF+F0MC3Q+kmBQlOyC7O14YCmZZcEs4xEc5nwfrxSLL67jyF1k37JG6NYNraGEZ GtZpVTbul5Bdfxkeam3h9oN/cXhlazLB+WD0dNPuF4gLqjHn6M6meqvLvpstQSBSpIEv Bf3A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=EJh7ZcC2grrGO9nq0Nd8wB/2nJ7WWNVSIfZil/98gt8=; b=bkIBV3KaJYf5Pg8qQGRX4iuld26nTojKgHpWaJ+1Nfyc6JPtSpzp1UaRocwa0dwoJ/ sKd9/KWX2s4Q2U1PTqp/uzjHYENAByk+m6K4xVSIuaLZzDblMs4NKQbrff3sTiPVyuDk CezKlmi0l6pLB+jjxRbZa4jfXgov4rOFc4fNLqz+klMWTx9F04PJLq9tbCxmemsm+hJ9 vC8FdAvD/zDr3XtHZ7J6rMNEqcTFltPVwzau1k1y2LnEJIeyBVtXUUD5uO5OJPSfyUm8 YI7dJEaIPJDon1APY1AMr+N2hkTArY7emA+gj5N1vx/eT5o85LnwJl7x4s1B0+TALwh/ OSZw== X-Gm-Message-State: AOAM531M2/3AviKNrIP40d30UrTPnD3RuSTRaSXzw7kdV9piTT/7ziZO ULVB7+vc7smgR5zD2tTrXL0= X-Google-Smtp-Source: ABdhPJzzPpdTpGuVeS0gC+5ijhnpjBOIJ70BbFzpBC6srcahv8VsxXgZTBY/f1tPPjPuPg+Oj97chA== X-Received: by 2002:a63:fb18:: with SMTP id o24mr20405269pgh.8.1635275202596; Tue, 26 Oct 2021 12:06:42 -0700 (PDT) Received: from smtpclient.apple (c-24-6-216-183.hsd1.ca.comcast.net. [24.6.216.183]) by smtp.gmail.com with ESMTPSA id h35sm7895669pgh.71.2021.10.26.12.06.41 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Tue, 26 Oct 2021 12:06:42 -0700 (PDT) Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 14.0 \(3654.120.0.1.13\)) Subject: Re: [PATCH v2 2/5] mm: avoid unnecessary flush on change_huge_pmd() From: Nadav Amit In-Reply-To: <435f41f2-ffd4-0278-9f26-fbe2c2c7545c@intel.com> Date: Tue, 26 Oct 2021 12:06:40 -0700 Cc: Linux-MM , LKML , Andrea Arcangeli , Andrew Cooper , Andrew Morton , Andy Lutomirski , Dave Hansen , Peter Xu , Peter Zijlstra , Thomas Gleixner , Will Deacon , Yu Zhao , Nick Piggin , "x86@kernel.org" Content-Transfer-Encoding: quoted-printable Message-Id: <8BC74789-FF33-403F-B5D7-19034CAC7EE6@gmail.com> References: <20211021122112.592634-1-namit@vmware.com> <20211021122112.592634-3-namit@vmware.com> <29E7E8A4-C400-40A5-ACEC-F15C976DDEE0@gmail.com> <435f41f2-ffd4-0278-9f26-fbe2c2c7545c@intel.com> To: Dave Hansen X-Mailer: Apple Mail (2.3654.120.0.1.13) X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 79C2FB0000B2 X-Stat-Signature: sawdn8q7mjgjz1bx3tspb3w7y9jq4uxg Authentication-Results: imf19.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=GjyKWLQJ; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf19.hostedemail.com: domain of nadav.amit@gmail.com designates 209.85.210.181 as permitted sender) smtp.mailfrom=nadav.amit@gmail.com X-HE-Tag: 1635275199-395858 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: > On Oct 26, 2021, at 11:44 AM, Dave Hansen = wrote: >=20 > On 10/26/21 10:44 AM, Nadav Amit wrote: >>> "If software on one logical processor writes to a page while = software on >>> another logical processor concurrently clears the R/W flag in the >>> paging-structure entry that maps the page, execution on some = processors may >>> result in the entry=E2=80=99s dirty flag being set (due to the write = on the first >>> logical processor) and the entry=E2=80=99s R/W flag being clear (due = to the update >>> to the entry on the second logical processor). This will never occur = on a >>> processor that supports control-flow enforcement technology (CET)=E2=80= =9D >>>=20 >>> So I guess that this optimization can only be enabled when CET is = enabled. >>>=20 >>> :( >> I still wonder whether the SDM comment applies to present bit vs = dirty >> bit atomicity as well. >=20 > I think it's implicit. =46rom "4.8 ACCESSED AND DIRTY FLAGS": >=20 > "Whenever there is a write to a linear address, the processor > sets the dirty flag (if it is not already set) in the paging- > structure entry" >=20 > There can't be a "write to a linear address" without a Present=3D1 = PTE. > If it were a Dirty=3D1,Present=3D1 PTE, there's no race because there = might > not be a write to the PTE at all. >=20 > There's also this from the "4.10.4.3 Optional Invalidation" section: >=20 > "no TLB entry or paging-structure cache entry is created with > information from a paging-structure entry in which the P flag > is 0." >=20 > That means that we don't have to worry about the TLB doing something > bonkers like caching a Dirty=3D1 bit from a Present=3D0 PTE. >=20 > Is that what you were worried about? Thanks Dave, but no - that is not my concern. To make it very clear - consider the following scenario, in which a volatile pointer p is mapped using a certain PTE, which is RW (i.e., *p is writable): CPU0 CPU1 ---- ---- x =3D *p [ PTE cached in TLB;=20 PTE is not dirty ] clear_pte(PTE) *p =3D x [ needs to set dirty ] Note that there is no TLB flush in this scenario. The question is whether the write access to *p would succeed, setting the dirty bit on the clear, non-present entry. I was under the impression that the hardware AD-assist would recheck the PTE atomically as it sets the dirty bit. But, as I said, I am not sure anymore whether this is defined architecturally (or at least would work in practice on all CPUs modulo the=20 Knights Landing thingy).