From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 21D33C433B4 for ; Fri, 21 May 2021 02:41:02 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 911A9611AB for ; Fri, 21 May 2021 02:41:01 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 911A9611AB Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id C369794000C; Thu, 20 May 2021 22:41:00 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id BE7268D0001; Thu, 20 May 2021 22:41:00 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A86D794000C; Thu, 20 May 2021 22:41:00 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0191.hostedemail.com [216.40.44.191]) by kanga.kvack.org (Postfix) with ESMTP id 7AC318D0001 for ; Thu, 20 May 2021 22:41:00 -0400 (EDT) Received: from smtpin23.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 159D2A74E for ; Fri, 21 May 2021 02:41:00 +0000 (UTC) X-FDA: 78163685880.23.1743224 Received: from mail-lj1-f169.google.com (mail-lj1-f169.google.com [209.85.208.169]) by imf25.hostedemail.com (Postfix) with ESMTP id 229696000255 for ; Fri, 21 May 2021 02:40:57 +0000 (UTC) Received: by mail-lj1-f169.google.com with SMTP id c15so22184555ljr.7 for ; Thu, 20 May 2021 19:40:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux-foundation.org; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=gixno4vr81cCAgvQz027RE1pAMILS4ks2YALYpOh1sE=; b=BfJC9dqc8caOml9V2QhWPD3hRkU6E0PHFbZPMYJfBdudmaaAb+WwUkNx19gz9gJWC/ wpBKXxOi1ZuNzchWPYXtnAnY7Ak/nW2IQ0kaaY6lXBFmbkKA6UDRniCwaEpO4zYv3UHk qo2/o4t2WuWel02dPj1GoCYnj9cQqiszKvRso= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=gixno4vr81cCAgvQz027RE1pAMILS4ks2YALYpOh1sE=; b=dR+nVIGsKSBpP1fOewe/x2CpbeBO9//A+1Msg7tfoD470ZVigpuUaQLE2S/blZrtOd vjjlCe4wyEiJwxTHQFZF5HU6nJYt4RVJnZzjd6iBE0uetzl5gOM5Kp2j6VSrJA+L4BV2 QbUwtFTEkxIQY79SOUM1vpQzvonpB2XygUJqQJRZZcVGTY5Kv3QYvt9+GkZJ8yxpNRKy w8acxz8QZmh9M2U4IPLjXT8ITtT65DO6bjB6Xlz+hECgUKxOyIT/nogr8BWtZilqlyGT 8neZJXXs0VZBUm1UkLfvFAP4kRQZ5Y4PA8RlEx4fpo2WHEBMD10xIi9PcAWVTs+mxCiE gdPw== X-Gm-Message-State: AOAM533ZRnXFfB9Uc2Hxub8b/RaAivlgChdjn4sDZA3HZAZmaBXUbYkU KQbfyBb+YLZ+H6qIzf0ziGqe+bFV9a7GqXx4 X-Google-Smtp-Source: ABdhPJwzBj+JBUz0SIJNmeK6mfj+2MVsOz6mtcIyVh/MtsZ98aam8VuSVLeBpUNXlnBN6N8a7MTxIQ== X-Received: by 2002:a2e:390b:: with SMTP id g11mr5011152lja.505.1621564857642; Thu, 20 May 2021 19:40:57 -0700 (PDT) Received: from mail-lf1-f50.google.com (mail-lf1-f50.google.com. [209.85.167.50]) by smtp.gmail.com with ESMTPSA id u15sm469701lfs.67.2021.05.20.19.40.57 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 20 May 2021 19:40:57 -0700 (PDT) Received: by mail-lf1-f50.google.com with SMTP id v8so22476290lft.8 for ; Thu, 20 May 2021 19:40:57 -0700 (PDT) X-Received: by 2002:ac2:4a9d:: with SMTP id l29mr431893lfp.201.1621564856863; Thu, 20 May 2021 19:40:56 -0700 (PDT) MIME-Version: 1.0 References: <20210422054323.150993-1-aneesh.kumar@linux.ibm.com> <20210422054323.150993-8-aneesh.kumar@linux.ibm.com> <2eafd7df-65fd-1e2c-90b6-d143557a1fdc@linux.ibm.com> In-Reply-To: <2eafd7df-65fd-1e2c-90b6-d143557a1fdc@linux.ibm.com> From: Linus Torvalds Date: Thu, 20 May 2021 16:40:41 -1000 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCH v5 7/9] mm/mremap: Move TLB flush outside page table lock To: "Aneesh Kumar K.V" Cc: Linux-MM , Andrew Morton , Michael Ellerman , linuxppc-dev , Kalesh Singh , Nick Piggin , Joel Fernandes , Christophe Leroy Content-Type: text/plain; charset="UTF-8" Authentication-Results: imf25.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=google header.b=BfJC9dqc; spf=pass (imf25.hostedemail.com: domain of torvalds@linuxfoundation.org designates 209.85.208.169 as permitted sender) smtp.mailfrom=torvalds@linuxfoundation.org; dmarc=none X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: 229696000255 X-Stat-Signature: qsu3pid9yho19c7opf6wwmxdkyfyanzh X-HE-Tag: 1621564857-340292 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu, May 20, 2021 at 6:57 AM Aneesh Kumar K.V wrote: > > Wondering whether this is correct considering we are holding mmap_sem in > write mode in mremap. Right. So *normally* the rule is to EITHER - hold the mmap_sem for writing OR - hold the page table lock and that the TLB flush needs to happen before you release that lock. But as that commit message of commit eb66ae030829 ("mremap: properly flush TLB before releasing the page") says, "mremap()" is a bit special. It's special because mremap() didn't take ownership of the page - it only moved it somewhere else. So now the page-out logic - that relies on the page table lock - can free the page immediately after we've released the page table lock. So basically, in order to delay the TLB flush after releasing the page table lock, it's not really sufficient to _just_ hold the mmap_sem for writing. You also need to guarantee that the lifetime of the page itself is held until after the TLB flush. For normal operations like "munmap()", this happens naturally, because we remove the page from the page table, and add it to the list of pages to be freed after the TLB flush. But mremap never did that "remove the page and add it to a list to be free'd later". Instead, it just moved the page somewhere else. And thus there is no guarantee that the page that got moved will continue to exist until a TLB flush is done. So mremap does need to flush the TLB before releasing the page table lock, because that's the lifetime boundary for the page that got moved. Linus From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C82CEC433ED for ; Fri, 21 May 2021 02:41:34 +0000 (UTC) Received: from lists.ozlabs.org (lists.ozlabs.org [112.213.38.117]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 039DA6105A for ; Fri, 21 May 2021 02:41:33 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 039DA6105A Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=linuxppc-dev-bounces+linuxppc-dev=archiver.kernel.org@lists.ozlabs.org Received: from boromir.ozlabs.org (localhost [IPv6:::1]) by lists.ozlabs.org (Postfix) with ESMTP id 4FmW8D0tT8z308N for ; Fri, 21 May 2021 12:41:32 +1000 (AEST) Authentication-Results: lists.ozlabs.org; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=linux-foundation.org header.i=@linux-foundation.org header.a=rsa-sha256 header.s=google header.b=BfJC9dqc; dkim-atps=neutral Authentication-Results: lists.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=linuxfoundation.org (client-ip=2a00:1450:4864:20::22e; helo=mail-lj1-x22e.google.com; envelope-from=torvalds@linuxfoundation.org; receiver=) Authentication-Results: lists.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=linux-foundation.org header.i=@linux-foundation.org header.a=rsa-sha256 header.s=google header.b=BfJC9dqc; dkim-atps=neutral Received: from mail-lj1-x22e.google.com (mail-lj1-x22e.google.com [IPv6:2a00:1450:4864:20::22e]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 4FmW7h3gFvz2yXh for ; Fri, 21 May 2021 12:41:03 +1000 (AEST) Received: by mail-lj1-x22e.google.com with SMTP id 131so22206129ljj.3 for ; Thu, 20 May 2021 19:41:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux-foundation.org; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=gixno4vr81cCAgvQz027RE1pAMILS4ks2YALYpOh1sE=; b=BfJC9dqc8caOml9V2QhWPD3hRkU6E0PHFbZPMYJfBdudmaaAb+WwUkNx19gz9gJWC/ wpBKXxOi1ZuNzchWPYXtnAnY7Ak/nW2IQ0kaaY6lXBFmbkKA6UDRniCwaEpO4zYv3UHk qo2/o4t2WuWel02dPj1GoCYnj9cQqiszKvRso= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=gixno4vr81cCAgvQz027RE1pAMILS4ks2YALYpOh1sE=; b=gNEhhW57+eVqd6Clx4FgWksRZ1AsE1ZndXz261u32S6S9i9vyPeiEuVnAeneQoKOxb 9d5pcn1LbQYqtOVSCLfrxumDjobI7CNWLWcMuHOO5jzeHOoNQ2RrGzw8Aj36LM+ZEufe AYYw4f9cAN1O21ebHItO8zuogTV+wZT4kXgCKv8F43OLnVSn5yiZqXIlMbcuGIB1lP0H pflg691Ejq+GHYsnUR1GUCtCEsGEttZhwsccz1EesloOkRmsCjLE3G1Sd43CAYoca44f Nnkh4IspwWt4QmLYPNok4MYUFOoMAoYqt7GQA8o98UYWkqR0usQjfPK8wvsWQniv42cU jAFg== X-Gm-Message-State: AOAM530mT/OPBvHa/Mcs0CAKsJjgFuYdBrezD8GAxlohImxfwClfB7on tuAH7iQ7zcQpNLWPS2TPvQ0n/vNOMmxtxKnL X-Google-Smtp-Source: ABdhPJwUiPTGeao5k/8CHT1DEDEymYCwsfniLklg7/hgggclMvxvq6PIuTYIeANPWEHpRhdAspq42Q== X-Received: by 2002:a2e:3803:: with SMTP id f3mr5108893lja.230.1621564857775; Thu, 20 May 2021 19:40:57 -0700 (PDT) Received: from mail-lf1-f53.google.com (mail-lf1-f53.google.com. [209.85.167.53]) by smtp.gmail.com with ESMTPSA id b14sm470711lfb.111.2021.05.20.19.40.57 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 20 May 2021 19:40:57 -0700 (PDT) Received: by mail-lf1-f53.google.com with SMTP id c10so6458737lfm.0 for ; Thu, 20 May 2021 19:40:57 -0700 (PDT) X-Received: by 2002:ac2:4a9d:: with SMTP id l29mr431893lfp.201.1621564856863; Thu, 20 May 2021 19:40:56 -0700 (PDT) MIME-Version: 1.0 References: <20210422054323.150993-1-aneesh.kumar@linux.ibm.com> <20210422054323.150993-8-aneesh.kumar@linux.ibm.com> <2eafd7df-65fd-1e2c-90b6-d143557a1fdc@linux.ibm.com> In-Reply-To: <2eafd7df-65fd-1e2c-90b6-d143557a1fdc@linux.ibm.com> From: Linus Torvalds Date: Thu, 20 May 2021 16:40:41 -1000 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCH v5 7/9] mm/mremap: Move TLB flush outside page table lock To: "Aneesh Kumar K.V" Content-Type: text/plain; charset="UTF-8" X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Nick Piggin , Linux-MM , Kalesh Singh , Joel Fernandes , Andrew Morton , linuxppc-dev Errors-To: linuxppc-dev-bounces+linuxppc-dev=archiver.kernel.org@lists.ozlabs.org Sender: "Linuxppc-dev" On Thu, May 20, 2021 at 6:57 AM Aneesh Kumar K.V wrote: > > Wondering whether this is correct considering we are holding mmap_sem in > write mode in mremap. Right. So *normally* the rule is to EITHER - hold the mmap_sem for writing OR - hold the page table lock and that the TLB flush needs to happen before you release that lock. But as that commit message of commit eb66ae030829 ("mremap: properly flush TLB before releasing the page") says, "mremap()" is a bit special. It's special because mremap() didn't take ownership of the page - it only moved it somewhere else. So now the page-out logic - that relies on the page table lock - can free the page immediately after we've released the page table lock. So basically, in order to delay the TLB flush after releasing the page table lock, it's not really sufficient to _just_ hold the mmap_sem for writing. You also need to guarantee that the lifetime of the page itself is held until after the TLB flush. For normal operations like "munmap()", this happens naturally, because we remove the page from the page table, and add it to the list of pages to be freed after the TLB flush. But mremap never did that "remove the page and add it to a list to be free'd later". Instead, it just moved the page somewhere else. And thus there is no guarantee that the page that got moved will continue to exist until a TLB flush is done. So mremap does need to flush the TLB before releasing the page table lock, because that's the lifetime boundary for the page that got moved. Linus