From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.4 required=3.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4805FC88C87 for ; Mon, 23 Sep 2019 20:42:49 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id E6D0320673 for ; Mon, 23 Sep 2019 20:42:48 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=nvidia.com header.i=@nvidia.com header.b="LXy6huOI" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org E6D0320673 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=nvidia.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 92D966B029E; Mon, 23 Sep 2019 16:42:48 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 8DFE16B02A6; Mon, 23 Sep 2019 16:42:48 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7A6976B02A7; Mon, 23 Sep 2019 16:42:48 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0050.hostedemail.com [216.40.44.50]) by kanga.kvack.org (Postfix) with ESMTP id 59E566B029E for ; Mon, 23 Sep 2019 16:42:48 -0400 (EDT) Received: from smtpin21.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with SMTP id 0162E8243762 for ; Mon, 23 Sep 2019 20:42:48 +0000 (UTC) X-FDA: 75967359216.21.oil92_357156ea23b0e X-HE-Tag: oil92_357156ea23b0e X-Filterd-Recvd-Size: 6942 Received: from hqemgate16.nvidia.com (hqemgate16.nvidia.com [216.228.121.65]) by imf47.hostedemail.com (Postfix) with ESMTP for ; Mon, 23 Sep 2019 20:42:47 +0000 (UTC) Received: from hqpgpgate101.nvidia.com (Not Verified[216.228.121.13]) by hqemgate16.nvidia.com (using TLS: TLSv1.2, DES-CBC3-SHA) id ; Mon, 23 Sep 2019 13:42:52 -0700 Received: from hqmail.nvidia.com ([172.20.161.6]) by hqpgpgate101.nvidia.com (PGP Universal service); Mon, 23 Sep 2019 13:42:46 -0700 X-PGP-Universal: processed; by hqpgpgate101.nvidia.com on Mon, 23 Sep 2019 13:42:46 -0700 Received: from DRHQMAIL107.nvidia.com (10.27.9.16) by HQMAIL107.nvidia.com (172.20.187.13) with Microsoft SMTP Server (TLS) id 15.0.1473.3; Mon, 23 Sep 2019 20:42:45 +0000 Received: from [10.110.48.28] (10.124.1.5) by DRHQMAIL107.nvidia.com (10.27.9.16) with Microsoft SMTP Server (TLS) id 15.0.1473.3; Mon, 23 Sep 2019 20:42:45 +0000 Subject: Re: [PATCH v2 01/11] powerpc/mm: Adds counting method to monitor lockless pgtable walks To: Leonardo Bras , , , Linux-MM CC: Benjamin Herrenschmidt , Paul Mackerras , Michael Ellerman , Arnd Bergmann , Aneesh Kumar K.V , "Christophe Leroy" , Andrew Morton , Dan Williams , Nicholas Piggin , Mahesh Salgaonkar , Thomas Gleixner , Richard Fontana , Ganesh Goudar , Allison Randal , "Greg Kroah-Hartman" , Mike Rapoport , YueHaibing , Ira Weiny , Jason Gunthorpe , Keith Busch References: <20190920195047.7703-1-leonardo@linux.ibm.com> <20190920195047.7703-2-leonardo@linux.ibm.com> X-Nvconfidentiality: public From: John Hubbard Message-ID: <90ceb0ca-9f04-65a5-586c-e37c2ecc6e4e@nvidia.com> Date: Mon, 23 Sep 2019 13:42:44 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.8.0 MIME-Version: 1.0 In-Reply-To: <20190920195047.7703-2-leonardo@linux.ibm.com> X-Originating-IP: [10.124.1.5] X-ClientProxiedBy: HQMAIL111.nvidia.com (172.20.187.18) To DRHQMAIL107.nvidia.com (10.27.9.16) Content-Type: text/plain; charset="utf-8" Content-Language: en-US Content-Transfer-Encoding: 7bit DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nvidia.com; s=n1; t=1569271372; bh=Li60fdWyAuxpYrBISj31o/hEnESDTi29YgSDXJ6CkSo=; h=X-PGP-Universal:Subject:To:CC:References:X-Nvconfidentiality:From: Message-ID:Date:User-Agent:MIME-Version:In-Reply-To: X-Originating-IP:X-ClientProxiedBy:Content-Type:Content-Language: Content-Transfer-Encoding; b=LXy6huOIeYa212ed1LaHWz9y2UlvW6A3MhQMPrhvKHJLYakDh3GDA8Z+DFaFI7fev 7Y815JiGsuqX2HcRvxrUZwLjt+Tf4AV/Ecc2ZAxpG9Nsq7Hxv2JtXF22ZLdDtSpmIa NC27Od6YC+rtxBoWQXPiL3HdZLW/eZAti36WkrdwlyO2djBw7kHPRPo5sPFuHtDNV+ sBNnSPijth3EWRe6Yb7FeqTplAVmP4/ZqXsOpoCgf4IjAcnAaRpIXQusvCBW10U2lA awL3IvrK+RwZllj+nEpeLgxumXbhSlDlgp9K1EHxY4lf0ZEu0xicefcT2gZjGMycer Gj7PE/xIhogeQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 9/20/19 12:50 PM, Leonardo Bras wrote: > It's necessary to monitor lockless pagetable walks, in order to avoid doing > THP splitting/collapsing during them. > > Some methods rely on local_irq_{save,restore}, but that can be slow on > cases with a lot of cpus are used for the process. > > In order to speedup some cases, I propose a refcount-based approach, that > counts the number of lockless pagetable walks happening on the process. > > This method does not exclude the current irq-oriented method. It works as a > complement to skip unnecessary waiting. > > start_lockless_pgtbl_walk(mm) > Insert before starting any lockless pgtable walk > end_lockless_pgtbl_walk(mm) > Insert after the end of any lockless pgtable walk > (Mostly after the ptep is last used) > running_lockless_pgtbl_walk(mm) > Returns the number of lockless pgtable walks running > > Signed-off-by: Leonardo Bras > --- > arch/powerpc/include/asm/book3s/64/mmu.h | 3 +++ > arch/powerpc/mm/book3s64/mmu_context.c | 1 + > arch/powerpc/mm/book3s64/pgtable.c | 17 +++++++++++++++++ > 3 files changed, 21 insertions(+) > > diff --git a/arch/powerpc/include/asm/book3s/64/mmu.h b/arch/powerpc/include/asm/book3s/64/mmu.h > index 23b83d3593e2..13b006e7dde4 100644 > --- a/arch/powerpc/include/asm/book3s/64/mmu.h > +++ b/arch/powerpc/include/asm/book3s/64/mmu.h > @@ -116,6 +116,9 @@ typedef struct { > /* Number of users of the external (Nest) MMU */ > atomic_t copros; > > + /* Number of running instances of lockless pagetable walk*/ > + atomic_t lockless_pgtbl_walk_count; > + > struct hash_mm_context *hash_context; > > unsigned long vdso_base; > diff --git a/arch/powerpc/mm/book3s64/mmu_context.c b/arch/powerpc/mm/book3s64/mmu_context.c > index 2d0cb5ba9a47..3dd01c0ca5be 100644 > --- a/arch/powerpc/mm/book3s64/mmu_context.c > +++ b/arch/powerpc/mm/book3s64/mmu_context.c > @@ -200,6 +200,7 @@ int init_new_context(struct task_struct *tsk, struct mm_struct *mm) > #endif > atomic_set(&mm->context.active_cpus, 0); > atomic_set(&mm->context.copros, 0); > + atomic_set(&mm->context.lockless_pgtbl_walk_count, 0); > > return 0; > } > diff --git a/arch/powerpc/mm/book3s64/pgtable.c b/arch/powerpc/mm/book3s64/pgtable.c > index 7d0e0d0d22c4..13239b17a22c 100644 > --- a/arch/powerpc/mm/book3s64/pgtable.c > +++ b/arch/powerpc/mm/book3s64/pgtable.c > @@ -98,6 +98,23 @@ void serialize_against_pte_lookup(struct mm_struct *mm) > smp_call_function_many(mm_cpumask(mm), do_nothing, NULL, 1); > } > Somewhere, there should be a short comment that explains how the following functions are meant to be used. And it should include the interaction with irqs, so maybe if you end up adding that combined wrapper function that does both, that's where the documentation would go. If not, then here is probably where it goes. > +void start_lockless_pgtbl_walk(struct mm_struct *mm) > +{ > + atomic_inc(&mm->context.lockless_pgtbl_walk_count); > +} > +EXPORT_SYMBOL(start_lockless_pgtbl_walk); > + > +void end_lockless_pgtbl_walk(struct mm_struct *mm) > +{ > + atomic_dec(&mm->context.lockless_pgtbl_walk_count); > +} > +EXPORT_SYMBOL(end_lockless_pgtbl_walk); > + > +int running_lockless_pgtbl_walk(struct mm_struct *mm) > +{ > + return atomic_read(&mm->context.lockless_pgtbl_walk_count); > +} > + thanks, -- John Hubbard NVIDIA From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.1 required=3.0 tests=DKIM_INVALID,DKIM_SIGNED, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY, SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 978D4C88CA8 for ; Mon, 23 Sep 2019 20:45:14 +0000 (UTC) Received: from lists.ozlabs.org (lists.ozlabs.org [203.11.71.2]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 0F0A520673 for ; Mon, 23 Sep 2019 20:45:13 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=nvidia.com header.i=@nvidia.com header.b="LXy6huOI" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 0F0A520673 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=nvidia.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=linuxppc-dev-bounces+linuxppc-dev=archiver.kernel.org@lists.ozlabs.org Received: from bilbo.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3]) by lists.ozlabs.org (Postfix) with ESMTP id 46cbtG6bLxzDq8v for ; Tue, 24 Sep 2019 06:45:09 +1000 (AEST) Authentication-Results: lists.ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nvidia.com (client-ip=216.228.121.65; helo=hqemgate16.nvidia.com; envelope-from=jhubbard@nvidia.com; receiver=) Authentication-Results: lists.ozlabs.org; dmarc=pass (p=none dis=none) header.from=nvidia.com Authentication-Results: lists.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=nvidia.com header.i=@nvidia.com header.b="LXy6huOI"; dkim-atps=neutral Received: from hqemgate16.nvidia.com (hqemgate16.nvidia.com [216.228.121.65]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 46cbqY49qrzDqDl for ; Tue, 24 Sep 2019 06:42:49 +1000 (AEST) Received: from hqpgpgate101.nvidia.com (Not Verified[216.228.121.13]) by hqemgate16.nvidia.com (using TLS: TLSv1.2, DES-CBC3-SHA) id ; Mon, 23 Sep 2019 13:42:52 -0700 Received: from hqmail.nvidia.com ([172.20.161.6]) by hqpgpgate101.nvidia.com (PGP Universal service); Mon, 23 Sep 2019 13:42:46 -0700 X-PGP-Universal: processed; by hqpgpgate101.nvidia.com on Mon, 23 Sep 2019 13:42:46 -0700 Received: from DRHQMAIL107.nvidia.com (10.27.9.16) by HQMAIL107.nvidia.com (172.20.187.13) with Microsoft SMTP Server (TLS) id 15.0.1473.3; Mon, 23 Sep 2019 20:42:45 +0000 Received: from [10.110.48.28] (10.124.1.5) by DRHQMAIL107.nvidia.com (10.27.9.16) with Microsoft SMTP Server (TLS) id 15.0.1473.3; Mon, 23 Sep 2019 20:42:45 +0000 Subject: Re: [PATCH v2 01/11] powerpc/mm: Adds counting method to monitor lockless pgtable walks To: Leonardo Bras , , , Linux-MM References: <20190920195047.7703-1-leonardo@linux.ibm.com> <20190920195047.7703-2-leonardo@linux.ibm.com> X-Nvconfidentiality: public From: John Hubbard Message-ID: <90ceb0ca-9f04-65a5-586c-e37c2ecc6e4e@nvidia.com> Date: Mon, 23 Sep 2019 13:42:44 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.8.0 MIME-Version: 1.0 In-Reply-To: <20190920195047.7703-2-leonardo@linux.ibm.com> X-Originating-IP: [10.124.1.5] X-ClientProxiedBy: HQMAIL111.nvidia.com (172.20.187.18) To DRHQMAIL107.nvidia.com (10.27.9.16) Content-Type: text/plain; charset="utf-8" Content-Language: en-US Content-Transfer-Encoding: 7bit DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nvidia.com; s=n1; t=1569271372; bh=Li60fdWyAuxpYrBISj31o/hEnESDTi29YgSDXJ6CkSo=; h=X-PGP-Universal:Subject:To:CC:References:X-Nvconfidentiality:From: Message-ID:Date:User-Agent:MIME-Version:In-Reply-To: X-Originating-IP:X-ClientProxiedBy:Content-Type:Content-Language: Content-Transfer-Encoding; b=LXy6huOIeYa212ed1LaHWz9y2UlvW6A3MhQMPrhvKHJLYakDh3GDA8Z+DFaFI7fev 7Y815JiGsuqX2HcRvxrUZwLjt+Tf4AV/Ecc2ZAxpG9Nsq7Hxv2JtXF22ZLdDtSpmIa NC27Od6YC+rtxBoWQXPiL3HdZLW/eZAti36WkrdwlyO2djBw7kHPRPo5sPFuHtDNV+ sBNnSPijth3EWRe6Yb7FeqTplAVmP4/ZqXsOpoCgf4IjAcnAaRpIXQusvCBW10U2lA awL3IvrK+RwZllj+nEpeLgxumXbhSlDlgp9K1EHxY4lf0ZEu0xicefcT2gZjGMycer Gj7PE/xIhogeQ== X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Jason Gunthorpe , Thomas Gleixner , Arnd Bergmann , Greg Kroah-Hartman , YueHaibing , Keith Busch , Nicholas Piggin , Mike Rapoport , Mahesh Salgaonkar , Richard Fontana , Paul Mackerras , "Aneesh Kumar K.V" , Ganesh Goudar , Andrew Morton , Ira Weiny , Dan Williams , Allison Randal Errors-To: linuxppc-dev-bounces+linuxppc-dev=archiver.kernel.org@lists.ozlabs.org Sender: "Linuxppc-dev" On 9/20/19 12:50 PM, Leonardo Bras wrote: > It's necessary to monitor lockless pagetable walks, in order to avoid doing > THP splitting/collapsing during them. > > Some methods rely on local_irq_{save,restore}, but that can be slow on > cases with a lot of cpus are used for the process. > > In order to speedup some cases, I propose a refcount-based approach, that > counts the number of lockless pagetable walks happening on the process. > > This method does not exclude the current irq-oriented method. It works as a > complement to skip unnecessary waiting. > > start_lockless_pgtbl_walk(mm) > Insert before starting any lockless pgtable walk > end_lockless_pgtbl_walk(mm) > Insert after the end of any lockless pgtable walk > (Mostly after the ptep is last used) > running_lockless_pgtbl_walk(mm) > Returns the number of lockless pgtable walks running > > Signed-off-by: Leonardo Bras > --- > arch/powerpc/include/asm/book3s/64/mmu.h | 3 +++ > arch/powerpc/mm/book3s64/mmu_context.c | 1 + > arch/powerpc/mm/book3s64/pgtable.c | 17 +++++++++++++++++ > 3 files changed, 21 insertions(+) > > diff --git a/arch/powerpc/include/asm/book3s/64/mmu.h b/arch/powerpc/include/asm/book3s/64/mmu.h > index 23b83d3593e2..13b006e7dde4 100644 > --- a/arch/powerpc/include/asm/book3s/64/mmu.h > +++ b/arch/powerpc/include/asm/book3s/64/mmu.h > @@ -116,6 +116,9 @@ typedef struct { > /* Number of users of the external (Nest) MMU */ > atomic_t copros; > > + /* Number of running instances of lockless pagetable walk*/ > + atomic_t lockless_pgtbl_walk_count; > + > struct hash_mm_context *hash_context; > > unsigned long vdso_base; > diff --git a/arch/powerpc/mm/book3s64/mmu_context.c b/arch/powerpc/mm/book3s64/mmu_context.c > index 2d0cb5ba9a47..3dd01c0ca5be 100644 > --- a/arch/powerpc/mm/book3s64/mmu_context.c > +++ b/arch/powerpc/mm/book3s64/mmu_context.c > @@ -200,6 +200,7 @@ int init_new_context(struct task_struct *tsk, struct mm_struct *mm) > #endif > atomic_set(&mm->context.active_cpus, 0); > atomic_set(&mm->context.copros, 0); > + atomic_set(&mm->context.lockless_pgtbl_walk_count, 0); > > return 0; > } > diff --git a/arch/powerpc/mm/book3s64/pgtable.c b/arch/powerpc/mm/book3s64/pgtable.c > index 7d0e0d0d22c4..13239b17a22c 100644 > --- a/arch/powerpc/mm/book3s64/pgtable.c > +++ b/arch/powerpc/mm/book3s64/pgtable.c > @@ -98,6 +98,23 @@ void serialize_against_pte_lookup(struct mm_struct *mm) > smp_call_function_many(mm_cpumask(mm), do_nothing, NULL, 1); > } > Somewhere, there should be a short comment that explains how the following functions are meant to be used. And it should include the interaction with irqs, so maybe if you end up adding that combined wrapper function that does both, that's where the documentation would go. If not, then here is probably where it goes. > +void start_lockless_pgtbl_walk(struct mm_struct *mm) > +{ > + atomic_inc(&mm->context.lockless_pgtbl_walk_count); > +} > +EXPORT_SYMBOL(start_lockless_pgtbl_walk); > + > +void end_lockless_pgtbl_walk(struct mm_struct *mm) > +{ > + atomic_dec(&mm->context.lockless_pgtbl_walk_count); > +} > +EXPORT_SYMBOL(end_lockless_pgtbl_walk); > + > +int running_lockless_pgtbl_walk(struct mm_struct *mm) > +{ > + return atomic_read(&mm->context.lockless_pgtbl_walk_count); > +} > + thanks, -- John Hubbard NVIDIA