From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.4 required=3.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 52EA5C352AA for ; Tue, 1 Oct 2019 19:04:32 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 03806215EA for ; Tue, 1 Oct 2019 19:04:32 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=nvidia.com header.i=@nvidia.com header.b="sFKmagTn" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 03806215EA Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=nvidia.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id A334A6B0003; Tue, 1 Oct 2019 15:04:31 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 9E4B16B0006; Tue, 1 Oct 2019 15:04:31 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8ABD38E0001; Tue, 1 Oct 2019 15:04:31 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0063.hostedemail.com [216.40.44.63]) by kanga.kvack.org (Postfix) with ESMTP id 6A2C66B0003 for ; Tue, 1 Oct 2019 15:04:31 -0400 (EDT) Received: from smtpin20.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with SMTP id E8523181AC9AE for ; Tue, 1 Oct 2019 19:04:30 +0000 (UTC) X-FDA: 75996141900.20.rifle71_618b8e8b53d4b X-HE-Tag: rifle71_618b8e8b53d4b X-Filterd-Recvd-Size: 5828 Received: from hqemgate16.nvidia.com (hqemgate16.nvidia.com [216.228.121.65]) by imf34.hostedemail.com (Postfix) with ESMTP for ; Tue, 1 Oct 2019 19:04:30 +0000 (UTC) Received: from hqpgpgate101.nvidia.com (Not Verified[216.228.121.13]) by hqemgate16.nvidia.com (using TLS: TLSv1.2, DES-CBC3-SHA) id ; Tue, 01 Oct 2019 12:04:29 -0700 Received: from hqmail.nvidia.com ([172.20.161.6]) by hqpgpgate101.nvidia.com (PGP Universal service); Tue, 01 Oct 2019 12:04:29 -0700 X-PGP-Universal: processed; by hqpgpgate101.nvidia.com on Tue, 01 Oct 2019 12:04:29 -0700 Received: from DRHQMAIL107.nvidia.com (10.27.9.16) by HQMAIL111.nvidia.com (172.20.187.18) with Microsoft SMTP Server (TLS) id 15.0.1473.3; Tue, 1 Oct 2019 19:04:28 +0000 Received: from [10.110.48.28] (10.124.1.5) by DRHQMAIL107.nvidia.com (10.27.9.16) with Microsoft SMTP Server (TLS) id 15.0.1473.3; Tue, 1 Oct 2019 19:04:28 +0000 Subject: Re: [PATCH v4 03/11] mm/gup: Applies counting method to monitor gup_pgd_range To: Leonardo Bras , , , , , CC: Keith Busch , Thomas Gleixner , Arnd Bergmann , Greg Kroah-Hartman , YueHaibing , "Nicholas Piggin" , Mike Rapoport , "Mahesh Salgaonkar" , Jason Gunthorpe , "Paul Mackerras" , Aneesh Kumar K.V , Ganesh Goudar , Andrew Morton , Ira Weiny , Dan Williams , Allison Randal References: <20190927234008.11513-1-leonardo@linux.ibm.com> <20190927234008.11513-4-leonardo@linux.ibm.com> <2cebe169221ae9270963d4bc4fd8e43066745f98.camel@linux.ibm.com> X-Nvconfidentiality: public From: John Hubbard Message-ID: Date: Tue, 1 Oct 2019 12:04:28 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.8.0 MIME-Version: 1.0 In-Reply-To: <2cebe169221ae9270963d4bc4fd8e43066745f98.camel@linux.ibm.com> X-Originating-IP: [10.124.1.5] X-ClientProxiedBy: HQMAIL105.nvidia.com (172.20.187.12) To DRHQMAIL107.nvidia.com (10.27.9.16) Content-Type: text/plain; charset="utf-8" Content-Language: en-US Content-Transfer-Encoding: 7bit DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nvidia.com; s=n1; t=1569956669; bh=mU2o66EzpiKq/VTOcPIA3Ya6nMDGuoufi0g71k1FY5k=; h=X-PGP-Universal:Subject:To:CC:References:X-Nvconfidentiality:From: Message-ID:Date:User-Agent:MIME-Version:In-Reply-To: X-Originating-IP:X-ClientProxiedBy:Content-Type:Content-Language: Content-Transfer-Encoding; b=sFKmagTnvrqulDybiAEutlUnORPR4yV0qiLmx94LqM0/kVZd+OSM9/3klZd40UgBO 7NsbvQGccOcm6Ar12WMXwvIKC6L9sj/9SYQrzJC+Dz+vB0rAMpZ0Onh/Ip9Xjzks8s mjkAk4LdHJ8dfB+Imob53rBQzTt+G+1Sbfs/MCPsqLLU141T5k2gPk/SFIwER/bars PSXpJw6U6xQKG1L+itgn9d8sn5ia8AYdBkIJFDpHuo+FFV5BQc7tgvcIDKvDhmx+4E hHTFu34yHmbOrnUfFk7G9KE5ZyKUA5uVBg9YKtMgQmstkM506MWpQasXlrHCO9cWRA fLwbBnh4IVlVg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 10/1/19 10:56 AM, Leonardo Bras wrote: > On Mon, 2019-09-30 at 14:51 -0700, John Hubbard wrote: >> On 9/27/19 4:40 PM, Leonardo Bras wrote: ... >>> diff --git a/mm/gup.c b/mm/gup.c >>> index 98f13ab37bac..7105c829cf44 100644 >>> --- a/mm/gup.c >>> +++ b/mm/gup.c >>> @@ -2325,6 +2325,7 @@ static bool gup_fast_permitted(unsigned long start, unsigned long end) >>> int __get_user_pages_fast(unsigned long start, int nr_pages, int write, >>> struct page **pages) >>> { >>> + struct mm_struct *mm; >> >> I don't think that this local variable adds any value, so let's not use it. >> Similar point in a few other patches too. > > It avoids 1 deference of current->mm, it's a little performance gain. > No, it isn't. :) Longer answer: at this level (by which I mean, "wrote the C code, haven't looked at the generated asm yet, and haven't done a direct perf test yet"), none of us C programmers are entitled to imagine that we can second guess both the compiler and the CPU well enough to claim that declaring a local pointer variable on the stack will even *affect* performance, much less know which way it will go! The compiler at -O2 will *absolutely* optimize away any local variables that it doesn't need. And that leads to how kernel programmers routinely decide about that kind of variable: "does the variable's added clarity compensate for the extra visual noise and for the need to manage the variable?" Here, and in most (all?) other points in the patchset where you've added an mm local variable, the answer is no. >> ... start_lockless_pgtbl_walk(mm); >> >> Minor: I'd like to rename this register_lockless_pgtable_walker(). >> >>> local_irq_disable(); >>> gup_pgd_range(addr, end, gup_flags, pages, &nr); >>> local_irq_enable(); >>> + end_lockless_pgtbl_walk(mm); >> >> ...and deregister_lockless_pgtable_walker(). >> > > I have no problem changing the name, but I don't register/deregister > are good terms for this. > > I would rather use start/finish, begin/end, and so on. Register sounds > like something more complicated than what we are trying to achieve > here. > OK, well, I don't want to bikeshed on naming more than I usually do, and what you have is reasonable, so I'll leave that alone. :) thanks, -- John Hubbard NVIDIA