From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.3 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 00F51C433E0 for ; Wed, 27 May 2020 01:44:12 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 9A433207CB for ; Wed, 27 May 2020 01:44:11 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 9A433207CB Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id E7EC2800B6; Tue, 26 May 2020 21:44:10 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E2EB480010; Tue, 26 May 2020 21:44:10 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D1DFA800B6; Tue, 26 May 2020 21:44:10 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0075.hostedemail.com [216.40.44.75]) by kanga.kvack.org (Postfix) with ESMTP id B7FCA80010 for ; Tue, 26 May 2020 21:44:10 -0400 (EDT) Received: from smtpin29.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 7D83A4DCD for ; Wed, 27 May 2020 01:44:10 +0000 (UTC) X-FDA: 76860803460.29.sky18_24110a826d4e Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin29.hostedemail.com (Postfix) with ESMTP id 65653180868D5 for ; Wed, 27 May 2020 01:44:10 +0000 (UTC) X-HE-Tag: sky18_24110a826d4e X-Filterd-Recvd-Size: 5728 Received: from mga07.intel.com (mga07.intel.com [134.134.136.100]) by imf31.hostedemail.com (Postfix) with ESMTP for ; Wed, 27 May 2020 01:44:08 +0000 (UTC) IronPort-SDR: y+MuBmHmZbToCYfSsWbGyJDp26G2y1XqHPQrWq9jB95QuD2WuTU/hgWLeoh0+pkifTQZxHJJb8 4uPJTpPz9AAQ== X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 May 2020 18:44:04 -0700 IronPort-SDR: xF4NamqZOsUfFCvLw03Fj91Sa2iAEJ1DPzR8Hjbd2NynOMqmOFEDowii1ih4wd95HIlwHYfBHy 8YnMVI6dUygw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.73,439,1583222400"; d="scan'208";a="255326666" Received: from shbuild999.sh.intel.com (HELO localhost) ([10.239.146.107]) by fmsmga007.fm.intel.com with ESMTP; 26 May 2020 18:44:00 -0700 Date: Wed, 27 May 2020 09:43:59 +0800 From: Feng Tang To: Andi Kleen Cc: Qian Cai , Andrew Morton , Michal Hocko , Johannes Weiner , Stephen Rothwell , Matthew Wilcox , Mel Gorman , Kees Cook , Luis Chamberlain , Iurii Zaikin , tim.c.chen@intel.com, dave.hansen@intel.com, ying.huang@intel.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH 0/3] make vm_committed_as_batch aware of vm overcommit policy Message-ID: <20200527014359.GA93879@shbuild999.sh.intel.com> References: <1588922717-63697-1-git-send-email-feng.tang@intel.com> <20200521212726.GC6367@ovpn-112-192.phx2.redhat.com> <20200526181459.GD991@lca.pw> <20200527011413.GA610738@tassilo.jf.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20200527011413.GA610738@tassilo.jf.intel.com> User-Agent: Mutt/1.5.24 (2015-08-30) X-Rspamd-Queue-Id: 65653180868D5 X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam05 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, May 26, 2020 at 06:14:13PM -0700, Andi Kleen wrote: > On Tue, May 26, 2020 at 02:14:59PM -0400, Qian Cai wrote: > > On Thu, May 21, 2020 at 05:27:26PM -0400, Qian Cai wrote: > > > On Fri, May 08, 2020 at 03:25:14PM +0800, Feng Tang wrote: > > > > When checking a performance change for will-it-scale scalability > > > > mmap test [1], we found very high lock contention for spinlock of > > > > percpu counter 'vm_committed_as': > > > > > > > > 94.14% 0.35% [kernel.kallsyms] [k] _raw_spin_lock_irqsave > > > > 48.21% _raw_spin_lock_irqsave;percpu_counter_add_batch;__vm_enough_memory;mmap_region;do_mmap; > > > > 45.91% _raw_spin_lock_irqsave;percpu_counter_add_batch;__do_munmap; > > > > > > > > Actually this heavy lock contention is not always necessary. The > > > > 'vm_committed_as' needs to be very precise when the strict > > > > OVERCOMMIT_NEVER policy is set, which requires a rather small batch > > > > number for the percpu counter. > > > > > > > > So lift the batch number to 16X for OVERCOMMIT_ALWAYS and > > > > OVERCOMMIT_GUESS policies, and add a sysctl handler to adjust it > > > > when the policy is reconfigured. > > > > > > > > Benchmark with the same testcase in [1] shows 53% improvement on a > > > > 8C/16T desktop, and 2097%(20X) on a 4S/72C/144T server. And no change > > > > for some platforms, due to the test mmap size of the case is bigger > > > > than the batch number computed, though the patch will help mmap/munmap > > > > generally. > > > > > > > > [1] https://lkml.org/lkml/2020/3/5/57 > > > > > > > > There are some style complain from checkpatch for patch 3, as it > > > > follows the similar format of sibling functions > > > > > > > > patch1: a cleanup for /proc/meminfo > > > > patch2: a preparation patch which also improve the accuracy of > > > > vm_memory_committed > > > > patch3: the main change > > > > > > > > Please help to review, thanks! > > > > > > Reverted this series fixed a warning under memory pressue. > > > > Andrew, Stephen, can you drop this series? > > Hope you're happy now. > > The warning is just not needed for the always case. The whole point > of the patch was to maintain these counters only approximately. > We could of course use _sum or _sum_positive in the warning, but that would > just be unnecessary overhead in this moderately hot path. > > So this patch should fix it: (untested) > > Possibly might need to move it even further down because the patch > also affected the _GUESS case. Thanks Andi for the patch. And I agree that it could be after _GUESS case, as existig check for _GUESS is loose: if (pages > totalram_pages() + total_swap_pages) goto error; Thanks, Feng > > > diff --git a/mm/util.c b/mm/util.c > index 988d11e6c17c..fa78f90e29a1 100644 > --- a/mm/util.c > +++ b/mm/util.c > @@ -798,10 +798,6 @@ int __vm_enough_memory(struct mm_struct *mm, long pages, int cap_sys_admin) > { > long allowed; > > - VM_WARN_ONCE(percpu_counter_read(&vm_committed_as) < > - -(s64)vm_committed_as_batch * num_online_cpus(), > - "memory commitment underflow"); > - > vm_acct_memory(pages); > > /* > @@ -810,6 +806,10 @@ int __vm_enough_memory(struct mm_struct *mm, long pages, int cap_sys_admin) > if (sysctl_overcommit_memory == OVERCOMMIT_ALWAYS) > return 0; > > + VM_WARN_ONCE(percpu_counter_read(&vm_committed_as) < > + -(s64)vm_committed_as_batch * num_online_cpus(), > + "memory commitment underflow"); > + > if (sysctl_overcommit_memory == OVERCOMMIT_GUESS) { > if (pages > totalram_pages() + total_swap_pages) > goto error;