From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.2 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,MENTIONS_GIT_HOSTING,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_SANE_1 autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C1A0EC433DF for ; Wed, 27 May 2020 13:33:57 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 6B47D207CB for ; Wed, 27 May 2020 13:33:57 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 6B47D207CB Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id A8E7A800DA; Wed, 27 May 2020 09:33:56 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id A3E3E80010; Wed, 27 May 2020 09:33:56 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 95400800DA; Wed, 27 May 2020 09:33:56 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0168.hostedemail.com [216.40.44.168]) by kanga.kvack.org (Postfix) with ESMTP id 7E4AD80010 for ; Wed, 27 May 2020 09:33:56 -0400 (EDT) Received: from smtpin01.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 3DAF94303F for ; Wed, 27 May 2020 13:33:56 +0000 (UTC) X-FDA: 76862592072.01.legs43_00184fe26d52 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin01.hostedemail.com (Postfix) with ESMTP id 32AA61802DE14 for ; Wed, 27 May 2020 13:33:49 +0000 (UTC) X-HE-Tag: legs43_00184fe26d52 X-Filterd-Recvd-Size: 4979 Received: from mga17.intel.com (mga17.intel.com [192.55.52.151]) by imf40.hostedemail.com (Postfix) with ESMTP for ; Wed, 27 May 2020 13:33:47 +0000 (UTC) IronPort-SDR: vi9nDrCO49d5U+D2Z/XzOMc2pGClcbic8qQ+YXdKwBJPThYromawEFptlkChXxNIBzC0UZHMOy DQXB/XGizahw== X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 May 2020 06:33:36 -0700 IronPort-SDR: Sp85A4i2kD1hIJeERk2+fDoPnHvu1ieNLHpHlYnUQoRo+UJZ+SVOZlsY8FKXyMSUVQx+2yYVDX C2fHseRFb0VA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.73,441,1583222400"; d="scan'208";a="291594465" Received: from shbuild999.sh.intel.com (HELO localhost) ([10.239.146.107]) by fmsmga004.fm.intel.com with ESMTP; 27 May 2020 06:33:34 -0700 Date: Wed, 27 May 2020 21:33:32 +0800 From: Feng Tang To: Qian Cai Cc: Andrew Morton , Michal Hocko , Johannes Weiner , Stephen Rothwell , Matthew Wilcox , Mel Gorman , Kees Cook , Luis Chamberlain , Iurii Zaikin , andi.kleen@intel.com, tim.c.chen@intel.com, dave.hansen@intel.com, ying.huang@intel.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH 0/3] make vm_committed_as_batch aware of vm overcommit policy Message-ID: <20200527133332.GA20232@shbuild999.sh.intel.com> References: <1588922717-63697-1-git-send-email-feng.tang@intel.com> <20200521212726.GC6367@ovpn-112-192.phx2.redhat.com> <20200526181459.GD991@lca.pw> <20200527014647.GB93879@shbuild999.sh.intel.com> <20200527022539.GK991@lca.pw> <20200527104606.GE93879@shbuild999.sh.intel.com> <20200527120549.GA741@lca.pw> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20200527120549.GA741@lca.pw> User-Agent: Mutt/1.5.24 (2015-08-30) X-Rspamd-Queue-Id: 32AA61802DE14 X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam05 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Hi Qian, On Wed, May 27, 2020 at 08:05:49AM -0400, Qian Cai wrote: > On Wed, May 27, 2020 at 06:46:06PM +0800, Feng Tang wrote: > > Hi Qian, > > > > On Tue, May 26, 2020 at 10:25:39PM -0400, Qian Cai wrote: > > > > > > > [1] https://lkml.org/lkml/2020/3/5/57 > > > > > > > > > > > > Reverted this series fixed a warning under memory pressue. > > > > > > > > > > Andrew, Stephen, can you drop this series? > > > > > > > > > > > > > > > > > [ 3319.257898] LTP: starting oom01 > > > > > > [ 3319.284417] ------------[ cut here ]------------ > > > > > > [ 3319.284439] memory commitment underflow > > > > > > > > Thanks for the catch! > > > > > > > > Could you share the info about the platform, like the CPU numbers > > > > and RAM size, and what's the mmap test size of your test program. > > > > It would be great if you can point me the link to the test program. > > > > > > I have been reproduced this on both AMD and Intel. The test just > > > allocating memory and swapping. > > > > > > https://github.com/linux-test-project/ltp/blob/master/testcases/kernel/mem/oom/oom01.c > > > https://github.com/linux-test-project/ltp/blob/master/testcases/kernel/mem/tunable/overcommit_memory.c > > > > > > It might be better to run the whole LTP mm tests if none of the above > > > triggers it for you which has quite a few memory pressurers. > > > > > > /opt/ltp/runltp -f mm > > > > Thanks for sharing. I tried to reproduce this on 2 server plaforms, > > but can't reproduce it, and they are still under testing. > > > > Meanwhile, could you help to try the below patch, which is based on > > Andi's suggestion and have some debug info. The warning is a little > > strange, as the condition is > > > > (percpu_counter_read(&vm_committed_as) < > > -(s64)vm_committed_as_batch * num_online_cpus()) > > > > while for your platform (48 CPU + 128 GB RAM), the > > '-(s64)vm_committed_as_batch * num_online_cpus()' > > is a s64 value: '-32G', which makes the condition hard to be true, > > and when it is, it could be triggered by some magic for s32/s64 > > operations around the percpu-counter. > > Here is the information on AMD and powerpc below affected by this. It > could need a bit patient to reproduce, but our usual daily CI would > trigger it eventually after a few tries. > > # git clone https://github.com/cailca/linux-mm.git > # cd linux-mm > # ./compile.sh > # systemctl reboot > # ./test.sh I just downloaded it, and it failed on my desktop machine as it failed in 'yum' and 'grub2' setup. The difficulty for me to reproduce is the test platforms are behind the 0day framework, and I can hardly setup external test suits, though I have been trying for all day today :) So if possible, please help to try the patch in my last email. thanks! - Feng