LKML Archive on lore.kernel.org
 help / color / Atom feed
From: Feng Tang <feng.tang@intel.com>
To: "Huang, Ying" <ying.huang@intel.com>,
	Andi Kleen <andi.kleen@intel.com>, Qian Cai <cai@lca.pw>,
	Andrew Morton <akpm@linux-foundation.org>,
	Michal Hocko <mhocko@suse.com>
Cc: Dennis Zhou <dennis@kernel.org>, Tejun Heo <tj@kernel.org>,
	Christoph Lameter <cl@linux.com>,
	kernel test robot <rong.a.chen@intel.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Matthew Wilcox <willy@infradead.org>,
	Mel Gorman <mgorman@suse.de>, Kees Cook <keescook@chromium.org>,
	Luis Chamberlain <mcgrof@kernel.org>,
	Iurii Zaikin <yzaikin@google.com>,
	tim.c.chen@intel.com, dave.hansen@intel.com, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org, lkp@lists.01.org
Subject: Re: [mm] 4e2c82a409: ltp.overcommit_memory01.fail
Date: Thu, 9 Jul 2020 12:55:54 +0800
Message-ID: <20200709045554.GA56190@shbuild999.sh.intel.com> (raw)
In-Reply-To: <20200707054120.GC21741@shbuild999.sh.intel.com>

On Tue, Jul 07, 2020 at 01:41:20PM +0800, Feng Tang wrote:
> On Tue, Jul 07, 2020 at 12:00:09PM +0800, Huang, Ying wrote:
> > Feng Tang <feng.tang@intel.com> writes:
> > 
> > > On Mon, Jul 06, 2020 at 06:34:34AM -0700, Andi Kleen wrote:
> > >> >  	ret = proc_dointvec_minmax(table, write, buffer, lenp, ppos);
> > >> > -	if (ret == 0 && write)
> > >> > +	if (ret == 0 && write) {
> > >> > +		if (sysctl_overcommit_memory == OVERCOMMIT_NEVER)
> > >> > +			schedule_on_each_cpu(sync_overcommit_as);
> > >> 
> > >> The schedule_on_each_cpu is not atomic, so the problem could still happen
> > >> in that window.
> > >> 
> > >> I think it may be ok if it eventually resolves, but certainly needs
> > >> a comment explaining it. Can you do some stress testing toggling the
> > >> policy all the time on different CPUs and running the test on
> > >> other CPUs and see if the test fails?
> > >
> > > For the raw test case reported by 0day, this patch passed in 200 times
> > > run. And I will read the ltp code and try stress testing it as you
> > > suggested.
> > >
> > >
> > >> The other alternative would be to define some intermediate state
> > >> for the sysctl variable and only switch to never once the schedule_on_each_cpu
> > >> returned. But that's more complexity.
> > >
> > > One thought I had is to put this schedule_on_each_cpu() before
> > > the proc_dointvec_minmax() to do the sync before sysctl_overcommit_memory
> > > is really changed. But the window still exists, as the batch is
> > > still the larger one. 
> > 
> > Can we change the batch firstly, then sync the global counter, finally
> > change the overcommit policy?
> 
> These reorderings are really head scratching :)
> 
> I've thought about this before when Qian Cai first reported the warning
> message, as kernel had a check: 
> 
> 	VM_WARN_ONCE(percpu_counter_read(&vm_committed_as) <
> 			-(s64)vm_committed_as_batch * num_online_cpus(),
> 			"memory commitment underflow");
> 
> If the batch is decreased first, the warning will be easier/earlier to be
> triggered, so I didn't brought this up when handling the warning message.
> 
> But it might work now, as the warning has been removed.

I tested the reorder way, and the test could pass in 100 times run. The
new order when changing policy to OVERCOMMIT_NEVER:
  1. re-compute the batch ( to the smaller one)
  2. do the on_each_cpu sync
  3. really change the policy to NEVER.

It solves one of previous concern, that after the sync is done on cpuX,
but before the whole sync on all cpus are done, there is a window that
the percpu-counter could be enlarged again.

IIRC Andi had concern about read side cost when doing the sync, my
understanding is most of the readers (malloc/free/map/unmap) are using
percpu_counter_read_positive, which is a fast path without involving lock.

As for the problem itself, I agree with Michal's point, that usually there
is no normal case that will change the overcommit_policy too frequently.

The code logic is mainly in overcommit_policy_handler(), based on the
previous sync fix. please help to review, thanks!

int overcommit_policy_handler(struct ctl_table *table, int write, void *buffer,
		size_t *lenp, loff_t *ppos)
{
	int ret;

	if (write) {
		int new_policy;
		struct ctl_table t;

		t = *table;
		t.data = &new_policy;
		ret = proc_dointvec_minmax(&t, write, buffer, lenp, ppos);
		if (ret)
			return ret;

		mm_compute_batch(new_policy);
		if (new_policy == OVERCOMMIT_NEVER)
			schedule_on_each_cpu(sync_overcommit_as);
		sysctl_overcommit_memory = new_policy;
	} else {
		ret = proc_dointvec_minmax(table, write, buffer, lenp, ppos);
	}

	return ret;
}

- Feng



  reply index

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-06-21  7:36 [PATCH v5 0/3] make vm_committed_as_batch aware of vm overcommit policy Feng Tang
2020-06-21  7:36 ` [PATCH v5 1/3] proc/meminfo: avoid open coded reading of vm_committed_as Feng Tang
2020-06-21  7:36 ` [PATCH v5 2/3] mm/util.c: make vm_memory_committed() more accurate Feng Tang
2020-06-21  7:36 ` [PATCH v5 3/3] mm: adjust vm_committed_as_batch according to vm overcommit policy Feng Tang
2020-06-22 13:25   ` [mm] 4e2c82a409: will-it-scale.per_process_ops 1894.6% improvement kernel test robot
     [not found]   ` <20200702063201.GG3874@shao2-debian>
2020-07-02  7:12     ` [mm] 4e2c82a409: ltp.overcommit_memory01.fail Feng Tang
2020-07-05  3:20       ` Qian Cai
2020-07-05  4:44       ` Feng Tang
2020-07-05 12:15         ` Qian Cai
2020-07-05 12:58           ` Feng Tang
2020-07-05 15:52             ` Qian Cai
2020-07-06  1:43               ` Feng Tang
2020-07-06  2:36                 ` Qian Cai
2020-07-06 13:24                   ` Feng Tang
2020-07-06 13:34                     ` Andi Kleen
2020-07-06 23:42                       ` Andrew Morton
2020-07-07  2:38                       ` Feng Tang
2020-07-07  4:00                         ` Huang, Ying
2020-07-07  5:41                           ` Feng Tang
2020-07-09  4:55                             ` Feng Tang [this message]
2020-07-09 13:40                               ` Qian Cai
2020-07-09 14:15                                 ` Feng Tang
2020-07-10  1:38                                   ` Feng Tang
2020-07-10  3:26                                     ` Qian Cai
2020-07-07  1:06                     ` Dennis Zhou
2020-07-07  3:24                       ` Feng Tang
2020-07-07 10:28               ` Michal Hocko
2020-06-24  9:45 ` [PATCH v5 0/3] make vm_committed_as_batch aware of vm overcommit policy Michal Hocko
2020-07-07 11:43 [mm] 4e2c82a409: ltp.overcommit_memory01.fail Qian Cai
2020-07-07 12:06 ` Michal Hocko
2020-07-07 13:04   ` Qian Cai
2020-07-07 13:56     ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200709045554.GA56190@shbuild999.sh.intel.com \
    --to=feng.tang@intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=andi.kleen@intel.com \
    --cc=cai@lca.pw \
    --cc=cl@linux.com \
    --cc=dave.hansen@intel.com \
    --cc=dennis@kernel.org \
    --cc=hannes@cmpxchg.org \
    --cc=keescook@chromium.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lkp@lists.01.org \
    --cc=mcgrof@kernel.org \
    --cc=mgorman@suse.de \
    --cc=mhocko@suse.com \
    --cc=rong.a.chen@intel.com \
    --cc=tim.c.chen@intel.com \
    --cc=tj@kernel.org \
    --cc=willy@infradead.org \
    --cc=ying.huang@intel.com \
    --cc=yzaikin@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

LKML Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/lkml/0 lkml/git/0.git
	git clone --mirror https://lore.kernel.org/lkml/1 lkml/git/1.git
	git clone --mirror https://lore.kernel.org/lkml/2 lkml/git/2.git
	git clone --mirror https://lore.kernel.org/lkml/3 lkml/git/3.git
	git clone --mirror https://lore.kernel.org/lkml/4 lkml/git/4.git
	git clone --mirror https://lore.kernel.org/lkml/5 lkml/git/5.git
	git clone --mirror https://lore.kernel.org/lkml/6 lkml/git/6.git
	git clone --mirror https://lore.kernel.org/lkml/7 lkml/git/7.git
	git clone --mirror https://lore.kernel.org/lkml/8 lkml/git/8.git
	git clone --mirror https://lore.kernel.org/lkml/9 lkml/git/9.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 lkml lkml/ https://lore.kernel.org/lkml \
		linux-kernel@vger.kernel.org
	public-inbox-index lkml

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-kernel


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git