LKML Archive on lore.kernel.org
 help / color / Atom feed
* [PATCH v5 0/3] make vm_committed_as_batch aware of vm overcommit policy
@ 2020-06-21  7:36 Feng Tang
  2020-06-21  7:36 ` [PATCH v5 1/3] proc/meminfo: avoid open coded reading of vm_committed_as Feng Tang
                   ` (3 more replies)
  0 siblings, 4 replies; 32+ messages in thread
From: Feng Tang @ 2020-06-21  7:36 UTC (permalink / raw)
  To: Andrew Morton, Michal Hocko, Johannes Weiner, Matthew Wilcox,
	Mel Gorman, Kees Cook, Luis Chamberlain, Iurii Zaikin,
	andi.kleen, tim.c.chen, dave.hansen, ying.huang, linux-mm,
	linux-kernel
  Cc: Feng Tang

When checking a performance change for will-it-scale scalability
mmap test [1], we found very high lock contention for spinlock of
percpu counter 'vm_committed_as':

    94.14%     0.35%  [kernel.kallsyms]         [k] _raw_spin_lock_irqsave
    48.21% _raw_spin_lock_irqsave;percpu_counter_add_batch;__vm_enough_memory;mmap_region;do_mmap;
    45.91% _raw_spin_lock_irqsave;percpu_counter_add_batch;__do_munmap;

Actually this heavy lock contention is not always necessary. The
'vm_committed_as' needs to be very precise when the strict
OVERCOMMIT_NEVER policy is set, which requires a rather small batch
number for the percpu counter.

So keep 'batch' number unchanged for strict OVERCOMMIT_NEVER policy,
and enlarge it for not-so-strict  OVERCOMMIT_ALWAYS and OVERCOMMIT_GUESS
policies.

Benchmark with the same testcase in [1] shows 53% improvement on a
8C/16T desktop, and 2097%(20X) on a 4S/72C/144T server. And for that
case, whether it shows improvements depends on if the test mmap size
is bigger than the batch number computed.

We tested 10+ platforms in 0day (server, desktop and laptop). If we
lift it to 64X, 80%+ platforms show improvements, and for 16X lift,
1/3 of the platforms will show improvements.

And generally it should help the mmap/unmap usage,as Michal Hocko
mentioned:

: I believe that there are non-synthetic worklaods which would benefit
: from a larger batch. E.g. large in memory databases which do large
: mmaps during startups from multiple threads.

Note: There are some style complain from checkpatch for patch 3,
as sysctl handler declaration follows the similar format of sibling
functions

[1] https://lore.kernel.org/lkml/20200305062138.GI5972@shao2-debian/

patch1: a cleanup for /proc/meminfo
patch2: a preparation patch which also improve the accuracy of
        vm_memory_committed
patch3: main change

This is against today's linux-mm git tree on github.

Please help to review, thanks!

- Feng

----------------------------------------------------------------
Changelog:

  v5:
    * rebase after 5.8-rc1
    * remove the 3/4 patch in v4  which is merged in v5.7
    * add code comments for vm_memory_committed() 

  v4:
    * Remove the VM_WARN_ONCE check for vm_committed_as underflow,
      thanks to Qian Cai for finding and testing the warning

  v3:
    * refine commit log and cleanup code, according to comments
      from Michal Hocko and Matthew Wilcox
    * change the lift from 16X and 64X after test 
  
  v2:
    * add the sysctl handler to cover runtime overcommit policy
      change, as suggested by Andres Morton 
    * address the accuracy concern of vm_memory_committed()
      from Andi Kleen 

Feng Tang (3):
  proc/meminfo: avoid open coded reading of vm_committed_as
  mm/util.c: make vm_memory_committed() more accurate
  mm: adjust vm_committed_as_batch according to vm overcommit policy

 fs/proc/meminfo.c    |  2 +-
 include/linux/mm.h   |  2 ++
 include/linux/mman.h |  4 ++++
 kernel/sysctl.c      |  2 +-
 mm/mm_init.c         | 18 ++++++++++++++----
 mm/util.c            | 19 ++++++++++++++++++-
 6 files changed, 40 insertions(+), 7 deletions(-)

-- 
2.7.4


^ permalink raw reply	[flat|nested] 32+ messages in thread
* Re: [mm] 4e2c82a409: ltp.overcommit_memory01.fail
@ 2020-07-07 11:43 Qian Cai
  2020-07-07 12:06 ` Michal Hocko
  0 siblings, 1 reply; 32+ messages in thread
From: Qian Cai @ 2020-07-07 11:43 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Feng Tang, kernel test robot, Andrew Morton, Johannes Weiner,
	Matthew Wilcox, Mel Gorman, Kees Cook, Luis Chamberlain,
	Iurii Zaikin, andi.kleen, tim.c.chen, dave.hansen, ying.huang,
	linux-mm, linux-kernel, lkp



> On Jul 7, 2020, at 6:28 AM, Michal Hocko <mhocko@kernel.org> wrote:
> 
> Would you have any examples? Because I find this highly unlikely.
> OVERCOMMIT_NEVER only works when virtual memory is not largerly
> overcommited wrt to real memory demand. And that tends to be more of
> an exception rather than a rule. "Modern" userspace (whatever that
> means) tends to be really hungry with virtual memory which is only used
> very sparsely.
> 
> I would argue that either somebody is running an "OVERCOMMIT_NEVER"
> friendly SW and this is a permanent setting or this is not used at all.
> At least this is my experience.
> 
> So I strongly suspect that LTP test failure is not something we should
> really lose sleep over. It would be nice to find a way to flush existing
> batches but I would rather see a real workload that would suffer from
> this imprecision.

I hear you many times that you really don’t care about those use cases unless you hear exactly people are using in your world.

For example, when you said LTP oom tests are totally artificial last time and how less you care about if they are failing, and I could only enjoy their efficiencies to find many issues like race conditions and bad error accumulation handling etc that your “real world use cases” are going to take ages or no way to flag them.

There are just too many valid use cases in this wild world. The difference is that I admit that I don’t know or even aware all the use cases, and I don’t believe you do as well.

If a patchset broke the existing behaviors that written exactly in the spec, it is then someone has to prove its innocent. For example, if nobody is going to rely on something like this now and future, and then fix the spec and explain exactly nobody should be rely upon.

^ permalink raw reply	[flat|nested] 32+ messages in thread

end of thread, back to index

Thread overview: 32+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-06-21  7:36 [PATCH v5 0/3] make vm_committed_as_batch aware of vm overcommit policy Feng Tang
2020-06-21  7:36 ` [PATCH v5 1/3] proc/meminfo: avoid open coded reading of vm_committed_as Feng Tang
2020-06-21  7:36 ` [PATCH v5 2/3] mm/util.c: make vm_memory_committed() more accurate Feng Tang
2020-06-21  7:36 ` [PATCH v5 3/3] mm: adjust vm_committed_as_batch according to vm overcommit policy Feng Tang
2020-06-22 13:25   ` [mm] 4e2c82a409: will-it-scale.per_process_ops 1894.6% improvement kernel test robot
     [not found]   ` <20200702063201.GG3874@shao2-debian>
2020-07-02  7:12     ` [mm] 4e2c82a409: ltp.overcommit_memory01.fail Feng Tang
2020-07-05  3:20       ` Qian Cai
2020-07-05  4:44       ` Feng Tang
2020-07-05 12:15         ` Qian Cai
2020-07-05 12:58           ` Feng Tang
2020-07-05 15:52             ` Qian Cai
2020-07-06  1:43               ` Feng Tang
2020-07-06  2:36                 ` Qian Cai
2020-07-06 13:24                   ` Feng Tang
2020-07-06 13:34                     ` Andi Kleen
2020-07-06 23:42                       ` Andrew Morton
2020-07-07  2:38                       ` Feng Tang
2020-07-07  4:00                         ` Huang, Ying
2020-07-07  5:41                           ` Feng Tang
2020-07-09  4:55                             ` Feng Tang
2020-07-09 13:40                               ` Qian Cai
2020-07-09 14:15                                 ` Feng Tang
2020-07-10  1:38                                   ` Feng Tang
2020-07-10  3:26                                     ` Qian Cai
2020-07-07  1:06                     ` Dennis Zhou
2020-07-07  3:24                       ` Feng Tang
2020-07-07 10:28               ` Michal Hocko
2020-06-24  9:45 ` [PATCH v5 0/3] make vm_committed_as_batch aware of vm overcommit policy Michal Hocko
2020-07-07 11:43 [mm] 4e2c82a409: ltp.overcommit_memory01.fail Qian Cai
2020-07-07 12:06 ` Michal Hocko
2020-07-07 13:04   ` Qian Cai
2020-07-07 13:56     ` Michal Hocko

LKML Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/lkml/0 lkml/git/0.git
	git clone --mirror https://lore.kernel.org/lkml/1 lkml/git/1.git
	git clone --mirror https://lore.kernel.org/lkml/2 lkml/git/2.git
	git clone --mirror https://lore.kernel.org/lkml/3 lkml/git/3.git
	git clone --mirror https://lore.kernel.org/lkml/4 lkml/git/4.git
	git clone --mirror https://lore.kernel.org/lkml/5 lkml/git/5.git
	git clone --mirror https://lore.kernel.org/lkml/6 lkml/git/6.git
	git clone --mirror https://lore.kernel.org/lkml/7 lkml/git/7.git
	git clone --mirror https://lore.kernel.org/lkml/8 lkml/git/8.git
	git clone --mirror https://lore.kernel.org/lkml/9 lkml/git/9.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 lkml lkml/ https://lore.kernel.org/lkml \
		linux-kernel@vger.kernel.org
	public-inbox-index lkml

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-kernel


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git