All of lore.kernel.org
 help / color / mirror / Atom feed
From: Johannes Weiner <hannes@cmpxchg.org>
To: Michal Hocko <mhocko@kernel.org>
Cc: Vladimir Davydov <vdavydov@virtuozzo.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	linux-mm@kvack.org, cgroups@vger.kernel.org,
	linux-kernel@vger.kernel.org, kernel-team@fb.com
Subject: Re: [PATCH 7/8] mm: memcontrol: account "kmem" consumers in cgroup2 memory controller
Date: Thu, 10 Dec 2015 10:16:27 -0500	[thread overview]
Message-ID: <20151210151627.GB1431@cmpxchg.org> (raw)
In-Reply-To: <20151210132833.GM19496@dhcp22.suse.cz>

On Thu, Dec 10, 2015 at 02:28:33PM +0100, Michal Hocko wrote:
> On Wed 09-12-15 14:30:38, Vladimir Davydov wrote:
> > From: Vladimir Davydov <vdavydov@virtuozzo.com>
> > Subject: [PATCH] mm: memcontrol: allow to disable kmem accounting for cgroup2
> > 
> > Kmem accounting might incur overhead that some users can't put up with.
> > Besides, the implementation is still considered unstable. So let's
> > provide a way to disable it for those users who aren't happy with it.
> 
> Yes there will be users who do not want to pay an additional overhead
> and still accoplish what they need.
> I haven't measured the overhead lately - especially after the opt-out ->
> opt-in change so it might be much lower than my previous ~5% for kbuild
> load.

I think switching from accounting *all* slab allocations to accounting
a list of, what, less than 20 select slabs, counts as a change
significant enough to entirely invalidate those measurements and never
bring up that number again in the context of kmem cost, don't you think?

There isn't that much that the kmem is doing, but for posterity I ran
kbuild test inside a cgroup2, with and without cgroup.memory=nokmem,
and these are the results:

default:
 Performance counter stats for 'make -j16 -s clean bzImage' (3 runs):

     715823.047005      task-clock (msec)         #    3.794 CPUs utilized          
           252,538      context-switches          #    0.353 K/sec                  
            32,018      cpu-migrations            #    0.045 K/sec                  
        16,678,202      page-faults               #    0.023 M/sec                  
 1,783,804,914,980      cycles                    #    2.492 GHz                    
   <not supported>      stalled-cycles-frontend  
   <not supported>      stalled-cycles-backend   
 1,346,424,021,728      instructions              #    0.75  insns per cycle        
   298,744,956,474      branches                  #  417.363 M/sec                  
    10,207,872,737      branch-misses             #    3.42% of all branches        

     188.667608149 seconds time elapsed                                          ( +-  0.66% )

cgroup.memory=nokmem
 Performance counter stats for 'make -j16 -s clean bzImage' (3 runs):

     729028.322760      task-clock (msec)         #    3.805 CPUs utilized          
           258,775      context-switches          #    0.356 K/sec                  
            32,241      cpu-migrations            #    0.044 K/sec                  
        16,647,817      page-faults               #    0.023 M/sec                  
 1,816,827,061,194      cycles                    #    2.497 GHz                    
   <not supported>      stalled-cycles-frontend  
   <not supported>      stalled-cycles-backend   
 1,345,446,962,095      instructions              #    0.74  insns per cycle        
   298,461,034,326      branches                  #  410.277 M/sec                  
    10,215,145,963      branch-misses             #    3.42% of all branches        

     191.583957742 seconds time elapsed                                          ( +-  0.57% )

I would say the difference is solidly in the noise.

I also profiled a silly find | xargs stat pipe to excercise the dentry
and inode accounting, and this was the highest kmem-specific entry:

     0.27%     0.27%  find             [kernel.kallsyms]        [k] __memcg_kmem_get_cache                    
                       |
                       ---__memcg_kmem_get_cache
                          __kmalloc
                          ext4_htree_store_dirent
                          htree_dirblock_to_tree
                          ext4_htree_fill_tree
                          ext4_readdir
                          iterate_dir
                          sys_getdents
                          entry_SYSCALL_64_fastpath
                          __getdents64

So can we *please* lay this whole "unreasonable burden to legacy and
power users" line of argument to rest and get on with our work? And
then tackle scalability problems as they show up in real workloads?

Thanks.

WARNING: multiple messages have this Message-ID (diff)
From: Johannes Weiner <hannes@cmpxchg.org>
To: Michal Hocko <mhocko@kernel.org>
Cc: Vladimir Davydov <vdavydov@virtuozzo.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	linux-mm@kvack.org, cgroups@vger.kernel.org,
	linux-kernel@vger.kernel.org, kernel-team@fb.com
Subject: Re: [PATCH 7/8] mm: memcontrol: account "kmem" consumers in cgroup2 memory controller
Date: Thu, 10 Dec 2015 10:16:27 -0500	[thread overview]
Message-ID: <20151210151627.GB1431@cmpxchg.org> (raw)
In-Reply-To: <20151210132833.GM19496@dhcp22.suse.cz>

On Thu, Dec 10, 2015 at 02:28:33PM +0100, Michal Hocko wrote:
> On Wed 09-12-15 14:30:38, Vladimir Davydov wrote:
> > From: Vladimir Davydov <vdavydov@virtuozzo.com>
> > Subject: [PATCH] mm: memcontrol: allow to disable kmem accounting for cgroup2
> > 
> > Kmem accounting might incur overhead that some users can't put up with.
> > Besides, the implementation is still considered unstable. So let's
> > provide a way to disable it for those users who aren't happy with it.
> 
> Yes there will be users who do not want to pay an additional overhead
> and still accoplish what they need.
> I haven't measured the overhead lately - especially after the opt-out ->
> opt-in change so it might be much lower than my previous ~5% for kbuild
> load.

I think switching from accounting *all* slab allocations to accounting
a list of, what, less than 20 select slabs, counts as a change
significant enough to entirely invalidate those measurements and never
bring up that number again in the context of kmem cost, don't you think?

There isn't that much that the kmem is doing, but for posterity I ran
kbuild test inside a cgroup2, with and without cgroup.memory=nokmem,
and these are the results:

default:
 Performance counter stats for 'make -j16 -s clean bzImage' (3 runs):

     715823.047005      task-clock (msec)         #    3.794 CPUs utilized          
           252,538      context-switches          #    0.353 K/sec                  
            32,018      cpu-migrations            #    0.045 K/sec                  
        16,678,202      page-faults               #    0.023 M/sec                  
 1,783,804,914,980      cycles                    #    2.492 GHz                    
   <not supported>      stalled-cycles-frontend  
   <not supported>      stalled-cycles-backend   
 1,346,424,021,728      instructions              #    0.75  insns per cycle        
   298,744,956,474      branches                  #  417.363 M/sec                  
    10,207,872,737      branch-misses             #    3.42% of all branches        

     188.667608149 seconds time elapsed                                          ( +-  0.66% )

cgroup.memory=nokmem
 Performance counter stats for 'make -j16 -s clean bzImage' (3 runs):

     729028.322760      task-clock (msec)         #    3.805 CPUs utilized          
           258,775      context-switches          #    0.356 K/sec                  
            32,241      cpu-migrations            #    0.044 K/sec                  
        16,647,817      page-faults               #    0.023 M/sec                  
 1,816,827,061,194      cycles                    #    2.497 GHz                    
   <not supported>      stalled-cycles-frontend  
   <not supported>      stalled-cycles-backend   
 1,345,446,962,095      instructions              #    0.74  insns per cycle        
   298,461,034,326      branches                  #  410.277 M/sec                  
    10,215,145,963      branch-misses             #    3.42% of all branches        

     191.583957742 seconds time elapsed                                          ( +-  0.57% )

I would say the difference is solidly in the noise.

I also profiled a silly find | xargs stat pipe to excercise the dentry
and inode accounting, and this was the highest kmem-specific entry:

     0.27%     0.27%  find             [kernel.kallsyms]        [k] __memcg_kmem_get_cache                    
                       |
                       ---__memcg_kmem_get_cache
                          __kmalloc
                          ext4_htree_store_dirent
                          htree_dirblock_to_tree
                          ext4_htree_fill_tree
                          ext4_readdir
                          iterate_dir
                          sys_getdents
                          entry_SYSCALL_64_fastpath
                          __getdents64

So can we *please* lay this whole "unreasonable burden to legacy and
power users" line of argument to rest and get on with our work? And
then tackle scalability problems as they show up in real workloads?

Thanks.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2015-12-10 15:16 UTC|newest]

Thread overview: 79+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-12-08 18:34 [PATCH 0/8] mm: memcontrol: account "kmem" in cgroup2 Johannes Weiner
2015-12-08 18:34 ` Johannes Weiner
2015-12-08 18:34 ` Johannes Weiner
2015-12-08 18:34 ` [PATCH 1/8] mm: memcontrol: drop unused @css argument in memcg_init_kmem Johannes Weiner
2015-12-08 18:34   ` Johannes Weiner
2015-12-09  9:01   ` Vladimir Davydov
2015-12-09  9:01     ` Vladimir Davydov
2015-12-09  9:01     ` Vladimir Davydov
2015-12-10 12:37   ` Michal Hocko
2015-12-10 12:37     ` Michal Hocko
2015-12-10 12:37     ` Michal Hocko
2015-12-08 18:34 ` [PATCH 2/8] mm: memcontrol: remove double kmem page_counter init Johannes Weiner
2015-12-08 18:34   ` Johannes Weiner
2015-12-09  9:05   ` Vladimir Davydov
2015-12-09  9:05     ` Vladimir Davydov
2015-12-10 12:40   ` Michal Hocko
2015-12-10 12:40     ` Michal Hocko
2015-12-10 12:40     ` Michal Hocko
2015-12-08 18:34 ` [PATCH 3/8] mm: memcontrol: give the kmem states more descriptive names Johannes Weiner
2015-12-08 18:34   ` Johannes Weiner
2015-12-09  9:10   ` Vladimir Davydov
2015-12-09  9:10     ` Vladimir Davydov
2015-12-09  9:10     ` Vladimir Davydov
2015-12-10 12:47   ` Michal Hocko
2015-12-10 12:47     ` Michal Hocko
2015-12-08 18:34 ` [PATCH 4/8] mm: memcontrol: group kmem init and exit functions together Johannes Weiner
2015-12-08 18:34   ` Johannes Weiner
2015-12-09  9:14   ` Vladimir Davydov
2015-12-09  9:14     ` Vladimir Davydov
2015-12-09  9:14     ` Vladimir Davydov
2015-12-10 12:56   ` Michal Hocko
2015-12-10 12:56     ` Michal Hocko
2015-12-08 18:34 ` [PATCH 5/8] mm: memcontrol: separate kmem code from legacy tcp accounting code Johannes Weiner
2015-12-08 18:34   ` Johannes Weiner
2015-12-09  9:23   ` Vladimir Davydov
2015-12-09  9:23     ` Vladimir Davydov
2015-12-09  9:23     ` Vladimir Davydov
2015-12-10 12:59   ` Michal Hocko
2015-12-10 12:59     ` Michal Hocko
2015-12-10 12:59     ` Michal Hocko
2015-12-08 18:34 ` [PATCH 6/8] mm: memcontrol: move kmem accounting code to CONFIG_MEMCG Johannes Weiner
2015-12-08 18:34   ` Johannes Weiner
2015-12-09  9:32   ` Vladimir Davydov
2015-12-09  9:32     ` Vladimir Davydov
2015-12-09  9:32     ` Vladimir Davydov
2015-12-10 13:17   ` Michal Hocko
2015-12-10 13:17     ` Michal Hocko
2015-12-10 14:00     ` Johannes Weiner
2015-12-10 14:00       ` Johannes Weiner
2015-12-10 14:00       ` Johannes Weiner
2015-12-10 20:22   ` [PATCH 6/8 v2] " Johannes Weiner
2015-12-10 20:22     ` Johannes Weiner
2015-12-10 20:22     ` Johannes Weiner
2015-12-10 20:50     ` Johannes Weiner
2015-12-10 20:50       ` Johannes Weiner
2015-12-10 20:50       ` Johannes Weiner
2015-12-08 18:34 ` [PATCH 7/8] mm: memcontrol: account "kmem" consumers in cgroup2 memory controller Johannes Weiner
2015-12-08 18:34   ` Johannes Weiner
2015-12-09 11:30   ` Vladimir Davydov
2015-12-09 11:30     ` Vladimir Davydov
2015-12-09 11:30     ` Vladimir Davydov
2015-12-09 14:32     ` Johannes Weiner
2015-12-09 14:32       ` Johannes Weiner
2015-12-09 14:32       ` Johannes Weiner
2015-12-10 13:28     ` Michal Hocko
2015-12-10 13:28       ` Michal Hocko
2015-12-10 13:28       ` Michal Hocko
2015-12-10 15:16       ` Johannes Weiner [this message]
2015-12-10 15:16         ` Johannes Weiner
2015-12-10 16:25         ` Michal Hocko
2015-12-10 16:25           ` Michal Hocko
2015-12-10 16:25           ` Michal Hocko
2015-12-10 14:21   ` Michal Hocko
2015-12-10 14:21     ` Michal Hocko
2015-12-08 18:34 ` [PATCH 8/8] mm: memcontrol: introduce CONFIG_MEMCG_LEGACY_KMEM Johannes Weiner
2015-12-08 18:34   ` Johannes Weiner
2015-12-09 11:31   ` Vladimir Davydov
2015-12-09 11:31     ` Vladimir Davydov
2015-12-09 11:31     ` Vladimir Davydov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20151210151627.GB1431@cmpxchg.org \
    --to=hannes@cmpxchg.org \
    --cc=akpm@linux-foundation.org \
    --cc=cgroups@vger.kernel.org \
    --cc=kernel-team@fb.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=vdavydov@virtuozzo.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.