From: Michal Hocko <mhocko@kernel.org> To: Johannes Weiner <hannes@cmpxchg.org> Cc: David Miller <davem@davemloft.net>, akpm@linux-foundation.org, vdavydov@virtuozzo.com, tj@kernel.org, netdev@vger.kernel.org, linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH 5/8] mm: memcontrol: account socket memory on unified hierarchy Date: Tue, 27 Oct 2015 17:15:54 +0100 [thread overview] Message-ID: <20151027161554.GJ9891@dhcp22.suse.cz> (raw) In-Reply-To: <20151027154138.GA4665@cmpxchg.org> On Tue 27-10-15 11:41:38, Johannes Weiner wrote: > On Tue, Oct 27, 2015 at 01:26:47PM +0100, Michal Hocko wrote: > > On Mon 26-10-15 12:56:19, Johannes Weiner wrote: > > [...] > > > Now you could argue that there might exist specialized workloads that > > > need to account anonymous pages and page cache, but not socket memory > > > buffers. > > > > Exactly, and there are loads doing this. Memcg groups are also created to > > limit anon/page cache consumers to not affect the others running on > > the system (basically in the root memcg context from memcg POV) which > > don't care about tracking and they definitely do not want to pay for an > > additional overhead. We should definitely be able to offer a global > > disable knob for them. The same applies to kmem accounting in general. > > I don't see how you make such a clear distinction between, say, page > cache and the dentry cache, and call one user memory and the other > kernel memory. Because the kernel memory footprint would be so small that it simply doesn't change the picture at all. While the page cache or anonymous memory consumption might be so large it might be disruptive. I am talking about loads where good enough is better than "perfect" and ephemeral global memory pressure when kmem goes over expectations is better than a permanent cpu overhead. Whatever we do it will always be non-zero. Also kmem accounting will make the load more non-deterministic because many of the resources are shared between tasks in separate cgroups unless they are explicitly configured. E.g. [id]cache will be shared and first to touch gets charged so you would end up with more false sharing. Nevertheless, I do not want to shift the discussion from the topic. I just think that one-fits-all simply won't work. > That just doesn't make sense to me. They're both kernel > memory allocated on behalf of the user, the only difference being that > one is tracked on the page level and the other on the slab level, and > we started accounting one before the other. > > IMO that's an implementation detail and a historical artifact that > should not be exposed to the user. And that's the thing I hate about > the current opt-out knob. > > > > I don't think there is a compelling case for an elaborate interface > > > to make individual memory consumers configurable inside the memory > > > controller. > > > > I do not think we need an elaborate interface. We just want to have > > a global boot time knob to overwrite the default behavior. This is > > few lines of code and it should give the sufficient flexibility. > > Okay, then let's add this for the socket memory to start with. I'll > have to think more about how to distinguish the slab-based consumers. > Or maybe you have an idea. Isn't that as simple as enabling the jump label during the initialization depending on the knob value? All the charging paths should be disabled by default already. > For now, something like this as a boot commandline? > > cgroup.memory=nosocket That would work for me. I would even see a place to have CONFIG_MEMCG_TCP_KMEM_ENABLED config option for the default and [no]socket as a kernel parameter to override the configuratioin default. This would allow distributions to define their policy without enforcing it hard and those who compile the kernel to define their own policy. -- Michal Hocko SUSE Labs
WARNING: multiple messages have this Message-ID (diff)
From: Michal Hocko <mhocko@kernel.org> To: Johannes Weiner <hannes@cmpxchg.org> Cc: David Miller <davem@davemloft.net>, akpm@linux-foundation.org, vdavydov@virtuozzo.com, tj@kernel.org, netdev@vger.kernel.org, linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH 5/8] mm: memcontrol: account socket memory on unified hierarchy Date: Tue, 27 Oct 2015 17:15:54 +0100 [thread overview] Message-ID: <20151027161554.GJ9891@dhcp22.suse.cz> (raw) In-Reply-To: <20151027154138.GA4665@cmpxchg.org> On Tue 27-10-15 11:41:38, Johannes Weiner wrote: > On Tue, Oct 27, 2015 at 01:26:47PM +0100, Michal Hocko wrote: > > On Mon 26-10-15 12:56:19, Johannes Weiner wrote: > > [...] > > > Now you could argue that there might exist specialized workloads that > > > need to account anonymous pages and page cache, but not socket memory > > > buffers. > > > > Exactly, and there are loads doing this. Memcg groups are also created to > > limit anon/page cache consumers to not affect the others running on > > the system (basically in the root memcg context from memcg POV) which > > don't care about tracking and they definitely do not want to pay for an > > additional overhead. We should definitely be able to offer a global > > disable knob for them. The same applies to kmem accounting in general. > > I don't see how you make such a clear distinction between, say, page > cache and the dentry cache, and call one user memory and the other > kernel memory. Because the kernel memory footprint would be so small that it simply doesn't change the picture at all. While the page cache or anonymous memory consumption might be so large it might be disruptive. I am talking about loads where good enough is better than "perfect" and ephemeral global memory pressure when kmem goes over expectations is better than a permanent cpu overhead. Whatever we do it will always be non-zero. Also kmem accounting will make the load more non-deterministic because many of the resources are shared between tasks in separate cgroups unless they are explicitly configured. E.g. [id]cache will be shared and first to touch gets charged so you would end up with more false sharing. Nevertheless, I do not want to shift the discussion from the topic. I just think that one-fits-all simply won't work. > That just doesn't make sense to me. They're both kernel > memory allocated on behalf of the user, the only difference being that > one is tracked on the page level and the other on the slab level, and > we started accounting one before the other. > > IMO that's an implementation detail and a historical artifact that > should not be exposed to the user. And that's the thing I hate about > the current opt-out knob. > > > > I don't think there is a compelling case for an elaborate interface > > > to make individual memory consumers configurable inside the memory > > > controller. > > > > I do not think we need an elaborate interface. We just want to have > > a global boot time knob to overwrite the default behavior. This is > > few lines of code and it should give the sufficient flexibility. > > Okay, then let's add this for the socket memory to start with. I'll > have to think more about how to distinguish the slab-based consumers. > Or maybe you have an idea. Isn't that as simple as enabling the jump label during the initialization depending on the knob value? All the charging paths should be disabled by default already. > For now, something like this as a boot commandline? > > cgroup.memory=nosocket That would work for me. I would even see a place to have CONFIG_MEMCG_TCP_KMEM_ENABLED config option for the default and [no]socket as a kernel parameter to override the configuratioin default. This would allow distributions to define their policy without enforcing it hard and those who compile the kernel to define their own policy. -- Michal Hocko SUSE Labs -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2015-10-27 16:16 UTC|newest] Thread overview: 156+ messages / expand[flat|nested] mbox.gz Atom feed top 2015-10-22 4:21 [PATCH 0/8] mm: memcontrol: account socket memory in unified hierarchy Johannes Weiner 2015-10-22 4:21 ` Johannes Weiner 2015-10-22 4:21 ` [PATCH 1/8] mm: page_counter: let page_counter_try_charge() return bool Johannes Weiner 2015-10-22 4:21 ` Johannes Weiner 2015-10-23 11:31 ` Michal Hocko 2015-10-23 11:31 ` Michal Hocko 2015-10-22 4:21 ` [PATCH 2/8] mm: memcontrol: export root_mem_cgroup Johannes Weiner 2015-10-22 4:21 ` Johannes Weiner 2015-10-23 11:32 ` Michal Hocko 2015-10-23 11:32 ` Michal Hocko 2015-10-22 4:21 ` [PATCH 3/8] net: consolidate memcg socket buffer tracking and accounting Johannes Weiner 2015-10-22 4:21 ` Johannes Weiner 2015-10-22 18:46 ` Vladimir Davydov 2015-10-22 18:46 ` Vladimir Davydov 2015-10-22 18:46 ` Vladimir Davydov 2015-10-22 19:09 ` Johannes Weiner 2015-10-22 19:09 ` Johannes Weiner 2015-10-23 13:42 ` Vladimir Davydov 2015-10-23 13:42 ` Vladimir Davydov 2015-10-23 13:42 ` Vladimir Davydov 2015-10-23 12:38 ` Michal Hocko 2015-10-23 12:38 ` Michal Hocko 2015-10-22 4:21 ` [PATCH 4/8] mm: memcontrol: prepare for unified hierarchy socket accounting Johannes Weiner 2015-10-22 4:21 ` Johannes Weiner 2015-10-23 12:39 ` Michal Hocko 2015-10-23 12:39 ` Michal Hocko 2015-10-22 4:21 ` [PATCH 5/8] mm: memcontrol: account socket memory on unified hierarchy Johannes Weiner 2015-10-22 4:21 ` Johannes Weiner 2015-10-22 18:47 ` Vladimir Davydov 2015-10-22 18:47 ` Vladimir Davydov 2015-10-22 18:47 ` Vladimir Davydov 2015-10-23 13:19 ` Michal Hocko 2015-10-23 13:19 ` Michal Hocko 2015-10-23 13:19 ` Michal Hocko 2015-10-23 13:59 ` David Miller 2015-10-23 13:59 ` David Miller 2015-10-23 13:59 ` David Miller 2015-10-26 16:56 ` Johannes Weiner 2015-10-26 16:56 ` Johannes Weiner 2015-10-27 12:26 ` Michal Hocko 2015-10-27 12:26 ` Michal Hocko 2015-10-27 13:49 ` David Miller 2015-10-27 13:49 ` David Miller 2015-10-27 13:49 ` David Miller 2015-10-27 15:41 ` Johannes Weiner 2015-10-27 15:41 ` Johannes Weiner 2015-10-27 15:41 ` Johannes Weiner 2015-10-27 16:15 ` Michal Hocko [this message] 2015-10-27 16:15 ` Michal Hocko 2015-10-27 16:42 ` Johannes Weiner 2015-10-27 16:42 ` Johannes Weiner 2015-10-28 0:45 ` David Miller 2015-10-28 0:45 ` David Miller 2015-10-28 0:45 ` David Miller 2015-10-28 3:05 ` Johannes Weiner 2015-10-28 3:05 ` Johannes Weiner 2015-10-29 15:25 ` Michal Hocko 2015-10-29 15:25 ` Michal Hocko 2015-10-29 16:10 ` Johannes Weiner 2015-10-29 16:10 ` Johannes Weiner 2015-10-29 16:10 ` Johannes Weiner 2015-11-04 10:42 ` Michal Hocko 2015-11-04 10:42 ` Michal Hocko 2015-11-04 19:50 ` Johannes Weiner 2015-11-04 19:50 ` Johannes Weiner 2015-11-04 19:50 ` Johannes Weiner 2015-11-05 14:40 ` Michal Hocko 2015-11-05 14:40 ` Michal Hocko 2015-11-05 16:16 ` David Miller 2015-11-05 16:16 ` David Miller 2015-11-05 16:28 ` Michal Hocko 2015-11-05 16:28 ` Michal Hocko 2015-11-05 16:28 ` Michal Hocko 2015-11-05 16:30 ` David Miller 2015-11-05 16:30 ` David Miller 2015-11-05 22:32 ` Johannes Weiner 2015-11-05 22:32 ` Johannes Weiner 2015-11-05 22:32 ` Johannes Weiner 2015-11-06 12:51 ` Michal Hocko 2015-11-06 12:51 ` Michal Hocko 2015-11-05 20:55 ` Johannes Weiner 2015-11-05 20:55 ` Johannes Weiner 2015-11-05 22:52 ` Johannes Weiner 2015-11-05 22:52 ` Johannes Weiner 2015-11-05 22:52 ` Johannes Weiner 2015-11-06 10:57 ` Michal Hocko 2015-11-06 10:57 ` Michal Hocko 2015-11-06 16:19 ` Johannes Weiner 2015-11-06 16:19 ` Johannes Weiner 2015-11-06 16:46 ` Michal Hocko 2015-11-06 16:46 ` Michal Hocko 2015-11-06 16:46 ` Michal Hocko 2015-11-06 17:45 ` Johannes Weiner 2015-11-06 17:45 ` Johannes Weiner 2015-11-06 17:45 ` Johannes Weiner 2015-11-07 3:45 ` David Miller 2015-11-07 3:45 ` David Miller 2015-11-12 18:36 ` Mel Gorman 2015-11-12 18:36 ` Mel Gorman 2015-11-12 18:36 ` Mel Gorman 2015-11-12 19:12 ` Johannes Weiner 2015-11-12 19:12 ` Johannes Weiner 2015-11-06 9:05 ` Vladimir Davydov 2015-11-06 9:05 ` Vladimir Davydov 2015-11-06 9:05 ` Vladimir Davydov 2015-11-06 9:05 ` Vladimir Davydov 2015-11-06 13:29 ` Michal Hocko 2015-11-06 13:29 ` Michal Hocko 2015-11-06 16:35 ` Johannes Weiner 2015-11-06 16:35 ` Johannes Weiner 2015-11-06 13:21 ` Michal Hocko 2015-11-06 13:21 ` Michal Hocko 2015-10-22 4:21 ` [PATCH 6/8] mm: vmscan: simplify memcg vs. global shrinker invocation Johannes Weiner 2015-10-22 4:21 ` Johannes Weiner 2015-10-23 13:26 ` Michal Hocko 2015-10-23 13:26 ` Michal Hocko 2015-10-22 4:21 ` [PATCH 7/8] mm: vmscan: report vmpressure at the level of reclaim activity Johannes Weiner 2015-10-22 4:21 ` Johannes Weiner 2015-10-22 18:48 ` Vladimir Davydov 2015-10-22 18:48 ` Vladimir Davydov 2015-10-22 18:48 ` Vladimir Davydov 2015-10-22 18:48 ` Vladimir Davydov 2015-10-23 13:49 ` Michal Hocko 2015-10-23 13:49 ` Michal Hocko 2015-10-23 13:49 ` Michal Hocko 2015-10-22 4:21 ` [PATCH 8/8] mm: memcontrol: hook up vmpressure to socket pressure Johannes Weiner 2015-10-22 4:21 ` Johannes Weiner 2015-10-22 18:57 ` Vladimir Davydov 2015-10-22 18:57 ` Vladimir Davydov 2015-10-22 18:57 ` Vladimir Davydov 2015-10-22 18:45 ` [PATCH 0/8] mm: memcontrol: account socket memory in unified hierarchy Vladimir Davydov 2015-10-22 18:45 ` Vladimir Davydov 2015-10-22 18:45 ` Vladimir Davydov 2015-10-26 17:22 ` Johannes Weiner 2015-10-26 17:22 ` Johannes Weiner 2015-10-26 17:22 ` Johannes Weiner 2015-10-26 17:22 ` Johannes Weiner 2015-10-27 8:43 ` Vladimir Davydov 2015-10-27 8:43 ` Vladimir Davydov 2015-10-27 8:43 ` Vladimir Davydov 2015-10-27 16:01 ` Johannes Weiner 2015-10-27 16:01 ` Johannes Weiner 2015-10-28 8:20 ` Vladimir Davydov 2015-10-28 8:20 ` Vladimir Davydov 2015-10-28 8:20 ` Vladimir Davydov 2015-10-28 18:58 ` Johannes Weiner 2015-10-28 18:58 ` Johannes Weiner 2015-10-29 9:27 ` Vladimir Davydov 2015-10-29 9:27 ` Vladimir Davydov 2015-10-29 9:27 ` Vladimir Davydov 2015-10-29 17:52 ` Johannes Weiner 2015-10-29 17:52 ` Johannes Weiner 2015-10-29 17:52 ` Johannes Weiner 2015-11-02 14:47 ` Vladimir Davydov 2015-11-02 14:47 ` Vladimir Davydov 2015-11-02 14:47 ` Vladimir Davydov
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20151027161554.GJ9891@dhcp22.suse.cz \ --to=mhocko@kernel.org \ --cc=akpm@linux-foundation.org \ --cc=cgroups@vger.kernel.org \ --cc=davem@davemloft.net \ --cc=hannes@cmpxchg.org \ --cc=linux-kernel@vger.kernel.org \ --cc=linux-mm@kvack.org \ --cc=netdev@vger.kernel.org \ --cc=tj@kernel.org \ --cc=vdavydov@virtuozzo.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.