linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Vladimir Davydov <vdavydov@virtuozzo.com>
To: Johannes Weiner <hannes@cmpxchg.org>
Cc: "David S. Miller" <davem@davemloft.net>,
	Andrew Morton <akpm@linux-foundation.org>,
	Michal Hocko <mhocko@suse.cz>, Tejun Heo <tj@kernel.org>,
	netdev@vger.kernel.org, linux-mm@kvack.org,
	cgroups@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 0/8] mm: memcontrol: account socket memory in unified hierarchy
Date: Tue, 27 Oct 2015 11:43:21 +0300	[thread overview]
Message-ID: <20151027084320.GF13221@esperanza> (raw)
In-Reply-To: <20151026172216.GC2214@cmpxchg.org>

On Mon, Oct 26, 2015 at 01:22:16PM -0400, Johannes Weiner wrote:
> On Thu, Oct 22, 2015 at 09:45:10PM +0300, Vladimir Davydov wrote:
> > Hi Johannes,
> > 
> > On Thu, Oct 22, 2015 at 12:21:28AM -0400, Johannes Weiner wrote:
> > ...
> > > Patch #5 adds accounting and tracking of socket memory to the unified
> > > hierarchy memory controller, as described above. It uses the existing
> > > per-cpu charge caches and triggers high limit reclaim asynchroneously.
> > > 
> > > Patch #8 uses the vmpressure extension to equalize pressure between
> > > the pages tracked natively by the VM and socket buffer pages. As the
> > > pool is shared, it makes sense that while natively tracked pages are
> > > under duress the network transmit windows are also not increased.
> > 
> > First of all, I've no experience in networking, so I'm likely to be
> > mistaken. Nevertheless I beg to disagree that this patch set is a step
> > in the right direction. Here goes why.
> > 
> > I admit that your idea to get rid of explicit tcp window control knobs
> > and size it dynamically basing on memory pressure instead does sound
> > tempting, but I don't think it'd always work. The problem is that in
> > contrast to, say, dcache, we can't shrink tcp buffers AFAIU, we can only
> > stop growing them. Now suppose a system hasn't experienced memory
> > pressure for a while. If we don't have explicit tcp window limit, tcp
> > buffers on such a system might have eaten almost all available memory
> > (because of network load/problems). If a user workload that needs a
> > significant amount of memory is started suddenly then, the network code
> > will receive a notification and surely stop growing buffers, but all
> > those buffers accumulated won't disappear instantly. As a result, the
> > workload might be unable to find enough free memory and have no choice
> > but invoke OOM killer. This looks unexpected from the user POV.
> 
> I'm not getting rid of those knobs, I'm just reusing the old socket
> accounting infrastructure in an attempt to make the memory accounting
> feature useful to more people in cgroups v2 (unified hierarchy).
> 

My understanding is that in the meantime you effectively break the
existing per memcg tcp window control logic.

> We can always come back to think about per-cgroup tcp window limits in
> the unified hierarchy, my patches don't get in the way of this. I'm
> not removing the knobs in cgroups v1 and I'm not preventing them in v2.
> 
> But regardless of tcp window control, we need to account socket memory
> in the main memory accounting pool where pressure is shared (to the
> best of our abilities) between all accounted memory consumers.
> 

No objections to this point. However, I really don't like the idea to
charge tcp window size to memory.current instead of charging individual
pages consumed by the workload for storing socket buffers, because it is
inconsistent with what we have now. Can't we charge individual skb pages
as we do in case of other kmem allocations?

> From an interface standpoint alone, I don't think it's reasonable to
> ask users per default to limit different consumers on a case by case
> basis. I certainly have no problem with finetuning for scenarios you
> describe above, but with memory.current, memory.high, memory.max we
> are providing a generic interface to account and contain memory
> consumption of workloads. This has to include all major memory
> consumers to make semantical sense.

We can propose a reasonable default as we do in the global case.

> 
> But also, there are people right now for whom the socket buffers cause
> system OOM, but the existing memcg's hard tcp window limitq that
> exists absolutely wrecks network performance for them. It's not usable
> the way it is. It'd be much better to have the socket buffers exert
> pressure on the shared pool, and then propagate the overall pressure
> back to individual consumers with reclaim, shrinkers, vmpressure etc.
> 

This might or might not work. I'm not an expert to judge. But if you do
this only for memcg leaving the global case as it is, networking people
won't budge IMO. So could you please start such a major rework from the
global case? Could you please try to deprecate the tcp window limits not
only in the legacy memcg hierarchy, but also system-wide in order to
attract attention of networking experts?

Thanks,
Vladimir

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2015-10-27  8:43 UTC|newest]

Thread overview: 63+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-10-22  4:21 [PATCH 0/8] mm: memcontrol: account socket memory in unified hierarchy Johannes Weiner
2015-10-22  4:21 ` [PATCH 1/8] mm: page_counter: let page_counter_try_charge() return bool Johannes Weiner
2015-10-23 11:31   ` Michal Hocko
2015-10-22  4:21 ` [PATCH 2/8] mm: memcontrol: export root_mem_cgroup Johannes Weiner
2015-10-23 11:32   ` Michal Hocko
2015-10-22  4:21 ` [PATCH 3/8] net: consolidate memcg socket buffer tracking and accounting Johannes Weiner
2015-10-22 18:46   ` Vladimir Davydov
2015-10-22 19:09     ` Johannes Weiner
2015-10-23 13:42       ` Vladimir Davydov
2015-10-23 12:38   ` Michal Hocko
2015-10-22  4:21 ` [PATCH 4/8] mm: memcontrol: prepare for unified hierarchy socket accounting Johannes Weiner
2015-10-23 12:39   ` Michal Hocko
2015-10-22  4:21 ` [PATCH 5/8] mm: memcontrol: account socket memory on unified hierarchy Johannes Weiner
2015-10-22 18:47   ` Vladimir Davydov
2015-10-23 13:19   ` Michal Hocko
2015-10-23 13:59     ` David Miller
2015-10-26 16:56       ` Johannes Weiner
2015-10-27 12:26         ` Michal Hocko
2015-10-27 13:49           ` David Miller
2015-10-27 15:41           ` Johannes Weiner
2015-10-27 16:15             ` Michal Hocko
2015-10-27 16:42               ` Johannes Weiner
2015-10-28  0:45                 ` David Miller
2015-10-28  3:05                   ` Johannes Weiner
2015-10-29 15:25                 ` Michal Hocko
2015-10-29 16:10                   ` Johannes Weiner
2015-11-04 10:42                     ` Michal Hocko
2015-11-04 19:50                       ` Johannes Weiner
2015-11-05 14:40                         ` Michal Hocko
2015-11-05 16:16                           ` David Miller
2015-11-05 16:28                             ` Michal Hocko
2015-11-05 16:30                               ` David Miller
2015-11-05 22:32                               ` Johannes Weiner
2015-11-06 12:51                                 ` Michal Hocko
2015-11-05 20:55                           ` Johannes Weiner
2015-11-05 22:52                             ` Johannes Weiner
2015-11-06 10:57                               ` Michal Hocko
2015-11-06 16:19                                 ` Johannes Weiner
2015-11-06 16:46                                   ` Michal Hocko
2015-11-06 17:45                                     ` Johannes Weiner
2015-11-07  3:45                                     ` David Miller
2015-11-12 18:36                                   ` Mel Gorman
2015-11-12 19:12                                     ` Johannes Weiner
2015-11-06  9:05                             ` Vladimir Davydov
2015-11-06 13:29                               ` Michal Hocko
2015-11-06 16:35                               ` Johannes Weiner
2015-11-06 13:21                             ` Michal Hocko
2015-10-22  4:21 ` [PATCH 6/8] mm: vmscan: simplify memcg vs. global shrinker invocation Johannes Weiner
2015-10-23 13:26   ` Michal Hocko
2015-10-22  4:21 ` [PATCH 7/8] mm: vmscan: report vmpressure at the level of reclaim activity Johannes Weiner
2015-10-22 18:48   ` Vladimir Davydov
2015-10-23 13:49   ` Michal Hocko
2015-10-22  4:21 ` [PATCH 8/8] mm: memcontrol: hook up vmpressure to socket pressure Johannes Weiner
2015-10-22 18:57   ` Vladimir Davydov
2015-10-22 18:45 ` [PATCH 0/8] mm: memcontrol: account socket memory in unified hierarchy Vladimir Davydov
2015-10-26 17:22   ` Johannes Weiner
2015-10-27  8:43     ` Vladimir Davydov [this message]
2015-10-27 16:01       ` Johannes Weiner
2015-10-28  8:20         ` Vladimir Davydov
2015-10-28 18:58           ` Johannes Weiner
2015-10-29  9:27             ` Vladimir Davydov
2015-10-29 17:52               ` Johannes Weiner
2015-11-02 14:47                 ` Vladimir Davydov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20151027084320.GF13221@esperanza \
    --to=vdavydov@virtuozzo.com \
    --cc=akpm@linux-foundation.org \
    --cc=cgroups@vger.kernel.org \
    --cc=davem@davemloft.net \
    --cc=hannes@cmpxchg.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.cz \
    --cc=netdev@vger.kernel.org \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).