All of lore.kernel.org
 help / color / mirror / Atom feed
From: Michal Hocko <mhocko@kernel.org>
To: Roman Gushchin <guro@fb.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Christoph Lameter <cl@linux.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Shakeel Butt <shakeelb@google.com>,
	linux-mm@kvack.org, Vlastimil Babka <vbabka@suse.cz>,
	kernel-team@fb.com, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v7 05/19] mm: memcontrol: decouple reference counting from page accounting
Date: Mon, 3 Aug 2020 11:00:33 +0200	[thread overview]
Message-ID: <20200803090033.GE5174@dhcp22.suse.cz> (raw)
In-Reply-To: <20200623174037.3951353-6-guro@fb.com>

I am sorry for coming late here.

On Tue 23-06-20 10:40:23, Roman Gushchin wrote:
> From: Johannes Weiner <hannes@cmpxchg.org>
> 
> The reference counting of a memcg is currently coupled directly to how
> many 4k pages are charged to it.  This doesn't work well with Roman's new
> slab controller, which maintains pools of objects and doesn't want to keep
> an extra balance sheet for the pages backing those objects.
> 
> This unusual refcounting design (reference counts usually track pointers
> to an object) is only for historical reasons: memcg used to not take any
> css references and simply stalled offlining until all charges had been
> reparented and the page counters had dropped to zero.  When we got rid of
> the reparenting requirement, the simple mechanical translation was to take
> a reference for every charge.
> 
> More historical context can be found in commit e8ea14cc6ead ("mm:
> memcontrol: take a css reference for each charged page"), commit
> 64f219938941 ("mm: memcontrol: remove obsolete kmemcg pinning tricks") and
> commit b2052564e66d ("mm: memcontrol: continue cache reclaim from offlined
> groups").
> 
> The new slab controller exposes the limitations in this scheme, so let's
> switch it to a more idiomatic reference counting model based on actual
> kernel pointers to the memcg:
> 
> - The per-cpu stock holds a reference to the memcg its caching
> 
> - User pages hold a reference for their page->mem_cgroup. Transparent
>   huge pages will no longer acquire tail references in advance, we'll
>   get them if needed during the split.
> 
> - Kernel pages hold a reference for their page->mem_cgroup
> 
> - Pages allocated in the root cgroup will acquire and release css
>   references for simplicity. css_get() and css_put() optimize that.
> 
> - The current memcg_charge_slab() already hacked around the per-charge
>   references; this change gets rid of that as well.

just for completeness
- tcp accounting will handle reference in mem_cgroup_sk_{alloc,free}

As all those paths are handling the reference count differently it is
probably good to remind that in a comment:
/* Caller is responsible to hold reference for the existence of the charged object *

for try_charge function.

We will need to be more careful (e.g. http://lkml.kernel.org/r/alpine.LSU.2.11.2007302011450.2347@eggly.anvils)
but considering that the old model doesn't fit with the new slab
accounting as mentioned above this is not really something terrible to
live with.

[...]
> @@ -5456,7 +5460,10 @@ static int mem_cgroup_move_account(struct page *page,
>  	 */
>  	smp_mb();
>  
> -	page->mem_cgroup = to; 	/* caller should have done css_get */
> +	css_get(&to->css);
> +	css_put(&from->css);
> +
> +	page->mem_cgroup = to;
>  
>  	__unlock_page_memcg(from);

What prevents from memcg to be released here?

-- 
Michal Hocko
SUSE Labs

  reply	other threads:[~2020-08-03  9:00 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-06-23 17:40 [PATCH v7 00/19] The new cgroup slab memory controller Roman Gushchin
2020-06-23 17:40 ` [PATCH v7 01/19] mm: memcg: factor out memcg- and lruvec-level changes out of __mod_lruvec_state() Roman Gushchin
2020-06-23 17:40 ` [PATCH v7 02/19] mm: memcg: prepare for byte-sized vmstat items Roman Gushchin
2020-06-23 17:40 ` [PATCH v7 03/19] mm: memcg: convert vmstat slab counters to bytes Roman Gushchin
2020-06-23 17:40 ` [PATCH v7 04/19] mm: slub: implement SLUB version of obj_to_index() Roman Gushchin
2020-06-23 17:40 ` [PATCH v7 05/19] mm: memcontrol: decouple reference counting from page accounting Roman Gushchin
2020-08-03  9:00   ` Michal Hocko [this message]
2020-08-03 15:03     ` Johannes Weiner
2020-08-03 15:08       ` Michal Hocko
2020-06-23 17:40 ` [PATCH v7 06/19] mm: memcg/slab: obj_cgroup API Roman Gushchin
2020-06-23 17:40 ` [PATCH v7 07/19] mm: memcg/slab: allocate obj_cgroups for non-root slab pages Roman Gushchin
2020-06-23 17:40 ` [PATCH v7 08/19] mm: memcg/slab: save obj_cgroup for non-root slab objects Roman Gushchin
2020-07-16 16:55   ` Naresh Kamboju
2020-07-16 16:55     ` Naresh Kamboju
2020-07-16 20:07     ` Roman Gushchin
2020-07-17  5:34       ` Naresh Kamboju
2020-07-17  5:34         ` Naresh Kamboju
2020-06-23 17:40 ` [PATCH v7 09/19] mm: memcg/slab: charge individual slab objects instead of pages Roman Gushchin
2020-06-23 17:40 ` [PATCH v7 10/19] mm: memcg/slab: deprecate memory.kmem.slabinfo Roman Gushchin
2020-06-24  1:43   ` Shakeel Butt
2020-06-24  1:43     ` Shakeel Butt
2020-06-24  1:53     ` Roman Gushchin
2020-06-23 17:40 ` [PATCH v7 11/19] mm: memcg/slab: move memcg_kmem_bypass() to memcontrol.h Roman Gushchin
2020-06-23 17:40 ` [PATCH v7 12/19] mm: memcg/slab: use a single set of kmem_caches for all accounted allocations Roman Gushchin
2020-06-23 17:40 ` [PATCH v7 13/19] mm: memcg/slab: simplify memcg cache creation Roman Gushchin
2020-06-23 17:40 ` [PATCH v7 14/19] mm: memcg/slab: remove memcg_kmem_get_cache() Roman Gushchin
2020-06-23 17:40 ` [PATCH v7 15/19] mm: memcg/slab: deprecate slab_root_caches Roman Gushchin
2020-06-23 17:40 ` [PATCH v7 16/19] mm: memcg/slab: remove redundant check in memcg_accumulate_slabinfo() Roman Gushchin
2020-06-23 17:40 ` [PATCH v7 17/19] mm: memcg/slab: use a single set of kmem_caches for all allocations Roman Gushchin
2020-07-18 17:24   ` Guenter Roeck
2020-07-18 23:03     ` Roman Gushchin
2020-06-23 17:40 ` [PATCH v7 18/19] kselftests: cgroup: add kernel memory accounting tests Roman Gushchin
  -- strict thread matches above, loose matches on Subject: below --
2020-06-23  1:58 [PATCH v7 00/19] The new cgroup slab memory controller Roman Gushchin
2020-06-23  1:58 ` [PATCH v7 05/19] mm: memcontrol: decouple reference counting from page accounting Roman Gushchin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200803090033.GE5174@dhcp22.suse.cz \
    --to=mhocko@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=cl@linux.com \
    --cc=guro@fb.com \
    --cc=hannes@cmpxchg.org \
    --cc=kernel-team@fb.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=shakeelb@google.com \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.