patches.lists.linux.dev archive mirror
 help / color / mirror / Atom feed
From: Roman Gushchin <guro@fb.com>
To: Vlastimil Babka <vbabka@suse.cz>
Cc: Matthew Wilcox <willy@infradead.org>,
	Christoph Lameter <cl@linux.com>,
	David Rientjes <rientjes@google.com>,
	Joonsoo Kim <iamjoonsoo.kim@lge.com>,
	Pekka Enberg <penberg@kernel.org>, <linux-mm@kvack.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Hyeonggon Yoo <42.hyeyoo@gmail.com>, <patches@lists.linux.dev>,
	Michal Hocko <mhocko@kernel.org>,
	Vladimir Davydov <vdavydov.dev@gmail.com>,
	<cgroups@vger.kernel.org>
Subject: Re: [PATCH v4 23/32] mm/memcg: Convert slab objcgs from struct page to struct slab
Date: Wed, 5 Jan 2022 19:36:10 -0800	[thread overview]
Message-ID: <YdZjqhGqVH38WJIw@carbon.dhcp.thefacebook.com> (raw)
In-Reply-To: <56c7a92b-f830-93dd-3fd0-4b6f1eb723ef@suse.cz>

On Wed, Jan 05, 2022 at 06:08:45PM +0100, Vlastimil Babka wrote:
> On 1/5/22 03:41, Roman Gushchin wrote:
> > On Tue, Jan 04, 2022 at 01:10:37AM +0100, Vlastimil Babka wrote:
> >> page->memcg_data is used with MEMCG_DATA_OBJCGS flag only for slab pages
> >> so convert all the related infrastructure to struct slab. Also use
> >> struct folio instead of struct page when resolving object pointers.
> >> 
> >> This is not just mechanistic changing of types and names. Now in
> >> mem_cgroup_from_obj() we use folio_test_slab() to decide if we interpret
> >> the folio as a real slab instead of a large kmalloc, instead of relying
> >> on MEMCG_DATA_OBJCGS bit that used to be checked in page_objcgs_check().
> >> Similarly in memcg_slab_free_hook() where we can encounter
> >> kmalloc_large() pages (here the folio slab flag check is implied by
> >> virt_to_slab()). As a result, page_objcgs_check() can be dropped instead
> >> of converted.
> >> 
> >> To avoid include cycles, move the inline definition of slab_objcgs()
> >> from memcontrol.h to mm/slab.h.
> >> 
> >> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
> >> Cc: Johannes Weiner <hannes@cmpxchg.org>
> >> Cc: Michal Hocko <mhocko@kernel.org>
> >> Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
> >> Cc: <cgroups@vger.kernel.org>
> >>  	/*
> >>  	 * Slab objects are accounted individually, not per-page.
> >>  	 * Memcg membership data for each individual object is saved in
> >>  	 * the page->obj_cgroups.
> >                ^^^^^^^^^^^^^^^^^
> > 	       slab->memcg_data
> 
> Good catch, fixed.
>  
> >>  	 */
> >> -	if (page_objcgs_check(page)) {
> >> -		struct obj_cgroup *objcg;
> >> +	if (folio_test_slab(folio)) {
> >> +		struct obj_cgroup **objcgs;
> >> +		struct slab *slab;
> >>  		unsigned int off;
> >>  
> >> -		off = obj_to_index(page->slab_cache, page_slab(page), p);
> >> -		objcg = page_objcgs(page)[off];
> >> -		if (objcg)
> >> -			return obj_cgroup_memcg(objcg);
> >> +		slab = folio_slab(folio);
> >> +		objcgs = slab_objcgs(slab);
> >> +		if (!objcgs)
> >> +			return NULL;
> >> +
> >> +		off = obj_to_index(slab->slab_cache, slab, p);
> >> +		if (objcgs[off])
> >> +			return obj_cgroup_memcg(objcgs[off]);
> >>  
> >>  		return NULL;
> >>  	}
> > 
> > There is a comment below, which needs some changes:
> > 	/*
> > 	 * page_memcg_check() is used here, because page_has_obj_cgroups()
> > 	 * check above could fail because the object cgroups vector wasn't set
> > 	 * at that moment, but it can be set concurrently.
> > 	 * page_memcg_check(page) will guarantee that a proper memory
> > 	 * cgroup pointer or NULL will be returned.
> > 	 */
> > 
> > In reality the folio's slab flag can be cleared before releasing the objcgs \
> > vector. It seems that there is no such possibility at setting the flag,
> > it's always set before allocating and assigning the objcg vector.
> 
> You're right. I'm changing it to:
> 
>          * page_memcg_check() is used here, because in theory we can encounter
>          * a folio where the slab flag has been cleared already, but
>          * slab->memcg_data has not been freed yet
>          * page_memcg_check(page) will guarantee that a proper memory
>          * cgroup pointer or NULL will be returned.
> 
> I wrote "in theory" because AFAICS it implies a race as we would have to be
> freeing a slab and at the same time query an object address. We probably
> could have used the non-check version, but at this point I don't want to
> make any functional changes besides these comment fixes.

Sounds good to me.

> 
> I assume your patch on top would cover it?

I tried to master it and remembered why we have this bit in place: there is
a /proc/kpagecgroup interface which just scans over pages and reads their
memcg data. It has zero control over the lifetime of pages, so it's prone
to all kinds of races with setting and clearing the slab flag. So it's
probably better to leave the MEMCG_DATA_OBJCGS bit in place.

> 
> >> @@ -2896,7 +2901,7 @@ struct mem_cgroup *mem_cgroup_from_obj(void *p)
> >>  	 * page_memcg_check(page) will guarantee that a proper memory
> >>  	 * cgroup pointer or NULL will be returned.
> >>  	 */
> >> -	return page_memcg_check(page);
> >> +	return page_memcg_check(folio_page(folio, 0));
> >>  }
> >>  
> >>  __always_inline struct obj_cgroup *get_obj_cgroup_from_current(void)
> >> diff --git a/mm/slab.h b/mm/slab.h
> >> index bca9181e96d7..36e0022d8267 100644
> >> --- a/mm/slab.h
> >> +++ b/mm/slab.h
> >> @@ -412,15 +412,36 @@ static inline bool kmem_cache_debug_flags(struct kmem_cache *s, slab_flags_t fla
> >>  }
> >>  
> >>  #ifdef CONFIG_MEMCG_KMEM
> >> -int memcg_alloc_page_obj_cgroups(struct page *page, struct kmem_cache *s,
> >> -				 gfp_t gfp, bool new_page);
> >> +/*
> >> + * slab_objcgs - get the object cgroups vector associated with a slab
> >> + * @slab: a pointer to the slab struct
> >> + *
> >> + * Returns a pointer to the object cgroups vector associated with the slab,
> >> + * or NULL. This function assumes that the slab is known to have an
> >> + * associated object cgroups vector. It's not safe to call this function
> >> + * against slabs with underlying pages, which might have an associated memory
> >> + * cgroup: e.g.  kernel stack pages.
> > 
> > Hm, is it still true? I don't think so. It must be safe to call it for any
> > slab now.
> 
> Right, forgot to update after removing the _check variant.
> Changing to:
> 
>   * Returns a pointer to the object cgroups vector associated with the slab,
>   * or NULL if no such vector has been associated yet.

Perfect!

Thanks!

  reply	other threads:[~2022-01-06  3:36 UTC|newest]

Thread overview: 55+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-01-04  0:10 [PATCH v4 00/32] Separate struct slab from struct page Vlastimil Babka
2022-01-04  0:10 ` [PATCH v4 01/32] mm: add virt_to_folio() and folio_address() Vlastimil Babka
2022-01-04  0:10 ` [PATCH v4 02/32] mm/slab: Dissolve slab_map_pages() in its caller Vlastimil Babka
2022-01-06  6:40   ` Hyeonggon Yoo
2022-01-04  0:10 ` [PATCH v4 03/32] mm/slub: Make object_err() static Vlastimil Babka
2022-01-04  0:10 ` [PATCH v4 04/32] mm: Split slab into its own type Vlastimil Babka
2022-01-06 11:54   ` Hyeonggon Yoo
2022-01-04  0:10 ` [PATCH v4 05/32] mm: Convert [un]account_slab_page() to struct slab Vlastimil Babka
2022-01-06 13:04   ` Hyeonggon Yoo
2022-01-04  0:10 ` [PATCH v4 06/32] mm: Convert virt_to_cache() to use " Vlastimil Babka
2022-01-06  6:44   ` Hyeonggon Yoo
2022-01-04  0:10 ` [PATCH v4 07/32] mm: Convert __ksize() to " Vlastimil Babka
2022-01-06 13:42   ` Hyeonggon Yoo
2022-01-06 17:26     ` Vlastimil Babka
2022-01-08  6:21       ` Hyeonggon Yoo
2022-01-04  0:10 ` [PATCH v4 08/32] mm: Use struct slab in kmem_obj_info() Vlastimil Babka
2022-01-04  0:10 ` [PATCH v4 09/32] mm: Convert check_heap_object() to use struct slab Vlastimil Babka
2022-01-06 13:56   ` Hyeonggon Yoo
2022-01-04  0:10 ` [PATCH v4 10/32] mm/slub: Convert detached_freelist to use a " Vlastimil Babka
2022-01-05  0:58   ` Roman Gushchin
2022-01-04  0:10 ` [PATCH v4 11/32] mm/slub: Convert kfree() " Vlastimil Babka
2022-01-05  1:00   ` Roman Gushchin
2022-01-04  0:10 ` [PATCH v4 12/32] mm/slub: Convert __slab_lock() and __slab_unlock() to " Vlastimil Babka
2022-01-04  0:10 ` [PATCH v4 13/32] mm/slub: Convert print_page_info() to print_slab_info() Vlastimil Babka
2022-01-04  0:10 ` [PATCH v4 14/32] mm/slub: Convert alloc_slab_page() to return a struct slab Vlastimil Babka
2022-01-04  0:10 ` [PATCH v4 15/32] mm/slub: Convert __free_slab() to use " Vlastimil Babka
2022-01-04  0:10 ` [PATCH v4 16/32] mm/slub: Convert pfmemalloc_match() to take a " Vlastimil Babka
2022-01-04  0:10 ` [PATCH v4 17/32] mm/slub: Convert most struct page to struct slab by spatch Vlastimil Babka
2022-01-04  0:10 ` [PATCH v4 18/32] mm/slub: Finish struct page to struct slab conversion Vlastimil Babka
2022-01-04  0:10 ` [PATCH v4 19/32] mm/slab: Convert kmem_getpages() and kmem_freepages() to struct slab Vlastimil Babka
2022-01-04  0:10 ` [PATCH v4 20/32] mm/slab: Convert most struct page to struct slab by spatch Vlastimil Babka
2022-01-05  1:52   ` Roman Gushchin
2022-01-04  0:10 ` [PATCH v4 21/32] mm/slab: Finish struct page to struct slab conversion Vlastimil Babka
2022-01-05  2:05   ` Roman Gushchin
2022-01-04  0:10 ` [PATCH v4 22/32] mm: Convert struct page to struct slab in functions used by other subsystems Vlastimil Babka
2022-01-05  2:12   ` Roman Gushchin
2022-01-05 16:39     ` Vlastimil Babka
2022-01-04  0:10 ` [PATCH v4 23/32] mm/memcg: Convert slab objcgs from struct page to struct slab Vlastimil Babka
2022-01-05  2:41   ` Roman Gushchin
2022-01-05 17:08     ` Vlastimil Babka
2022-01-06  3:36       ` Roman Gushchin [this message]
2022-01-05  2:55   ` Roman Gushchin
2022-01-04  0:10 ` [PATCH v4 24/32] mm/slob: Convert SLOB to use struct slab and struct folio Vlastimil Babka
2022-01-04  0:10 ` [PATCH v4 25/32] mm/kasan: Convert to struct folio and struct slab Vlastimil Babka
2022-01-06  4:06   ` Roman Gushchin
2022-01-04  0:10 ` [PATCH v4 26/32] mm/kfence: Convert kfence_guarded_alloc() to " Vlastimil Babka
2022-01-04  0:10 ` [PATCH v4 27/32] mm/sl*b: Differentiate struct slab fields by sl*b implementations Vlastimil Babka
2022-01-06  4:12   ` Roman Gushchin
2022-01-04  0:10 ` [PATCH v4 28/32] mm/slub: Simplify struct slab slabs field definition Vlastimil Babka
2022-01-06  4:13   ` Roman Gushchin
2022-01-04  0:10 ` [PATCH v4 29/32] mm/slub: Define struct slab fields for CONFIG_SLUB_CPU_PARTIAL only when enabled Vlastimil Babka
2022-01-06  4:16   ` Roman Gushchin
2022-01-04  0:10 ` [PATCH v4 30/32] zsmalloc: Stop using slab fields in struct page Vlastimil Babka
2022-01-04  0:10 ` [PATCH v4 31/32] bootmem: Use page->index instead of page->freelist Vlastimil Babka
2022-01-04  0:10 ` [PATCH v4 32/32] mm/slob: Remove unnecessary page_mapcount_reset() function call Vlastimil Babka

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YdZjqhGqVH38WJIw@carbon.dhcp.thefacebook.com \
    --to=guro@fb.com \
    --cc=42.hyeyoo@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=cgroups@vger.kernel.org \
    --cc=cl@linux.com \
    --cc=hannes@cmpxchg.org \
    --cc=iamjoonsoo.kim@lge.com \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=patches@lists.linux.dev \
    --cc=penberg@kernel.org \
    --cc=rientjes@google.com \
    --cc=vbabka@suse.cz \
    --cc=vdavydov.dev@gmail.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).