From: Roman Gushchin <guro@fb.com>
To: "Tobin C. Harding" <me@tobin.cc>
Cc: "Tobin C. Harding" <tobin@kernel.org>,
Andrew Morton <akpm@linux-foundation.org>,
Matthew Wilcox <willy@infradead.org>,
"Alexander Viro" <viro@ftp.linux.org.uk>,
Christoph Hellwig <hch@infradead.org>,
"Pekka Enberg" <penberg@cs.helsinki.fi>,
David Rientjes <rientjes@google.com>,
Joonsoo Kim <iamjoonsoo.kim@lge.com>,
Christopher Lameter <cl@linux.com>,
Miklos Szeredi <mszeredi@redhat.com>,
Andreas Dilger <adilger@dilger.ca>,
Waiman Long <longman@redhat.com>, Tycho Andersen <tycho@tycho.ws>,
"Theodore Ts'o" <tytso@mit.edu>, Andi Kleen <ak@linux.intel.com>,
David Chinner <david@fromorbit.com>,
Nick Piggin <npiggin@gmail.com>, Rik van Riel <riel@redhat.com>,
Hugh Dickins <hughd@google.com>, Jonathan Corbet <corbet@lwn.net>,
"linux-mm@kvack.org" <linux-mm@kvack.org>,
"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: [RFC PATCH v5 04/16] slub: Slab defrag core
Date: Tue, 21 May 2019 01:25:34 +0000 [thread overview]
Message-ID: <20190521012525.GA15348@tower.DHCP.thefacebook.com> (raw)
In-Reply-To: <20190521011525.GA25898@eros.localdomain>
On Tue, May 21, 2019 at 11:15:25AM +1000, Tobin C. Harding wrote:
> On Tue, May 21, 2019 at 12:51:57AM +0000, Roman Gushchin wrote:
> > On Mon, May 20, 2019 at 03:40:05PM +1000, Tobin C. Harding wrote:
> > > Internal fragmentation can occur within pages used by the slub
> > > allocator. Under some workloads large numbers of pages can be used by
> > > partial slab pages. This under-utilisation is bad simply because it
> > > wastes memory but also because if the system is under memory pressure
> > > higher order allocations may become difficult to satisfy. If we can
> > > defrag slab caches we can alleviate these problems.
> > >
> > > Implement Slab Movable Objects in order to defragment slab caches.
> > >
> > > Slab defragmentation may occur:
> > >
> > > 1. Unconditionally when __kmem_cache_shrink() is called on a slab cache
> > > by the kernel calling kmem_cache_shrink().
> > >
> > > 2. Unconditionally through the use of the slabinfo command.
> > >
> > > slabinfo <cache> -s
> > >
> > > 3. Conditionally via the use of kmem_cache_defrag()
> > >
> > > - Use Slab Movable Objects when shrinking cache.
> > >
> > > Currently when the kernel calls kmem_cache_shrink() we curate the
> > > partial slabs list. If object migration is not enabled for the cache we
> > > still do this, if however, SMO is enabled we attempt to move objects in
> > > partially full slabs in order to defragment the cache. Shrink attempts
> > > to move all objects in order to reduce the cache to a single partial
> > > slab for each node.
> > >
> > > - Add conditional per node defrag via new function:
> > >
> > > kmem_defrag_slabs(int node).
> > >
> > > kmem_defrag_slabs() attempts to defragment all slab caches for
> > > node. Defragmentation is done conditionally dependent on MAX_PARTIAL
> > > _and_ defrag_used_ratio.
> > >
> > > Caches are only considered for defragmentation if the number of
> > > partial slabs exceeds MAX_PARTIAL (per node).
> > >
> > > Also, defragmentation only occurs if the usage ratio of the slab is
> > > lower than the configured percentage (sysfs field added in this
> > > patch). Fragmentation ratios are measured by calculating the
> > > percentage of objects in use compared to the total number of objects
> > > that the slab page can accommodate.
> > >
> > > The scanning of slab caches is optimized because the defragmentable
> > > slabs come first on the list. Thus we can terminate scans on the
> > > first slab encountered that does not support defragmentation.
> > >
> > > kmem_defrag_slabs() takes a node parameter. This can either be -1 if
> > > defragmentation should be performed on all nodes, or a node number.
> > >
> > > Defragmentation may be disabled by setting defrag ratio to 0
> > >
> > > echo 0 > /sys/kernel/slab/<cache>/defrag_used_ratio
> > >
> > > - Add a defrag ratio sysfs field and set it to 30% by default. A limit
> > > of 30% specifies that more than 3 out of 10 available slots for objects
> > > need to be in use otherwise slab defragmentation will be attempted on
> > > the remaining objects.
> > >
> > > In order for a cache to be defragmentable the cache must support object
> > > migration (SMO). Enabling SMO for a cache is done via a call to the
> > > recently added function:
> > >
> > > void kmem_cache_setup_mobility(struct kmem_cache *,
> > > kmem_cache_isolate_func,
> > > kmem_cache_migrate_func);
> > >
> > > Co-developed-by: Christoph Lameter <cl@linux.com>
> > > Signed-off-by: Tobin C. Harding <tobin@kernel.org>
> > > ---
> > > Documentation/ABI/testing/sysfs-kernel-slab | 14 +
> > > include/linux/slab.h | 1 +
> > > include/linux/slub_def.h | 7 +
> > > mm/slub.c | 385 ++++++++++++++++----
> > > 4 files changed, 334 insertions(+), 73 deletions(-)
> >
> > Hi Tobin!
> >
> > Overall looks very good to me! I'll take another look when you'll post
> > a non-RFC version, but so far I can't find any issues.
>
> Thanks for the reviews.
>
> > A generic question: as I understand, you do support only root kmemcaches now.
> > Is kmemcg support in plans?
>
> I know very little about cgroups, I have no plans for this work.
> However, I'm not the architect behind this - Christoph is guiding the
> direction on this one. Perhaps he will comment.
>
> > Without it the patchset isn't as attractive to anyone using cgroups,
> > as it could be. Also, I hope it can solve (or mitigate) the memcg-specific
> > problem of scattering vfs cache workingset over multiple generations of the
> > same cgroup (their kmem_caches).
>
> I'm keen to work on anything that makes this more useful so I'll do some
> research. Thanks for the idea.
You're welcome! I'm happy to help or even to do it by myself, once
your patches will be merged.
Thanks!
next prev parent reply other threads:[~2019-05-21 1:27 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-05-20 5:40 [RFC PATCH v5 00/16] Slab Movable Objects (SMO) Tobin C. Harding
2019-05-20 5:40 ` [RFC PATCH v5 01/16] slub: Add isolate() and migrate() methods Tobin C. Harding
2019-05-21 0:37 ` Roman Gushchin
2019-05-20 5:40 ` [RFC PATCH v5 02/16] tools/vm/slabinfo: Add support for -C and -M options Tobin C. Harding
2019-05-20 5:40 ` [RFC PATCH v5 03/16] slub: Sort slab cache list Tobin C. Harding
2019-05-21 0:38 ` Roman Gushchin
2019-05-20 5:40 ` [RFC PATCH v5 04/16] slub: Slab defrag core Tobin C. Harding
2019-05-21 0:51 ` Roman Gushchin
2019-05-21 1:15 ` Tobin C. Harding
2019-05-21 1:25 ` Roman Gushchin [this message]
2019-05-20 5:40 ` [RFC PATCH v5 05/16] tools/vm/slabinfo: Add remote node defrag ratio output Tobin C. Harding
2019-05-20 5:40 ` [RFC PATCH v5 06/16] tools/vm/slabinfo: Add defrag_used_ratio output Tobin C. Harding
2019-05-20 5:40 ` [RFC PATCH v5 07/16] tools/testing/slab: Add object migration test module Tobin C. Harding
2019-05-20 5:40 ` [RFC PATCH v5 08/16] tools/testing/slab: Add object migration test suite Tobin C. Harding
2019-05-20 5:40 ` [RFC PATCH v5 09/16] lib: Separate radix_tree_node and xa_node slab cache Tobin C. Harding
2019-05-20 5:40 ` [RFC PATCH v5 10/16] xarray: Implement migration function for xa_node objects Tobin C. Harding
2019-05-20 5:40 ` [RFC PATCH v5 11/16] tools/testing/slab: Add XArray movable objects tests Tobin C. Harding
2019-05-20 5:40 ` [RFC PATCH v5 12/16] slub: Enable moving objects to/from specific nodes Tobin C. Harding
2019-05-20 5:40 ` [RFC PATCH v5 13/16] slub: Enable balancing slabs across nodes Tobin C. Harding
2019-05-21 1:04 ` Roman Gushchin
2019-05-21 1:44 ` Tobin C. Harding
2019-05-20 5:40 ` [RFC PATCH v5 14/16] dcache: Provide a dentry constructor Tobin C. Harding
2019-05-20 5:40 ` [RFC PATCH v5 15/16] dcache: Implement partial shrink via Slab Movable Objects Tobin C. Harding
2019-05-20 5:40 ` [RFC PATCH v5 16/16] dcache: Add CONFIG_DCACHE_SMO Tobin C. Harding
2019-05-21 0:57 ` Roman Gushchin
2019-05-21 1:31 ` Tobin C. Harding
2019-05-21 2:05 ` Roman Gushchin
2019-05-21 3:15 ` Tobin C. Harding
2019-05-29 3:54 ` Tobin C. Harding
2019-05-29 16:16 ` Roman Gushchin
2019-06-03 4:26 ` Tobin C. Harding
2019-06-03 20:34 ` Roman Gushchin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190521012525.GA15348@tower.DHCP.thefacebook.com \
--to=guro@fb.com \
--cc=adilger@dilger.ca \
--cc=ak@linux.intel.com \
--cc=akpm@linux-foundation.org \
--cc=cl@linux.com \
--cc=corbet@lwn.net \
--cc=david@fromorbit.com \
--cc=hch@infradead.org \
--cc=hughd@google.com \
--cc=iamjoonsoo.kim@lge.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=longman@redhat.com \
--cc=me@tobin.cc \
--cc=mszeredi@redhat.com \
--cc=npiggin@gmail.com \
--cc=penberg@cs.helsinki.fi \
--cc=riel@redhat.com \
--cc=rientjes@google.com \
--cc=tobin@kernel.org \
--cc=tycho@tycho.ws \
--cc=tytso@mit.edu \
--cc=viro@ftp.linux.org.uk \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).