From: Jay Patel <jaypatel@linux.ibm.com>
To: Vlastimil Babka <vbabka@suse.cz>, linux-mm@kvack.org
Cc: cl@linux.com, penberg@kernel.org, rientjes@google.com,
iamjoonsoo.kim@lge.com, akpm@linux-foundation.org,
aneesh.kumar@linux.ibm.com, tsahu@linux.ibm.com,
piyushs@linux.ibm.com
Subject: Re: [PATCH] [RFC PATCH v2]mm/slub: Optimize slub memory usage
Date: Thu, 20 Jul 2023 16:00:56 +0530 [thread overview]
Message-ID: <d841547b0bca28ee1ee7dd3b4dfc6a6dfa403755.camel@linux.ibm.com> (raw)
In-Reply-To: <a3bbb264-6d04-6917-f9b6-eade87a50707@suse.cz>
On Wed, 2023-07-12 at 15:06 +0200, Vlastimil Babka wrote:
> On 6/28/23 11:57, Jay Patel wrote:
> > In the previous version [1], we were able to reduce slub memory
> > wastage, but the total memory was also increasing so to solve
> > this problem have modified the patch as follow:
> >
> > 1) If min_objects * object_size > PAGE_ALLOC_COSTLY_ORDER, then it
> > will return with PAGE_ALLOC_COSTLY_ORDER.
> > 2) Similarly, if min_objects * object_size < PAGE_SIZE, then it
> > will
> > return with slub_min_order.
> > 3) Additionally, I changed slub_max_order to 2. There is no
> > specific
> > reason for using the value 2, but it provided the best results in
> > terms of performance without any noticeable impact.
> >
> > [1]
> >
>
> Hi,
>
> thanks for the v2. A process note: the changelog should be self-
> contained as
> will become the commit description in git log. What this would mean
> here is
> to take the v1 changelog and adjust description to how v2 is
> implemented,
> and of course replace the v1 measurements with new ones.
>
> The "what changed since v1" can be summarized in the area after sign-
> off and
> "---", before the diffstat. This helps those that looked at v1
> previously,
> but doesn't become part of git log.
>
> Now, my impression is that v1 made a sensible tradeoff for 4K pages,
> as the
> wastage was reduced, yet overal slab consumption didn't increase
> much. But
> for 64K the tradeoff looked rather bad. I think it's because with 64K
> pages
> and certain object size you can e.g. get less waste with order-3 than
> order-2, but the difference will be relatively tiny part of the 64KB,
> so
> it's not worth the increase of order, while with 4KB you can get
> larger
> reduction of waste both in absolute amount and especially relatively
> to the
> 4KB size.
>
> So I think ideally the calculation would somehow take this into
> account. The
> changes done in v2 as described above are different. It seems as a
> result we
> can now calculate lower orders on 4K systems than before the patch,
> probably
> due to conditions 2) or 3) ? I think it would be best if the patch
> resulted
> only in the same or higher order. It should be enough to tweak some
> thresholds for when it makes sense to pay the price of higher order -
> whether the reduction of wastage is worth it, in a way that takes the
> page
> size into account.
>
> Thanks,
> Vlastimil
Hi Vlastimil,
Indeed, I aim to optimize memory allocation in the SLUB
allocator [1] by targeting larger page sizes with minimal modifications
, resulting in reduced memory consumpion.
[1]https://lore.kernel.org/linux-mm/20230720102337.2069722-1-
jaypatel@linux.ibm.com/
Thanks,
Jay Patel
>
> > I have conducted tests on systems with 160 CPUs and 16 CPUs using
> > 4K
> > and 64K page sizes. The tests showed that the patch successfully
> > reduces the total and wastage of slab memory without any noticeable
> > performance degradation in the hackbench test.
> >
> > Test Results are as follows:
> > 1) On 160 CPUs with 4K Page size
> >
> > +----------------+----------------+----------------+
> > > Total wastage in slub memory |
> > +----------------+----------------+----------------+
> > > | After Boot | After Hackbench|
> > > Normal | 2090 Kb | 3204 Kb |
> > > With Patch | 1825 Kb | 3088 Kb |
> > > Wastage reduce | ~12% | ~4% |
> > +----------------+----------------+----------------+
> >
> > +-----------------+----------------+----------------+
> > > Total slub memory |
> > +-----------------+----------------+----------------+
> > > | After Boot | After Hackbench|
> > > Normal | 500572 | 713568 |
> > > With Patch | 482036 | 688312 |
> > > Memory reduce | ~4% | ~3% |
> > +-----------------+----------------+----------------+
> >
> > hackbench-process-sockets
> > +-------+-----+----------+----------+-----------+
> > > | Normal |With Patch| |
> > +-------+-----+----------+----------+-----------+
> > > Amean | 1 | 1.3237 | 1.2737 | ( 3.78%) |
> > > Amean | 4 | 1.5923 | 1.6023 | ( -0.63%) |
> > > Amean | 7 | 2.3727 | 2.4260 | ( -2.25%) |
> > > Amean | 12 | 3.9813 | 4.1290 | ( -3.71%) |
> > > Amean | 21 | 6.9680 | 7.0630 | ( -1.36%) |
> > > Amean | 30 | 10.1480 | 10.2170 | ( -0.68%) |
> > > Amean | 48 | 16.7793 | 16.8780 | ( -0.59%) |
> > > Amean | 79 | 28.9537 | 28.8187 | ( 0.47%) |
> > > Amean | 110 | 39.5507 | 40.0157 | ( -1.18%) |
> > > Amean | 141 | 51.5670 | 51.8200 | ( -0.49%) |
> > > Amean | 172 | 62.8710 | 63.2540 | ( -0.61%) |
> > > Amean | 203 | 74.6417 | 75.2520 | ( -0.82%) |
> > > Amean | 234 | 86.0853 | 86.5653 | ( -0.56%) |
> > > Amean | 265 | 97.9203 | 98.4617 | ( -0.55%) |
> > > Amean | 296 | 108.6243 | 109.8770 | ( -1.15%) |
> > +-------+-----+----------+----------+-----------+
> >
> > 2) On 160 CPUs with 64K Page size
> > +-----------------+----------------+----------------+
> > > Total wastage in slub memory |
> > +-----------------+----------------+----------------+
> > > | After Boot |After Hackbench |
> > > Normal | 919 Kb | 1880 Kb |
> > > With Patch | 807 Kb | 1684 Kb |
> > > Wastage reduce | ~12% | ~10% |
> > +-----------------+----------------+----------------+
> >
> > +-----------------+----------------+----------------+
> > > Total slub memory |
> > +-----------------+----------------+----------------+
> > > | After Boot | After Hackbench|
> > > Normal | 1862592 | 3023744 |
> > > With Patch | 1644416 | 2675776 |
> > > Memory reduce | ~12% | ~11% |
> > +-----------------+----------------+----------------+
> >
> > hackbench-process-sockets
> > +-------+-----+----------+----------+-----------+
> > > | Normal |With Patch| |
> > +-------+-----+----------+----------+-----------+
> > > Amean | 1 | 1.2547 | 1.2677 | ( -1.04%) |
> > > Amean | 4 | 1.5523 | 1.5783 | ( -1.67%) |
> > > Amean | 7 | 2.4157 | 2.3883 | ( 1.13%) |
> > > Amean | 12 | 3.9807 | 3.9793 | ( 0.03%) |
> > > Amean | 21 | 6.9687 | 6.9703 | ( -0.02%) |
> > > Amean | 30 | 10.1403 | 10.1297 | ( 0.11%) |
> > > Amean | 48 | 16.7477 | 16.6893 | ( 0.35%) |
> > > Amean | 79 | 27.9510 | 28.0463 | ( -0.34%) |
> > > Amean | 110 | 39.6833 | 39.5687 | ( 0.29%) |
> > > Amean | 141 | 51.5673 | 51.4477 | ( 0.23%) |
> > > Amean | 172 | 62.9643 | 63.1647 | ( -0.32%) |
> > > Amean | 203 | 74.6220 | 73.7900 | ( 1.11%) |
> > > Amean | 234 | 85.1783 | 85.3420 | ( -0.19%) |
> > > Amean | 265 | 96.6627 | 96.7903 | ( -0.13%) |
> > > Amean | 296 | 108.2543 | 108.2253 | ( 0.03%) |
> > +-------+-----+----------+----------+-----------+
> >
> > 3) On 16 CPUs with 4K Page size
> > +-----------------+----------------+------------------+
> > > Total wastage in slub memory |
> > +-----------------+----------------+------------------+
> > > | After Boot | After Hackbench |
> > > Normal | 491 Kb | 727 Kb |
> > > With Patch | 483 Kb | 670 Kb |
> > > Wastage reduce | ~1% | ~8% |
> > +-----------------+----------------+------------------+
> >
> > +-----------------+----------------+----------------+
> > > Total slub memory |
> > +-----------------+----------------+----------------+
> > > | After Boot | After Hackbench|
> > > Normal | 105340 | 153116 |
> > > With Patch | 103620 | 147412 |
> > > Memory reduce | ~1.6% | ~4% |
> > +-----------------+----------------+----------------+
> >
> > hackbench-process-sockets
> > +-------+-----+----------+----------+---------+
> > > | Normal |With Patch| |
> > +-------+-----+----------+----------+---------+
> > > Amean | 1 | 1.0963 | 1.1070 | ( -0.97%) |
> > > Amean | 4 | 3.7963) | 3.7957 | ( 0.02%) |
> > > Amean | 7 | 6.5947) | 6.6017 | ( -0.11%) |
> > > Amean | 12 | 11.1993) | 11.1730 | ( 0.24%) |
> > > Amean | 21 | 19.4097) | 19.3647 | ( 0.23%) |
> > > Amean | 30 | 27.7023) | 27.6040 | ( 0.35%) |
> > > Amean | 48 | 44.1287) | 43.9630 | ( 0.38%) |
> > > Amean | 64 | 58.8147) | 58.5753 | ( 0.41%) |
> > +-------+----+---------+----------+-----------+
> >
> > 4) On 16 CPUs with 64K Page size
> > +----------------+----------------+----------------+
> > > Total wastage in slub memory |
> > +----------------+----------------+----------------+
> > > | After Boot | After Hackbench|
> > > Normal | 194 Kb | 349 Kb |
> > > With Patch | 191 Kb | 344 Kb |
> > > Wastage reduce | ~1% | ~1% |
> > +----------------+----------------+----------------+
> >
> > +-----------------+----------------+----------------+
> > > Total slub memory |
> > +-----------------+----------------+----------------+
> > > | After Boot | After Hackbench|
> > > Normal | 330304 | 472960 |
> > > With Patch | 319808 | 458944 |
> > > Memory reduce | ~3% | ~3% |
> > +-----------------+----------------+----------------+
> >
> > hackbench-process-sockets
> > +-------+-----+----------+----------+---------+
> > > | Normal |With Patch| |
> > +-------+----+----------+----------+----------+
> > > Amean | 1 | 1.9030 | 1.8967 | ( 0.33%) |
> > > Amean | 4 | 7.2117 | 7.1283 | ( 1.16%) |
> > > Amean | 7 | 12.5247 | 12.3460 | ( 1.43%) |
> > > Amean | 12 | 21.7157 | 21.4753 | ( 1.11%) |
> > > Amean | 21 | 38.2693 | 37.6670 | ( 1.57%) |
> > > Amean | 30 | 54.5930 | 53.8657 | ( 1.33%) |
> > > Amean | 48 | 87.6700 | 86.3690 | ( 1.48%) |
> > > Amean | 64 | 117.1227 | 115.4893 | ( 1.39%) |
> > +-------+----+----------+----------+----------+
> >
> > Signed-off-by: Jay Patel <jaypatel@linux.ibm.com>
> > ---
> > mm/slub.c | 52 +++++++++++++++++++++++++------------------------
> > ---
> > 1 file changed, 25 insertions(+), 27 deletions(-)
> >
> > diff --git a/mm/slub.c b/mm/slub.c
> > index c87628cd8a9a..0a1090c528da 100644
> > --- a/mm/slub.c
> > +++ b/mm/slub.c
> > @@ -4058,7 +4058,7 @@ EXPORT_SYMBOL(kmem_cache_alloc_bulk);
> > */
> > static unsigned int slub_min_order;
> > static unsigned int slub_max_order =
> > - IS_ENABLED(CONFIG_SLUB_TINY) ? 1 : PAGE_ALLOC_COSTLY_ORDER;
> > + IS_ENABLED(CONFIG_SLUB_TINY) ? 1 : 2;
> > static unsigned int slub_min_objects;
> >
> > /*
> > @@ -4087,11 +4087,10 @@ static unsigned int slub_min_objects;
> > * the smallest order which will fit the object.
> > */
> > static inline unsigned int calc_slab_order(unsigned int size,
> > - unsigned int min_objects, unsigned int max_order,
> > - unsigned int fract_leftover)
> > + unsigned int min_objects, unsigned int max_order)
> > {
> > unsigned int min_order = slub_min_order;
> > - unsigned int order;
> > + unsigned int order, min_wastage = size, min_wastage_order =
> > MAX_ORDER+1;
> >
> > if (order_objects(min_order, size) > MAX_OBJS_PER_PAGE)
> > return get_order(size * MAX_OBJS_PER_PAGE) - 1;
> > @@ -4104,11 +4103,17 @@ static inline unsigned int
> > calc_slab_order(unsigned int size,
> >
> > rem = slab_size % size;
> >
> > - if (rem <= slab_size / fract_leftover)
> > - break;
> > + if (rem < min_wastage) {
> > + min_wastage = rem;
> > + min_wastage_order = order;
> > + }
> > }
> >
> > - return order;
> > + if (min_wastage_order <= slub_max_order)
> > + return min_wastage_order;
> > + else
> > + return order;
> > +
> > }
> >
> > static inline int calculate_order(unsigned int size)
> > @@ -4142,35 +4147,28 @@ static inline int calculate_order(unsigned
> > int size)
> > nr_cpus = nr_cpu_ids;
> > min_objects = 4 * (fls(nr_cpus) + 1);
> > }
> > +
> > + if ((min_objects * size) > (PAGE_SIZE <<
> > PAGE_ALLOC_COSTLY_ORDER))
> > + return PAGE_ALLOC_COSTLY_ORDER;
> > +
> > + if ((min_objects * size) <= PAGE_SIZE)
> > + return slub_min_order;
> > +
> > max_objects = order_objects(slub_max_order, size);
> > min_objects = min(min_objects, max_objects);
> >
> > - while (min_objects > 1) {
> > - unsigned int fraction;
> > -
> > - fraction = 16;
> > - while (fraction >= 4) {
> > - order = calc_slab_order(size, min_objects,
> > - slub_max_order, fraction);
> > - if (order <= slub_max_order)
> > - return order;
> > - fraction /= 2;
> > - }
> > + while (min_objects >= 1) {
> > + order = calc_slab_order(size, min_objects,
> > + slub_max_order);
> > + if (order <= slub_max_order)
> > + return order;
> > min_objects--;
> > }
> >
> > - /*
> > - * We were unable to place multiple objects in a slab. Now
> > - * lets see if we can place a single object there.
> > - */
> > - order = calc_slab_order(size, 1, slub_max_order, 1);
> > - if (order <= slub_max_order)
> > - return order;
> > -
> > /*
> > * Doh this slab cannot be placed using slub_max_order.
> > */
> > - order = calc_slab_order(size, 1, MAX_ORDER, 1);
> > + order = calc_slab_order(size, 1, MAX_ORDER);
> > if (order <= MAX_ORDER)
> > return order;
> > return -ENOSYS;
next prev parent reply other threads:[~2023-07-20 10:31 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-06-28 9:57 [PATCH] [RFC PATCH v2]mm/slub: Optimize slub memory usage Jay Patel
2023-07-03 0:13 ` David Rientjes
2023-07-03 8:39 ` Jay Patel
2023-07-09 14:42 ` Hyeonggon Yoo
2023-07-12 13:06 ` Vlastimil Babka
2023-07-20 10:30 ` Jay Patel [this message]
2023-07-17 13:41 ` kernel test robot
2023-07-18 6:43 ` Hyeonggon Yoo
2023-07-20 3:00 ` Oliver Sang
2023-07-20 12:59 ` Hyeonggon Yoo
2023-07-20 13:46 ` Hyeonggon Yoo
2023-07-20 14:15 ` Hyeonggon Yoo
2023-07-24 2:39 ` Oliver Sang
2023-07-31 9:49 ` Hyeonggon Yoo
2023-07-20 13:49 ` Feng Tang
2023-07-20 15:05 ` Hyeonggon Yoo
2023-07-21 14:50 ` Binder Makin
2023-07-21 15:39 ` Hyeonggon Yoo
2023-07-21 18:31 ` Binder Makin
2023-07-24 14:35 ` Feng Tang
2023-07-25 3:13 ` Hyeonggon Yoo
2023-07-25 9:12 ` Feng Tang
2023-08-29 8:30 ` Feng Tang
2023-07-26 10:06 ` Vlastimil Babka
2023-08-10 10:38 ` Jay Patel
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=d841547b0bca28ee1ee7dd3b4dfc6a6dfa403755.camel@linux.ibm.com \
--to=jaypatel@linux.ibm.com \
--cc=akpm@linux-foundation.org \
--cc=aneesh.kumar@linux.ibm.com \
--cc=cl@linux.com \
--cc=iamjoonsoo.kim@lge.com \
--cc=linux-mm@kvack.org \
--cc=penberg@kernel.org \
--cc=piyushs@linux.ibm.com \
--cc=rientjes@google.com \
--cc=tsahu@linux.ibm.com \
--cc=vbabka@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).