linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Vlastimil Babka <vbabka@suse.cz>
To: David Rientjes <rientjes@google.com>,
	Christoph Lameter <cl@linux.com>,
	Pekka Enberg <penberg@kernel.org>,
	Joonsoo Kim <iamjoonsoo.kim@lge.com>,
	Andrew Morton <akpm@linux-foundation.org>
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [patch] mm, slab: avoid high-order slab pages when it does not reduce waste
Date: Tue, 16 Oct 2018 10:42:33 +0200	[thread overview]
Message-ID: <a85917a2-199f-a2c1-da28-13f0420f0908@suse.cz> (raw)
In-Reply-To: <alpine.DEB.2.21.1810121424420.116562@chino.kir.corp.google.com>

On 10/12/18 11:24 PM, David Rientjes wrote:
> The slab allocator has a heuristic that checks whether the internal
> fragmentation is satisfactory and, if not, increases cachep->gfporder to
> try to improve this.
> 
> If the amount of waste is the same at higher cachep->gfporder values,
> there is no significant benefit to allocating higher order memory.  There
> will be fewer calls to the page allocator, but each call will require
> zone->lock and finding the page of best fit from the per-zone free areas.
> 
> Instead, it is better to allocate order-0 memory if possible so that pages
> can be returned from the per-cpu pagesets (pcp).
> 
> There are two reasons to prefer this over allocating high order memory:
> 
>  - allocating from the pcp lists does not require a per-zone lock, and
> 
>  - this reduces stranding of MIGRATE_UNMOVABLE pageblocks on pcp lists
>    that increases slab fragmentation across a zone.
> 
> We are particularly interested in the second point to eliminate cases
> where all other pages on a pageblock are movable (or free) and fallback to
> pageblocks of other migratetypes from the per-zone free areas causes
> high-order slab memory to be allocated from them rather than from free
> MIGRATE_UNMOVABLE pages on the pcp.
> 
> Signed-off-by: David Rientjes <rientjes@google.com>
> ---
>  mm/slab.c | 15 +++++++++++++++
>  1 file changed, 15 insertions(+)
> 
> diff --git a/mm/slab.c b/mm/slab.c
> --- a/mm/slab.c
> +++ b/mm/slab.c
> @@ -1748,6 +1748,7 @@ static size_t calculate_slab_order(struct kmem_cache *cachep,
>  	for (gfporder = 0; gfporder <= KMALLOC_MAX_ORDER; gfporder++) {
>  		unsigned int num;
>  		size_t remainder;
> +		int order;
>  
>  		num = cache_estimate(gfporder, size, flags, &remainder);
>  		if (!num)
> @@ -1803,6 +1804,20 @@ static size_t calculate_slab_order(struct kmem_cache *cachep,
>  		 */
>  		if (left_over * 8 <= (PAGE_SIZE << gfporder))
>  			break;
> +
> +		/*
> +		 * If a higher gfporder would not reduce internal fragmentation,
> +		 * no need to continue.  The preference is to keep gfporder as
> +		 * small as possible so slab allocations can be served from
> +		 * MIGRATE_UNMOVABLE pcp lists to avoid stranding.
> +		 */
> +		for (order = gfporder + 1; order <= slab_max_order; order++) {
> +			cache_estimate(order, size, flags, &remainder);
> +			if (remainder < left_over)

I think this can be suboptimal when left_over is e.g. 500 for the lower
order and remainder is 800 for the higher order, so wasted memory per
page is lower, although the absolute value isn't. Can that happen?
Probably not for order-0 vs order-1 case, but for higher orders? In that
case left_order should be shifted left by (gfporder - order) in the
comparison?

> +				break;
> +		}
> +		if (order > slab_max_order)
> +			break;
>  	}
>  	return left_over;
>  }
> 


      parent reply	other threads:[~2018-10-16  8:43 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-10-12 21:24 [patch] mm, slab: avoid high-order slab pages when it does not reduce waste David Rientjes
2018-10-12 22:13 ` Andrew Morton
2018-10-12 23:09   ` David Rientjes
2018-10-15 22:41   ` Christopher Lameter
2018-10-16  0:39     ` David Rientjes
2018-10-16 15:17       ` Christopher Lameter
2018-10-17  9:09         ` Vlastimil Babka
2018-10-17 15:38           ` Christopher Lameter
2018-10-15 22:42 ` Christopher Lameter
2018-10-16  8:42 ` Vlastimil Babka [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a85917a2-199f-a2c1-da28-13f0420f0908@suse.cz \
    --to=vbabka@suse.cz \
    --cc=akpm@linux-foundation.org \
    --cc=cl@linux.com \
    --cc=iamjoonsoo.kim@lge.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=penberg@kernel.org \
    --cc=rientjes@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).