All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mikulas Patocka <mpatocka@redhat.com>
To: Matthew Wilcox <willy@infradead.org>
Cc: Christopher Lameter <cl@linux.com>,
	Pekka Enberg <penberg@kernel.org>,
	David Rientjes <rientjes@google.com>,
	Joonsoo Kim <iamjoonsoo.kim@lge.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	linux-mm@kvack.org, dm-devel@redhat.com,
	Mike Snitzer <msnitzer@redhat.com>
Subject: Re: [PATCH] slab: introduce the flag SLAB_MINIMIZE_WASTE
Date: Wed, 21 Mar 2018 14:23:40 -0400 (EDT)	[thread overview]
Message-ID: <alpine.LRH.2.02.1803211406180.26409@file01.intranet.prod.int.rdu2.redhat.com> (raw)
In-Reply-To: <20180321174937.GF4780@bombadil.infradead.org>



On Wed, 21 Mar 2018, Matthew Wilcox wrote:

> On Wed, Mar 21, 2018 at 12:39:33PM -0500, Christopher Lameter wrote:
> > One other thought: If you want to improve the behavior for large scale
> > objects allocated through kmalloc/kmemcache then we would certainly be
> > glad to entertain those ideas.
> > 
> > F.e. you could optimize the allcations > 2x PAGE_SIZE so that they do not
> > allocate powers of two pages. It would be relatively easy to make
> > kmalloc_large round the allocation to the next page size and then allocate
> > N consecutive pages via alloc_pages_exact() and free the remainder unused
> > pages or some such thing.

alloc_pages_exact() has O(n*log n) complexity with respect to the number 
of requested pages. It would have to be reworked and optimized if it were 
to be used for the dm-bufio cache. (it could be optimized down to O(log n) 
if it didn't split the compound page to a lot of separate pages, but split 
it to a power-of-two clusters instead).

> I don't know if that's a good idea.  That will contribute to fragmentation
> if the allocation is held onto for a short-to-medium length of time.
> If the allocation is for a very long period of time then those pages
> would have been unavailable anyway, but if the user of the tail pages
> holds them beyond the lifetime of the large allocation, then this is
> probably a bad tradeoff to make.

The problem with alloc_pages_exact() is that it exhausts all the 
high-order pages and leaves many free low-order pages around. So you'll 
end up in a system with a lot of free memory, but with all high-order 
pages missing. As there would be a lot of free memory, the kswapd thread 
would not be woken up to free some high-order pages.

I think that using slab with high order is better, because it at least 
doesn't leave many low-order pages behind.

> I do see Mikulas' use case as interesting, I just don't know whether it's
> worth changing slab/slub to support it.  At first blush, other than the
> sheer size of the allocations, it's a good fit.

All I need is to increase the order of a specific slab cache - I think 
it's better to implement an interface that allows doing it than to 
duplicate the slab cache code.

BTW. it could be possible to open the file 
"/sys/kernel/slab/<cache>/order" from the dm-bufio kernel driver and write 
the requested value there, but it seems very dirty. It would be better to 
have a kernel interface for that.

Mikulas

WARNING: multiple messages have this Message-ID (diff)
From: Mikulas Patocka <mpatocka@redhat.com>
To: Matthew Wilcox <willy@infradead.org>
Cc: Mike Snitzer <msnitzer@redhat.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Pekka Enberg <penberg@kernel.org>,
	linux-mm@kvack.org, dm-devel@redhat.com,
	David Rientjes <rientjes@google.com>,
	Christopher Lameter <cl@linux.com>,
	Joonsoo Kim <iamjoonsoo.kim@lge.com>
Subject: Re: [PATCH] slab: introduce the flag SLAB_MINIMIZE_WASTE
Date: Wed, 21 Mar 2018 14:23:40 -0400 (EDT)	[thread overview]
Message-ID: <alpine.LRH.2.02.1803211406180.26409@file01.intranet.prod.int.rdu2.redhat.com> (raw)
In-Reply-To: <20180321174937.GF4780@bombadil.infradead.org>



On Wed, 21 Mar 2018, Matthew Wilcox wrote:

> On Wed, Mar 21, 2018 at 12:39:33PM -0500, Christopher Lameter wrote:
> > One other thought: If you want to improve the behavior for large scale
> > objects allocated through kmalloc/kmemcache then we would certainly be
> > glad to entertain those ideas.
> > 
> > F.e. you could optimize the allcations > 2x PAGE_SIZE so that they do not
> > allocate powers of two pages. It would be relatively easy to make
> > kmalloc_large round the allocation to the next page size and then allocate
> > N consecutive pages via alloc_pages_exact() and free the remainder unused
> > pages or some such thing.

alloc_pages_exact() has O(n*log n) complexity with respect to the number 
of requested pages. It would have to be reworked and optimized if it were 
to be used for the dm-bufio cache. (it could be optimized down to O(log n) 
if it didn't split the compound page to a lot of separate pages, but split 
it to a power-of-two clusters instead).

> I don't know if that's a good idea.  That will contribute to fragmentation
> if the allocation is held onto for a short-to-medium length of time.
> If the allocation is for a very long period of time then those pages
> would have been unavailable anyway, but if the user of the tail pages
> holds them beyond the lifetime of the large allocation, then this is
> probably a bad tradeoff to make.

The problem with alloc_pages_exact() is that it exhausts all the 
high-order pages and leaves many free low-order pages around. So you'll 
end up in a system with a lot of free memory, but with all high-order 
pages missing. As there would be a lot of free memory, the kswapd thread 
would not be woken up to free some high-order pages.

I think that using slab with high order is better, because it at least 
doesn't leave many low-order pages behind.

> I do see Mikulas' use case as interesting, I just don't know whether it's
> worth changing slab/slub to support it.  At first blush, other than the
> sheer size of the allocations, it's a good fit.

All I need is to increase the order of a specific slab cache - I think 
it's better to implement an interface that allows doing it than to 
duplicate the slab cache code.

BTW. it could be possible to open the file 
"/sys/kernel/slab/<cache>/order" from the dm-bufio kernel driver and write 
the requested value there, but it seems very dirty. It would be better to 
have a kernel interface for that.

Mikulas

  parent reply	other threads:[~2018-03-21 18:23 UTC|newest]

Thread overview: 109+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-03-20 17:25 [PATCH] slab: introduce the flag SLAB_MINIMIZE_WASTE Mikulas Patocka
2018-03-20 17:35 ` Matthew Wilcox
2018-03-20 17:35   ` Matthew Wilcox
2018-03-20 17:54   ` Christopher Lameter
2018-03-20 17:54     ` Christopher Lameter
2018-03-20 19:22     ` Mikulas Patocka
2018-03-20 19:22       ` Mikulas Patocka
2018-03-20 20:42       ` Christopher Lameter
2018-03-20 20:42         ` Christopher Lameter
2018-03-20 22:02         ` Mikulas Patocka
2018-03-20 22:02           ` Mikulas Patocka
2018-03-21 15:35           ` Christopher Lameter
2018-03-21 15:35             ` Christopher Lameter
2018-03-21 16:25             ` Mikulas Patocka
2018-03-21 16:25               ` Mikulas Patocka
2018-03-21 17:10               ` Matthew Wilcox
2018-03-21 17:10                 ` Matthew Wilcox
2018-03-21 17:30               ` Christopher Lameter
2018-03-21 17:30                 ` Christopher Lameter
2018-03-21 17:39                 ` Christopher Lameter
2018-03-21 17:39                   ` Christopher Lameter
2018-03-21 17:49                   ` Matthew Wilcox
2018-03-21 17:49                     ` Matthew Wilcox
2018-03-21 18:01                     ` Christopher Lameter
2018-03-21 18:01                       ` Christopher Lameter
2018-03-21 18:23                     ` Mikulas Patocka [this message]
2018-03-21 18:23                       ` Mikulas Patocka
2018-03-21 18:40                       ` Christopher Lameter
2018-03-21 18:40                         ` Christopher Lameter
2018-03-21 18:55                         ` Mikulas Patocka
2018-03-21 18:55                           ` Mikulas Patocka
2018-03-21 18:55                         ` Matthew Wilcox
2018-03-21 18:55                           ` Matthew Wilcox
2018-03-21 18:58                           ` Christopher Lameter
2018-03-21 18:58                             ` Christopher Lameter
2018-03-21 19:25                   ` Mikulas Patocka
2018-03-21 19:25                     ` Mikulas Patocka
2018-03-21 18:36                 ` Mikulas Patocka
2018-03-21 18:36                   ` Mikulas Patocka
2018-03-21 18:57                   ` Christopher Lameter
2018-03-21 18:57                     ` Christopher Lameter
2018-03-21 19:19                     ` Mikulas Patocka
2018-03-21 19:19                       ` Mikulas Patocka
2018-03-21 20:09                       ` Christopher Lameter
2018-03-21 20:09                         ` Christopher Lameter
2018-03-21 20:37                         ` Mikulas Patocka
2018-03-21 20:37                           ` Mikulas Patocka
2018-03-23 15:10                           ` Christopher Lameter
2018-03-23 15:10                             ` Christopher Lameter
2018-03-23 15:31                             ` Mikulas Patocka
2018-03-23 15:31                               ` Mikulas Patocka
2018-03-23 15:48                               ` Christopher Lameter
2018-03-23 15:48                                 ` Christopher Lameter
2018-04-13  9:22                   ` Vlastimil Babka
2018-04-13  9:22                     ` Vlastimil Babka
2018-04-13 15:10                     ` Mike Snitzer
2018-04-13 15:10                       ` Mike Snitzer
2018-04-16 12:38                       ` Vlastimil Babka
2018-04-16 12:38                         ` Vlastimil Babka
2018-04-16 14:27                         ` Mike Snitzer
2018-04-16 14:27                           ` Mike Snitzer
2018-04-16 14:37                           ` Mikulas Patocka
2018-04-16 14:37                             ` Mikulas Patocka
2018-04-16 14:46                             ` Mike Snitzer
2018-04-16 14:46                               ` Mike Snitzer
2018-04-16 14:57                               ` Mikulas Patocka
2018-04-16 14:57                                 ` Mikulas Patocka
2018-04-16 15:18                                 ` Christopher Lameter
2018-04-16 15:18                                   ` Christopher Lameter
2018-04-16 15:25                                   ` Mikulas Patocka
2018-04-16 15:25                                     ` Mikulas Patocka
2018-04-16 15:45                                     ` Christopher Lameter
2018-04-16 15:45                                       ` Christopher Lameter
2018-04-16 19:36                                       ` Mikulas Patocka
2018-04-16 19:36                                         ` Mikulas Patocka
2018-04-16 19:53                                         ` Vlastimil Babka
2018-04-16 21:01                                           ` Mikulas Patocka
2018-04-17 14:40                                             ` Christopher Lameter
2018-04-17 18:53                                               ` Mikulas Patocka
2018-04-17 18:53                                                 ` Mikulas Patocka
2018-04-17 21:42                                                 ` Christopher Lameter
2018-04-17 14:49                                           ` Christopher Lameter
2018-04-17 14:49                                             ` Christopher Lameter
2018-04-17 14:47                                         ` Christopher Lameter
2018-04-17 14:47                                           ` Christopher Lameter
2018-04-16 19:32                               ` [PATCH RESEND] " Mikulas Patocka
2018-04-17 14:45                                 ` Christopher Lameter
2018-04-17 16:16                                   ` Vlastimil Babka
2018-04-17 16:38                                     ` Christopher Lameter
2018-04-17 19:09                                       ` Mikulas Patocka
2018-04-17 17:26                                     ` Mikulas Patocka
2018-04-17 19:13                                       ` Vlastimil Babka
2018-04-17 19:06                                   ` Mikulas Patocka
2018-04-17 19:06                                     ` Mikulas Patocka
2018-04-18 14:55                                     ` Christopher Lameter
2018-04-25 21:04                                       ` Mikulas Patocka
2018-04-25 23:24                                         ` Mikulas Patocka
2018-04-26 19:01                                           ` Christopher Lameter
2018-04-26 21:09                                             ` Mikulas Patocka
2018-04-27 16:41                                               ` Christopher Lameter
2018-04-27 19:19                                                 ` Mikulas Patocka
2018-06-13 17:01                                                   ` Mikulas Patocka
2018-06-13 18:16                                                     ` Christoph Hellwig
2018-06-13 18:53                                                       ` Mikulas Patocka
2018-04-26 18:51                                         ` Christopher Lameter
2018-04-16 19:38                             ` Vlastimil Babka
2018-04-16 19:38                               ` Vlastimil Babka
2018-04-16 21:04                         ` Mikulas Patocka
2018-04-16 21:04                           ` Mikulas Patocka

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.LRH.2.02.1803211406180.26409@file01.intranet.prod.int.rdu2.redhat.com \
    --to=mpatocka@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=cl@linux.com \
    --cc=dm-devel@redhat.com \
    --cc=iamjoonsoo.kim@lge.com \
    --cc=linux-mm@kvack.org \
    --cc=msnitzer@redhat.com \
    --cc=penberg@kernel.org \
    --cc=rientjes@google.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.