All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mikulas Patocka <mpatocka@redhat.com>
To: Christopher Lameter <cl@linux.com>
Cc: Matthew Wilcox <willy@infradead.org>,
	Pekka Enberg <penberg@kernel.org>,
	David Rientjes <rientjes@google.com>,
	Joonsoo Kim <iamjoonsoo.kim@lge.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	linux-mm@kvack.org, dm-devel@redhat.com,
	Mike Snitzer <msnitzer@redhat.com>
Subject: Re: [PATCH] slab: introduce the flag SLAB_MINIMIZE_WASTE
Date: Wed, 21 Mar 2018 14:36:58 -0400 (EDT)	[thread overview]
Message-ID: <alpine.LRH.2.02.1803211425330.26409@file01.intranet.prod.int.rdu2.redhat.com> (raw)
In-Reply-To: <alpine.DEB.2.20.1803211226350.3174@nuc-kabylake>



On Wed, 21 Mar 2018, Christopher Lameter wrote:

> On Wed, 21 Mar 2018, Mikulas Patocka wrote:
> 
> > > You should not be using the slab allocators for these. Allocate higher
> > > order pages or numbers of consecutive smaller pagess from the page
> > > allocator. The slab allocators are written for objects smaller than page
> > > size.
> >
> > So, do you argue that I need to write my own slab cache functionality
> > instead of using the existing slab code?
> 
> Just use the existing page allocator calls to allocate and free the
> memory you need.
> 
> > I can do it - but duplicating code is bad thing.
> 
> There is no need to duplicate anything. There is lots of infrastructure
> already in the kernel. You just need to use the right allocation / freeing
> calls.

So, what would you recommend for allocating 640KB objects while minimizing 
wasted space?
* alloc_pages - rounds up to the next power of two
* kmalloc - rounds up to the next power of two
* alloc_pages_exact - O(n*log n) complexity; and causes memory 
  fragmentation if used excesivelly
* vmalloc - horrible performance (modifies page tables and that causes 
  synchronization across all CPUs)

anything else?

The slab cache with large order seems as a best choice for this.

> > > What kind of problem could be caused here?
> >
> > Unlocked accesses are generally considered bad. For example, see this
> > piece of code in calculate_sizes:
> >         s->allocflags = 0;
> >         if (order)
> >                 s->allocflags |= __GFP_COMP;
> >
> >         if (s->flags & SLAB_CACHE_DMA)
> >                 s->allocflags |= GFP_DMA;
> >
> >         if (s->flags & SLAB_RECLAIM_ACCOUNT)
> >                 s->allocflags |= __GFP_RECLAIMABLE;
> >
> > If you are running this while the cache is in use (i.e. when the user
> > writes /sys/kernel/slab/<cache>/order), then other processes will see
> > invalid s->allocflags for a short time.
> 
> Calculating sizes is done when the slab has only a single accessor. Thus
> no locking is neeed.

The calculation is done whenever someone writes to 
"/sys/kernel/slab/*/order"

And you can obviously write to that file why the slab cache is in use. Try 
it.

So, the function calculate_sizes can actually race with allocation from 
the slab cache.

> Changing the size of objects in a slab cache when there is already a set
> of object allocated and under management by the slab cache would
> cause the allocator to fail and lead to garbled data.

I am not talking about changing the size of objects in a slab cache. I am 
talking about changing the allocation order of a slab cache while the 
cache is in use. This can be done with the sysfs interface.

Mikulas

WARNING: multiple messages have this Message-ID (diff)
From: Mikulas Patocka <mpatocka@redhat.com>
To: Christopher Lameter <cl@linux.com>
Cc: Mike Snitzer <msnitzer@redhat.com>,
	Matthew Wilcox <willy@infradead.org>,
	Pekka Enberg <penberg@kernel.org>,
	linux-mm@kvack.org, dm-devel@redhat.com,
	David Rientjes <rientjes@google.com>,
	Joonsoo Kim <iamjoonsoo.kim@lge.com>,
	Andrew Morton <akpm@linux-foundation.org>
Subject: Re: [PATCH] slab: introduce the flag SLAB_MINIMIZE_WASTE
Date: Wed, 21 Mar 2018 14:36:58 -0400 (EDT)	[thread overview]
Message-ID: <alpine.LRH.2.02.1803211425330.26409@file01.intranet.prod.int.rdu2.redhat.com> (raw)
In-Reply-To: <alpine.DEB.2.20.1803211226350.3174@nuc-kabylake>



On Wed, 21 Mar 2018, Christopher Lameter wrote:

> On Wed, 21 Mar 2018, Mikulas Patocka wrote:
> 
> > > You should not be using the slab allocators for these. Allocate higher
> > > order pages or numbers of consecutive smaller pagess from the page
> > > allocator. The slab allocators are written for objects smaller than page
> > > size.
> >
> > So, do you argue that I need to write my own slab cache functionality
> > instead of using the existing slab code?
> 
> Just use the existing page allocator calls to allocate and free the
> memory you need.
> 
> > I can do it - but duplicating code is bad thing.
> 
> There is no need to duplicate anything. There is lots of infrastructure
> already in the kernel. You just need to use the right allocation / freeing
> calls.

So, what would you recommend for allocating 640KB objects while minimizing 
wasted space?
* alloc_pages - rounds up to the next power of two
* kmalloc - rounds up to the next power of two
* alloc_pages_exact - O(n*log n) complexity; and causes memory 
  fragmentation if used excesivelly
* vmalloc - horrible performance (modifies page tables and that causes 
  synchronization across all CPUs)

anything else?

The slab cache with large order seems as a best choice for this.

> > > What kind of problem could be caused here?
> >
> > Unlocked accesses are generally considered bad. For example, see this
> > piece of code in calculate_sizes:
> >         s->allocflags = 0;
> >         if (order)
> >                 s->allocflags |= __GFP_COMP;
> >
> >         if (s->flags & SLAB_CACHE_DMA)
> >                 s->allocflags |= GFP_DMA;
> >
> >         if (s->flags & SLAB_RECLAIM_ACCOUNT)
> >                 s->allocflags |= __GFP_RECLAIMABLE;
> >
> > If you are running this while the cache is in use (i.e. when the user
> > writes /sys/kernel/slab/<cache>/order), then other processes will see
> > invalid s->allocflags for a short time.
> 
> Calculating sizes is done when the slab has only a single accessor. Thus
> no locking is neeed.

The calculation is done whenever someone writes to 
"/sys/kernel/slab/*/order"

And you can obviously write to that file why the slab cache is in use. Try 
it.

So, the function calculate_sizes can actually race with allocation from 
the slab cache.

> Changing the size of objects in a slab cache when there is already a set
> of object allocated and under management by the slab cache would
> cause the allocator to fail and lead to garbled data.

I am not talking about changing the size of objects in a slab cache. I am 
talking about changing the allocation order of a slab cache while the 
cache is in use. This can be done with the sysfs interface.

Mikulas

  parent reply	other threads:[~2018-03-21 18:37 UTC|newest]

Thread overview: 109+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-03-20 17:25 [PATCH] slab: introduce the flag SLAB_MINIMIZE_WASTE Mikulas Patocka
2018-03-20 17:35 ` Matthew Wilcox
2018-03-20 17:35   ` Matthew Wilcox
2018-03-20 17:54   ` Christopher Lameter
2018-03-20 17:54     ` Christopher Lameter
2018-03-20 19:22     ` Mikulas Patocka
2018-03-20 19:22       ` Mikulas Patocka
2018-03-20 20:42       ` Christopher Lameter
2018-03-20 20:42         ` Christopher Lameter
2018-03-20 22:02         ` Mikulas Patocka
2018-03-20 22:02           ` Mikulas Patocka
2018-03-21 15:35           ` Christopher Lameter
2018-03-21 15:35             ` Christopher Lameter
2018-03-21 16:25             ` Mikulas Patocka
2018-03-21 16:25               ` Mikulas Patocka
2018-03-21 17:10               ` Matthew Wilcox
2018-03-21 17:10                 ` Matthew Wilcox
2018-03-21 17:30               ` Christopher Lameter
2018-03-21 17:30                 ` Christopher Lameter
2018-03-21 17:39                 ` Christopher Lameter
2018-03-21 17:39                   ` Christopher Lameter
2018-03-21 17:49                   ` Matthew Wilcox
2018-03-21 17:49                     ` Matthew Wilcox
2018-03-21 18:01                     ` Christopher Lameter
2018-03-21 18:01                       ` Christopher Lameter
2018-03-21 18:23                     ` Mikulas Patocka
2018-03-21 18:23                       ` Mikulas Patocka
2018-03-21 18:40                       ` Christopher Lameter
2018-03-21 18:40                         ` Christopher Lameter
2018-03-21 18:55                         ` Mikulas Patocka
2018-03-21 18:55                           ` Mikulas Patocka
2018-03-21 18:55                         ` Matthew Wilcox
2018-03-21 18:55                           ` Matthew Wilcox
2018-03-21 18:58                           ` Christopher Lameter
2018-03-21 18:58                             ` Christopher Lameter
2018-03-21 19:25                   ` Mikulas Patocka
2018-03-21 19:25                     ` Mikulas Patocka
2018-03-21 18:36                 ` Mikulas Patocka [this message]
2018-03-21 18:36                   ` Mikulas Patocka
2018-03-21 18:57                   ` Christopher Lameter
2018-03-21 18:57                     ` Christopher Lameter
2018-03-21 19:19                     ` Mikulas Patocka
2018-03-21 19:19                       ` Mikulas Patocka
2018-03-21 20:09                       ` Christopher Lameter
2018-03-21 20:09                         ` Christopher Lameter
2018-03-21 20:37                         ` Mikulas Patocka
2018-03-21 20:37                           ` Mikulas Patocka
2018-03-23 15:10                           ` Christopher Lameter
2018-03-23 15:10                             ` Christopher Lameter
2018-03-23 15:31                             ` Mikulas Patocka
2018-03-23 15:31                               ` Mikulas Patocka
2018-03-23 15:48                               ` Christopher Lameter
2018-03-23 15:48                                 ` Christopher Lameter
2018-04-13  9:22                   ` Vlastimil Babka
2018-04-13  9:22                     ` Vlastimil Babka
2018-04-13 15:10                     ` Mike Snitzer
2018-04-13 15:10                       ` Mike Snitzer
2018-04-16 12:38                       ` Vlastimil Babka
2018-04-16 12:38                         ` Vlastimil Babka
2018-04-16 14:27                         ` Mike Snitzer
2018-04-16 14:27                           ` Mike Snitzer
2018-04-16 14:37                           ` Mikulas Patocka
2018-04-16 14:37                             ` Mikulas Patocka
2018-04-16 14:46                             ` Mike Snitzer
2018-04-16 14:46                               ` Mike Snitzer
2018-04-16 14:57                               ` Mikulas Patocka
2018-04-16 14:57                                 ` Mikulas Patocka
2018-04-16 15:18                                 ` Christopher Lameter
2018-04-16 15:18                                   ` Christopher Lameter
2018-04-16 15:25                                   ` Mikulas Patocka
2018-04-16 15:25                                     ` Mikulas Patocka
2018-04-16 15:45                                     ` Christopher Lameter
2018-04-16 15:45                                       ` Christopher Lameter
2018-04-16 19:36                                       ` Mikulas Patocka
2018-04-16 19:36                                         ` Mikulas Patocka
2018-04-16 19:53                                         ` Vlastimil Babka
2018-04-16 21:01                                           ` Mikulas Patocka
2018-04-17 14:40                                             ` Christopher Lameter
2018-04-17 18:53                                               ` Mikulas Patocka
2018-04-17 18:53                                                 ` Mikulas Patocka
2018-04-17 21:42                                                 ` Christopher Lameter
2018-04-17 14:49                                           ` Christopher Lameter
2018-04-17 14:49                                             ` Christopher Lameter
2018-04-17 14:47                                         ` Christopher Lameter
2018-04-17 14:47                                           ` Christopher Lameter
2018-04-16 19:32                               ` [PATCH RESEND] " Mikulas Patocka
2018-04-17 14:45                                 ` Christopher Lameter
2018-04-17 16:16                                   ` Vlastimil Babka
2018-04-17 16:38                                     ` Christopher Lameter
2018-04-17 19:09                                       ` Mikulas Patocka
2018-04-17 17:26                                     ` Mikulas Patocka
2018-04-17 19:13                                       ` Vlastimil Babka
2018-04-17 19:06                                   ` Mikulas Patocka
2018-04-17 19:06                                     ` Mikulas Patocka
2018-04-18 14:55                                     ` Christopher Lameter
2018-04-25 21:04                                       ` Mikulas Patocka
2018-04-25 23:24                                         ` Mikulas Patocka
2018-04-26 19:01                                           ` Christopher Lameter
2018-04-26 21:09                                             ` Mikulas Patocka
2018-04-27 16:41                                               ` Christopher Lameter
2018-04-27 19:19                                                 ` Mikulas Patocka
2018-06-13 17:01                                                   ` Mikulas Patocka
2018-06-13 18:16                                                     ` Christoph Hellwig
2018-06-13 18:53                                                       ` Mikulas Patocka
2018-04-26 18:51                                         ` Christopher Lameter
2018-04-16 19:38                             ` Vlastimil Babka
2018-04-16 19:38                               ` Vlastimil Babka
2018-04-16 21:04                         ` Mikulas Patocka
2018-04-16 21:04                           ` Mikulas Patocka

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.LRH.2.02.1803211425330.26409@file01.intranet.prod.int.rdu2.redhat.com \
    --to=mpatocka@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=cl@linux.com \
    --cc=dm-devel@redhat.com \
    --cc=iamjoonsoo.kim@lge.com \
    --cc=linux-mm@kvack.org \
    --cc=msnitzer@redhat.com \
    --cc=penberg@kernel.org \
    --cc=rientjes@google.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.