From: Mikulas Patocka <mpatocka@redhat.com> To: Christopher Lameter <cl@linux.com> Cc: Matthew Wilcox <willy@infradead.org>, Pekka Enberg <penberg@kernel.org>, David Rientjes <rientjes@google.com>, Joonsoo Kim <iamjoonsoo.kim@lge.com>, Andrew Morton <akpm@linux-foundation.org>, linux-mm@kvack.org, dm-devel@redhat.com, Mike Snitzer <msnitzer@redhat.com> Subject: Re: [PATCH] slab: introduce the flag SLAB_MINIMIZE_WASTE Date: Wed, 21 Mar 2018 14:36:58 -0400 (EDT) [thread overview] Message-ID: <alpine.LRH.2.02.1803211425330.26409@file01.intranet.prod.int.rdu2.redhat.com> (raw) In-Reply-To: <alpine.DEB.2.20.1803211226350.3174@nuc-kabylake> On Wed, 21 Mar 2018, Christopher Lameter wrote: > On Wed, 21 Mar 2018, Mikulas Patocka wrote: > > > > You should not be using the slab allocators for these. Allocate higher > > > order pages or numbers of consecutive smaller pagess from the page > > > allocator. The slab allocators are written for objects smaller than page > > > size. > > > > So, do you argue that I need to write my own slab cache functionality > > instead of using the existing slab code? > > Just use the existing page allocator calls to allocate and free the > memory you need. > > > I can do it - but duplicating code is bad thing. > > There is no need to duplicate anything. There is lots of infrastructure > already in the kernel. You just need to use the right allocation / freeing > calls. So, what would you recommend for allocating 640KB objects while minimizing wasted space? * alloc_pages - rounds up to the next power of two * kmalloc - rounds up to the next power of two * alloc_pages_exact - O(n*log n) complexity; and causes memory fragmentation if used excesivelly * vmalloc - horrible performance (modifies page tables and that causes synchronization across all CPUs) anything else? The slab cache with large order seems as a best choice for this. > > > What kind of problem could be caused here? > > > > Unlocked accesses are generally considered bad. For example, see this > > piece of code in calculate_sizes: > > s->allocflags = 0; > > if (order) > > s->allocflags |= __GFP_COMP; > > > > if (s->flags & SLAB_CACHE_DMA) > > s->allocflags |= GFP_DMA; > > > > if (s->flags & SLAB_RECLAIM_ACCOUNT) > > s->allocflags |= __GFP_RECLAIMABLE; > > > > If you are running this while the cache is in use (i.e. when the user > > writes /sys/kernel/slab/<cache>/order), then other processes will see > > invalid s->allocflags for a short time. > > Calculating sizes is done when the slab has only a single accessor. Thus > no locking is neeed. The calculation is done whenever someone writes to "/sys/kernel/slab/*/order" And you can obviously write to that file why the slab cache is in use. Try it. So, the function calculate_sizes can actually race with allocation from the slab cache. > Changing the size of objects in a slab cache when there is already a set > of object allocated and under management by the slab cache would > cause the allocator to fail and lead to garbled data. I am not talking about changing the size of objects in a slab cache. I am talking about changing the allocation order of a slab cache while the cache is in use. This can be done with the sysfs interface. Mikulas
WARNING: multiple messages have this Message-ID (diff)
From: Mikulas Patocka <mpatocka@redhat.com> To: Christopher Lameter <cl@linux.com> Cc: Mike Snitzer <msnitzer@redhat.com>, Matthew Wilcox <willy@infradead.org>, Pekka Enberg <penberg@kernel.org>, linux-mm@kvack.org, dm-devel@redhat.com, David Rientjes <rientjes@google.com>, Joonsoo Kim <iamjoonsoo.kim@lge.com>, Andrew Morton <akpm@linux-foundation.org> Subject: Re: [PATCH] slab: introduce the flag SLAB_MINIMIZE_WASTE Date: Wed, 21 Mar 2018 14:36:58 -0400 (EDT) [thread overview] Message-ID: <alpine.LRH.2.02.1803211425330.26409@file01.intranet.prod.int.rdu2.redhat.com> (raw) In-Reply-To: <alpine.DEB.2.20.1803211226350.3174@nuc-kabylake> On Wed, 21 Mar 2018, Christopher Lameter wrote: > On Wed, 21 Mar 2018, Mikulas Patocka wrote: > > > > You should not be using the slab allocators for these. Allocate higher > > > order pages or numbers of consecutive smaller pagess from the page > > > allocator. The slab allocators are written for objects smaller than page > > > size. > > > > So, do you argue that I need to write my own slab cache functionality > > instead of using the existing slab code? > > Just use the existing page allocator calls to allocate and free the > memory you need. > > > I can do it - but duplicating code is bad thing. > > There is no need to duplicate anything. There is lots of infrastructure > already in the kernel. You just need to use the right allocation / freeing > calls. So, what would you recommend for allocating 640KB objects while minimizing wasted space? * alloc_pages - rounds up to the next power of two * kmalloc - rounds up to the next power of two * alloc_pages_exact - O(n*log n) complexity; and causes memory fragmentation if used excesivelly * vmalloc - horrible performance (modifies page tables and that causes synchronization across all CPUs) anything else? The slab cache with large order seems as a best choice for this. > > > What kind of problem could be caused here? > > > > Unlocked accesses are generally considered bad. For example, see this > > piece of code in calculate_sizes: > > s->allocflags = 0; > > if (order) > > s->allocflags |= __GFP_COMP; > > > > if (s->flags & SLAB_CACHE_DMA) > > s->allocflags |= GFP_DMA; > > > > if (s->flags & SLAB_RECLAIM_ACCOUNT) > > s->allocflags |= __GFP_RECLAIMABLE; > > > > If you are running this while the cache is in use (i.e. when the user > > writes /sys/kernel/slab/<cache>/order), then other processes will see > > invalid s->allocflags for a short time. > > Calculating sizes is done when the slab has only a single accessor. Thus > no locking is neeed. The calculation is done whenever someone writes to "/sys/kernel/slab/*/order" And you can obviously write to that file why the slab cache is in use. Try it. So, the function calculate_sizes can actually race with allocation from the slab cache. > Changing the size of objects in a slab cache when there is already a set > of object allocated and under management by the slab cache would > cause the allocator to fail and lead to garbled data. I am not talking about changing the size of objects in a slab cache. I am talking about changing the allocation order of a slab cache while the cache is in use. This can be done with the sysfs interface. Mikulas
next prev parent reply other threads:[~2018-03-21 18:37 UTC|newest] Thread overview: 109+ messages / expand[flat|nested] mbox.gz Atom feed top 2018-03-20 17:25 [PATCH] slab: introduce the flag SLAB_MINIMIZE_WASTE Mikulas Patocka 2018-03-20 17:35 ` Matthew Wilcox 2018-03-20 17:35 ` Matthew Wilcox 2018-03-20 17:54 ` Christopher Lameter 2018-03-20 17:54 ` Christopher Lameter 2018-03-20 19:22 ` Mikulas Patocka 2018-03-20 19:22 ` Mikulas Patocka 2018-03-20 20:42 ` Christopher Lameter 2018-03-20 20:42 ` Christopher Lameter 2018-03-20 22:02 ` Mikulas Patocka 2018-03-20 22:02 ` Mikulas Patocka 2018-03-21 15:35 ` Christopher Lameter 2018-03-21 15:35 ` Christopher Lameter 2018-03-21 16:25 ` Mikulas Patocka 2018-03-21 16:25 ` Mikulas Patocka 2018-03-21 17:10 ` Matthew Wilcox 2018-03-21 17:10 ` Matthew Wilcox 2018-03-21 17:30 ` Christopher Lameter 2018-03-21 17:30 ` Christopher Lameter 2018-03-21 17:39 ` Christopher Lameter 2018-03-21 17:39 ` Christopher Lameter 2018-03-21 17:49 ` Matthew Wilcox 2018-03-21 17:49 ` Matthew Wilcox 2018-03-21 18:01 ` Christopher Lameter 2018-03-21 18:01 ` Christopher Lameter 2018-03-21 18:23 ` Mikulas Patocka 2018-03-21 18:23 ` Mikulas Patocka 2018-03-21 18:40 ` Christopher Lameter 2018-03-21 18:40 ` Christopher Lameter 2018-03-21 18:55 ` Mikulas Patocka 2018-03-21 18:55 ` Mikulas Patocka 2018-03-21 18:55 ` Matthew Wilcox 2018-03-21 18:55 ` Matthew Wilcox 2018-03-21 18:58 ` Christopher Lameter 2018-03-21 18:58 ` Christopher Lameter 2018-03-21 19:25 ` Mikulas Patocka 2018-03-21 19:25 ` Mikulas Patocka 2018-03-21 18:36 ` Mikulas Patocka [this message] 2018-03-21 18:36 ` Mikulas Patocka 2018-03-21 18:57 ` Christopher Lameter 2018-03-21 18:57 ` Christopher Lameter 2018-03-21 19:19 ` Mikulas Patocka 2018-03-21 19:19 ` Mikulas Patocka 2018-03-21 20:09 ` Christopher Lameter 2018-03-21 20:09 ` Christopher Lameter 2018-03-21 20:37 ` Mikulas Patocka 2018-03-21 20:37 ` Mikulas Patocka 2018-03-23 15:10 ` Christopher Lameter 2018-03-23 15:10 ` Christopher Lameter 2018-03-23 15:31 ` Mikulas Patocka 2018-03-23 15:31 ` Mikulas Patocka 2018-03-23 15:48 ` Christopher Lameter 2018-03-23 15:48 ` Christopher Lameter 2018-04-13 9:22 ` Vlastimil Babka 2018-04-13 9:22 ` Vlastimil Babka 2018-04-13 15:10 ` Mike Snitzer 2018-04-13 15:10 ` Mike Snitzer 2018-04-16 12:38 ` Vlastimil Babka 2018-04-16 12:38 ` Vlastimil Babka 2018-04-16 14:27 ` Mike Snitzer 2018-04-16 14:27 ` Mike Snitzer 2018-04-16 14:37 ` Mikulas Patocka 2018-04-16 14:37 ` Mikulas Patocka 2018-04-16 14:46 ` Mike Snitzer 2018-04-16 14:46 ` Mike Snitzer 2018-04-16 14:57 ` Mikulas Patocka 2018-04-16 14:57 ` Mikulas Patocka 2018-04-16 15:18 ` Christopher Lameter 2018-04-16 15:18 ` Christopher Lameter 2018-04-16 15:25 ` Mikulas Patocka 2018-04-16 15:25 ` Mikulas Patocka 2018-04-16 15:45 ` Christopher Lameter 2018-04-16 15:45 ` Christopher Lameter 2018-04-16 19:36 ` Mikulas Patocka 2018-04-16 19:36 ` Mikulas Patocka 2018-04-16 19:53 ` Vlastimil Babka 2018-04-16 21:01 ` Mikulas Patocka 2018-04-17 14:40 ` Christopher Lameter 2018-04-17 18:53 ` Mikulas Patocka 2018-04-17 18:53 ` Mikulas Patocka 2018-04-17 21:42 ` Christopher Lameter 2018-04-17 14:49 ` Christopher Lameter 2018-04-17 14:49 ` Christopher Lameter 2018-04-17 14:47 ` Christopher Lameter 2018-04-17 14:47 ` Christopher Lameter 2018-04-16 19:32 ` [PATCH RESEND] " Mikulas Patocka 2018-04-17 14:45 ` Christopher Lameter 2018-04-17 16:16 ` Vlastimil Babka 2018-04-17 16:38 ` Christopher Lameter 2018-04-17 19:09 ` Mikulas Patocka 2018-04-17 17:26 ` Mikulas Patocka 2018-04-17 19:13 ` Vlastimil Babka 2018-04-17 19:06 ` Mikulas Patocka 2018-04-17 19:06 ` Mikulas Patocka 2018-04-18 14:55 ` Christopher Lameter 2018-04-25 21:04 ` Mikulas Patocka 2018-04-25 23:24 ` Mikulas Patocka 2018-04-26 19:01 ` Christopher Lameter 2018-04-26 21:09 ` Mikulas Patocka 2018-04-27 16:41 ` Christopher Lameter 2018-04-27 19:19 ` Mikulas Patocka 2018-06-13 17:01 ` Mikulas Patocka 2018-06-13 18:16 ` Christoph Hellwig 2018-06-13 18:53 ` Mikulas Patocka 2018-04-26 18:51 ` Christopher Lameter 2018-04-16 19:38 ` Vlastimil Babka 2018-04-16 19:38 ` Vlastimil Babka 2018-04-16 21:04 ` Mikulas Patocka 2018-04-16 21:04 ` Mikulas Patocka
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=alpine.LRH.2.02.1803211425330.26409@file01.intranet.prod.int.rdu2.redhat.com \ --to=mpatocka@redhat.com \ --cc=akpm@linux-foundation.org \ --cc=cl@linux.com \ --cc=dm-devel@redhat.com \ --cc=iamjoonsoo.kim@lge.com \ --cc=linux-mm@kvack.org \ --cc=msnitzer@redhat.com \ --cc=penberg@kernel.org \ --cc=rientjes@google.com \ --cc=willy@infradead.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.