linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [LSF/MM TOPIC] guarantee natural alignment for kmalloc()?
@ 2019-04-11 12:52 Vlastimil Babka
  2019-04-11 13:28 ` Matthew Wilcox
  2019-04-12  7:14 ` James Bottomley
  0 siblings, 2 replies; 9+ messages in thread
From: Vlastimil Babka @ 2019-04-11 12:52 UTC (permalink / raw)
  To: lsf-pc
  Cc: Linux-FSDevel, linux-mm, linux-block, Michal Hocko,
	Christoph Lameter, David Rientjes, Pekka Enberg, Joonsoo Kim,
	Ming Lei, linux-xfs, Christoph Hellwig, Dave Chinner,
	Darrick J . Wong

Hi,

here's a late topic for discussion that came out of my patchset [1]. It
would likely have to involve all three groups, as FS/IO people would
benefit, but it's MM area.

Background:
The recent thread [2] inspired me to look into guaranteeing alignment
for kmalloc() for power-of-two sizes. IIUC some usecases (see [2]) don't
know the required sizes in advance in order to create named caches via
kmem_cache_create() with explicit alignment parameter (which is the only
way to guarantee alignment right now). Moreover, in most cases the
alignment happens naturally as the slab allocators split
power-of-two-sized pages into smaller power-of-two-sized objects.
kmalloc() users then might rely on the alignment even unknowingly, until
it breaks when e.g. SLUB debugging is enabled.

Turns out it's not difficult to add the guarantees [1] and in the
production SLAB/SLUB configurations nothing really changes as explained
above. Then folks wouldn't have to come up with workarounds as in [2].
Technical downsides would be for SLUB debug mode (increased memory
fragmentation, should be acceptable in a bug hunting scenario?), and
SLOB (potentially worse performance due to increased packing effort, but
this slab variant is rather marginal).

In the session I hope to resolve the question whether this is indeed the
right thing to do for all kmalloc() users, without an explicit alignment
requests, and if it's worth the potentially worse
performance/fragmentation it would impose on a hypothetical new slab
implementation for which it wouldn't be optimal to split power-of-two
sized pages into power-of-two-sized objects (or whether there are any
other downsides).

Thanks,
Vlastimil

[1] https://lore.kernel.org/lkml/20190319211108.15495-1-vbabka@suse.cz/T/#u
[2]
https://lore.kernel.org/linux-fsdevel/20190225040904.5557-1-ming.lei@redhat.com/T/#u

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [LSF/MM TOPIC] guarantee natural alignment for kmalloc()?
  2019-04-11 12:52 [LSF/MM TOPIC] guarantee natural alignment for kmalloc()? Vlastimil Babka
@ 2019-04-11 13:28 ` Matthew Wilcox
  2019-04-25 11:33   ` Matthew Wilcox
  2019-04-12  7:14 ` James Bottomley
  1 sibling, 1 reply; 9+ messages in thread
From: Matthew Wilcox @ 2019-04-11 13:28 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: lsf-pc, Linux-FSDevel, linux-mm, linux-block, Michal Hocko,
	Christoph Lameter, David Rientjes, Pekka Enberg, Joonsoo Kim,
	Ming Lei, linux-xfs, Christoph Hellwig, Dave Chinner,
	Darrick J . Wong

On Thu, Apr 11, 2019 at 02:52:08PM +0200, Vlastimil Babka wrote:
> In the session I hope to resolve the question whether this is indeed the
> right thing to do for all kmalloc() users, without an explicit alignment
> requests, and if it's worth the potentially worse
> performance/fragmentation it would impose on a hypothetical new slab
> implementation for which it wouldn't be optimal to split power-of-two
> sized pages into power-of-two-sized objects (or whether there are any
> other downsides).

I think this is exactly the kind of discussion that LSFMM is for!  It's
really a whole-system question; is Linux better-off having the flexibility
for allocators to return non-power-of-two aligned memory, or allowing
consumers of the kmalloc API to assume that "sufficiently large" memory
is naturally aligned.

Another possibility that should be considered is introducing a kmalloc()
variant like posix_memalign() that allows for specifying the alignment,
or just kmalloc_naturally_aligned().

And we probably need to reiterate for the benefit of those not following
the discussion that creating a slab cache (which does allow for alignment
to be specified) is impractical for this use case because the actual
allocations are of variable size, but always need to be 512-byte aligned.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [LSF/MM TOPIC] guarantee natural alignment for kmalloc()?
  2019-04-11 12:52 [LSF/MM TOPIC] guarantee natural alignment for kmalloc()? Vlastimil Babka
  2019-04-11 13:28 ` Matthew Wilcox
@ 2019-04-12  7:14 ` James Bottomley
  2019-04-12  7:54   ` Vlastimil Babka
  1 sibling, 1 reply; 9+ messages in thread
From: James Bottomley @ 2019-04-12  7:14 UTC (permalink / raw)
  To: Vlastimil Babka, lsf-pc
  Cc: Linux-FSDevel, linux-mm, linux-block, Michal Hocko,
	Christoph Lameter, David Rientjes, Pekka Enberg, Joonsoo Kim,
	Ming Lei, linux-xfs, Christoph Hellwig, Dave Chinner,
	Darrick J . Wong

On Thu, 2019-04-11 at 14:52 +0200, Vlastimil Babka wrote:
> Hi,
> 
> here's a late topic for discussion that came out of my patchset [1].
> It would likely have to involve all three groups, as FS/IO people
> would benefit, but it's MM area.
> 
> Background:
> The recent thread [2] inspired me to look into guaranteeing alignment
> for kmalloc() for power-of-two sizes. IIUC some usecases (see [2])
> don't know the required sizes in advance in order to create named
> caches via kmem_cache_create() with explicit alignment parameter
> (which is the only way to guarantee alignment right now). Moreover,
> in most cases the alignment happens naturally as the slab allocators
> split power-of-two-sized pages into smaller power-of-two-sized
> objects. kmalloc() users then might rely on the alignment even
> unknowingly, until it breaks when e.g. SLUB debugging is enabled.
> 
> Turns out it's not difficult to add the guarantees [1] and in the
> production SLAB/SLUB configurations nothing really changes as
> explained above. Then folks wouldn't have to come up with workarounds
> as in [2]. Technical downsides would be for SLUB debug mode
> (increased memory fragmentation, should be acceptable in a bug
> hunting scenario?), and SLOB (potentially worse performance due to
> increased packing effort, but this slab variant is rather marginal).
> 
> In the session I hope to resolve the question whether this is indeed
> the right thing to do for all kmalloc() users, without an explicit
> alignment requests, and if it's worth the potentially worse
> performance/fragmentation it would impose on a hypothetical new slab
> implementation for which it wouldn't be optimal to split power-of-two
> sized pages into power-of-two-sized objects (or whether there are any
> other downsides).

I think so.  The question is how aligned?  explicit flushing arch's
definitely need at least cache line alignment when using kmalloc for
I/O and if allocations cross cache lines they have serious coherency
problems.   The question of how much more aligned than this is
interesting ... I've got to say that the power of two allocator implies
same alignment as size and we seem to keep growing use cases that
assume this.  I'm not so keen on growing a separate API unless there's
a really useful mm efficiency in breaking the kmalloc alignment
assumptions.

James


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [LSF/MM TOPIC] guarantee natural alignment for kmalloc()?
  2019-04-12  7:14 ` James Bottomley
@ 2019-04-12  7:54   ` Vlastimil Babka
  2019-04-16 15:38     ` Christopher Lameter
  0 siblings, 1 reply; 9+ messages in thread
From: Vlastimil Babka @ 2019-04-12  7:54 UTC (permalink / raw)
  To: James Bottomley, lsf-pc
  Cc: Linux-FSDevel, linux-mm, linux-block, Michal Hocko,
	Christoph Lameter, David Rientjes, Pekka Enberg, Joonsoo Kim,
	Ming Lei, linux-xfs, Christoph Hellwig, Dave Chinner,
	Darrick J . Wong

On 4/12/19 9:14 AM, James Bottomley wrote:
>> In the session I hope to resolve the question whether this is indeed
>> the right thing to do for all kmalloc() users, without an explicit
>> alignment requests, and if it's worth the potentially worse
>> performance/fragmentation it would impose on a hypothetical new slab
>> implementation for which it wouldn't be optimal to split power-of-two
>> sized pages into power-of-two-sized objects (or whether there are any
>> other downsides).
> 
> I think so.  The question is how aligned?  explicit flushing arch's
> definitely need at least cache line alignment when using kmalloc for
> I/O and if allocations cross cache lines they have serious coherency
> problems.   The question of how much more aligned than this is
> interesting ... I've got to say that the power of two allocator implies
> same alignment as size and we seem to keep growing use cases that
> assume this.

Right, by "natural alignment" I meant exactly that - align to size for
power-of-two sizes.

> I'm not so keen on growing a separate API unless there's
> a really useful mm efficiency in breaking the kmalloc alignment
> assumptions.

I'd argue there's not.

> James
> 


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [LSF/MM TOPIC] guarantee natural alignment for kmalloc()?
  2019-04-12  7:54   ` Vlastimil Babka
@ 2019-04-16 15:38     ` Christopher Lameter
  2019-04-17  8:07       ` Vlastimil Babka
  0 siblings, 1 reply; 9+ messages in thread
From: Christopher Lameter @ 2019-04-16 15:38 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: James Bottomley, lsf-pc, Linux-FSDevel, linux-mm, linux-block,
	Michal Hocko, David Rientjes, Pekka Enberg, Joonsoo Kim,
	Ming Lei, linux-xfs, Christoph Hellwig, Dave Chinner,
	Darrick J . Wong

On Fri, 12 Apr 2019, Vlastimil Babka wrote:

> On 4/12/19 9:14 AM, James Bottomley wrote:
> >> In the session I hope to resolve the question whether this is indeed
> >> the right thing to do for all kmalloc() users, without an explicit
> >> alignment requests, and if it's worth the potentially worse
> >> performance/fragmentation it would impose on a hypothetical new slab
> >> implementation for which it wouldn't be optimal to split power-of-two
> >> sized pages into power-of-two-sized objects (or whether there are any
> >> other downsides).
> >
> > I think so.  The question is how aligned?  explicit flushing arch's
> > definitely need at least cache line alignment when using kmalloc for
> > I/O and if allocations cross cache lines they have serious coherency
> > problems.   The question of how much more aligned than this is
> > interesting ... I've got to say that the power of two allocator implies
> > same alignment as size and we seem to keep growing use cases that
> > assume this.

Well that can be controlled on a  per arch level through KMALLOC_MIN_ALIGN
already. There are architectues that align to cache line boundaries.
However you sometimes have hardware with ridiculous large cache line
length configurations like VSMP with 4k.

> Right, by "natural alignment" I meant exactly that - align to size for
> power-of-two sizes.

Well for which sizes? Double word till PAGE_SIZE? This gets us into weird
and difficult to comprehend rules for how objects are aligned. Or do we
start on the cache line size to provide cacheline alignment and do word
alignment before?

Consistency is important I think and if you want something different then
you need to say so in one way or another.


> > I'm not so keen on growing a separate API unless there's
> > a really useful mm efficiency in breaking the kmalloc alignment
> > assumptions.
>
> I'd argue there's not.



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [LSF/MM TOPIC] guarantee natural alignment for kmalloc()?
  2019-04-16 15:38     ` Christopher Lameter
@ 2019-04-17  8:07       ` Vlastimil Babka
  0 siblings, 0 replies; 9+ messages in thread
From: Vlastimil Babka @ 2019-04-17  8:07 UTC (permalink / raw)
  To: Christopher Lameter
  Cc: James Bottomley, lsf-pc, Linux-FSDevel, linux-mm, linux-block,
	Michal Hocko, David Rientjes, Pekka Enberg, Joonsoo Kim,
	Ming Lei, linux-xfs, Christoph Hellwig, Dave Chinner,
	Darrick J . Wong

On 4/16/19 5:38 PM, Christopher Lameter wrote:
> On Fri, 12 Apr 2019, Vlastimil Babka wrote:
> 
>> On 4/12/19 9:14 AM, James Bottomley wrote:
>>>> In the session I hope to resolve the question whether this is indeed
>>>> the right thing to do for all kmalloc() users, without an explicit
>>>> alignment requests, and if it's worth the potentially worse
>>>> performance/fragmentation it would impose on a hypothetical new slab
>>>> implementation for which it wouldn't be optimal to split power-of-two
>>>> sized pages into power-of-two-sized objects (or whether there are any
>>>> other downsides).
>>>
>>> I think so.  The question is how aligned?  explicit flushing arch's
>>> definitely need at least cache line alignment when using kmalloc for
>>> I/O and if allocations cross cache lines they have serious coherency
>>> problems.   The question of how much more aligned than this is
>>> interesting ... I've got to say that the power of two allocator implies
>>> same alignment as size and we seem to keep growing use cases that
>>> assume this.
> 
> Well that can be controlled on a  per arch level through KMALLOC_MIN_ALIGN
> already. There are architectues that align to cache line boundaries.
> However you sometimes have hardware with ridiculous large cache line
> length configurations like VSMP with 4k.

The arch and cache line limits would be respected as well, of course.

>> Right, by "natural alignment" I meant exactly that - align to size for
>> power-of-two sizes.
> 
> Well for which sizes? Double word till PAGE_SIZE?

Basically, yes. Above page size this is also true thanks to the buddy
allocator scheme.

> This gets us into weird
> and difficult to comprehend rules for how objects are aligned.

I don't think the rules are really difficult to comprehend for kmalloc()
users when they can rely on these alignment guarantees:

- alignment is at least what the arch mandates (to prevent unaligned
access, which is either illegal, or slower, right?)
- alignment at least to allocation size, for power of two sizes
- alignment at least to cache line size for performance or coherency reasons

The point is that kmalloc() users do not ever need to know the exact
alignment! Why should they care? It's enough that the guarantees are
fulfilled, and thanks to the "at least" part, the alignment might be
e.g. twice the size sometimes (e.g. 64 instead of 32), but that's
obviously not a problem for the kmalloc() user as the larger alignment
still satisfies the need for the smaller alignment.

(Implementation-wise a simple max(KMALLOC_MIN_ALIGN, size, cache_line)
is enough if all three are a power-of-two values, otherwise we need to
calculate LCM, but IIRC existing code already uses max() for
KMALLOC_MIN_ALIGN and cache_line at least in SLAB).

> Or do we
> start on the cache line size to provide cacheline alignment and do word
> alignment before?

I didn't intend to change how cache line alignment works, that's a
separate thing. Looks like on my system with SLAB and 64B cache line
size, I have kmalloc-32 aligned to 32, kmalloc-64 aligned to 64 and
kmalloc-96 aligned to 64, thus practically the same as kmalloc-128.
Adding the align-to-size-for-power-of-two guarantee would change nothing
here.

> Consistency is important I think

I think using the three "at least" rules above is consistent enough, or
I'm not sure what kind of consistency you mean here?

> and if you want something different then
> you need to say so in one way or another.
> 
> 
>>> I'm not so keen on growing a separate API unless there's
>>> a really useful mm efficiency in breaking the kmalloc alignment
>>> assumptions.
>>
>> I'd argue there's not.
> 
> 


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [LSF/MM TOPIC] guarantee natural alignment for kmalloc()?
  2019-04-11 13:28 ` Matthew Wilcox
@ 2019-04-25 11:33   ` Matthew Wilcox
  2019-04-25 12:03     ` Martin K. Petersen
  2019-04-25 12:03     ` Michal Hocko
  0 siblings, 2 replies; 9+ messages in thread
From: Matthew Wilcox @ 2019-04-25 11:33 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: lsf-pc, Linux-FSDevel, linux-mm, linux-block, Michal Hocko,
	Christoph Lameter, David Rientjes, Pekka Enberg, Joonsoo Kim,
	Ming Lei, linux-xfs, Christoph Hellwig, Dave Chinner,
	Darrick J . Wong

On Thu, Apr 11, 2019 at 06:28:19AM -0700, Matthew Wilcox wrote:
> On Thu, Apr 11, 2019 at 02:52:08PM +0200, Vlastimil Babka wrote:
> > In the session I hope to resolve the question whether this is indeed the
> > right thing to do for all kmalloc() users, without an explicit alignment
> > requests, and if it's worth the potentially worse
> > performance/fragmentation it would impose on a hypothetical new slab
> > implementation for which it wouldn't be optimal to split power-of-two
> > sized pages into power-of-two-sized objects (or whether there are any
> > other downsides).
> 
> I think this is exactly the kind of discussion that LSFMM is for!  It's
> really a whole-system question; is Linux better-off having the flexibility
> for allocators to return non-power-of-two aligned memory, or allowing
> consumers of the kmalloc API to assume that "sufficiently large" memory
> is naturally aligned.

This has been scheduled for only the MM track.  I think at least the
filesystem people should be involved in this discussion since it's for
their benefit.

Do we have an lsf-discuss mailing list this year?  Might be good to
coordinate arrivals / departures for taxi sharing purposes.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [LSF/MM TOPIC] guarantee natural alignment for kmalloc()?
  2019-04-25 11:33   ` Matthew Wilcox
@ 2019-04-25 12:03     ` Martin K. Petersen
  2019-04-25 12:03     ` Michal Hocko
  1 sibling, 0 replies; 9+ messages in thread
From: Martin K. Petersen @ 2019-04-25 12:03 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Vlastimil Babka, lsf-pc, Linux-FSDevel, linux-mm, linux-block,
	Michal Hocko, Christoph Lameter, David Rientjes, Pekka Enberg,
	Joonsoo Kim, Ming Lei, linux-xfs, Christoph Hellwig,
	Dave Chinner, Darrick J . Wong


Matthew,

> Do we have an lsf-discuss mailing list this year?  Might be good to
> coordinate arrivals / departures for taxi sharing purposes.

lsf@lists.linux-foundation.org

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [LSF/MM TOPIC] guarantee natural alignment for kmalloc()?
  2019-04-25 11:33   ` Matthew Wilcox
  2019-04-25 12:03     ` Martin K. Petersen
@ 2019-04-25 12:03     ` Michal Hocko
  1 sibling, 0 replies; 9+ messages in thread
From: Michal Hocko @ 2019-04-25 12:03 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Vlastimil Babka, lsf-pc, Linux-FSDevel, linux-mm, linux-block,
	Christoph Lameter, David Rientjes, Pekka Enberg, Joonsoo Kim,
	Ming Lei, linux-xfs, Christoph Hellwig, Dave Chinner,
	Darrick J . Wong

On Thu 25-04-19 04:33:59, Matthew Wilcox wrote:
> On Thu, Apr 11, 2019 at 06:28:19AM -0700, Matthew Wilcox wrote:
> > On Thu, Apr 11, 2019 at 02:52:08PM +0200, Vlastimil Babka wrote:
> > > In the session I hope to resolve the question whether this is indeed the
> > > right thing to do for all kmalloc() users, without an explicit alignment
> > > requests, and if it's worth the potentially worse
> > > performance/fragmentation it would impose on a hypothetical new slab
> > > implementation for which it wouldn't be optimal to split power-of-two
> > > sized pages into power-of-two-sized objects (or whether there are any
> > > other downsides).
> > 
> > I think this is exactly the kind of discussion that LSFMM is for!  It's
> > really a whole-system question; is Linux better-off having the flexibility
> > for allocators to return non-power-of-two aligned memory, or allowing
> > consumers of the kmalloc API to assume that "sufficiently large" memory
> > is naturally aligned.
> 
> This has been scheduled for only the MM track.  I think at least the
> filesystem people should be involved in this discussion since it's for
> their benefit.

Agreed. I have marked it as a MM/IO/FS track, we just haven't added it
to the schedule that way. I still plan to go over all topics again and
consolidate the current (very preliminary) schedule. Thanks for catching
this up.

> Do we have an lsf-discuss mailing list this year?  Might be good to
> coordinate arrivals / departures for taxi sharing purposes.

Yes, the list should be established AFAIK and same address as last
years.

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2019-04-25 12:04 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-04-11 12:52 [LSF/MM TOPIC] guarantee natural alignment for kmalloc()? Vlastimil Babka
2019-04-11 13:28 ` Matthew Wilcox
2019-04-25 11:33   ` Matthew Wilcox
2019-04-25 12:03     ` Martin K. Petersen
2019-04-25 12:03     ` Michal Hocko
2019-04-12  7:14 ` James Bottomley
2019-04-12  7:54   ` Vlastimil Babka
2019-04-16 15:38     ` Christopher Lameter
2019-04-17  8:07       ` Vlastimil Babka

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).