* [PATCH] slab.h: Avoid using & for logical and of booleans
@ 2018-11-05 20:40 Bart Van Assche
2018-11-05 21:13 ` Andrew Morton
` (3 more replies)
0 siblings, 4 replies; 29+ messages in thread
From: Bart Van Assche @ 2018-11-05 20:40 UTC (permalink / raw)
To: Andrew Morton
Cc: linux-kernel, Bart Van Assche, Vlastimil Babka, Mel Gorman,
Christoph Lameter, Roman Gushchin
This patch suppresses the following sparse warning:
./include/linux/slab.h:332:43: warning: dubious: x & !y
Fixes: 1291523f2c1d ("mm, slab/slub: introduce kmalloc-reclaimable caches")
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Christoph Lameter <cl@linux.com>
Cc: Roman Gushchin <guro@fb.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
include/linux/slab.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/include/linux/slab.h b/include/linux/slab.h
index 918f374e7156..97d0599ddb7b 100644
--- a/include/linux/slab.h
+++ b/include/linux/slab.h
@@ -329,7 +329,7 @@ static __always_inline enum kmalloc_cache_type kmalloc_type(gfp_t flags)
* If an allocation is both __GFP_DMA and __GFP_RECLAIMABLE, return
* KMALLOC_DMA and effectively ignore __GFP_RECLAIMABLE
*/
- return type_dma + (is_reclaimable & !is_dma) * KMALLOC_RECLAIM;
+ return type_dma + is_reclaimable * !is_dma * KMALLOC_RECLAIM;
}
/*
--
2.19.1.930.g4563a0d9d0-goog
^ permalink raw reply related [flat|nested] 29+ messages in thread
* Re: [PATCH] slab.h: Avoid using & for logical and of booleans
2018-11-05 20:40 [PATCH] slab.h: Avoid using & for logical and of booleans Bart Van Assche
@ 2018-11-05 21:13 ` Andrew Morton
2018-11-05 21:48 ` Bart Van Assche
2018-11-06 9:45 ` William Kucharski
2018-11-06 8:40 ` Vlastimil Babka
` (2 subsequent siblings)
3 siblings, 2 replies; 29+ messages in thread
From: Andrew Morton @ 2018-11-05 21:13 UTC (permalink / raw)
To: Bart Van Assche
Cc: linux-kernel, Vlastimil Babka, Mel Gorman, Christoph Lameter,
Roman Gushchin, Pekka Enberg, David Rientjes, Joonsoo Kim,
linux-mm
On Mon, 5 Nov 2018 12:40:00 -0800 Bart Van Assche <bvanassche@acm.org> wrote:
> This patch suppresses the following sparse warning:
>
> ./include/linux/slab.h:332:43: warning: dubious: x & !y
>
> ...
>
> --- a/include/linux/slab.h
> +++ b/include/linux/slab.h
> @@ -329,7 +329,7 @@ static __always_inline enum kmalloc_cache_type kmalloc_type(gfp_t flags)
> * If an allocation is both __GFP_DMA and __GFP_RECLAIMABLE, return
> * KMALLOC_DMA and effectively ignore __GFP_RECLAIMABLE
> */
> - return type_dma + (is_reclaimable & !is_dma) * KMALLOC_RECLAIM;
> + return type_dma + is_reclaimable * !is_dma * KMALLOC_RECLAIM;
> }
>
> /*
I suppose so.
That function seems too clever for its own good :(. I wonder if these
branch-avoiding tricks are really worthwhile.
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH] slab.h: Avoid using & for logical and of booleans
2018-11-05 21:13 ` Andrew Morton
@ 2018-11-05 21:48 ` Bart Van Assche
2018-11-05 22:14 ` Rasmus Villemoes
2018-11-06 9:45 ` William Kucharski
1 sibling, 1 reply; 29+ messages in thread
From: Bart Van Assche @ 2018-11-05 21:48 UTC (permalink / raw)
To: Andrew Morton
Cc: linux-kernel, Vlastimil Babka, Mel Gorman, Christoph Lameter,
Roman Gushchin, Pekka Enberg, David Rientjes, Joonsoo Kim,
linux-mm
On Mon, 2018-11-05 at 13:13 -0800, Andrew Morton wrote:
> On Mon, 5 Nov 2018 12:40:00 -0800 Bart Van Assche <bvanassche@acm.org> wrote:
>
> > This patch suppresses the following sparse warning:
> >
> > ./include/linux/slab.h:332:43: warning: dubious: x & !y
> >
> > ...
> >
> > --- a/include/linux/slab.h
> > +++ b/include/linux/slab.h
> > @@ -329,7 +329,7 @@ static __always_inline enum kmalloc_cache_type kmalloc_type(gfp_t flags)
> > * If an allocation is both __GFP_DMA and __GFP_RECLAIMABLE, return
> > * KMALLOC_DMA and effectively ignore __GFP_RECLAIMABLE
> > */
> > - return type_dma + (is_reclaimable & !is_dma) * KMALLOC_RECLAIM;
> > + return type_dma + is_reclaimable * !is_dma * KMALLOC_RECLAIM;
> > }
> >
> > /*
>
> I suppose so.
>
> That function seems too clever for its own good :(. I wonder if these
> branch-avoiding tricks are really worthwhile.
From what I have seen in gcc disassembly it seems to me like gcc uses the
cmov instruction to implement e.g. the ternary operator (?:). So I think none
of the cleverness in kmalloc_type() is really necessary to avoid conditional
branches. I think this function would become much more readable when using a
switch statement or when rewriting it as follows (untested):
static __always_inline enum kmalloc_cache_type kmalloc_type(gfp_t flags)
{
- int is_dma = 0;
- int type_dma = 0;
- int is_reclaimable;
-
-#ifdef CONFIG_ZONE_DMA
- is_dma = !!(flags & __GFP_DMA);
- type_dma = is_dma * KMALLOC_DMA;
-#endif
-
- is_reclaimable = !!(flags & __GFP_RECLAIMABLE);
-
/*
* If an allocation is both __GFP_DMA and __GFP_RECLAIMABLE, return
* KMALLOC_DMA and effectively ignore __GFP_RECLAIMABLE
*/
- return type_dma + (is_reclaimable & !is_dma) * KMALLOC_RECLAIM;
+ static const enum kmalloc_cache_type flags_to_type[2][2] = {
+ { 0, KMALLOC_RECLAIM },
+ { KMALLOC_DMA, KMALLOC_DMA },
+ };
+#ifdef CONFIG_ZONE_DMA
+ bool is_dma = !!(flags & __GFP_DMA);
+#endif
+ bool is_reclaimable = !!(flags & __GFP_RECLAIMABLE);
+
+ return flags_to_type[is_dma][is_reclaimable];
}
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH] slab.h: Avoid using & for logical and of booleans
2018-11-05 21:48 ` Bart Van Assche
@ 2018-11-05 22:14 ` Rasmus Villemoes
2018-11-05 22:40 ` Bart Van Assche
0 siblings, 1 reply; 29+ messages in thread
From: Rasmus Villemoes @ 2018-11-05 22:14 UTC (permalink / raw)
To: Bart Van Assche, Andrew Morton
Cc: linux-kernel, Vlastimil Babka, Mel Gorman, Christoph Lameter,
Roman Gushchin, Pekka Enberg, David Rientjes, Joonsoo Kim,
linux-mm
On 2018-11-05 22:48, Bart Van Assche wrote:
> On Mon, 2018-11-05 at 13:13 -0800, Andrew Morton wrote:
>> On Mon, 5 Nov 2018 12:40:00 -0800 Bart Van Assche <bvanassche@acm.org> wrote:
>>
>>> This patch suppresses the following sparse warning:
>>>
>>> ./include/linux/slab.h:332:43: warning: dubious: x & !y
>>>
>>> ...
>>>
>>> --- a/include/linux/slab.h
>>> +++ b/include/linux/slab.h
>>> @@ -329,7 +329,7 @@ static __always_inline enum kmalloc_cache_type kmalloc_type(gfp_t flags)
>>> * If an allocation is both __GFP_DMA and __GFP_RECLAIMABLE, return
>>> * KMALLOC_DMA and effectively ignore __GFP_RECLAIMABLE
>>> */
>>> - return type_dma + (is_reclaimable & !is_dma) * KMALLOC_RECLAIM;
>>> + return type_dma + is_reclaimable * !is_dma * KMALLOC_RECLAIM;
>>> }
>>>
>>> /*
>>
>> I suppose so.
>>
>> That function seems too clever for its own good :(. I wonder if these
>> branch-avoiding tricks are really worthwhile.
>
> From what I have seen in gcc disassembly it seems to me like gcc uses the
> cmov instruction to implement e.g. the ternary operator (?:). So I think none
> of the cleverness in kmalloc_type() is really necessary to avoid conditional
> branches. I think this function would become much more readable when using a
> switch statement or when rewriting it as follows (untested):
>
> static __always_inline enum kmalloc_cache_type kmalloc_type(gfp_t flags)
> {
> - int is_dma = 0;
> - int type_dma = 0;
> - int is_reclaimable;
> -
> -#ifdef CONFIG_ZONE_DMA
> - is_dma = !!(flags & __GFP_DMA);
> - type_dma = is_dma * KMALLOC_DMA;
> -#endif
> -
> - is_reclaimable = !!(flags & __GFP_RECLAIMABLE);
> -
> /*
> * If an allocation is both __GFP_DMA and __GFP_RECLAIMABLE, return
> * KMALLOC_DMA and effectively ignore __GFP_RECLAIMABLE
> */
> - return type_dma + (is_reclaimable & !is_dma) * KMALLOC_RECLAIM;
> + static const enum kmalloc_cache_type flags_to_type[2][2] = {
> + { 0, KMALLOC_RECLAIM },
> + { KMALLOC_DMA, KMALLOC_DMA },
> + };
> +#ifdef CONFIG_ZONE_DMA
> + bool is_dma = !!(flags & __GFP_DMA);
> +#endif
> + bool is_reclaimable = !!(flags & __GFP_RECLAIMABLE);
> +
> + return flags_to_type[is_dma][is_reclaimable];
> }
>
Won't that pessimize the cases where gfp is a constant to actually do
the table lookup, and add 16 bytes to every translation unit?
Another option is to add a fake KMALLOC_DMA_RECLAIM so the
kmalloc_caches[] array has size 4, then assign the same dma
kmalloc_cache pointer to [2][i] and [3][i] (so that costs perhaps a
dozen pointers in .data), and then just compute kmalloc_type() as
((flags & __GFP_RECLAIMABLE) >> someshift) | ((flags & __GFP_DMA) >>
someothershift).
Perhaps one could even shuffle the GFP flags so the two shifts are the same.
Rasmus
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH] slab.h: Avoid using & for logical and of booleans
2018-11-05 22:14 ` Rasmus Villemoes
@ 2018-11-05 22:40 ` Bart Van Assche
2018-11-05 22:48 ` Alexander Duyck
0 siblings, 1 reply; 29+ messages in thread
From: Bart Van Assche @ 2018-11-05 22:40 UTC (permalink / raw)
To: Rasmus Villemoes, Andrew Morton
Cc: linux-kernel, Vlastimil Babka, Mel Gorman, Christoph Lameter,
Roman Gushchin, Pekka Enberg, David Rientjes, Joonsoo Kim,
linux-mm
On Mon, 2018-11-05 at 23:14 +0100, Rasmus Villemoes wrote:
> Won't that pessimize the cases where gfp is a constant to actually do
> the table lookup, and add 16 bytes to every translation unit?
>
> Another option is to add a fake KMALLOC_DMA_RECLAIM so the
> kmalloc_caches[] array has size 4, then assign the same dma
> kmalloc_cache pointer to [2][i] and [3][i] (so that costs perhaps a
> dozen pointers in .data), and then just compute kmalloc_type() as
>
> ((flags & __GFP_RECLAIMABLE) >> someshift) | ((flags & __GFP_DMA) >>
> someothershift).
>
> Perhaps one could even shuffle the GFP flags so the two shifts are the same.
How about this version, still untested? My compiler is able to evaluate
the switch expression if the argument is constant.
static __always_inline enum kmalloc_cache_type kmalloc_type(gfp_t flags)
{
- int is_dma = 0;
- int type_dma = 0;
- int is_reclaimable;
+ unsigned int dr = !!(flags & __GFP_RECLAIMABLE);
#ifdef CONFIG_ZONE_DMA
- is_dma = !!(flags & __GFP_DMA);
- type_dma = is_dma * KMALLOC_DMA;
+ dr |= !!(flags & __GFP_DMA) << 1;
#endif
- is_reclaimable = !!(flags & __GFP_RECLAIMABLE);
-
/*
* If an allocation is both __GFP_DMA and __GFP_RECLAIMABLE, return
* KMALLOC_DMA and effectively ignore __GFP_RECLAIMABLE
*/
- return type_dma + (is_reclaimable & !is_dma) * KMALLOC_RECLAIM;
+ switch (dr) {
+ default:
+ case 0:
+ return 0;
+ case 1:
+ return KMALLOC_RECLAIM;
+ case 2:
+ case 3:
+ return KMALLOC_DMA;
+ }
}
Bart.
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH] slab.h: Avoid using & for logical and of booleans
2018-11-05 22:40 ` Bart Van Assche
@ 2018-11-05 22:48 ` Alexander Duyck
2018-11-06 0:01 ` Bart Van Assche
0 siblings, 1 reply; 29+ messages in thread
From: Alexander Duyck @ 2018-11-05 22:48 UTC (permalink / raw)
To: bvanassche
Cc: linux, Andrew Morton, LKML, Vlastimil Babka, Mel Gorman,
Christoph Lameter, guro, Pekka Enberg, David Rientjes,
Joonsoo Kim, linux-mm
On Mon, Nov 5, 2018 at 2:41 PM Bart Van Assche <bvanassche@acm.org> wrote:
>
> On Mon, 2018-11-05 at 23:14 +0100, Rasmus Villemoes wrote:
> > Won't that pessimize the cases where gfp is a constant to actually do
> > the table lookup, and add 16 bytes to every translation unit?
> >
> > Another option is to add a fake KMALLOC_DMA_RECLAIM so the
> > kmalloc_caches[] array has size 4, then assign the same dma
> > kmalloc_cache pointer to [2][i] and [3][i] (so that costs perhaps a
> > dozen pointers in .data), and then just compute kmalloc_type() as
> >
> > ((flags & __GFP_RECLAIMABLE) >> someshift) | ((flags & __GFP_DMA) >>
> > someothershift).
> >
> > Perhaps one could even shuffle the GFP flags so the two shifts are the same.
>
> How about this version, still untested? My compiler is able to evaluate
> the switch expression if the argument is constant.
>
> static __always_inline enum kmalloc_cache_type kmalloc_type(gfp_t flags)
> {
> - int is_dma = 0;
> - int type_dma = 0;
> - int is_reclaimable;
> + unsigned int dr = !!(flags & __GFP_RECLAIMABLE);
>
> #ifdef CONFIG_ZONE_DMA
> - is_dma = !!(flags & __GFP_DMA);
> - type_dma = is_dma * KMALLOC_DMA;
> + dr |= !!(flags & __GFP_DMA) << 1;
> #endif
>
> - is_reclaimable = !!(flags & __GFP_RECLAIMABLE);
> -
> /*
> * If an allocation is both __GFP_DMA and __GFP_RECLAIMABLE, return
> * KMALLOC_DMA and effectively ignore __GFP_RECLAIMABLE
> */
> - return type_dma + (is_reclaimable & !is_dma) * KMALLOC_RECLAIM;
> + switch (dr) {
> + default:
> + case 0:
> + return 0;
> + case 1:
> + return KMALLOC_RECLAIM;
> + case 2:
> + case 3:
> + return KMALLOC_DMA;
> + }
> }
>
> Bart.
Doesn't this defeat the whole point of the code which I thought was to
avoid conditional jumps and branches? Also why would you bother with
the "dr" value when you could just mask the flags value and switch on
that directly?
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH] slab.h: Avoid using & for logical and of booleans
2018-11-05 22:48 ` Alexander Duyck
@ 2018-11-06 0:01 ` Bart Van Assche
2018-11-06 0:11 ` Alexander Duyck
0 siblings, 1 reply; 29+ messages in thread
From: Bart Van Assche @ 2018-11-06 0:01 UTC (permalink / raw)
To: Alexander Duyck
Cc: linux, Andrew Morton, LKML, Vlastimil Babka, Mel Gorman,
Christoph Lameter, guro, Pekka Enberg, David Rientjes,
Joonsoo Kim, linux-mm
On Mon, 2018-11-05 at 14:48 -0800, Alexander Duyck wrote:
> On Mon, Nov 5, 2018 at 2:41 PM Bart Van Assche <bvanassche@acm.org> wrote:
> > How about this version, still untested? My compiler is able to evaluate
> > the switch expression if the argument is constant.
> >
> > static __always_inline enum kmalloc_cache_type kmalloc_type(gfp_t flags)
> > {
> > - int is_dma = 0;
> > - int type_dma = 0;
> > - int is_reclaimable;
> > + unsigned int dr = !!(flags & __GFP_RECLAIMABLE);
> >
> > #ifdef CONFIG_ZONE_DMA
> > - is_dma = !!(flags & __GFP_DMA);
> > - type_dma = is_dma * KMALLOC_DMA;
> > + dr |= !!(flags & __GFP_DMA) << 1;
> > #endif
> >
> > - is_reclaimable = !!(flags & __GFP_RECLAIMABLE);
> > -
> > /*
> > * If an allocation is both __GFP_DMA and __GFP_RECLAIMABLE, return
> > * KMALLOC_DMA and effectively ignore __GFP_RECLAIMABLE
> > */
> > - return type_dma + (is_reclaimable & !is_dma) * KMALLOC_RECLAIM;
> > + switch (dr) {
> > + default:
> > + case 0:
> > + return 0;
> > + case 1:
> > + return KMALLOC_RECLAIM;
> > + case 2:
> > + case 3:
> > + return KMALLOC_DMA;
> > + }
> > }
>
> Doesn't this defeat the whole point of the code which I thought was to
> avoid conditional jumps and branches? Also why would you bother with
> the "dr" value when you could just mask the flags value and switch on
> that directly?
Storing the relevant bits of 'flags' in the 'dr' variable avoids that the
bit selection expressions have to be repeated and allows to use a switch
statement instead of multiple if / else statements.
Most kmalloc() calls pass a constant to the gfp argument. That allows the
compiler to evaluate kmalloc_type() at compile time. So the conditional jumps
and branches only appear when the gfp argument is not a constant. What makes
you think it is important to optimize for that case?
Bart.
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH] slab.h: Avoid using & for logical and of booleans
2018-11-06 0:01 ` Bart Van Assche
@ 2018-11-06 0:11 ` Alexander Duyck
2018-11-06 0:32 ` Bart Van Assche
0 siblings, 1 reply; 29+ messages in thread
From: Alexander Duyck @ 2018-11-06 0:11 UTC (permalink / raw)
To: bvanassche
Cc: linux, Andrew Morton, LKML, Vlastimil Babka, Mel Gorman,
Christoph Lameter, guro, Pekka Enberg, David Rientjes,
Joonsoo Kim, linux-mm
On Mon, Nov 5, 2018 at 4:01 PM Bart Van Assche <bvanassche@acm.org> wrote:
>
> On Mon, 2018-11-05 at 14:48 -0800, Alexander Duyck wrote:
> > On Mon, Nov 5, 2018 at 2:41 PM Bart Van Assche <bvanassche@acm.org> wrote:
> > > How about this version, still untested? My compiler is able to evaluate
> > > the switch expression if the argument is constant.
> > >
> > > static __always_inline enum kmalloc_cache_type kmalloc_type(gfp_t flags)
> > > {
> > > - int is_dma = 0;
> > > - int type_dma = 0;
> > > - int is_reclaimable;
> > > + unsigned int dr = !!(flags & __GFP_RECLAIMABLE);
> > >
> > > #ifdef CONFIG_ZONE_DMA
> > > - is_dma = !!(flags & __GFP_DMA);
> > > - type_dma = is_dma * KMALLOC_DMA;
> > > + dr |= !!(flags & __GFP_DMA) << 1;
> > > #endif
> > >
> > > - is_reclaimable = !!(flags & __GFP_RECLAIMABLE);
> > > -
> > > /*
> > > * If an allocation is both __GFP_DMA and __GFP_RECLAIMABLE, return
> > > * KMALLOC_DMA and effectively ignore __GFP_RECLAIMABLE
> > > */
> > > - return type_dma + (is_reclaimable & !is_dma) * KMALLOC_RECLAIM;
> > > + switch (dr) {
> > > + default:
> > > + case 0:
> > > + return 0;
> > > + case 1:
> > > + return KMALLOC_RECLAIM;
> > > + case 2:
> > > + case 3:
> > > + return KMALLOC_DMA;
> > > + }
> > > }
> >
> > Doesn't this defeat the whole point of the code which I thought was to
> > avoid conditional jumps and branches? Also why would you bother with
> > the "dr" value when you could just mask the flags value and switch on
> > that directly?
>
> Storing the relevant bits of 'flags' in the 'dr' variable avoids that the
> bit selection expressions have to be repeated and allows to use a switch
> statement instead of multiple if / else statements.
Really they shouldn't have to be repeated. You essentially have just 3
cases. 0, __GFP_RECLAIMABLE, and the default case.
> Most kmalloc() calls pass a constant to the gfp argument. That allows the
> compiler to evaluate kmalloc_type() at compile time. So the conditional jumps
> and branches only appear when the gfp argument is not a constant. What makes
> you think it is important to optimize for that case?
>
> Bart.
I didn't really think it was all that important to optimize, but I
thought that was what you were trying to maintain with the earlier
patch since it was converting things to a table lookup.
If we really don't care then why even bother with the switch statement
anyway? It seems like you could just do one ternary operator and be
done with it. Basically all you need is:
return (defined(CONFIG_ZONE_DMA) && (flags & __GFP_DMA)) ? KMALLOC_DMA :
(flags & __GFP_RECLAIMABLE) ? KMALLOC_RECLAIM : 0;
Why bother with all the extra complexity of the switch statement?
Thanks.
- Alex
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH] slab.h: Avoid using & for logical and of booleans
2018-11-06 0:11 ` Alexander Duyck
@ 2018-11-06 0:32 ` Bart Van Assche
2018-11-06 17:20 ` Alexander Duyck
0 siblings, 1 reply; 29+ messages in thread
From: Bart Van Assche @ 2018-11-06 0:32 UTC (permalink / raw)
To: Alexander Duyck
Cc: linux, Andrew Morton, LKML, Vlastimil Babka, Mel Gorman,
Christoph Lameter, guro, Pekka Enberg, David Rientjes,
Joonsoo Kim, linux-mm
On Mon, 2018-11-05 at 16:11 -0800, Alexander Duyck wrote:
> If we really don't care then why even bother with the switch statement
> anyway? It seems like you could just do one ternary operator and be
> done with it. Basically all you need is:
> return (defined(CONFIG_ZONE_DMA) && (flags & __GFP_DMA)) ? KMALLOC_DMA :
> (flags & __GFP_RECLAIMABLE) ? KMALLOC_RECLAIM : 0;
>
> Why bother with all the extra complexity of the switch statement?
I don't think that defined() can be used in a C expression. Hence the
IS_ENABLED() macro. If you fix that, leave out four superfluous parentheses,
test your patch, post that patch and cc me then I will add my Reviewed-by.
Bart.
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH] slab.h: Avoid using & for logical and of booleans
2018-11-05 20:40 [PATCH] slab.h: Avoid using & for logical and of booleans Bart Van Assche
2018-11-05 21:13 ` Andrew Morton
@ 2018-11-06 8:40 ` Vlastimil Babka
2018-11-06 10:08 ` David Laight
2018-11-19 11:04 ` Pavel Machek
3 siblings, 0 replies; 29+ messages in thread
From: Vlastimil Babka @ 2018-11-06 8:40 UTC (permalink / raw)
To: Bart Van Assche, Andrew Morton
Cc: linux-kernel, Mel Gorman, Christoph Lameter, Roman Gushchin
On 11/5/18 9:40 PM, Bart Van Assche wrote:
> This patch suppresses the following sparse warning:
>
> ./include/linux/slab.h:332:43: warning: dubious: x & !y
>
> Fixes: 1291523f2c1d ("mm, slab/slub: introduce kmalloc-reclaimable caches")
> Cc: Vlastimil Babka <vbabka@suse.cz>
> Cc: Mel Gorman <mgorman@techsingularity.net>
> Cc: Christoph Lameter <cl@linux.com>
> Cc: Roman Gushchin <guro@fb.com>
> Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Thanks.
Acked-by: Vlastimil Babka <vbabka@suse.cz>
> ---
> include/linux/slab.h | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/include/linux/slab.h b/include/linux/slab.h
> index 918f374e7156..97d0599ddb7b 100644
> --- a/include/linux/slab.h
> +++ b/include/linux/slab.h
> @@ -329,7 +329,7 @@ static __always_inline enum kmalloc_cache_type kmalloc_type(gfp_t flags)
> * If an allocation is both __GFP_DMA and __GFP_RECLAIMABLE, return
> * KMALLOC_DMA and effectively ignore __GFP_RECLAIMABLE
> */
> - return type_dma + (is_reclaimable & !is_dma) * KMALLOC_RECLAIM;
> + return type_dma + is_reclaimable * !is_dma * KMALLOC_RECLAIM;
> }
>
> /*
>
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH] slab.h: Avoid using & for logical and of booleans
2018-11-05 21:13 ` Andrew Morton
2018-11-05 21:48 ` Bart Van Assche
@ 2018-11-06 9:45 ` William Kucharski
1 sibling, 0 replies; 29+ messages in thread
From: William Kucharski @ 2018-11-06 9:45 UTC (permalink / raw)
To: Andrew Morton
Cc: Bart Van Assche, linux-kernel, Vlastimil Babka, Mel Gorman,
Christoph Lameter, Roman Gushchin, Pekka Enberg, David Rientjes,
Joonsoo Kim, linux-mm
> On Nov 5, 2018, at 14:13, Andrew Morton <akpm@linux-foundation.org> wrote:
>
>> On Mon, 5 Nov 2018 12:40:00 -0800 Bart Van Assche <bvanassche@acm.org> wrote:
>> - return type_dma + (is_reclaimable & !is_dma) * KMALLOC_RECLAIM;
>> + return type_dma + is_reclaimable * !is_dma * KMALLOC_RECLAIM;
>> }
>>
>> /*
>
> I suppose so.
>
> That function seems too clever for its own good :(. I wonder if these
> branch-avoiding tricks are really worthwhile.
At the very least I'd like to see some comments added as to why that approach was taken for the sake of future maintainers.
William Kucharski
^ permalink raw reply [flat|nested] 29+ messages in thread
* RE: [PATCH] slab.h: Avoid using & for logical and of booleans
2018-11-05 20:40 [PATCH] slab.h: Avoid using & for logical and of booleans Bart Van Assche
2018-11-05 21:13 ` Andrew Morton
2018-11-06 8:40 ` Vlastimil Babka
@ 2018-11-06 10:08 ` David Laight
2018-11-06 10:22 ` Vlastimil Babka
2018-11-19 11:04 ` Pavel Machek
3 siblings, 1 reply; 29+ messages in thread
From: David Laight @ 2018-11-06 10:08 UTC (permalink / raw)
To: 'Bart Van Assche', Andrew Morton
Cc: linux-kernel, Vlastimil Babka, Mel Gorman, Christoph Lameter,
Roman Gushchin
From: Bart Van Assche
> Sent: 05 November 2018 20:40
>
> This patch suppresses the following sparse warning:
>
> ./include/linux/slab.h:332:43: warning: dubious: x & !y
>
> Fixes: 1291523f2c1d ("mm, slab/slub: introduce kmalloc-reclaimable caches")
> Cc: Vlastimil Babka <vbabka@suse.cz>
> Cc: Mel Gorman <mgorman@techsingularity.net>
> Cc: Christoph Lameter <cl@linux.com>
> Cc: Roman Gushchin <guro@fb.com>
> Signed-off-by: Bart Van Assche <bvanassche@acm.org>
> ---
> include/linux/slab.h | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/include/linux/slab.h b/include/linux/slab.h
> index 918f374e7156..97d0599ddb7b 100644
> --- a/include/linux/slab.h
> +++ b/include/linux/slab.h
> @@ -329,7 +329,7 @@ static __always_inline enum kmalloc_cache_type kmalloc_type(gfp_t flags)
> * If an allocation is both __GFP_DMA and __GFP_RECLAIMABLE, return
> * KMALLOC_DMA and effectively ignore __GFP_RECLAIMABLE
> */
> - return type_dma + (is_reclaimable & !is_dma) * KMALLOC_RECLAIM;
> + return type_dma + is_reclaimable * !is_dma * KMALLOC_RECLAIM;
ISTM that changing is_dma and is_reclaimable from int to bool will stop the bleating.
It is also strange that this code is trying so hard here to avoid conditional instructions
and then uses several to generate the boolean values in the first place.
OTOH I'd probably write:
int gfp_dma = 0;
#ifdef CONFIG_ZONE_DMA
gfp_dma = __GFP_DMA;
#endif
return flags & gfp_dma ? KMALLOC_DMA : flags & __GFP_RECLAIMABLE ? KMALLOC_RECLAIM : 0;
That might generate cmovs, but is may be better to put unlikely() around both
conditional expressions. Or redo as:
return !unlikely(flags & (dfp_dma | __GFP_RECLAIMABLE)) ? 0 : flags & gfp_dma ? KMALLOC_DMA : KMALLOC_RECLAIM;
David
-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH] slab.h: Avoid using & for logical and of booleans
2018-11-06 10:08 ` David Laight
@ 2018-11-06 10:22 ` Vlastimil Babka
2018-11-06 11:07 ` David Laight
0 siblings, 1 reply; 29+ messages in thread
From: Vlastimil Babka @ 2018-11-06 10:22 UTC (permalink / raw)
To: David Laight, 'Bart Van Assche', Andrew Morton
Cc: linux-kernel, Mel Gorman, Christoph Lameter, Roman Gushchin
On 11/6/18 11:08 AM, David Laight wrote:
> From: Bart Van Assche
>> Sent: 05 November 2018 20:40
>>
>> This patch suppresses the following sparse warning:
>>
>> ./include/linux/slab.h:332:43: warning: dubious: x & !y
BTW, I wonder why the warnings appeared only now, after maybe months in
linux-next. Don't the various automated testing bots run sparse also on
linux-next?
>>
>> Fixes: 1291523f2c1d ("mm, slab/slub: introduce kmalloc-reclaimable caches")
>> Cc: Vlastimil Babka <vbabka@suse.cz>
>> Cc: Mel Gorman <mgorman@techsingularity.net>
>> Cc: Christoph Lameter <cl@linux.com>
>> Cc: Roman Gushchin <guro@fb.com>
>> Signed-off-by: Bart Van Assche <bvanassche@acm.org>
>> ---
>> include/linux/slab.h | 2 +-
>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/include/linux/slab.h b/include/linux/slab.h
>> index 918f374e7156..97d0599ddb7b 100644
>> --- a/include/linux/slab.h
>> +++ b/include/linux/slab.h
>> @@ -329,7 +329,7 @@ static __always_inline enum kmalloc_cache_type kmalloc_type(gfp_t flags)
>> * If an allocation is both __GFP_DMA and __GFP_RECLAIMABLE, return
>> * KMALLOC_DMA and effectively ignore __GFP_RECLAIMABLE
>> */
>> - return type_dma + (is_reclaimable & !is_dma) * KMALLOC_RECLAIM;
>> + return type_dma + is_reclaimable * !is_dma * KMALLOC_RECLAIM;
>
> ISTM that changing is_dma and is_reclaimable from int to bool will stop the bleating.
>
> It is also strange that this code is trying so hard here to avoid conditional instructions
I primarily wanted to avoid branches in a hot path, not cmovs. Note
those are also not "free" (latency-wise) if the result of cmov is
immediately used for further computation.
> and then uses several to generate the boolean values in the first place.
I'm not sure where exactly?
> OTOH I'd probably write:
> int gfp_dma = 0;
>
> #ifdef CONFIG_ZONE_DMA
> gfp_dma = __GFP_DMA;
> #endif
>
> return flags & gfp_dma ? KMALLOC_DMA : flags & __GFP_RECLAIMABLE ? KMALLOC_RECLAIM : 0;
I'm not opposed to this. Christoph might :)
>
> That might generate cmovs, but is may be better to put unlikely() around both
> conditional expressions. Or redo as:
>
> return !unlikely(flags & (dfp_dma | __GFP_RECLAIMABLE)) ? 0 : flags & gfp_dma ? KMALLOC_DMA : KMALLOC_RECLAIM;
I guess it should be structured so that the fast path is for gfp without
both __GFP_DMA and __GFP_RECLAIMABLE, with a single test+branch. IIRC
that's what Christoph originally requested.
> David
>
> -
> Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
> Registration No: 1397386 (Wales)
>
^ permalink raw reply [flat|nested] 29+ messages in thread
* RE: [PATCH] slab.h: Avoid using & for logical and of booleans
2018-11-06 10:22 ` Vlastimil Babka
@ 2018-11-06 11:07 ` David Laight
2018-11-06 12:51 ` Vlastimil Babka
0 siblings, 1 reply; 29+ messages in thread
From: David Laight @ 2018-11-06 11:07 UTC (permalink / raw)
To: 'Vlastimil Babka', 'Bart Van Assche', Andrew Morton
Cc: linux-kernel, Mel Gorman, Christoph Lameter, Roman Gushchin
From: Vlastimil Babka [mailto:vbabka@suse.cz]
> Sent: 06 November 2018 10:22
...
> >> - return type_dma + (is_reclaimable & !is_dma) * KMALLOC_RECLAIM;
> >> + return type_dma + is_reclaimable * !is_dma * KMALLOC_RECLAIM;
> >
> > ISTM that changing is_dma and is_reclaimable from int to bool will stop the bleating.
> >
> > It is also strange that this code is trying so hard here to avoid conditional instructions
I've done some experiments, compiled with gcc 4.7.3 and -O2
The constants match those from the kernel headers.
It is noticable that there isn't a cmov in sight.
The code would also be better if the KMALLOC constants matched the GFP ones.
#define __GFP_DMA 1u
#define __GFP_RECLAIM 0x10u
#define KMALLOC_DMA 2
#define KMALLOC_RECLAIM 1
unsigned int f(unsigned int flags)
{
return flags & __GFP_DMA ? KMALLOC_DMA : flags & __GFP_RECLAIM ? KMALLOC_RECLAIM : 0;
}
unsigned int f1(unsigned int flags)
{
return !__builtin_expect(flags & (__GFP_DMA | __GFP_RECLAIM), 0) ? 0 : flags & __GFP_DMA ? KMALLOC_DMA : KMALLOC_RECLAIM;
}
unsigned int f2(unsigned int flags)
{
int is_dma, type_dma, is_rec;
is_dma = !!(flags & __GFP_DMA);
type_dma = is_dma * KMALLOC_DMA;
is_rec = !!(flags & __GFP_RECLAIM);
return type_dma + (is_rec & !is_dma) * KMALLOC_RECLAIM;
}
unsigned int f3(unsigned int flags)
{
int is_dma, type_dma, is_rec;
is_dma = !!(flags & __GFP_DMA);
type_dma = is_dma * KMALLOC_DMA;
is_rec = !!(flags & __GFP_RECLAIM);
return type_dma + (is_rec * !is_dma) * KMALLOC_RECLAIM;
}
Disassembly of section .text:
0000000000000000 <f>:
0: 40 f6 c7 01 test $0x1,%dil
4: b8 02 00 00 00 mov $0x2,%eax
9: 74 05 je 10 <f+0x10>
b: f3 c3 repz retq
d: 0f 1f 00 nopl (%rax)
10: 89 f8 mov %edi,%eax
12: c1 e8 04 shr $0x4,%eax
15: 83 e0 01 and $0x1,%eax
18: c3 retq
This one has a misprediced branch in the common path.
0000000000000020 <f1>:
20: 40 f6 c7 11 test $0x11,%dil
24: 75 03 jne 29 <f1+0x9>
26: 31 c0 xor %eax,%eax
28: c3 retq
29: 83 e7 01 and $0x1,%edi
2c: 83 ff 01 cmp $0x1,%edi
2f: 19 c0 sbb %eax,%eax
31: 83 c0 02 add $0x2,%eax
34: c3 retq
The jne will be predicted not taken and the retq predicted.
So this might only be 1 clock in the normal case.
0000000000000040 <f2>:
40: 89 f8 mov %edi,%eax
42: c1 ef 04 shr $0x4,%edi
45: 83 e0 01 and $0x1,%eax
48: 89 c2 mov %eax,%edx
4a: 83 f2 01 xor $0x1,%edx
4d: 21 d7 and %edx,%edi
4f: 8d 04 47 lea (%rdi,%rax,2),%eax
52: c3 retq
No conditionals, but a 4 deep dependency chain.
0000000000000060 <f3>:
60: 89 fa mov %edi,%edx
62: c1 ef 04 shr $0x4,%edi
65: 83 e2 01 and $0x1,%edx
68: 83 e7 01 and $0x1,%edi
6b: 89 d0 mov %edx,%eax
6d: 83 f0 01 xor $0x1,%eax
70: 0f af c7 imul %edi,%eax
73: 8d 04 50 lea (%rax,%rdx,2),%eax
76: c3 retq
You really don't want the multiply.
David
-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH] slab.h: Avoid using & for logical and of booleans
2018-11-06 11:07 ` David Laight
@ 2018-11-06 12:51 ` Vlastimil Babka
2018-11-07 10:41 ` David Laight
0 siblings, 1 reply; 29+ messages in thread
From: Vlastimil Babka @ 2018-11-06 12:51 UTC (permalink / raw)
To: David Laight, 'Bart Van Assche', Andrew Morton
Cc: linux-kernel, Mel Gorman, Christoph Lameter, Roman Gushchin
On 11/6/18 12:07 PM, David Laight wrote:
> From: Vlastimil Babka [mailto:vbabka@suse.cz]
>> Sent: 06 November 2018 10:22
> ...
>>>> - return type_dma + (is_reclaimable & !is_dma) * KMALLOC_RECLAIM;
>>>> + return type_dma + is_reclaimable * !is_dma * KMALLOC_RECLAIM;
>>>
>>> ISTM that changing is_dma and is_reclaimable from int to bool will stop the bleating.
>>>
>>> It is also strange that this code is trying so hard here to avoid conditional instructions
>
> I've done some experiments, compiled with gcc 4.7.3 and -O2
> The constants match those from the kernel headers.
>
> It is noticable that there isn't a cmov in sight.
There is with newer gcc: https://godbolt.org/z/qFdByQ
But even that didn't remove the imul in f3() so that's indeed a bust.
> The code would also be better if the KMALLOC constants matched the GFP ones.
That would be hard, as __GFP flags have also other constraints
(especially __GFP_DMA relative to other zone restricting __GFP flags)
and KMALLOC_* are used as array index.
> unsigned int f1(unsigned int flags)
> {
> return !__builtin_expect(flags & (__GFP_DMA | __GFP_RECLAIM), 0) ? 0 : flags & __GFP_DMA ? KMALLOC_DMA : KMALLOC_RECLAIM;
> }
>
...
> 0000000000000020 <f1>:
> 20: 40 f6 c7 11 test $0x11,%dil
> 24: 75 03 jne 29 <f1+0x9>
> 26: 31 c0 xor %eax,%eax
> 28: c3 retq
> 29: 83 e7 01 and $0x1,%edi
> 2c: 83 ff 01 cmp $0x1,%edi
> 2f: 19 c0 sbb %eax,%eax
> 31: 83 c0 02 add $0x2,%eax
> 34: c3 retq
>
> The jne will be predicted not taken and the retq predicted.
> So this might only be 1 clock in the normal case.
I think this is the winner. It's also a single branch and not two,
because the compiler could figure out some of the "clever arithmetics"
itself. Care to send a full patch?
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH] slab.h: Avoid using & for logical and of booleans
2018-11-06 0:32 ` Bart Van Assche
@ 2018-11-06 17:20 ` Alexander Duyck
2018-11-06 17:48 ` Bart Van Assche
0 siblings, 1 reply; 29+ messages in thread
From: Alexander Duyck @ 2018-11-06 17:20 UTC (permalink / raw)
To: bvanassche
Cc: linux, Andrew Morton, LKML, Vlastimil Babka, Mel Gorman,
Christoph Lameter, guro, Pekka Enberg, David Rientjes,
Joonsoo Kim, linux-mm
On Mon, Nov 5, 2018 at 4:32 PM Bart Van Assche <bvanassche@acm.org> wrote:
>
> On Mon, 2018-11-05 at 16:11 -0800, Alexander Duyck wrote:
> > If we really don't care then why even bother with the switch statement
> > anyway? It seems like you could just do one ternary operator and be
> > done with it. Basically all you need is:
> > return (defined(CONFIG_ZONE_DMA) && (flags & __GFP_DMA)) ? KMALLOC_DMA :
> > (flags & __GFP_RECLAIMABLE) ? KMALLOC_RECLAIM : 0;
> >
> > Why bother with all the extra complexity of the switch statement?
>
> I don't think that defined() can be used in a C expression. Hence the
> IS_ENABLED() macro. If you fix that, leave out four superfluous parentheses,
> test your patch, post that patch and cc me then I will add my Reviewed-by.
>
> Bart.
Actually the defined macro is used multiple spots in if statements
throughout the kernel.
The reason for IS_ENABLED is to address the fact that we can be
dealing with macros that indicate if they are built in or a module
since those end up being two different defines depending on if you
select 'y' or 'm'.
Thanks.
- Alex
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH] slab.h: Avoid using & for logical and of booleans
2018-11-06 17:20 ` Alexander Duyck
@ 2018-11-06 17:48 ` Bart Van Assche
2018-11-06 18:17 ` Alexander Duyck
0 siblings, 1 reply; 29+ messages in thread
From: Bart Van Assche @ 2018-11-06 17:48 UTC (permalink / raw)
To: Alexander Duyck
Cc: linux, Andrew Morton, LKML, Vlastimil Babka, Mel Gorman,
Christoph Lameter, guro, Pekka Enberg, David Rientjes,
Joonsoo Kim, linux-mm
On Tue, 2018-11-06 at 09:20 -0800, Alexander Duyck wrote:
> On Mon, Nov 5, 2018 at 4:32 PM Bart Van Assche <bvanassche@acm.org> wrote:
> >
> > On Mon, 2018-11-05 at 16:11 -0800, Alexander Duyck wrote:
> > > If we really don't care then why even bother with the switch statement
> > > anyway? It seems like you could just do one ternary operator and be
> > > done with it. Basically all you need is:
> > > return (defined(CONFIG_ZONE_DMA) && (flags & __GFP_DMA)) ? KMALLOC_DMA :
> > > (flags & __GFP_RECLAIMABLE) ? KMALLOC_RECLAIM : 0;
> > >
> > > Why bother with all the extra complexity of the switch statement?
> >
> > I don't think that defined() can be used in a C expression. Hence the
> > IS_ENABLED() macro. If you fix that, leave out four superfluous parentheses,
> > test your patch, post that patch and cc me then I will add my Reviewed-by.
>
> Actually the defined macro is used multiple spots in if statements
> throughout the kernel.
The only 'if (defined(' matches I found in the kernel tree that are not
preprocessor statements occur in Perl code. Maybe I overlooked something?
> The reason for IS_ENABLED is to address the fact that we can be
> dealing with macros that indicate if they are built in or a module
> since those end up being two different defines depending on if you
> select 'y' or 'm'.
From Documentation/process/coding-style.rst:
Within code, where possible, use the IS_ENABLED macro to convert a Kconfig
symbol into a C boolean expression, and use it in a normal C conditional:
.. code-block:: c
if (IS_ENABLED(CONFIG_SOMETHING)) {
...
}
Bart.
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH] slab.h: Avoid using & for logical and of booleans
2018-11-06 17:48 ` Bart Van Assche
@ 2018-11-06 18:17 ` Alexander Duyck
0 siblings, 0 replies; 29+ messages in thread
From: Alexander Duyck @ 2018-11-06 18:17 UTC (permalink / raw)
To: bvanassche
Cc: linux, Andrew Morton, LKML, Vlastimil Babka, Mel Gorman,
Christoph Lameter, guro, Pekka Enberg, David Rientjes,
Joonsoo Kim, linux-mm
On Tue, Nov 6, 2018 at 9:48 AM Bart Van Assche <bvanassche@acm.org> wrote:
>
> On Tue, 2018-11-06 at 09:20 -0800, Alexander Duyck wrote:
> > On Mon, Nov 5, 2018 at 4:32 PM Bart Van Assche <bvanassche@acm.org> wrote:
> > >
> > > On Mon, 2018-11-05 at 16:11 -0800, Alexander Duyck wrote:
> > > > If we really don't care then why even bother with the switch statement
> > > > anyway? It seems like you could just do one ternary operator and be
> > > > done with it. Basically all you need is:
> > > > return (defined(CONFIG_ZONE_DMA) && (flags & __GFP_DMA)) ? KMALLOC_DMA :
> > > > (flags & __GFP_RECLAIMABLE) ? KMALLOC_RECLAIM : 0;
> > > >
> > > > Why bother with all the extra complexity of the switch statement?
> > >
> > > I don't think that defined() can be used in a C expression. Hence the
> > > IS_ENABLED() macro. If you fix that, leave out four superfluous parentheses,
> > > test your patch, post that patch and cc me then I will add my Reviewed-by.
> >
> > Actually the defined macro is used multiple spots in if statements
> > throughout the kernel.
>
> The only 'if (defined(' matches I found in the kernel tree that are not
> preprocessor statements occur in Perl code. Maybe I overlooked something?
You may be right. I think I was thinking of "__is_defined", not "defined".
> > The reason for IS_ENABLED is to address the fact that we can be
> > dealing with macros that indicate if they are built in or a module
> > since those end up being two different defines depending on if you
> > select 'y' or 'm'.
>
> From Documentation/process/coding-style.rst:
>
> Within code, where possible, use the IS_ENABLED macro to convert a Kconfig
> symbol into a C boolean expression, and use it in a normal C conditional:
>
> .. code-block:: c
>
> if (IS_ENABLED(CONFIG_SOMETHING)) {
> ...
> }
>
> Bart.
Right. Part of the reason for suggesting that is that depending on how
you define "CONFIG_SOMETHING" it can actually be defined as
"CONFIG_SOMETHING" or "CONFIG_SOMETHING_MODULE". I was operating
under the assumption that CONFIG_ZONE_DMA wasn't ever going to be
built as a module.
Thanks.
- Alex
^ permalink raw reply [flat|nested] 29+ messages in thread
* RE: [PATCH] slab.h: Avoid using & for logical and of booleans
2018-11-06 12:51 ` Vlastimil Babka
@ 2018-11-07 10:41 ` David Laight
2018-11-09 8:12 ` Vlastimil Babka
0 siblings, 1 reply; 29+ messages in thread
From: David Laight @ 2018-11-07 10:41 UTC (permalink / raw)
To: 'Vlastimil Babka', 'Bart Van Assche', Andrew Morton
Cc: linux-kernel, Mel Gorman, Christoph Lameter, Roman Gushchin
From: Vlastimil Babka
> Sent: 06 November 2018 12:51
>
> On 11/6/18 12:07 PM, David Laight wrote:
> > From: Vlastimil Babka [mailto:vbabka@suse.cz]
> >> Sent: 06 November 2018 10:22
> > ...
> >>>> - return type_dma + (is_reclaimable & !is_dma) * KMALLOC_RECLAIM;
> >>>> + return type_dma + is_reclaimable * !is_dma * KMALLOC_RECLAIM;
...
> > I've done some experiments, compiled with gcc 4.7.3 and -O2
> > The constants match those from the kernel headers.
> >
> > It is noticable that there isn't a cmov in sight.
>
> There is with newer gcc: https://godbolt.org/z/qFdByQ
>
> But even that didn't remove the imul in f3() so that's indeed a bust.
>
> > The code would also be better if the KMALLOC constants matched the GFP ones.
>
> That would be hard, as __GFP flags have also other constraints
> (especially __GFP_DMA relative to other zone restricting __GFP flags)
> and KMALLOC_* are used as array index.
Hmmm...
With only 2 or three values conditionals are probably better than
table lookups - especially if they are function pointers.
>
> > unsigned int f1(unsigned int flags)
> > {
> > return !__builtin_expect(flags & (__GFP_DMA | __GFP_RECLAIM), 0) ? 0 : flags & __GFP_DMA ?
> KMALLOC_DMA : KMALLOC_RECLAIM;
> > }
> >
>
> ...
>
> > 0000000000000020 <f1>:
> > 20: 40 f6 c7 11 test $0x11,%dil
> > 24: 75 03 jne 29 <f1+0x9>
> > 26: 31 c0 xor %eax,%eax
> > 28: c3 retq
> > 29: 83 e7 01 and $0x1,%edi
> > 2c: 83 ff 01 cmp $0x1,%edi
> > 2f: 19 c0 sbb %eax,%eax
> > 31: 83 c0 02 add $0x2,%eax
> > 34: c3 retq
> >
> > The jne will be predicted not taken and the retq predicted.
> > So this might only be 1 clock in the normal case.
>
> I think this is the winner. It's also a single branch and not two,
> because the compiler could figure out some of the "clever arithmetics"
> itself. Care to send a full patch?
I've not got a suitable source tree lurking.
So someone else would need to do it.
I'll waive any copyright that could plausibly be assigned to the above!
David
-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH] slab.h: Avoid using & for logical and of booleans
2018-11-07 10:41 ` David Laight
@ 2018-11-09 8:12 ` Vlastimil Babka
2018-11-09 19:00 ` Andrew Morton
0 siblings, 1 reply; 29+ messages in thread
From: Vlastimil Babka @ 2018-11-09 8:12 UTC (permalink / raw)
To: David Laight, 'Bart Van Assche', Andrew Morton
Cc: linux-kernel, Mel Gorman, Christoph Lameter, Roman Gushchin,
Darryl T. Agostinelli
On 11/7/18 11:41 AM, David Laight wrote:
> From: Vlastimil Babka
>> Sent: 06 November 2018 12:51
>>
>> On 11/6/18 12:07 PM, David Laight wrote:
>>> From: Vlastimil Babka [mailto:vbabka@suse.cz]
>>> 0000000000000020 <f1>:
>>> 20: 40 f6 c7 11 test $0x11,%dil
>>> 24: 75 03 jne 29 <f1+0x9>
>>> 26: 31 c0 xor %eax,%eax
>>> 28: c3 retq
>>> 29: 83 e7 01 and $0x1,%edi
>>> 2c: 83 ff 01 cmp $0x1,%edi
>>> 2f: 19 c0 sbb %eax,%eax
>>> 31: 83 c0 02 add $0x2,%eax
>>> 34: c3 retq
>>>
>>> The jne will be predicted not taken and the retq predicted.
>>> So this might only be 1 clock in the normal case.
>>
>> I think this is the winner. It's also a single branch and not two,
>> because the compiler could figure out some of the "clever arithmetics"
>> itself. Care to send a full patch?
>
> I've not got a suitable source tree lurking.
> So someone else would need to do it.
> I'll waive any copyright that could plausibly be assigned to the above!
There we go. This is to replace the current fix by Bart (sorry) which seems
to add an extra IMUL. Apparently current mainline is spamming anyone running
sparse with lots of warning, so it should be merged soon.
----8<----
From ddd2fc6fcba425733f8320413a1451410687c9c3 Mon Sep 17 00:00:00 2001
From: Vlastimil Babka <vbabka@suse.cz>
Date: Fri, 9 Nov 2018 08:47:12 +0100
Subject: [PATCH] mm, slab: fix sparse warning in kmalloc_type()
Multiple people have reported the following sparse warning:
./include/linux/slab.h:332:43: warning: dubious: x & !y
The minimal fix would be to change the logical & to boolean &&, which emits the
same code, but Andrew has suggested that the branch-avoiding tricks are maybe
not worthwile. David Laight provided a nice comparison of disassembly of
multiple variants, which shows that the current version produces a 4 deep
dependency chain, and fixing the sparse warning by changing logical and to
multiplication emits an IMUL, making it even more expensive.
The code as rewritten by this patch yielded the best disassembly, with a single
predictable branch for the most common case, and a ternary operator for the
rest, which gcc seems to compile without a branch or cmov by itself.
The result should be more readable, without a sparse warning and probably also
faster for the common case.
Reported-by: Bart Van Assche <bvanassche@acm.org>
Reported-by: Darryl T. Agostinelli <dagostinelli@gmail.com>
Suggested-by: Andrew Morton <akpm@linux-foundation.org>
Suggested-by: David Laight <David.Laight@ACULAB.COM>
Fixes: 1291523f2c1d ("mm, slab/slub: introduce kmalloc-reclaimable caches")
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
---
include/linux/slab.h | 24 ++++++++++++------------
1 file changed, 12 insertions(+), 12 deletions(-)
diff --git a/include/linux/slab.h b/include/linux/slab.h
index 918f374e7156..18c6920c2803 100644
--- a/include/linux/slab.h
+++ b/include/linux/slab.h
@@ -304,6 +304,8 @@ enum kmalloc_cache_type {
KMALLOC_RECLAIM,
#ifdef CONFIG_ZONE_DMA
KMALLOC_DMA,
+#else
+ KMALLOC_DMA = KMALLOC_NORMAL,
#endif
NR_KMALLOC_TYPES
};
@@ -314,22 +316,20 @@ kmalloc_caches[NR_KMALLOC_TYPES][KMALLOC_SHIFT_HIGH + 1];
static __always_inline enum kmalloc_cache_type kmalloc_type(gfp_t flags)
{
- int is_dma = 0;
- int type_dma = 0;
- int is_reclaimable;
-
-#ifdef CONFIG_ZONE_DMA
- is_dma = !!(flags & __GFP_DMA);
- type_dma = is_dma * KMALLOC_DMA;
-#endif
+ int gfp_dma = IS_ENABLED(CONFIG_ZONE_DMA) ? __GFP_DMA : 0;
- is_reclaimable = !!(flags & __GFP_RECLAIMABLE);
+ /*
+ * The most common case is KMALLOC_NORMAL, so test for it
+ * with a single branch for both flags.
+ */
+ if (likely((flags & (gfp_dma | __GFP_RECLAIMABLE)) == 0))
+ return KMALLOC_NORMAL;
/*
- * If an allocation is both __GFP_DMA and __GFP_RECLAIMABLE, return
- * KMALLOC_DMA and effectively ignore __GFP_RECLAIMABLE
+ * At least one of the flags has to be set. If both are, __GFP_DMA
+ * is more important.
*/
- return type_dma + (is_reclaimable & !is_dma) * KMALLOC_RECLAIM;
+ return flags & gfp_dma ? KMALLOC_DMA : KMALLOC_RECLAIM;
}
/*
--
2.19.1
^ permalink raw reply related [flat|nested] 29+ messages in thread
* Re: [PATCH] slab.h: Avoid using & for logical and of booleans
2018-11-09 8:12 ` Vlastimil Babka
@ 2018-11-09 19:00 ` Andrew Morton
2018-11-09 19:16 ` Vlastimil Babka
0 siblings, 1 reply; 29+ messages in thread
From: Andrew Morton @ 2018-11-09 19:00 UTC (permalink / raw)
To: Vlastimil Babka
Cc: David Laight, 'Bart Van Assche',
linux-kernel, Mel Gorman, Christoph Lameter, Roman Gushchin,
Darryl T. Agostinelli
On Fri, 9 Nov 2018 09:12:09 +0100 Vlastimil Babka <vbabka@suse.cz> wrote:
> Multiple people have reported the following sparse warning:
>
> ./include/linux/slab.h:332:43: warning: dubious: x & !y
>
> The minimal fix would be to change the logical & to boolean &&, which emits the
> same code, but Andrew has suggested that the branch-avoiding tricks are maybe
> not worthwile. David Laight provided a nice comparison of disassembly of
> multiple variants, which shows that the current version produces a 4 deep
> dependency chain, and fixing the sparse warning by changing logical and to
> multiplication emits an IMUL, making it even more expensive.
>
> The code as rewritten by this patch yielded the best disassembly, with a single
> predictable branch for the most common case, and a ternary operator for the
> rest, which gcc seems to compile without a branch or cmov by itself.
>
> The result should be more readable, without a sparse warning and probably also
> faster for the common case.
>
> Reported-by: Bart Van Assche <bvanassche@acm.org>
> Reported-by: Darryl T. Agostinelli <dagostinelli@gmail.com>
> Suggested-by: Andrew Morton <akpm@linux-foundation.org>
> Suggested-by: David Laight <David.Laight@ACULAB.COM>
> Fixes: 1291523f2c1d ("mm, slab/slub: introduce kmalloc-reclaimable caches")
> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
> ---
> include/linux/slab.h | 24 ++++++++++++------------
> 1 file changed, 12 insertions(+), 12 deletions(-)
>
> diff --git a/include/linux/slab.h b/include/linux/slab.h
> index 918f374e7156..18c6920c2803 100644
> --- a/include/linux/slab.h
> +++ b/include/linux/slab.h
> @@ -304,6 +304,8 @@ enum kmalloc_cache_type {
> KMALLOC_RECLAIM,
> #ifdef CONFIG_ZONE_DMA
> KMALLOC_DMA,
> +#else
> + KMALLOC_DMA = KMALLOC_NORMAL,
> #endif
> NR_KMALLOC_TYPES
> };
I don't think this works correctly. Resetting KMALLOC_DMA to 0 will
cause NR_KMALLOC_TYPES to have value 1.
enum foo {
a = 0,
b,
c = 0,
d
}
main()
{
printf("%d %d %d %d\n", a, b, c, d);
}
akpm3> ./a.out
0 1 0 1
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH] slab.h: Avoid using & for logical and of booleans
2018-11-09 19:00 ` Andrew Morton
@ 2018-11-09 19:16 ` Vlastimil Babka
2018-11-09 19:47 ` Darryl T. Agostinelli
2018-11-12 9:55 ` David Laight
0 siblings, 2 replies; 29+ messages in thread
From: Vlastimil Babka @ 2018-11-09 19:16 UTC (permalink / raw)
To: Andrew Morton
Cc: David Laight, 'Bart Van Assche',
linux-kernel, Mel Gorman, Christoph Lameter, Roman Gushchin,
Darryl T. Agostinelli
On 11/9/18 8:00 PM, Andrew Morton wrote:
> On Fri, 9 Nov 2018 09:12:09 +0100 Vlastimil Babka <vbabka@suse.cz> wrote:
>
>> Multiple people have reported the following sparse warning:
>>
>> ./include/linux/slab.h:332:43: warning: dubious: x & !y
>>
>> The minimal fix would be to change the logical & to boolean &&, which emits the
>> same code, but Andrew has suggested that the branch-avoiding tricks are maybe
>> not worthwile. David Laight provided a nice comparison of disassembly of
>> multiple variants, which shows that the current version produces a 4 deep
>> dependency chain, and fixing the sparse warning by changing logical and to
>> multiplication emits an IMUL, making it even more expensive.
>>
>> The code as rewritten by this patch yielded the best disassembly, with a single
>> predictable branch for the most common case, and a ternary operator for the
>> rest, which gcc seems to compile without a branch or cmov by itself.
>>
>> The result should be more readable, without a sparse warning and probably also
>> faster for the common case.
>>
>> Reported-by: Bart Van Assche <bvanassche@acm.org>
>> Reported-by: Darryl T. Agostinelli <dagostinelli@gmail.com>
>> Suggested-by: Andrew Morton <akpm@linux-foundation.org>
>> Suggested-by: David Laight <David.Laight@ACULAB.COM>
>> Fixes: 1291523f2c1d ("mm, slab/slub: introduce kmalloc-reclaimable caches")
>> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
>> ---
>> include/linux/slab.h | 24 ++++++++++++------------
>> 1 file changed, 12 insertions(+), 12 deletions(-)
>>
>> diff --git a/include/linux/slab.h b/include/linux/slab.h
>> index 918f374e7156..18c6920c2803 100644
>> --- a/include/linux/slab.h
>> +++ b/include/linux/slab.h
>> @@ -304,6 +304,8 @@ enum kmalloc_cache_type {
>> KMALLOC_RECLAIM,
>> #ifdef CONFIG_ZONE_DMA
>> KMALLOC_DMA,
>> +#else
>> + KMALLOC_DMA = KMALLOC_NORMAL,
>> #endif
>> NR_KMALLOC_TYPES
>> };
>
> I don't think this works correctly. Resetting KMALLOC_DMA to 0 will
> cause NR_KMALLOC_TYPES to have value 1.
Doh, right! Thanks for catching this.
This? Not terribly elegant, but I don't see a nicer way right now...
----8<----
From 40b84707e1b5aeccff11bd5f0563bb938e2c22d6 Mon Sep 17 00:00:00 2001
From: Vlastimil Babka <vbabka@suse.cz>
Date: Fri, 9 Nov 2018 08:47:12 +0100
Subject: [PATCH] mm, slab: fix sparse warning in kmalloc_type()
Multiple people have reported the following sparse warning:
./include/linux/slab.h:332:43: warning: dubious: x & !y
The minimal fix would be to change the logical & to boolean &&, which emits the
same code, but Andrew has suggested that the branch-avoiding tricks are maybe
not worthwile. David Laight provided a nice comparison of disassembly of
multiple variants, which shows that the current version produces a 4 deep
dependency chain, and fixing the sparse warning by changing logical and to
multiplication emits an IMUL, making it even more expensive.
The code as rewritten by this patch yielded the best disassembly, with a single
predictable branch for the most common case, and a ternary operator for the
rest, which gcc seems to compile without a branch or cmov by itself.
The result should be more readable, without a sparse warning and probably also
faster for the common case.
Reported-by: Bart Van Assche <bvanassche@acm.org>
Reported-by: Darryl T. Agostinelli <dagostinelli@gmail.com>
Suggested-by: Andrew Morton <akpm@linux-foundation.org>
Suggested-by: David Laight <David.Laight@ACULAB.COM>
Fixes: 1291523f2c1d ("mm, slab/slub: introduce kmalloc-reclaimable caches")
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
---
include/linux/slab.h | 26 ++++++++++++++------------
1 file changed, 14 insertions(+), 12 deletions(-)
diff --git a/include/linux/slab.h b/include/linux/slab.h
index 918f374e7156..ca113d4ecc6f 100644
--- a/include/linux/slab.h
+++ b/include/linux/slab.h
@@ -314,22 +314,24 @@ kmalloc_caches[NR_KMALLOC_TYPES][KMALLOC_SHIFT_HIGH + 1];
static __always_inline enum kmalloc_cache_type kmalloc_type(gfp_t flags)
{
- int is_dma = 0;
- int type_dma = 0;
- int is_reclaimable;
+ int gfp_dma = IS_ENABLED(CONFIG_ZONE_DMA) ? __GFP_DMA : 0;
-#ifdef CONFIG_ZONE_DMA
- is_dma = !!(flags & __GFP_DMA);
- type_dma = is_dma * KMALLOC_DMA;
-#endif
-
- is_reclaimable = !!(flags & __GFP_RECLAIMABLE);
+ /*
+ * The most common case is KMALLOC_NORMAL, so test for it
+ * with a single branch for both flags.
+ */
+ if (likely((flags & (gfp_dma | __GFP_RECLAIMABLE)) == 0))
+ return KMALLOC_NORMAL;
/*
- * If an allocation is both __GFP_DMA and __GFP_RECLAIMABLE, return
- * KMALLOC_DMA and effectively ignore __GFP_RECLAIMABLE
+ * At least one of the flags has to be set. If both are, __GFP_DMA
+ * is more important.
*/
- return type_dma + (is_reclaimable & !is_dma) * KMALLOC_RECLAIM;
+#ifdef CONFIG_ZONE_DMA
+ return flags & gfp_dma ? KMALLOC_DMA : KMALLOC_RECLAIM;
+#else
+ return KMALLOC_RECLAIM;
+#endif
}
/*
--
2.19.1
^ permalink raw reply related [flat|nested] 29+ messages in thread
* Re: [PATCH] slab.h: Avoid using & for logical and of booleans
2018-11-09 19:16 ` Vlastimil Babka
@ 2018-11-09 19:47 ` Darryl T. Agostinelli
2018-11-09 21:31 ` Vlastimil Babka
2018-11-12 9:55 ` David Laight
1 sibling, 1 reply; 29+ messages in thread
From: Darryl T. Agostinelli @ 2018-11-09 19:47 UTC (permalink / raw)
To: Vlastimil Babka
Cc: Andrew Morton, David Laight, 'Bart Van Assche',
linux-kernel, Mel Gorman, Christoph Lameter, Roman Gushchin
On Fri, Nov 09, 2018 at 08:16:07PM +0100, Vlastimil Babka wrote:
> On 11/9/18 8:00 PM, Andrew Morton wrote:
> > On Fri, 9 Nov 2018 09:12:09 +0100 Vlastimil Babka <vbabka@suse.cz> wrote:
> >
> >> Multiple people have reported the following sparse warning:
> >>
> >> ./include/linux/slab.h:332:43: warning: dubious: x & !y
> >>
> >> The minimal fix would be to change the logical & to boolean &&, which emits the
> >> same code, but Andrew has suggested that the branch-avoiding tricks are maybe
> >> not worthwile. David Laight provided a nice comparison of disassembly of
> >> multiple variants, which shows that the current version produces a 4 deep
> >> dependency chain, and fixing the sparse warning by changing logical and to
> >> multiplication emits an IMUL, making it even more expensive.
> >>
> >> The code as rewritten by this patch yielded the best disassembly, with a single
> >> predictable branch for the most common case, and a ternary operator for the
> >> rest, which gcc seems to compile without a branch or cmov by itself.
> >>
> >> The result should be more readable, without a sparse warning and probably also
> >> faster for the common case.
> >>
> >> Reported-by: Bart Van Assche <bvanassche@acm.org>
> >> Reported-by: Darryl T. Agostinelli <dagostinelli@gmail.com>
> >> Suggested-by: Andrew Morton <akpm@linux-foundation.org>
> >> Suggested-by: David Laight <David.Laight@ACULAB.COM>
> >> Fixes: 1291523f2c1d ("mm, slab/slub: introduce kmalloc-reclaimable caches")
> >> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
> >> ---
> >> include/linux/slab.h | 24 ++++++++++++------------
> >> 1 file changed, 12 insertions(+), 12 deletions(-)
> >>
> >> diff --git a/include/linux/slab.h b/include/linux/slab.h
> >> index 918f374e7156..18c6920c2803 100644
> >> --- a/include/linux/slab.h
> >> +++ b/include/linux/slab.h
> >> @@ -304,6 +304,8 @@ enum kmalloc_cache_type {
> >> KMALLOC_RECLAIM,
> >> #ifdef CONFIG_ZONE_DMA
> >> KMALLOC_DMA,
> >> +#else
> >> + KMALLOC_DMA = KMALLOC_NORMAL,
> >> #endif
> >> NR_KMALLOC_TYPES
> >> };
> >
> > I don't think this works correctly. Resetting KMALLOC_DMA to 0 will
> > cause NR_KMALLOC_TYPES to have value 1.
>
> Doh, right! Thanks for catching this.
>
> This? Not terribly elegant, but I don't see a nicer way right now...
>
How about the solution I proposed yesterday?
https://lkml.org/lkml/2018/11/9/750
It doesn't involve any tricks.
As it is, this sparse warning is begging for a trick. Let's not
oblidge it to much.
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH] slab.h: Avoid using & for logical and of booleans
2018-11-09 19:47 ` Darryl T. Agostinelli
@ 2018-11-09 21:31 ` Vlastimil Babka
0 siblings, 0 replies; 29+ messages in thread
From: Vlastimil Babka @ 2018-11-09 21:31 UTC (permalink / raw)
To: Darryl T. Agostinelli
Cc: Andrew Morton, David Laight, 'Bart Van Assche',
linux-kernel, Mel Gorman, Christoph Lameter, Roman Gushchin
On 11/9/18 8:47 PM, Darryl T. Agostinelli wrote:
> On Fri, Nov 09, 2018 at 08:16:07PM +0100, Vlastimil Babka wrote:
>> On 11/9/18 8:00 PM, Andrew Morton wrote:
>>> On Fri, 9 Nov 2018 09:12:09 +0100 Vlastimil Babka <vbabka@suse.cz> wrote:
>>>
>>>> Multiple people have reported the following sparse warning:
>>>>
>>>> ./include/linux/slab.h:332:43: warning: dubious: x & !y
>>>>
>>>> The minimal fix would be to change the logical & to boolean &&, which emits the
>>>> same code, but Andrew has suggested that the branch-avoiding tricks are maybe
>>>> not worthwile. David Laight provided a nice comparison of disassembly of
>>>> multiple variants, which shows that the current version produces a 4 deep
>>>> dependency chain, and fixing the sparse warning by changing logical and to
>>>> multiplication emits an IMUL, making it even more expensive.
>>>>
>>>> The code as rewritten by this patch yielded the best disassembly, with a single
>>>> predictable branch for the most common case, and a ternary operator for the
>>>> rest, which gcc seems to compile without a branch or cmov by itself.
>>>>
>>>> The result should be more readable, without a sparse warning and probably also
>>>> faster for the common case.
>>>>
>>>> Reported-by: Bart Van Assche <bvanassche@acm.org>
>>>> Reported-by: Darryl T. Agostinelli <dagostinelli@gmail.com>
>>>> Suggested-by: Andrew Morton <akpm@linux-foundation.org>
>>>> Suggested-by: David Laight <David.Laight@ACULAB.COM>
>>>> Fixes: 1291523f2c1d ("mm, slab/slub: introduce kmalloc-reclaimable caches")
>>>> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
>>>> ---
>>>> include/linux/slab.h | 24 ++++++++++++------------
>>>> 1 file changed, 12 insertions(+), 12 deletions(-)
>>>>
>>>> diff --git a/include/linux/slab.h b/include/linux/slab.h
>>>> index 918f374e7156..18c6920c2803 100644
>>>> --- a/include/linux/slab.h
>>>> +++ b/include/linux/slab.h
>>>> @@ -304,6 +304,8 @@ enum kmalloc_cache_type {
>>>> KMALLOC_RECLAIM,
>>>> #ifdef CONFIG_ZONE_DMA
>>>> KMALLOC_DMA,
>>>> +#else
>>>> + KMALLOC_DMA = KMALLOC_NORMAL,
>>>> #endif
>>>> NR_KMALLOC_TYPES
>>>> };
>>>
>>> I don't think this works correctly. Resetting KMALLOC_DMA to 0 will
>>> cause NR_KMALLOC_TYPES to have value 1.
>>
>> Doh, right! Thanks for catching this.
>>
>> This? Not terribly elegant, but I don't see a nicer way right now...
>>
>
> How about the solution I proposed yesterday?
>
> https://lkml.org/lkml/2018/11/9/750
>
> It doesn't involve any tricks.
It doesn't remove the "trick" that calculates return value as a sum of
booleans multiplying constants. The patch converts one part of the
expression of those booleans to a ternary operator. I think the result
is even harder to follow and meanwhile Andrew's suggestion was to remove
all the tricks.
> As it is, this sparse warning is begging for a trick. Let's not
> oblidge it to much.
The sparse warning could be silenced just by changing '&' to '&&' which
would emit the same code. But we decided to untrick the code.
^ permalink raw reply [flat|nested] 29+ messages in thread
* RE: [PATCH] slab.h: Avoid using & for logical and of booleans
2018-11-09 19:16 ` Vlastimil Babka
2018-11-09 19:47 ` Darryl T. Agostinelli
@ 2018-11-12 9:55 ` David Laight
2018-11-13 18:22 ` Vlastimil Babka
1 sibling, 1 reply; 29+ messages in thread
From: David Laight @ 2018-11-12 9:55 UTC (permalink / raw)
To: 'Vlastimil Babka', Andrew Morton
Cc: 'Bart Van Assche',
linux-kernel, Mel Gorman, Christoph Lameter, Roman Gushchin,
Darryl T. Agostinelli
From: Vlastimil Babka [mailto:vbabka@suse.cz]
> Sent: 09 November 2018 19:16
...
> This? Not terribly elegant, but I don't see a nicer way right now...
Maybe just have two copies of the function body?
static __always_inline enum kmalloc_cache_type kmalloc_type(gfp_t flags)
{
#ifndef CONFIG_ZONE_DMA
return flags & __GFP_RECLAIMABLE ? KMALLOC_RECLAIM : KMALLOC_NORMAL;
#else
if (likely((flags & (__GFP_DMA | __GFP_RECLAIMABLE)) == 0))
return KMALLOC_NORMAL;
return flags & __GFP_DMA ? KMALLOC_DMA : KMALLOC_RECLAIM;
#endif
}
David
-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH] slab.h: Avoid using & for logical and of booleans
2018-11-12 9:55 ` David Laight
@ 2018-11-13 18:22 ` Vlastimil Babka
2018-11-21 13:22 ` Vlastimil Babka
0 siblings, 1 reply; 29+ messages in thread
From: Vlastimil Babka @ 2018-11-13 18:22 UTC (permalink / raw)
To: David Laight, Andrew Morton
Cc: 'Bart Van Assche',
linux-kernel, Mel Gorman, Christoph Lameter, Roman Gushchin,
Darryl T. Agostinelli
On 11/12/18 10:55 AM, David Laight wrote:
> From: Vlastimil Babka [mailto:vbabka@suse.cz]
>> Sent: 09 November 2018 19:16
> ...
>> This? Not terribly elegant, but I don't see a nicer way right now...
>
> Maybe just have two copies of the function body?
>
> static __always_inline enum kmalloc_cache_type kmalloc_type(gfp_t flags)
> {
> #ifndef CONFIG_ZONE_DMA
> return flags & __GFP_RECLAIMABLE ? KMALLOC_RECLAIM : KMALLOC_NORMAL;
> #else
> if (likely((flags & (__GFP_DMA | __GFP_RECLAIMABLE)) == 0))
> return KMALLOC_NORMAL;
> return flags & __GFP_DMA ? KMALLOC_DMA : KMALLOC_RECLAIM;
> #endif
> }
OK that's probably the most straightforward to follow, thanks.
Note that for CONFIG_ZONE_DMA=n the result is identical to original code and
all other attempts. flags & __GFP_DMA is converted to 1/0 index without branches
or cmovs or whatnot.
----8<----
From 40735b637b28c3e5798bc7e90f72f349050c2045 Mon Sep 17 00:00:00 2001
From: Vlastimil Babka <vbabka@suse.cz>
Date: Fri, 9 Nov 2018 08:47:12 +0100
Subject: [PATCH] mm, slab: fix sparse warning in kmalloc_type()
Multiple people have reported the following sparse warning:
./include/linux/slab.h:332:43: warning: dubious: x & !y
The minimal fix would be to change the logical & to boolean &&, which emits the
same code, but Andrew has suggested that the branch-avoiding tricks are maybe
not worthwile. David Laight provided a nice comparison of disassembly of
multiple variants, which shows that the current version produces a 4 deep
dependency chain, and fixing the sparse warning by changing logical and to
multiplication emits an IMUL, making it even more expensive.
The code as rewritten by this patch yielded the best disassembly, with a single
predictable branch for the most common case, and a ternary operator for the
rest, which gcc seems to compile without a branch or cmov by itself.
The result should be more readable, without a sparse warning and probably also
faster for the common case.
Reported-by: Bart Van Assche <bvanassche@acm.org>
Reported-by: Darryl T. Agostinelli <dagostinelli@gmail.com>
Suggested-by: Andrew Morton <akpm@linux-foundation.org>
Suggested-by: David Laight <David.Laight@ACULAB.COM>
Fixes: 1291523f2c1d ("mm, slab/slub: introduce kmalloc-reclaimable caches")
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
---
include/linux/slab.h | 24 ++++++++++++------------
1 file changed, 12 insertions(+), 12 deletions(-)
diff --git a/include/linux/slab.h b/include/linux/slab.h
index 918f374e7156..6d5009f29ce5 100644
--- a/include/linux/slab.h
+++ b/include/linux/slab.h
@@ -314,22 +314,22 @@ kmalloc_caches[NR_KMALLOC_TYPES][KMALLOC_SHIFT_HIGH + 1];
static __always_inline enum kmalloc_cache_type kmalloc_type(gfp_t flags)
{
- int is_dma = 0;
- int type_dma = 0;
- int is_reclaimable;
-
#ifdef CONFIG_ZONE_DMA
- is_dma = !!(flags & __GFP_DMA);
- type_dma = is_dma * KMALLOC_DMA;
-#endif
-
- is_reclaimable = !!(flags & __GFP_RECLAIMABLE);
+ /*
+ * The most common case is KMALLOC_NORMAL, so test for it
+ * with a single branch for both flags.
+ */
+ if (likely((flags & (__GFP_DMA | __GFP_RECLAIMABLE)) == 0))
+ return KMALLOC_NORMAL;
/*
- * If an allocation is both __GFP_DMA and __GFP_RECLAIMABLE, return
- * KMALLOC_DMA and effectively ignore __GFP_RECLAIMABLE
+ * At least one of the flags has to be set. If both are, __GFP_DMA
+ * is more important.
*/
- return type_dma + (is_reclaimable & !is_dma) * KMALLOC_RECLAIM;
+ return flags & __GFP_DMA ? KMALLOC_DMA : KMALLOC_RECLAIM;
+#else
+ return flags & __GFP_RECLAIMABLE ? KMALLOC_RECLAIM : KMALLOC_NORMAL;
+#endif
}
/*
--
2.19.1
^ permalink raw reply related [flat|nested] 29+ messages in thread
* Re: [PATCH] slab.h: Avoid using & for logical and of booleans
2018-11-05 20:40 [PATCH] slab.h: Avoid using & for logical and of booleans Bart Van Assche
` (2 preceding siblings ...)
2018-11-06 10:08 ` David Laight
@ 2018-11-19 11:04 ` Pavel Machek
2018-11-19 12:51 ` Vlastimil Babka
3 siblings, 1 reply; 29+ messages in thread
From: Pavel Machek @ 2018-11-19 11:04 UTC (permalink / raw)
To: Bart Van Assche
Cc: Andrew Morton, linux-kernel, Vlastimil Babka, Mel Gorman,
Christoph Lameter, Roman Gushchin
[-- Attachment #1: Type: text/plain, Size: 1406 bytes --]
On Mon 2018-11-05 12:40:00, Bart Van Assche wrote:
> This patch suppresses the following sparse warning:
>
> ./include/linux/slab.h:332:43: warning: dubious: x & !y
>
> Fixes: 1291523f2c1d ("mm, slab/slub: introduce kmalloc-reclaimable caches")
> Cc: Vlastimil Babka <vbabka@suse.cz>
> Cc: Mel Gorman <mgorman@techsingularity.net>
> Cc: Christoph Lameter <cl@linux.com>
> Cc: Roman Gushchin <guro@fb.com>
> Signed-off-by: Bart Van Assche <bvanassche@acm.org>
> ---
> include/linux/slab.h | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/include/linux/slab.h b/include/linux/slab.h
> index 918f374e7156..97d0599ddb7b 100644
> --- a/include/linux/slab.h
> +++ b/include/linux/slab.h
> @@ -329,7 +329,7 @@ static __always_inline enum kmalloc_cache_type kmalloc_type(gfp_t flags)
> * If an allocation is both __GFP_DMA and __GFP_RECLAIMABLE, return
> * KMALLOC_DMA and effectively ignore __GFP_RECLAIMABLE
> */
> - return type_dma + (is_reclaimable & !is_dma) * KMALLOC_RECLAIM;
> + return type_dma + is_reclaimable * !is_dma * KMALLOC_RECLAIM;
> }
>
What is wrong with && ? If logical and is better done by multiply,
that's compiler job, and compiler should be fixed to do it...
Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH] slab.h: Avoid using & for logical and of booleans
2018-11-19 11:04 ` Pavel Machek
@ 2018-11-19 12:51 ` Vlastimil Babka
0 siblings, 0 replies; 29+ messages in thread
From: Vlastimil Babka @ 2018-11-19 12:51 UTC (permalink / raw)
To: Pavel Machek, Bart Van Assche
Cc: Andrew Morton, linux-kernel, Mel Gorman, Christoph Lameter,
Roman Gushchin
On 11/19/18 12:04 PM, Pavel Machek wrote:
> On Mon 2018-11-05 12:40:00, Bart Van Assche wrote:
>> This patch suppresses the following sparse warning:
>>
>> ./include/linux/slab.h:332:43: warning: dubious: x & !y
>>
>> Fixes: 1291523f2c1d ("mm, slab/slub: introduce kmalloc-reclaimable caches")
>> Cc: Vlastimil Babka <vbabka@suse.cz>
>> Cc: Mel Gorman <mgorman@techsingularity.net>
>> Cc: Christoph Lameter <cl@linux.com>
>> Cc: Roman Gushchin <guro@fb.com>
>> Signed-off-by: Bart Van Assche <bvanassche@acm.org>
>> ---
>> include/linux/slab.h | 2 +-
>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/include/linux/slab.h b/include/linux/slab.h
>> index 918f374e7156..97d0599ddb7b 100644
>> --- a/include/linux/slab.h
>> +++ b/include/linux/slab.h
>> @@ -329,7 +329,7 @@ static __always_inline enum kmalloc_cache_type kmalloc_type(gfp_t flags)
>> * If an allocation is both __GFP_DMA and __GFP_RECLAIMABLE, return
>> * KMALLOC_DMA and effectively ignore __GFP_RECLAIMABLE
>> */
>> - return type_dma + (is_reclaimable & !is_dma) * KMALLOC_RECLAIM;
>> + return type_dma + is_reclaimable * !is_dma * KMALLOC_RECLAIM;
>> }
>>
>
> What is wrong with && ?
Nothing, it would work and generate the same assembly as '&'. But Andrew
noted that this code is probably too clever for its own good, and he has
a point. The single predictable branch is also likely faster than the
chain of arithmetic calculations anyway. Nobody has actually measured
it, so I'd go with the easier-to-read variant.
> If logical and is better done by multiply,
> that's compiler job, and compiler should be fixed to do it...
Multiply was just another way (equivalent to '&&' semantically) to shut
up sparse warning. But gcc actually emits IMUL in that case, which is
wasteful, so yeah there's a bug report now:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87954
>
> Pavel
>
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH] slab.h: Avoid using & for logical and of booleans
2018-11-13 18:22 ` Vlastimil Babka
@ 2018-11-21 13:22 ` Vlastimil Babka
0 siblings, 0 replies; 29+ messages in thread
From: Vlastimil Babka @ 2018-11-21 13:22 UTC (permalink / raw)
To: David Laight, Andrew Morton
Cc: 'Bart Van Assche',
linux-kernel, Mel Gorman, Christoph Lameter, Roman Gushchin,
Darryl T. Agostinelli, linux-mm, Masahiro Yamada, Dan Carpenter
On 11/13/18 7:22 PM, Vlastimil Babka wrote:
> On 11/12/18 10:55 AM, David Laight wrote:
>> From: Vlastimil Babka [mailto:vbabka@suse.cz]
>>> Sent: 09 November 2018 19:16
>> ...
>>> This? Not terribly elegant, but I don't see a nicer way right now...
>>
>> Maybe just have two copies of the function body?
>>
>> static __always_inline enum kmalloc_cache_type kmalloc_type(gfp_t flags)
>> {
>> #ifndef CONFIG_ZONE_DMA
>> return flags & __GFP_RECLAIMABLE ? KMALLOC_RECLAIM : KMALLOC_NORMAL;
>> #else
>> if (likely((flags & (__GFP_DMA | __GFP_RECLAIMABLE)) == 0))
>> return KMALLOC_NORMAL;
>> return flags & __GFP_DMA ? KMALLOC_DMA : KMALLOC_RECLAIM;
>> #endif
>> }
>
> OK that's probably the most straightforward to follow, thanks.
> Note that for CONFIG_ZONE_DMA=n the result is identical to original code and
> all other attempts. flags & __GFP_DMA is converted to 1/0 index without branches
> or cmovs or whatnot.
Ping? Seems like people will report duplicates until the sparse warning
is gone in mainline...
Also CC linux-mm which was somehow lost.
----8<----
From 40735b637b28c3e5798bc7e90f72f349050c2045 Mon Sep 17 00:00:00 2001
From: Vlastimil Babka <vbabka@suse.cz>
Date: Fri, 9 Nov 2018 08:47:12 +0100
Subject: [PATCH] mm, slab: fix sparse warning in kmalloc_type()
Multiple people have reported the following sparse warning:
./include/linux/slab.h:332:43: warning: dubious: x & !y
The minimal fix would be to change the logical & to boolean &&, which emits the
same code, but Andrew has suggested that the branch-avoiding tricks are maybe
not worthwile. David Laight provided a nice comparison of disassembly of
multiple variants, which shows that the current version produces a 4 deep
dependency chain, and fixing the sparse warning by changing logical and to
multiplication emits an IMUL, making it even more expensive.
The code as rewritten by this patch yielded the best disassembly, with a single
predictable branch for the most common case, and a ternary operator for the
rest, which gcc seems to compile without a branch or cmov by itself.
The result should be more readable, without a sparse warning and probably also
faster for the common case.
Reported-by: Bart Van Assche <bvanassche@acm.org>
Reported-by: Darryl T. Agostinelli <dagostinelli@gmail.com>
Reported-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Suggested-by: Andrew Morton <akpm@linux-foundation.org>
Suggested-by: David Laight <David.Laight@ACULAB.COM>
Fixes: 1291523f2c1d ("mm, slab/slub: introduce kmalloc-reclaimable caches")
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
---
include/linux/slab.h | 24 ++++++++++++------------
1 file changed, 12 insertions(+), 12 deletions(-)
diff --git a/include/linux/slab.h b/include/linux/slab.h
index 918f374e7156..6d5009f29ce5 100644
--- a/include/linux/slab.h
+++ b/include/linux/slab.h
@@ -314,22 +314,22 @@ kmalloc_caches[NR_KMALLOC_TYPES][KMALLOC_SHIFT_HIGH + 1];
static __always_inline enum kmalloc_cache_type kmalloc_type(gfp_t flags)
{
- int is_dma = 0;
- int type_dma = 0;
- int is_reclaimable;
-
#ifdef CONFIG_ZONE_DMA
- is_dma = !!(flags & __GFP_DMA);
- type_dma = is_dma * KMALLOC_DMA;
-#endif
-
- is_reclaimable = !!(flags & __GFP_RECLAIMABLE);
+ /*
+ * The most common case is KMALLOC_NORMAL, so test for it
+ * with a single branch for both flags.
+ */
+ if (likely((flags & (__GFP_DMA | __GFP_RECLAIMABLE)) == 0))
+ return KMALLOC_NORMAL;
/*
- * If an allocation is both __GFP_DMA and __GFP_RECLAIMABLE, return
- * KMALLOC_DMA and effectively ignore __GFP_RECLAIMABLE
+ * At least one of the flags has to be set. If both are, __GFP_DMA
+ * is more important.
*/
- return type_dma + (is_reclaimable & !is_dma) * KMALLOC_RECLAIM;
+ return flags & __GFP_DMA ? KMALLOC_DMA : KMALLOC_RECLAIM;
+#else
+ return flags & __GFP_RECLAIMABLE ? KMALLOC_RECLAIM : KMALLOC_NORMAL;
+#endif
}
/*
--
2.19.1
^ permalink raw reply related [flat|nested] 29+ messages in thread
end of thread, other threads:[~2018-11-21 13:22 UTC | newest]
Thread overview: 29+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-11-05 20:40 [PATCH] slab.h: Avoid using & for logical and of booleans Bart Van Assche
2018-11-05 21:13 ` Andrew Morton
2018-11-05 21:48 ` Bart Van Assche
2018-11-05 22:14 ` Rasmus Villemoes
2018-11-05 22:40 ` Bart Van Assche
2018-11-05 22:48 ` Alexander Duyck
2018-11-06 0:01 ` Bart Van Assche
2018-11-06 0:11 ` Alexander Duyck
2018-11-06 0:32 ` Bart Van Assche
2018-11-06 17:20 ` Alexander Duyck
2018-11-06 17:48 ` Bart Van Assche
2018-11-06 18:17 ` Alexander Duyck
2018-11-06 9:45 ` William Kucharski
2018-11-06 8:40 ` Vlastimil Babka
2018-11-06 10:08 ` David Laight
2018-11-06 10:22 ` Vlastimil Babka
2018-11-06 11:07 ` David Laight
2018-11-06 12:51 ` Vlastimil Babka
2018-11-07 10:41 ` David Laight
2018-11-09 8:12 ` Vlastimil Babka
2018-11-09 19:00 ` Andrew Morton
2018-11-09 19:16 ` Vlastimil Babka
2018-11-09 19:47 ` Darryl T. Agostinelli
2018-11-09 21:31 ` Vlastimil Babka
2018-11-12 9:55 ` David Laight
2018-11-13 18:22 ` Vlastimil Babka
2018-11-21 13:22 ` Vlastimil Babka
2018-11-19 11:04 ` Pavel Machek
2018-11-19 12:51 ` Vlastimil Babka
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).