* Re: [PATCH] x86: bpf_jit_comp: simplify trivial boolean return
@ 2014-11-26 18:34 Alexei Starovoitov
2014-11-26 18:41 ` Joe Perches
2014-11-26 20:00 ` Joe Perches
0 siblings, 2 replies; 17+ messages in thread
From: Alexei Starovoitov @ 2014-11-26 18:34 UTC (permalink / raw)
To: Joe Perches
Cc: Quentin Lambert, David S. Miller, Alexey Kuznetsov, James Morris,
Hideaki YOSHIFUJI, Patrick McHardy, Thomas Gleixner, Ingo Molnar,
H. Peter Anvin, x86, netdev, linux-kernel
On Wed, Nov 26, 2014 at 10:02 AM, Joe Perches <joe@perches.com> wrote:
> On Wed, 2014-11-26 at 09:23 -0800, Alexei Starovoitov wrote:
>> On Wed, Nov 26, 2014 at 8:58 AM, Joe Perches <joe@perches.com> wrote:
>
>> > Is there any value in reordering these tests for frequency
>> > or maybe using | instead of || to avoid multiple jumps?
>>
>> probably not. It's not a critical path.
>> compiler may fuse conditions depending on values anyway.
>> If it was a critical path, we could have used
>> (1 << reg) & mask trick.
>> I picked explicit 'return true' else 'return false' here,
>> because it felt easier to read. Just a matter of taste.
>
> There is a size difference though: (allyesconfig)
>
> $ size arch/x86/net/built-in.o*
> text data bss dec hex filename
> 12999 1012 4336 18347 47ab arch/x86/net/built-in.o.new
> 13177 1076 4592 18845 499d arch/x86/net/built-in.o.old
interesting. Compiler obviously thinks that 178 byte increase
with -O2 is the right trade off. Which I agree with :)
If I think dropping 'inline' and using -Os will give bigger savings...
but I suspect 'tinification' folks will compile JIT out anyway...
> ---
> diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
> index 3f62734..09e2cea 100644
> --- a/arch/x86/net/bpf_jit_comp.c
> +++ b/arch/x86/net/bpf_jit_comp.c
> @@ -135,11 +135,11 @@ static const int reg2hex[] = {
> */
> static inline bool is_ereg(u32 reg)
> {
> - if (reg == BPF_REG_5 || reg == AUX_REG ||
> - (reg >= BPF_REG_7 && reg <= BPF_REG_9))
> - return true;
> - else
> - return false;
> + return (1 << reg) & (BIT(BPF_REG_5) |
> + BIT(AUX_REG) |
> + BIT(BPF_REG_7) |
> + BIT(BPF_REG_8) |
> + BIT(BPF_REG_9));
thanks for giving it a shot :)
That's exactly what I had in mind.
imo it's less readable, but we probably not going
to mess much with this piece of code anyway.
Though to be safe in the future, we'd need to
add BUILD_BUG_ON that largest value (AUX_REG)
fits in 32bit (or 64bit) and add a comment that
verifier goes before the JIT and checks that
insn->src_reg, insn->dst_reg are less than MAX_BPF_REG,
so argument 'reg' also doesn't trigger too large shift.
Perfectionists r us. :)
... or just leave it as-is ;)
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] x86: bpf_jit_comp: simplify trivial boolean return
2014-11-26 18:34 [PATCH] x86: bpf_jit_comp: simplify trivial boolean return Alexei Starovoitov
@ 2014-11-26 18:41 ` Joe Perches
2014-11-26 20:00 ` Joe Perches
1 sibling, 0 replies; 17+ messages in thread
From: Joe Perches @ 2014-11-26 18:41 UTC (permalink / raw)
To: Alexei Starovoitov
Cc: Quentin Lambert, David S. Miller, Alexey Kuznetsov, James Morris,
Hideaki YOSHIFUJI, Patrick McHardy, Thomas Gleixner, Ingo Molnar,
H. Peter Anvin, x86, netdev, linux-kernel
On Wed, 2014-11-26 at 10:34 -0800, Alexei Starovoitov wrote:
> On Wed, Nov 26, 2014 at 10:02 AM, Joe Perches <joe@perches.com> wrote:
> > On Wed, 2014-11-26 at 09:23 -0800, Alexei Starovoitov wrote:
> >> On Wed, Nov 26, 2014 at 8:58 AM, Joe Perches <joe@perches.com> wrote:
> >
> >> > Is there any value in reordering these tests for frequency
> >> > or maybe using | instead of || to avoid multiple jumps?
> >>
> >> probably not. It's not a critical path.
> >> compiler may fuse conditions depending on values anyway.
> >> If it was a critical path, we could have used
> >> (1 << reg) & mask trick.
> >> I picked explicit 'return true' else 'return false' here,
> >> because it felt easier to read. Just a matter of taste.
> >
> > There is a size difference though: (allyesconfig)
> >
> > $ size arch/x86/net/built-in.o*
> > text data bss dec hex filename
> > 12999 1012 4336 18347 47ab arch/x86/net/built-in.o.new
> > 13177 1076 4592 18845 499d arch/x86/net/built-in.o.old
>
> interesting. Compiler obviously thinks that 178 byte increase
> with -O2 is the right trade off. Which I agree with :)
498 overall.
> If I think dropping 'inline' and using -Os will give bigger savings...
> but I suspect 'tinification' folks will compile JIT out anyway...
Smaller is generally better/faster in any case.
> thanks for giving it a shot :)
> That's exactly what I had in mind.
> imo it's less readable, but we probably not going
> to mess much with this piece of code anyway.
> Though to be safe in the future, we'd need to
> add BUILD_BUG_ON that largest value (AUX_REG)
> fits in 32bit (or 64bit) and add a comment that
> verifier goes before the JIT and checks that
> insn->src_reg, insn->dst_reg are less than MAX_BPF_REG,
> so argument 'reg' also doesn't trigger too large shift.
> Perfectionists r us. :)
> ... or just leave it as-is ;)
16 registers max anyway as it's stored
in a :4
No worries, it was just playtime anyway.
cheers, Joe
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] x86: bpf_jit_comp: simplify trivial boolean return
2014-11-26 18:34 [PATCH] x86: bpf_jit_comp: simplify trivial boolean return Alexei Starovoitov
2014-11-26 18:41 ` Joe Perches
@ 2014-11-26 20:00 ` Joe Perches
2014-11-27 12:25 ` David Laight
1 sibling, 1 reply; 17+ messages in thread
From: Joe Perches @ 2014-11-26 20:00 UTC (permalink / raw)
To: Alexei Starovoitov
Cc: Quentin Lambert, David S. Miller, Alexey Kuznetsov, James Morris,
Hideaki YOSHIFUJI, Patrick McHardy, Thomas Gleixner, Ingo Molnar,
H. Peter Anvin, x86, netdev, linux-kernel
On Wed, 2014-11-26 at 10:34 -0800, Alexei Starovoitov wrote:
> On Wed, Nov 26, 2014 at 10:02 AM, Joe Perches <joe@perches.com> wrote:
> > On Wed, 2014-11-26 at 09:23 -0800, Alexei Starovoitov wrote:
> >> On Wed, Nov 26, 2014 at 8:58 AM, Joe Perches <joe@perches.com> wrote:
> >
> >> > Is there any value in reordering these tests for frequency
> >> > or maybe using | instead of || to avoid multiple jumps?
> >>
> >> probably not. It's not a critical path.
> >> compiler may fuse conditions depending on values anyway.
> >> If it was a critical path, we could have used
> >> (1 << reg) & mask trick.
> >> I picked explicit 'return true' else 'return false' here,
> >> because it felt easier to read. Just a matter of taste.
> >
> > There is a size difference though: (allyesconfig)
> >
> > $ size arch/x86/net/built-in.o*
> > text data bss dec hex filename
> > 12999 1012 4336 18347 47ab arch/x86/net/built-in.o.new
> > 13177 1076 4592 18845 499d arch/x86/net/built-in.o.old
>
> interesting. Compiler obviously thinks that 178 byte increase
> with -O2 is the right trade off. Which I agree with :)
>
> If I think dropping 'inline' and using -Os will give bigger savings...
This was allyesconfig which already uses -Os
Using -O2, there is no difference using inline
or not, but the size delta with the bitmask is
much larger
$ size arch/x86/net/built-in.o* (allyesconfig, but not -Os)
text data bss dec hex filename
13410 820 3624 17854 45be arch/x86/net/built-in.o.new
16130 884 4200 21214 52de arch/x86/net/built-in.o.old
16130 884 4200 21214 52de arch/x86/net/built-in.o.static
^ permalink raw reply [flat|nested] 17+ messages in thread
* RE: [PATCH] x86: bpf_jit_comp: simplify trivial boolean return
2014-11-26 20:00 ` Joe Perches
@ 2014-11-27 12:25 ` David Laight
2014-11-27 14:36 ` Quentin Lambert
` (2 more replies)
0 siblings, 3 replies; 17+ messages in thread
From: David Laight @ 2014-11-27 12:25 UTC (permalink / raw)
To: 'Joe Perches', Alexei Starovoitov
Cc: Quentin Lambert, David S. Miller, Alexey Kuznetsov, James Morris,
Hideaki YOSHIFUJI, Patrick McHardy, Thomas Gleixner, Ingo Molnar,
H. Peter Anvin, x86, netdev, linux-kernel
From: Joe Perches
> On Wed, 2014-11-26 at 10:34 -0800, Alexei Starovoitov wrote:
> > On Wed, Nov 26, 2014 at 10:02 AM, Joe Perches <joe@perches.com> wrote:
> > > On Wed, 2014-11-26 at 09:23 -0800, Alexei Starovoitov wrote:
> > >> On Wed, Nov 26, 2014 at 8:58 AM, Joe Perches <joe@perches.com> wrote:
> > >
> > >> > Is there any value in reordering these tests for frequency
> > >> > or maybe using | instead of || to avoid multiple jumps?
> > >>
> > >> probably not. It's not a critical path.
> > >> compiler may fuse conditions depending on values anyway.
> > >> If it was a critical path, we could have used
> > >> (1 << reg) & mask trick.
> > >> I picked explicit 'return true' else 'return false' here,
> > >> because it felt easier to read. Just a matter of taste.
> > >
> > > There is a size difference though: (allyesconfig)
> > >
> > > $ size arch/x86/net/built-in.o*
> > > text data bss dec hex filename
> > > 12999 1012 4336 18347 47ab arch/x86/net/built-in.o.new
> > > 13177 1076 4592 18845 499d arch/x86/net/built-in.o.old
> >
> > interesting. Compiler obviously thinks that 178 byte increase
> > with -O2 is the right trade off. Which I agree with :)
> >
> > If I think dropping 'inline' and using -Os will give bigger savings...
>
> This was allyesconfig which already uses -Os
>
> Using -O2, there is no difference using inline
> or not, but the size delta with the bitmask is
> much larger
>
> $ size arch/x86/net/built-in.o* (allyesconfig, but not -Os)
> text data bss dec hex filename
> 13410 820 3624 17854 45be arch/x86/net/built-in.o.new
> 16130 884 4200 21214 52de arch/x86/net/built-in.o.old
> 16130 884 4200 21214 52de arch/x86/net/built-in.o.static
That is quite a big % change in the code size.
Why the change in data?
David
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] x86: bpf_jit_comp: simplify trivial boolean return
2014-11-27 12:25 ` David Laight
@ 2014-11-27 14:36 ` Quentin Lambert
2014-11-27 18:35 ` Joe Perches
2014-11-27 18:49 ` Joe Perches
2 siblings, 0 replies; 17+ messages in thread
From: Quentin Lambert @ 2014-11-27 14:36 UTC (permalink / raw)
To: David Laight, 'Joe Perches', Alexei Starovoitov
Cc: David S. Miller, Alexey Kuznetsov, James Morris,
Hideaki YOSHIFUJI, Patrick McHardy, Thomas Gleixner, Ingo Molnar,
H. Peter Anvin, x86, netdev, linux-kernel
On 27/11/2014 13:25, David Laight wrote:
> From: Joe Perches
>> On Wed, 2014-11-26 at 10:34 -0800, Alexei Starovoitov wrote:
>>> On Wed, Nov 26, 2014 at 10:02 AM, Joe Perches <joe@perches.com> wrote:
>>>> On Wed, 2014-11-26 at 09:23 -0800, Alexei Starovoitov wrote:
>>>>> On Wed, Nov 26, 2014 at 8:58 AM, Joe Perches <joe@perches.com> wrote:
>>>>>> Is there any value in reordering these tests for frequency
>>>>>> or maybe using | instead of || to avoid multiple jumps?
>>>>> probably not. It's not a critical path.
>>>>> compiler may fuse conditions depending on values anyway.
>>>>> If it was a critical path, we could have used
>>>>> (1 << reg) & mask trick.
>>>>> I picked explicit 'return true' else 'return false' here,
>>>>> because it felt easier to read. Just a matter of taste.
>>>> There is a size difference though: (allyesconfig)
>>>>
>>>> $ size arch/x86/net/built-in.o*
>>>> text data bss dec hex filename
>>>> 12999 1012 4336 18347 47ab arch/x86/net/built-in.o.new
>>>> 13177 1076 4592 18845 499d arch/x86/net/built-in.o.old
>>> interesting. Compiler obviously thinks that 178 byte increase
>>> with -O2 is the right trade off. Which I agree with :)
>>>
>>> If I think dropping 'inline' and using -Os will give bigger savings...
>> This was allyesconfig which already uses -Os
>>
>> Using -O2, there is no difference using inline
>> or not, but the size delta with the bitmask is
>> much larger
>>
>> $ size arch/x86/net/built-in.o* (allyesconfig, but not -Os)
>> text data bss dec hex filename
>> 13410 820 3624 17854 45be arch/x86/net/built-in.o.new
>> 16130 884 4200 21214 52de arch/x86/net/built-in.o.old
>> 16130 884 4200 21214 52de arch/x86/net/built-in.o.static
> That is quite a big % change in the code size.
> Why the change in data?
>
> David
>
>
>
Do you want me to propose a second version, or should I just
drop it all together ?
I am a new contributor so I have no experience in that sort of thing.
Quentin
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] x86: bpf_jit_comp: simplify trivial boolean return
2014-11-27 12:25 ` David Laight
2014-11-27 14:36 ` Quentin Lambert
@ 2014-11-27 18:35 ` Joe Perches
2014-11-27 18:49 ` Joe Perches
2 siblings, 0 replies; 17+ messages in thread
From: Joe Perches @ 2014-11-27 18:35 UTC (permalink / raw)
To: David Laight
Cc: Alexei Starovoitov, Quentin Lambert, David S. Miller,
Alexey Kuznetsov, James Morris, Hideaki YOSHIFUJI,
Patrick McHardy, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
x86, netdev, linux-kernel
On Thu, 2014-11-27 at 12:25 +0000, David Laight wrote:
> From: Joe Perches
> > On Wed, 2014-11-26 at 10:34 -0800, Alexei Starovoitov wrote:
> > > On Wed, Nov 26, 2014 at 10:02 AM, Joe Perches <joe@perches.com> wrote:
> > > > On Wed, 2014-11-26 at 09:23 -0800, Alexei Starovoitov wrote:
> > > >> On Wed, Nov 26, 2014 at 8:58 AM, Joe Perches <joe@perches.com> wrote:
> > > >
> > > >> > Is there any value in reordering these tests for frequency
> > > >> > or maybe using | instead of || to avoid multiple jumps?
> > > >>
> > > >> probably not. It's not a critical path.
> > > >> compiler may fuse conditions depending on values anyway.
> > > >> If it was a critical path, we could have used
> > > >> (1 << reg) & mask trick.
> > > >> I picked explicit 'return true' else 'return false' here,
> > > >> because it felt easier to read. Just a matter of taste.
> > > >
> > > > There is a size difference though: (allyesconfig)
> > > >
> > > > $ size arch/x86/net/built-in.o*
> > > > text data bss dec hex filename
> > > > 12999 1012 4336 18347 47ab arch/x86/net/built-in.o.new
> > > > 13177 1076 4592 18845 499d arch/x86/net/built-in.o.old
> > >
> > > interesting. Compiler obviously thinks that 178 byte increase
> > > with -O2 is the right trade off. Which I agree with :)
> > >
> > > If I think dropping 'inline' and using -Os will give bigger savings...
> >
> > This was allyesconfig which already uses -Os
> >
> > Using -O2, there is no difference using inline
> > or not, but the size delta with the bitmask is
> > much larger
> >
> > $ size arch/x86/net/built-in.o* (allyesconfig, but not -Os)
> > text data bss dec hex filename
> > 13410 820 3624 17854 45be arch/x86/net/built-in.o.new
> > 16130 884 4200 21214 52de arch/x86/net/built-in.o.old
> > 16130 884 4200 21214 52de arch/x86/net/built-in.o.static
>
> That is quite a big % change in the code size.
> Why the change in data?
$ objdump -t arch/x86/net/bpf_jit_comp.o.new > new
$ objdump -t arch/x86/net/bpf_jit_comp.o.old > old
$ diff -urN old new
--- old 2014-11-27 10:31:36.654373756 -0800
+++ new 2014-11-27 10:31:31.254373453 -0800
@@ -1,5 +1,5 @@
-arch/x86/net/bpf_jit_comp.o.old: file format elf64-x86-64
+arch/x86/net/bpf_jit_comp.o.new: file format elf64-x86-64
SYMBOL TABLE:
0000000000000000 l df *ABS* 0000000000000000 bpf_jit_comp.c
@@ -8,28 +8,26 @@
0000000000000000 l d .bss 0000000000000000 .bss
0000000000000000 l d .text.unlikely 0000000000000000 .text.unlikely
0000000000000000 l F .text 000000000000001f jit_fill_hole
-0000000000000098 l O .bss 0000000000000008 __gcov0.jit_fill_hole
+0000000000000060 l O .bss 0000000000000008 __gcov0.jit_fill_hole
0000000000000000 l d .rodata.str1.1 0000000000000000 .rodata.str1.1
0000000000000000 l d .rodata.str1.8 0000000000000000 .rodata.str1.8
-0000000000000020 l F .text 00000000000030a2 do_jit
-00000000000000c0 l O .bss 0000000000000b68 __gcov0.do_jit
+0000000000000020 l F .text 000000000000260b do_jit
+0000000000000080 l O .bss 0000000000000970 __gcov0.do_jit
00000000000006e0 l O .rodata 0000000000000034 reg2hex
0000000000000000 l d .rodata 0000000000000000 .rodata
-0000000000000060 l O .bss 0000000000000038 __gcov0.add_2mod
-0000000000000c28 l O .bss 0000000000000008 __gcov0.bpf_jit_compile
-0000000000000c40 l O .bss 00000000000003f0 __gcov0.bpf_int_jit_compile
+00000000000009f0 l O .bss 0000000000000008 __gcov0.bpf_jit_compile
+0000000000000a00 l O .bss 00000000000003f0 __gcov0.bpf_int_jit_compile
0000000000000040 l O .bss 0000000000000020 __gcov0.bpf_jit_dump
-0000000000001040 l O .bss 0000000000000028 __gcov0.bpf_jit_free
+0000000000000e00 l O .bss 0000000000000028 __gcov0.bpf_jit_free
0000000000000010 l O .bss 0000000000000018 __gcov0.bpf_prog_unlock_free
0000000000000000 l d .text.startup 0000000000000000 .text.startup
0000000000000000 l F .text.startup 0000000000000012 _GLOBAL__sub_I_65535_0_bpf_jit_compile
0000000000000000 l d .init_array 0000000000000000 .init_array
-0000000000000340 l O .data 0000000000000028 __gcov_.bpf_jit_free
-0000000000000260 l O .data 0000000000000028 __gcov_.bpf_int_jit_compile
-0000000000000220 l O .data 0000000000000028 __gcov_.bpf_jit_compile
-00000000000001e0 l O .data 0000000000000028 __gcov_.do_jit
-00000000000001a0 l O .data 0000000000000028 __gcov_.jit_fill_hole
-0000000000000160 l O .data 0000000000000028 __gcov_.add_2mod
+0000000000000300 l O .data 0000000000000028 __gcov_.bpf_jit_free
+0000000000000220 l O .data 0000000000000028 __gcov_.bpf_int_jit_compile
+00000000000001e0 l O .data 0000000000000028 __gcov_.bpf_jit_compile
+00000000000001a0 l O .data 0000000000000028 __gcov_.do_jit
+0000000000000160 l O .data 0000000000000028 __gcov_.jit_fill_hole
0000000000000120 l O .data 0000000000000028 __gcov_.bpf_jit_dump
00000000000000e0 l O .data 0000000000000028 __gcov_.bpf_prog_unlock_free
00000000000000a0 l O .data 0000000000000028 __gcov_.__get_order
@@ -43,17 +41,17 @@
0000000000000000 *UND* 0000000000000000 memset
0000000000000000 *UND* 0000000000000000 sk_load_half
0000000000000000 *UND* 0000000000000000 printk
+0000000000000000 *UND* 0000000000000000 sk_load_byte
+0000000000000000 *UND* 0000000000000000 sk_load_word
0000000000000000 *UND* 0000000000000000 sk_load_half_positive_offset
0000000000000000 *UND* 0000000000000000 sk_load_half_negative_offset
-0000000000000000 *UND* 0000000000000000 sk_load_word_positive_offset
-0000000000000000 *UND* 0000000000000000 sk_load_byte
0000000000000000 *UND* 0000000000000000 sk_load_byte_positive_offset
0000000000000000 *UND* 0000000000000000 sk_load_byte_negative_offset
-0000000000000000 *UND* 0000000000000000 sk_load_word
+0000000000000000 *UND* 0000000000000000 sk_load_word_positive_offset
0000000000000000 *UND* 0000000000000000 __bpf_call_base
0000000000000000 *UND* 0000000000000000 sk_load_word_negative_offset
-00000000000030d0 g F .text 0000000000000013 bpf_jit_compile
-00000000000030f0 g F .text 0000000000000352 bpf_int_jit_compile
+0000000000002630 g F .text 0000000000000013 bpf_jit_compile
+0000000000002650 g F .text 0000000000000352 bpf_int_jit_compile
0000000000000000 g O .data..read_mostly 0000000000000004 bpf_jit_enable
0000000000000000 *UND* 0000000000000000 __kmalloc
0000000000000000 *UND* 0000000000000000 bpf_jit_binary_alloc
@@ -62,7 +60,7 @@
0000000000000000 *UND* 0000000000000000 set_memory_ro
0000000000000000 *UND* 0000000000000000 kfree
0000000000000000 *UND* 0000000000000000 bpf_jit_binary_free
-0000000000003450 g F .text 000000000000008c bpf_jit_free
+00000000000029b0 g F .text 000000000000008c bpf_jit_free
0000000000000000 *UND* 0000000000000000 set_memory_rw
0000000000000000 *UND* 0000000000000000 __bpf_prog_free
0000000000000000 *UND* 0000000000000000 __gcov_init
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] x86: bpf_jit_comp: simplify trivial boolean return
2014-11-27 12:25 ` David Laight
2014-11-27 14:36 ` Quentin Lambert
2014-11-27 18:35 ` Joe Perches
@ 2014-11-27 18:49 ` Joe Perches
2014-12-04 9:26 ` Joe Perches
2 siblings, 1 reply; 17+ messages in thread
From: Joe Perches @ 2014-11-27 18:49 UTC (permalink / raw)
To: David Laight
Cc: Alexei Starovoitov, Quentin Lambert, David S. Miller,
Alexey Kuznetsov, James Morris, Hideaki YOSHIFUJI,
Patrick McHardy, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
x86, netdev, linux-kernel
On Thu, 2014-11-27 at 12:25 +0000, David Laight wrote:
> Why the change in data?
btw: without gcov and using -O2
$ size arch/x86/net/bpf_jit_comp.o*
text data bss dec hex filename
9671 4 0 9675 25cb arch/x86/net/bpf_jit_comp.o.new
10679 4 0 10683 29bb arch/x86/net/bpf_jit_comp.o.old
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] x86: bpf_jit_comp: simplify trivial boolean return
2014-11-27 18:49 ` Joe Perches
@ 2014-12-04 9:26 ` Joe Perches
2014-12-04 15:56 ` Alexei Starovoitov
0 siblings, 1 reply; 17+ messages in thread
From: Joe Perches @ 2014-12-04 9:26 UTC (permalink / raw)
To: Alexei Starovoitov
Cc: David Laight, Quentin Lambert, David S. Miller, Alexey Kuznetsov,
James Morris, Hideaki YOSHIFUJI, Patrick McHardy,
Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86, netdev,
linux-kernel
On Thu, 2014-11-27 at 10:49 -0800, Joe Perches wrote:
> On Thu, 2014-11-27 at 12:25 +0000, David Laight wrote:
> > Why the change in data?
>
> btw: without gcov and using -O2
>
> $ size arch/x86/net/bpf_jit_comp.o*
> text data bss dec hex filename
> 9671 4 0 9675 25cb arch/x86/net/bpf_jit_comp.o.new
> 10679 4 0 10683 29bb arch/x86/net/bpf_jit_comp.o.old
Alexei?
Is this 10% reduction in size a good reason to change the code?
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] x86: bpf_jit_comp: simplify trivial boolean return
2014-12-04 9:26 ` Joe Perches
@ 2014-12-04 15:56 ` Alexei Starovoitov
2014-12-04 18:05 ` Joe Perches
0 siblings, 1 reply; 17+ messages in thread
From: Alexei Starovoitov @ 2014-12-04 15:56 UTC (permalink / raw)
To: Joe Perches
Cc: David Laight, Quentin Lambert, David S. Miller, Alexey Kuznetsov,
James Morris, Hideaki YOSHIFUJI, Patrick McHardy,
Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86, netdev,
linux-kernel
On Thu, Dec 4, 2014 at 1:26 AM, Joe Perches <joe@perches.com> wrote:
> On Thu, 2014-11-27 at 10:49 -0800, Joe Perches wrote:
>> On Thu, 2014-11-27 at 12:25 +0000, David Laight wrote:
>> > Why the change in data?
>>
>> btw: without gcov and using -O2
>>
>> $ size arch/x86/net/bpf_jit_comp.o*
>> text data bss dec hex filename
>> 9671 4 0 9675 25cb arch/x86/net/bpf_jit_comp.o.new
>> 10679 4 0 10683 29bb arch/x86/net/bpf_jit_comp.o.old
>
> Alexei?
>
> Is this 10% reduction in size a good reason to change the code?
yes.
I believe you're seeing it with gcc 4.9. I wanted to double
check what 4.6 and 4.7 are doing. If they're not suddenly
increase code size then resubmit it for inclusion please.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] x86: bpf_jit_comp: simplify trivial boolean return
2014-12-04 15:56 ` Alexei Starovoitov
@ 2014-12-04 18:05 ` Joe Perches
0 siblings, 0 replies; 17+ messages in thread
From: Joe Perches @ 2014-12-04 18:05 UTC (permalink / raw)
To: Alexei Starovoitov
Cc: David Laight, Quentin Lambert, David S. Miller, Alexey Kuznetsov,
James Morris, Hideaki YOSHIFUJI, Patrick McHardy,
Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86, netdev,
linux-kernel
On Thu, 2014-12-04 at 07:56 -0800, Alexei Starovoitov wrote:
> On Thu, Dec 4, 2014 at 1:26 AM, Joe Perches <joe@perches.com> wrote:
> > On Thu, 2014-11-27 at 10:49 -0800, Joe Perches wrote:
> >> On Thu, 2014-11-27 at 12:25 +0000, David Laight wrote:
> >> > Why the change in data?
> >>
> >> btw: without gcov and using -O2
> >>
> >> $ size arch/x86/net/bpf_jit_comp.o*
> >> text data bss dec hex filename
> >> 9671 4 0 9675 25cb arch/x86/net/bpf_jit_comp.o.new
> >> 10679 4 0 10683 29bb arch/x86/net/bpf_jit_comp.o.old
> >
> > Alexei?
> >
> > Is this 10% reduction in size a good reason to change the code?
>
> yes.
> I believe you're seeing it with gcc 4.9. I wanted to double
> check what 4.6 and 4.7 are doing. If they're not suddenly
> increase code size then resubmit it for inclusion please.
I get these sizes for these compilers
(x86-64, -O2, without profiling)
$ size arch/x86/net/bpf_jit_comp.o*
text data bss dec hex filename
9266 4 0 9270 2436 arch/x86/net/bpf_jit_comp.o.4.4.new
10042 4 0 10046 273e arch/x86/net/bpf_jit_comp.o.4.4.old
9109 4 0 9113 2399 arch/x86/net/bpf_jit_comp.o.4.6.new
9717 4 0 9721 25f9 arch/x86/net/bpf_jit_comp.o.4.6.old
8789 4 0 8793 2259 arch/x86/net/bpf_jit_comp.o.4.7.new
10245 4 0 10249 2809 arch/x86/net/bpf_jit_comp.o.4.7.old
9671 4 0 9675 25cb arch/x86/net/bpf_jit_comp.o.4.9.new
10679 4 0 10683 29bb arch/x86/net/bpf_jit_comp.o.4.9.old
I am a bit surprised by the size variations
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] x86: bpf_jit_comp: simplify trivial boolean return
@ 2014-12-04 22:44 Alexei Starovoitov
0 siblings, 0 replies; 17+ messages in thread
From: Alexei Starovoitov @ 2014-12-04 22:44 UTC (permalink / raw)
To: Joe Perches
Cc: David Laight, Quentin Lambert, David S. Miller, Alexey Kuznetsov,
James Morris, Hideaki YOSHIFUJI, Patrick McHardy,
Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86, netdev,
linux-kernel
On Thu, Dec 4, 2014 at 10:05 AM, Joe Perches <joe@perches.com> wrote:
> On Thu, 2014-12-04 at 07:56 -0800, Alexei Starovoitov wrote:
>> On Thu, Dec 4, 2014 at 1:26 AM, Joe Perches <joe@perches.com> wrote:
>> > On Thu, 2014-11-27 at 10:49 -0800, Joe Perches wrote:
>> >> On Thu, 2014-11-27 at 12:25 +0000, David Laight wrote:
>> >> > Why the change in data?
>> >>
>> >> btw: without gcov and using -O2
>> >>
>> >> $ size arch/x86/net/bpf_jit_comp.o*
>> >> text data bss dec hex filename
>> >> 9671 4 0 9675 25cb arch/x86/net/bpf_jit_comp.o.new
>> >> 10679 4 0 10683 29bb arch/x86/net/bpf_jit_comp.o.old
>> >
>> > Alexei?
>> >
>> > Is this 10% reduction in size a good reason to change the code?
>>
>> yes.
>> I believe you're seeing it with gcc 4.9. I wanted to double
>> check what 4.6 and 4.7 are doing. If they're not suddenly
>> increase code size then resubmit it for inclusion please.
>
> I get these sizes for these compilers
> (x86-64, -O2, without profiling)
>
> $ size arch/x86/net/bpf_jit_comp.o*
> text data bss dec hex filename
> 9266 4 0 9270 2436 arch/x86/net/bpf_jit_comp.o.4.4.new
> 10042 4 0 10046 273e arch/x86/net/bpf_jit_comp.o.4.4.old
> 9109 4 0 9113 2399 arch/x86/net/bpf_jit_comp.o.4.6.new
> 9717 4 0 9721 25f9 arch/x86/net/bpf_jit_comp.o.4.6.old
> 8789 4 0 8793 2259 arch/x86/net/bpf_jit_comp.o.4.7.new
> 10245 4 0 10249 2809 arch/x86/net/bpf_jit_comp.o.4.7.old
> 9671 4 0 9675 25cb arch/x86/net/bpf_jit_comp.o.4.9.new
> 10679 4 0 10683 29bb arch/x86/net/bpf_jit_comp.o.4.9.old
>
> I am a bit surprised by the size variations
yeah. the difference is surprising.
Just tried with 4.7 and my regular config and I see the same difference.
Looks like gcc wasn't able to fold conditions into cmov
and used a bunch of cmp/jmp
Since is_ereg() was inlined ~70 times on its own and as
part of other functions, the difference of 3-4 instructions
may a large difference in total size.
test_bpf also passes, so please resubmit properly.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] x86: bpf_jit_comp: simplify trivial boolean return
2014-11-26 17:23 Alexei Starovoitov
@ 2014-11-26 18:02 ` Joe Perches
0 siblings, 0 replies; 17+ messages in thread
From: Joe Perches @ 2014-11-26 18:02 UTC (permalink / raw)
To: Alexei Starovoitov
Cc: Quentin Lambert, David S. Miller, Alexey Kuznetsov, James Morris,
Hideaki YOSHIFUJI, Patrick McHardy, Thomas Gleixner, Ingo Molnar,
H. Peter Anvin, x86, netdev, linux-kernel
On Wed, 2014-11-26 at 09:23 -0800, Alexei Starovoitov wrote:
> On Wed, Nov 26, 2014 at 8:58 AM, Joe Perches <joe@perches.com> wrote:
> > Is there any value in reordering these tests for frequency
> > or maybe using | instead of || to avoid multiple jumps?
>
> probably not. It's not a critical path.
> compiler may fuse conditions depending on values anyway.
> If it was a critical path, we could have used
> (1 << reg) & mask trick.
> I picked explicit 'return true' else 'return false' here,
> because it felt easier to read. Just a matter of taste.
There is a size difference though: (allyesconfig)
$ size arch/x86/net/built-in.o*
text data bss dec hex filename
12999 1012 4336 18347 47ab arch/x86/net/built-in.o.new
13177 1076 4592 18845 499d arch/x86/net/built-in.o.old
---
diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
index 3f62734..09e2cea 100644
--- a/arch/x86/net/bpf_jit_comp.c
+++ b/arch/x86/net/bpf_jit_comp.c
@@ -135,11 +135,11 @@ static const int reg2hex[] = {
*/
static inline bool is_ereg(u32 reg)
{
- if (reg == BPF_REG_5 || reg == AUX_REG ||
- (reg >= BPF_REG_7 && reg <= BPF_REG_9))
- return true;
- else
- return false;
+ return (1 << reg) & (BIT(BPF_REG_5) |
+ BIT(AUX_REG) |
+ BIT(BPF_REG_7) |
+ BIT(BPF_REG_8) |
+ BIT(BPF_REG_9));
}
/* add modifiers if 'reg' maps to x64 registers r8..r15 */
^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH] x86: bpf_jit_comp: simplify trivial boolean return
2014-11-26 16:58 ` Joe Perches
@ 2014-11-26 17:55 ` Daniel Borkmann
0 siblings, 0 replies; 17+ messages in thread
From: Daniel Borkmann @ 2014-11-26 17:55 UTC (permalink / raw)
To: Joe Perches
Cc: Alexei Starovoitov, Quentin Lambert, David S. Miller,
Alexey Kuznetsov, James Morris, Hideaki YOSHIFUJI,
Patrick McHardy, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
x86, netdev, linux-kernel
On 11/26/2014 05:58 PM, Joe Perches wrote:
...
>> imo existing code is fine and I don't think the time spent
>> reviewing such changes is worth it when there is no
>> improvement in readability.
+1
> Is there any value in reordering these tests for frequency
> or maybe using | instead of || to avoid multiple jumps?
No, it's not a fast-path.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] x86: bpf_jit_comp: simplify trivial boolean return
@ 2014-11-26 17:23 Alexei Starovoitov
2014-11-26 18:02 ` Joe Perches
0 siblings, 1 reply; 17+ messages in thread
From: Alexei Starovoitov @ 2014-11-26 17:23 UTC (permalink / raw)
To: Joe Perches
Cc: Quentin Lambert, David S. Miller, Alexey Kuznetsov, James Morris,
Hideaki YOSHIFUJI, Patrick McHardy, Thomas Gleixner, Ingo Molnar,
H. Peter Anvin, x86, netdev, linux-kernel
On Wed, Nov 26, 2014 at 8:58 AM, Joe Perches <joe@perches.com> wrote:
> On Wed, 2014-11-26 at 08:42 -0800, Alexei Starovoitov wrote:
>> On Wed, Nov 26, 2014 at 1:18 AM, Quentin Lambert
>> <lambert.quentin@gmail.com> wrote:
>> > Remove if then else statements preceding
>> > boolean return.
> []
>> > diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
> []
>> > @@ -135,11 +135,9 @@ static const int reg2hex[] = {
>> > */
>> > static inline bool is_ereg(u32 reg)
>> > {
>> > - if (reg == BPF_REG_5 || reg == AUX_REG ||
>> > - (reg >= BPF_REG_7 && reg <= BPF_REG_9))
>> > - return true;
>> > - else
>> > - return false;
>> > + return (reg == BPF_REG_5 ||
>> > + reg == AUX_REG ||
>> > + (reg >= BPF_REG_7 && reg <= BPF_REG_9));
>>
>> please remove extra () around the whole expression, and
>> align in properly, and
>> don't move reg==AUX_REG check to a different line.
>> Subject is not warranted. I don't think it's a simplification.
>
> It's not really a simplification,
> gcc should emit the same object code.
exactly.
>> imo existing code is fine and I don't think the time spent
>> reviewing such changes is worth it when there is no
>> improvement in readability.
>
> Is there any value in reordering these tests for frequency
> or maybe using | instead of || to avoid multiple jumps?
probably not. It's not a critical path.
compiler may fuse conditions depending on values anyway.
If it was a critical path, we could have used
(1 << reg) & mask trick.
I picked explicit 'return true' else 'return false' here,
because it felt easier to read. Just a matter of taste.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] x86: bpf_jit_comp: simplify trivial boolean return
2014-11-26 16:42 Alexei Starovoitov
@ 2014-11-26 16:58 ` Joe Perches
2014-11-26 17:55 ` Daniel Borkmann
0 siblings, 1 reply; 17+ messages in thread
From: Joe Perches @ 2014-11-26 16:58 UTC (permalink / raw)
To: Alexei Starovoitov
Cc: Quentin Lambert, David S. Miller, Alexey Kuznetsov, James Morris,
Hideaki YOSHIFUJI, Patrick McHardy, Thomas Gleixner, Ingo Molnar,
H. Peter Anvin, x86, netdev, linux-kernel
On Wed, 2014-11-26 at 08:42 -0800, Alexei Starovoitov wrote:
> On Wed, Nov 26, 2014 at 1:18 AM, Quentin Lambert
> <lambert.quentin@gmail.com> wrote:
> > Remove if then else statements preceding
> > boolean return.
[]
> > diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
[]
> > @@ -135,11 +135,9 @@ static const int reg2hex[] = {
> > */
> > static inline bool is_ereg(u32 reg)
> > {
> > - if (reg == BPF_REG_5 || reg == AUX_REG ||
> > - (reg >= BPF_REG_7 && reg <= BPF_REG_9))
> > - return true;
> > - else
> > - return false;
> > + return (reg == BPF_REG_5 ||
> > + reg == AUX_REG ||
> > + (reg >= BPF_REG_7 && reg <= BPF_REG_9));
>
> please remove extra () around the whole expression, and
> align in properly, and
> don't move reg==AUX_REG check to a different line.
> Subject is not warranted. I don't think it's a simplification.
It's not really a simplification,
gcc should emit the same object code.
> imo existing code is fine and I don't think the time spent
> reviewing such changes is worth it when there is no
> improvement in readability.
Is there any value in reordering these tests for frequency
or maybe using | instead of || to avoid multiple jumps?
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] x86: bpf_jit_comp: simplify trivial boolean return
@ 2014-11-26 16:42 Alexei Starovoitov
2014-11-26 16:58 ` Joe Perches
0 siblings, 1 reply; 17+ messages in thread
From: Alexei Starovoitov @ 2014-11-26 16:42 UTC (permalink / raw)
To: Quentin Lambert
Cc: David S. Miller, Alexey Kuznetsov, James Morris,
Hideaki YOSHIFUJI, Patrick McHardy, Thomas Gleixner, Ingo Molnar,
H. Peter Anvin, x86, netdev, linux-kernel
On Wed, Nov 26, 2014 at 1:18 AM, Quentin Lambert
<lambert.quentin@gmail.com> wrote:
> Remove if then else statements preceding
> boolean return. Occurences were found using
> Coccinelle.
>
> The semantic patch used was:
>
> @@
> expression expr;
> @@
>
>
> - if ( expr )
> - return true;
> - else
> - return false;
> + return expr;
>
> Signed-off-by: Quentin Lambert <lambert.quentin@gmail.com>
>
> ---
> arch/x86/net/bpf_jit_comp.c | 8 +++-----
> 1 file changed, 3 insertions(+), 5 deletions(-)
>
> diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
> index 3f62734..1542f39 100644
> --- a/arch/x86/net/bpf_jit_comp.c
> +++ b/arch/x86/net/bpf_jit_comp.c
> @@ -135,11 +135,9 @@ static const int reg2hex[] = {
> */
> static inline bool is_ereg(u32 reg)
> {
> - if (reg == BPF_REG_5 || reg == AUX_REG ||
> - (reg >= BPF_REG_7 && reg <= BPF_REG_9))
> - return true;
> - else
> - return false;
> + return (reg == BPF_REG_5 ||
> + reg == AUX_REG ||
> + (reg >= BPF_REG_7 && reg <= BPF_REG_9));
please remove extra () around the whole expression, and
align in properly, and
don't move reg==AUX_REG check to a different line.
Subject is not warranted. I don't think it's a simplification.
imo existing code is fine and I don't think the time spent
reviewing such changes is worth it when there is no
improvement in readability.
^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH] x86: bpf_jit_comp: simplify trivial boolean return
@ 2014-11-26 9:18 Quentin Lambert
0 siblings, 0 replies; 17+ messages in thread
From: Quentin Lambert @ 2014-11-26 9:18 UTC (permalink / raw)
To: David S. Miller
Cc: Quentin Lambert, Alexey Kuznetsov, James Morris,
Hideaki YOSHIFUJI, Patrick McHardy, Thomas Gleixner, Ingo Molnar,
H. Peter Anvin, x86, netdev, linux-kernel
Remove if then else statements preceding
boolean return. Occurences were found using
Coccinelle.
The semantic patch used was:
@@
expression expr;
@@
- if ( expr )
- return true;
- else
- return false;
+ return expr;
Signed-off-by: Quentin Lambert <lambert.quentin@gmail.com>
---
arch/x86/net/bpf_jit_comp.c | 8 +++-----
1 file changed, 3 insertions(+), 5 deletions(-)
diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
index 3f62734..1542f39 100644
--- a/arch/x86/net/bpf_jit_comp.c
+++ b/arch/x86/net/bpf_jit_comp.c
@@ -135,11 +135,9 @@ static const int reg2hex[] = {
*/
static inline bool is_ereg(u32 reg)
{
- if (reg == BPF_REG_5 || reg == AUX_REG ||
- (reg >= BPF_REG_7 && reg <= BPF_REG_9))
- return true;
- else
- return false;
+ return (reg == BPF_REG_5 ||
+ reg == AUX_REG ||
+ (reg >= BPF_REG_7 && reg <= BPF_REG_9));
}
/* add modifiers if 'reg' maps to x64 registers r8..r15 */
^ permalink raw reply related [flat|nested] 17+ messages in thread
end of thread, other threads:[~2014-12-04 22:44 UTC | newest]
Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-11-26 18:34 [PATCH] x86: bpf_jit_comp: simplify trivial boolean return Alexei Starovoitov
2014-11-26 18:41 ` Joe Perches
2014-11-26 20:00 ` Joe Perches
2014-11-27 12:25 ` David Laight
2014-11-27 14:36 ` Quentin Lambert
2014-11-27 18:35 ` Joe Perches
2014-11-27 18:49 ` Joe Perches
2014-12-04 9:26 ` Joe Perches
2014-12-04 15:56 ` Alexei Starovoitov
2014-12-04 18:05 ` Joe Perches
-- strict thread matches above, loose matches on Subject: below --
2014-12-04 22:44 Alexei Starovoitov
2014-11-26 17:23 Alexei Starovoitov
2014-11-26 18:02 ` Joe Perches
2014-11-26 16:42 Alexei Starovoitov
2014-11-26 16:58 ` Joe Perches
2014-11-26 17:55 ` Daniel Borkmann
2014-11-26 9:18 Quentin Lambert
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.