All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: arch: arm: bpf: Converting cBPF to eBPF for arm 32 bit
       [not found]       ` <CAHgaXdKZ_v+iO7uqEDx7PA7D+xcp1FngGvJ1SRSsGXNQ-iWWDQ@mail.gmail.com>
  2017-05-11  9:32           ` Shubham Bansal
@ 2017-05-11  9:32           ` Shubham Bansal
  0 siblings, 0 replies; 99+ messages in thread
From: Shubham Bansal @ 2017-05-11  9:32 UTC (permalink / raw)
  To: Kees Cook
  Cc: David Miller, Mircea Gherzan, Network Development,
	kernel-hardening, linux-arm-kernel, ast, Daniel Borkmann

Hi kees & Daniel,

David suggested following :

"""
eBPF has registers 0 through 10 plus you need to allocate another
temporary register for constant blinding (this is BPF_REG_AX).

I would put all of BPF_REG_0 through BPF_REG_5 in registers if
possible.  BPF_REG_FP is the frame pointer which you don't have to
really allocate.  That leaves BPF_REG_6 through BPF_REG_9, which
are callee saved, for perhaps stack slot allocation.

You seem to have R0 through R10 on ARM plus a separate frame pointer.
And then I see something called "LR" which is probably the function
return address register.Why can't you just use R0 through R9
for BPF_REG_0 through BPF_REG_9, BPF_REG_10 is just FP and then you
have R10 for BPF_REG_AX?
"""

"""
static const u8 bpf2a32[][2] = {
        /* return value from in-kernel function, and exit value from eBPF */
        [BPF_REG_0] = {ARM_R1, ARM_R0},
        /* arguments from eBPF program to in-kernel function */
        [BPF_REG_1] = {ARM_R1, ARM_R0},
        [BPF_REG_2] = {ARM_R3, ARM_R2},
        /* Stored on stack */
        [BPF_REG_3] = {STACK_OFFSET(0), STACK_OFFSET(4)},
        [BPF_REG_4] = {STACK_OFFSET(8), STACK_OFFSET(12)},
        [BPF_REG_5] = {STACK_OFFSET(16), STACK_OFFSET(20)},
"bpf_jit/* callee saved registers that in-kernel function will preserve */
        [BPF_REG_6] = {ARM_R5, ARM_R4},
        [BPF_REG_7] = {STACK_OFFSET(24), STACK_OFFSET(28)},
        /* Stored on stack */
        [BPF_REG_8] = {STACK_OFFSET(32), STACK_OFFSET(36)},
        [BPF_REG_9] = {STACK_OFFSET(40), STACK_OFFSET(44)},
        /* Read only Frame Pointer to access Stack */
        [BPF_REG_FP] = {ARM_FP},
        /* Temperory Register for internal BPF JIT, can be used
         * for constant blindings and others. */
        [TMP_REG_1] = {ARM_R7, ARM_R6},
        [TMP_REG_2] = {ARM_R10, ARM_R8},
        /* Tail call count. */
        [TCALL_CNT] = {STACK_OFFSET(48), STACK_OFFSET(52)},

        [BPF_REG_AX] = {STACK_OFFSET(56), STACK_OFFSET(60)},
};

> How register starved are you?
Super Starved.
>
> eBPF has registers 0 through 10 plus you need to allocate another
> temporary register for constant blinding (this is BPF_REG_AX).
I am storing BPF_REG_AX on stack as of now.
>
> I would put all of BPF_REG_0 through BPF_REG_5 in registers if
> possible.  BPF_REG_FP is the frame pointer which you don't have to
> really allocate.  That leaves BPF_REG_6 through BPF_REG_9, which
> are callee saved, for perhaps stack slot allocation.
>
> You seem to have R0 through R10 on ARM plus a separate frame pointer.
> And then I see something called "LR" which is probably the function
> return address register.  Why can't you just use R0 through R9
> for BPF_REG_0 through BPF_REG_9, BPF_REG_10 is just FP and then you
> have R10 for BPF_REG_AX?
I can't do that. BPF registers are 64 bits and ARM registers are 32
bit. So I have to map each BPF register with 2 arm registers.
Also, I need 4 temp registers which I am currently using.
"""

"""
>> I can't do that. BPF registers are 64 bits and ARM registers are 32
>> bit. So I have to map each BPF register with 2 arm registers.
>> Also, I need 4 temp registers which I am currently using.
>
> Ummm, no you don't.
>
> You can do proper data flow analysis on the register values and you
> can just use plain 32-bit registers when that is all that the data
> flow tells you the register is used for.
I don't understand. Can you explain that with example?

>
> This is what the netronome driver does, it is in the same situation
> you are.  The NPU cpus on their networking card are 32-bits, and
> they have to do 32-bit value analysis while JIT'ing into their
> device.
As far as I know their ISA is more like cBPF? isn't it?
>
> It is actually rare for full 64-bit values to be used.  Those ususally
> come from pointers.  But on arm32, pointers will be 32-bits therefore
> any pointer relative value will be 32-bits as well.
Well, in that case I have to rewrite the whole code. I asked what
mapping I should use when I started and nobody replied so I went ahead
and started implementing. :(
>
> When you actually have to fabricate a full 64-bit operation, yeah
> use a stack slot or something like that.
So you are telling me to store the low 32 bit in registers and high 32
bit in scratch memory?
"""

What do you guys suggest i should implement it? I am almost done with
my current implementation but if you think I should change it to the
way David suggested, its better to suggest now before I send the
patch.

Let me know if you have any questions.
Best,
Shubham Bansal


On Thu, May 11, 2017 at 7:23 AM, Shubham Bansal
<illusionist.neo@gmail.com> wrote:
> Okay. My mistake.
>
> -Shubham
>
> On May 11, 2017 7:22 AM, "David Miller" <davem@davemloft.net> wrote:
>>
>>
>> Please keep this discussion on the mailing list.
>>
>> When you drop the CC:, you exclude the entire world from contributing
>> and continuing to help you.

^ permalink raw reply	[flat|nested] 99+ messages in thread

* arch: arm: bpf: Converting cBPF to eBPF for arm 32 bit
@ 2017-05-11  9:32           ` Shubham Bansal
  0 siblings, 0 replies; 99+ messages in thread
From: Shubham Bansal @ 2017-05-11  9:32 UTC (permalink / raw)
  To: linux-arm-kernel

Hi kees & Daniel,

David suggested following :

"""
eBPF has registers 0 through 10 plus you need to allocate another
temporary register for constant blinding (this is BPF_REG_AX).

I would put all of BPF_REG_0 through BPF_REG_5 in registers if
possible.  BPF_REG_FP is the frame pointer which you don't have to
really allocate.  That leaves BPF_REG_6 through BPF_REG_9, which
are callee saved, for perhaps stack slot allocation.

You seem to have R0 through R10 on ARM plus a separate frame pointer.
And then I see something called "LR" which is probably the function
return address register.Why can't you just use R0 through R9
for BPF_REG_0 through BPF_REG_9, BPF_REG_10 is just FP and then you
have R10 for BPF_REG_AX?
"""

"""
static const u8 bpf2a32[][2] = {
        /* return value from in-kernel function, and exit value from eBPF */
        [BPF_REG_0] = {ARM_R1, ARM_R0},
        /* arguments from eBPF program to in-kernel function */
        [BPF_REG_1] = {ARM_R1, ARM_R0},
        [BPF_REG_2] = {ARM_R3, ARM_R2},
        /* Stored on stack */
        [BPF_REG_3] = {STACK_OFFSET(0), STACK_OFFSET(4)},
        [BPF_REG_4] = {STACK_OFFSET(8), STACK_OFFSET(12)},
        [BPF_REG_5] = {STACK_OFFSET(16), STACK_OFFSET(20)},
"bpf_jit/* callee saved registers that in-kernel function will preserve */
        [BPF_REG_6] = {ARM_R5, ARM_R4},
        [BPF_REG_7] = {STACK_OFFSET(24), STACK_OFFSET(28)},
        /* Stored on stack */
        [BPF_REG_8] = {STACK_OFFSET(32), STACK_OFFSET(36)},
        [BPF_REG_9] = {STACK_OFFSET(40), STACK_OFFSET(44)},
        /* Read only Frame Pointer to access Stack */
        [BPF_REG_FP] = {ARM_FP},
        /* Temperory Register for internal BPF JIT, can be used
         * for constant blindings and others. */
        [TMP_REG_1] = {ARM_R7, ARM_R6},
        [TMP_REG_2] = {ARM_R10, ARM_R8},
        /* Tail call count. */
        [TCALL_CNT] = {STACK_OFFSET(48), STACK_OFFSET(52)},

        [BPF_REG_AX] = {STACK_OFFSET(56), STACK_OFFSET(60)},
};

> How register starved are you?
Super Starved.
>
> eBPF has registers 0 through 10 plus you need to allocate another
> temporary register for constant blinding (this is BPF_REG_AX).
I am storing BPF_REG_AX on stack as of now.
>
> I would put all of BPF_REG_0 through BPF_REG_5 in registers if
> possible.  BPF_REG_FP is the frame pointer which you don't have to
> really allocate.  That leaves BPF_REG_6 through BPF_REG_9, which
> are callee saved, for perhaps stack slot allocation.
>
> You seem to have R0 through R10 on ARM plus a separate frame pointer.
> And then I see something called "LR" which is probably the function
> return address register.  Why can't you just use R0 through R9
> for BPF_REG_0 through BPF_REG_9, BPF_REG_10 is just FP and then you
> have R10 for BPF_REG_AX?
I can't do that. BPF registers are 64 bits and ARM registers are 32
bit. So I have to map each BPF register with 2 arm registers.
Also, I need 4 temp registers which I am currently using.
"""

"""
>> I can't do that. BPF registers are 64 bits and ARM registers are 32
>> bit. So I have to map each BPF register with 2 arm registers.
>> Also, I need 4 temp registers which I am currently using.
>
> Ummm, no you don't.
>
> You can do proper data flow analysis on the register values and you
> can just use plain 32-bit registers when that is all that the data
> flow tells you the register is used for.
I don't understand. Can you explain that with example?

>
> This is what the netronome driver does, it is in the same situation
> you are.  The NPU cpus on their networking card are 32-bits, and
> they have to do 32-bit value analysis while JIT'ing into their
> device.
As far as I know their ISA is more like cBPF? isn't it?
>
> It is actually rare for full 64-bit values to be used.  Those ususally
> come from pointers.  But on arm32, pointers will be 32-bits therefore
> any pointer relative value will be 32-bits as well.
Well, in that case I have to rewrite the whole code. I asked what
mapping I should use when I started and nobody replied so I went ahead
and started implementing. :(
>
> When you actually have to fabricate a full 64-bit operation, yeah
> use a stack slot or something like that.
So you are telling me to store the low 32 bit in registers and high 32
bit in scratch memory?
"""

What do you guys suggest i should implement it? I am almost done with
my current implementation but if you think I should change it to the
way David suggested, its better to suggest now before I send the
patch.

Let me know if you have any questions.
Best,
Shubham Bansal


On Thu, May 11, 2017 at 7:23 AM, Shubham Bansal
<illusionist.neo@gmail.com> wrote:
> Okay. My mistake.
>
> -Shubham
>
> On May 11, 2017 7:22 AM, "David Miller" <davem@davemloft.net> wrote:
>>
>>
>> Please keep this discussion on the mailing list.
>>
>> When you drop the CC:, you exclude the entire world from contributing
>> and continuing to help you.

^ permalink raw reply	[flat|nested] 99+ messages in thread

* [kernel-hardening] Re: arch: arm: bpf: Converting cBPF to eBPF for arm 32 bit
@ 2017-05-11  9:32           ` Shubham Bansal
  0 siblings, 0 replies; 99+ messages in thread
From: Shubham Bansal @ 2017-05-11  9:32 UTC (permalink / raw)
  To: Kees Cook
  Cc: David Miller, Mircea Gherzan, Network Development,
	kernel-hardening, linux-arm-kernel, ast, Daniel Borkmann

Hi kees & Daniel,

David suggested following :

"""
eBPF has registers 0 through 10 plus you need to allocate another
temporary register for constant blinding (this is BPF_REG_AX).

I would put all of BPF_REG_0 through BPF_REG_5 in registers if
possible.  BPF_REG_FP is the frame pointer which you don't have to
really allocate.  That leaves BPF_REG_6 through BPF_REG_9, which
are callee saved, for perhaps stack slot allocation.

You seem to have R0 through R10 on ARM plus a separate frame pointer.
And then I see something called "LR" which is probably the function
return address register.Why can't you just use R0 through R9
for BPF_REG_0 through BPF_REG_9, BPF_REG_10 is just FP and then you
have R10 for BPF_REG_AX?
"""

"""
static const u8 bpf2a32[][2] = {
        /* return value from in-kernel function, and exit value from eBPF */
        [BPF_REG_0] = {ARM_R1, ARM_R0},
        /* arguments from eBPF program to in-kernel function */
        [BPF_REG_1] = {ARM_R1, ARM_R0},
        [BPF_REG_2] = {ARM_R3, ARM_R2},
        /* Stored on stack */
        [BPF_REG_3] = {STACK_OFFSET(0), STACK_OFFSET(4)},
        [BPF_REG_4] = {STACK_OFFSET(8), STACK_OFFSET(12)},
        [BPF_REG_5] = {STACK_OFFSET(16), STACK_OFFSET(20)},
"bpf_jit/* callee saved registers that in-kernel function will preserve */
        [BPF_REG_6] = {ARM_R5, ARM_R4},
        [BPF_REG_7] = {STACK_OFFSET(24), STACK_OFFSET(28)},
        /* Stored on stack */
        [BPF_REG_8] = {STACK_OFFSET(32), STACK_OFFSET(36)},
        [BPF_REG_9] = {STACK_OFFSET(40), STACK_OFFSET(44)},
        /* Read only Frame Pointer to access Stack */
        [BPF_REG_FP] = {ARM_FP},
        /* Temperory Register for internal BPF JIT, can be used
         * for constant blindings and others. */
        [TMP_REG_1] = {ARM_R7, ARM_R6},
        [TMP_REG_2] = {ARM_R10, ARM_R8},
        /* Tail call count. */
        [TCALL_CNT] = {STACK_OFFSET(48), STACK_OFFSET(52)},

        [BPF_REG_AX] = {STACK_OFFSET(56), STACK_OFFSET(60)},
};

> How register starved are you?
Super Starved.
>
> eBPF has registers 0 through 10 plus you need to allocate another
> temporary register for constant blinding (this is BPF_REG_AX).
I am storing BPF_REG_AX on stack as of now.
>
> I would put all of BPF_REG_0 through BPF_REG_5 in registers if
> possible.  BPF_REG_FP is the frame pointer which you don't have to
> really allocate.  That leaves BPF_REG_6 through BPF_REG_9, which
> are callee saved, for perhaps stack slot allocation.
>
> You seem to have R0 through R10 on ARM plus a separate frame pointer.
> And then I see something called "LR" which is probably the function
> return address register.  Why can't you just use R0 through R9
> for BPF_REG_0 through BPF_REG_9, BPF_REG_10 is just FP and then you
> have R10 for BPF_REG_AX?
I can't do that. BPF registers are 64 bits and ARM registers are 32
bit. So I have to map each BPF register with 2 arm registers.
Also, I need 4 temp registers which I am currently using.
"""

"""
>> I can't do that. BPF registers are 64 bits and ARM registers are 32
>> bit. So I have to map each BPF register with 2 arm registers.
>> Also, I need 4 temp registers which I am currently using.
>
> Ummm, no you don't.
>
> You can do proper data flow analysis on the register values and you
> can just use plain 32-bit registers when that is all that the data
> flow tells you the register is used for.
I don't understand. Can you explain that with example?

>
> This is what the netronome driver does, it is in the same situation
> you are.  The NPU cpus on their networking card are 32-bits, and
> they have to do 32-bit value analysis while JIT'ing into their
> device.
As far as I know their ISA is more like cBPF? isn't it?
>
> It is actually rare for full 64-bit values to be used.  Those ususally
> come from pointers.  But on arm32, pointers will be 32-bits therefore
> any pointer relative value will be 32-bits as well.
Well, in that case I have to rewrite the whole code. I asked what
mapping I should use when I started and nobody replied so I went ahead
and started implementing. :(
>
> When you actually have to fabricate a full 64-bit operation, yeah
> use a stack slot or something like that.
So you are telling me to store the low 32 bit in registers and high 32
bit in scratch memory?
"""

What do you guys suggest i should implement it? I am almost done with
my current implementation but if you think I should change it to the
way David suggested, its better to suggest now before I send the
patch.

Let me know if you have any questions.
Best,
Shubham Bansal


On Thu, May 11, 2017 at 7:23 AM, Shubham Bansal
<illusionist.neo@gmail.com> wrote:
> Okay. My mistake.
>
> -Shubham
>
> On May 11, 2017 7:22 AM, "David Miller" <davem@davemloft.net> wrote:
>>
>>
>> Please keep this discussion on the mailing list.
>>
>> When you drop the CC:, you exclude the entire world from contributing
>> and continuing to help you.

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: arch: arm: bpf: Converting cBPF to eBPF for arm 32 bit
  2017-05-11  9:32           ` Shubham Bansal
  (?)
@ 2017-05-11 15:30             ` Kees Cook
  -1 siblings, 0 replies; 99+ messages in thread
From: Kees Cook @ 2017-05-11 15:30 UTC (permalink / raw)
  To: Shubham Bansal
  Cc: David Miller, Mircea Gherzan, Network Development,
	kernel-hardening, linux-arm-kernel, ast, Daniel Borkmann

On Thu, May 11, 2017 at 2:32 AM, Shubham Bansal
<illusionist.neo@gmail.com> wrote:
> What do you guys suggest i should implement it? I am almost done with
> my current implementation but if you think I should change it to the
> way David suggested, its better to suggest now before I send the
> patch.

I'd say send what you have right now, as it's a good starting point
for future work. I'll be curious to see the benchmarks, etc. It can be
a base for further optimization.

Thanks for chipping away at this!

-Kees

-- 
Kees Cook
Pixel Security

^ permalink raw reply	[flat|nested] 99+ messages in thread

* arch: arm: bpf: Converting cBPF to eBPF for arm 32 bit
@ 2017-05-11 15:30             ` Kees Cook
  0 siblings, 0 replies; 99+ messages in thread
From: Kees Cook @ 2017-05-11 15:30 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, May 11, 2017 at 2:32 AM, Shubham Bansal
<illusionist.neo@gmail.com> wrote:
> What do you guys suggest i should implement it? I am almost done with
> my current implementation but if you think I should change it to the
> way David suggested, its better to suggest now before I send the
> patch.

I'd say send what you have right now, as it's a good starting point
for future work. I'll be curious to see the benchmarks, etc. It can be
a base for further optimization.

Thanks for chipping away at this!

-Kees

-- 
Kees Cook
Pixel Security

^ permalink raw reply	[flat|nested] 99+ messages in thread

* [kernel-hardening] Re: arch: arm: bpf: Converting cBPF to eBPF for arm 32 bit
@ 2017-05-11 15:30             ` Kees Cook
  0 siblings, 0 replies; 99+ messages in thread
From: Kees Cook @ 2017-05-11 15:30 UTC (permalink / raw)
  To: Shubham Bansal
  Cc: David Miller, Mircea Gherzan, Network Development,
	kernel-hardening, linux-arm-kernel, ast, Daniel Borkmann

On Thu, May 11, 2017 at 2:32 AM, Shubham Bansal
<illusionist.neo@gmail.com> wrote:
> What do you guys suggest i should implement it? I am almost done with
> my current implementation but if you think I should change it to the
> way David suggested, its better to suggest now before I send the
> patch.

I'd say send what you have right now, as it's a good starting point
for future work. I'll be curious to see the benchmarks, etc. It can be
a base for further optimization.

Thanks for chipping away at this!

-Kees

-- 
Kees Cook
Pixel Security

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: arch: arm: bpf: Converting cBPF to eBPF for arm 32 bit
  2017-05-11 15:30             ` Kees Cook
  (?)
@ 2017-05-13 21:38               ` Shubham Bansal
  -1 siblings, 0 replies; 99+ messages in thread
From: Shubham Bansal @ 2017-05-13 21:38 UTC (permalink / raw)
  To: Kees Cook
  Cc: David Miller, Mircea Gherzan, Network Development,
	kernel-hardening, linux-arm-kernel, ast, Daniel Borkmann

Finally finished testing.

"test_bpf: Summary: 314 PASSED, 0 FAILED, [274/306 JIT'ed]"

Will send the patch after code refactoring. Thanks for all the help
you guys. I really really appreciate it.

Special thanks to Kees and Daniel. :)

Best,
Shubham Bansal


On Thu, May 11, 2017 at 9:00 PM, Kees Cook <keescook@chromium.org> wrote:
> On Thu, May 11, 2017 at 2:32 AM, Shubham Bansal
> <illusionist.neo@gmail.com> wrote:
>> What do you guys suggest i should implement it? I am almost done with
>> my current implementation but if you think I should change it to the
>> way David suggested, its better to suggest now before I send the
>> patch.
>
> I'd say send what you have right now, as it's a good starting point
> for future work. I'll be curious to see the benchmarks, etc. It can be
> a base for further optimization.
>
> Thanks for chipping away at this!
>
> -Kees
>
> --
> Kees Cook
> Pixel Security

^ permalink raw reply	[flat|nested] 99+ messages in thread

* arch: arm: bpf: Converting cBPF to eBPF for arm 32 bit
@ 2017-05-13 21:38               ` Shubham Bansal
  0 siblings, 0 replies; 99+ messages in thread
From: Shubham Bansal @ 2017-05-13 21:38 UTC (permalink / raw)
  To: linux-arm-kernel

Finally finished testing.

"test_bpf: Summary: 314 PASSED, 0 FAILED, [274/306 JIT'ed]"

Will send the patch after code refactoring. Thanks for all the help
you guys. I really really appreciate it.

Special thanks to Kees and Daniel. :)

Best,
Shubham Bansal


On Thu, May 11, 2017 at 9:00 PM, Kees Cook <keescook@chromium.org> wrote:
> On Thu, May 11, 2017 at 2:32 AM, Shubham Bansal
> <illusionist.neo@gmail.com> wrote:
>> What do you guys suggest i should implement it? I am almost done with
>> my current implementation but if you think I should change it to the
>> way David suggested, its better to suggest now before I send the
>> patch.
>
> I'd say send what you have right now, as it's a good starting point
> for future work. I'll be curious to see the benchmarks, etc. It can be
> a base for further optimization.
>
> Thanks for chipping away at this!
>
> -Kees
>
> --
> Kees Cook
> Pixel Security

^ permalink raw reply	[flat|nested] 99+ messages in thread

* [kernel-hardening] Re: arch: arm: bpf: Converting cBPF to eBPF for arm 32 bit
@ 2017-05-13 21:38               ` Shubham Bansal
  0 siblings, 0 replies; 99+ messages in thread
From: Shubham Bansal @ 2017-05-13 21:38 UTC (permalink / raw)
  To: Kees Cook
  Cc: David Miller, Mircea Gherzan, Network Development,
	kernel-hardening, linux-arm-kernel, ast, Daniel Borkmann

Finally finished testing.

"test_bpf: Summary: 314 PASSED, 0 FAILED, [274/306 JIT'ed]"

Will send the patch after code refactoring. Thanks for all the help
you guys. I really really appreciate it.

Special thanks to Kees and Daniel. :)

Best,
Shubham Bansal


On Thu, May 11, 2017 at 9:00 PM, Kees Cook <keescook@chromium.org> wrote:
> On Thu, May 11, 2017 at 2:32 AM, Shubham Bansal
> <illusionist.neo@gmail.com> wrote:
>> What do you guys suggest i should implement it? I am almost done with
>> my current implementation but if you think I should change it to the
>> way David suggested, its better to suggest now before I send the
>> patch.
>
> I'd say send what you have right now, as it's a good starting point
> for future work. I'll be curious to see the benchmarks, etc. It can be
> a base for further optimization.
>
> Thanks for chipping away at this!
>
> -Kees
>
> --
> Kees Cook
> Pixel Security

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: arch: arm: bpf: Converting cBPF to eBPF for arm 32 bit
  2017-05-13 21:38               ` Shubham Bansal
  (?)
@ 2017-05-15 17:44                 ` Kees Cook
  -1 siblings, 0 replies; 99+ messages in thread
From: Kees Cook @ 2017-05-15 17:44 UTC (permalink / raw)
  To: Shubham Bansal
  Cc: David Miller, Mircea Gherzan, Network Development,
	kernel-hardening, linux-arm-kernel, ast, Daniel Borkmann

On Sat, May 13, 2017 at 2:38 PM, Shubham Bansal
<illusionist.neo@gmail.com> wrote:
> Finally finished testing.
>
> "test_bpf: Summary: 314 PASSED, 0 FAILED, [274/306 JIT'ed]"

Nice work! Glad you've been chipping away at this. Thanks!

-Kees

>
> Will send the patch after code refactoring. Thanks for all the help
> you guys. I really really appreciate it.
>
> Special thanks to Kees and Daniel. :)
>
> Best,
> Shubham Bansal
>
>
> On Thu, May 11, 2017 at 9:00 PM, Kees Cook <keescook@chromium.org> wrote:
>> On Thu, May 11, 2017 at 2:32 AM, Shubham Bansal
>> <illusionist.neo@gmail.com> wrote:
>>> What do you guys suggest i should implement it? I am almost done with
>>> my current implementation but if you think I should change it to the
>>> way David suggested, its better to suggest now before I send the
>>> patch.
>>
>> I'd say send what you have right now, as it's a good starting point
>> for future work. I'll be curious to see the benchmarks, etc. It can be
>> a base for further optimization.
>>
>> Thanks for chipping away at this!
>>
>> -Kees
>>
>> --
>> Kees Cook
>> Pixel Security



-- 
Kees Cook
Pixel Security

^ permalink raw reply	[flat|nested] 99+ messages in thread

* arch: arm: bpf: Converting cBPF to eBPF for arm 32 bit
@ 2017-05-15 17:44                 ` Kees Cook
  0 siblings, 0 replies; 99+ messages in thread
From: Kees Cook @ 2017-05-15 17:44 UTC (permalink / raw)
  To: linux-arm-kernel

On Sat, May 13, 2017 at 2:38 PM, Shubham Bansal
<illusionist.neo@gmail.com> wrote:
> Finally finished testing.
>
> "test_bpf: Summary: 314 PASSED, 0 FAILED, [274/306 JIT'ed]"

Nice work! Glad you've been chipping away at this. Thanks!

-Kees

>
> Will send the patch after code refactoring. Thanks for all the help
> you guys. I really really appreciate it.
>
> Special thanks to Kees and Daniel. :)
>
> Best,
> Shubham Bansal
>
>
> On Thu, May 11, 2017 at 9:00 PM, Kees Cook <keescook@chromium.org> wrote:
>> On Thu, May 11, 2017 at 2:32 AM, Shubham Bansal
>> <illusionist.neo@gmail.com> wrote:
>>> What do you guys suggest i should implement it? I am almost done with
>>> my current implementation but if you think I should change it to the
>>> way David suggested, its better to suggest now before I send the
>>> patch.
>>
>> I'd say send what you have right now, as it's a good starting point
>> for future work. I'll be curious to see the benchmarks, etc. It can be
>> a base for further optimization.
>>
>> Thanks for chipping away at this!
>>
>> -Kees
>>
>> --
>> Kees Cook
>> Pixel Security



-- 
Kees Cook
Pixel Security

^ permalink raw reply	[flat|nested] 99+ messages in thread

* [kernel-hardening] Re: arch: arm: bpf: Converting cBPF to eBPF for arm 32 bit
@ 2017-05-15 17:44                 ` Kees Cook
  0 siblings, 0 replies; 99+ messages in thread
From: Kees Cook @ 2017-05-15 17:44 UTC (permalink / raw)
  To: Shubham Bansal
  Cc: David Miller, Mircea Gherzan, Network Development,
	kernel-hardening, linux-arm-kernel, ast, Daniel Borkmann

On Sat, May 13, 2017 at 2:38 PM, Shubham Bansal
<illusionist.neo@gmail.com> wrote:
> Finally finished testing.
>
> "test_bpf: Summary: 314 PASSED, 0 FAILED, [274/306 JIT'ed]"

Nice work! Glad you've been chipping away at this. Thanks!

-Kees

>
> Will send the patch after code refactoring. Thanks for all the help
> you guys. I really really appreciate it.
>
> Special thanks to Kees and Daniel. :)
>
> Best,
> Shubham Bansal
>
>
> On Thu, May 11, 2017 at 9:00 PM, Kees Cook <keescook@chromium.org> wrote:
>> On Thu, May 11, 2017 at 2:32 AM, Shubham Bansal
>> <illusionist.neo@gmail.com> wrote:
>>> What do you guys suggest i should implement it? I am almost done with
>>> my current implementation but if you think I should change it to the
>>> way David suggested, its better to suggest now before I send the
>>> patch.
>>
>> I'd say send what you have right now, as it's a good starting point
>> for future work. I'll be curious to see the benchmarks, etc. It can be
>> a base for further optimization.
>>
>> Thanks for chipping away at this!
>>
>> -Kees
>>
>> --
>> Kees Cook
>> Pixel Security



-- 
Kees Cook
Pixel Security

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: arch: arm: bpf: Converting cBPF to eBPF for arm 32 bit
  2017-05-13 21:38               ` Shubham Bansal
  (?)
@ 2017-05-15 19:55                 ` Daniel Borkmann
  -1 siblings, 0 replies; 99+ messages in thread
From: Daniel Borkmann @ 2017-05-15 19:55 UTC (permalink / raw)
  To: Shubham Bansal, Kees Cook
  Cc: David Miller, Mircea Gherzan, Network Development,
	kernel-hardening, linux-arm-kernel, ast

On 05/13/2017 11:38 PM, Shubham Bansal wrote:
> Finally finished testing.
>
> "test_bpf: Summary: 314 PASSED, 0 FAILED, [274/306 JIT'ed]"

What are the missing pieces and how is the performance compared
to the interpreter?

Thanks,
Daniel

^ permalink raw reply	[flat|nested] 99+ messages in thread

* arch: arm: bpf: Converting cBPF to eBPF for arm 32 bit
@ 2017-05-15 19:55                 ` Daniel Borkmann
  0 siblings, 0 replies; 99+ messages in thread
From: Daniel Borkmann @ 2017-05-15 19:55 UTC (permalink / raw)
  To: linux-arm-kernel

On 05/13/2017 11:38 PM, Shubham Bansal wrote:
> Finally finished testing.
>
> "test_bpf: Summary: 314 PASSED, 0 FAILED, [274/306 JIT'ed]"

What are the missing pieces and how is the performance compared
to the interpreter?

Thanks,
Daniel

^ permalink raw reply	[flat|nested] 99+ messages in thread

* [kernel-hardening] Re: arch: arm: bpf: Converting cBPF to eBPF for arm 32 bit
@ 2017-05-15 19:55                 ` Daniel Borkmann
  0 siblings, 0 replies; 99+ messages in thread
From: Daniel Borkmann @ 2017-05-15 19:55 UTC (permalink / raw)
  To: Shubham Bansal, Kees Cook
  Cc: David Miller, Mircea Gherzan, Network Development,
	kernel-hardening, linux-arm-kernel, ast

On 05/13/2017 11:38 PM, Shubham Bansal wrote:
> Finally finished testing.
>
> "test_bpf: Summary: 314 PASSED, 0 FAILED, [274/306 JIT'ed]"

What are the missing pieces and how is the performance compared
to the interpreter?

Thanks,
Daniel

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: arch: arm: bpf: Converting cBPF to eBPF for arm 32 bit
  2017-05-15 19:55                 ` Daniel Borkmann
  (?)
@ 2017-05-20 20:01                   ` Shubham Bansal
  -1 siblings, 0 replies; 99+ messages in thread
From: Shubham Bansal @ 2017-05-20 20:01 UTC (permalink / raw)
  To: Daniel Borkmann
  Cc: Kees Cook, David Miller, Mircea Gherzan, Network Development,
	kernel-hardening, linux-arm-kernel, ast

Hi Daniel and Kees,

Before I send the patch, I have tested the JIT compiler on ARMv7 but
not on ARMv5 or ARMv6. So can you tell me which arch versions I should
test it for?
Also for my testing, CONFIG_FRAME_POINTER and CONFIG_CPU_BIG_ENDIAN
are both disabled. But I need to test JIT with these flags as well.
Whenever I put these flags in .config file, the arm kernel is not
getting compiler with these flags. Can you tell me why? If you need
more information regarding this, please let me know.

With current config for ARMv7, benchmarks are :

[root@vexpress modules]# insmod test_bpf.ko
[   25.797766] test_bpf: #0 TAX jited:1 180 170 169 PASS
[   25.811395] test_bpf: #1 TXA jited:1 93 89 111 PASS
[   25.815073] test_bpf: #2 ADD_SUB_MUL_K jited:1 94 PASS
[   25.816779] test_bpf: #3 DIV_MOD_KX jited:1 983 PASS
[   25.827310] test_bpf: #4 AND_OR_LSH_K jited:1 94 93 PASS
[   25.829843] test_bpf: #5 LD_IMM_0 jited:1 83 PASS
[   25.831260] test_bpf: #6 LD_IND jited:1 338 266 305 PASS
[   25.840970] test_bpf: #7 LD_ABS jited:1 343 304 289 PASS
[   25.851005] test_bpf: #8 LD_ABS_LL jited:1 362 300 PASS
[   25.858119] test_bpf: #9 LD_IND_LL jited:1 244 241 245 PASS
[   25.865994] test_bpf: #10 LD_ABS_NET jited:1 318 316 PASS
[   25.872829] test_bpf: #11 LD_IND_NET jited:1 243 196 196 PASS
[   25.879717] test_bpf: #12 LD_PKTTYPE jited:1 129 140 PASS
[   25.883034] test_bpf: #13 LD_MARK jited:1 113 88 PASS
[   25.885545] test_bpf: #14 LD_RXHASH jited:1 81 79 PASS
[   25.887506] test_bpf: #15 LD_QUEUE jited:1 88 85 PASS
[   25.889593] test_bpf: #16 LD_PROTOCOL jited:1 322 353 PASS
[   25.896894] test_bpf: #17 LD_VLAN_TAG jited:1 92 92 PASS
[   25.899173] test_bpf: #18 LD_VLAN_TAG_PRESENT jited:1 85 88 PASS
[   25.901310] test_bpf: #19 LD_IFINDEX jited:1 94 130 PASS
[   25.904110] test_bpf: #20 LD_HATYPE jited:1 98 91 PASS
[   25.906393] test_bpf: #21 LD_CPU
[   25.906651] bpf_jit: *** NOT YET: opcode 85 ***
[   25.906795] jited:0 705 691 PASS
[   25.921007] test_bpf: #22 LD_NLATTR jited:0 577 668 PASS
[   25.933870] test_bpf: #23 LD_NLATTR_NEST jited:0 2253 3006 PASS
[   25.987020] test_bpf: #24 LD_PAYLOAD_OFF jited:0 3840 4922 PASS
[   26.075119] test_bpf: #25 LD_ANC_XOR jited:1 107 94 PASS
[   26.077583] test_bpf: #26 SPILL_FILL jited:1 159 148 173 PASS
[   26.083259] test_bpf: #27 JEQ jited:1 274 183 181 PASS
[   26.090383] test_bpf: #28 JGT jited:1 255 194 165 PASS
[   26.097163] test_bpf: #29 JGE jited:1 187 190 246 PASS
[   26.103932] test_bpf: #30 JSET jited:1 178 184 192 PASS
[   26.110229] test_bpf: #31 tcpdump port 22 jited:1 266 698 717 PASS
[   26.127698] test_bpf: #32 tcpdump complex jited:1 267 729 1129 PASS
[   26.149727] test_bpf: #33 RET_A jited:1 94 88 PASS
[   26.152114] test_bpf: #34 INT: ADD trivial jited:1 87 PASS
[   26.153900] test_bpf: #35 INT: MUL_X jited:1 95 PASS
[   26.155384] test_bpf: #36 INT: MUL_X2 jited:1 82 PASS
[   26.156606] test_bpf: #37 INT: MUL32_X jited:1 91 PASS
[   26.157846] test_bpf: #38 INT: ADD 64-bit jited:1 1055 PASS
[   26.169280] test_bpf: #39 INT: ADD 32-bit jited:1 701 PASS
[   26.177039] test_bpf: #40 INT: SUB jited:1 931 PASS
[   26.187108] test_bpf: #41 INT: XOR jited:1 355 PASS
[   26.191364] test_bpf: #42 INT: MUL jited:1 389 PASS
[   26.196286] test_bpf: #43 MOV REG64 jited:1 267 PASS
[   26.199759] test_bpf: #44 MOV REG32 jited:1 176 PASS
[   26.202060] test_bpf: #45 LD IMM64 jited:1 194 PASS
[   26.204607] test_bpf: #46 INT: ALU MIX jited:0 1174 PASS
[   26.216896] test_bpf: #47 INT: shifts by register jited:1 211 PASS
[   26.219956] test_bpf: #48 INT: DIV + ABS jited:1 559 517 PASS
[   26.231347] test_bpf: #49 INT: DIV by zero jited:1 395 277 PASS
[   26.238862] test_bpf: #50 check: missing ret PASS
[   26.239288] test_bpf: #51 check: div_k_0 PASS
[   26.239492] test_bpf: #52 check: unknown insn PASS
[   26.239640] test_bpf: #53 check: out of range spill/fill PASS
[   26.239803] test_bpf: #54 JUMPS + HOLES jited:1 295 PASS
[   26.243343] test_bpf: #55 check: RET X PASS
[   26.244065] test_bpf: #56 check: LDX + RET X PASS
[   26.244224] test_bpf: #57 M[]: alt STX + LDX jited:1 433 PASS
[   26.249126] test_bpf: #58 M[]: full STX + full LDX jited:1 427 PASS
[   26.254123] test_bpf: #59 check: SKF_AD_MAX PASS
[   26.254509] test_bpf: #60 LD [SKF_AD_OFF-1] jited:1 298 PASS
[   26.257882] test_bpf: #61 load 64-bit immediate jited:1 128 PASS
[   26.259813] test_bpf: #62 nmap reduced jited:1 655 PASS
[   26.267216] test_bpf: #63 ALU_MOV_X: dst = 2 jited:1 89 PASS
[   26.268766] test_bpf: #64 ALU_MOV_X: dst = 4294967295 jited:1 72 PASS
[   26.270126] test_bpf: #65 ALU64_MOV_X: dst = 2 jited:1 94 PASS
[   26.271768] test_bpf: #66 ALU64_MOV_X: dst = 4294967295 jited:1 145 PASS
[   26.274152] test_bpf: #67 ALU_MOV_K: dst = 2 jited:1 93 PASS
[   26.275673] test_bpf: #68 ALU_MOV_K: dst = 4294967295 jited:1 103 PASS
[   26.277371] test_bpf: #69 ALU_MOV_K: 0x0000ffffffff0000 =
0x00000000ffffffff jited:1 99 PASS
[   26.278966] test_bpf: #70 ALU64_MOV_K: dst = 2 jited:1 110 PASS
[   26.280440] test_bpf: #71 ALU64_MOV_K: dst = 2147483647 jited:1 96 PASS
[   26.281843] test_bpf: #72 ALU64_OR_K: dst = 0x0 jited:1 103 PASS
[   26.283682] test_bpf: #73 ALU64_MOV_K: dst = -1 jited:1 96 PASS
[   26.285147] test_bpf: #74 ALU_ADD_X: 1 + 2 = 3 jited:1 85 PASS
[   26.286373] test_bpf: #75 ALU_ADD_X: 1 + 4294967294 = 4294967295
jited:1 108 PASS
[   26.288079] test_bpf: #76 ALU_ADD_X: 2 + 4294967294 = 0 jited:1 112 PASS
[   26.289653] test_bpf: #77 ALU64_ADD_X: 1 + 2 = 3 jited:1 70 PASS
[   26.290666] test_bpf: #78 ALU64_ADD_X: 1 + 4294967294 = 4294967295
jited:1 85 PASS
[   26.291897] test_bpf: #79 ALU64_ADD_X: 2 + 4294967294 = 4294967296
jited:1 95 PASS
[   26.293429] test_bpf: #80 ALU_ADD_K: 1 + 2 = 3 jited:1 96 PASS
[   26.294794] test_bpf: #81 ALU_ADD_K: 3 + 0 = 3 jited:1 79 PASS
[   26.295956] test_bpf: #82 ALU_ADD_K: 1 + 4294967294 = 4294967295
jited:1 70 PASS
[   26.297026] test_bpf: #83 ALU_ADD_K: 4294967294 + 2 = 0 jited:1 77 PASS
[   26.298109] test_bpf: #84 ALU_ADD_K: 0 + (-1) = 0x00000000ffffffff
jited:1 120 PASS
[   26.299705] test_bpf: #85 ALU_ADD_K: 0 + 0xffff = 0xffff jited:1 85 PASS
[   26.300902] test_bpf: #86 ALU_ADD_K: 0 + 0x7fffffff = 0x7fffffff
jited:1 121 PASS
[   26.302578] test_bpf: #87 ALU_ADD_K: 0 + 0x80000000 = 0x80000000
jited:1 115 PASS
[   26.304134] test_bpf: #88 ALU_ADD_K: 0 + 0x80008000 = 0x80008000
jited:1 136 PASS
[   26.305881] test_bpf: #89 ALU64_ADD_K: 1 + 2 = 3 jited:1 119 PASS
[   26.307481] test_bpf: #90 ALU64_ADD_K: 3 + 0 = 3 jited:1 90 PASS
[   26.308784] test_bpf: #91 ALU64_ADD_K: 1 + 2147483646 = 2147483647
jited:1 83 PASS
[   26.310091] test_bpf: #92 ALU64_ADD_K: 4294967294 + 2 = 4294967296
jited:1 82 PASS
[   26.311534] test_bpf: #93 ALU64_ADD_K: 2147483646 + -2147483647 =
-1 jited:1 71 PASS
[   26.312842] test_bpf: #94 ALU64_ADD_K: 1 + 0 = 1 jited:1 143 PASS
[   26.315010] test_bpf: #95 ALU64_ADD_K: 0 + (-1) =
0xffffffffffffffff jited:1 116 PASS
[   26.317106] test_bpf: #96 ALU64_ADD_K: 0 + 0xffff = 0xffff jited:1 119 PASS
[   26.318834] test_bpf: #97 ALU64_ADD_K: 0 + 0x7fffffff = 0x7fffffff
jited:1 119 PASS
[   26.320484] test_bpf: #98 ALU64_ADD_K: 0 + 0x80000000 =
0xffffffff80000000 jited:1 110 PASS
[   26.322003] test_bpf: #99 ALU_ADD_K: 0 + 0x80008000 =
0xffffffff80008000 jited:1 117 PASS
[   26.323841] test_bpf: #100 ALU_SUB_X: 3 - 1 = 2 jited:1 84 PASS
[   26.325043] test_bpf: #101 ALU_SUB_X: 4294967295 - 4294967294 = 1
jited:1 84 PASS
[   26.326300] test_bpf: #102 ALU64_SUB_X: 3 - 1 = 2 jited:1 97 PASS
[   26.327661] test_bpf: #103 ALU64_SUB_X: 4294967295 - 4294967294 = 1
jited:1 74 PASS
[   26.328760] test_bpf: #104 ALU_SUB_K: 3 - 1 = 2 jited:1 80 PASS
[   26.329880] test_bpf: #105 ALU_SUB_K: 3 - 0 = 3 jited:1 93 PASS
[   26.331166] test_bpf: #106 ALU_SUB_K: 4294967295 - 4294967294 = 1
jited:1 81 PASS
[   26.332348] test_bpf: #107 ALU64_SUB_K: 3 - 1 = 2 jited:1 89 PASS
[   26.333616] test_bpf: #108 ALU64_SUB_K: 3 - 0 = 3 jited:1 73 PASS
[   26.334796] test_bpf: #109 ALU64_SUB_K: 4294967294 - 4294967295 =
-1 jited:1 75 PASS
[   26.335880] test_bpf: #110 ALU64_ADD_K: 2147483646 - 2147483647 =
-1 jited:1 88 PASS
[   26.337138] test_bpf: #111 ALU_MUL_X: 2 * 3 = 6 jited:1 113 PASS
[   26.338609] test_bpf: #112 ALU_MUL_X: 2 * 0x7FFFFFF8 = 0xFFFFFFF0
jited:1 99 PASS
[   26.339983] test_bpf: #113 ALU_MUL_X: -1 * -1 = 1 jited:1 70 PASS
[   26.341036] test_bpf: #114 ALU64_MUL_X: 2 * 3 = 6 jited:1 75 PASS
[   26.342242] test_bpf: #115 ALU64_MUL_X: 1 * 2147483647 = 2147483647
jited:1 91 PASS
[   26.343719] test_bpf: #116 ALU_MUL_K: 2 * 3 = 6 jited:1 83 PASS
[   26.344945] test_bpf: #117 ALU_MUL_K: 3 * 1 = 3 jited:1 84 PASS
[   26.346135] test_bpf: #118 ALU_MUL_K: 2 * 0x7FFFFFF8 = 0xFFFFFFF0
jited:1 69 PASS
[   26.347240] test_bpf: #119 ALU_MUL_K: 1 * (-1) = 0x00000000ffffffff
jited:1 99 PASS
[   26.348596] test_bpf: #120 ALU64_MUL_K: 2 * 3 = 6 jited:1 73 PASS
[   26.349749] test_bpf: #121 ALU64_MUL_K: 3 * 1 = 3 jited:1 89 PASS
[   26.350992] test_bpf: #122 ALU64_MUL_K: 1 * 2147483647 = 2147483647
jited:1 101 PASS
[   26.352436] test_bpf: #123 ALU64_MUL_K: 1 * -2147483647 =
-2147483647 jited:1 112 PASS
[   26.354144] test_bpf: #124 ALU64_MUL_K: 1 * (-1) =
0xffffffffffffffff jited:1 145 PASS
[   26.356392] test_bpf: #125 ALU_DIV_X: 6 / 2 = 3 jited:1 204 PASS
[   26.359242] test_bpf: #126 ALU_DIV_X: 4294967295 / 4294967295 = 1
jited:1 232 PASS
[   26.362516] test_bpf: #127 ALU64_DIV_X: 6 / 2 = 3 jited:0 593 PASS
[   26.368978] test_bpf: #128 ALU64_DIV_X: 2147483647 / 2147483647 = 1
jited:0 517 PASS
[   26.374539] test_bpf: #129 ALU64_DIV_X: 0xffffffffffffffff / (-1) =
0x0000000000000001 jited:0 666 PASS
[   26.381642] test_bpf: #130 ALU_DIV_K: 6 / 2 = 3 jited:1 225 PASS
[   26.384418] test_bpf: #131 ALU_DIV_K: 3 / 1 = 3 jited:1 199 PASS
[   26.386820] test_bpf: #132 ALU_DIV_K: 4294967295 / 4294967295 = 1
jited:1 195 PASS
[   26.389428] test_bpf: #133 ALU_DIV_K: 0xffffffffffffffff / (-1) =
0x1 jited:1 354 PASS
[   26.393537] test_bpf: #134 ALU64_DIV_K: 6 / 2 = 3 jited:0 403 PASS
[   26.398414] test_bpf: #135 ALU64_DIV_K: 3 / 1 = 3 jited:0 388 PASS
[   26.403006] test_bpf: #136 ALU64_DIV_K: 2147483647 / 2147483647 = 1
jited:0 387 PASS
[   26.407619] test_bpf: #137 ALU64_DIV_K: 0xffffffffffffffff / (-1) =
0x0000000000000001 jited:0 577 PASS
[   26.413875] test_bpf: #138 ALU_MOD_X: 3 % 2 = 1 jited:1 284 PASS
[   26.417106] test_bpf: #139 ALU_MOD_X: 4294967295 % 4294967293 = 2
jited:1 298 PASS
[   26.420489] test_bpf: #140 ALU64_MOD_X: 3 % 2 = 1 jited:0 609 PASS
[   26.426958] test_bpf: #141 ALU64_MOD_X: 2147483647 % 2147483645 = 2
jited:0 588 PASS
[   26.433454] test_bpf: #142 ALU_MOD_K: 3 % 2 = 1 jited:1 301 PASS
[   26.436831] test_bpf: #143 ALU_MOD_K: 3 % 1 = 0 jited:1 PASS
[   26.437152] test_bpf: #144 ALU_MOD_K: 4294967295 % 4294967293 = 2
jited:1 316 PASS
[   26.440713] test_bpf: #145 ALU64_MOD_K: 3 % 2 = 1 jited:0 612 PASS
[   26.447535] test_bpf: #146 ALU64_MOD_K: 3 % 1 = 0 jited:0 PASS
[   26.448057] test_bpf: #147 ALU64_MOD_K: 2147483647 % 2147483645 = 2
jited:0 612 PASS
[   26.454579] test_bpf: #148 ALU_AND_X: 3 & 2 = 2 jited:1 112 PASS
[   26.456065] test_bpf: #149 ALU_AND_X: 0xffffffff & 0xffffffff =
0xffffffff jited:1 76 PASS
[   26.457168] test_bpf: #150 ALU64_AND_X: 3 & 2 = 2 jited:1 84 PASS
[   26.458350] test_bpf: #151 ALU64_AND_X: 0xffffffff & 0xffffffff =
0xffffffff jited:1 78 PASS
[   26.459582] test_bpf: #152 ALU_AND_K: 3 & 2 = 2 jited:1 81 PASS
[   26.460724] test_bpf: #153 ALU_AND_K: 0xffffffff & 0xffffffff =
0xffffffff jited:1 89 PASS
[   26.462005] test_bpf: #154 ALU64_AND_K: 3 & 2 = 2 jited:1 99 PASS
[   26.463622] test_bpf: #155 ALU64_AND_K: 0xffffffff & 0xffffffff =
0xffffffff jited:1 86 PASS
[   26.464833] test_bpf: #156 ALU64_AND_K: 0x0000ffffffff0000 & 0x0 =
0x0000ffff00000000 jited:1 100 PASS
[   26.466244] test_bpf: #157 ALU64_AND_K: 0x0000ffffffff0000 & -1 =
0x0000ffffffffffff jited:1 126 PASS
[   26.467904] test_bpf: #158 ALU64_AND_K: 0xffffffffffffffff & -1 =
0xffffffffffffffff jited:1 105 PASS
[   26.469357] test_bpf: #159 ALU_OR_X: 1 | 2 = 3 jited:1 71 PASS
[   26.470510] test_bpf: #160 ALU_OR_X: 0x0 | 0xffffffff = 0xffffffff
jited:1 72 PASS
[   26.471626] test_bpf: #161 ALU64_OR_X: 1 | 2 = 3 jited:1 71 PASS
[   26.472684] test_bpf: #162 ALU64_OR_X: 0 | 0xffffffff = 0xffffffff
jited:1 87 PASS
[   26.473892] test_bpf: #163 ALU_OR_K: 1 | 2 = 3 jited:1 94 PASS
[   26.475174] test_bpf: #164 ALU_OR_K: 0 & 0xffffffff = 0xffffffff
jited:1 84 PASS
[   26.476385] test_bpf: #165 ALU64_OR_K: 1 | 2 = 3 jited:1 71 PASS
[   26.477586] test_bpf: #166 ALU64_OR_K: 0 & 0xffffffff = 0xffffffff
jited:1 80 PASS
[   26.478723] test_bpf: #167 ALU64_OR_K: 0x0000ffffffff0000 | 0x0 =
0x0000ffff00000000 jited:1 124 PASS
[   26.480417] test_bpf: #168 ALU64_OR_K: 0x0000ffffffff0000 | -1 =
0xffffffffffffffff jited:1 94 PASS
[   26.481820] test_bpf: #169 ALU64_OR_K: 0x000000000000000 | -1 =
0xffffffffffffffff jited:1 150 PASS
[   26.483952] test_bpf: #170 ALU_XOR_X: 5 ^ 6 = 3 jited:1 72 PASS
[   26.485195] test_bpf: #171 ALU_XOR_X: 0x1 ^ 0xffffffff = 0xfffffffe
jited:1 89 PASS
[   26.486648] test_bpf: #172 ALU64_XOR_X: 5 ^ 6 = 3 jited:1 80 PASS
[   26.488214] test_bpf: #173 ALU64_XOR_X: 1 ^ 0xffffffff = 0xfffffffe
jited:1 71 PASS
[   26.489566] test_bpf: #174 ALU_XOR_K: 5 ^ 6 = 3 jited:1 79 PASS
[   26.490791] test_bpf: #175 ALU_XOR_K: 1 ^ 0xffffffff = 0xfffffffe
jited:1 105 PASS
[   26.492548] test_bpf: #176 ALU64_XOR_K: 5 ^ 6 = 3 jited:1 128 PASS
[   26.494713] test_bpf: #177 ALU64_XOR_K: 1 & 0xffffffff = 0xfffffffe
jited:1 86 PASS
[   26.496072] test_bpf: #178 ALU64_XOR_K: 0x0000ffffffff0000 ^ 0x0 =
0x0000ffffffff0000 jited:1 102 PASS
[   26.497612] test_bpf: #179 ALU64_XOR_K: 0x0000ffffffff0000 ^ -1 =
0xffff00000000ffff jited:1 88 PASS
[   26.498906] test_bpf: #180 ALU64_XOR_K: 0x000000000000000 ^ -1 =
0xffffffffffffffff jited:1 98 PASS
[   26.500256] test_bpf: #181 ALU_LSH_X: 1 << 1 = 2 jited:1 107 PASS
[   26.501668] test_bpf: #182 ALU_LSH_X: 1 << 31 = 0x80000000 jited:1 70 PASS
[   26.502690] test_bpf: #183 ALU64_LSH_X: 1 << 1 = 2 jited:1 100 PASS
[   26.504077] test_bpf: #184 ALU64_LSH_X: 1 << 31 = 0x80000000 jited:1 78 PASS
[   26.505197] test_bpf: #185 ALU_LSH_K: 1 << 1 = 2 jited:1 76 PASS
[   26.506268] test_bpf: #186 ALU_LSH_K: 1 << 31 = 0x80000000 jited:1 71 PASS
[   26.507301] test_bpf: #187 ALU64_LSH_K: 1 << 1 = 2 jited:1 69 PASS
[   26.508374] test_bpf: #188 ALU64_LSH_K: 1 << 31 = 0x80000000 jited:1 78 PASS
[   26.509494] test_bpf: #189 ALU_RSH_X: 2 >> 1 = 1 jited:1 76 PASS
[   26.510665] test_bpf: #190 ALU_RSH_X: 0x80000000 >> 31 = 1 jited:1 77 PASS
[   26.511787] test_bpf: #191 ALU64_RSH_X: 2 >> 1 = 1 jited:1 79 PASS
[   26.513033] test_bpf: #192 ALU64_RSH_X: 0x80000000 >> 31 = 1 jited:1 95 PASS
[   26.514382] test_bpf: #193 ALU_RSH_K: 2 >> 1 = 1 jited:1 80 PASS
[   26.515648] test_bpf: #194 ALU_RSH_K: 0x80000000 >> 31 = 1 jited:1 72 PASS
[   26.516778] test_bpf: #195 ALU64_RSH_K: 2 >> 1 = 1 jited:1 86 PASS
[   26.517971] test_bpf: #196 ALU64_RSH_K: 0x80000000 >> 31 = 1 jited:1 78 PASS
[   26.519188] test_bpf: #197 ALU_ARSH_X: 0xff00ff0000000000 >> 40 =
0xffffffffffff00ff jited:1 90 PASS
[   26.520458] test_bpf: #198 ALU_ARSH_K: 0xff00ff0000000000 >> 40 =
0xffffffffffff00ff jited:1 69 PASS
[   26.521509] test_bpf: #199 ALU_NEG: -(3) = -3 jited:1 79 PASS
[   26.522692] test_bpf: #200 ALU_NEG: -(-3) = 3 jited:1 101 PASS
[   26.524066] test_bpf: #201 ALU64_NEG: -(3) = -3 jited:1 69 PASS
[   26.525152] test_bpf: #202 ALU64_NEG: -(-3) = 3 jited:1 69 PASS
[   26.526264] test_bpf: #203 ALU_END_FROM_BE 16: 0x0123456789abcdef
-> 0xcdef jited:1 112 PASS
[   26.527879] test_bpf: #204 ALU_END_FROM_BE 32: 0x0123456789abcdef
-> 0x89abcdef jited:1 75 PASS
[   26.529323] test_bpf: #205 ALU_END_FROM_BE 64: 0x0123456789abcdef
-> 0x89abcdef jited:1 88 PASS
[   26.530801] test_bpf: #206 ALU_END_FROM_LE 16: 0x0123456789abcdef
-> 0xefcd jited:1 103 PASS
[   26.532789] test_bpf: #207 ALU_END_FROM_LE 32: 0x0123456789abcdef
-> 0xefcdab89 jited:1 121 PASS
[   26.534881] test_bpf: #208 ALU_END_FROM_LE 64: 0x0123456789abcdef
-> 0x67452301 jited:1 74 PASS
[   26.536388] test_bpf: #209 ST_MEM_B: Store/Load byte: max negative
jited:1 139 PASS
[   26.538618] test_bpf: #210 ST_MEM_B: Store/Load byte: max positive
jited:1 93 PASS
[   26.540104] test_bpf: #211 STX_MEM_B: Store/Load byte: max negative
jited:1 87 PASS
[   26.541494] test_bpf: #212 ST_MEM_H: Store/Load half word: max
negative jited:1 104 PASS
[   26.543038] test_bpf: #213 ST_MEM_H: Store/Load half word: max
positive jited:1 111 PASS
[   26.544554] test_bpf: #214 STX_MEM_H: Store/Load half word: max
negative jited:1 92 PASS
[   26.545885] test_bpf: #215 ST_MEM_W: Store/Load word: max negative
jited:1 124 PASS
[   26.547523] test_bpf: #216 ST_MEM_W: Store/Load word: max positive
jited:1 83 PASS
[   26.548753] test_bpf: #217 STX_MEM_W: Store/Load word: max negative
jited:1 87 PASS
[   26.550001] test_bpf: #218 ST_MEM_DW: Store/Load double word: max
negative jited:1 91 PASS
[   26.551660] test_bpf: #219 ST_MEM_DW: Store/Load double word: max
negative 2 jited:1 117 PASS
[   26.553320] test_bpf: #220 ST_MEM_DW: Store/Load double word: max
positive jited:1 103 PASS
[   26.554733] test_bpf: #221 STX_MEM_DW: Store/Load double word: max
negative jited:1 85 PASS
[   26.555984] test_bpf: #222 STX_XADD_W: Test: 0x12 + 0x10 = 0x22
jited:0 289 PASS
[   26.559204] test_bpf: #223 STX_XADD_W: Test side-effects, r10: 0x12
+ 0x10 = 0x22 jited:0 PASS
[   26.559438] test_bpf: #224 STX_XADD_W: Test side-effects, r0: 0x12
+ 0x10 = 0x22 jited:0 242 PASS
[   26.562189] test_bpf: #225 STX_XADD_W: X + 1 + 1 + 1 + ... jited:0
141847 PASS
[   27.981183] test_bpf: #226 STX_XADD_DW: Test: 0x12 + 0x10 = 0x22
jited:0 374 PASS
[   27.985739] test_bpf: #227 STX_XADD_DW: Test side-effects, r10:
0x12 + 0x10 = 0x22 jited:0 PASS
[   27.985990] test_bpf: #228 STX_XADD_DW: Test side-effects, r0: 0x12
+ 0x10 = 0x22 jited:0 274 PASS
[   27.989010] test_bpf: #229 STX_XADD_DW: X + 1 + 1 + 1 + ... jited:0
340041 PASS
[   31.389811] test_bpf: #230 JMP_EXIT jited:1 73 PASS
[   31.391325] test_bpf: #231 JMP_JA: Unconditional jump: if (true)
return 1 jited:1 90 PASS
[   31.392672] test_bpf: #232 JMP_JSGT_K: Signed jump: if (-1 > -2)
return 1 jited:1 101 PASS
[   31.394242] test_bpf: #233 JMP_JSGT_K: Signed jump: if (-1 > -1)
return 0 jited:1 76 PASS
[   31.395380] test_bpf: #234 JMP_JSGE_K: Signed jump: if (-1 >= -2)
return 1 jited:1 83 PASS
[   31.396628] test_bpf: #235 JMP_JSGE_K: Signed jump: if (-1 >= -1)
return 1 jited:1 80 PASS
[   31.397766] test_bpf: #236 JMP_JGT_K: if (3 > 2) return 1 jited:1 79 PASS
[   31.398935] test_bpf: #237 JMP_JGT_K: Unsigned jump: if (-1 > 1)
return 1 jited:1 124 PASS
[   31.400772] test_bpf: #238 JMP_JGE_K: if (3 >= 2) return 1 jited:1 181 PASS
[   31.403241] test_bpf: #239 JMP_JGT_K: if (3 > 2) return 1 (jump
backwards) jited:1 97 PASS
[   31.404772] test_bpf: #240 JMP_JGE_K: if (3 >= 3) return 1 jited:1 80 PASS
[   31.405965] test_bpf: #241 JMP_JNE_K: if (3 != 2) return 1 jited:1 79 PASS
[   31.407146] test_bpf: #242 JMP_JEQ_K: if (3 == 3) return 1 jited:1 88 PASS
[   31.408357] test_bpf: #243 JMP_JSET_K: if (0x3 & 0x2) return 1
jited:1 115 PASS
[   31.409855] test_bpf: #244 JMP_JSET_K: if (0x3 & 0xffffffff) return
1 jited:1 89 PASS
[   31.411190] test_bpf: #245 JMP_JSGT_X: Signed jump: if (-1 > -2)
return 1 jited:1 93 PASS
[   31.412513] test_bpf: #246 JMP_JSGT_X: Signed jump: if (-1 > -1)
return 0 jited:1 90 PASS
[   31.413820] test_bpf: #247 JMP_JSGE_X: Signed jump: if (-1 >= -2)
return 1 jited:1 93 PASS
[   31.415252] test_bpf: #248 JMP_JSGE_X: Signed jump: if (-1 >= -1)
return 1 jited:1 94 PASS
[   31.416629] test_bpf: #249 JMP_JGT_X: if (3 > 2) return 1 jited:1 88 PASS
[   31.417834] test_bpf: #250 JMP_JGT_X: Unsigned jump: if (-1 > 1)
return 1 jited:1 77 PASS
[   31.419077] test_bpf: #251 JMP_JGE_X: if (3 >= 2) return 1 jited:1 95 PASS
[   31.420402] test_bpf: #252 JMP_JGE_X: if (3 >= 3) return 1 jited:1 97 PASS
[   31.421841] test_bpf: #253 JMP_JGE_X: ldimm64 test 1 jited:1 110 PASS
[   31.423894] test_bpf: #254 JMP_JGE_X: ldimm64 test 2 jited:1 94 PASS
[   31.425370] test_bpf: #255 JMP_JGE_X: ldimm64 test 3 jited:1 107 PASS
[   31.427259] test_bpf: #256 JMP_JNE_X: if (3 != 2) return 1 jited:1 101 PASS
[   31.428931] test_bpf: #257 JMP_JEQ_X: if (3 == 3) return 1 jited:1 94 PASS
[   31.430449] test_bpf: #258 JMP_JSET_X: if (0x3 & 0x2) return 1
jited:1 96 PASS
[   31.431873] test_bpf: #259 JMP_JSET_X: if (0x3 & 0xffffffff) return
1 jited:1 217 PASS
[   31.434897] test_bpf: #260 JMP_JA: Jump, gap, jump, ... jited:1 145 PASS
[   31.437097] test_bpf: #261 BPF_MAXINSNS: Maximum possible literals
jited:1 184 PASS
[   31.460982] test_bpf: #262 BPF_MAXINSNS: Single literal jited:1 250 PASS
[   31.487721] test_bpf: #263 BPF_MAXINSNS: Run/add until end jited:1 6762 PASS
[   31.569308] test_bpf: #264 BPF_MAXINSNS: Too many instructions PASS
[   31.569438] test_bpf: #265 BPF_MAXINSNS: Very long jump jited:1 141 PASS
[   31.585479] test_bpf: #266 BPF_MAXINSNS: Ctx heavy transformations
jited:1 31891 25898 PASS
[   32.183199] test_bpf: #267 BPF_MAXINSNS: Call heavy transformations
jited:0 822248 831950 PASS
[   48.732702] test_bpf: #268 BPF_MAXINSNS: Jump heavy test jited:1 136555 PASS
[   50.117439] test_bpf: #269 BPF_MAXINSNS: Very long jump backwards
jited:1 168 PASS
[   50.127426] test_bpf: #270 BPF_MAXINSNS: Edge hopping nuthouse
jited:1 255760 PASS
[   52.690619] test_bpf: #271 BPF_MAXINSNS: Jump, gap, jump, ...
jited:1 1146 PASS
[   52.724995] test_bpf: #272 BPF_MAXINSNS: ld_abs+get_processor_id
jited:0 562650 PASS
[   58.361997] test_bpf: #273 BPF_MAXINSNS: ld_abs+vlan_push/pop
jited:0 527825 PASS
[   63.641315] test_bpf: #274 LD_IND byte frag jited:1 649 PASS
[   63.649443] test_bpf: #275 LD_IND halfword frag jited:1 640 PASS
[   63.656527] test_bpf: #276 LD_IND word frag jited:1 668 PASS
[   63.664376] test_bpf: #277 LD_IND halfword mixed head/frag jited:1 990 PASS
[   63.675745] test_bpf: #278 LD_IND word mixed head/frag jited:1 971 PASS
[   63.686341] test_bpf: #279 LD_ABS byte frag jited:1 778 PASS
[   63.694965] test_bpf: #280 LD_ABS halfword frag jited:1 982 PASS
[   63.705562] test_bpf: #281 LD_ABS word frag jited:1 737 PASS
[   63.714005] test_bpf: #282 LD_ABS halfword mixed head/frag jited:1 997 PASS
[   63.724868] test_bpf: #283 LD_ABS word mixed head/frag jited:1 1037 PASS
[   63.736399] test_bpf: #284 LD_IND byte default X jited:1 270 PASS
[   63.740296] test_bpf: #285 LD_IND byte positive offset jited:1 214 PASS
[   63.743487] test_bpf: #286 LD_IND byte negative offset jited:1 251 PASS
[   63.747173] test_bpf: #287 LD_IND halfword positive offset jited:1 191 PASS
[   63.749669] test_bpf: #288 LD_IND halfword negative offset jited:1 188 PASS
[   63.752163] test_bpf: #289 LD_IND halfword unaligned jited:1 212 PASS
[   63.754834] test_bpf: #290 LD_IND word positive offset jited:1 187 PASS
[   63.757257] test_bpf: #291 LD_IND word negative offset jited:1 180 PASS
[   63.759764] test_bpf: #292 LD_IND word unaligned (addr & 3 == 2)
jited:1 211 PASS
[   63.762621] test_bpf: #293 LD_IND word unaligned (addr & 3 == 1)
jited:1 276 PASS
[   63.766332] test_bpf: #294 LD_IND word unaligned (addr & 3 == 3)
jited:1 270 PASS
[   63.769931] test_bpf: #295 LD_ABS byte jited:1 253 PASS
[   63.773492] test_bpf: #296 LD_ABS halfword jited:1 251 PASS
[   63.777622] test_bpf: #297 LD_ABS halfword unaligned jited:1 259 PASS
[   63.781554] test_bpf: #298 LD_ABS word jited:1 216 PASS
[   63.784630] test_bpf: #299 LD_ABS word unaligned (addr & 3 == 2)
jited:1 198 PASS
[   63.787174] test_bpf: #300 LD_ABS word unaligned (addr & 3 == 1)
jited:1 177 PASS
[   63.789367] test_bpf: #301 LD_ABS word unaligned (addr & 3 == 3)
jited:1 235 PASS
[   63.792235] test_bpf: #302 ADD default X jited:1 85 PASS
[   63.793953] test_bpf: #303 ADD default A jited:1 87 PASS
[   63.795439] test_bpf: #304 SUB default X jited:1 113 PASS
[   63.797128] test_bpf: #305 SUB default A jited:1 94 PASS
[   63.798801] test_bpf: #306 MUL default X jited:1 95 PASS
[   63.800387] test_bpf: #307 MUL default A jited:1 83 PASS
[   63.801704] test_bpf: #308 DIV default X jited:1 100 PASS
[   63.803305] test_bpf: #309 DIV default A jited:1 216 PASS
[   63.806105] test_bpf: #310 MOD default X jited:1 94 PASS
[   63.807447] test_bpf: #311 MOD default A jited:1 270 PASS
[   63.810667] test_bpf: #312 JMP EQ default A jited:1 94 PASS
[   63.812321] test_bpf: #313 JMP EQ default X jited:1 163 PASS
[   63.814754] test_bpf: Summary: 314 PASSED, 0 FAILED, [278/306 JIT'ed]

Let me know if you need more information.

Best,
Shubham Bansal


On Tue, May 16, 2017 at 1:25 AM, Daniel Borkmann <daniel@iogearbox.net> wrote:
> On 05/13/2017 11:38 PM, Shubham Bansal wrote:
>>
>> Finally finished testing.
>>
>> "test_bpf: Summary: 314 PASSED, 0 FAILED, [274/306 JIT'ed]"
>
>
> What are the missing pieces and how is the performance compared
> to the interpreter?
>
> Thanks,
> Daniel

^ permalink raw reply	[flat|nested] 99+ messages in thread

* arch: arm: bpf: Converting cBPF to eBPF for arm 32 bit
@ 2017-05-20 20:01                   ` Shubham Bansal
  0 siblings, 0 replies; 99+ messages in thread
From: Shubham Bansal @ 2017-05-20 20:01 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Daniel and Kees,

Before I send the patch, I have tested the JIT compiler on ARMv7 but
not on ARMv5 or ARMv6. So can you tell me which arch versions I should
test it for?
Also for my testing, CONFIG_FRAME_POINTER and CONFIG_CPU_BIG_ENDIAN
are both disabled. But I need to test JIT with these flags as well.
Whenever I put these flags in .config file, the arm kernel is not
getting compiler with these flags. Can you tell me why? If you need
more information regarding this, please let me know.

With current config for ARMv7, benchmarks are :

[root at vexpress modules]# insmod test_bpf.ko
[   25.797766] test_bpf: #0 TAX jited:1 180 170 169 PASS
[   25.811395] test_bpf: #1 TXA jited:1 93 89 111 PASS
[   25.815073] test_bpf: #2 ADD_SUB_MUL_K jited:1 94 PASS
[   25.816779] test_bpf: #3 DIV_MOD_KX jited:1 983 PASS
[   25.827310] test_bpf: #4 AND_OR_LSH_K jited:1 94 93 PASS
[   25.829843] test_bpf: #5 LD_IMM_0 jited:1 83 PASS
[   25.831260] test_bpf: #6 LD_IND jited:1 338 266 305 PASS
[   25.840970] test_bpf: #7 LD_ABS jited:1 343 304 289 PASS
[   25.851005] test_bpf: #8 LD_ABS_LL jited:1 362 300 PASS
[   25.858119] test_bpf: #9 LD_IND_LL jited:1 244 241 245 PASS
[   25.865994] test_bpf: #10 LD_ABS_NET jited:1 318 316 PASS
[   25.872829] test_bpf: #11 LD_IND_NET jited:1 243 196 196 PASS
[   25.879717] test_bpf: #12 LD_PKTTYPE jited:1 129 140 PASS
[   25.883034] test_bpf: #13 LD_MARK jited:1 113 88 PASS
[   25.885545] test_bpf: #14 LD_RXHASH jited:1 81 79 PASS
[   25.887506] test_bpf: #15 LD_QUEUE jited:1 88 85 PASS
[   25.889593] test_bpf: #16 LD_PROTOCOL jited:1 322 353 PASS
[   25.896894] test_bpf: #17 LD_VLAN_TAG jited:1 92 92 PASS
[   25.899173] test_bpf: #18 LD_VLAN_TAG_PRESENT jited:1 85 88 PASS
[   25.901310] test_bpf: #19 LD_IFINDEX jited:1 94 130 PASS
[   25.904110] test_bpf: #20 LD_HATYPE jited:1 98 91 PASS
[   25.906393] test_bpf: #21 LD_CPU
[   25.906651] bpf_jit: *** NOT YET: opcode 85 ***
[   25.906795] jited:0 705 691 PASS
[   25.921007] test_bpf: #22 LD_NLATTR jited:0 577 668 PASS
[   25.933870] test_bpf: #23 LD_NLATTR_NEST jited:0 2253 3006 PASS
[   25.987020] test_bpf: #24 LD_PAYLOAD_OFF jited:0 3840 4922 PASS
[   26.075119] test_bpf: #25 LD_ANC_XOR jited:1 107 94 PASS
[   26.077583] test_bpf: #26 SPILL_FILL jited:1 159 148 173 PASS
[   26.083259] test_bpf: #27 JEQ jited:1 274 183 181 PASS
[   26.090383] test_bpf: #28 JGT jited:1 255 194 165 PASS
[   26.097163] test_bpf: #29 JGE jited:1 187 190 246 PASS
[   26.103932] test_bpf: #30 JSET jited:1 178 184 192 PASS
[   26.110229] test_bpf: #31 tcpdump port 22 jited:1 266 698 717 PASS
[   26.127698] test_bpf: #32 tcpdump complex jited:1 267 729 1129 PASS
[   26.149727] test_bpf: #33 RET_A jited:1 94 88 PASS
[   26.152114] test_bpf: #34 INT: ADD trivial jited:1 87 PASS
[   26.153900] test_bpf: #35 INT: MUL_X jited:1 95 PASS
[   26.155384] test_bpf: #36 INT: MUL_X2 jited:1 82 PASS
[   26.156606] test_bpf: #37 INT: MUL32_X jited:1 91 PASS
[   26.157846] test_bpf: #38 INT: ADD 64-bit jited:1 1055 PASS
[   26.169280] test_bpf: #39 INT: ADD 32-bit jited:1 701 PASS
[   26.177039] test_bpf: #40 INT: SUB jited:1 931 PASS
[   26.187108] test_bpf: #41 INT: XOR jited:1 355 PASS
[   26.191364] test_bpf: #42 INT: MUL jited:1 389 PASS
[   26.196286] test_bpf: #43 MOV REG64 jited:1 267 PASS
[   26.199759] test_bpf: #44 MOV REG32 jited:1 176 PASS
[   26.202060] test_bpf: #45 LD IMM64 jited:1 194 PASS
[   26.204607] test_bpf: #46 INT: ALU MIX jited:0 1174 PASS
[   26.216896] test_bpf: #47 INT: shifts by register jited:1 211 PASS
[   26.219956] test_bpf: #48 INT: DIV + ABS jited:1 559 517 PASS
[   26.231347] test_bpf: #49 INT: DIV by zero jited:1 395 277 PASS
[   26.238862] test_bpf: #50 check: missing ret PASS
[   26.239288] test_bpf: #51 check: div_k_0 PASS
[   26.239492] test_bpf: #52 check: unknown insn PASS
[   26.239640] test_bpf: #53 check: out of range spill/fill PASS
[   26.239803] test_bpf: #54 JUMPS + HOLES jited:1 295 PASS
[   26.243343] test_bpf: #55 check: RET X PASS
[   26.244065] test_bpf: #56 check: LDX + RET X PASS
[   26.244224] test_bpf: #57 M[]: alt STX + LDX jited:1 433 PASS
[   26.249126] test_bpf: #58 M[]: full STX + full LDX jited:1 427 PASS
[   26.254123] test_bpf: #59 check: SKF_AD_MAX PASS
[   26.254509] test_bpf: #60 LD [SKF_AD_OFF-1] jited:1 298 PASS
[   26.257882] test_bpf: #61 load 64-bit immediate jited:1 128 PASS
[   26.259813] test_bpf: #62 nmap reduced jited:1 655 PASS
[   26.267216] test_bpf: #63 ALU_MOV_X: dst = 2 jited:1 89 PASS
[   26.268766] test_bpf: #64 ALU_MOV_X: dst = 4294967295 jited:1 72 PASS
[   26.270126] test_bpf: #65 ALU64_MOV_X: dst = 2 jited:1 94 PASS
[   26.271768] test_bpf: #66 ALU64_MOV_X: dst = 4294967295 jited:1 145 PASS
[   26.274152] test_bpf: #67 ALU_MOV_K: dst = 2 jited:1 93 PASS
[   26.275673] test_bpf: #68 ALU_MOV_K: dst = 4294967295 jited:1 103 PASS
[   26.277371] test_bpf: #69 ALU_MOV_K: 0x0000ffffffff0000 =
0x00000000ffffffff jited:1 99 PASS
[   26.278966] test_bpf: #70 ALU64_MOV_K: dst = 2 jited:1 110 PASS
[   26.280440] test_bpf: #71 ALU64_MOV_K: dst = 2147483647 jited:1 96 PASS
[   26.281843] test_bpf: #72 ALU64_OR_K: dst = 0x0 jited:1 103 PASS
[   26.283682] test_bpf: #73 ALU64_MOV_K: dst = -1 jited:1 96 PASS
[   26.285147] test_bpf: #74 ALU_ADD_X: 1 + 2 = 3 jited:1 85 PASS
[   26.286373] test_bpf: #75 ALU_ADD_X: 1 + 4294967294 = 4294967295
jited:1 108 PASS
[   26.288079] test_bpf: #76 ALU_ADD_X: 2 + 4294967294 = 0 jited:1 112 PASS
[   26.289653] test_bpf: #77 ALU64_ADD_X: 1 + 2 = 3 jited:1 70 PASS
[   26.290666] test_bpf: #78 ALU64_ADD_X: 1 + 4294967294 = 4294967295
jited:1 85 PASS
[   26.291897] test_bpf: #79 ALU64_ADD_X: 2 + 4294967294 = 4294967296
jited:1 95 PASS
[   26.293429] test_bpf: #80 ALU_ADD_K: 1 + 2 = 3 jited:1 96 PASS
[   26.294794] test_bpf: #81 ALU_ADD_K: 3 + 0 = 3 jited:1 79 PASS
[   26.295956] test_bpf: #82 ALU_ADD_K: 1 + 4294967294 = 4294967295
jited:1 70 PASS
[   26.297026] test_bpf: #83 ALU_ADD_K: 4294967294 + 2 = 0 jited:1 77 PASS
[   26.298109] test_bpf: #84 ALU_ADD_K: 0 + (-1) = 0x00000000ffffffff
jited:1 120 PASS
[   26.299705] test_bpf: #85 ALU_ADD_K: 0 + 0xffff = 0xffff jited:1 85 PASS
[   26.300902] test_bpf: #86 ALU_ADD_K: 0 + 0x7fffffff = 0x7fffffff
jited:1 121 PASS
[   26.302578] test_bpf: #87 ALU_ADD_K: 0 + 0x80000000 = 0x80000000
jited:1 115 PASS
[   26.304134] test_bpf: #88 ALU_ADD_K: 0 + 0x80008000 = 0x80008000
jited:1 136 PASS
[   26.305881] test_bpf: #89 ALU64_ADD_K: 1 + 2 = 3 jited:1 119 PASS
[   26.307481] test_bpf: #90 ALU64_ADD_K: 3 + 0 = 3 jited:1 90 PASS
[   26.308784] test_bpf: #91 ALU64_ADD_K: 1 + 2147483646 = 2147483647
jited:1 83 PASS
[   26.310091] test_bpf: #92 ALU64_ADD_K: 4294967294 + 2 = 4294967296
jited:1 82 PASS
[   26.311534] test_bpf: #93 ALU64_ADD_K: 2147483646 + -2147483647 =
-1 jited:1 71 PASS
[   26.312842] test_bpf: #94 ALU64_ADD_K: 1 + 0 = 1 jited:1 143 PASS
[   26.315010] test_bpf: #95 ALU64_ADD_K: 0 + (-1) =
0xffffffffffffffff jited:1 116 PASS
[   26.317106] test_bpf: #96 ALU64_ADD_K: 0 + 0xffff = 0xffff jited:1 119 PASS
[   26.318834] test_bpf: #97 ALU64_ADD_K: 0 + 0x7fffffff = 0x7fffffff
jited:1 119 PASS
[   26.320484] test_bpf: #98 ALU64_ADD_K: 0 + 0x80000000 =
0xffffffff80000000 jited:1 110 PASS
[   26.322003] test_bpf: #99 ALU_ADD_K: 0 + 0x80008000 =
0xffffffff80008000 jited:1 117 PASS
[   26.323841] test_bpf: #100 ALU_SUB_X: 3 - 1 = 2 jited:1 84 PASS
[   26.325043] test_bpf: #101 ALU_SUB_X: 4294967295 - 4294967294 = 1
jited:1 84 PASS
[   26.326300] test_bpf: #102 ALU64_SUB_X: 3 - 1 = 2 jited:1 97 PASS
[   26.327661] test_bpf: #103 ALU64_SUB_X: 4294967295 - 4294967294 = 1
jited:1 74 PASS
[   26.328760] test_bpf: #104 ALU_SUB_K: 3 - 1 = 2 jited:1 80 PASS
[   26.329880] test_bpf: #105 ALU_SUB_K: 3 - 0 = 3 jited:1 93 PASS
[   26.331166] test_bpf: #106 ALU_SUB_K: 4294967295 - 4294967294 = 1
jited:1 81 PASS
[   26.332348] test_bpf: #107 ALU64_SUB_K: 3 - 1 = 2 jited:1 89 PASS
[   26.333616] test_bpf: #108 ALU64_SUB_K: 3 - 0 = 3 jited:1 73 PASS
[   26.334796] test_bpf: #109 ALU64_SUB_K: 4294967294 - 4294967295 =
-1 jited:1 75 PASS
[   26.335880] test_bpf: #110 ALU64_ADD_K: 2147483646 - 2147483647 =
-1 jited:1 88 PASS
[   26.337138] test_bpf: #111 ALU_MUL_X: 2 * 3 = 6 jited:1 113 PASS
[   26.338609] test_bpf: #112 ALU_MUL_X: 2 * 0x7FFFFFF8 = 0xFFFFFFF0
jited:1 99 PASS
[   26.339983] test_bpf: #113 ALU_MUL_X: -1 * -1 = 1 jited:1 70 PASS
[   26.341036] test_bpf: #114 ALU64_MUL_X: 2 * 3 = 6 jited:1 75 PASS
[   26.342242] test_bpf: #115 ALU64_MUL_X: 1 * 2147483647 = 2147483647
jited:1 91 PASS
[   26.343719] test_bpf: #116 ALU_MUL_K: 2 * 3 = 6 jited:1 83 PASS
[   26.344945] test_bpf: #117 ALU_MUL_K: 3 * 1 = 3 jited:1 84 PASS
[   26.346135] test_bpf: #118 ALU_MUL_K: 2 * 0x7FFFFFF8 = 0xFFFFFFF0
jited:1 69 PASS
[   26.347240] test_bpf: #119 ALU_MUL_K: 1 * (-1) = 0x00000000ffffffff
jited:1 99 PASS
[   26.348596] test_bpf: #120 ALU64_MUL_K: 2 * 3 = 6 jited:1 73 PASS
[   26.349749] test_bpf: #121 ALU64_MUL_K: 3 * 1 = 3 jited:1 89 PASS
[   26.350992] test_bpf: #122 ALU64_MUL_K: 1 * 2147483647 = 2147483647
jited:1 101 PASS
[   26.352436] test_bpf: #123 ALU64_MUL_K: 1 * -2147483647 =
-2147483647 jited:1 112 PASS
[   26.354144] test_bpf: #124 ALU64_MUL_K: 1 * (-1) =
0xffffffffffffffff jited:1 145 PASS
[   26.356392] test_bpf: #125 ALU_DIV_X: 6 / 2 = 3 jited:1 204 PASS
[   26.359242] test_bpf: #126 ALU_DIV_X: 4294967295 / 4294967295 = 1
jited:1 232 PASS
[   26.362516] test_bpf: #127 ALU64_DIV_X: 6 / 2 = 3 jited:0 593 PASS
[   26.368978] test_bpf: #128 ALU64_DIV_X: 2147483647 / 2147483647 = 1
jited:0 517 PASS
[   26.374539] test_bpf: #129 ALU64_DIV_X: 0xffffffffffffffff / (-1) =
0x0000000000000001 jited:0 666 PASS
[   26.381642] test_bpf: #130 ALU_DIV_K: 6 / 2 = 3 jited:1 225 PASS
[   26.384418] test_bpf: #131 ALU_DIV_K: 3 / 1 = 3 jited:1 199 PASS
[   26.386820] test_bpf: #132 ALU_DIV_K: 4294967295 / 4294967295 = 1
jited:1 195 PASS
[   26.389428] test_bpf: #133 ALU_DIV_K: 0xffffffffffffffff / (-1) =
0x1 jited:1 354 PASS
[   26.393537] test_bpf: #134 ALU64_DIV_K: 6 / 2 = 3 jited:0 403 PASS
[   26.398414] test_bpf: #135 ALU64_DIV_K: 3 / 1 = 3 jited:0 388 PASS
[   26.403006] test_bpf: #136 ALU64_DIV_K: 2147483647 / 2147483647 = 1
jited:0 387 PASS
[   26.407619] test_bpf: #137 ALU64_DIV_K: 0xffffffffffffffff / (-1) =
0x0000000000000001 jited:0 577 PASS
[   26.413875] test_bpf: #138 ALU_MOD_X: 3 % 2 = 1 jited:1 284 PASS
[   26.417106] test_bpf: #139 ALU_MOD_X: 4294967295 % 4294967293 = 2
jited:1 298 PASS
[   26.420489] test_bpf: #140 ALU64_MOD_X: 3 % 2 = 1 jited:0 609 PASS
[   26.426958] test_bpf: #141 ALU64_MOD_X: 2147483647 % 2147483645 = 2
jited:0 588 PASS
[   26.433454] test_bpf: #142 ALU_MOD_K: 3 % 2 = 1 jited:1 301 PASS
[   26.436831] test_bpf: #143 ALU_MOD_K: 3 % 1 = 0 jited:1 PASS
[   26.437152] test_bpf: #144 ALU_MOD_K: 4294967295 % 4294967293 = 2
jited:1 316 PASS
[   26.440713] test_bpf: #145 ALU64_MOD_K: 3 % 2 = 1 jited:0 612 PASS
[   26.447535] test_bpf: #146 ALU64_MOD_K: 3 % 1 = 0 jited:0 PASS
[   26.448057] test_bpf: #147 ALU64_MOD_K: 2147483647 % 2147483645 = 2
jited:0 612 PASS
[   26.454579] test_bpf: #148 ALU_AND_X: 3 & 2 = 2 jited:1 112 PASS
[   26.456065] test_bpf: #149 ALU_AND_X: 0xffffffff & 0xffffffff =
0xffffffff jited:1 76 PASS
[   26.457168] test_bpf: #150 ALU64_AND_X: 3 & 2 = 2 jited:1 84 PASS
[   26.458350] test_bpf: #151 ALU64_AND_X: 0xffffffff & 0xffffffff =
0xffffffff jited:1 78 PASS
[   26.459582] test_bpf: #152 ALU_AND_K: 3 & 2 = 2 jited:1 81 PASS
[   26.460724] test_bpf: #153 ALU_AND_K: 0xffffffff & 0xffffffff =
0xffffffff jited:1 89 PASS
[   26.462005] test_bpf: #154 ALU64_AND_K: 3 & 2 = 2 jited:1 99 PASS
[   26.463622] test_bpf: #155 ALU64_AND_K: 0xffffffff & 0xffffffff =
0xffffffff jited:1 86 PASS
[   26.464833] test_bpf: #156 ALU64_AND_K: 0x0000ffffffff0000 & 0x0 =
0x0000ffff00000000 jited:1 100 PASS
[   26.466244] test_bpf: #157 ALU64_AND_K: 0x0000ffffffff0000 & -1 =
0x0000ffffffffffff jited:1 126 PASS
[   26.467904] test_bpf: #158 ALU64_AND_K: 0xffffffffffffffff & -1 =
0xffffffffffffffff jited:1 105 PASS
[   26.469357] test_bpf: #159 ALU_OR_X: 1 | 2 = 3 jited:1 71 PASS
[   26.470510] test_bpf: #160 ALU_OR_X: 0x0 | 0xffffffff = 0xffffffff
jited:1 72 PASS
[   26.471626] test_bpf: #161 ALU64_OR_X: 1 | 2 = 3 jited:1 71 PASS
[   26.472684] test_bpf: #162 ALU64_OR_X: 0 | 0xffffffff = 0xffffffff
jited:1 87 PASS
[   26.473892] test_bpf: #163 ALU_OR_K: 1 | 2 = 3 jited:1 94 PASS
[   26.475174] test_bpf: #164 ALU_OR_K: 0 & 0xffffffff = 0xffffffff
jited:1 84 PASS
[   26.476385] test_bpf: #165 ALU64_OR_K: 1 | 2 = 3 jited:1 71 PASS
[   26.477586] test_bpf: #166 ALU64_OR_K: 0 & 0xffffffff = 0xffffffff
jited:1 80 PASS
[   26.478723] test_bpf: #167 ALU64_OR_K: 0x0000ffffffff0000 | 0x0 =
0x0000ffff00000000 jited:1 124 PASS
[   26.480417] test_bpf: #168 ALU64_OR_K: 0x0000ffffffff0000 | -1 =
0xffffffffffffffff jited:1 94 PASS
[   26.481820] test_bpf: #169 ALU64_OR_K: 0x000000000000000 | -1 =
0xffffffffffffffff jited:1 150 PASS
[   26.483952] test_bpf: #170 ALU_XOR_X: 5 ^ 6 = 3 jited:1 72 PASS
[   26.485195] test_bpf: #171 ALU_XOR_X: 0x1 ^ 0xffffffff = 0xfffffffe
jited:1 89 PASS
[   26.486648] test_bpf: #172 ALU64_XOR_X: 5 ^ 6 = 3 jited:1 80 PASS
[   26.488214] test_bpf: #173 ALU64_XOR_X: 1 ^ 0xffffffff = 0xfffffffe
jited:1 71 PASS
[   26.489566] test_bpf: #174 ALU_XOR_K: 5 ^ 6 = 3 jited:1 79 PASS
[   26.490791] test_bpf: #175 ALU_XOR_K: 1 ^ 0xffffffff = 0xfffffffe
jited:1 105 PASS
[   26.492548] test_bpf: #176 ALU64_XOR_K: 5 ^ 6 = 3 jited:1 128 PASS
[   26.494713] test_bpf: #177 ALU64_XOR_K: 1 & 0xffffffff = 0xfffffffe
jited:1 86 PASS
[   26.496072] test_bpf: #178 ALU64_XOR_K: 0x0000ffffffff0000 ^ 0x0 =
0x0000ffffffff0000 jited:1 102 PASS
[   26.497612] test_bpf: #179 ALU64_XOR_K: 0x0000ffffffff0000 ^ -1 =
0xffff00000000ffff jited:1 88 PASS
[   26.498906] test_bpf: #180 ALU64_XOR_K: 0x000000000000000 ^ -1 =
0xffffffffffffffff jited:1 98 PASS
[   26.500256] test_bpf: #181 ALU_LSH_X: 1 << 1 = 2 jited:1 107 PASS
[   26.501668] test_bpf: #182 ALU_LSH_X: 1 << 31 = 0x80000000 jited:1 70 PASS
[   26.502690] test_bpf: #183 ALU64_LSH_X: 1 << 1 = 2 jited:1 100 PASS
[   26.504077] test_bpf: #184 ALU64_LSH_X: 1 << 31 = 0x80000000 jited:1 78 PASS
[   26.505197] test_bpf: #185 ALU_LSH_K: 1 << 1 = 2 jited:1 76 PASS
[   26.506268] test_bpf: #186 ALU_LSH_K: 1 << 31 = 0x80000000 jited:1 71 PASS
[   26.507301] test_bpf: #187 ALU64_LSH_K: 1 << 1 = 2 jited:1 69 PASS
[   26.508374] test_bpf: #188 ALU64_LSH_K: 1 << 31 = 0x80000000 jited:1 78 PASS
[   26.509494] test_bpf: #189 ALU_RSH_X: 2 >> 1 = 1 jited:1 76 PASS
[   26.510665] test_bpf: #190 ALU_RSH_X: 0x80000000 >> 31 = 1 jited:1 77 PASS
[   26.511787] test_bpf: #191 ALU64_RSH_X: 2 >> 1 = 1 jited:1 79 PASS
[   26.513033] test_bpf: #192 ALU64_RSH_X: 0x80000000 >> 31 = 1 jited:1 95 PASS
[   26.514382] test_bpf: #193 ALU_RSH_K: 2 >> 1 = 1 jited:1 80 PASS
[   26.515648] test_bpf: #194 ALU_RSH_K: 0x80000000 >> 31 = 1 jited:1 72 PASS
[   26.516778] test_bpf: #195 ALU64_RSH_K: 2 >> 1 = 1 jited:1 86 PASS
[   26.517971] test_bpf: #196 ALU64_RSH_K: 0x80000000 >> 31 = 1 jited:1 78 PASS
[   26.519188] test_bpf: #197 ALU_ARSH_X: 0xff00ff0000000000 >> 40 =
0xffffffffffff00ff jited:1 90 PASS
[   26.520458] test_bpf: #198 ALU_ARSH_K: 0xff00ff0000000000 >> 40 =
0xffffffffffff00ff jited:1 69 PASS
[   26.521509] test_bpf: #199 ALU_NEG: -(3) = -3 jited:1 79 PASS
[   26.522692] test_bpf: #200 ALU_NEG: -(-3) = 3 jited:1 101 PASS
[   26.524066] test_bpf: #201 ALU64_NEG: -(3) = -3 jited:1 69 PASS
[   26.525152] test_bpf: #202 ALU64_NEG: -(-3) = 3 jited:1 69 PASS
[   26.526264] test_bpf: #203 ALU_END_FROM_BE 16: 0x0123456789abcdef
-> 0xcdef jited:1 112 PASS
[   26.527879] test_bpf: #204 ALU_END_FROM_BE 32: 0x0123456789abcdef
-> 0x89abcdef jited:1 75 PASS
[   26.529323] test_bpf: #205 ALU_END_FROM_BE 64: 0x0123456789abcdef
-> 0x89abcdef jited:1 88 PASS
[   26.530801] test_bpf: #206 ALU_END_FROM_LE 16: 0x0123456789abcdef
-> 0xefcd jited:1 103 PASS
[   26.532789] test_bpf: #207 ALU_END_FROM_LE 32: 0x0123456789abcdef
-> 0xefcdab89 jited:1 121 PASS
[   26.534881] test_bpf: #208 ALU_END_FROM_LE 64: 0x0123456789abcdef
-> 0x67452301 jited:1 74 PASS
[   26.536388] test_bpf: #209 ST_MEM_B: Store/Load byte: max negative
jited:1 139 PASS
[   26.538618] test_bpf: #210 ST_MEM_B: Store/Load byte: max positive
jited:1 93 PASS
[   26.540104] test_bpf: #211 STX_MEM_B: Store/Load byte: max negative
jited:1 87 PASS
[   26.541494] test_bpf: #212 ST_MEM_H: Store/Load half word: max
negative jited:1 104 PASS
[   26.543038] test_bpf: #213 ST_MEM_H: Store/Load half word: max
positive jited:1 111 PASS
[   26.544554] test_bpf: #214 STX_MEM_H: Store/Load half word: max
negative jited:1 92 PASS
[   26.545885] test_bpf: #215 ST_MEM_W: Store/Load word: max negative
jited:1 124 PASS
[   26.547523] test_bpf: #216 ST_MEM_W: Store/Load word: max positive
jited:1 83 PASS
[   26.548753] test_bpf: #217 STX_MEM_W: Store/Load word: max negative
jited:1 87 PASS
[   26.550001] test_bpf: #218 ST_MEM_DW: Store/Load double word: max
negative jited:1 91 PASS
[   26.551660] test_bpf: #219 ST_MEM_DW: Store/Load double word: max
negative 2 jited:1 117 PASS
[   26.553320] test_bpf: #220 ST_MEM_DW: Store/Load double word: max
positive jited:1 103 PASS
[   26.554733] test_bpf: #221 STX_MEM_DW: Store/Load double word: max
negative jited:1 85 PASS
[   26.555984] test_bpf: #222 STX_XADD_W: Test: 0x12 + 0x10 = 0x22
jited:0 289 PASS
[   26.559204] test_bpf: #223 STX_XADD_W: Test side-effects, r10: 0x12
+ 0x10 = 0x22 jited:0 PASS
[   26.559438] test_bpf: #224 STX_XADD_W: Test side-effects, r0: 0x12
+ 0x10 = 0x22 jited:0 242 PASS
[   26.562189] test_bpf: #225 STX_XADD_W: X + 1 + 1 + 1 + ... jited:0
141847 PASS
[   27.981183] test_bpf: #226 STX_XADD_DW: Test: 0x12 + 0x10 = 0x22
jited:0 374 PASS
[   27.985739] test_bpf: #227 STX_XADD_DW: Test side-effects, r10:
0x12 + 0x10 = 0x22 jited:0 PASS
[   27.985990] test_bpf: #228 STX_XADD_DW: Test side-effects, r0: 0x12
+ 0x10 = 0x22 jited:0 274 PASS
[   27.989010] test_bpf: #229 STX_XADD_DW: X + 1 + 1 + 1 + ... jited:0
340041 PASS
[   31.389811] test_bpf: #230 JMP_EXIT jited:1 73 PASS
[   31.391325] test_bpf: #231 JMP_JA: Unconditional jump: if (true)
return 1 jited:1 90 PASS
[   31.392672] test_bpf: #232 JMP_JSGT_K: Signed jump: if (-1 > -2)
return 1 jited:1 101 PASS
[   31.394242] test_bpf: #233 JMP_JSGT_K: Signed jump: if (-1 > -1)
return 0 jited:1 76 PASS
[   31.395380] test_bpf: #234 JMP_JSGE_K: Signed jump: if (-1 >= -2)
return 1 jited:1 83 PASS
[   31.396628] test_bpf: #235 JMP_JSGE_K: Signed jump: if (-1 >= -1)
return 1 jited:1 80 PASS
[   31.397766] test_bpf: #236 JMP_JGT_K: if (3 > 2) return 1 jited:1 79 PASS
[   31.398935] test_bpf: #237 JMP_JGT_K: Unsigned jump: if (-1 > 1)
return 1 jited:1 124 PASS
[   31.400772] test_bpf: #238 JMP_JGE_K: if (3 >= 2) return 1 jited:1 181 PASS
[   31.403241] test_bpf: #239 JMP_JGT_K: if (3 > 2) return 1 (jump
backwards) jited:1 97 PASS
[   31.404772] test_bpf: #240 JMP_JGE_K: if (3 >= 3) return 1 jited:1 80 PASS
[   31.405965] test_bpf: #241 JMP_JNE_K: if (3 != 2) return 1 jited:1 79 PASS
[   31.407146] test_bpf: #242 JMP_JEQ_K: if (3 == 3) return 1 jited:1 88 PASS
[   31.408357] test_bpf: #243 JMP_JSET_K: if (0x3 & 0x2) return 1
jited:1 115 PASS
[   31.409855] test_bpf: #244 JMP_JSET_K: if (0x3 & 0xffffffff) return
1 jited:1 89 PASS
[   31.411190] test_bpf: #245 JMP_JSGT_X: Signed jump: if (-1 > -2)
return 1 jited:1 93 PASS
[   31.412513] test_bpf: #246 JMP_JSGT_X: Signed jump: if (-1 > -1)
return 0 jited:1 90 PASS
[   31.413820] test_bpf: #247 JMP_JSGE_X: Signed jump: if (-1 >= -2)
return 1 jited:1 93 PASS
[   31.415252] test_bpf: #248 JMP_JSGE_X: Signed jump: if (-1 >= -1)
return 1 jited:1 94 PASS
[   31.416629] test_bpf: #249 JMP_JGT_X: if (3 > 2) return 1 jited:1 88 PASS
[   31.417834] test_bpf: #250 JMP_JGT_X: Unsigned jump: if (-1 > 1)
return 1 jited:1 77 PASS
[   31.419077] test_bpf: #251 JMP_JGE_X: if (3 >= 2) return 1 jited:1 95 PASS
[   31.420402] test_bpf: #252 JMP_JGE_X: if (3 >= 3) return 1 jited:1 97 PASS
[   31.421841] test_bpf: #253 JMP_JGE_X: ldimm64 test 1 jited:1 110 PASS
[   31.423894] test_bpf: #254 JMP_JGE_X: ldimm64 test 2 jited:1 94 PASS
[   31.425370] test_bpf: #255 JMP_JGE_X: ldimm64 test 3 jited:1 107 PASS
[   31.427259] test_bpf: #256 JMP_JNE_X: if (3 != 2) return 1 jited:1 101 PASS
[   31.428931] test_bpf: #257 JMP_JEQ_X: if (3 == 3) return 1 jited:1 94 PASS
[   31.430449] test_bpf: #258 JMP_JSET_X: if (0x3 & 0x2) return 1
jited:1 96 PASS
[   31.431873] test_bpf: #259 JMP_JSET_X: if (0x3 & 0xffffffff) return
1 jited:1 217 PASS
[   31.434897] test_bpf: #260 JMP_JA: Jump, gap, jump, ... jited:1 145 PASS
[   31.437097] test_bpf: #261 BPF_MAXINSNS: Maximum possible literals
jited:1 184 PASS
[   31.460982] test_bpf: #262 BPF_MAXINSNS: Single literal jited:1 250 PASS
[   31.487721] test_bpf: #263 BPF_MAXINSNS: Run/add until end jited:1 6762 PASS
[   31.569308] test_bpf: #264 BPF_MAXINSNS: Too many instructions PASS
[   31.569438] test_bpf: #265 BPF_MAXINSNS: Very long jump jited:1 141 PASS
[   31.585479] test_bpf: #266 BPF_MAXINSNS: Ctx heavy transformations
jited:1 31891 25898 PASS
[   32.183199] test_bpf: #267 BPF_MAXINSNS: Call heavy transformations
jited:0 822248 831950 PASS
[   48.732702] test_bpf: #268 BPF_MAXINSNS: Jump heavy test jited:1 136555 PASS
[   50.117439] test_bpf: #269 BPF_MAXINSNS: Very long jump backwards
jited:1 168 PASS
[   50.127426] test_bpf: #270 BPF_MAXINSNS: Edge hopping nuthouse
jited:1 255760 PASS
[   52.690619] test_bpf: #271 BPF_MAXINSNS: Jump, gap, jump, ...
jited:1 1146 PASS
[   52.724995] test_bpf: #272 BPF_MAXINSNS: ld_abs+get_processor_id
jited:0 562650 PASS
[   58.361997] test_bpf: #273 BPF_MAXINSNS: ld_abs+vlan_push/pop
jited:0 527825 PASS
[   63.641315] test_bpf: #274 LD_IND byte frag jited:1 649 PASS
[   63.649443] test_bpf: #275 LD_IND halfword frag jited:1 640 PASS
[   63.656527] test_bpf: #276 LD_IND word frag jited:1 668 PASS
[   63.664376] test_bpf: #277 LD_IND halfword mixed head/frag jited:1 990 PASS
[   63.675745] test_bpf: #278 LD_IND word mixed head/frag jited:1 971 PASS
[   63.686341] test_bpf: #279 LD_ABS byte frag jited:1 778 PASS
[   63.694965] test_bpf: #280 LD_ABS halfword frag jited:1 982 PASS
[   63.705562] test_bpf: #281 LD_ABS word frag jited:1 737 PASS
[   63.714005] test_bpf: #282 LD_ABS halfword mixed head/frag jited:1 997 PASS
[   63.724868] test_bpf: #283 LD_ABS word mixed head/frag jited:1 1037 PASS
[   63.736399] test_bpf: #284 LD_IND byte default X jited:1 270 PASS
[   63.740296] test_bpf: #285 LD_IND byte positive offset jited:1 214 PASS
[   63.743487] test_bpf: #286 LD_IND byte negative offset jited:1 251 PASS
[   63.747173] test_bpf: #287 LD_IND halfword positive offset jited:1 191 PASS
[   63.749669] test_bpf: #288 LD_IND halfword negative offset jited:1 188 PASS
[   63.752163] test_bpf: #289 LD_IND halfword unaligned jited:1 212 PASS
[   63.754834] test_bpf: #290 LD_IND word positive offset jited:1 187 PASS
[   63.757257] test_bpf: #291 LD_IND word negative offset jited:1 180 PASS
[   63.759764] test_bpf: #292 LD_IND word unaligned (addr & 3 == 2)
jited:1 211 PASS
[   63.762621] test_bpf: #293 LD_IND word unaligned (addr & 3 == 1)
jited:1 276 PASS
[   63.766332] test_bpf: #294 LD_IND word unaligned (addr & 3 == 3)
jited:1 270 PASS
[   63.769931] test_bpf: #295 LD_ABS byte jited:1 253 PASS
[   63.773492] test_bpf: #296 LD_ABS halfword jited:1 251 PASS
[   63.777622] test_bpf: #297 LD_ABS halfword unaligned jited:1 259 PASS
[   63.781554] test_bpf: #298 LD_ABS word jited:1 216 PASS
[   63.784630] test_bpf: #299 LD_ABS word unaligned (addr & 3 == 2)
jited:1 198 PASS
[   63.787174] test_bpf: #300 LD_ABS word unaligned (addr & 3 == 1)
jited:1 177 PASS
[   63.789367] test_bpf: #301 LD_ABS word unaligned (addr & 3 == 3)
jited:1 235 PASS
[   63.792235] test_bpf: #302 ADD default X jited:1 85 PASS
[   63.793953] test_bpf: #303 ADD default A jited:1 87 PASS
[   63.795439] test_bpf: #304 SUB default X jited:1 113 PASS
[   63.797128] test_bpf: #305 SUB default A jited:1 94 PASS
[   63.798801] test_bpf: #306 MUL default X jited:1 95 PASS
[   63.800387] test_bpf: #307 MUL default A jited:1 83 PASS
[   63.801704] test_bpf: #308 DIV default X jited:1 100 PASS
[   63.803305] test_bpf: #309 DIV default A jited:1 216 PASS
[   63.806105] test_bpf: #310 MOD default X jited:1 94 PASS
[   63.807447] test_bpf: #311 MOD default A jited:1 270 PASS
[   63.810667] test_bpf: #312 JMP EQ default A jited:1 94 PASS
[   63.812321] test_bpf: #313 JMP EQ default X jited:1 163 PASS
[   63.814754] test_bpf: Summary: 314 PASSED, 0 FAILED, [278/306 JIT'ed]

Let me know if you need more information.

Best,
Shubham Bansal


On Tue, May 16, 2017 at 1:25 AM, Daniel Borkmann <daniel@iogearbox.net> wrote:
> On 05/13/2017 11:38 PM, Shubham Bansal wrote:
>>
>> Finally finished testing.
>>
>> "test_bpf: Summary: 314 PASSED, 0 FAILED, [274/306 JIT'ed]"
>
>
> What are the missing pieces and how is the performance compared
> to the interpreter?
>
> Thanks,
> Daniel

^ permalink raw reply	[flat|nested] 99+ messages in thread

* [kernel-hardening] Re: arch: arm: bpf: Converting cBPF to eBPF for arm 32 bit
@ 2017-05-20 20:01                   ` Shubham Bansal
  0 siblings, 0 replies; 99+ messages in thread
From: Shubham Bansal @ 2017-05-20 20:01 UTC (permalink / raw)
  To: Daniel Borkmann
  Cc: Kees Cook, David Miller, Mircea Gherzan, Network Development,
	kernel-hardening, linux-arm-kernel, ast

Hi Daniel and Kees,

Before I send the patch, I have tested the JIT compiler on ARMv7 but
not on ARMv5 or ARMv6. So can you tell me which arch versions I should
test it for?
Also for my testing, CONFIG_FRAME_POINTER and CONFIG_CPU_BIG_ENDIAN
are both disabled. But I need to test JIT with these flags as well.
Whenever I put these flags in .config file, the arm kernel is not
getting compiler with these flags. Can you tell me why? If you need
more information regarding this, please let me know.

With current config for ARMv7, benchmarks are :

[root@vexpress modules]# insmod test_bpf.ko
[   25.797766] test_bpf: #0 TAX jited:1 180 170 169 PASS
[   25.811395] test_bpf: #1 TXA jited:1 93 89 111 PASS
[   25.815073] test_bpf: #2 ADD_SUB_MUL_K jited:1 94 PASS
[   25.816779] test_bpf: #3 DIV_MOD_KX jited:1 983 PASS
[   25.827310] test_bpf: #4 AND_OR_LSH_K jited:1 94 93 PASS
[   25.829843] test_bpf: #5 LD_IMM_0 jited:1 83 PASS
[   25.831260] test_bpf: #6 LD_IND jited:1 338 266 305 PASS
[   25.840970] test_bpf: #7 LD_ABS jited:1 343 304 289 PASS
[   25.851005] test_bpf: #8 LD_ABS_LL jited:1 362 300 PASS
[   25.858119] test_bpf: #9 LD_IND_LL jited:1 244 241 245 PASS
[   25.865994] test_bpf: #10 LD_ABS_NET jited:1 318 316 PASS
[   25.872829] test_bpf: #11 LD_IND_NET jited:1 243 196 196 PASS
[   25.879717] test_bpf: #12 LD_PKTTYPE jited:1 129 140 PASS
[   25.883034] test_bpf: #13 LD_MARK jited:1 113 88 PASS
[   25.885545] test_bpf: #14 LD_RXHASH jited:1 81 79 PASS
[   25.887506] test_bpf: #15 LD_QUEUE jited:1 88 85 PASS
[   25.889593] test_bpf: #16 LD_PROTOCOL jited:1 322 353 PASS
[   25.896894] test_bpf: #17 LD_VLAN_TAG jited:1 92 92 PASS
[   25.899173] test_bpf: #18 LD_VLAN_TAG_PRESENT jited:1 85 88 PASS
[   25.901310] test_bpf: #19 LD_IFINDEX jited:1 94 130 PASS
[   25.904110] test_bpf: #20 LD_HATYPE jited:1 98 91 PASS
[   25.906393] test_bpf: #21 LD_CPU
[   25.906651] bpf_jit: *** NOT YET: opcode 85 ***
[   25.906795] jited:0 705 691 PASS
[   25.921007] test_bpf: #22 LD_NLATTR jited:0 577 668 PASS
[   25.933870] test_bpf: #23 LD_NLATTR_NEST jited:0 2253 3006 PASS
[   25.987020] test_bpf: #24 LD_PAYLOAD_OFF jited:0 3840 4922 PASS
[   26.075119] test_bpf: #25 LD_ANC_XOR jited:1 107 94 PASS
[   26.077583] test_bpf: #26 SPILL_FILL jited:1 159 148 173 PASS
[   26.083259] test_bpf: #27 JEQ jited:1 274 183 181 PASS
[   26.090383] test_bpf: #28 JGT jited:1 255 194 165 PASS
[   26.097163] test_bpf: #29 JGE jited:1 187 190 246 PASS
[   26.103932] test_bpf: #30 JSET jited:1 178 184 192 PASS
[   26.110229] test_bpf: #31 tcpdump port 22 jited:1 266 698 717 PASS
[   26.127698] test_bpf: #32 tcpdump complex jited:1 267 729 1129 PASS
[   26.149727] test_bpf: #33 RET_A jited:1 94 88 PASS
[   26.152114] test_bpf: #34 INT: ADD trivial jited:1 87 PASS
[   26.153900] test_bpf: #35 INT: MUL_X jited:1 95 PASS
[   26.155384] test_bpf: #36 INT: MUL_X2 jited:1 82 PASS
[   26.156606] test_bpf: #37 INT: MUL32_X jited:1 91 PASS
[   26.157846] test_bpf: #38 INT: ADD 64-bit jited:1 1055 PASS
[   26.169280] test_bpf: #39 INT: ADD 32-bit jited:1 701 PASS
[   26.177039] test_bpf: #40 INT: SUB jited:1 931 PASS
[   26.187108] test_bpf: #41 INT: XOR jited:1 355 PASS
[   26.191364] test_bpf: #42 INT: MUL jited:1 389 PASS
[   26.196286] test_bpf: #43 MOV REG64 jited:1 267 PASS
[   26.199759] test_bpf: #44 MOV REG32 jited:1 176 PASS
[   26.202060] test_bpf: #45 LD IMM64 jited:1 194 PASS
[   26.204607] test_bpf: #46 INT: ALU MIX jited:0 1174 PASS
[   26.216896] test_bpf: #47 INT: shifts by register jited:1 211 PASS
[   26.219956] test_bpf: #48 INT: DIV + ABS jited:1 559 517 PASS
[   26.231347] test_bpf: #49 INT: DIV by zero jited:1 395 277 PASS
[   26.238862] test_bpf: #50 check: missing ret PASS
[   26.239288] test_bpf: #51 check: div_k_0 PASS
[   26.239492] test_bpf: #52 check: unknown insn PASS
[   26.239640] test_bpf: #53 check: out of range spill/fill PASS
[   26.239803] test_bpf: #54 JUMPS + HOLES jited:1 295 PASS
[   26.243343] test_bpf: #55 check: RET X PASS
[   26.244065] test_bpf: #56 check: LDX + RET X PASS
[   26.244224] test_bpf: #57 M[]: alt STX + LDX jited:1 433 PASS
[   26.249126] test_bpf: #58 M[]: full STX + full LDX jited:1 427 PASS
[   26.254123] test_bpf: #59 check: SKF_AD_MAX PASS
[   26.254509] test_bpf: #60 LD [SKF_AD_OFF-1] jited:1 298 PASS
[   26.257882] test_bpf: #61 load 64-bit immediate jited:1 128 PASS
[   26.259813] test_bpf: #62 nmap reduced jited:1 655 PASS
[   26.267216] test_bpf: #63 ALU_MOV_X: dst = 2 jited:1 89 PASS
[   26.268766] test_bpf: #64 ALU_MOV_X: dst = 4294967295 jited:1 72 PASS
[   26.270126] test_bpf: #65 ALU64_MOV_X: dst = 2 jited:1 94 PASS
[   26.271768] test_bpf: #66 ALU64_MOV_X: dst = 4294967295 jited:1 145 PASS
[   26.274152] test_bpf: #67 ALU_MOV_K: dst = 2 jited:1 93 PASS
[   26.275673] test_bpf: #68 ALU_MOV_K: dst = 4294967295 jited:1 103 PASS
[   26.277371] test_bpf: #69 ALU_MOV_K: 0x0000ffffffff0000 =
0x00000000ffffffff jited:1 99 PASS
[   26.278966] test_bpf: #70 ALU64_MOV_K: dst = 2 jited:1 110 PASS
[   26.280440] test_bpf: #71 ALU64_MOV_K: dst = 2147483647 jited:1 96 PASS
[   26.281843] test_bpf: #72 ALU64_OR_K: dst = 0x0 jited:1 103 PASS
[   26.283682] test_bpf: #73 ALU64_MOV_K: dst = -1 jited:1 96 PASS
[   26.285147] test_bpf: #74 ALU_ADD_X: 1 + 2 = 3 jited:1 85 PASS
[   26.286373] test_bpf: #75 ALU_ADD_X: 1 + 4294967294 = 4294967295
jited:1 108 PASS
[   26.288079] test_bpf: #76 ALU_ADD_X: 2 + 4294967294 = 0 jited:1 112 PASS
[   26.289653] test_bpf: #77 ALU64_ADD_X: 1 + 2 = 3 jited:1 70 PASS
[   26.290666] test_bpf: #78 ALU64_ADD_X: 1 + 4294967294 = 4294967295
jited:1 85 PASS
[   26.291897] test_bpf: #79 ALU64_ADD_X: 2 + 4294967294 = 4294967296
jited:1 95 PASS
[   26.293429] test_bpf: #80 ALU_ADD_K: 1 + 2 = 3 jited:1 96 PASS
[   26.294794] test_bpf: #81 ALU_ADD_K: 3 + 0 = 3 jited:1 79 PASS
[   26.295956] test_bpf: #82 ALU_ADD_K: 1 + 4294967294 = 4294967295
jited:1 70 PASS
[   26.297026] test_bpf: #83 ALU_ADD_K: 4294967294 + 2 = 0 jited:1 77 PASS
[   26.298109] test_bpf: #84 ALU_ADD_K: 0 + (-1) = 0x00000000ffffffff
jited:1 120 PASS
[   26.299705] test_bpf: #85 ALU_ADD_K: 0 + 0xffff = 0xffff jited:1 85 PASS
[   26.300902] test_bpf: #86 ALU_ADD_K: 0 + 0x7fffffff = 0x7fffffff
jited:1 121 PASS
[   26.302578] test_bpf: #87 ALU_ADD_K: 0 + 0x80000000 = 0x80000000
jited:1 115 PASS
[   26.304134] test_bpf: #88 ALU_ADD_K: 0 + 0x80008000 = 0x80008000
jited:1 136 PASS
[   26.305881] test_bpf: #89 ALU64_ADD_K: 1 + 2 = 3 jited:1 119 PASS
[   26.307481] test_bpf: #90 ALU64_ADD_K: 3 + 0 = 3 jited:1 90 PASS
[   26.308784] test_bpf: #91 ALU64_ADD_K: 1 + 2147483646 = 2147483647
jited:1 83 PASS
[   26.310091] test_bpf: #92 ALU64_ADD_K: 4294967294 + 2 = 4294967296
jited:1 82 PASS
[   26.311534] test_bpf: #93 ALU64_ADD_K: 2147483646 + -2147483647 =
-1 jited:1 71 PASS
[   26.312842] test_bpf: #94 ALU64_ADD_K: 1 + 0 = 1 jited:1 143 PASS
[   26.315010] test_bpf: #95 ALU64_ADD_K: 0 + (-1) =
0xffffffffffffffff jited:1 116 PASS
[   26.317106] test_bpf: #96 ALU64_ADD_K: 0 + 0xffff = 0xffff jited:1 119 PASS
[   26.318834] test_bpf: #97 ALU64_ADD_K: 0 + 0x7fffffff = 0x7fffffff
jited:1 119 PASS
[   26.320484] test_bpf: #98 ALU64_ADD_K: 0 + 0x80000000 =
0xffffffff80000000 jited:1 110 PASS
[   26.322003] test_bpf: #99 ALU_ADD_K: 0 + 0x80008000 =
0xffffffff80008000 jited:1 117 PASS
[   26.323841] test_bpf: #100 ALU_SUB_X: 3 - 1 = 2 jited:1 84 PASS
[   26.325043] test_bpf: #101 ALU_SUB_X: 4294967295 - 4294967294 = 1
jited:1 84 PASS
[   26.326300] test_bpf: #102 ALU64_SUB_X: 3 - 1 = 2 jited:1 97 PASS
[   26.327661] test_bpf: #103 ALU64_SUB_X: 4294967295 - 4294967294 = 1
jited:1 74 PASS
[   26.328760] test_bpf: #104 ALU_SUB_K: 3 - 1 = 2 jited:1 80 PASS
[   26.329880] test_bpf: #105 ALU_SUB_K: 3 - 0 = 3 jited:1 93 PASS
[   26.331166] test_bpf: #106 ALU_SUB_K: 4294967295 - 4294967294 = 1
jited:1 81 PASS
[   26.332348] test_bpf: #107 ALU64_SUB_K: 3 - 1 = 2 jited:1 89 PASS
[   26.333616] test_bpf: #108 ALU64_SUB_K: 3 - 0 = 3 jited:1 73 PASS
[   26.334796] test_bpf: #109 ALU64_SUB_K: 4294967294 - 4294967295 =
-1 jited:1 75 PASS
[   26.335880] test_bpf: #110 ALU64_ADD_K: 2147483646 - 2147483647 =
-1 jited:1 88 PASS
[   26.337138] test_bpf: #111 ALU_MUL_X: 2 * 3 = 6 jited:1 113 PASS
[   26.338609] test_bpf: #112 ALU_MUL_X: 2 * 0x7FFFFFF8 = 0xFFFFFFF0
jited:1 99 PASS
[   26.339983] test_bpf: #113 ALU_MUL_X: -1 * -1 = 1 jited:1 70 PASS
[   26.341036] test_bpf: #114 ALU64_MUL_X: 2 * 3 = 6 jited:1 75 PASS
[   26.342242] test_bpf: #115 ALU64_MUL_X: 1 * 2147483647 = 2147483647
jited:1 91 PASS
[   26.343719] test_bpf: #116 ALU_MUL_K: 2 * 3 = 6 jited:1 83 PASS
[   26.344945] test_bpf: #117 ALU_MUL_K: 3 * 1 = 3 jited:1 84 PASS
[   26.346135] test_bpf: #118 ALU_MUL_K: 2 * 0x7FFFFFF8 = 0xFFFFFFF0
jited:1 69 PASS
[   26.347240] test_bpf: #119 ALU_MUL_K: 1 * (-1) = 0x00000000ffffffff
jited:1 99 PASS
[   26.348596] test_bpf: #120 ALU64_MUL_K: 2 * 3 = 6 jited:1 73 PASS
[   26.349749] test_bpf: #121 ALU64_MUL_K: 3 * 1 = 3 jited:1 89 PASS
[   26.350992] test_bpf: #122 ALU64_MUL_K: 1 * 2147483647 = 2147483647
jited:1 101 PASS
[   26.352436] test_bpf: #123 ALU64_MUL_K: 1 * -2147483647 =
-2147483647 jited:1 112 PASS
[   26.354144] test_bpf: #124 ALU64_MUL_K: 1 * (-1) =
0xffffffffffffffff jited:1 145 PASS
[   26.356392] test_bpf: #125 ALU_DIV_X: 6 / 2 = 3 jited:1 204 PASS
[   26.359242] test_bpf: #126 ALU_DIV_X: 4294967295 / 4294967295 = 1
jited:1 232 PASS
[   26.362516] test_bpf: #127 ALU64_DIV_X: 6 / 2 = 3 jited:0 593 PASS
[   26.368978] test_bpf: #128 ALU64_DIV_X: 2147483647 / 2147483647 = 1
jited:0 517 PASS
[   26.374539] test_bpf: #129 ALU64_DIV_X: 0xffffffffffffffff / (-1) =
0x0000000000000001 jited:0 666 PASS
[   26.381642] test_bpf: #130 ALU_DIV_K: 6 / 2 = 3 jited:1 225 PASS
[   26.384418] test_bpf: #131 ALU_DIV_K: 3 / 1 = 3 jited:1 199 PASS
[   26.386820] test_bpf: #132 ALU_DIV_K: 4294967295 / 4294967295 = 1
jited:1 195 PASS
[   26.389428] test_bpf: #133 ALU_DIV_K: 0xffffffffffffffff / (-1) =
0x1 jited:1 354 PASS
[   26.393537] test_bpf: #134 ALU64_DIV_K: 6 / 2 = 3 jited:0 403 PASS
[   26.398414] test_bpf: #135 ALU64_DIV_K: 3 / 1 = 3 jited:0 388 PASS
[   26.403006] test_bpf: #136 ALU64_DIV_K: 2147483647 / 2147483647 = 1
jited:0 387 PASS
[   26.407619] test_bpf: #137 ALU64_DIV_K: 0xffffffffffffffff / (-1) =
0x0000000000000001 jited:0 577 PASS
[   26.413875] test_bpf: #138 ALU_MOD_X: 3 % 2 = 1 jited:1 284 PASS
[   26.417106] test_bpf: #139 ALU_MOD_X: 4294967295 % 4294967293 = 2
jited:1 298 PASS
[   26.420489] test_bpf: #140 ALU64_MOD_X: 3 % 2 = 1 jited:0 609 PASS
[   26.426958] test_bpf: #141 ALU64_MOD_X: 2147483647 % 2147483645 = 2
jited:0 588 PASS
[   26.433454] test_bpf: #142 ALU_MOD_K: 3 % 2 = 1 jited:1 301 PASS
[   26.436831] test_bpf: #143 ALU_MOD_K: 3 % 1 = 0 jited:1 PASS
[   26.437152] test_bpf: #144 ALU_MOD_K: 4294967295 % 4294967293 = 2
jited:1 316 PASS
[   26.440713] test_bpf: #145 ALU64_MOD_K: 3 % 2 = 1 jited:0 612 PASS
[   26.447535] test_bpf: #146 ALU64_MOD_K: 3 % 1 = 0 jited:0 PASS
[   26.448057] test_bpf: #147 ALU64_MOD_K: 2147483647 % 2147483645 = 2
jited:0 612 PASS
[   26.454579] test_bpf: #148 ALU_AND_X: 3 & 2 = 2 jited:1 112 PASS
[   26.456065] test_bpf: #149 ALU_AND_X: 0xffffffff & 0xffffffff =
0xffffffff jited:1 76 PASS
[   26.457168] test_bpf: #150 ALU64_AND_X: 3 & 2 = 2 jited:1 84 PASS
[   26.458350] test_bpf: #151 ALU64_AND_X: 0xffffffff & 0xffffffff =
0xffffffff jited:1 78 PASS
[   26.459582] test_bpf: #152 ALU_AND_K: 3 & 2 = 2 jited:1 81 PASS
[   26.460724] test_bpf: #153 ALU_AND_K: 0xffffffff & 0xffffffff =
0xffffffff jited:1 89 PASS
[   26.462005] test_bpf: #154 ALU64_AND_K: 3 & 2 = 2 jited:1 99 PASS
[   26.463622] test_bpf: #155 ALU64_AND_K: 0xffffffff & 0xffffffff =
0xffffffff jited:1 86 PASS
[   26.464833] test_bpf: #156 ALU64_AND_K: 0x0000ffffffff0000 & 0x0 =
0x0000ffff00000000 jited:1 100 PASS
[   26.466244] test_bpf: #157 ALU64_AND_K: 0x0000ffffffff0000 & -1 =
0x0000ffffffffffff jited:1 126 PASS
[   26.467904] test_bpf: #158 ALU64_AND_K: 0xffffffffffffffff & -1 =
0xffffffffffffffff jited:1 105 PASS
[   26.469357] test_bpf: #159 ALU_OR_X: 1 | 2 = 3 jited:1 71 PASS
[   26.470510] test_bpf: #160 ALU_OR_X: 0x0 | 0xffffffff = 0xffffffff
jited:1 72 PASS
[   26.471626] test_bpf: #161 ALU64_OR_X: 1 | 2 = 3 jited:1 71 PASS
[   26.472684] test_bpf: #162 ALU64_OR_X: 0 | 0xffffffff = 0xffffffff
jited:1 87 PASS
[   26.473892] test_bpf: #163 ALU_OR_K: 1 | 2 = 3 jited:1 94 PASS
[   26.475174] test_bpf: #164 ALU_OR_K: 0 & 0xffffffff = 0xffffffff
jited:1 84 PASS
[   26.476385] test_bpf: #165 ALU64_OR_K: 1 | 2 = 3 jited:1 71 PASS
[   26.477586] test_bpf: #166 ALU64_OR_K: 0 & 0xffffffff = 0xffffffff
jited:1 80 PASS
[   26.478723] test_bpf: #167 ALU64_OR_K: 0x0000ffffffff0000 | 0x0 =
0x0000ffff00000000 jited:1 124 PASS
[   26.480417] test_bpf: #168 ALU64_OR_K: 0x0000ffffffff0000 | -1 =
0xffffffffffffffff jited:1 94 PASS
[   26.481820] test_bpf: #169 ALU64_OR_K: 0x000000000000000 | -1 =
0xffffffffffffffff jited:1 150 PASS
[   26.483952] test_bpf: #170 ALU_XOR_X: 5 ^ 6 = 3 jited:1 72 PASS
[   26.485195] test_bpf: #171 ALU_XOR_X: 0x1 ^ 0xffffffff = 0xfffffffe
jited:1 89 PASS
[   26.486648] test_bpf: #172 ALU64_XOR_X: 5 ^ 6 = 3 jited:1 80 PASS
[   26.488214] test_bpf: #173 ALU64_XOR_X: 1 ^ 0xffffffff = 0xfffffffe
jited:1 71 PASS
[   26.489566] test_bpf: #174 ALU_XOR_K: 5 ^ 6 = 3 jited:1 79 PASS
[   26.490791] test_bpf: #175 ALU_XOR_K: 1 ^ 0xffffffff = 0xfffffffe
jited:1 105 PASS
[   26.492548] test_bpf: #176 ALU64_XOR_K: 5 ^ 6 = 3 jited:1 128 PASS
[   26.494713] test_bpf: #177 ALU64_XOR_K: 1 & 0xffffffff = 0xfffffffe
jited:1 86 PASS
[   26.496072] test_bpf: #178 ALU64_XOR_K: 0x0000ffffffff0000 ^ 0x0 =
0x0000ffffffff0000 jited:1 102 PASS
[   26.497612] test_bpf: #179 ALU64_XOR_K: 0x0000ffffffff0000 ^ -1 =
0xffff00000000ffff jited:1 88 PASS
[   26.498906] test_bpf: #180 ALU64_XOR_K: 0x000000000000000 ^ -1 =
0xffffffffffffffff jited:1 98 PASS
[   26.500256] test_bpf: #181 ALU_LSH_X: 1 << 1 = 2 jited:1 107 PASS
[   26.501668] test_bpf: #182 ALU_LSH_X: 1 << 31 = 0x80000000 jited:1 70 PASS
[   26.502690] test_bpf: #183 ALU64_LSH_X: 1 << 1 = 2 jited:1 100 PASS
[   26.504077] test_bpf: #184 ALU64_LSH_X: 1 << 31 = 0x80000000 jited:1 78 PASS
[   26.505197] test_bpf: #185 ALU_LSH_K: 1 << 1 = 2 jited:1 76 PASS
[   26.506268] test_bpf: #186 ALU_LSH_K: 1 << 31 = 0x80000000 jited:1 71 PASS
[   26.507301] test_bpf: #187 ALU64_LSH_K: 1 << 1 = 2 jited:1 69 PASS
[   26.508374] test_bpf: #188 ALU64_LSH_K: 1 << 31 = 0x80000000 jited:1 78 PASS
[   26.509494] test_bpf: #189 ALU_RSH_X: 2 >> 1 = 1 jited:1 76 PASS
[   26.510665] test_bpf: #190 ALU_RSH_X: 0x80000000 >> 31 = 1 jited:1 77 PASS
[   26.511787] test_bpf: #191 ALU64_RSH_X: 2 >> 1 = 1 jited:1 79 PASS
[   26.513033] test_bpf: #192 ALU64_RSH_X: 0x80000000 >> 31 = 1 jited:1 95 PASS
[   26.514382] test_bpf: #193 ALU_RSH_K: 2 >> 1 = 1 jited:1 80 PASS
[   26.515648] test_bpf: #194 ALU_RSH_K: 0x80000000 >> 31 = 1 jited:1 72 PASS
[   26.516778] test_bpf: #195 ALU64_RSH_K: 2 >> 1 = 1 jited:1 86 PASS
[   26.517971] test_bpf: #196 ALU64_RSH_K: 0x80000000 >> 31 = 1 jited:1 78 PASS
[   26.519188] test_bpf: #197 ALU_ARSH_X: 0xff00ff0000000000 >> 40 =
0xffffffffffff00ff jited:1 90 PASS
[   26.520458] test_bpf: #198 ALU_ARSH_K: 0xff00ff0000000000 >> 40 =
0xffffffffffff00ff jited:1 69 PASS
[   26.521509] test_bpf: #199 ALU_NEG: -(3) = -3 jited:1 79 PASS
[   26.522692] test_bpf: #200 ALU_NEG: -(-3) = 3 jited:1 101 PASS
[   26.524066] test_bpf: #201 ALU64_NEG: -(3) = -3 jited:1 69 PASS
[   26.525152] test_bpf: #202 ALU64_NEG: -(-3) = 3 jited:1 69 PASS
[   26.526264] test_bpf: #203 ALU_END_FROM_BE 16: 0x0123456789abcdef
-> 0xcdef jited:1 112 PASS
[   26.527879] test_bpf: #204 ALU_END_FROM_BE 32: 0x0123456789abcdef
-> 0x89abcdef jited:1 75 PASS
[   26.529323] test_bpf: #205 ALU_END_FROM_BE 64: 0x0123456789abcdef
-> 0x89abcdef jited:1 88 PASS
[   26.530801] test_bpf: #206 ALU_END_FROM_LE 16: 0x0123456789abcdef
-> 0xefcd jited:1 103 PASS
[   26.532789] test_bpf: #207 ALU_END_FROM_LE 32: 0x0123456789abcdef
-> 0xefcdab89 jited:1 121 PASS
[   26.534881] test_bpf: #208 ALU_END_FROM_LE 64: 0x0123456789abcdef
-> 0x67452301 jited:1 74 PASS
[   26.536388] test_bpf: #209 ST_MEM_B: Store/Load byte: max negative
jited:1 139 PASS
[   26.538618] test_bpf: #210 ST_MEM_B: Store/Load byte: max positive
jited:1 93 PASS
[   26.540104] test_bpf: #211 STX_MEM_B: Store/Load byte: max negative
jited:1 87 PASS
[   26.541494] test_bpf: #212 ST_MEM_H: Store/Load half word: max
negative jited:1 104 PASS
[   26.543038] test_bpf: #213 ST_MEM_H: Store/Load half word: max
positive jited:1 111 PASS
[   26.544554] test_bpf: #214 STX_MEM_H: Store/Load half word: max
negative jited:1 92 PASS
[   26.545885] test_bpf: #215 ST_MEM_W: Store/Load word: max negative
jited:1 124 PASS
[   26.547523] test_bpf: #216 ST_MEM_W: Store/Load word: max positive
jited:1 83 PASS
[   26.548753] test_bpf: #217 STX_MEM_W: Store/Load word: max negative
jited:1 87 PASS
[   26.550001] test_bpf: #218 ST_MEM_DW: Store/Load double word: max
negative jited:1 91 PASS
[   26.551660] test_bpf: #219 ST_MEM_DW: Store/Load double word: max
negative 2 jited:1 117 PASS
[   26.553320] test_bpf: #220 ST_MEM_DW: Store/Load double word: max
positive jited:1 103 PASS
[   26.554733] test_bpf: #221 STX_MEM_DW: Store/Load double word: max
negative jited:1 85 PASS
[   26.555984] test_bpf: #222 STX_XADD_W: Test: 0x12 + 0x10 = 0x22
jited:0 289 PASS
[   26.559204] test_bpf: #223 STX_XADD_W: Test side-effects, r10: 0x12
+ 0x10 = 0x22 jited:0 PASS
[   26.559438] test_bpf: #224 STX_XADD_W: Test side-effects, r0: 0x12
+ 0x10 = 0x22 jited:0 242 PASS
[   26.562189] test_bpf: #225 STX_XADD_W: X + 1 + 1 + 1 + ... jited:0
141847 PASS
[   27.981183] test_bpf: #226 STX_XADD_DW: Test: 0x12 + 0x10 = 0x22
jited:0 374 PASS
[   27.985739] test_bpf: #227 STX_XADD_DW: Test side-effects, r10:
0x12 + 0x10 = 0x22 jited:0 PASS
[   27.985990] test_bpf: #228 STX_XADD_DW: Test side-effects, r0: 0x12
+ 0x10 = 0x22 jited:0 274 PASS
[   27.989010] test_bpf: #229 STX_XADD_DW: X + 1 + 1 + 1 + ... jited:0
340041 PASS
[   31.389811] test_bpf: #230 JMP_EXIT jited:1 73 PASS
[   31.391325] test_bpf: #231 JMP_JA: Unconditional jump: if (true)
return 1 jited:1 90 PASS
[   31.392672] test_bpf: #232 JMP_JSGT_K: Signed jump: if (-1 > -2)
return 1 jited:1 101 PASS
[   31.394242] test_bpf: #233 JMP_JSGT_K: Signed jump: if (-1 > -1)
return 0 jited:1 76 PASS
[   31.395380] test_bpf: #234 JMP_JSGE_K: Signed jump: if (-1 >= -2)
return 1 jited:1 83 PASS
[   31.396628] test_bpf: #235 JMP_JSGE_K: Signed jump: if (-1 >= -1)
return 1 jited:1 80 PASS
[   31.397766] test_bpf: #236 JMP_JGT_K: if (3 > 2) return 1 jited:1 79 PASS
[   31.398935] test_bpf: #237 JMP_JGT_K: Unsigned jump: if (-1 > 1)
return 1 jited:1 124 PASS
[   31.400772] test_bpf: #238 JMP_JGE_K: if (3 >= 2) return 1 jited:1 181 PASS
[   31.403241] test_bpf: #239 JMP_JGT_K: if (3 > 2) return 1 (jump
backwards) jited:1 97 PASS
[   31.404772] test_bpf: #240 JMP_JGE_K: if (3 >= 3) return 1 jited:1 80 PASS
[   31.405965] test_bpf: #241 JMP_JNE_K: if (3 != 2) return 1 jited:1 79 PASS
[   31.407146] test_bpf: #242 JMP_JEQ_K: if (3 == 3) return 1 jited:1 88 PASS
[   31.408357] test_bpf: #243 JMP_JSET_K: if (0x3 & 0x2) return 1
jited:1 115 PASS
[   31.409855] test_bpf: #244 JMP_JSET_K: if (0x3 & 0xffffffff) return
1 jited:1 89 PASS
[   31.411190] test_bpf: #245 JMP_JSGT_X: Signed jump: if (-1 > -2)
return 1 jited:1 93 PASS
[   31.412513] test_bpf: #246 JMP_JSGT_X: Signed jump: if (-1 > -1)
return 0 jited:1 90 PASS
[   31.413820] test_bpf: #247 JMP_JSGE_X: Signed jump: if (-1 >= -2)
return 1 jited:1 93 PASS
[   31.415252] test_bpf: #248 JMP_JSGE_X: Signed jump: if (-1 >= -1)
return 1 jited:1 94 PASS
[   31.416629] test_bpf: #249 JMP_JGT_X: if (3 > 2) return 1 jited:1 88 PASS
[   31.417834] test_bpf: #250 JMP_JGT_X: Unsigned jump: if (-1 > 1)
return 1 jited:1 77 PASS
[   31.419077] test_bpf: #251 JMP_JGE_X: if (3 >= 2) return 1 jited:1 95 PASS
[   31.420402] test_bpf: #252 JMP_JGE_X: if (3 >= 3) return 1 jited:1 97 PASS
[   31.421841] test_bpf: #253 JMP_JGE_X: ldimm64 test 1 jited:1 110 PASS
[   31.423894] test_bpf: #254 JMP_JGE_X: ldimm64 test 2 jited:1 94 PASS
[   31.425370] test_bpf: #255 JMP_JGE_X: ldimm64 test 3 jited:1 107 PASS
[   31.427259] test_bpf: #256 JMP_JNE_X: if (3 != 2) return 1 jited:1 101 PASS
[   31.428931] test_bpf: #257 JMP_JEQ_X: if (3 == 3) return 1 jited:1 94 PASS
[   31.430449] test_bpf: #258 JMP_JSET_X: if (0x3 & 0x2) return 1
jited:1 96 PASS
[   31.431873] test_bpf: #259 JMP_JSET_X: if (0x3 & 0xffffffff) return
1 jited:1 217 PASS
[   31.434897] test_bpf: #260 JMP_JA: Jump, gap, jump, ... jited:1 145 PASS
[   31.437097] test_bpf: #261 BPF_MAXINSNS: Maximum possible literals
jited:1 184 PASS
[   31.460982] test_bpf: #262 BPF_MAXINSNS: Single literal jited:1 250 PASS
[   31.487721] test_bpf: #263 BPF_MAXINSNS: Run/add until end jited:1 6762 PASS
[   31.569308] test_bpf: #264 BPF_MAXINSNS: Too many instructions PASS
[   31.569438] test_bpf: #265 BPF_MAXINSNS: Very long jump jited:1 141 PASS
[   31.585479] test_bpf: #266 BPF_MAXINSNS: Ctx heavy transformations
jited:1 31891 25898 PASS
[   32.183199] test_bpf: #267 BPF_MAXINSNS: Call heavy transformations
jited:0 822248 831950 PASS
[   48.732702] test_bpf: #268 BPF_MAXINSNS: Jump heavy test jited:1 136555 PASS
[   50.117439] test_bpf: #269 BPF_MAXINSNS: Very long jump backwards
jited:1 168 PASS
[   50.127426] test_bpf: #270 BPF_MAXINSNS: Edge hopping nuthouse
jited:1 255760 PASS
[   52.690619] test_bpf: #271 BPF_MAXINSNS: Jump, gap, jump, ...
jited:1 1146 PASS
[   52.724995] test_bpf: #272 BPF_MAXINSNS: ld_abs+get_processor_id
jited:0 562650 PASS
[   58.361997] test_bpf: #273 BPF_MAXINSNS: ld_abs+vlan_push/pop
jited:0 527825 PASS
[   63.641315] test_bpf: #274 LD_IND byte frag jited:1 649 PASS
[   63.649443] test_bpf: #275 LD_IND halfword frag jited:1 640 PASS
[   63.656527] test_bpf: #276 LD_IND word frag jited:1 668 PASS
[   63.664376] test_bpf: #277 LD_IND halfword mixed head/frag jited:1 990 PASS
[   63.675745] test_bpf: #278 LD_IND word mixed head/frag jited:1 971 PASS
[   63.686341] test_bpf: #279 LD_ABS byte frag jited:1 778 PASS
[   63.694965] test_bpf: #280 LD_ABS halfword frag jited:1 982 PASS
[   63.705562] test_bpf: #281 LD_ABS word frag jited:1 737 PASS
[   63.714005] test_bpf: #282 LD_ABS halfword mixed head/frag jited:1 997 PASS
[   63.724868] test_bpf: #283 LD_ABS word mixed head/frag jited:1 1037 PASS
[   63.736399] test_bpf: #284 LD_IND byte default X jited:1 270 PASS
[   63.740296] test_bpf: #285 LD_IND byte positive offset jited:1 214 PASS
[   63.743487] test_bpf: #286 LD_IND byte negative offset jited:1 251 PASS
[   63.747173] test_bpf: #287 LD_IND halfword positive offset jited:1 191 PASS
[   63.749669] test_bpf: #288 LD_IND halfword negative offset jited:1 188 PASS
[   63.752163] test_bpf: #289 LD_IND halfword unaligned jited:1 212 PASS
[   63.754834] test_bpf: #290 LD_IND word positive offset jited:1 187 PASS
[   63.757257] test_bpf: #291 LD_IND word negative offset jited:1 180 PASS
[   63.759764] test_bpf: #292 LD_IND word unaligned (addr & 3 == 2)
jited:1 211 PASS
[   63.762621] test_bpf: #293 LD_IND word unaligned (addr & 3 == 1)
jited:1 276 PASS
[   63.766332] test_bpf: #294 LD_IND word unaligned (addr & 3 == 3)
jited:1 270 PASS
[   63.769931] test_bpf: #295 LD_ABS byte jited:1 253 PASS
[   63.773492] test_bpf: #296 LD_ABS halfword jited:1 251 PASS
[   63.777622] test_bpf: #297 LD_ABS halfword unaligned jited:1 259 PASS
[   63.781554] test_bpf: #298 LD_ABS word jited:1 216 PASS
[   63.784630] test_bpf: #299 LD_ABS word unaligned (addr & 3 == 2)
jited:1 198 PASS
[   63.787174] test_bpf: #300 LD_ABS word unaligned (addr & 3 == 1)
jited:1 177 PASS
[   63.789367] test_bpf: #301 LD_ABS word unaligned (addr & 3 == 3)
jited:1 235 PASS
[   63.792235] test_bpf: #302 ADD default X jited:1 85 PASS
[   63.793953] test_bpf: #303 ADD default A jited:1 87 PASS
[   63.795439] test_bpf: #304 SUB default X jited:1 113 PASS
[   63.797128] test_bpf: #305 SUB default A jited:1 94 PASS
[   63.798801] test_bpf: #306 MUL default X jited:1 95 PASS
[   63.800387] test_bpf: #307 MUL default A jited:1 83 PASS
[   63.801704] test_bpf: #308 DIV default X jited:1 100 PASS
[   63.803305] test_bpf: #309 DIV default A jited:1 216 PASS
[   63.806105] test_bpf: #310 MOD default X jited:1 94 PASS
[   63.807447] test_bpf: #311 MOD default A jited:1 270 PASS
[   63.810667] test_bpf: #312 JMP EQ default A jited:1 94 PASS
[   63.812321] test_bpf: #313 JMP EQ default X jited:1 163 PASS
[   63.814754] test_bpf: Summary: 314 PASSED, 0 FAILED, [278/306 JIT'ed]

Let me know if you need more information.

Best,
Shubham Bansal


On Tue, May 16, 2017 at 1:25 AM, Daniel Borkmann <daniel@iogearbox.net> wrote:
> On 05/13/2017 11:38 PM, Shubham Bansal wrote:
>>
>> Finally finished testing.
>>
>> "test_bpf: Summary: 314 PASSED, 0 FAILED, [274/306 JIT'ed]"
>
>
> What are the missing pieces and how is the performance compared
> to the interpreter?
>
> Thanks,
> Daniel

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: arch: arm: bpf: Converting cBPF to eBPF for arm 32 bit
  2017-05-20 20:01                   ` Shubham Bansal
  (?)
@ 2017-05-22 13:01                     ` Daniel Borkmann
  -1 siblings, 0 replies; 99+ messages in thread
From: Daniel Borkmann @ 2017-05-22 13:01 UTC (permalink / raw)
  To: Shubham Bansal
  Cc: Kees Cook, David Miller, Mircea Gherzan, Network Development,
	kernel-hardening, linux-arm-kernel, ast

On 05/20/2017 10:01 PM, Shubham Bansal wrote:
[...]
> Before I send the patch, I have tested the JIT compiler on ARMv7 but
> not on ARMv5 or ARMv6. So can you tell me which arch versions I should
> test it for?
> Also for my testing, CONFIG_FRAME_POINTER and CONFIG_CPU_BIG_ENDIAN
> are both disabled. But I need to test JIT with these flags as well.
> Whenever I put these flags in .config file, the arm kernel is not
> getting compiler with these flags. Can you tell me why? If you need
> more information regarding this, please let me know.

Maybe Mircea, Kees or someone from linux-arm-kernel can help you out
on that.

With regards to the below benchmark, I was mentioning how it compares
to the interpreter. With only the numbers for jit it's hard to compare.
So would be great to see the output for the following three cases:

1) Interpreter:

echo 0 > /proc/sys/net/core/bpf_jit_enable

2) JIT enabled:

echo 1 > /proc/sys/net/core/bpf_jit_enable

3) JIT + blinding enabled:

echo 1 > /proc/sys/net/core/bpf_jit_enable
echo 2 > /proc/sys/net/core/bpf_jit_harden

> With current config for ARMv7, benchmarks are :
>
> [root@vexpress modules]# insmod test_bpf.ko
> [   25.797766] test_bpf: #0 TAX jited:1 180 170 169 PASS
> [   25.811395] test_bpf: #1 TXA jited:1 93 89 111 PASS
> [   25.815073] test_bpf: #2 ADD_SUB_MUL_K jited:1 94 PASS
> [   25.816779] test_bpf: #3 DIV_MOD_KX jited:1 983 PASS
> [   25.827310] test_bpf: #4 AND_OR_LSH_K jited:1 94 93 PASS
> [   25.829843] test_bpf: #5 LD_IMM_0 jited:1 83 PASS
> [   25.831260] test_bpf: #6 LD_IND jited:1 338 266 305 PASS
[...]

Thanks,
Daniel

^ permalink raw reply	[flat|nested] 99+ messages in thread

* arch: arm: bpf: Converting cBPF to eBPF for arm 32 bit
@ 2017-05-22 13:01                     ` Daniel Borkmann
  0 siblings, 0 replies; 99+ messages in thread
From: Daniel Borkmann @ 2017-05-22 13:01 UTC (permalink / raw)
  To: linux-arm-kernel

On 05/20/2017 10:01 PM, Shubham Bansal wrote:
[...]
> Before I send the patch, I have tested the JIT compiler on ARMv7 but
> not on ARMv5 or ARMv6. So can you tell me which arch versions I should
> test it for?
> Also for my testing, CONFIG_FRAME_POINTER and CONFIG_CPU_BIG_ENDIAN
> are both disabled. But I need to test JIT with these flags as well.
> Whenever I put these flags in .config file, the arm kernel is not
> getting compiler with these flags. Can you tell me why? If you need
> more information regarding this, please let me know.

Maybe Mircea, Kees or someone from linux-arm-kernel can help you out
on that.

With regards to the below benchmark, I was mentioning how it compares
to the interpreter. With only the numbers for jit it's hard to compare.
So would be great to see the output for the following three cases:

1) Interpreter:

echo 0 > /proc/sys/net/core/bpf_jit_enable

2) JIT enabled:

echo 1 > /proc/sys/net/core/bpf_jit_enable

3) JIT + blinding enabled:

echo 1 > /proc/sys/net/core/bpf_jit_enable
echo 2 > /proc/sys/net/core/bpf_jit_harden

> With current config for ARMv7, benchmarks are :
>
> [root at vexpress modules]# insmod test_bpf.ko
> [   25.797766] test_bpf: #0 TAX jited:1 180 170 169 PASS
> [   25.811395] test_bpf: #1 TXA jited:1 93 89 111 PASS
> [   25.815073] test_bpf: #2 ADD_SUB_MUL_K jited:1 94 PASS
> [   25.816779] test_bpf: #3 DIV_MOD_KX jited:1 983 PASS
> [   25.827310] test_bpf: #4 AND_OR_LSH_K jited:1 94 93 PASS
> [   25.829843] test_bpf: #5 LD_IMM_0 jited:1 83 PASS
> [   25.831260] test_bpf: #6 LD_IND jited:1 338 266 305 PASS
[...]

Thanks,
Daniel

^ permalink raw reply	[flat|nested] 99+ messages in thread

* [kernel-hardening] Re: arch: arm: bpf: Converting cBPF to eBPF for arm 32 bit
@ 2017-05-22 13:01                     ` Daniel Borkmann
  0 siblings, 0 replies; 99+ messages in thread
From: Daniel Borkmann @ 2017-05-22 13:01 UTC (permalink / raw)
  To: Shubham Bansal
  Cc: Kees Cook, David Miller, Mircea Gherzan, Network Development,
	kernel-hardening, linux-arm-kernel, ast

On 05/20/2017 10:01 PM, Shubham Bansal wrote:
[...]
> Before I send the patch, I have tested the JIT compiler on ARMv7 but
> not on ARMv5 or ARMv6. So can you tell me which arch versions I should
> test it for?
> Also for my testing, CONFIG_FRAME_POINTER and CONFIG_CPU_BIG_ENDIAN
> are both disabled. But I need to test JIT with these flags as well.
> Whenever I put these flags in .config file, the arm kernel is not
> getting compiler with these flags. Can you tell me why? If you need
> more information regarding this, please let me know.

Maybe Mircea, Kees or someone from linux-arm-kernel can help you out
on that.

With regards to the below benchmark, I was mentioning how it compares
to the interpreter. With only the numbers for jit it's hard to compare.
So would be great to see the output for the following three cases:

1) Interpreter:

echo 0 > /proc/sys/net/core/bpf_jit_enable

2) JIT enabled:

echo 1 > /proc/sys/net/core/bpf_jit_enable

3) JIT + blinding enabled:

echo 1 > /proc/sys/net/core/bpf_jit_enable
echo 2 > /proc/sys/net/core/bpf_jit_harden

> With current config for ARMv7, benchmarks are :
>
> [root@vexpress modules]# insmod test_bpf.ko
> [   25.797766] test_bpf: #0 TAX jited:1 180 170 169 PASS
> [   25.811395] test_bpf: #1 TXA jited:1 93 89 111 PASS
> [   25.815073] test_bpf: #2 ADD_SUB_MUL_K jited:1 94 PASS
> [   25.816779] test_bpf: #3 DIV_MOD_KX jited:1 983 PASS
> [   25.827310] test_bpf: #4 AND_OR_LSH_K jited:1 94 93 PASS
> [   25.829843] test_bpf: #5 LD_IMM_0 jited:1 83 PASS
> [   25.831260] test_bpf: #6 LD_IND jited:1 338 266 305 PASS
[...]

Thanks,
Daniel

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: arch: arm: bpf: Converting cBPF to eBPF for arm 32 bit
  2017-05-22 13:01                     ` Daniel Borkmann
  (?)
@ 2017-05-22 17:04                       ` Shubham Bansal
  -1 siblings, 0 replies; 99+ messages in thread
From: Shubham Bansal @ 2017-05-22 17:04 UTC (permalink / raw)
  To: Daniel Borkmann
  Cc: Kees Cook, David Miller, Mircea Gherzan, Network Development,
	kernel-hardening, linux-arm-kernel, ast

Hi Daniel,

Here are the benchmarks.

1) Interpreter:

[root@vexpress modules]# insmod test_bpf.ko
[   37.244999] test_bpf: #0 TAX jited:0 757 645 650 PASS
[   37.272577] test_bpf: #1 TXA jited:0 366 334 336 PASS
[   37.283507] test_bpf: #2 ADD_SUB_MUL_K jited:0 543 PASS
[   37.289542] test_bpf: #3 DIV_MOD_KX jited:0 1509 PASS
[   37.305374] test_bpf: #4 AND_OR_LSH_K jited:0 539 559 PASS
[   37.317209] test_bpf: #5 LD_IMM_0 jited:0 412 PASS
[   37.321820] test_bpf: #6 LD_IND jited:0 428 376 389 PASS
[   37.334327] test_bpf: #7 LD_ABS jited:0 509 405 358 PASS
[   37.350596] test_bpf: #8 LD_ABS_LL jited:0 542 783 PASS
[   37.364340] test_bpf: #9 LD_IND_LL jited:0 524 496 723 PASS
[   37.382352] test_bpf: #10 LD_ABS_NET jited:0 527 545 PASS
[   37.393642] test_bpf: #11 LD_IND_NET jited:0 650 495 647 PASS
[   37.412228] test_bpf: #12 LD_PKTTYPE jited:0 686 901 PASS
[   37.428818] test_bpf: #13 LD_MARK jited:0 305 291 PASS
[   37.435349] test_bpf: #14 LD_RXHASH jited:0 257 259 PASS
[   37.440850] test_bpf: #15 LD_QUEUE jited:0 255 254 PASS
[   37.446254] test_bpf: #16 LD_PROTOCOL jited:0 593 603 PASS
[   37.458570] test_bpf: #17 LD_VLAN_TAG jited:0 288 292 PASS
[   37.464821] test_bpf: #18 LD_VLAN_TAG_PRESENT jited:0 335 421 PASS
[   37.472817] test_bpf: #19 LD_IFINDEX jited:0 8568 606 PASS
[   37.565163] test_bpf: #20 LD_HATYPE jited:0 618 695 PASS
[   37.579457] test_bpf: #21 LD_CPU jited:0 1200 1172 PASS
[   37.604424] test_bpf: #22 LD_NLATTR jited:0 979 1124 PASS
[   37.626345] test_bpf: #23 LD_NLATTR_NEST jited:0 12232 3593 PASS
[   37.785251] test_bpf: #24 LD_PAYLOAD_OFF jited:0 3697 4834 PASS
[   37.871224] test_bpf: #25 LD_ANC_XOR jited:0 314 344 PASS
[   37.878210] test_bpf: #26 SPILL_FILL jited:0 757 850 903 PASS
[   37.903954] test_bpf: #27 JEQ jited:0 380 420 426 PASS
[   37.916756] test_bpf: #28 JGT jited:0 376 467 448 PASS
[   37.930276] test_bpf: #29 JGE jited:0 446 590 498 PASS
[   37.946729] test_bpf: #30 JSET jited:0 571 787 1003 PASS
[   37.970896] test_bpf: #31 tcpdump port 22 jited:0 358 1079 1190 PASS
[   37.997814] test_bpf: #32 tcpdump complex jited:0 319 1061 2324 PASS
[   38.035596] test_bpf: #33 RET_A jited:0 253 249 PASS
[   38.041262] test_bpf: #34 INT: ADD trivial jited:0 414 PASS
[   38.045777] test_bpf: #35 INT: MUL_X jited:0 336 PASS
[   38.049402] test_bpf: #36 INT: MUL_X2 jited:0 431 PASS
[   38.054178] test_bpf: #37 INT: MUL32_X jited:0 523 PASS
[   38.059902] test_bpf: #38 INT: ADD 64-bit jited:0 5263 PASS
[   38.113069] test_bpf: #39 INT: ADD 32-bit jited:0 4127 PASS
[   38.154754] test_bpf: #40 INT: SUB jited:0 4218 PASS
[   38.197294] test_bpf: #41 INT: XOR jited:0 2252 PASS
[   38.220159] test_bpf: #42 INT: MUL jited:0 1986 PASS
[   38.240410] test_bpf: #43 MOV REG64 jited:0 1103 PASS
[   38.251796] test_bpf: #44 MOV REG32 jited:0 1140 PASS
[   38.263614] test_bpf: #45 LD IMM64 jited:0 1182 PASS
[   38.276031] test_bpf: #46 INT: ALU MIX jited:0 1068 PASS
[   38.287319] test_bpf: #47 INT: shifts by register jited:0 1125 PASS
[   38.298913] test_bpf: #48 INT: DIV + ABS jited:0 570 850 PASS
[   38.313745] test_bpf: #49 INT: DIV by zero jited:0 350 305 PASS
[   38.320829] test_bpf: #50 check: missing ret PASS
[   38.321186] test_bpf: #51 check: div_k_0 PASS
[   38.321350] test_bpf: #52 check: unknown insn PASS
[   38.321492] test_bpf: #53 check: out of range spill/fill PASS
[   38.321665] test_bpf: #54 JUMPS + HOLES jited:0 863 PASS
[   38.330763] test_bpf: #55 check: RET X PASS
[   38.331060] test_bpf: #56 check: LDX + RET X PASS
[   38.331292] test_bpf: #57 M[]: alt STX + LDX jited:0 3990 PASS
[   38.373667] test_bpf: #58 M[]: full STX + full LDX jited:0 2819 PASS
[   38.410225] test_bpf: #59 check: SKF_AD_MAX PASS
[   38.410461] test_bpf: #60 LD [SKF_AD_OFF-1] jited:0 313 PASS
[   38.413785] test_bpf: #61 load 64-bit immediate jited:0 579 PASS
[   38.419764] test_bpf: #62 nmap reduced jited:0 1860 PASS
[   38.439016] test_bpf: #63 ALU_MOV_X: dst = 2 jited:0 249 PASS
[   38.441990] test_bpf: #64 ALU_MOV_X: dst = 4294967295 jited:0 264 PASS
[   38.445000] test_bpf: #65 ALU64_MOV_X: dst = 2 jited:0 229 PASS
[   38.447602] test_bpf: #66 ALU64_MOV_X: dst = 4294967295 jited:0 213 PASS
[   38.450011] test_bpf: #67 ALU_MOV_K: dst = 2 jited:0 167 PASS
[   38.451963] test_bpf: #68 ALU_MOV_K: dst = 4294967295 jited:0 149 PASS
[   38.453694] test_bpf: #69 ALU_MOV_K: 0x0000ffffffff0000 =
0x00000000ffffffff jited:0 358 PASS
[   38.457572] test_bpf: #70 ALU64_MOV_K: dst = 2 jited:0 158 PASS
[   38.459546] test_bpf: #71 ALU64_MOV_K: dst = 2147483647 jited:0 156 PASS
[   38.461364] test_bpf: #72 ALU64_OR_K: dst = 0x0 jited:0 306 PASS
[   38.464652] test_bpf: #73 ALU64_MOV_K: dst = -1 jited:0 327 PASS
[   38.468154] test_bpf: #74 ALU_ADD_X: 1 + 2 = 3 jited:0 212 PASS
[   38.470551] test_bpf: #75 ALU_ADD_X: 1 + 4294967294 = 4294967295
jited:0 231 PASS
[   38.473187] test_bpf: #76 ALU_ADD_X: 2 + 4294967294 = 0 jited:0 309 PASS
[   38.476618] test_bpf: #77 ALU64_ADD_X: 1 + 2 = 3 jited:0 280 PASS
[   38.479675] test_bpf: #78 ALU64_ADD_X: 1 + 4294967294 = 4294967295
jited:0 286 PASS
[   38.482755] test_bpf: #79 ALU64_ADD_X: 2 + 4294967294 = 4294967296
jited:0 460 PASS
[   38.487670] test_bpf: #80 ALU_ADD_K: 1 + 2 = 3 jited:0 210 PASS
[   38.490042] test_bpf: #81 ALU_ADD_K: 3 + 0 = 3 jited:0 208 PASS
[   38.492331] test_bpf: #82 ALU_ADD_K: 1 + 4294967294 = 4294967295
jited:0 205 PASS
[   38.494604] test_bpf: #83 ALU_ADD_K: 4294967294 + 2 = 0 jited:0 323 PASS
[   38.498071] test_bpf: #84 ALU_ADD_K: 0 + (-1) = 0x00000000ffffffff
jited:0 338 PASS
[   38.501674] test_bpf: #85 ALU_ADD_K: 0 + 0xffff = 0xffff jited:0 347 PASS
[   38.505355] test_bpf: #86 ALU_ADD_K: 0 + 0x7fffffff = 0x7fffffff
jited:0 360 PASS
[   38.509197] test_bpf: #87 ALU_ADD_K: 0 + 0x80000000 = 0x80000000
jited:0 345 PASS
[   38.512873] test_bpf: #88 ALU_ADD_K: 0 + 0x80008000 = 0x80008000
jited:0 377 PASS
[   38.516924] test_bpf: #89 ALU64_ADD_K: 1 + 2 = 3 jited:0 184 PASS
[   38.519053] test_bpf: #90 ALU64_ADD_K: 3 + 0 = 3 jited:0 185 PASS
[   38.521246] test_bpf: #91 ALU64_ADD_K: 1 + 2147483646 = 2147483647
jited:0 186 PASS
[   38.523414] test_bpf: #92 ALU64_ADD_K: 4294967294 + 2 = 4294967296
jited:0 353 PASS
[   38.527276] test_bpf: #93 ALU64_ADD_K: 2147483646 + -2147483647 =
-1 jited:0 182 PASS
[   38.529353] test_bpf: #94 ALU64_ADD_K: 1 + 0 = 1 jited:0 311 PASS
[   38.532680] test_bpf: #95 ALU64_ADD_K: 0 + (-1) =
0xffffffffffffffff jited:0 339 PASS
[   38.536308] test_bpf: #96 ALU64_ADD_K: 0 + 0xffff = 0xffff jited:0 310 PASS
[   38.539652] test_bpf: #97 ALU64_ADD_K: 0 + 0x7fffffff = 0x7fffffff
jited:0 313 PASS
[   38.543022] test_bpf: #98 ALU64_ADD_K: 0 + 0x80000000 =
0xffffffff80000000 jited:0 340 PASS
[   38.546651] test_bpf: #99 ALU_ADD_K: 0 + 0x80008000 =
0xffffffff80008000 jited:0 311 PASS
[   38.549994] test_bpf: #100 ALU_SUB_X: 3 - 1 = 2 jited:0 213 PASS
[   38.552326] test_bpf: #101 ALU_SUB_X: 4294967295 - 4294967294 = 1
jited:0 212 PASS
[   38.554661] test_bpf: #102 ALU64_SUB_X: 3 - 1 = 2 jited:0 237 PASS
[   38.557278] test_bpf: #103 ALU64_SUB_X: 4294967295 - 4294967294 = 1
jited:0 221 PASS
[   38.559713] test_bpf: #104 ALU_SUB_K: 3 - 1 = 2 jited:0 177 PASS
[   38.561682] test_bpf: #105 ALU_SUB_K: 3 - 0 = 3 jited:0 179 PASS
[   38.563692] test_bpf: #106 ALU_SUB_K: 4294967295 - 4294967294 = 1
jited:0 195 PASS
[   38.565891] test_bpf: #107 ALU64_SUB_K: 3 - 1 = 2 jited:0 183 PASS
[   38.567926] test_bpf: #108 ALU64_SUB_K: 3 - 0 = 3 jited:0 177 PASS
[   38.569901] test_bpf: #109 ALU64_SUB_K: 4294967294 - 4294967295 =
-1 jited:0 181 PASS
[   38.571925] test_bpf: #110 ALU64_ADD_K: 2147483646 - 2147483647 =
-1 jited:0 177 PASS
[   38.573910] test_bpf: #111 ALU_MUL_X: 2 * 3 = 6 jited:0 241 PASS
[   38.576535] test_bpf: #112 ALU_MUL_X: 2 * 0x7FFFFFF8 = 0xFFFFFFF0
jited:0 220 PASS
[   38.578948] test_bpf: #113 ALU_MUL_X: -1 * -1 = 1 jited:0 224 PASS
[   38.581387] test_bpf: #114 ALU64_MUL_X: 2 * 3 = 6 jited:0 213 PASS
[   38.583715] test_bpf: #115 ALU64_MUL_X: 1 * 2147483647 = 2147483647
jited:0 230 PASS
[   38.586253] test_bpf: #116 ALU_MUL_K: 2 * 3 = 6 jited:0 191 PASS
[   38.588392] test_bpf: #117 ALU_MUL_K: 3 * 1 = 3 jited:0 189 PASS
[   38.590487] test_bpf: #118 ALU_MUL_K: 2 * 0x7FFFFFF8 = 0xFFFFFFF0
jited:0 192 PASS
[   38.592616] test_bpf: #119 ALU_MUL_K: 1 * (-1) = 0x00000000ffffffff
jited:0 333 PASS
[   38.596172] test_bpf: #120 ALU64_MUL_K: 2 * 3 = 6 jited:0 185 PASS
[   38.598224] test_bpf: #121 ALU64_MUL_K: 3 * 1 = 3 jited:0 185 PASS
[   38.600287] test_bpf: #122 ALU64_MUL_K: 1 * 2147483647 = 2147483647
jited:0 184 PASS
[   38.602369] test_bpf: #123 ALU64_MUL_K: 1 * -2147483647 =
-2147483647 jited:0 183 PASS
[   38.604421] test_bpf: #124 ALU64_MUL_K: 1 * (-1) =
0xffffffffffffffff jited:0 336 PASS
[   38.608002] test_bpf: #125 ALU_DIV_X: 6 / 2 = 3 jited:0 316 PASS
[   38.611394] test_bpf: #126 ALU_DIV_X: 4294967295 / 4294967295 = 1
jited:0 315 PASS
[   38.614753] test_bpf: #127 ALU64_DIV_X: 6 / 2 = 3 jited:0 439 PASS
[   38.619370] test_bpf: #128 ALU64_DIV_X: 2147483647 / 2147483647 = 1
jited:0 420 PASS
[   38.623844] test_bpf: #129 ALU64_DIV_X: 0xffffffffffffffff / (-1) =
0x0000000000000001 jited:0 604 PASS
[   38.630156] test_bpf: #130 ALU_DIV_K: 6 / 2 = 3 jited:0 249 PASS
[   38.632858] test_bpf: #131 ALU_DIV_K: 3 / 1 = 3 jited:0 240 PASS
[   38.635647] test_bpf: #132 ALU_DIV_K: 4294967295 / 4294967295 = 1
jited:0 254 PASS
[   38.638408] test_bpf: #133 ALU_DIV_K: 0xffffffffffffffff / (-1) =
0x1 jited:0 379 PASS
[   38.642450] test_bpf: #134 ALU64_DIV_K: 6 / 2 = 3 jited:0 346 PASS
[   38.646123] test_bpf: #135 ALU64_DIV_K: 3 / 1 = 3 jited:0 323 PASS
[   38.649558] test_bpf: #136 ALU64_DIV_K: 2147483647 / 2147483647 = 1
jited:0 329 PASS
[   38.653061] test_bpf: #137 ALU64_DIV_K: 0xffffffffffffffff / (-1) =
0x0000000000000001 jited:0 477 PASS
[   38.658065] test_bpf: #138 ALU_MOD_X: 3 % 2 = 1 jited:0 421 PASS
[   38.662580] test_bpf: #139 ALU_MOD_X: 4294967295 % 4294967293 = 2
jited:0 453 PASS
[   38.667414] test_bpf: #140 ALU64_MOD_X: 3 % 2 = 1 jited:0 553 PASS
[   38.673235] test_bpf: #141 ALU64_MOD_X: 2147483647 % 2147483645 = 2
jited:0 583 PASS
[   38.679343] test_bpf: #142 ALU_MOD_K: 3 % 2 = 1 jited:0 380 PASS
[   38.683374] test_bpf: #143 ALU_MOD_K: 3 % 1 = 0 jited:0 PASS
[   38.683586] test_bpf: #144 ALU_MOD_K: 4294967295 % 4294967293 = 2
jited:0 467 PASS
[   38.688672] test_bpf: #145 ALU64_MOD_K: 3 % 2 = 1 jited:0 492 PASS
[   38.694058] test_bpf: #146 ALU64_MOD_K: 3 % 1 = 0 jited:0 PASS
[   38.694359] test_bpf: #147 ALU64_MOD_K: 2147483647 % 2147483645 = 2
jited:0 571 PASS
[   38.700389] test_bpf: #148 ALU_AND_X: 3 & 2 = 2 jited:0 225 PASS
[   38.702952] test_bpf: #149 ALU_AND_X: 0xffffffff & 0xffffffff =
0xffffffff jited:0 261 PASS
[   38.705982] test_bpf: #150 ALU64_AND_X: 3 & 2 = 2 jited:0 273 PASS
[   38.709194] test_bpf: #151 ALU64_AND_X: 0xffffffff & 0xffffffff =
0xffffffff jited:0 251 PASS
[   38.712213] test_bpf: #152 ALU_AND_K: 3 & 2 = 2 jited:0 201 PASS
[   38.714638] test_bpf: #153 ALU_AND_K: 0xffffffff & 0xffffffff =
0xffffffff jited:0 240 PASS
[   38.717477] test_bpf: #154 ALU64_AND_K: 3 & 2 = 2 jited:0 209 PASS
[   38.720125] test_bpf: #155 ALU64_AND_K: 0xffffffff & 0xffffffff =
0xffffffff jited:0 319 PASS
[   38.724356] test_bpf: #156 ALU64_AND_K: 0x0000ffffffff0000 & 0x0 =
0x0000ffff00000000 jited:0 384 PASS
[   38.729293] test_bpf: #157 ALU64_AND_K: 0x0000ffffffff0000 & -1 =
0x0000ffffffffffff jited:0 367 PASS
[   38.733598] test_bpf: #158 ALU64_AND_K: 0xffffffffffffffff & -1 =
0xffffffffffffffff jited:0 375 PASS
[   38.737966] test_bpf: #159 ALU_OR_X: 1 | 2 = 3 jited:0 271 PASS
[   38.741274] test_bpf: #160 ALU_OR_X: 0x0 | 0xffffffff = 0xffffffff
jited:0 280 PASS
[   38.744653] test_bpf: #161 ALU64_OR_X: 1 | 2 = 3 jited:0 253 PASS
[   38.747717] test_bpf: #162 ALU64_OR_X: 0 | 0xffffffff = 0xffffffff
jited:0 263 PASS
[   38.750830] test_bpf: #163 ALU_OR_K: 1 | 2 = 3 jited:0 216 PASS
[   38.753357] test_bpf: #164 ALU_OR_K: 0 & 0xffffffff = 0xffffffff
jited:0 187 PASS
[   38.755553] test_bpf: #165 ALU64_OR_K: 1 | 2 = 3 jited:0 183 PASS
[   38.757693] test_bpf: #166 ALU64_OR_K: 0 & 0xffffffff = 0xffffffff
jited:0 195 PASS
[   38.759975] test_bpf: #167 ALU64_OR_K: 0x0000ffffffff0000 | 0x0 =
0x0000ffff00000000 jited:0 338 PASS
[   38.763728] test_bpf: #168 ALU64_OR_K: 0x0000ffffffff0000 | -1 =
0xffffffffffffffff jited:0 324 PASS
[   38.767311] test_bpf: #169 ALU64_OR_K: 0x000000000000000 | -1 =
0xffffffffffffffff jited:0 309 PASS
[   38.770633] test_bpf: #170 ALU_XOR_X: 5 ^ 6 = 3 jited:0 216 PASS
[   38.776135] test_bpf: #171 ALU_XOR_X: 0x1 ^ 0xffffffff = 0xfffffffe
jited:0 414 PASS
[   38.780950] test_bpf: #172 ALU64_XOR_X: 5 ^ 6 = 3 jited:0 320 PASS
[   38.784540] test_bpf: #173 ALU64_XOR_X: 1 ^ 0xffffffff = 0xfffffffe
jited:0 223 PASS
[   38.787037] test_bpf: #174 ALU_XOR_K: 5 ^ 6 = 3 jited:0 203 PASS
[   38.789359] test_bpf: #175 ALU_XOR_K: 1 ^ 0xffffffff = 0xfffffffe
jited:0 205 PASS
[   38.791707] test_bpf: #176 ALU64_XOR_K: 5 ^ 6 = 3 jited:0 205 PASS
[   38.794045] test_bpf: #177 ALU64_XOR_K: 1 & 0xffffffff = 0xfffffffe
jited:0 186 PASS
[   38.796180] test_bpf: #178 ALU64_XOR_K: 0x0000ffffffff0000 ^ 0x0 =
0x0000ffffffff0000 jited:0 352 PASS
[   38.800050] test_bpf: #179 ALU64_XOR_K: 0x0000ffffffff0000 ^ -1 =
0xffff00000000ffff jited:0 353 PASS
[   38.803970] test_bpf: #180 ALU64_XOR_K: 0x000000000000000 ^ -1 =
0xffffffffffffffff jited:0 362 PASS
[   38.808102] test_bpf: #181 ALU_LSH_X: 1 << 1 = 2 jited:0 211 PASS
[   38.810517] test_bpf: #182 ALU_LSH_X: 1 << 31 = 0x80000000 jited:0 216 PASS
[   38.812957] test_bpf: #183 ALU64_LSH_X: 1 << 1 = 2 jited:0 224 PASS
[   38.815480] test_bpf: #184 ALU64_LSH_X: 1 << 31 = 0x80000000 jited:0 223 PASS
[   38.818057] test_bpf: #185 ALU_LSH_K: 1 << 1 = 2 jited:0 208 PASS
[   38.820559] test_bpf: #186 ALU_LSH_K: 1 << 31 = 0x80000000 jited:0 210 PASS
[   38.823011] test_bpf: #187 ALU64_LSH_K: 1 << 1 = 2 jited:0 211 PASS
[   38.825737] test_bpf: #188 ALU64_LSH_K: 1 << 31 = 0x80000000 jited:0 182 PASS
[   38.828021] test_bpf: #189 ALU_RSH_X: 2 >> 1 = 1 jited:0 226 PASS
[   38.830655] test_bpf: #190 ALU_RSH_X: 0x80000000 >> 31 = 1 jited:0 225 PASS
[   38.833287] test_bpf: #191 ALU64_RSH_X: 2 >> 1 = 1 jited:0 289 PASS
[   38.836535] test_bpf: #192 ALU64_RSH_X: 0x80000000 >> 31 = 1 jited:0 253 PASS
[   38.839501] test_bpf: #193 ALU_RSH_K: 2 >> 1 = 1 jited:0 207 PASS
[   38.842025] test_bpf: #194 ALU_RSH_K: 0x80000000 >> 31 = 1 jited:0 210 PASS
[   38.844570] test_bpf: #195 ALU64_RSH_K: 2 >> 1 = 1 jited:0 232 PASS
[   38.847341] test_bpf: #196 ALU64_RSH_K: 0x80000000 >> 31 = 1 jited:0 208 PASS
[   38.849849] test_bpf: #197 ALU_ARSH_X: 0xff00ff0000000000 >> 40 =
0xffffffffffff00ff jited:0 252 PASS
[   38.852728] test_bpf: #198 ALU_ARSH_K: 0xff00ff0000000000 >> 40 =
0xffffffffffff00ff jited:0 197 PASS
[   38.855165] test_bpf: #199 ALU_NEG: -(3) = -3 jited:0 189 PASS
[   38.857410] test_bpf: #200 ALU_NEG: -(-3) = 3 jited:0 171 PASS
[   38.859380] test_bpf: #201 ALU64_NEG: -(3) = -3 jited:0 179 PASS
[   38.861411] test_bpf: #202 ALU64_NEG: -(-3) = 3 jited:0 180 PASS
[   38.863491] test_bpf: #203 ALU_END_FROM_BE 16: 0x0123456789abcdef
-> 0xcdef jited:0 202 PASS
[   38.865978] test_bpf: #204 ALU_END_FROM_BE 32: 0x0123456789abcdef
-> 0x89abcdef jited:0 368 PASS
[   38.869957] test_bpf: #205 ALU_END_FROM_BE 64: 0x0123456789abcdef
-> 0x89abcdef jited:0 244 PASS
[   38.872708] test_bpf: #206 ALU_END_FROM_LE 16: 0x0123456789abcdef
-> 0xefcd jited:0 274 PASS
[   38.875930] test_bpf: #207 ALU_END_FROM_LE 32: 0x0123456789abcdef
-> 0xefcdab89 jited:0 319 PASS
[   38.879417] test_bpf: #208 ALU_END_FROM_LE 64: 0x0123456789abcdef
-> 0x67452301 jited:0 193 PASS
[   38.881653] test_bpf: #209 ST_MEM_B: Store/Load byte: max negative
jited:0 219 PASS
[   38.884143] test_bpf: #210 ST_MEM_B: Store/Load byte: max positive
jited:0 227 PASS
[   38.886902] test_bpf: #211 STX_MEM_B: Store/Load byte: max negative
jited:0 251 PASS
[   38.889691] test_bpf: #212 ST_MEM_H: Store/Load half word: max
negative jited:0 218 PASS
[   38.892132] test_bpf: #213 ST_MEM_H: Store/Load half word: max
positive jited:0 208 PASS
[   38.894448] test_bpf: #214 STX_MEM_H: Store/Load half word: max
negative jited:0 259 PASS
[   38.897504] test_bpf: #215 ST_MEM_W: Store/Load word: max negative
jited:0 253 PASS
[   38.900355] test_bpf: #216 ST_MEM_W: Store/Load word: max positive
jited:0 244 PASS
[   38.903051] test_bpf: #217 STX_MEM_W: Store/Load word: max negative
jited:0 297 PASS
[   38.906372] test_bpf: #218 ST_MEM_DW: Store/Load double word: max
negative jited:0 257 PASS
[   38.909268] test_bpf: #219 ST_MEM_DW: Store/Load double word: max
negative 2 jited:0 392 PASS
[   38.913520] test_bpf: #220 ST_MEM_DW: Store/Load double word: max
positive jited:0 292 PASS
[   38.916792] test_bpf: #221 STX_MEM_DW: Store/Load double word: max
negative jited:0 259 PASS
[   38.919654] test_bpf: #222 STX_XADD_W: Test: 0x12 + 0x10 = 0x22
jited:0 262 PASS
[   38.922517] test_bpf: #223 STX_XADD_W: Test side-effects, r10: 0x12
+ 0x10 = 0x22 jited:0 PASS
[   38.922764] test_bpf: #224 STX_XADD_W: Test side-effects, r0: 0x12
+ 0x10 = 0x22 jited:0 221 PASS
[   38.925373] test_bpf: #225 STX_XADD_W: X + 1 + 1 + 1 + ... jited:0
142719 PASS
[   40.352892] test_bpf: #226 STX_XADD_DW: Test: 0x12 + 0x10 = 0x22
jited:0 345 PASS
[   40.356940] test_bpf: #227 STX_XADD_DW: Test side-effects, r10:
0x12 + 0x10 = 0x22 jited:0 PASS
[   40.357188] test_bpf: #228 STX_XADD_DW: Test side-effects, r0: 0x12
+ 0x10 = 0x22 jited:0 254 PASS
[   40.359954] test_bpf: #229 STX_XADD_DW: X + 1 + 1 + 1 + ... jited:0
349891 PASS
[   43.859287] test_bpf: #230 JMP_EXIT jited:0 127 PASS
[   43.861346] test_bpf: #231 JMP_JA: Unconditional jump: if (true)
return 1 jited:0 194 PASS
[   43.863538] test_bpf: #232 JMP_JSGT_K: Signed jump: if (-1 > -2)
return 1 jited:0 262 PASS
[   43.866400] test_bpf: #233 JMP_JSGT_K: Signed jump: if (-1 > -1)
return 0 jited:0 249 PASS
[   43.869132] test_bpf: #234 JMP_JSGE_K: Signed jump: if (-1 >= -2)
return 1 jited:0 262 PASS
[   43.872046] test_bpf: #235 JMP_JSGE_K: Signed jump: if (-1 >= -1)
return 1 jited:0 260 PASS
[   43.874890] test_bpf: #236 JMP_JGT_K: if (3 > 2) return 1 jited:0 260 PASS
[   43.877701] test_bpf: #237 JMP_JGT_K: Unsigned jump: if (-1 > 1)
return 1 jited:0 278 PASS
[   43.880801] test_bpf: #238 JMP_JGE_K: if (3 >= 2) return 1 jited:0 255 PASS
[   43.883637] test_bpf: #239 JMP_JGT_K: if (3 > 2) return 1 (jump
backwards) jited:0 321 PASS
[   43.887202] test_bpf: #240 JMP_JGE_K: if (3 >= 3) return 1 jited:0 340 PASS
[   43.891306] test_bpf: #241 JMP_JNE_K: if (3 != 2) return 1 jited:0 310 PASS
[   43.895036] test_bpf: #242 JMP_JEQ_K: if (3 == 3) return 1 jited:0 310 PASS
[   43.898963] test_bpf: #243 JMP_JSET_K: if (0x3 & 0x2) return 1
jited:0 276 PASS
[   43.902034] test_bpf: #244 JMP_JSET_K: if (0x3 & 0xffffffff) return
1 jited:0 312 PASS
[   43.905679] test_bpf: #245 JMP_JSGT_X: Signed jump: if (-1 > -2)
return 1 jited:0 346 PASS
[   43.909500] test_bpf: #246 JMP_JSGT_X: Signed jump: if (-1 > -1)
return 0 jited:0 292 PASS
[   43.912696] test_bpf: #247 JMP_JSGE_X: Signed jump: if (-1 >= -2)
return 1 jited:0 318 PASS
[   43.916115] test_bpf: #248 JMP_JSGE_X: Signed jump: if (-1 >= -1)
return 1 jited:0 287 PASS
[   43.919236] test_bpf: #249 JMP_JGT_X: if (3 > 2) return 1 jited:0 316 PASS
[   43.922749] test_bpf: #250 JMP_JGT_X: Unsigned jump: if (-1 > 1)
return 1 jited:0 400 PASS
[   43.927178] test_bpf: #251 JMP_JGE_X: if (3 >= 2) return 1 jited:0 287 PASS
[   43.930323] test_bpf: #252 JMP_JGE_X: if (3 >= 3) return 1 jited:0 287 PASS
[   43.933432] test_bpf: #253 JMP_JGE_X: ldimm64 test 1 jited:0 323 PASS
[   43.936912] test_bpf: #254 JMP_JGE_X: ldimm64 test 2 jited:0 298 PASS
[   43.940168] test_bpf: #255 JMP_JGE_X: ldimm64 test 3 jited:0 263 PASS
[   43.943062] test_bpf: #256 JMP_JNE_X: if (3 != 2) return 1 jited:0 313 PASS
[   43.946483] test_bpf: #257 JMP_JEQ_X: if (3 == 3) return 1 jited:0 308 PASS
[   43.949817] test_bpf: #258 JMP_JSET_X: if (0x3 & 0x2) return 1
jited:0 359 PASS
[   43.953715] test_bpf: #259 JMP_JSET_X: if (0x3 & 0xffffffff) return
1 jited:0 421 PASS
[   43.958350] test_bpf: #260 JMP_JA: Jump, gap, jump, ... jited:0 309 PASS
[   43.961783] test_bpf: #261 BPF_MAXINSNS: Maximum possible literals
jited:0 251 PASS
[   43.969019] test_bpf: #262 BPF_MAXINSNS: Single literal jited:0 286 PASS
[   43.976250] test_bpf: #263 BPF_MAXINSNS: Run/add until end jited:0
254969 PASS
[   46.530754] test_bpf: #264 BPF_MAXINSNS: Too many instructions PASS
[   46.531227] test_bpf: #265 BPF_MAXINSNS: Very long jump jited:0 284 PASS
[   46.538925] test_bpf: #266 BPF_MAXINSNS: Ctx heavy transformations
jited:0 548311 560800 PASS
[   57.635685] test_bpf: #267 BPF_MAXINSNS: Call heavy transformations
jited:0 949505 881276 PASS
[   75.951893] test_bpf: #268 BPF_MAXINSNS: Jump heavy test jited:0 480796 PASS
[   80.765143] test_bpf: #269 BPF_MAXINSNS: Very long jump backwards
jited:0 193 PASS
[   80.767750] test_bpf: #270 BPF_MAXINSNS: Edge hopping nuthouse
jited:0 114304 PASS
[   81.911103] test_bpf: #271 BPF_MAXINSNS: Jump, gap, jump, ...
jited:0 1884 PASS
[   81.935374] test_bpf: #272 BPF_MAXINSNS: ld_abs+get_processor_id
jited:0 546269 PASS
[   87.405760] test_bpf: #273 BPF_MAXINSNS: ld_abs+vlan_push/pop
jited:0 594906 PASS
[   93.356075] test_bpf: #274 LD_IND byte frag jited:0 695 PASS
[   93.364087] test_bpf: #275 LD_IND halfword frag jited:0 818 PASS
[   93.372861] test_bpf: #276 LD_IND word frag jited:0 837 PASS
[   93.381738] test_bpf: #277 LD_IND halfword mixed head/frag jited:0 1170 PASS
[   93.394096] test_bpf: #278 LD_IND word mixed head/frag jited:0 950 PASS
[   93.404149] test_bpf: #279 LD_ABS byte frag jited:0 953 PASS
[   93.414270] test_bpf: #280 LD_ABS halfword frag jited:0 754 PASS
[   93.422281] test_bpf: #281 LD_ABS word frag jited:0 1133 PASS
[   93.434166] test_bpf: #282 LD_ABS halfword mixed head/frag jited:0 1079 PASS
[   93.445353] test_bpf: #283 LD_ABS word mixed head/frag jited:0 718 PASS
[   93.452901] test_bpf: #284 LD_IND byte default X jited:0 297 PASS
[   93.456118] test_bpf: #285 LD_IND byte positive offset jited:0 300 PASS
[   93.459342] test_bpf: #286 LD_IND byte negative offset jited:0 296 PASS
[   93.462553] test_bpf: #287 LD_IND halfword positive offset jited:0 333 PASS
[   93.466116] test_bpf: #288 LD_IND halfword negative offset jited:0 306 PASS
[   93.469402] test_bpf: #289 LD_IND halfword unaligned jited:0 307 PASS
[   93.472711] test_bpf: #290 LD_IND word positive offset jited:0 337 PASS
[   93.476296] test_bpf: #291 LD_IND word negative offset jited:0 312 PASS
[   93.479676] test_bpf: #292 LD_IND word unaligned (addr & 3 == 2)
jited:0 309 PASS
[   93.482987] test_bpf: #293 LD_IND word unaligned (addr & 3 == 1)
jited:0 335 PASS
[   93.486601] test_bpf: #294 LD_IND word unaligned (addr & 3 == 3)
jited:0 305 PASS
[   93.489878] test_bpf: #295 LD_ABS byte jited:0 269 PASS
[   93.492784] test_bpf: #296 LD_ABS halfword jited:0 294 PASS
[   93.495950] test_bpf: #297 LD_ABS halfword unaligned jited:0 271 PASS
[   93.498895] test_bpf: #298 LD_ABS word jited:0 265 PASS
[   93.501756] test_bpf: #299 LD_ABS word unaligned (addr & 3 == 2)
jited:0 267 PASS
[   93.504667] test_bpf: #300 LD_ABS word unaligned (addr & 3 == 1)
jited:0 269 PASS
[   93.507584] test_bpf: #301 LD_ABS word unaligned (addr & 3 == 3)
jited:0 281 PASS
[   93.510665] test_bpf: #302 ADD default X jited:0 296 PASS
[   93.513830] test_bpf: #303 ADD default A jited:0 309 PASS
[   93.517144] test_bpf: #304 SUB default X jited:0 290 PASS
[   93.520249] test_bpf: #305 SUB default A jited:0 252 PASS
[   93.522974] test_bpf: #306 MUL default X jited:0 322 PASS
[   93.526403] test_bpf: #307 MUL default A jited:0 267 PASS
[   93.529277] test_bpf: #308 DIV default X jited:0 293 PASS
[   93.532414] test_bpf: #309 DIV default A jited:0 336 PASS
[   93.535988] test_bpf: #310 MOD default X jited:0 284 PASS
[   93.539032] test_bpf: #311 MOD default A jited:0 435 PASS
[   93.543608] test_bpf: #312 JMP EQ default A jited:0 352 PASS
[   93.547355] test_bpf: #313 JMP EQ default X jited:0 357 PASS
[   93.551176] test_bpf: Summary: 314 PASSED, 0 FAILED, [0/306 JIT'ed]

2) JIT enabled

[root@vexpress modules]# insmod test_bpf.ko
[   53.785470] test_bpf: #0 TAX jited:1 234 171 195 PASS
[   53.794856] test_bpf: #1 TXA jited:1 81 79 77 PASS
[   53.803927] test_bpf: #2 ADD_SUB_MUL_K jited:1 89 PASS
[   53.805542] test_bpf: #3 DIV_MOD_KX jited:1 939 PASS
[   53.816227] test_bpf: #4 AND_OR_LSH_K jited:1 116 114 PASS
[   53.821088] test_bpf: #5 LD_IMM_0 jited:1 93 PASS
[   53.822900] test_bpf: #6 LD_IND jited:1 371 279 274 PASS
[   53.833030] test_bpf: #7 LD_ABS jited:1 408 402 272 PASS
[   53.844767] test_bpf: #8 LD_ABS_LL jited:1 387 346 PASS
[   53.852730] test_bpf: #9 LD_IND_LL jited:1 239 248 217 PASS
[   53.860410] test_bpf: #10 LD_ABS_NET jited:1 356 332 PASS
[   53.867897] test_bpf: #11 LD_IND_NET jited:1 223 212 320 PASS
[   53.876076] test_bpf: #12 LD_PKTTYPE jited:1 102 90 PASS
[   53.878660] test_bpf: #13 LD_MARK jited:1 80 80 PASS
[   53.880695] test_bpf: #14 LD_RXHASH jited:1 73 71 PASS
[   53.882488] test_bpf: #15 LD_QUEUE jited:1 120 121 PASS
[   53.885266] test_bpf: #16 LD_PROTOCOL jited:1 256 247 PASS
[   53.890918] test_bpf: #17 LD_VLAN_TAG jited:1 82 84 PASS
[   53.893002] test_bpf: #18 LD_VLAN_TAG_PRESENT jited:1 80 77 PASS
[   53.894946] test_bpf: #19 LD_IFINDEX jited:1 87 98 PASS
[   53.897261] test_bpf: #20 LD_HATYPE jited:1 95 90 PASS
[   53.899466] test_bpf: #21 LD_CPU
[   53.899663] bpf_jit: *** NOT YET: opcode 85 ***
[   53.899796] jited:0 722 837 PASS
[   53.915645] test_bpf: #22 LD_NLATTR jited:0 593 659 PASS
[   53.928662] test_bpf: #23 LD_NLATTR_NEST jited:0 2186 2964 PASS
[   53.980966] test_bpf: #24 LD_PAYLOAD_OFF jited:0 3891 5637 PASS
[   54.076878] test_bpf: #25 LD_ANC_XOR jited:1 86 100 PASS
[   54.079241] test_bpf: #26 SPILL_FILL jited:1 131 137 123 PASS
[   54.084092] test_bpf: #27 JEQ jited:1 266 189 216 PASS
[   54.091500] test_bpf: #28 JGT jited:1 301 211 192 PASS
[   54.099467] test_bpf: #29 JGE jited:1 191 200 223 PASS
[   54.106275] test_bpf: #30 JSET jited:1 211 210 214 PASS
[   54.113660] test_bpf: #31 tcpdump port 22 jited:1 314 722 711 PASS
[   54.131943] test_bpf: #32 tcpdump complex jited:1 291 707 1068 PASS
[   54.153409] test_bpf: #33 RET_A jited:1 83 88 PASS
[   54.155617] test_bpf: #34 INT: ADD trivial jited:1 162 PASS
[   54.158387] test_bpf: #35 INT: MUL_X jited:1 176 PASS
[   54.161075] test_bpf: #36 INT: MUL_X2 jited:1 84 PASS
[   54.162483] test_bpf: #37 INT: MUL32_X jited:1 99 PASS
[   54.163849] test_bpf: #38 INT: ADD 64-bit jited:1 1066 PASS
[   54.175468] test_bpf: #39 INT: ADD 32-bit jited:1 666 PASS
[   54.182860] test_bpf: #40 INT: SUB jited:1 3236 PASS
[   54.215932] test_bpf: #41 INT: XOR jited:1 308 PASS
[   54.219704] test_bpf: #42 INT: MUL jited:1 376 PASS
[   54.224452] test_bpf: #43 MOV REG64 jited:1 227 PASS
[   54.227383] test_bpf: #44 MOV REG32 jited:1 171 PASS
[   54.229618] test_bpf: #45 LD IMM64 jited:1 163 PASS
[   54.231875] test_bpf: #46 INT: ALU MIX jited:0 1277 PASS
[   54.245188] test_bpf: #47 INT: shifts by register jited:1 208 PASS
[   54.248151] test_bpf: #48 INT: DIV + ABS jited:1 659 601 PASS
[   54.261395] test_bpf: #49 INT: DIV by zero jited:1 317 169 PASS
[   54.266949] test_bpf: #50 check: missing ret PASS
[   54.267418] test_bpf: #51 check: div_k_0 PASS
[   54.267631] test_bpf: #52 check: unknown insn PASS
[   54.267804] test_bpf: #53 check: out of range spill/fill PASS
[   54.268008] test_bpf: #54 JUMPS + HOLES jited:1 358 PASS
[   54.272201] test_bpf: #55 check: RET X PASS
[   54.273054] test_bpf: #56 check: LDX + RET X PASS
[   54.273226] test_bpf: #57 M[]: alt STX + LDX jited:1 456 PASS
[   54.278359] test_bpf: #58 M[]: full STX + full LDX jited:1 438 PASS
[   54.283300] test_bpf: #59 check: SKF_AD_MAX PASS
[   54.283576] test_bpf: #60 LD [SKF_AD_OFF-1] jited:1 198 PASS
[   54.285812] test_bpf: #61 load 64-bit immediate jited:1 125 PASS
[   54.287556] test_bpf: #62 nmap reduced jited:1 1054 PASS
[   54.298630] test_bpf: #63 ALU_MOV_X: dst = 2 jited:1 81 PASS
[   54.300079] test_bpf: #64 ALU_MOV_X: dst = 4294967295 jited:1 85 PASS
[   54.301462] test_bpf: #65 ALU64_MOV_X: dst = 2 jited:1 96 PASS
[   54.303048] test_bpf: #66 ALU64_MOV_X: dst = 4294967295 jited:1 71 PASS
[   54.304115] test_bpf: #67 ALU_MOV_K: dst = 2 jited:1 70 PASS
[   54.305148] test_bpf: #68 ALU_MOV_K: dst = 4294967295 jited:1 71 PASS
[   54.306222] test_bpf: #69 ALU_MOV_K: 0x0000ffffffff0000 =
0x00000000ffffffff jited:1 97 PASS
[   54.307659] test_bpf: #70 ALU64_MOV_K: dst = 2 jited:1 75 PASS
[   54.308750] test_bpf: #71 ALU64_MOV_K: dst = 2147483647 jited:1 66 PASS
[   54.309773] test_bpf: #72 ALU64_OR_K: dst = 0x0 jited:1 92 PASS
[   54.311093] test_bpf: #73 ALU64_MOV_K: dst = -1 jited:1 94 PASS
[   54.312383] test_bpf: #74 ALU_ADD_X: 1 + 2 = 3 jited:1 66 PASS
[   54.313388] test_bpf: #75 ALU_ADD_X: 1 + 4294967294 = 4294967295
jited:1 66 PASS
[   54.314430] test_bpf: #76 ALU_ADD_X: 2 + 4294967294 = 0 jited:1 87 PASS
[   54.315756] test_bpf: #77 ALU64_ADD_X: 1 + 2 = 3 jited:1 77 PASS
[   54.316892] test_bpf: #78 ALU64_ADD_X: 1 + 4294967294 = 4294967295
jited:1 72 PASS
[   54.318015] test_bpf: #79 ALU64_ADD_X: 2 + 4294967294 = 4294967296
jited:1 79 PASS
[   54.319181] test_bpf: #80 ALU_ADD_K: 1 + 2 = 3 jited:1 75 PASS
[   54.320261] test_bpf: #81 ALU_ADD_K: 3 + 0 = 3 jited:1 71 PASS
[   54.321307] test_bpf: #82 ALU_ADD_K: 1 + 4294967294 = 4294967295
jited:1 67 PASS
[   54.322342] test_bpf: #83 ALU_ADD_K: 4294967294 + 2 = 0 jited:1 82 PASS
[   54.323600] test_bpf: #84 ALU_ADD_K: 0 + (-1) = 0x00000000ffffffff
jited:1 86 PASS
[   54.325898] test_bpf: #85 ALU_ADD_K: 0 + 0xffff = 0xffff jited:1 99 PASS
[   54.327242] test_bpf: #86 ALU_ADD_K: 0 + 0x7fffffff = 0x7fffffff
jited:1 113 PASS
[   54.328684] test_bpf: #87 ALU_ADD_K: 0 + 0x80000000 = 0x80000000
jited:1 123 PASS
[   54.330224] test_bpf: #88 ALU_ADD_K: 0 + 0x80008000 = 0x80008000
jited:1 85 PASS
[   54.331395] test_bpf: #89 ALU64_ADD_K: 1 + 2 = 3 jited:1 66 PASS
[   54.332375] test_bpf: #90 ALU64_ADD_K: 3 + 0 = 3 jited:1 66 PASS
[   54.333381] test_bpf: #91 ALU64_ADD_K: 1 + 2147483646 = 2147483647
jited:1 69 PASS
[   54.334397] test_bpf: #92 ALU64_ADD_K: 4294967294 + 2 = 4294967296
jited:1 109 PASS
[   54.335818] test_bpf: #93 ALU64_ADD_K: 2147483646 + -2147483647 =
-1 jited:1 72 PASS
[   54.336873] test_bpf: #94 ALU64_ADD_K: 1 + 0 = 1 jited:1 126 PASS
[   54.338484] test_bpf: #95 ALU64_ADD_K: 0 + (-1) =
0xffffffffffffffff jited:1 107 PASS
[   54.340100] test_bpf: #96 ALU64_ADD_K: 0 + 0xffff = 0xffff jited:1 98 PASS
[   54.341569] test_bpf: #97 ALU64_ADD_K: 0 + 0x7fffffff = 0x7fffffff
jited:1 87 PASS
[   54.342794] test_bpf: #98 ALU64_ADD_K: 0 + 0x80000000 =
0xffffffff80000000 jited:1 98 PASS
[   54.344142] test_bpf: #99 ALU_ADD_K: 0 + 0x80008000 =
0xffffffff80008000 jited:1 92 PASS
[   54.345399] test_bpf: #100 ALU_SUB_X: 3 - 1 = 2 jited:1 77 PASS
[   54.346726] test_bpf: #101 ALU_SUB_X: 4294967295 - 4294967294 = 1
jited:1 72 PASS
[   54.347794] test_bpf: #102 ALU64_SUB_X: 3 - 1 = 2 jited:1 72 PASS
[   54.348826] test_bpf: #103 ALU64_SUB_X: 4294967295 - 4294967294 = 1
jited:1 71 PASS
[   54.349843] test_bpf: #104 ALU_SUB_K: 3 - 1 = 2 jited:1 120 PASS
[   54.351486] test_bpf: #105 ALU_SUB_K: 3 - 0 = 3 jited:1 82 PASS
[   54.352814] test_bpf: #106 ALU_SUB_K: 4294967295 - 4294967294 = 1
jited:1 103 PASS
[   54.354550] test_bpf: #107 ALU64_SUB_K: 3 - 1 = 2 jited:1 140 PASS
[   54.356822] test_bpf: #108 ALU64_SUB_K: 3 - 0 = 3 jited:1 117 PASS
[   54.359156] test_bpf: #109 ALU64_SUB_K: 4294967294 - 4294967295 =
-1 jited:1 83 PASS
[   54.360401] test_bpf: #110 ALU64_ADD_K: 2147483646 - 2147483647 =
-1 jited:1 77 PASS
[   54.361515] test_bpf: #111 ALU_MUL_X: 2 * 3 = 6 jited:1 68 PASS
[   54.362528] test_bpf: #112 ALU_MUL_X: 2 * 0x7FFFFFF8 = 0xFFFFFFF0
jited:1 70 PASS
[   54.363572] test_bpf: #113 ALU_MUL_X: -1 * -1 = 1 jited:1 73 PASS
[   54.364644] test_bpf: #114 ALU64_MUL_X: 2 * 3 = 6 jited:1 70 PASS
[   54.365655] test_bpf: #115 ALU64_MUL_X: 1 * 2147483647 = 2147483647
jited:1 75 PASS
[   54.366719] test_bpf: #116 ALU_MUL_K: 2 * 3 = 6 jited:1 67 PASS
[   54.367707] test_bpf: #117 ALU_MUL_K: 3 * 1 = 3 jited:1 71 PASS
[   54.368726] test_bpf: #118 ALU_MUL_K: 2 * 0x7FFFFFF8 = 0xFFFFFFF0
jited:1 70 PASS
[   54.369733] test_bpf: #119 ALU_MUL_K: 1 * (-1) = 0x00000000ffffffff
jited:1 153 PASS
[   54.371617] test_bpf: #120 ALU64_MUL_K: 2 * 3 = 6 jited:1 101 PASS
[   54.373505] test_bpf: #121 ALU64_MUL_K: 3 * 1 = 3 jited:1 108 PASS
[   54.375362] test_bpf: #122 ALU64_MUL_K: 1 * 2147483647 = 2147483647
jited:1 106 PASS
[   54.377242] test_bpf: #123 ALU64_MUL_K: 1 * -2147483647 =
-2147483647 jited:1 92 PASS
[   54.379044] test_bpf: #124 ALU64_MUL_K: 1 * (-1) =
0xffffffffffffffff jited:1 122 PASS
[   54.380863] test_bpf: #125 ALU_DIV_X: 6 / 2 = 3 jited:1 220 PASS
[   54.383591] test_bpf: #126 ALU_DIV_X: 4294967295 / 4294967295 = 1
jited:1 208 PASS
[   54.386292] test_bpf: #127 ALU64_DIV_X: 6 / 2 = 3 jited:0 736 PASS
[   54.394242] test_bpf: #128 ALU64_DIV_X: 2147483647 / 2147483647 = 1
jited:0 464 PASS
[   54.399433] test_bpf: #129 ALU64_DIV_X: 0xffffffffffffffff / (-1) =
0x0000000000000001 jited:0 743 PASS
[   54.407799] test_bpf: #130 ALU_DIV_K: 6 / 2 = 3 jited:1 246 PASS
[   54.410964] test_bpf: #131 ALU_DIV_K: 3 / 1 = 3 jited:1 199 PASS
[   54.413410] test_bpf: #132 ALU_DIV_K: 4294967295 / 4294967295 = 1
jited:1 192 PASS
[   54.415782] test_bpf: #133 ALU_DIV_K: 0xffffffffffffffff / (-1) =
0x1 jited:1 215 PASS
[   54.418414] test_bpf: #134 ALU64_DIV_K: 6 / 2 = 3 jited:0 364 PASS
[   54.422379] test_bpf: #135 ALU64_DIV_K: 3 / 1 = 3 jited:0 369 PASS
[   54.426692] test_bpf: #136 ALU64_DIV_K: 2147483647 / 2147483647 = 1
jited:0 380 PASS
[   54.430875] test_bpf: #137 ALU64_DIV_K: 0xffffffffffffffff / (-1) =
0x0000000000000001 jited:0 623 PASS
[   54.437429] test_bpf: #138 ALU_MOD_X: 3 % 2 = 1 jited:1 235 PASS
[   54.440177] test_bpf: #139 ALU_MOD_X: 4294967295 % 4294967293 = 2
jited:1 262 PASS
[   54.443183] test_bpf: #140 ALU64_MOD_X: 3 % 2 = 1 jited:0 1524 PASS
[   54.458988] test_bpf: #141 ALU64_MOD_X: 2147483647 % 2147483645 = 2
jited:0 720 PASS
[   54.466677] test_bpf: #142 ALU_MOD_K: 3 % 2 = 1 jited:1 231 PASS
[   54.469383] test_bpf: #143 ALU_MOD_K: 3 % 1 = 0 jited:1 PASS
[   54.469685] test_bpf: #144 ALU_MOD_K: 4294967295 % 4294967293 = 2
jited:1 257 PASS
[   54.472650] test_bpf: #145 ALU64_MOD_K: 3 % 2 = 1 jited:0 481 PASS
[   54.477765] test_bpf: #146 ALU64_MOD_K: 3 % 1 = 0 jited:0 PASS
[   54.478042] test_bpf: #147 ALU64_MOD_K: 2147483647 % 2147483645 = 2
jited:0 513 PASS
[   54.483455] test_bpf: #148 ALU_AND_X: 3 & 2 = 2 jited:1 100 PASS
[   54.484786] test_bpf: #149 ALU_AND_X: 0xffffffff & 0xffffffff =
0xffffffff jited:1 106 PASS
[   54.486335] test_bpf: #150 ALU64_AND_X: 3 & 2 = 2 jited:1 86 PASS
[   54.487738] test_bpf: #151 ALU64_AND_X: 0xffffffff & 0xffffffff =
0xffffffff jited:1 118 PASS
[   54.489623] test_bpf: #152 ALU_AND_K: 3 & 2 = 2 jited:1 117 PASS
[   54.491645] test_bpf: #153 ALU_AND_K: 0xffffffff & 0xffffffff =
0xffffffff jited:1 72 PASS
[   54.493119] test_bpf: #154 ALU64_AND_K: 3 & 2 = 2 jited:1 72 PASS
[   54.494195] test_bpf: #155 ALU64_AND_K: 0xffffffff & 0xffffffff =
0xffffffff jited:1 70 PASS
[   54.495330] test_bpf: #156 ALU64_AND_K: 0x0000ffffffff0000 & 0x0 =
0x0000ffff00000000 jited:1 99 PASS
[   54.496721] test_bpf: #157 ALU64_AND_K: 0x0000ffffffff0000 & -1 =
0x0000ffffffffffff jited:1 97 PASS
[   54.498106] test_bpf: #158 ALU64_AND_K: 0xffffffffffffffff & -1 =
0xffffffffffffffff jited:1 86 PASS
[   54.499343] test_bpf: #159 ALU_OR_X: 1 | 2 = 3 jited:1 73 PASS
[   54.500447] test_bpf: #160 ALU_OR_X: 0x0 | 0xffffffff = 0xffffffff
jited:1 72 PASS
[   54.501546] test_bpf: #161 ALU64_OR_X: 1 | 2 = 3 jited:1 89 PASS
[   54.502779] test_bpf: #162 ALU64_OR_X: 0 | 0xffffffff = 0xffffffff
jited:1 91 PASS
[   54.504154] test_bpf: #163 ALU_OR_K: 1 | 2 = 3 jited:1 71 PASS
[   54.505223] test_bpf: #164 ALU_OR_K: 0 & 0xffffffff = 0xffffffff
jited:1 116 PASS
[   54.506916] test_bpf: #165 ALU64_OR_K: 1 | 2 = 3 jited:1 77 PASS
[   54.508328] test_bpf: #166 ALU64_OR_K: 0 & 0xffffffff = 0xffffffff
jited:1 80 PASS
[   54.509666] test_bpf: #167 ALU64_OR_K: 0x0000ffffffff0000 | 0x0 =
0x0000ffff00000000 jited:1 86 PASS
[   54.511012] test_bpf: #168 ALU64_OR_K: 0x0000ffffffff0000 | -1 =
0xffffffffffffffff jited:1 99 PASS
[   54.512432] test_bpf: #169 ALU64_OR_K: 0x000000000000000 | -1 =
0xffffffffffffffff jited:1 147 PASS
[   54.514401] test_bpf: #170 ALU_XOR_X: 5 ^ 6 = 3 jited:1 80 PASS
[   54.515668] test_bpf: #171 ALU_XOR_X: 0x1 ^ 0xffffffff = 0xfffffffe
jited:1 73 PASS
[   54.516794] test_bpf: #172 ALU64_XOR_X: 5 ^ 6 = 3 jited:1 71 PASS
[   54.517879] test_bpf: #173 ALU64_XOR_X: 1 ^ 0xffffffff = 0xfffffffe
jited:1 72 PASS
[   54.518998] test_bpf: #174 ALU_XOR_K: 5 ^ 6 = 3 jited:1 71 PASS
[   54.520120] test_bpf: #175 ALU_XOR_K: 1 ^ 0xffffffff = 0xfffffffe
jited:1 67 PASS
[   54.521181] test_bpf: #176 ALU64_XOR_K: 5 ^ 6 = 3 jited:1 70 PASS
[   54.522292] test_bpf: #177 ALU64_XOR_K: 1 & 0xffffffff = 0xfffffffe
jited:1 104 PASS
[   54.523741] test_bpf: #178 ALU64_XOR_K: 0x0000ffffffff0000 ^ 0x0 =
0x0000ffffffff0000 jited:1 96 PASS
[   54.525269] test_bpf: #179 ALU64_XOR_K: 0x0000ffffffff0000 ^ -1 =
0xffff00000000ffff jited:1 119 PASS
[   54.526875] test_bpf: #180 ALU64_XOR_K: 0x000000000000000 ^ -1 =
0xffffffffffffffff jited:1 116 PASS
[   54.528421] test_bpf: #181 ALU_LSH_X: 1 << 1 = 2 jited:1 100 PASS
[   54.529848] test_bpf: #182 ALU_LSH_X: 1 << 31 = 0x80000000 jited:1 73 PASS
[   54.530965] test_bpf: #183 ALU64_LSH_X: 1 << 1 = 2 jited:1 119 PASS
[   54.532667] test_bpf: #184 ALU64_LSH_X: 1 << 31 = 0x80000000 jited:1 110 PASS
[   54.534257] test_bpf: #185 ALU_LSH_K: 1 << 1 = 2 jited:1 147 PASS
[   54.536290] test_bpf: #186 ALU_LSH_K: 1 << 31 = 0x80000000 jited:1 116 PASS
[   54.538165] test_bpf: #187 ALU64_LSH_K: 1 << 1 = 2 jited:1 154 PASS
[   54.540668] test_bpf: #188 ALU64_LSH_K: 1 << 31 = 0x80000000 jited:1 92 PASS
[   54.542464] test_bpf: #189 ALU_RSH_X: 2 >> 1 = 1 jited:1 86 PASS
[   54.543937] test_bpf: #190 ALU_RSH_X: 0x80000000 >> 31 = 1 jited:1 148 PASS
[   54.545995] test_bpf: #191 ALU64_RSH_X: 2 >> 1 = 1 jited:1 108 PASS
[   54.547759] test_bpf: #192 ALU64_RSH_X: 0x80000000 >> 31 = 1 jited:1 96 PASS
[   54.549178] test_bpf: #193 ALU_RSH_K: 2 >> 1 = 1 jited:1 68 PASS
[   54.550175] test_bpf: #194 ALU_RSH_K: 0x80000000 >> 31 = 1 jited:1 74 PASS
[   54.551208] test_bpf: #195 ALU64_RSH_K: 2 >> 1 = 1 jited:1 66 PASS
[   54.552193] test_bpf: #196 ALU64_RSH_K: 0x80000000 >> 31 = 1 jited:1 95 PASS
[   54.553449] test_bpf: #197 ALU_ARSH_X: 0xff00ff0000000000 >> 40 =
0xffffffffffff00ff jited:1 74 PASS
[   54.554566] test_bpf: #198 ALU_ARSH_K: 0xff00ff0000000000 >> 40 =
0xffffffffffff00ff jited:1 96 PASS
[   54.555984] test_bpf: #199 ALU_NEG: -(3) = -3 jited:1 84 PASS
[   54.557335] test_bpf: #200 ALU_NEG: -(-3) = 3 jited:1 72 PASS
[   54.558442] test_bpf: #201 ALU64_NEG: -(3) = -3 jited:1 74 PASS
[   54.559596] test_bpf: #202 ALU64_NEG: -(-3) = 3 jited:1 68 PASS
[   54.560664] test_bpf: #203 ALU_END_FROM_BE 16: 0x0123456789abcdef
-> 0xcdef jited:1 74 PASS
[   54.561814] test_bpf: #204 ALU_END_FROM_BE 32: 0x0123456789abcdef
-> 0x89abcdef jited:1 101 PASS
[   54.563242] test_bpf: #205 ALU_END_FROM_BE 64: 0x0123456789abcdef
-> 0x89abcdef jited:1 93 PASS
[   54.564578] test_bpf: #206 ALU_END_FROM_LE 16: 0x0123456789abcdef
-> 0xefcd jited:1 73 PASS
[   54.565750] test_bpf: #207 ALU_END_FROM_LE 32: 0x0123456789abcdef
-> 0xefcdab89 jited:1 76 PASS
[   54.566879] test_bpf: #208 ALU_END_FROM_LE 64: 0x0123456789abcdef
-> 0x67452301 jited:1 78 PASS
[   54.568009] test_bpf: #209 ST_MEM_B: Store/Load byte: max negative
jited:1 72 PASS
[   54.569258] test_bpf: #210 ST_MEM_B: Store/Load byte: max positive
jited:1 79 PASS
[   54.570402] test_bpf: #211 STX_MEM_B: Store/Load byte: max negative
jited:1 79 PASS
[   54.571541] test_bpf: #212 ST_MEM_H: Store/Load half word: max
negative jited:1 81 PASS
[   54.572896] test_bpf: #213 ST_MEM_H: Store/Load half word: max
positive jited:1 100 PASS
[   54.574521] test_bpf: #214 STX_MEM_H: Store/Load half word: max
negative jited:1 110 PASS
[   54.576159] test_bpf: #215 ST_MEM_W: Store/Load word: max negative
jited:1 75 PASS
[   54.577570] test_bpf: #216 ST_MEM_W: Store/Load word: max positive
jited:1 89 PASS
[   54.579195] test_bpf: #217 STX_MEM_W: Store/Load word: max negative
jited:1 122 PASS
[   54.581267] test_bpf: #218 ST_MEM_DW: Store/Load double word: max
negative jited:1 85 PASS
[   54.582954] test_bpf: #219 ST_MEM_DW: Store/Load double word: max
negative 2 jited:1 123 PASS
[   54.584677] test_bpf: #220 ST_MEM_DW: Store/Load double word: max
positive jited:1 78 PASS
[   54.585879] test_bpf: #221 STX_MEM_DW: Store/Load double word: max
negative jited:1 85 PASS
[   54.587106] test_bpf: #222 STX_XADD_W: Test: 0x12 + 0x10 = 0x22
jited:0 328 PASS
[   54.590869] test_bpf: #223 STX_XADD_W: Test side-effects, r10: 0x12
+ 0x10 = 0x22 jited:0 PASS
[   54.591178] test_bpf: #224 STX_XADD_W: Test side-effects, r0: 0x12
+ 0x10 = 0x22 jited:0 285 PASS
[   54.594489] test_bpf: #225 STX_XADD_W: X + 1 + 1 + 1 + ... jited:0
158746 PASS
[   56.182499] test_bpf: #226 STX_XADD_DW: Test: 0x12 + 0x10 = 0x22
jited:0 343 PASS
[   56.186642] test_bpf: #227 STX_XADD_DW: Test side-effects, r10:
0x12 + 0x10 = 0x22 jited:0 PASS
[   56.186926] test_bpf: #228 STX_XADD_DW: Test side-effects, r0: 0x12
+ 0x10 = 0x22 jited:0 272 PASS
[   56.190021] test_bpf: #229 STX_XADD_DW: X + 1 + 1 + 1 + ... jited:0
194997 PASS
[   58.140569] test_bpf: #230 JMP_EXIT jited:1 82 PASS
[   58.142427] test_bpf: #231 JMP_JA: Unconditional jump: if (true)
return 1 jited:1 86 PASS
[   58.155637] test_bpf: #232 JMP_JSGT_K: Signed jump: if (-1 > -2)
return 1 jited:1 86 PASS
[   58.157334] test_bpf: #233 JMP_JSGT_K: Signed jump: if (-1 > -1)
return 0 jited:1 82 PASS
[   58.158533] test_bpf: #234 JMP_JSGE_K: Signed jump: if (-1 >= -2)
return 1 jited:1 72 PASS
[   58.159560] test_bpf: #235 JMP_JSGE_K: Signed jump: if (-1 >= -1)
return 1 jited:1 73 PASS
[   58.160538] test_bpf: #236 JMP_JGT_K: if (3 > 2) return 1 jited:1 71 PASS
[   58.161457] test_bpf: #237 JMP_JGT_K: Unsigned jump: if (-1 > 1)
return 1 jited:1 72 PASS
[   58.162407] test_bpf: #238 JMP_JGE_K: if (3 >= 2) return 1 jited:1 77 PASS
[   58.163411] test_bpf: #239 JMP_JGT_K: if (3 > 2) return 1 (jump
backwards) jited:1 76 PASS
[   58.164416] test_bpf: #240 JMP_JGE_K: if (3 >= 3) return 1 jited:1 74 PASS
[   58.165391] test_bpf: #241 JMP_JNE_K: if (3 != 2) return 1 jited:1 74 PASS
[   58.166375] test_bpf: #242 JMP_JEQ_K: if (3 == 3) return 1 jited:1 78 PASS
[   58.167382] test_bpf: #243 JMP_JSET_K: if (0x3 & 0x2) return 1
jited:1 109 PASS
[   58.168822] test_bpf: #244 JMP_JSET_K: if (0x3 & 0xffffffff) return
1 jited:1 71 PASS
[   58.170396] test_bpf: #245 JMP_JSGT_X: Signed jump: if (-1 > -2)
return 1 jited:1 75 PASS
[   58.171568] test_bpf: #246 JMP_JSGT_X: Signed jump: if (-1 > -1)
return 0 jited:1 78 PASS
[   58.172804] test_bpf: #247 JMP_JSGE_X: Signed jump: if (-1 >= -2)
return 1 jited:1 134 PASS
[   58.175486] test_bpf: #248 JMP_JSGE_X: Signed jump: if (-1 >= -1)
return 1 jited:1 102 PASS
[   58.177403] test_bpf: #249 JMP_JGT_X: if (3 > 2) return 1 jited:1 83 PASS
[   58.178806] test_bpf: #250 JMP_JGT_X: Unsigned jump: if (-1 > 1)
return 1 jited:1 80 PASS
[   58.180104] test_bpf: #251 JMP_JGE_X: if (3 >= 2) return 1 jited:1 78 PASS
[   58.181230] test_bpf: #252 JMP_JGE_X: if (3 >= 3) return 1 jited:1 116 PASS
[   58.182751] test_bpf: #253 JMP_JGE_X: ldimm64 test 1 jited:1 81 PASS
[   58.183951] test_bpf: #254 JMP_JGE_X: ldimm64 test 2 jited:1 79 PASS
[   58.185334] test_bpf: #255 JMP_JGE_X: ldimm64 test 3 jited:1 78 PASS
[   58.186505] test_bpf: #256 JMP_JNE_X: if (3 != 2) return 1 jited:1 108 PASS
[   58.187991] test_bpf: #257 JMP_JEQ_X: if (3 == 3) return 1 jited:1 102 PASS
[   58.189496] test_bpf: #258 JMP_JSET_X: if (0x3 & 0x2) return 1
jited:1 133 PASS
[   58.191644] test_bpf: #259 JMP_JSET_X: if (0x3 & 0xffffffff) return
1 jited:1 128 PASS
[   58.193631] test_bpf: #260 JMP_JA: Jump, gap, jump, ... jited:1 108 PASS
[   58.195981] test_bpf: #261 BPF_MAXINSNS: Maximum possible literals
jited:1 111 PASS
[   58.211020] test_bpf: #262 BPF_MAXINSNS: Single literal jited:1 115 PASS
[   58.226185] test_bpf: #263 BPF_MAXINSNS: Run/add until end jited:1 8481 PASS
[   58.322910] test_bpf: #264 BPF_MAXINSNS: Too many instructions PASS
[   58.323076] test_bpf: #265 BPF_MAXINSNS: Very long jump jited:1 123 PASS
[   58.339381] test_bpf: #266 BPF_MAXINSNS: Ctx heavy transformations
jited:1 28166 29032 PASS
[   58.931050] test_bpf: #267 BPF_MAXINSNS: Call heavy transformations
jited:0 903498 894192 PASS
[   76.916296] test_bpf: #268 BPF_MAXINSNS: Jump heavy test jited:1 132663 PASS
[   78.260490] test_bpf: #269 BPF_MAXINSNS: Very long jump backwards
jited:1 148 PASS
[   78.269590] test_bpf: #270 BPF_MAXINSNS: Edge hopping nuthouse
jited:1 277097 PASS
[   81.046383] test_bpf: #271 BPF_MAXINSNS: Jump, gap, jump, ...
jited:1 1041 PASS
[   81.076916] test_bpf: #272 BPF_MAXINSNS: ld_abs+get_processor_id
jited:0 566894 PASS
[   86.754024] test_bpf: #273 BPF_MAXINSNS: ld_abs+vlan_push/pop
jited:0 602040 PASS
[   92.775504] test_bpf: #274 LD_IND byte frag jited:1 574 PASS
[   92.782876] test_bpf: #275 LD_IND halfword frag jited:1 641 PASS
[   92.790062] test_bpf: #276 LD_IND word frag jited:1 731 PASS
[   92.798321] test_bpf: #277 LD_IND halfword mixed head/frag jited:1 741 PASS
[   92.806601] test_bpf: #278 LD_IND word mixed head/frag jited:1 972 PASS
[   92.817542] test_bpf: #279 LD_ABS byte frag jited:1 601 PASS
[   92.824156] test_bpf: #280 LD_ABS halfword frag jited:1 603 PASS
[   92.830806] test_bpf: #281 LD_ABS word frag jited:1 688 PASS
[   92.838273] test_bpf: #282 LD_ABS halfword mixed head/frag jited:1 657 PASS
[   92.845562] test_bpf: #283 LD_ABS word mixed head/frag jited:1 748 PASS
[   92.853678] test_bpf: #284 LD_IND byte default X jited:1 178 PASS
[   92.856290] test_bpf: #285 LD_IND byte positive offset jited:1 187 PASS
[   92.858954] test_bpf: #286 LD_IND byte negative offset jited:1 178 PASS
[   92.861592] test_bpf: #287 LD_IND halfword positive offset jited:1 161 PASS
[   92.863726] test_bpf: #288 LD_IND halfword negative offset jited:1 195 PASS
[   92.866372] test_bpf: #289 LD_IND halfword unaligned jited:1 183 PASS
[   92.868821] test_bpf: #290 LD_IND word positive offset jited:1 170 PASS
[   92.871096] test_bpf: #291 LD_IND word negative offset jited:1 198 PASS
[   92.873832] test_bpf: #292 LD_IND word unaligned (addr & 3 == 2)
jited:1 281 PASS
[   92.877321] test_bpf: #293 LD_IND word unaligned (addr & 3 == 1)
jited:1 172 PASS
[   92.879493] test_bpf: #294 LD_IND word unaligned (addr & 3 == 3)
jited:1 171 PASS
[   92.881590] test_bpf: #295 LD_ABS byte jited:1 162 PASS
[   92.883535] test_bpf: #296 LD_ABS halfword jited:1 160 PASS
[   92.885486] test_bpf: #297 LD_ABS halfword unaligned jited:1 180 PASS
[   92.887650] test_bpf: #298 LD_ABS word jited:1 166 PASS
[   92.889661] test_bpf: #299 LD_ABS word unaligned (addr & 3 == 2)
jited:1 157 PASS
[   92.891595] test_bpf: #300 LD_ABS word unaligned (addr & 3 == 1)
jited:1 170 PASS
[   92.893662] test_bpf: #301 LD_ABS word unaligned (addr & 3 == 3)
jited:1 163 PASS
[   92.895660] test_bpf: #302 ADD default X jited:1 84 PASS
[   92.896895] test_bpf: #303 ADD default A jited:1 79 PASS
[   92.898143] test_bpf: #304 SUB default X jited:1 82 PASS
[   92.899284] test_bpf: #305 SUB default A jited:1 85 PASS
[   92.900529] test_bpf: #306 MUL default X jited:1 76 PASS
[   92.901642] test_bpf: #307 MUL default A jited:1 83 PASS
[   92.903045] test_bpf: #308 DIV default X jited:1 93 PASS
[   92.904524] test_bpf: #309 DIV default A jited:1 203 PASS
[   92.906955] test_bpf: #310 MOD default X jited:1 100 PASS
[   92.908398] test_bpf: #311 MOD default A jited:1 249 PASS
[   92.911232] test_bpf: #312 JMP EQ default A jited:1 83 PASS
[   92.912593] test_bpf: #313 JMP EQ default X jited:1 95 PASS
[   92.913931] test_bpf: Summary: 314 PASSED, 0 FAILED, [278/306 JIT'ed]

3) JIT + blinding enabled:

[root@vexpress modules]# insmod test_bpf.ko
[   56.044720] test_bpf: #0 TAX jited:1 239 218 229 PASS
[   56.054736] test_bpf: #1 TXA jited:1 89 119 85 PASS
[   56.064598] test_bpf: #2 ADD_SUB_MUL_K jited:1 213 PASS
[   56.067415] test_bpf: #3 DIV_MOD_KX jited:1 1190 PASS
[   56.080569] test_bpf: #4 AND_OR_LSH_K jited:1 200 149 PASS
[   56.084764] test_bpf: #5 LD_IMM_0 jited:1 101 PASS
[   56.086832] test_bpf: #6 LD_IND jited:1 314 310 283 PASS
[   56.096521] test_bpf: #7 LD_ABS jited:1 376 460 397 PASS
[   56.109604] test_bpf: #8 LD_ABS_LL jited:1 608 415 PASS
[   56.120753] test_bpf: #9 LD_IND_LL jited:1 248 256 268 PASS
[   56.129296] test_bpf: #10 LD_ABS_NET jited:1 435 420 PASS
[   56.138666] test_bpf: #11 LD_IND_NET jited:1 240 228 215 PASS
[   56.146039] test_bpf: #12 LD_PKTTYPE jited:1 211 274 PASS
[   56.151632] test_bpf: #13 LD_MARK jited:1 119 76 PASS
[   56.154522] test_bpf: #14 LD_RXHASH jited:1 78 70 PASS
[   56.156535] test_bpf: #15 LD_QUEUE jited:1 77 73 PASS
[   56.158482] test_bpf: #16 LD_PROTOCOL jited:1 326 320 PASS
[   56.165778] test_bpf: #17 LD_VLAN_TAG jited:1 129 86 PASS
[   56.168783] test_bpf: #18 LD_VLAN_TAG_PRESENT jited:1 87 88 PASS
[   56.170990] test_bpf: #19 LD_IFINDEX jited:1 97 95 PASS
[   56.173444] test_bpf: #20 LD_HATYPE jited:1 94 118 PASS
[   56.176033] test_bpf: #21 LD_CPU
[   56.176329] bpf_jit: *** NOT YET: opcode 85 ***
[   56.176565] jited:0 2639 702 PASS
[   56.210242] test_bpf: #22 LD_NLATTR jited:0 685 2101 PASS
[   56.238881] test_bpf: #23 LD_NLATTR_NEST jited:0 2323 3752 PASS
[   56.300600] test_bpf: #24 LD_PAYLOAD_OFF jited:0 4543 6842 PASS
[   56.415022] test_bpf: #25 LD_ANC_XOR jited:1 168 156 PASS
[   56.419429] test_bpf: #26 SPILL_FILL jited:1 232 212 219 PASS
[   56.427785] test_bpf: #27 JEQ jited:1 362 352 230 PASS
[   56.438180] test_bpf: #28 JGT jited:1 334 236 197 PASS
[   56.446672] test_bpf: #29 JGE jited:1 260 318 307 PASS
[   56.456301] test_bpf: #30 JSET jited:1 274 339 410 PASS
[   56.467681] test_bpf: #31 tcpdump port 22 jited:1 355 951 968 PASS
[   56.492091] test_bpf: #32 tcpdump complex jited:1 318 798 1308 PASS
[   56.517843] test_bpf: #33 RET_A jited:1 83 76 PASS
[   56.520000] test_bpf: #34 INT: ADD trivial jited:1 152 PASS
[   56.522183] test_bpf: #35 INT: MUL_X jited:1 192 PASS
[   56.524626] test_bpf: #36 INT: MUL_X2 jited:1 165 PASS
[   56.526762] test_bpf: #37 INT: MUL32_X jited:1 163 PASS
[   56.528828] test_bpf: #38 INT: ADD 64-bit jited:1 1507 PASS
[   56.544862] test_bpf: #39 INT: ADD 32-bit jited:1 954 PASS
[   56.555409] test_bpf: #40 INT: SUB jited:1 1159 PASS
[   56.567960] test_bpf: #41 INT: XOR jited:1 480 PASS
[   56.573431] test_bpf: #42 INT: MUL jited:1 486 PASS
[   56.579305] test_bpf: #43 MOV REG64 jited:1 274 PASS
[   56.583045] test_bpf: #44 MOV REG32 jited:1 253 PASS
[   56.586138] test_bpf: #45 LD IMM64 jited:1 578 PASS
[   56.592580] test_bpf: #46 INT: ALU MIX jited:0 1199 PASS
[   56.605346] test_bpf: #47 INT: shifts by register jited:1 381 PASS
[   56.610159] test_bpf: #48 INT: DIV + ABS jited:1 588 482 PASS
[   56.621545] test_bpf: #49 INT: DIV by zero jited:1 276 199 PASS
[   56.626894] test_bpf: #50 check: missing ret PASS
[   56.627249] test_bpf: #51 check: div_k_0 PASS
[   56.627403] test_bpf: #52 check: unknown insn PASS
[   56.627518] test_bpf: #53 check: out of range spill/fill PASS
[   56.627639] test_bpf: #54 JUMPS + HOLES jited:1 371 PASS
[   56.632295] test_bpf: #55 check: RET X PASS
[   56.632615] test_bpf: #56 check: LDX + RET X PASS
[   56.632748] test_bpf: #57 M[]: alt STX + LDX jited:1 621 PASS
[   56.639774] test_bpf: #58 M[]: full STX + full LDX jited:1 586 PASS
[   56.646535] test_bpf: #59 check: SKF_AD_MAX PASS
[   56.646837] test_bpf: #60 LD [SKF_AD_OFF-1] jited:1 195 PASS
[   56.649245] test_bpf: #61 load 64-bit immediate jited:1 220 PASS
[   56.652259] test_bpf: #62 nmap reduced jited:1 816 PASS
[   56.661508] test_bpf: #63 ALU_MOV_X: dst = 2 jited:1 76 PASS
[   56.662760] test_bpf: #64 ALU_MOV_X: dst = 4294967295 jited:1 79 PASS
[   56.663905] test_bpf: #65 ALU64_MOV_X: dst = 2 jited:1 80 PASS
[   56.665158] test_bpf: #66 ALU64_MOV_X: dst = 4294967295 jited:1 79 PASS
[   56.666297] test_bpf: #67 ALU_MOV_K: dst = 2 jited:1 75 PASS
[   56.667389] test_bpf: #68 ALU_MOV_K: dst = 4294967295 jited:1 73 PASS
[   56.668504] test_bpf: #69 ALU_MOV_K: 0x0000ffffffff0000 =
0x00000000ffffffff jited:1 195 PASS
[   56.670934] test_bpf: #70 ALU64_MOV_K: dst = 2 jited:1 77 PASS
[   56.672115] test_bpf: #71 ALU64_MOV_K: dst = 2147483647 jited:1 104 PASS
[   56.673550] test_bpf: #72 ALU64_OR_K: dst = 0x0 jited:1 215 PASS
[   56.676139] test_bpf: #73 ALU64_MOV_K: dst = -1 jited:1 173 PASS
[   56.687141] test_bpf: #74 ALU_ADD_X: 1 + 2 = 3 jited:1 114 PASS
[   56.688839] test_bpf: #75 ALU_ADD_X: 1 + 4294967294 = 4294967295
jited:1 112 PASS
[   56.690248] test_bpf: #76 ALU_ADD_X: 2 + 4294967294 = 0 jited:1 186 PASS
[   56.692428] test_bpf: #77 ALU64_ADD_X: 1 + 2 = 3 jited:1 159 PASS
[   56.694388] test_bpf: #78 ALU64_ADD_X: 1 + 4294967294 = 4294967295
jited:1 109 PASS
[   56.696115] test_bpf: #79 ALU64_ADD_X: 2 + 4294967294 = 4294967296
jited:1 218 PASS
[   56.698754] test_bpf: #80 ALU_ADD_K: 1 + 2 = 3 jited:1 120 PASS
[   56.700479] test_bpf: #81 ALU_ADD_K: 3 + 0 = 3 jited:1 118 PASS
[   56.702378] test_bpf: #82 ALU_ADD_K: 1 + 4294967294 = 4294967295
jited:1 121 PASS
[   56.704284] test_bpf: #83 ALU_ADD_K: 4294967294 + 2 = 0 jited:1 139 PASS
[   56.706363] test_bpf: #84 ALU_ADD_K: 0 + (-1) = 0x00000000ffffffff
jited:1 176 PASS
[   56.708715] test_bpf: #85 ALU_ADD_K: 0 + 0xffff = 0xffff jited:1 190 PASS
[   56.711155] test_bpf: #86 ALU_ADD_K: 0 + 0x7fffffff = 0x7fffffff
jited:1 228 PASS
[   56.713878] test_bpf: #87 ALU_ADD_K: 0 + 0x80000000 = 0x80000000
jited:1 198 PASS
[   56.716318] test_bpf: #88 ALU_ADD_K: 0 + 0x80008000 = 0x80008000
jited:1 189 PASS
[   56.718657] test_bpf: #89 ALU64_ADD_K: 1 + 2 = 3 jited:1 112 PASS
[   56.720152] test_bpf: #90 ALU64_ADD_K: 3 + 0 = 3 jited:1 111 PASS
[   56.721639] test_bpf: #91 ALU64_ADD_K: 1 + 2147483646 = 2147483647
jited:1 138 PASS
[   56.723403] test_bpf: #92 ALU64_ADD_K: 4294967294 + 2 = 4294967296
jited:1 151 PASS
[   56.725349] test_bpf: #93 ALU64_ADD_K: 2147483646 + -2147483647 =
-1 jited:1 115 PASS
[   56.726923] test_bpf: #94 ALU64_ADD_K: 1 + 0 = 1 jited:1 206 PASS
[   56.729436] test_bpf: #95 ALU64_ADD_K: 0 + (-1) =
0xffffffffffffffff jited:1 211 PASS
[   56.731988] test_bpf: #96 ALU64_ADD_K: 0 + 0xffff = 0xffff jited:1 250 PASS
[   56.735291] test_bpf: #97 ALU64_ADD_K: 0 + 0x7fffffff = 0x7fffffff
jited:1 199 PASS
[   56.737871] test_bpf: #98 ALU64_ADD_K: 0 + 0x80000000 =
0xffffffff80000000 jited:1 177 PASS
[   56.740193] test_bpf: #99 ALU_ADD_K: 0 + 0x80008000 =
0xffffffff80008000 jited:1 243 PASS
[   56.743126] test_bpf: #100 ALU_SUB_X: 3 - 1 = 2 jited:1 108 PASS
[   56.744676] test_bpf: #101 ALU_SUB_X: 4294967295 - 4294967294 = 1
jited:1 133 PASS
[   56.746386] test_bpf: #102 ALU64_SUB_X: 3 - 1 = 2 jited:1 110 PASS
[   56.747835] test_bpf: #103 ALU64_SUB_X: 4294967295 - 4294967294 = 1
jited:1 111 PASS
[   56.749292] test_bpf: #104 ALU_SUB_K: 3 - 1 = 2 jited:1 110 PASS
[   56.750766] test_bpf: #105 ALU_SUB_K: 3 - 0 = 3 jited:1 123 PASS
[   56.752371] test_bpf: #106 ALU_SUB_K: 4294967295 - 4294967294 = 1
jited:1 124 PASS
[   56.754095] test_bpf: #107 ALU64_SUB_K: 3 - 1 = 2 jited:1 116 PASS
[   56.755687] test_bpf: #108 ALU64_SUB_K: 3 - 0 = 3 jited:1 133 PASS
[   56.757418] test_bpf: #109 ALU64_SUB_K: 4294967294 - 4294967295 =
-1 jited:1 148 PASS
[   56.759295] test_bpf: #110 ALU64_ADD_K: 2147483646 - 2147483647 =
-1 jited:1 145 PASS
[   56.761137] test_bpf: #111 ALU_MUL_X: 2 * 3 = 6 jited:1 172 PASS
[   56.763380] test_bpf: #112 ALU_MUL_X: 2 * 0x7FFFFFF8 = 0xFFFFFFF0
jited:1 117 PASS
[   56.764943] test_bpf: #113 ALU_MUL_X: -1 * -1 = 1 jited:1 109 PASS
[   56.766424] test_bpf: #114 ALU64_MUL_X: 2 * 3 = 6 jited:1 115 PASS
[   56.767999] test_bpf: #115 ALU64_MUL_X: 1 * 2147483647 = 2147483647
jited:1 119 PASS
[   56.769584] test_bpf: #116 ALU_MUL_K: 2 * 3 = 6 jited:1 111 PASS
[   56.771124] test_bpf: #117 ALU_MUL_K: 3 * 1 = 3 jited:1 118 PASS
[   56.772961] test_bpf: #118 ALU_MUL_K: 2 * 0x7FFFFFF8 = 0xFFFFFFF0
jited:1 109 PASS
[   56.774431] test_bpf: #119 ALU_MUL_K: 1 * (-1) = 0x00000000ffffffff
jited:1 201 PASS
[   56.776888] test_bpf: #120 ALU64_MUL_K: 2 * 3 = 6 jited:1 116 PASS
[   56.778460] test_bpf: #121 ALU64_MUL_K: 3 * 1 = 3 jited:1 115 PASS
[   56.779993] test_bpf: #122 ALU64_MUL_K: 1 * 2147483647 = 2147483647
jited:1 278 PASS
[   56.783229] test_bpf: #123 ALU64_MUL_K: 1 * -2147483647 =
-2147483647 jited:1 125 PASS
[   56.785228] test_bpf: #124 ALU64_MUL_K: 1 * (-1) =
0xffffffffffffffff jited:1 208 PASS
[   56.787912] test_bpf: #125 ALU_DIV_X: 6 / 2 = 3 jited:1 246 PASS
[   56.790983] test_bpf: #126 ALU_DIV_X: 4294967295 / 4294967295 = 1
jited:1 291 PASS
[   56.794583] test_bpf: #127 ALU64_DIV_X: 6 / 2 = 3 jited:0 449 PASS
[   56.799521] test_bpf: #128 ALU64_DIV_X: 2147483647 / 2147483647 = 1
jited:0 462 PASS
[   56.804433] test_bpf: #129 ALU64_DIV_X: 0xffffffffffffffff / (-1) =
0x0000000000000001 jited:0 602 PASS
[   56.810815] test_bpf: #130 ALU_DIV_K: 6 / 2 = 3 jited:1 234 PASS
[   56.813585] test_bpf: #131 ALU_DIV_K: 3 / 1 = 3 jited:1 240 PASS
[   56.816466] test_bpf: #132 ALU_DIV_K: 4294967295 / 4294967295 = 1
jited:1 276 PASS
[   56.819790] test_bpf: #133 ALU_DIV_K: 0xffffffffffffffff / (-1) =
0x1 jited:1 373 PASS
[   56.824311] test_bpf: #134 ALU64_DIV_K: 6 / 2 = 3 jited:0 367 PASS
[   56.828509] test_bpf: #135 ALU64_DIV_K: 3 / 1 = 3 jited:0 354 PASS
[   56.832439] test_bpf: #136 ALU64_DIV_K: 2147483647 / 2147483647 = 1
jited:0 358 PASS
[   56.836360] test_bpf: #137 ALU64_DIV_K: 0xffffffffffffffff / (-1) =
0x0000000000000001 jited:0 563 PASS
[   56.842408] test_bpf: #138 ALU_MOD_X: 3 % 2 = 1 jited:1 293 PASS
[   56.845744] test_bpf: #139 ALU_MOD_X: 4294967295 % 4294967293 = 2
jited:1 289 PASS
[   56.849070] test_bpf: #140 ALU64_MOD_X: 3 % 2 = 1 jited:0 660 PASS
[   56.856100] test_bpf: #141 ALU64_MOD_X: 2147483647 % 2147483645 = 2
jited:0 692 PASS
[   56.863515] test_bpf: #142 ALU_MOD_K: 3 % 2 = 1 jited:1 311 PASS
[   56.867145] test_bpf: #143 ALU_MOD_K: 3 % 1 = 0 jited:1 PASS
[   56.867640] test_bpf: #144 ALU_MOD_K: 4294967295 % 4294967293 = 2
jited:1 319 PASS
[   56.871208] test_bpf: #145 ALU64_MOD_K: 3 % 2 = 1 jited:0 539 PASS
[   56.876982] test_bpf: #146 ALU64_MOD_K: 3 % 1 = 0 jited:0 PASS
[   56.877292] test_bpf: #147 ALU64_MOD_K: 2147483647 % 2147483645 = 2
jited:0 499 PASS
[   56.882591] test_bpf: #148 ALU_AND_X: 3 & 2 = 2 jited:1 109 PASS
[   56.884070] test_bpf: #149 ALU_AND_X: 0xffffffff & 0xffffffff =
0xffffffff jited:1 130 PASS
[   56.885807] test_bpf: #150 ALU64_AND_X: 3 & 2 = 2 jited:1 106 PASS
[   56.887288] test_bpf: #151 ALU64_AND_X: 0xffffffff & 0xffffffff =
0xffffffff jited:1 102 PASS
[   56.888746] test_bpf: #152 ALU_AND_K: 3 & 2 = 2 jited:1 114 PASS
[   56.890232] test_bpf: #153 ALU_AND_K: 0xffffffff & 0xffffffff =
0xffffffff jited:1 138 PASS
[   56.891967] test_bpf: #154 ALU64_AND_K: 3 & 2 = 2 jited:1 110 PASS
[   56.893502] test_bpf: #155 ALU64_AND_K: 0xffffffff & 0xffffffff =
0xffffffff jited:1 148 PASS
[   56.895413] test_bpf: #156 ALU64_AND_K: 0x0000ffffffff0000 & 0x0 =
0x0000ffff00000000 jited:1 206 PASS
[   56.897993] test_bpf: #157 ALU64_AND_K: 0x0000ffffffff0000 & -1 =
0x0000ffffffffffff jited:1 176 PASS
[   56.900294] test_bpf: #158 ALU64_AND_K: 0xffffffffffffffff & -1 =
0xffffffffffffffff jited:1 271 PASS
[   56.903712] test_bpf: #159 ALU_OR_X: 1 | 2 = 3 jited:1 108 PASS
[   56.905547] test_bpf: #160 ALU_OR_X: 0x0 | 0xffffffff = 0xffffffff
jited:1 118 PASS
[   56.907467] test_bpf: #161 ALU64_OR_X: 1 | 2 = 3 jited:1 103 PASS
[   56.909247] test_bpf: #162 ALU64_OR_X: 0 | 0xffffffff = 0xffffffff
jited:1 143 PASS
[   56.911219] test_bpf: #163 ALU_OR_K: 1 | 2 = 3 jited:1 123 PASS
[   56.913042] test_bpf: #164 ALU_OR_K: 0 & 0xffffffff = 0xffffffff
jited:1 110 PASS
[   56.914579] test_bpf: #165 ALU64_OR_K: 1 | 2 = 3 jited:1 120 PASS
[   56.916390] test_bpf: #166 ALU64_OR_K: 0 & 0xffffffff = 0xffffffff
jited:1 119 PASS
[   56.918118] test_bpf: #167 ALU64_OR_K: 0x0000ffffffff0000 | 0x0 =
0x0000ffff00000000 jited:1 212 PASS
[   56.920808] test_bpf: #168 ALU64_OR_K: 0x0000ffffffff0000 | -1 =
0xffffffffffffffff jited:1 221 PASS
[   56.923458] test_bpf: #169 ALU64_OR_K: 0x000000000000000 | -1 =
0xffffffffffffffff jited:1 198 PASS
[   56.925881] test_bpf: #170 ALU_XOR_X: 5 ^ 6 = 3 jited:1 138 PASS
[   56.927678] test_bpf: #171 ALU_XOR_X: 0x1 ^ 0xffffffff = 0xfffffffe
jited:1 130 PASS
[   56.929353] test_bpf: #172 ALU64_XOR_X: 5 ^ 6 = 3 jited:1 114 PASS
[   56.930850] test_bpf: #173 ALU64_XOR_X: 1 ^ 0xffffffff = 0xfffffffe
jited:1 106 PASS
[   56.932277] test_bpf: #174 ALU_XOR_K: 5 ^ 6 = 3 jited:1 112 PASS
[   56.933790] test_bpf: #175 ALU_XOR_K: 1 ^ 0xffffffff = 0xfffffffe
jited:1 116 PASS
[   56.935371] test_bpf: #176 ALU64_XOR_K: 5 ^ 6 = 3 jited:1 114 PASS
[   56.936942] test_bpf: #177 ALU64_XOR_K: 1 & 0xffffffff = 0xfffffffe
jited:1 112 PASS
[   56.938503] test_bpf: #178 ALU64_XOR_K: 0x0000ffffffff0000 ^ 0x0 =
0x0000ffffffff0000 jited:1 201 PASS
[   56.940978] test_bpf: #179 ALU64_XOR_K: 0x0000ffffffff0000 ^ -1 =
0xffff00000000ffff jited:1 242 PASS
[   56.943908] test_bpf: #180 ALU64_XOR_K: 0x000000000000000 ^ -1 =
0xffffffffffffffff jited:1 208 PASS
[   56.946575] test_bpf: #181 ALU_LSH_X: 1 << 1 = 2 jited:1 112 PASS
[   56.948252] test_bpf: #182 ALU_LSH_X: 1 << 31 = 0x80000000 jited:1 137 PASS
[   56.950466] test_bpf: #183 ALU64_LSH_X: 1 << 1 = 2 jited:1 163 PASS
[   56.953176] test_bpf: #184 ALU64_LSH_X: 1 << 31 = 0x80000000 jited:1 145 PASS
[   56.955105] test_bpf: #185 ALU_LSH_K: 1 << 1 = 2 jited:1 92 PASS
[   56.956400] test_bpf: #186 ALU_LSH_K: 1 << 31 = 0x80000000 jited:1 94 PASS
[   56.957700] test_bpf: #187 ALU64_LSH_K: 1 << 1 = 2 jited:1 94 PASS
[   56.959086] test_bpf: #188 ALU64_LSH_K: 1 << 31 = 0x80000000 jited:1 127 PASS
[   56.960779] test_bpf: #189 ALU_RSH_X: 2 >> 1 = 1 jited:1 135 PASS
[   56.962532] test_bpf: #190 ALU_RSH_X: 0x80000000 >> 31 = 1 jited:1 109 PASS
[   56.964027] test_bpf: #191 ALU64_RSH_X: 2 >> 1 = 1 jited:1 123 PASS
[   56.965961] test_bpf: #192 ALU64_RSH_X: 0x80000000 >> 31 = 1 jited:1 117 PASS
[   56.967517] test_bpf: #193 ALU_RSH_K: 2 >> 1 = 1 jited:1 95 PASS
[   56.968874] test_bpf: #194 ALU_RSH_K: 0x80000000 >> 31 = 1 jited:1 103 PASS
[   56.970261] test_bpf: #195 ALU64_RSH_K: 2 >> 1 = 1 jited:1 124 PASS
[   56.971879] test_bpf: #196 ALU64_RSH_K: 0x80000000 >> 31 = 1 jited:1 107 PASS
[   56.973346] test_bpf: #197 ALU_ARSH_X: 0xff00ff0000000000 >> 40 =
0xffffffffffff00ff jited:1 125 PASS
[   56.975022] test_bpf: #198 ALU_ARSH_K: 0xff00ff0000000000 >> 40 =
0xffffffffffff00ff jited:1 105 PASS
[   56.976479] test_bpf: #199 ALU_NEG: -(3) = -3 jited:1 76 PASS
[   56.977591] test_bpf: #200 ALU_NEG: -(-3) = 3 jited:1 106 PASS
[   56.979068] test_bpf: #201 ALU64_NEG: -(3) = -3 jited:1 104 PASS
[   56.980508] test_bpf: #202 ALU64_NEG: -(-3) = 3 jited:1 135 PASS
[   56.982223] test_bpf: #203 ALU_END_FROM_BE 16: 0x0123456789abcdef
-> 0xcdef jited:1 115 PASS
[   56.984458] test_bpf: #204 ALU_END_FROM_BE 32: 0x0123456789abcdef
-> 0x89abcdef jited:1 101 PASS
[   56.985991] test_bpf: #205 ALU_END_FROM_BE 64: 0x0123456789abcdef
-> 0x89abcdef jited:1 103 PASS
[   56.987477] test_bpf: #206 ALU_END_FROM_LE 16: 0x0123456789abcdef
-> 0xefcd jited:1 107 PASS
[   56.988937] test_bpf: #207 ALU_END_FROM_LE 32: 0x0123456789abcdef
-> 0xefcdab89 jited:1 93 PASS
[   56.990256] test_bpf: #208 ALU_END_FROM_LE 64: 0x0123456789abcdef
-> 0x67452301 jited:1 108 PASS
[   56.991728] test_bpf: #209 ST_MEM_B: Store/Load byte: max negative
jited:1 168 PASS
[   56.993878] test_bpf: #210 ST_MEM_B: Store/Load byte: max positive
jited:1 105 PASS
[   56.995386] test_bpf: #211 STX_MEM_B: Store/Load byte: max negative
jited:1 140 PASS
[   56.997188] test_bpf: #212 ST_MEM_H: Store/Load half word: max
negative jited:1 98 PASS
[   56.998563] test_bpf: #213 ST_MEM_H: Store/Load half word: max
positive jited:1 109 PASS
[   57.000045] test_bpf: #214 STX_MEM_H: Store/Load half word: max
negative jited:1 134 PASS
[   57.001803] test_bpf: #215 ST_MEM_W: Store/Load word: max negative
jited:1 148 PASS
[   57.003666] test_bpf: #216 ST_MEM_W: Store/Load word: max positive
jited:1 136 PASS
[   57.006376] test_bpf: #217 STX_MEM_W: Store/Load word: max negative
jited:1 205 PASS
[   57.009004] test_bpf: #218 ST_MEM_DW: Store/Load double word: max
negative jited:1 124 PASS
[   57.011164] test_bpf: #219 ST_MEM_DW: Store/Load double word: max
negative 2 jited:1 222 PASS
[   57.014281] test_bpf: #220 ST_MEM_DW: Store/Load double word: max
positive jited:1 110 PASS
[   57.016138] test_bpf: #221 STX_MEM_DW: Store/Load double word: max
negative jited:1 194 PASS
[   57.018614] test_bpf: #222 STX_XADD_W: Test: 0x12 + 0x10 = 0x22
jited:0 292 PASS
[   57.022064] test_bpf: #223 STX_XADD_W: Test side-effects, r10: 0x12
+ 0x10 = 0x22 jited:0 PASS
[   57.022356] test_bpf: #224 STX_XADD_W: Test side-effects, r0: 0x12
+ 0x10 = 0x22 jited:0 241 PASS
[   57.025099] test_bpf: #225 STX_XADD_W: X + 1 + 1 + 1 + ... jited:0
142752 PASS
[   58.454867] test_bpf: #226 STX_XADD_DW: Test: 0x12 + 0x10 = 0x22
jited:0 370 PASS
[   58.459675] test_bpf: #227 STX_XADD_DW: Test side-effects, r10:
0x12 + 0x10 = 0x22 jited:0 PASS
[   58.460082] test_bpf: #228 STX_XADD_DW: Test side-effects, r0: 0x12
+ 0x10 = 0x22 jited:0 268 PASS
[   58.463093] test_bpf: #229 STX_XADD_DW: X + 1 + 1 + 1 + ... jited:0
224885 PASS
[   60.713635] test_bpf: #230 JMP_EXIT jited:1 77 PASS
[   60.715476] test_bpf: #231 JMP_JA: Unconditional jump: if (true)
return 1 jited:1 84 PASS
[   60.716748] test_bpf: #232 JMP_JSGT_K: Signed jump: if (-1 > -2)
return 1 jited:1 128 PASS
[   60.718617] test_bpf: #233 JMP_JSGT_K: Signed jump: if (-1 > -1)
return 0 jited:1 126 PASS
[   60.720303] test_bpf: #234 JMP_JSGE_K: Signed jump: if (-1 >= -2)
return 1 jited:1 179 PASS
[   60.722889] test_bpf: #235 JMP_JSGE_K: Signed jump: if (-1 >= -1)
return 1 jited:1 125 PASS
[   60.724674] test_bpf: #236 JMP_JGT_K: if (3 > 2) return 1 jited:1 142 PASS
[   60.726577] test_bpf: #237 JMP_JGT_K: Unsigned jump: if (-1 > 1)
return 1 jited:1 161 PASS
[   60.728695] test_bpf: #238 JMP_JGE_K: if (3 >= 2) return 1 jited:1 163 PASS
[   60.730807] test_bpf: #239 JMP_JGT_K: if (3 > 2) return 1 (jump
backwards) jited:1 143 PASS
[   60.733042] test_bpf: #240 JMP_JGE_K: if (3 >= 3) return 1 jited:1 179 PASS
[   60.735513] test_bpf: #241 JMP_JNE_K: if (3 != 2) return 1 jited:1 144 PASS
[   60.737586] test_bpf: #242 JMP_JEQ_K: if (3 == 3) return 1 jited:1 144 PASS
[   60.739896] test_bpf: #243 JMP_JSET_K: if (0x3 & 0x2) return 1
jited:1 149 PASS
[   60.741813] test_bpf: #244 JMP_JSET_K: if (0x3 & 0xffffffff) return
1 jited:1 153 PASS
[   60.743773] test_bpf: #245 JMP_JSGT_X: Signed jump: if (-1 > -2)
return 1 jited:1 162 PASS
[   60.745798] test_bpf: #246 JMP_JSGT_X: Signed jump: if (-1 > -1)
return 0 jited:1 162 PASS
[   60.747921] test_bpf: #247 JMP_JSGE_X: Signed jump: if (-1 >= -2)
return 1 jited:1 178 PASS
[   60.750577] test_bpf: #248 JMP_JSGE_X: Signed jump: if (-1 >= -1)
return 1 jited:1 192 PASS
[   60.753315] test_bpf: #249 JMP_JGT_X: if (3 > 2) return 1 jited:1 205 PASS
[   60.756115] test_bpf: #250 JMP_JGT_X: Unsigned jump: if (-1 > 1)
return 1 jited:1 154 PASS
[   60.758287] test_bpf: #251 JMP_JGE_X: if (3 >= 2) return 1 jited:1 177 PASS
[   60.760611] test_bpf: #252 JMP_JGE_X: if (3 >= 3) return 1 jited:1 160 PASS
[   60.762901] test_bpf: #253 JMP_JGE_X: ldimm64 test 1 jited:1 204 PASS
[   60.765394] test_bpf: #254 JMP_JGE_X: ldimm64 test 2 jited:1 201 PASS
[   60.767884] test_bpf: #255 JMP_JGE_X: ldimm64 test 3 jited:1 184 PASS
[   60.770228] test_bpf: #256 JMP_JNE_X: if (3 != 2) return 1 jited:1 168 PASS
[   60.772331] test_bpf: #257 JMP_JEQ_X: if (3 == 3) return 1 jited:1 197 PASS
[   60.774754] test_bpf: #258 JMP_JSET_X: if (0x3 & 0x2) return 1
jited:1 192 PASS
[   60.777384] test_bpf: #259 JMP_JSET_X: if (0x3 & 0xffffffff) return
1 jited:1 181 PASS
[   60.779641] test_bpf: #260 JMP_JA: Jump, gap, jump, ... jited:1 97 PASS
[   60.781022] test_bpf: #261 BPF_MAXINSNS: Maximum possible literals
jited:1 125 PASS
[   61.242879] test_bpf: #262 BPF_MAXINSNS: Single literal jited:1 105 PASS
[   61.835125] test_bpf: #263 BPF_MAXINSNS: Run/add until end jited:1
121315 PASS
[   63.362129] test_bpf: #264 BPF_MAXINSNS: Too many instructions PASS
[   63.362231] test_bpf: #265 BPF_MAXINSNS: Very long jump jited:1 131 PASS
[   63.879679] test_bpf: #266 BPF_MAXINSNS: Ctx heavy transformations
jited:1 217030 181848 PASS
[   68.492725] test_bpf: #267 BPF_MAXINSNS: Call heavy transformations
jited:0 1018683 930359 PASS
[   88.007480] test_bpf: #268 BPF_MAXINSNS: Jump heavy test jited:1 440621 PASS
[   93.074379] test_bpf: #269 BPF_MAXINSNS: Very long jump backwards
jited:1 154 PASS
[   93.358458] test_bpf: #270 BPF_MAXINSNS: Edge hopping nuthouse
jited:1 302835 PASS
[   96.392483] test_bpf: #271 BPF_MAXINSNS: Jump, gap, jump, ...
jited:1 1008 PASS
[   96.501153] test_bpf: #272 BPF_MAXINSNS: ld_abs+get_processor_id
jited:0 597855 PASS
[  102.759854] test_bpf: #273 BPF_MAXINSNS: ld_abs+vlan_push/pop
jited:0 626616 PASS
[  109.247312] test_bpf: #274 LD_IND byte frag jited:1 1453 PASS
[  109.263829] test_bpf: #275 LD_IND halfword frag jited:1 600 PASS
[  109.270433] test_bpf: #276 LD_IND word frag jited:1 719 PASS
[  109.278159] test_bpf: #277 LD_IND halfword mixed head/frag jited:1 705 PASS
[  109.285898] test_bpf: #278 LD_IND word mixed head/frag jited:1 732 PASS
[  109.293879] test_bpf: #279 LD_ABS byte frag jited:1 683 PASS
[  109.301360] test_bpf: #280 LD_ABS halfword frag jited:1 595 PASS
[  109.307841] test_bpf: #281 LD_ABS word frag jited:1 672 PASS
[  109.315579] test_bpf: #282 LD_ABS halfword mixed head/frag jited:1 775 PASS
[  109.323890] test_bpf: #283 LD_ABS word mixed head/frag jited:1 725 PASS
[  109.331927] test_bpf: #284 LD_IND byte default X jited:1 274 PASS
[  109.335451] test_bpf: #285 LD_IND byte positive offset jited:1 302 PASS
[  109.339511] test_bpf: #286 LD_IND byte negative offset jited:1 311 PASS
[  109.343448] test_bpf: #287 LD_IND halfword positive offset jited:1 218 PASS
[  109.346282] test_bpf: #288 LD_IND halfword negative offset jited:1 193 PASS
[  109.348832] test_bpf: #289 LD_IND halfword unaligned jited:1 190 PASS
[  109.351330] test_bpf: #290 LD_IND word positive offset jited:1 200 PASS
[  109.353993] test_bpf: #291 LD_IND word negative offset jited:1 216 PASS
[  109.356739] test_bpf: #292 LD_IND word unaligned (addr & 3 == 2)
jited:1 195 PASS
[  109.359225] test_bpf: #293 LD_IND word unaligned (addr & 3 == 1)
jited:1 196 PASS
[  109.361713] test_bpf: #294 LD_IND word unaligned (addr & 3 == 3)
jited:1 221 PASS
[  109.364417] test_bpf: #295 LD_ABS byte jited:1 195 PASS
[  109.366896] test_bpf: #296 LD_ABS halfword jited:1 170 PASS
[  109.369093] test_bpf: #297 LD_ABS halfword unaligned jited:1 167 PASS
[  109.371399] test_bpf: #298 LD_ABS word jited:1 182 PASS
[  109.373724] test_bpf: #299 LD_ABS word unaligned (addr & 3 == 2)
jited:1 185 PASS
[  109.376064] test_bpf: #300 LD_ABS word unaligned (addr & 3 == 1)
jited:1 162 PASS
[  109.381701] test_bpf: #301 LD_ABS word unaligned (addr & 3 == 3)
jited:1 231 PASS
[  109.384839] test_bpf: #302 ADD default X jited:1 105 PASS
[  109.386839] test_bpf: #303 ADD default A jited:1 101 PASS
[  109.388677] test_bpf: #304 SUB default X jited:1 106 PASS
[  109.390267] test_bpf: #305 SUB default A jited:1 119 PASS
[  109.391992] test_bpf: #306 MUL default X jited:1 131 PASS
[  109.394020] test_bpf: #307 MUL default A jited:1 116 PASS
[  109.395766] test_bpf: #308 DIV default X jited:1 116 PASS
[  109.397706] test_bpf: #309 DIV default A jited:1 227 PASS
[  109.406156] test_bpf: #310 MOD default X jited:1 98 PASS
[  109.407645] test_bpf: #311 MOD default A jited:1 265 PASS
[  109.410774] test_bpf: #312 JMP EQ default A jited:1 134 PASS
[  109.412679] test_bpf: #313 JMP EQ default X jited:1 108 PASS
[  109.414506] test_bpf: Summary: 314 PASSED, 0 FAILED, [278/306 JIT'ed]


These all benchmarks are for ARMv7.
Best,
Shubham Bansal


On Mon, May 22, 2017 at 6:31 PM, Daniel Borkmann <daniel@iogearbox.net> wrote:
> On 05/20/2017 10:01 PM, Shubham Bansal wrote:
> [...]
>>
>> Before I send the patch, I have tested the JIT compiler on ARMv7 but
>> not on ARMv5 or ARMv6. So can you tell me which arch versions I should
>> test it for?
>> Also for my testing, CONFIG_FRAME_POINTER and CONFIG_CPU_BIG_ENDIAN
>> are both disabled. But I need to test JIT with these flags as well.
>> Whenever I put these flags in .config file, the arm kernel is not
>> getting compiler with these flags. Can you tell me why? If you need
>> more information regarding this, please let me know.
>
>
> Maybe Mircea, Kees or someone from linux-arm-kernel can help you out
> on that.
>
> With regards to the below benchmark, I was mentioning how it compares
> to the interpreter. With only the numbers for jit it's hard to compare.
> So would be great to see the output for the following three cases:
>
> 1) Interpreter:
>
> echo 0 > /proc/sys/net/core/bpf_jit_enable
>
> 2) JIT enabled:
>
> echo 1 > /proc/sys/net/core/bpf_jit_enable
>
> 3) JIT + blinding enabled:
>
> echo 1 > /proc/sys/net/core/bpf_jit_enable
> echo 2 > /proc/sys/net/core/bpf_jit_harden
>
>> With current config for ARMv7, benchmarks are :
>>
>> [root@vexpress modules]# insmod test_bpf.ko
>> [   25.797766] test_bpf: #0 TAX jited:1 180 170 169 PASS
>> [   25.811395] test_bpf: #1 TXA jited:1 93 89 111 PASS
>> [   25.815073] test_bpf: #2 ADD_SUB_MUL_K jited:1 94 PASS
>> [   25.816779] test_bpf: #3 DIV_MOD_KX jited:1 983 PASS
>> [   25.827310] test_bpf: #4 AND_OR_LSH_K jited:1 94 93 PASS
>> [   25.829843] test_bpf: #5 LD_IMM_0 jited:1 83 PASS
>> [   25.831260] test_bpf: #6 LD_IND jited:1 338 266 305 PASS
>
> [...]
>
> Thanks,
> Daniel

^ permalink raw reply	[flat|nested] 99+ messages in thread

* arch: arm: bpf: Converting cBPF to eBPF for arm 32 bit
@ 2017-05-22 17:04                       ` Shubham Bansal
  0 siblings, 0 replies; 99+ messages in thread
From: Shubham Bansal @ 2017-05-22 17:04 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Daniel,

Here are the benchmarks.

1) Interpreter:

[root at vexpress modules]# insmod test_bpf.ko
[   37.244999] test_bpf: #0 TAX jited:0 757 645 650 PASS
[   37.272577] test_bpf: #1 TXA jited:0 366 334 336 PASS
[   37.283507] test_bpf: #2 ADD_SUB_MUL_K jited:0 543 PASS
[   37.289542] test_bpf: #3 DIV_MOD_KX jited:0 1509 PASS
[   37.305374] test_bpf: #4 AND_OR_LSH_K jited:0 539 559 PASS
[   37.317209] test_bpf: #5 LD_IMM_0 jited:0 412 PASS
[   37.321820] test_bpf: #6 LD_IND jited:0 428 376 389 PASS
[   37.334327] test_bpf: #7 LD_ABS jited:0 509 405 358 PASS
[   37.350596] test_bpf: #8 LD_ABS_LL jited:0 542 783 PASS
[   37.364340] test_bpf: #9 LD_IND_LL jited:0 524 496 723 PASS
[   37.382352] test_bpf: #10 LD_ABS_NET jited:0 527 545 PASS
[   37.393642] test_bpf: #11 LD_IND_NET jited:0 650 495 647 PASS
[   37.412228] test_bpf: #12 LD_PKTTYPE jited:0 686 901 PASS
[   37.428818] test_bpf: #13 LD_MARK jited:0 305 291 PASS
[   37.435349] test_bpf: #14 LD_RXHASH jited:0 257 259 PASS
[   37.440850] test_bpf: #15 LD_QUEUE jited:0 255 254 PASS
[   37.446254] test_bpf: #16 LD_PROTOCOL jited:0 593 603 PASS
[   37.458570] test_bpf: #17 LD_VLAN_TAG jited:0 288 292 PASS
[   37.464821] test_bpf: #18 LD_VLAN_TAG_PRESENT jited:0 335 421 PASS
[   37.472817] test_bpf: #19 LD_IFINDEX jited:0 8568 606 PASS
[   37.565163] test_bpf: #20 LD_HATYPE jited:0 618 695 PASS
[   37.579457] test_bpf: #21 LD_CPU jited:0 1200 1172 PASS
[   37.604424] test_bpf: #22 LD_NLATTR jited:0 979 1124 PASS
[   37.626345] test_bpf: #23 LD_NLATTR_NEST jited:0 12232 3593 PASS
[   37.785251] test_bpf: #24 LD_PAYLOAD_OFF jited:0 3697 4834 PASS
[   37.871224] test_bpf: #25 LD_ANC_XOR jited:0 314 344 PASS
[   37.878210] test_bpf: #26 SPILL_FILL jited:0 757 850 903 PASS
[   37.903954] test_bpf: #27 JEQ jited:0 380 420 426 PASS
[   37.916756] test_bpf: #28 JGT jited:0 376 467 448 PASS
[   37.930276] test_bpf: #29 JGE jited:0 446 590 498 PASS
[   37.946729] test_bpf: #30 JSET jited:0 571 787 1003 PASS
[   37.970896] test_bpf: #31 tcpdump port 22 jited:0 358 1079 1190 PASS
[   37.997814] test_bpf: #32 tcpdump complex jited:0 319 1061 2324 PASS
[   38.035596] test_bpf: #33 RET_A jited:0 253 249 PASS
[   38.041262] test_bpf: #34 INT: ADD trivial jited:0 414 PASS
[   38.045777] test_bpf: #35 INT: MUL_X jited:0 336 PASS
[   38.049402] test_bpf: #36 INT: MUL_X2 jited:0 431 PASS
[   38.054178] test_bpf: #37 INT: MUL32_X jited:0 523 PASS
[   38.059902] test_bpf: #38 INT: ADD 64-bit jited:0 5263 PASS
[   38.113069] test_bpf: #39 INT: ADD 32-bit jited:0 4127 PASS
[   38.154754] test_bpf: #40 INT: SUB jited:0 4218 PASS
[   38.197294] test_bpf: #41 INT: XOR jited:0 2252 PASS
[   38.220159] test_bpf: #42 INT: MUL jited:0 1986 PASS
[   38.240410] test_bpf: #43 MOV REG64 jited:0 1103 PASS
[   38.251796] test_bpf: #44 MOV REG32 jited:0 1140 PASS
[   38.263614] test_bpf: #45 LD IMM64 jited:0 1182 PASS
[   38.276031] test_bpf: #46 INT: ALU MIX jited:0 1068 PASS
[   38.287319] test_bpf: #47 INT: shifts by register jited:0 1125 PASS
[   38.298913] test_bpf: #48 INT: DIV + ABS jited:0 570 850 PASS
[   38.313745] test_bpf: #49 INT: DIV by zero jited:0 350 305 PASS
[   38.320829] test_bpf: #50 check: missing ret PASS
[   38.321186] test_bpf: #51 check: div_k_0 PASS
[   38.321350] test_bpf: #52 check: unknown insn PASS
[   38.321492] test_bpf: #53 check: out of range spill/fill PASS
[   38.321665] test_bpf: #54 JUMPS + HOLES jited:0 863 PASS
[   38.330763] test_bpf: #55 check: RET X PASS
[   38.331060] test_bpf: #56 check: LDX + RET X PASS
[   38.331292] test_bpf: #57 M[]: alt STX + LDX jited:0 3990 PASS
[   38.373667] test_bpf: #58 M[]: full STX + full LDX jited:0 2819 PASS
[   38.410225] test_bpf: #59 check: SKF_AD_MAX PASS
[   38.410461] test_bpf: #60 LD [SKF_AD_OFF-1] jited:0 313 PASS
[   38.413785] test_bpf: #61 load 64-bit immediate jited:0 579 PASS
[   38.419764] test_bpf: #62 nmap reduced jited:0 1860 PASS
[   38.439016] test_bpf: #63 ALU_MOV_X: dst = 2 jited:0 249 PASS
[   38.441990] test_bpf: #64 ALU_MOV_X: dst = 4294967295 jited:0 264 PASS
[   38.445000] test_bpf: #65 ALU64_MOV_X: dst = 2 jited:0 229 PASS
[   38.447602] test_bpf: #66 ALU64_MOV_X: dst = 4294967295 jited:0 213 PASS
[   38.450011] test_bpf: #67 ALU_MOV_K: dst = 2 jited:0 167 PASS
[   38.451963] test_bpf: #68 ALU_MOV_K: dst = 4294967295 jited:0 149 PASS
[   38.453694] test_bpf: #69 ALU_MOV_K: 0x0000ffffffff0000 =
0x00000000ffffffff jited:0 358 PASS
[   38.457572] test_bpf: #70 ALU64_MOV_K: dst = 2 jited:0 158 PASS
[   38.459546] test_bpf: #71 ALU64_MOV_K: dst = 2147483647 jited:0 156 PASS
[   38.461364] test_bpf: #72 ALU64_OR_K: dst = 0x0 jited:0 306 PASS
[   38.464652] test_bpf: #73 ALU64_MOV_K: dst = -1 jited:0 327 PASS
[   38.468154] test_bpf: #74 ALU_ADD_X: 1 + 2 = 3 jited:0 212 PASS
[   38.470551] test_bpf: #75 ALU_ADD_X: 1 + 4294967294 = 4294967295
jited:0 231 PASS
[   38.473187] test_bpf: #76 ALU_ADD_X: 2 + 4294967294 = 0 jited:0 309 PASS
[   38.476618] test_bpf: #77 ALU64_ADD_X: 1 + 2 = 3 jited:0 280 PASS
[   38.479675] test_bpf: #78 ALU64_ADD_X: 1 + 4294967294 = 4294967295
jited:0 286 PASS
[   38.482755] test_bpf: #79 ALU64_ADD_X: 2 + 4294967294 = 4294967296
jited:0 460 PASS
[   38.487670] test_bpf: #80 ALU_ADD_K: 1 + 2 = 3 jited:0 210 PASS
[   38.490042] test_bpf: #81 ALU_ADD_K: 3 + 0 = 3 jited:0 208 PASS
[   38.492331] test_bpf: #82 ALU_ADD_K: 1 + 4294967294 = 4294967295
jited:0 205 PASS
[   38.494604] test_bpf: #83 ALU_ADD_K: 4294967294 + 2 = 0 jited:0 323 PASS
[   38.498071] test_bpf: #84 ALU_ADD_K: 0 + (-1) = 0x00000000ffffffff
jited:0 338 PASS
[   38.501674] test_bpf: #85 ALU_ADD_K: 0 + 0xffff = 0xffff jited:0 347 PASS
[   38.505355] test_bpf: #86 ALU_ADD_K: 0 + 0x7fffffff = 0x7fffffff
jited:0 360 PASS
[   38.509197] test_bpf: #87 ALU_ADD_K: 0 + 0x80000000 = 0x80000000
jited:0 345 PASS
[   38.512873] test_bpf: #88 ALU_ADD_K: 0 + 0x80008000 = 0x80008000
jited:0 377 PASS
[   38.516924] test_bpf: #89 ALU64_ADD_K: 1 + 2 = 3 jited:0 184 PASS
[   38.519053] test_bpf: #90 ALU64_ADD_K: 3 + 0 = 3 jited:0 185 PASS
[   38.521246] test_bpf: #91 ALU64_ADD_K: 1 + 2147483646 = 2147483647
jited:0 186 PASS
[   38.523414] test_bpf: #92 ALU64_ADD_K: 4294967294 + 2 = 4294967296
jited:0 353 PASS
[   38.527276] test_bpf: #93 ALU64_ADD_K: 2147483646 + -2147483647 =
-1 jited:0 182 PASS
[   38.529353] test_bpf: #94 ALU64_ADD_K: 1 + 0 = 1 jited:0 311 PASS
[   38.532680] test_bpf: #95 ALU64_ADD_K: 0 + (-1) =
0xffffffffffffffff jited:0 339 PASS
[   38.536308] test_bpf: #96 ALU64_ADD_K: 0 + 0xffff = 0xffff jited:0 310 PASS
[   38.539652] test_bpf: #97 ALU64_ADD_K: 0 + 0x7fffffff = 0x7fffffff
jited:0 313 PASS
[   38.543022] test_bpf: #98 ALU64_ADD_K: 0 + 0x80000000 =
0xffffffff80000000 jited:0 340 PASS
[   38.546651] test_bpf: #99 ALU_ADD_K: 0 + 0x80008000 =
0xffffffff80008000 jited:0 311 PASS
[   38.549994] test_bpf: #100 ALU_SUB_X: 3 - 1 = 2 jited:0 213 PASS
[   38.552326] test_bpf: #101 ALU_SUB_X: 4294967295 - 4294967294 = 1
jited:0 212 PASS
[   38.554661] test_bpf: #102 ALU64_SUB_X: 3 - 1 = 2 jited:0 237 PASS
[   38.557278] test_bpf: #103 ALU64_SUB_X: 4294967295 - 4294967294 = 1
jited:0 221 PASS
[   38.559713] test_bpf: #104 ALU_SUB_K: 3 - 1 = 2 jited:0 177 PASS
[   38.561682] test_bpf: #105 ALU_SUB_K: 3 - 0 = 3 jited:0 179 PASS
[   38.563692] test_bpf: #106 ALU_SUB_K: 4294967295 - 4294967294 = 1
jited:0 195 PASS
[   38.565891] test_bpf: #107 ALU64_SUB_K: 3 - 1 = 2 jited:0 183 PASS
[   38.567926] test_bpf: #108 ALU64_SUB_K: 3 - 0 = 3 jited:0 177 PASS
[   38.569901] test_bpf: #109 ALU64_SUB_K: 4294967294 - 4294967295 =
-1 jited:0 181 PASS
[   38.571925] test_bpf: #110 ALU64_ADD_K: 2147483646 - 2147483647 =
-1 jited:0 177 PASS
[   38.573910] test_bpf: #111 ALU_MUL_X: 2 * 3 = 6 jited:0 241 PASS
[   38.576535] test_bpf: #112 ALU_MUL_X: 2 * 0x7FFFFFF8 = 0xFFFFFFF0
jited:0 220 PASS
[   38.578948] test_bpf: #113 ALU_MUL_X: -1 * -1 = 1 jited:0 224 PASS
[   38.581387] test_bpf: #114 ALU64_MUL_X: 2 * 3 = 6 jited:0 213 PASS
[   38.583715] test_bpf: #115 ALU64_MUL_X: 1 * 2147483647 = 2147483647
jited:0 230 PASS
[   38.586253] test_bpf: #116 ALU_MUL_K: 2 * 3 = 6 jited:0 191 PASS
[   38.588392] test_bpf: #117 ALU_MUL_K: 3 * 1 = 3 jited:0 189 PASS
[   38.590487] test_bpf: #118 ALU_MUL_K: 2 * 0x7FFFFFF8 = 0xFFFFFFF0
jited:0 192 PASS
[   38.592616] test_bpf: #119 ALU_MUL_K: 1 * (-1) = 0x00000000ffffffff
jited:0 333 PASS
[   38.596172] test_bpf: #120 ALU64_MUL_K: 2 * 3 = 6 jited:0 185 PASS
[   38.598224] test_bpf: #121 ALU64_MUL_K: 3 * 1 = 3 jited:0 185 PASS
[   38.600287] test_bpf: #122 ALU64_MUL_K: 1 * 2147483647 = 2147483647
jited:0 184 PASS
[   38.602369] test_bpf: #123 ALU64_MUL_K: 1 * -2147483647 =
-2147483647 jited:0 183 PASS
[   38.604421] test_bpf: #124 ALU64_MUL_K: 1 * (-1) =
0xffffffffffffffff jited:0 336 PASS
[   38.608002] test_bpf: #125 ALU_DIV_X: 6 / 2 = 3 jited:0 316 PASS
[   38.611394] test_bpf: #126 ALU_DIV_X: 4294967295 / 4294967295 = 1
jited:0 315 PASS
[   38.614753] test_bpf: #127 ALU64_DIV_X: 6 / 2 = 3 jited:0 439 PASS
[   38.619370] test_bpf: #128 ALU64_DIV_X: 2147483647 / 2147483647 = 1
jited:0 420 PASS
[   38.623844] test_bpf: #129 ALU64_DIV_X: 0xffffffffffffffff / (-1) =
0x0000000000000001 jited:0 604 PASS
[   38.630156] test_bpf: #130 ALU_DIV_K: 6 / 2 = 3 jited:0 249 PASS
[   38.632858] test_bpf: #131 ALU_DIV_K: 3 / 1 = 3 jited:0 240 PASS
[   38.635647] test_bpf: #132 ALU_DIV_K: 4294967295 / 4294967295 = 1
jited:0 254 PASS
[   38.638408] test_bpf: #133 ALU_DIV_K: 0xffffffffffffffff / (-1) =
0x1 jited:0 379 PASS
[   38.642450] test_bpf: #134 ALU64_DIV_K: 6 / 2 = 3 jited:0 346 PASS
[   38.646123] test_bpf: #135 ALU64_DIV_K: 3 / 1 = 3 jited:0 323 PASS
[   38.649558] test_bpf: #136 ALU64_DIV_K: 2147483647 / 2147483647 = 1
jited:0 329 PASS
[   38.653061] test_bpf: #137 ALU64_DIV_K: 0xffffffffffffffff / (-1) =
0x0000000000000001 jited:0 477 PASS
[   38.658065] test_bpf: #138 ALU_MOD_X: 3 % 2 = 1 jited:0 421 PASS
[   38.662580] test_bpf: #139 ALU_MOD_X: 4294967295 % 4294967293 = 2
jited:0 453 PASS
[   38.667414] test_bpf: #140 ALU64_MOD_X: 3 % 2 = 1 jited:0 553 PASS
[   38.673235] test_bpf: #141 ALU64_MOD_X: 2147483647 % 2147483645 = 2
jited:0 583 PASS
[   38.679343] test_bpf: #142 ALU_MOD_K: 3 % 2 = 1 jited:0 380 PASS
[   38.683374] test_bpf: #143 ALU_MOD_K: 3 % 1 = 0 jited:0 PASS
[   38.683586] test_bpf: #144 ALU_MOD_K: 4294967295 % 4294967293 = 2
jited:0 467 PASS
[   38.688672] test_bpf: #145 ALU64_MOD_K: 3 % 2 = 1 jited:0 492 PASS
[   38.694058] test_bpf: #146 ALU64_MOD_K: 3 % 1 = 0 jited:0 PASS
[   38.694359] test_bpf: #147 ALU64_MOD_K: 2147483647 % 2147483645 = 2
jited:0 571 PASS
[   38.700389] test_bpf: #148 ALU_AND_X: 3 & 2 = 2 jited:0 225 PASS
[   38.702952] test_bpf: #149 ALU_AND_X: 0xffffffff & 0xffffffff =
0xffffffff jited:0 261 PASS
[   38.705982] test_bpf: #150 ALU64_AND_X: 3 & 2 = 2 jited:0 273 PASS
[   38.709194] test_bpf: #151 ALU64_AND_X: 0xffffffff & 0xffffffff =
0xffffffff jited:0 251 PASS
[   38.712213] test_bpf: #152 ALU_AND_K: 3 & 2 = 2 jited:0 201 PASS
[   38.714638] test_bpf: #153 ALU_AND_K: 0xffffffff & 0xffffffff =
0xffffffff jited:0 240 PASS
[   38.717477] test_bpf: #154 ALU64_AND_K: 3 & 2 = 2 jited:0 209 PASS
[   38.720125] test_bpf: #155 ALU64_AND_K: 0xffffffff & 0xffffffff =
0xffffffff jited:0 319 PASS
[   38.724356] test_bpf: #156 ALU64_AND_K: 0x0000ffffffff0000 & 0x0 =
0x0000ffff00000000 jited:0 384 PASS
[   38.729293] test_bpf: #157 ALU64_AND_K: 0x0000ffffffff0000 & -1 =
0x0000ffffffffffff jited:0 367 PASS
[   38.733598] test_bpf: #158 ALU64_AND_K: 0xffffffffffffffff & -1 =
0xffffffffffffffff jited:0 375 PASS
[   38.737966] test_bpf: #159 ALU_OR_X: 1 | 2 = 3 jited:0 271 PASS
[   38.741274] test_bpf: #160 ALU_OR_X: 0x0 | 0xffffffff = 0xffffffff
jited:0 280 PASS
[   38.744653] test_bpf: #161 ALU64_OR_X: 1 | 2 = 3 jited:0 253 PASS
[   38.747717] test_bpf: #162 ALU64_OR_X: 0 | 0xffffffff = 0xffffffff
jited:0 263 PASS
[   38.750830] test_bpf: #163 ALU_OR_K: 1 | 2 = 3 jited:0 216 PASS
[   38.753357] test_bpf: #164 ALU_OR_K: 0 & 0xffffffff = 0xffffffff
jited:0 187 PASS
[   38.755553] test_bpf: #165 ALU64_OR_K: 1 | 2 = 3 jited:0 183 PASS
[   38.757693] test_bpf: #166 ALU64_OR_K: 0 & 0xffffffff = 0xffffffff
jited:0 195 PASS
[   38.759975] test_bpf: #167 ALU64_OR_K: 0x0000ffffffff0000 | 0x0 =
0x0000ffff00000000 jited:0 338 PASS
[   38.763728] test_bpf: #168 ALU64_OR_K: 0x0000ffffffff0000 | -1 =
0xffffffffffffffff jited:0 324 PASS
[   38.767311] test_bpf: #169 ALU64_OR_K: 0x000000000000000 | -1 =
0xffffffffffffffff jited:0 309 PASS
[   38.770633] test_bpf: #170 ALU_XOR_X: 5 ^ 6 = 3 jited:0 216 PASS
[   38.776135] test_bpf: #171 ALU_XOR_X: 0x1 ^ 0xffffffff = 0xfffffffe
jited:0 414 PASS
[   38.780950] test_bpf: #172 ALU64_XOR_X: 5 ^ 6 = 3 jited:0 320 PASS
[   38.784540] test_bpf: #173 ALU64_XOR_X: 1 ^ 0xffffffff = 0xfffffffe
jited:0 223 PASS
[   38.787037] test_bpf: #174 ALU_XOR_K: 5 ^ 6 = 3 jited:0 203 PASS
[   38.789359] test_bpf: #175 ALU_XOR_K: 1 ^ 0xffffffff = 0xfffffffe
jited:0 205 PASS
[   38.791707] test_bpf: #176 ALU64_XOR_K: 5 ^ 6 = 3 jited:0 205 PASS
[   38.794045] test_bpf: #177 ALU64_XOR_K: 1 & 0xffffffff = 0xfffffffe
jited:0 186 PASS
[   38.796180] test_bpf: #178 ALU64_XOR_K: 0x0000ffffffff0000 ^ 0x0 =
0x0000ffffffff0000 jited:0 352 PASS
[   38.800050] test_bpf: #179 ALU64_XOR_K: 0x0000ffffffff0000 ^ -1 =
0xffff00000000ffff jited:0 353 PASS
[   38.803970] test_bpf: #180 ALU64_XOR_K: 0x000000000000000 ^ -1 =
0xffffffffffffffff jited:0 362 PASS
[   38.808102] test_bpf: #181 ALU_LSH_X: 1 << 1 = 2 jited:0 211 PASS
[   38.810517] test_bpf: #182 ALU_LSH_X: 1 << 31 = 0x80000000 jited:0 216 PASS
[   38.812957] test_bpf: #183 ALU64_LSH_X: 1 << 1 = 2 jited:0 224 PASS
[   38.815480] test_bpf: #184 ALU64_LSH_X: 1 << 31 = 0x80000000 jited:0 223 PASS
[   38.818057] test_bpf: #185 ALU_LSH_K: 1 << 1 = 2 jited:0 208 PASS
[   38.820559] test_bpf: #186 ALU_LSH_K: 1 << 31 = 0x80000000 jited:0 210 PASS
[   38.823011] test_bpf: #187 ALU64_LSH_K: 1 << 1 = 2 jited:0 211 PASS
[   38.825737] test_bpf: #188 ALU64_LSH_K: 1 << 31 = 0x80000000 jited:0 182 PASS
[   38.828021] test_bpf: #189 ALU_RSH_X: 2 >> 1 = 1 jited:0 226 PASS
[   38.830655] test_bpf: #190 ALU_RSH_X: 0x80000000 >> 31 = 1 jited:0 225 PASS
[   38.833287] test_bpf: #191 ALU64_RSH_X: 2 >> 1 = 1 jited:0 289 PASS
[   38.836535] test_bpf: #192 ALU64_RSH_X: 0x80000000 >> 31 = 1 jited:0 253 PASS
[   38.839501] test_bpf: #193 ALU_RSH_K: 2 >> 1 = 1 jited:0 207 PASS
[   38.842025] test_bpf: #194 ALU_RSH_K: 0x80000000 >> 31 = 1 jited:0 210 PASS
[   38.844570] test_bpf: #195 ALU64_RSH_K: 2 >> 1 = 1 jited:0 232 PASS
[   38.847341] test_bpf: #196 ALU64_RSH_K: 0x80000000 >> 31 = 1 jited:0 208 PASS
[   38.849849] test_bpf: #197 ALU_ARSH_X: 0xff00ff0000000000 >> 40 =
0xffffffffffff00ff jited:0 252 PASS
[   38.852728] test_bpf: #198 ALU_ARSH_K: 0xff00ff0000000000 >> 40 =
0xffffffffffff00ff jited:0 197 PASS
[   38.855165] test_bpf: #199 ALU_NEG: -(3) = -3 jited:0 189 PASS
[   38.857410] test_bpf: #200 ALU_NEG: -(-3) = 3 jited:0 171 PASS
[   38.859380] test_bpf: #201 ALU64_NEG: -(3) = -3 jited:0 179 PASS
[   38.861411] test_bpf: #202 ALU64_NEG: -(-3) = 3 jited:0 180 PASS
[   38.863491] test_bpf: #203 ALU_END_FROM_BE 16: 0x0123456789abcdef
-> 0xcdef jited:0 202 PASS
[   38.865978] test_bpf: #204 ALU_END_FROM_BE 32: 0x0123456789abcdef
-> 0x89abcdef jited:0 368 PASS
[   38.869957] test_bpf: #205 ALU_END_FROM_BE 64: 0x0123456789abcdef
-> 0x89abcdef jited:0 244 PASS
[   38.872708] test_bpf: #206 ALU_END_FROM_LE 16: 0x0123456789abcdef
-> 0xefcd jited:0 274 PASS
[   38.875930] test_bpf: #207 ALU_END_FROM_LE 32: 0x0123456789abcdef
-> 0xefcdab89 jited:0 319 PASS
[   38.879417] test_bpf: #208 ALU_END_FROM_LE 64: 0x0123456789abcdef
-> 0x67452301 jited:0 193 PASS
[   38.881653] test_bpf: #209 ST_MEM_B: Store/Load byte: max negative
jited:0 219 PASS
[   38.884143] test_bpf: #210 ST_MEM_B: Store/Load byte: max positive
jited:0 227 PASS
[   38.886902] test_bpf: #211 STX_MEM_B: Store/Load byte: max negative
jited:0 251 PASS
[   38.889691] test_bpf: #212 ST_MEM_H: Store/Load half word: max
negative jited:0 218 PASS
[   38.892132] test_bpf: #213 ST_MEM_H: Store/Load half word: max
positive jited:0 208 PASS
[   38.894448] test_bpf: #214 STX_MEM_H: Store/Load half word: max
negative jited:0 259 PASS
[   38.897504] test_bpf: #215 ST_MEM_W: Store/Load word: max negative
jited:0 253 PASS
[   38.900355] test_bpf: #216 ST_MEM_W: Store/Load word: max positive
jited:0 244 PASS
[   38.903051] test_bpf: #217 STX_MEM_W: Store/Load word: max negative
jited:0 297 PASS
[   38.906372] test_bpf: #218 ST_MEM_DW: Store/Load double word: max
negative jited:0 257 PASS
[   38.909268] test_bpf: #219 ST_MEM_DW: Store/Load double word: max
negative 2 jited:0 392 PASS
[   38.913520] test_bpf: #220 ST_MEM_DW: Store/Load double word: max
positive jited:0 292 PASS
[   38.916792] test_bpf: #221 STX_MEM_DW: Store/Load double word: max
negative jited:0 259 PASS
[   38.919654] test_bpf: #222 STX_XADD_W: Test: 0x12 + 0x10 = 0x22
jited:0 262 PASS
[   38.922517] test_bpf: #223 STX_XADD_W: Test side-effects, r10: 0x12
+ 0x10 = 0x22 jited:0 PASS
[   38.922764] test_bpf: #224 STX_XADD_W: Test side-effects, r0: 0x12
+ 0x10 = 0x22 jited:0 221 PASS
[   38.925373] test_bpf: #225 STX_XADD_W: X + 1 + 1 + 1 + ... jited:0
142719 PASS
[   40.352892] test_bpf: #226 STX_XADD_DW: Test: 0x12 + 0x10 = 0x22
jited:0 345 PASS
[   40.356940] test_bpf: #227 STX_XADD_DW: Test side-effects, r10:
0x12 + 0x10 = 0x22 jited:0 PASS
[   40.357188] test_bpf: #228 STX_XADD_DW: Test side-effects, r0: 0x12
+ 0x10 = 0x22 jited:0 254 PASS
[   40.359954] test_bpf: #229 STX_XADD_DW: X + 1 + 1 + 1 + ... jited:0
349891 PASS
[   43.859287] test_bpf: #230 JMP_EXIT jited:0 127 PASS
[   43.861346] test_bpf: #231 JMP_JA: Unconditional jump: if (true)
return 1 jited:0 194 PASS
[   43.863538] test_bpf: #232 JMP_JSGT_K: Signed jump: if (-1 > -2)
return 1 jited:0 262 PASS
[   43.866400] test_bpf: #233 JMP_JSGT_K: Signed jump: if (-1 > -1)
return 0 jited:0 249 PASS
[   43.869132] test_bpf: #234 JMP_JSGE_K: Signed jump: if (-1 >= -2)
return 1 jited:0 262 PASS
[   43.872046] test_bpf: #235 JMP_JSGE_K: Signed jump: if (-1 >= -1)
return 1 jited:0 260 PASS
[   43.874890] test_bpf: #236 JMP_JGT_K: if (3 > 2) return 1 jited:0 260 PASS
[   43.877701] test_bpf: #237 JMP_JGT_K: Unsigned jump: if (-1 > 1)
return 1 jited:0 278 PASS
[   43.880801] test_bpf: #238 JMP_JGE_K: if (3 >= 2) return 1 jited:0 255 PASS
[   43.883637] test_bpf: #239 JMP_JGT_K: if (3 > 2) return 1 (jump
backwards) jited:0 321 PASS
[   43.887202] test_bpf: #240 JMP_JGE_K: if (3 >= 3) return 1 jited:0 340 PASS
[   43.891306] test_bpf: #241 JMP_JNE_K: if (3 != 2) return 1 jited:0 310 PASS
[   43.895036] test_bpf: #242 JMP_JEQ_K: if (3 == 3) return 1 jited:0 310 PASS
[   43.898963] test_bpf: #243 JMP_JSET_K: if (0x3 & 0x2) return 1
jited:0 276 PASS
[   43.902034] test_bpf: #244 JMP_JSET_K: if (0x3 & 0xffffffff) return
1 jited:0 312 PASS
[   43.905679] test_bpf: #245 JMP_JSGT_X: Signed jump: if (-1 > -2)
return 1 jited:0 346 PASS
[   43.909500] test_bpf: #246 JMP_JSGT_X: Signed jump: if (-1 > -1)
return 0 jited:0 292 PASS
[   43.912696] test_bpf: #247 JMP_JSGE_X: Signed jump: if (-1 >= -2)
return 1 jited:0 318 PASS
[   43.916115] test_bpf: #248 JMP_JSGE_X: Signed jump: if (-1 >= -1)
return 1 jited:0 287 PASS
[   43.919236] test_bpf: #249 JMP_JGT_X: if (3 > 2) return 1 jited:0 316 PASS
[   43.922749] test_bpf: #250 JMP_JGT_X: Unsigned jump: if (-1 > 1)
return 1 jited:0 400 PASS
[   43.927178] test_bpf: #251 JMP_JGE_X: if (3 >= 2) return 1 jited:0 287 PASS
[   43.930323] test_bpf: #252 JMP_JGE_X: if (3 >= 3) return 1 jited:0 287 PASS
[   43.933432] test_bpf: #253 JMP_JGE_X: ldimm64 test 1 jited:0 323 PASS
[   43.936912] test_bpf: #254 JMP_JGE_X: ldimm64 test 2 jited:0 298 PASS
[   43.940168] test_bpf: #255 JMP_JGE_X: ldimm64 test 3 jited:0 263 PASS
[   43.943062] test_bpf: #256 JMP_JNE_X: if (3 != 2) return 1 jited:0 313 PASS
[   43.946483] test_bpf: #257 JMP_JEQ_X: if (3 == 3) return 1 jited:0 308 PASS
[   43.949817] test_bpf: #258 JMP_JSET_X: if (0x3 & 0x2) return 1
jited:0 359 PASS
[   43.953715] test_bpf: #259 JMP_JSET_X: if (0x3 & 0xffffffff) return
1 jited:0 421 PASS
[   43.958350] test_bpf: #260 JMP_JA: Jump, gap, jump, ... jited:0 309 PASS
[   43.961783] test_bpf: #261 BPF_MAXINSNS: Maximum possible literals
jited:0 251 PASS
[   43.969019] test_bpf: #262 BPF_MAXINSNS: Single literal jited:0 286 PASS
[   43.976250] test_bpf: #263 BPF_MAXINSNS: Run/add until end jited:0
254969 PASS
[   46.530754] test_bpf: #264 BPF_MAXINSNS: Too many instructions PASS
[   46.531227] test_bpf: #265 BPF_MAXINSNS: Very long jump jited:0 284 PASS
[   46.538925] test_bpf: #266 BPF_MAXINSNS: Ctx heavy transformations
jited:0 548311 560800 PASS
[   57.635685] test_bpf: #267 BPF_MAXINSNS: Call heavy transformations
jited:0 949505 881276 PASS
[   75.951893] test_bpf: #268 BPF_MAXINSNS: Jump heavy test jited:0 480796 PASS
[   80.765143] test_bpf: #269 BPF_MAXINSNS: Very long jump backwards
jited:0 193 PASS
[   80.767750] test_bpf: #270 BPF_MAXINSNS: Edge hopping nuthouse
jited:0 114304 PASS
[   81.911103] test_bpf: #271 BPF_MAXINSNS: Jump, gap, jump, ...
jited:0 1884 PASS
[   81.935374] test_bpf: #272 BPF_MAXINSNS: ld_abs+get_processor_id
jited:0 546269 PASS
[   87.405760] test_bpf: #273 BPF_MAXINSNS: ld_abs+vlan_push/pop
jited:0 594906 PASS
[   93.356075] test_bpf: #274 LD_IND byte frag jited:0 695 PASS
[   93.364087] test_bpf: #275 LD_IND halfword frag jited:0 818 PASS
[   93.372861] test_bpf: #276 LD_IND word frag jited:0 837 PASS
[   93.381738] test_bpf: #277 LD_IND halfword mixed head/frag jited:0 1170 PASS
[   93.394096] test_bpf: #278 LD_IND word mixed head/frag jited:0 950 PASS
[   93.404149] test_bpf: #279 LD_ABS byte frag jited:0 953 PASS
[   93.414270] test_bpf: #280 LD_ABS halfword frag jited:0 754 PASS
[   93.422281] test_bpf: #281 LD_ABS word frag jited:0 1133 PASS
[   93.434166] test_bpf: #282 LD_ABS halfword mixed head/frag jited:0 1079 PASS
[   93.445353] test_bpf: #283 LD_ABS word mixed head/frag jited:0 718 PASS
[   93.452901] test_bpf: #284 LD_IND byte default X jited:0 297 PASS
[   93.456118] test_bpf: #285 LD_IND byte positive offset jited:0 300 PASS
[   93.459342] test_bpf: #286 LD_IND byte negative offset jited:0 296 PASS
[   93.462553] test_bpf: #287 LD_IND halfword positive offset jited:0 333 PASS
[   93.466116] test_bpf: #288 LD_IND halfword negative offset jited:0 306 PASS
[   93.469402] test_bpf: #289 LD_IND halfword unaligned jited:0 307 PASS
[   93.472711] test_bpf: #290 LD_IND word positive offset jited:0 337 PASS
[   93.476296] test_bpf: #291 LD_IND word negative offset jited:0 312 PASS
[   93.479676] test_bpf: #292 LD_IND word unaligned (addr & 3 == 2)
jited:0 309 PASS
[   93.482987] test_bpf: #293 LD_IND word unaligned (addr & 3 == 1)
jited:0 335 PASS
[   93.486601] test_bpf: #294 LD_IND word unaligned (addr & 3 == 3)
jited:0 305 PASS
[   93.489878] test_bpf: #295 LD_ABS byte jited:0 269 PASS
[   93.492784] test_bpf: #296 LD_ABS halfword jited:0 294 PASS
[   93.495950] test_bpf: #297 LD_ABS halfword unaligned jited:0 271 PASS
[   93.498895] test_bpf: #298 LD_ABS word jited:0 265 PASS
[   93.501756] test_bpf: #299 LD_ABS word unaligned (addr & 3 == 2)
jited:0 267 PASS
[   93.504667] test_bpf: #300 LD_ABS word unaligned (addr & 3 == 1)
jited:0 269 PASS
[   93.507584] test_bpf: #301 LD_ABS word unaligned (addr & 3 == 3)
jited:0 281 PASS
[   93.510665] test_bpf: #302 ADD default X jited:0 296 PASS
[   93.513830] test_bpf: #303 ADD default A jited:0 309 PASS
[   93.517144] test_bpf: #304 SUB default X jited:0 290 PASS
[   93.520249] test_bpf: #305 SUB default A jited:0 252 PASS
[   93.522974] test_bpf: #306 MUL default X jited:0 322 PASS
[   93.526403] test_bpf: #307 MUL default A jited:0 267 PASS
[   93.529277] test_bpf: #308 DIV default X jited:0 293 PASS
[   93.532414] test_bpf: #309 DIV default A jited:0 336 PASS
[   93.535988] test_bpf: #310 MOD default X jited:0 284 PASS
[   93.539032] test_bpf: #311 MOD default A jited:0 435 PASS
[   93.543608] test_bpf: #312 JMP EQ default A jited:0 352 PASS
[   93.547355] test_bpf: #313 JMP EQ default X jited:0 357 PASS
[   93.551176] test_bpf: Summary: 314 PASSED, 0 FAILED, [0/306 JIT'ed]

2) JIT enabled

[root at vexpress modules]# insmod test_bpf.ko
[   53.785470] test_bpf: #0 TAX jited:1 234 171 195 PASS
[   53.794856] test_bpf: #1 TXA jited:1 81 79 77 PASS
[   53.803927] test_bpf: #2 ADD_SUB_MUL_K jited:1 89 PASS
[   53.805542] test_bpf: #3 DIV_MOD_KX jited:1 939 PASS
[   53.816227] test_bpf: #4 AND_OR_LSH_K jited:1 116 114 PASS
[   53.821088] test_bpf: #5 LD_IMM_0 jited:1 93 PASS
[   53.822900] test_bpf: #6 LD_IND jited:1 371 279 274 PASS
[   53.833030] test_bpf: #7 LD_ABS jited:1 408 402 272 PASS
[   53.844767] test_bpf: #8 LD_ABS_LL jited:1 387 346 PASS
[   53.852730] test_bpf: #9 LD_IND_LL jited:1 239 248 217 PASS
[   53.860410] test_bpf: #10 LD_ABS_NET jited:1 356 332 PASS
[   53.867897] test_bpf: #11 LD_IND_NET jited:1 223 212 320 PASS
[   53.876076] test_bpf: #12 LD_PKTTYPE jited:1 102 90 PASS
[   53.878660] test_bpf: #13 LD_MARK jited:1 80 80 PASS
[   53.880695] test_bpf: #14 LD_RXHASH jited:1 73 71 PASS
[   53.882488] test_bpf: #15 LD_QUEUE jited:1 120 121 PASS
[   53.885266] test_bpf: #16 LD_PROTOCOL jited:1 256 247 PASS
[   53.890918] test_bpf: #17 LD_VLAN_TAG jited:1 82 84 PASS
[   53.893002] test_bpf: #18 LD_VLAN_TAG_PRESENT jited:1 80 77 PASS
[   53.894946] test_bpf: #19 LD_IFINDEX jited:1 87 98 PASS
[   53.897261] test_bpf: #20 LD_HATYPE jited:1 95 90 PASS
[   53.899466] test_bpf: #21 LD_CPU
[   53.899663] bpf_jit: *** NOT YET: opcode 85 ***
[   53.899796] jited:0 722 837 PASS
[   53.915645] test_bpf: #22 LD_NLATTR jited:0 593 659 PASS
[   53.928662] test_bpf: #23 LD_NLATTR_NEST jited:0 2186 2964 PASS
[   53.980966] test_bpf: #24 LD_PAYLOAD_OFF jited:0 3891 5637 PASS
[   54.076878] test_bpf: #25 LD_ANC_XOR jited:1 86 100 PASS
[   54.079241] test_bpf: #26 SPILL_FILL jited:1 131 137 123 PASS
[   54.084092] test_bpf: #27 JEQ jited:1 266 189 216 PASS
[   54.091500] test_bpf: #28 JGT jited:1 301 211 192 PASS
[   54.099467] test_bpf: #29 JGE jited:1 191 200 223 PASS
[   54.106275] test_bpf: #30 JSET jited:1 211 210 214 PASS
[   54.113660] test_bpf: #31 tcpdump port 22 jited:1 314 722 711 PASS
[   54.131943] test_bpf: #32 tcpdump complex jited:1 291 707 1068 PASS
[   54.153409] test_bpf: #33 RET_A jited:1 83 88 PASS
[   54.155617] test_bpf: #34 INT: ADD trivial jited:1 162 PASS
[   54.158387] test_bpf: #35 INT: MUL_X jited:1 176 PASS
[   54.161075] test_bpf: #36 INT: MUL_X2 jited:1 84 PASS
[   54.162483] test_bpf: #37 INT: MUL32_X jited:1 99 PASS
[   54.163849] test_bpf: #38 INT: ADD 64-bit jited:1 1066 PASS
[   54.175468] test_bpf: #39 INT: ADD 32-bit jited:1 666 PASS
[   54.182860] test_bpf: #40 INT: SUB jited:1 3236 PASS
[   54.215932] test_bpf: #41 INT: XOR jited:1 308 PASS
[   54.219704] test_bpf: #42 INT: MUL jited:1 376 PASS
[   54.224452] test_bpf: #43 MOV REG64 jited:1 227 PASS
[   54.227383] test_bpf: #44 MOV REG32 jited:1 171 PASS
[   54.229618] test_bpf: #45 LD IMM64 jited:1 163 PASS
[   54.231875] test_bpf: #46 INT: ALU MIX jited:0 1277 PASS
[   54.245188] test_bpf: #47 INT: shifts by register jited:1 208 PASS
[   54.248151] test_bpf: #48 INT: DIV + ABS jited:1 659 601 PASS
[   54.261395] test_bpf: #49 INT: DIV by zero jited:1 317 169 PASS
[   54.266949] test_bpf: #50 check: missing ret PASS
[   54.267418] test_bpf: #51 check: div_k_0 PASS
[   54.267631] test_bpf: #52 check: unknown insn PASS
[   54.267804] test_bpf: #53 check: out of range spill/fill PASS
[   54.268008] test_bpf: #54 JUMPS + HOLES jited:1 358 PASS
[   54.272201] test_bpf: #55 check: RET X PASS
[   54.273054] test_bpf: #56 check: LDX + RET X PASS
[   54.273226] test_bpf: #57 M[]: alt STX + LDX jited:1 456 PASS
[   54.278359] test_bpf: #58 M[]: full STX + full LDX jited:1 438 PASS
[   54.283300] test_bpf: #59 check: SKF_AD_MAX PASS
[   54.283576] test_bpf: #60 LD [SKF_AD_OFF-1] jited:1 198 PASS
[   54.285812] test_bpf: #61 load 64-bit immediate jited:1 125 PASS
[   54.287556] test_bpf: #62 nmap reduced jited:1 1054 PASS
[   54.298630] test_bpf: #63 ALU_MOV_X: dst = 2 jited:1 81 PASS
[   54.300079] test_bpf: #64 ALU_MOV_X: dst = 4294967295 jited:1 85 PASS
[   54.301462] test_bpf: #65 ALU64_MOV_X: dst = 2 jited:1 96 PASS
[   54.303048] test_bpf: #66 ALU64_MOV_X: dst = 4294967295 jited:1 71 PASS
[   54.304115] test_bpf: #67 ALU_MOV_K: dst = 2 jited:1 70 PASS
[   54.305148] test_bpf: #68 ALU_MOV_K: dst = 4294967295 jited:1 71 PASS
[   54.306222] test_bpf: #69 ALU_MOV_K: 0x0000ffffffff0000 =
0x00000000ffffffff jited:1 97 PASS
[   54.307659] test_bpf: #70 ALU64_MOV_K: dst = 2 jited:1 75 PASS
[   54.308750] test_bpf: #71 ALU64_MOV_K: dst = 2147483647 jited:1 66 PASS
[   54.309773] test_bpf: #72 ALU64_OR_K: dst = 0x0 jited:1 92 PASS
[   54.311093] test_bpf: #73 ALU64_MOV_K: dst = -1 jited:1 94 PASS
[   54.312383] test_bpf: #74 ALU_ADD_X: 1 + 2 = 3 jited:1 66 PASS
[   54.313388] test_bpf: #75 ALU_ADD_X: 1 + 4294967294 = 4294967295
jited:1 66 PASS
[   54.314430] test_bpf: #76 ALU_ADD_X: 2 + 4294967294 = 0 jited:1 87 PASS
[   54.315756] test_bpf: #77 ALU64_ADD_X: 1 + 2 = 3 jited:1 77 PASS
[   54.316892] test_bpf: #78 ALU64_ADD_X: 1 + 4294967294 = 4294967295
jited:1 72 PASS
[   54.318015] test_bpf: #79 ALU64_ADD_X: 2 + 4294967294 = 4294967296
jited:1 79 PASS
[   54.319181] test_bpf: #80 ALU_ADD_K: 1 + 2 = 3 jited:1 75 PASS
[   54.320261] test_bpf: #81 ALU_ADD_K: 3 + 0 = 3 jited:1 71 PASS
[   54.321307] test_bpf: #82 ALU_ADD_K: 1 + 4294967294 = 4294967295
jited:1 67 PASS
[   54.322342] test_bpf: #83 ALU_ADD_K: 4294967294 + 2 = 0 jited:1 82 PASS
[   54.323600] test_bpf: #84 ALU_ADD_K: 0 + (-1) = 0x00000000ffffffff
jited:1 86 PASS
[   54.325898] test_bpf: #85 ALU_ADD_K: 0 + 0xffff = 0xffff jited:1 99 PASS
[   54.327242] test_bpf: #86 ALU_ADD_K: 0 + 0x7fffffff = 0x7fffffff
jited:1 113 PASS
[   54.328684] test_bpf: #87 ALU_ADD_K: 0 + 0x80000000 = 0x80000000
jited:1 123 PASS
[   54.330224] test_bpf: #88 ALU_ADD_K: 0 + 0x80008000 = 0x80008000
jited:1 85 PASS
[   54.331395] test_bpf: #89 ALU64_ADD_K: 1 + 2 = 3 jited:1 66 PASS
[   54.332375] test_bpf: #90 ALU64_ADD_K: 3 + 0 = 3 jited:1 66 PASS
[   54.333381] test_bpf: #91 ALU64_ADD_K: 1 + 2147483646 = 2147483647
jited:1 69 PASS
[   54.334397] test_bpf: #92 ALU64_ADD_K: 4294967294 + 2 = 4294967296
jited:1 109 PASS
[   54.335818] test_bpf: #93 ALU64_ADD_K: 2147483646 + -2147483647 =
-1 jited:1 72 PASS
[   54.336873] test_bpf: #94 ALU64_ADD_K: 1 + 0 = 1 jited:1 126 PASS
[   54.338484] test_bpf: #95 ALU64_ADD_K: 0 + (-1) =
0xffffffffffffffff jited:1 107 PASS
[   54.340100] test_bpf: #96 ALU64_ADD_K: 0 + 0xffff = 0xffff jited:1 98 PASS
[   54.341569] test_bpf: #97 ALU64_ADD_K: 0 + 0x7fffffff = 0x7fffffff
jited:1 87 PASS
[   54.342794] test_bpf: #98 ALU64_ADD_K: 0 + 0x80000000 =
0xffffffff80000000 jited:1 98 PASS
[   54.344142] test_bpf: #99 ALU_ADD_K: 0 + 0x80008000 =
0xffffffff80008000 jited:1 92 PASS
[   54.345399] test_bpf: #100 ALU_SUB_X: 3 - 1 = 2 jited:1 77 PASS
[   54.346726] test_bpf: #101 ALU_SUB_X: 4294967295 - 4294967294 = 1
jited:1 72 PASS
[   54.347794] test_bpf: #102 ALU64_SUB_X: 3 - 1 = 2 jited:1 72 PASS
[   54.348826] test_bpf: #103 ALU64_SUB_X: 4294967295 - 4294967294 = 1
jited:1 71 PASS
[   54.349843] test_bpf: #104 ALU_SUB_K: 3 - 1 = 2 jited:1 120 PASS
[   54.351486] test_bpf: #105 ALU_SUB_K: 3 - 0 = 3 jited:1 82 PASS
[   54.352814] test_bpf: #106 ALU_SUB_K: 4294967295 - 4294967294 = 1
jited:1 103 PASS
[   54.354550] test_bpf: #107 ALU64_SUB_K: 3 - 1 = 2 jited:1 140 PASS
[   54.356822] test_bpf: #108 ALU64_SUB_K: 3 - 0 = 3 jited:1 117 PASS
[   54.359156] test_bpf: #109 ALU64_SUB_K: 4294967294 - 4294967295 =
-1 jited:1 83 PASS
[   54.360401] test_bpf: #110 ALU64_ADD_K: 2147483646 - 2147483647 =
-1 jited:1 77 PASS
[   54.361515] test_bpf: #111 ALU_MUL_X: 2 * 3 = 6 jited:1 68 PASS
[   54.362528] test_bpf: #112 ALU_MUL_X: 2 * 0x7FFFFFF8 = 0xFFFFFFF0
jited:1 70 PASS
[   54.363572] test_bpf: #113 ALU_MUL_X: -1 * -1 = 1 jited:1 73 PASS
[   54.364644] test_bpf: #114 ALU64_MUL_X: 2 * 3 = 6 jited:1 70 PASS
[   54.365655] test_bpf: #115 ALU64_MUL_X: 1 * 2147483647 = 2147483647
jited:1 75 PASS
[   54.366719] test_bpf: #116 ALU_MUL_K: 2 * 3 = 6 jited:1 67 PASS
[   54.367707] test_bpf: #117 ALU_MUL_K: 3 * 1 = 3 jited:1 71 PASS
[   54.368726] test_bpf: #118 ALU_MUL_K: 2 * 0x7FFFFFF8 = 0xFFFFFFF0
jited:1 70 PASS
[   54.369733] test_bpf: #119 ALU_MUL_K: 1 * (-1) = 0x00000000ffffffff
jited:1 153 PASS
[   54.371617] test_bpf: #120 ALU64_MUL_K: 2 * 3 = 6 jited:1 101 PASS
[   54.373505] test_bpf: #121 ALU64_MUL_K: 3 * 1 = 3 jited:1 108 PASS
[   54.375362] test_bpf: #122 ALU64_MUL_K: 1 * 2147483647 = 2147483647
jited:1 106 PASS
[   54.377242] test_bpf: #123 ALU64_MUL_K: 1 * -2147483647 =
-2147483647 jited:1 92 PASS
[   54.379044] test_bpf: #124 ALU64_MUL_K: 1 * (-1) =
0xffffffffffffffff jited:1 122 PASS
[   54.380863] test_bpf: #125 ALU_DIV_X: 6 / 2 = 3 jited:1 220 PASS
[   54.383591] test_bpf: #126 ALU_DIV_X: 4294967295 / 4294967295 = 1
jited:1 208 PASS
[   54.386292] test_bpf: #127 ALU64_DIV_X: 6 / 2 = 3 jited:0 736 PASS
[   54.394242] test_bpf: #128 ALU64_DIV_X: 2147483647 / 2147483647 = 1
jited:0 464 PASS
[   54.399433] test_bpf: #129 ALU64_DIV_X: 0xffffffffffffffff / (-1) =
0x0000000000000001 jited:0 743 PASS
[   54.407799] test_bpf: #130 ALU_DIV_K: 6 / 2 = 3 jited:1 246 PASS
[   54.410964] test_bpf: #131 ALU_DIV_K: 3 / 1 = 3 jited:1 199 PASS
[   54.413410] test_bpf: #132 ALU_DIV_K: 4294967295 / 4294967295 = 1
jited:1 192 PASS
[   54.415782] test_bpf: #133 ALU_DIV_K: 0xffffffffffffffff / (-1) =
0x1 jited:1 215 PASS
[   54.418414] test_bpf: #134 ALU64_DIV_K: 6 / 2 = 3 jited:0 364 PASS
[   54.422379] test_bpf: #135 ALU64_DIV_K: 3 / 1 = 3 jited:0 369 PASS
[   54.426692] test_bpf: #136 ALU64_DIV_K: 2147483647 / 2147483647 = 1
jited:0 380 PASS
[   54.430875] test_bpf: #137 ALU64_DIV_K: 0xffffffffffffffff / (-1) =
0x0000000000000001 jited:0 623 PASS
[   54.437429] test_bpf: #138 ALU_MOD_X: 3 % 2 = 1 jited:1 235 PASS
[   54.440177] test_bpf: #139 ALU_MOD_X: 4294967295 % 4294967293 = 2
jited:1 262 PASS
[   54.443183] test_bpf: #140 ALU64_MOD_X: 3 % 2 = 1 jited:0 1524 PASS
[   54.458988] test_bpf: #141 ALU64_MOD_X: 2147483647 % 2147483645 = 2
jited:0 720 PASS
[   54.466677] test_bpf: #142 ALU_MOD_K: 3 % 2 = 1 jited:1 231 PASS
[   54.469383] test_bpf: #143 ALU_MOD_K: 3 % 1 = 0 jited:1 PASS
[   54.469685] test_bpf: #144 ALU_MOD_K: 4294967295 % 4294967293 = 2
jited:1 257 PASS
[   54.472650] test_bpf: #145 ALU64_MOD_K: 3 % 2 = 1 jited:0 481 PASS
[   54.477765] test_bpf: #146 ALU64_MOD_K: 3 % 1 = 0 jited:0 PASS
[   54.478042] test_bpf: #147 ALU64_MOD_K: 2147483647 % 2147483645 = 2
jited:0 513 PASS
[   54.483455] test_bpf: #148 ALU_AND_X: 3 & 2 = 2 jited:1 100 PASS
[   54.484786] test_bpf: #149 ALU_AND_X: 0xffffffff & 0xffffffff =
0xffffffff jited:1 106 PASS
[   54.486335] test_bpf: #150 ALU64_AND_X: 3 & 2 = 2 jited:1 86 PASS
[   54.487738] test_bpf: #151 ALU64_AND_X: 0xffffffff & 0xffffffff =
0xffffffff jited:1 118 PASS
[   54.489623] test_bpf: #152 ALU_AND_K: 3 & 2 = 2 jited:1 117 PASS
[   54.491645] test_bpf: #153 ALU_AND_K: 0xffffffff & 0xffffffff =
0xffffffff jited:1 72 PASS
[   54.493119] test_bpf: #154 ALU64_AND_K: 3 & 2 = 2 jited:1 72 PASS
[   54.494195] test_bpf: #155 ALU64_AND_K: 0xffffffff & 0xffffffff =
0xffffffff jited:1 70 PASS
[   54.495330] test_bpf: #156 ALU64_AND_K: 0x0000ffffffff0000 & 0x0 =
0x0000ffff00000000 jited:1 99 PASS
[   54.496721] test_bpf: #157 ALU64_AND_K: 0x0000ffffffff0000 & -1 =
0x0000ffffffffffff jited:1 97 PASS
[   54.498106] test_bpf: #158 ALU64_AND_K: 0xffffffffffffffff & -1 =
0xffffffffffffffff jited:1 86 PASS
[   54.499343] test_bpf: #159 ALU_OR_X: 1 | 2 = 3 jited:1 73 PASS
[   54.500447] test_bpf: #160 ALU_OR_X: 0x0 | 0xffffffff = 0xffffffff
jited:1 72 PASS
[   54.501546] test_bpf: #161 ALU64_OR_X: 1 | 2 = 3 jited:1 89 PASS
[   54.502779] test_bpf: #162 ALU64_OR_X: 0 | 0xffffffff = 0xffffffff
jited:1 91 PASS
[   54.504154] test_bpf: #163 ALU_OR_K: 1 | 2 = 3 jited:1 71 PASS
[   54.505223] test_bpf: #164 ALU_OR_K: 0 & 0xffffffff = 0xffffffff
jited:1 116 PASS
[   54.506916] test_bpf: #165 ALU64_OR_K: 1 | 2 = 3 jited:1 77 PASS
[   54.508328] test_bpf: #166 ALU64_OR_K: 0 & 0xffffffff = 0xffffffff
jited:1 80 PASS
[   54.509666] test_bpf: #167 ALU64_OR_K: 0x0000ffffffff0000 | 0x0 =
0x0000ffff00000000 jited:1 86 PASS
[   54.511012] test_bpf: #168 ALU64_OR_K: 0x0000ffffffff0000 | -1 =
0xffffffffffffffff jited:1 99 PASS
[   54.512432] test_bpf: #169 ALU64_OR_K: 0x000000000000000 | -1 =
0xffffffffffffffff jited:1 147 PASS
[   54.514401] test_bpf: #170 ALU_XOR_X: 5 ^ 6 = 3 jited:1 80 PASS
[   54.515668] test_bpf: #171 ALU_XOR_X: 0x1 ^ 0xffffffff = 0xfffffffe
jited:1 73 PASS
[   54.516794] test_bpf: #172 ALU64_XOR_X: 5 ^ 6 = 3 jited:1 71 PASS
[   54.517879] test_bpf: #173 ALU64_XOR_X: 1 ^ 0xffffffff = 0xfffffffe
jited:1 72 PASS
[   54.518998] test_bpf: #174 ALU_XOR_K: 5 ^ 6 = 3 jited:1 71 PASS
[   54.520120] test_bpf: #175 ALU_XOR_K: 1 ^ 0xffffffff = 0xfffffffe
jited:1 67 PASS
[   54.521181] test_bpf: #176 ALU64_XOR_K: 5 ^ 6 = 3 jited:1 70 PASS
[   54.522292] test_bpf: #177 ALU64_XOR_K: 1 & 0xffffffff = 0xfffffffe
jited:1 104 PASS
[   54.523741] test_bpf: #178 ALU64_XOR_K: 0x0000ffffffff0000 ^ 0x0 =
0x0000ffffffff0000 jited:1 96 PASS
[   54.525269] test_bpf: #179 ALU64_XOR_K: 0x0000ffffffff0000 ^ -1 =
0xffff00000000ffff jited:1 119 PASS
[   54.526875] test_bpf: #180 ALU64_XOR_K: 0x000000000000000 ^ -1 =
0xffffffffffffffff jited:1 116 PASS
[   54.528421] test_bpf: #181 ALU_LSH_X: 1 << 1 = 2 jited:1 100 PASS
[   54.529848] test_bpf: #182 ALU_LSH_X: 1 << 31 = 0x80000000 jited:1 73 PASS
[   54.530965] test_bpf: #183 ALU64_LSH_X: 1 << 1 = 2 jited:1 119 PASS
[   54.532667] test_bpf: #184 ALU64_LSH_X: 1 << 31 = 0x80000000 jited:1 110 PASS
[   54.534257] test_bpf: #185 ALU_LSH_K: 1 << 1 = 2 jited:1 147 PASS
[   54.536290] test_bpf: #186 ALU_LSH_K: 1 << 31 = 0x80000000 jited:1 116 PASS
[   54.538165] test_bpf: #187 ALU64_LSH_K: 1 << 1 = 2 jited:1 154 PASS
[   54.540668] test_bpf: #188 ALU64_LSH_K: 1 << 31 = 0x80000000 jited:1 92 PASS
[   54.542464] test_bpf: #189 ALU_RSH_X: 2 >> 1 = 1 jited:1 86 PASS
[   54.543937] test_bpf: #190 ALU_RSH_X: 0x80000000 >> 31 = 1 jited:1 148 PASS
[   54.545995] test_bpf: #191 ALU64_RSH_X: 2 >> 1 = 1 jited:1 108 PASS
[   54.547759] test_bpf: #192 ALU64_RSH_X: 0x80000000 >> 31 = 1 jited:1 96 PASS
[   54.549178] test_bpf: #193 ALU_RSH_K: 2 >> 1 = 1 jited:1 68 PASS
[   54.550175] test_bpf: #194 ALU_RSH_K: 0x80000000 >> 31 = 1 jited:1 74 PASS
[   54.551208] test_bpf: #195 ALU64_RSH_K: 2 >> 1 = 1 jited:1 66 PASS
[   54.552193] test_bpf: #196 ALU64_RSH_K: 0x80000000 >> 31 = 1 jited:1 95 PASS
[   54.553449] test_bpf: #197 ALU_ARSH_X: 0xff00ff0000000000 >> 40 =
0xffffffffffff00ff jited:1 74 PASS
[   54.554566] test_bpf: #198 ALU_ARSH_K: 0xff00ff0000000000 >> 40 =
0xffffffffffff00ff jited:1 96 PASS
[   54.555984] test_bpf: #199 ALU_NEG: -(3) = -3 jited:1 84 PASS
[   54.557335] test_bpf: #200 ALU_NEG: -(-3) = 3 jited:1 72 PASS
[   54.558442] test_bpf: #201 ALU64_NEG: -(3) = -3 jited:1 74 PASS
[   54.559596] test_bpf: #202 ALU64_NEG: -(-3) = 3 jited:1 68 PASS
[   54.560664] test_bpf: #203 ALU_END_FROM_BE 16: 0x0123456789abcdef
-> 0xcdef jited:1 74 PASS
[   54.561814] test_bpf: #204 ALU_END_FROM_BE 32: 0x0123456789abcdef
-> 0x89abcdef jited:1 101 PASS
[   54.563242] test_bpf: #205 ALU_END_FROM_BE 64: 0x0123456789abcdef
-> 0x89abcdef jited:1 93 PASS
[   54.564578] test_bpf: #206 ALU_END_FROM_LE 16: 0x0123456789abcdef
-> 0xefcd jited:1 73 PASS
[   54.565750] test_bpf: #207 ALU_END_FROM_LE 32: 0x0123456789abcdef
-> 0xefcdab89 jited:1 76 PASS
[   54.566879] test_bpf: #208 ALU_END_FROM_LE 64: 0x0123456789abcdef
-> 0x67452301 jited:1 78 PASS
[   54.568009] test_bpf: #209 ST_MEM_B: Store/Load byte: max negative
jited:1 72 PASS
[   54.569258] test_bpf: #210 ST_MEM_B: Store/Load byte: max positive
jited:1 79 PASS
[   54.570402] test_bpf: #211 STX_MEM_B: Store/Load byte: max negative
jited:1 79 PASS
[   54.571541] test_bpf: #212 ST_MEM_H: Store/Load half word: max
negative jited:1 81 PASS
[   54.572896] test_bpf: #213 ST_MEM_H: Store/Load half word: max
positive jited:1 100 PASS
[   54.574521] test_bpf: #214 STX_MEM_H: Store/Load half word: max
negative jited:1 110 PASS
[   54.576159] test_bpf: #215 ST_MEM_W: Store/Load word: max negative
jited:1 75 PASS
[   54.577570] test_bpf: #216 ST_MEM_W: Store/Load word: max positive
jited:1 89 PASS
[   54.579195] test_bpf: #217 STX_MEM_W: Store/Load word: max negative
jited:1 122 PASS
[   54.581267] test_bpf: #218 ST_MEM_DW: Store/Load double word: max
negative jited:1 85 PASS
[   54.582954] test_bpf: #219 ST_MEM_DW: Store/Load double word: max
negative 2 jited:1 123 PASS
[   54.584677] test_bpf: #220 ST_MEM_DW: Store/Load double word: max
positive jited:1 78 PASS
[   54.585879] test_bpf: #221 STX_MEM_DW: Store/Load double word: max
negative jited:1 85 PASS
[   54.587106] test_bpf: #222 STX_XADD_W: Test: 0x12 + 0x10 = 0x22
jited:0 328 PASS
[   54.590869] test_bpf: #223 STX_XADD_W: Test side-effects, r10: 0x12
+ 0x10 = 0x22 jited:0 PASS
[   54.591178] test_bpf: #224 STX_XADD_W: Test side-effects, r0: 0x12
+ 0x10 = 0x22 jited:0 285 PASS
[   54.594489] test_bpf: #225 STX_XADD_W: X + 1 + 1 + 1 + ... jited:0
158746 PASS
[   56.182499] test_bpf: #226 STX_XADD_DW: Test: 0x12 + 0x10 = 0x22
jited:0 343 PASS
[   56.186642] test_bpf: #227 STX_XADD_DW: Test side-effects, r10:
0x12 + 0x10 = 0x22 jited:0 PASS
[   56.186926] test_bpf: #228 STX_XADD_DW: Test side-effects, r0: 0x12
+ 0x10 = 0x22 jited:0 272 PASS
[   56.190021] test_bpf: #229 STX_XADD_DW: X + 1 + 1 + 1 + ... jited:0
194997 PASS
[   58.140569] test_bpf: #230 JMP_EXIT jited:1 82 PASS
[   58.142427] test_bpf: #231 JMP_JA: Unconditional jump: if (true)
return 1 jited:1 86 PASS
[   58.155637] test_bpf: #232 JMP_JSGT_K: Signed jump: if (-1 > -2)
return 1 jited:1 86 PASS
[   58.157334] test_bpf: #233 JMP_JSGT_K: Signed jump: if (-1 > -1)
return 0 jited:1 82 PASS
[   58.158533] test_bpf: #234 JMP_JSGE_K: Signed jump: if (-1 >= -2)
return 1 jited:1 72 PASS
[   58.159560] test_bpf: #235 JMP_JSGE_K: Signed jump: if (-1 >= -1)
return 1 jited:1 73 PASS
[   58.160538] test_bpf: #236 JMP_JGT_K: if (3 > 2) return 1 jited:1 71 PASS
[   58.161457] test_bpf: #237 JMP_JGT_K: Unsigned jump: if (-1 > 1)
return 1 jited:1 72 PASS
[   58.162407] test_bpf: #238 JMP_JGE_K: if (3 >= 2) return 1 jited:1 77 PASS
[   58.163411] test_bpf: #239 JMP_JGT_K: if (3 > 2) return 1 (jump
backwards) jited:1 76 PASS
[   58.164416] test_bpf: #240 JMP_JGE_K: if (3 >= 3) return 1 jited:1 74 PASS
[   58.165391] test_bpf: #241 JMP_JNE_K: if (3 != 2) return 1 jited:1 74 PASS
[   58.166375] test_bpf: #242 JMP_JEQ_K: if (3 == 3) return 1 jited:1 78 PASS
[   58.167382] test_bpf: #243 JMP_JSET_K: if (0x3 & 0x2) return 1
jited:1 109 PASS
[   58.168822] test_bpf: #244 JMP_JSET_K: if (0x3 & 0xffffffff) return
1 jited:1 71 PASS
[   58.170396] test_bpf: #245 JMP_JSGT_X: Signed jump: if (-1 > -2)
return 1 jited:1 75 PASS
[   58.171568] test_bpf: #246 JMP_JSGT_X: Signed jump: if (-1 > -1)
return 0 jited:1 78 PASS
[   58.172804] test_bpf: #247 JMP_JSGE_X: Signed jump: if (-1 >= -2)
return 1 jited:1 134 PASS
[   58.175486] test_bpf: #248 JMP_JSGE_X: Signed jump: if (-1 >= -1)
return 1 jited:1 102 PASS
[   58.177403] test_bpf: #249 JMP_JGT_X: if (3 > 2) return 1 jited:1 83 PASS
[   58.178806] test_bpf: #250 JMP_JGT_X: Unsigned jump: if (-1 > 1)
return 1 jited:1 80 PASS
[   58.180104] test_bpf: #251 JMP_JGE_X: if (3 >= 2) return 1 jited:1 78 PASS
[   58.181230] test_bpf: #252 JMP_JGE_X: if (3 >= 3) return 1 jited:1 116 PASS
[   58.182751] test_bpf: #253 JMP_JGE_X: ldimm64 test 1 jited:1 81 PASS
[   58.183951] test_bpf: #254 JMP_JGE_X: ldimm64 test 2 jited:1 79 PASS
[   58.185334] test_bpf: #255 JMP_JGE_X: ldimm64 test 3 jited:1 78 PASS
[   58.186505] test_bpf: #256 JMP_JNE_X: if (3 != 2) return 1 jited:1 108 PASS
[   58.187991] test_bpf: #257 JMP_JEQ_X: if (3 == 3) return 1 jited:1 102 PASS
[   58.189496] test_bpf: #258 JMP_JSET_X: if (0x3 & 0x2) return 1
jited:1 133 PASS
[   58.191644] test_bpf: #259 JMP_JSET_X: if (0x3 & 0xffffffff) return
1 jited:1 128 PASS
[   58.193631] test_bpf: #260 JMP_JA: Jump, gap, jump, ... jited:1 108 PASS
[   58.195981] test_bpf: #261 BPF_MAXINSNS: Maximum possible literals
jited:1 111 PASS
[   58.211020] test_bpf: #262 BPF_MAXINSNS: Single literal jited:1 115 PASS
[   58.226185] test_bpf: #263 BPF_MAXINSNS: Run/add until end jited:1 8481 PASS
[   58.322910] test_bpf: #264 BPF_MAXINSNS: Too many instructions PASS
[   58.323076] test_bpf: #265 BPF_MAXINSNS: Very long jump jited:1 123 PASS
[   58.339381] test_bpf: #266 BPF_MAXINSNS: Ctx heavy transformations
jited:1 28166 29032 PASS
[   58.931050] test_bpf: #267 BPF_MAXINSNS: Call heavy transformations
jited:0 903498 894192 PASS
[   76.916296] test_bpf: #268 BPF_MAXINSNS: Jump heavy test jited:1 132663 PASS
[   78.260490] test_bpf: #269 BPF_MAXINSNS: Very long jump backwards
jited:1 148 PASS
[   78.269590] test_bpf: #270 BPF_MAXINSNS: Edge hopping nuthouse
jited:1 277097 PASS
[   81.046383] test_bpf: #271 BPF_MAXINSNS: Jump, gap, jump, ...
jited:1 1041 PASS
[   81.076916] test_bpf: #272 BPF_MAXINSNS: ld_abs+get_processor_id
jited:0 566894 PASS
[   86.754024] test_bpf: #273 BPF_MAXINSNS: ld_abs+vlan_push/pop
jited:0 602040 PASS
[   92.775504] test_bpf: #274 LD_IND byte frag jited:1 574 PASS
[   92.782876] test_bpf: #275 LD_IND halfword frag jited:1 641 PASS
[   92.790062] test_bpf: #276 LD_IND word frag jited:1 731 PASS
[   92.798321] test_bpf: #277 LD_IND halfword mixed head/frag jited:1 741 PASS
[   92.806601] test_bpf: #278 LD_IND word mixed head/frag jited:1 972 PASS
[   92.817542] test_bpf: #279 LD_ABS byte frag jited:1 601 PASS
[   92.824156] test_bpf: #280 LD_ABS halfword frag jited:1 603 PASS
[   92.830806] test_bpf: #281 LD_ABS word frag jited:1 688 PASS
[   92.838273] test_bpf: #282 LD_ABS halfword mixed head/frag jited:1 657 PASS
[   92.845562] test_bpf: #283 LD_ABS word mixed head/frag jited:1 748 PASS
[   92.853678] test_bpf: #284 LD_IND byte default X jited:1 178 PASS
[   92.856290] test_bpf: #285 LD_IND byte positive offset jited:1 187 PASS
[   92.858954] test_bpf: #286 LD_IND byte negative offset jited:1 178 PASS
[   92.861592] test_bpf: #287 LD_IND halfword positive offset jited:1 161 PASS
[   92.863726] test_bpf: #288 LD_IND halfword negative offset jited:1 195 PASS
[   92.866372] test_bpf: #289 LD_IND halfword unaligned jited:1 183 PASS
[   92.868821] test_bpf: #290 LD_IND word positive offset jited:1 170 PASS
[   92.871096] test_bpf: #291 LD_IND word negative offset jited:1 198 PASS
[   92.873832] test_bpf: #292 LD_IND word unaligned (addr & 3 == 2)
jited:1 281 PASS
[   92.877321] test_bpf: #293 LD_IND word unaligned (addr & 3 == 1)
jited:1 172 PASS
[   92.879493] test_bpf: #294 LD_IND word unaligned (addr & 3 == 3)
jited:1 171 PASS
[   92.881590] test_bpf: #295 LD_ABS byte jited:1 162 PASS
[   92.883535] test_bpf: #296 LD_ABS halfword jited:1 160 PASS
[   92.885486] test_bpf: #297 LD_ABS halfword unaligned jited:1 180 PASS
[   92.887650] test_bpf: #298 LD_ABS word jited:1 166 PASS
[   92.889661] test_bpf: #299 LD_ABS word unaligned (addr & 3 == 2)
jited:1 157 PASS
[   92.891595] test_bpf: #300 LD_ABS word unaligned (addr & 3 == 1)
jited:1 170 PASS
[   92.893662] test_bpf: #301 LD_ABS word unaligned (addr & 3 == 3)
jited:1 163 PASS
[   92.895660] test_bpf: #302 ADD default X jited:1 84 PASS
[   92.896895] test_bpf: #303 ADD default A jited:1 79 PASS
[   92.898143] test_bpf: #304 SUB default X jited:1 82 PASS
[   92.899284] test_bpf: #305 SUB default A jited:1 85 PASS
[   92.900529] test_bpf: #306 MUL default X jited:1 76 PASS
[   92.901642] test_bpf: #307 MUL default A jited:1 83 PASS
[   92.903045] test_bpf: #308 DIV default X jited:1 93 PASS
[   92.904524] test_bpf: #309 DIV default A jited:1 203 PASS
[   92.906955] test_bpf: #310 MOD default X jited:1 100 PASS
[   92.908398] test_bpf: #311 MOD default A jited:1 249 PASS
[   92.911232] test_bpf: #312 JMP EQ default A jited:1 83 PASS
[   92.912593] test_bpf: #313 JMP EQ default X jited:1 95 PASS
[   92.913931] test_bpf: Summary: 314 PASSED, 0 FAILED, [278/306 JIT'ed]

3) JIT + blinding enabled:

[root at vexpress modules]# insmod test_bpf.ko
[   56.044720] test_bpf: #0 TAX jited:1 239 218 229 PASS
[   56.054736] test_bpf: #1 TXA jited:1 89 119 85 PASS
[   56.064598] test_bpf: #2 ADD_SUB_MUL_K jited:1 213 PASS
[   56.067415] test_bpf: #3 DIV_MOD_KX jited:1 1190 PASS
[   56.080569] test_bpf: #4 AND_OR_LSH_K jited:1 200 149 PASS
[   56.084764] test_bpf: #5 LD_IMM_0 jited:1 101 PASS
[   56.086832] test_bpf: #6 LD_IND jited:1 314 310 283 PASS
[   56.096521] test_bpf: #7 LD_ABS jited:1 376 460 397 PASS
[   56.109604] test_bpf: #8 LD_ABS_LL jited:1 608 415 PASS
[   56.120753] test_bpf: #9 LD_IND_LL jited:1 248 256 268 PASS
[   56.129296] test_bpf: #10 LD_ABS_NET jited:1 435 420 PASS
[   56.138666] test_bpf: #11 LD_IND_NET jited:1 240 228 215 PASS
[   56.146039] test_bpf: #12 LD_PKTTYPE jited:1 211 274 PASS
[   56.151632] test_bpf: #13 LD_MARK jited:1 119 76 PASS
[   56.154522] test_bpf: #14 LD_RXHASH jited:1 78 70 PASS
[   56.156535] test_bpf: #15 LD_QUEUE jited:1 77 73 PASS
[   56.158482] test_bpf: #16 LD_PROTOCOL jited:1 326 320 PASS
[   56.165778] test_bpf: #17 LD_VLAN_TAG jited:1 129 86 PASS
[   56.168783] test_bpf: #18 LD_VLAN_TAG_PRESENT jited:1 87 88 PASS
[   56.170990] test_bpf: #19 LD_IFINDEX jited:1 97 95 PASS
[   56.173444] test_bpf: #20 LD_HATYPE jited:1 94 118 PASS
[   56.176033] test_bpf: #21 LD_CPU
[   56.176329] bpf_jit: *** NOT YET: opcode 85 ***
[   56.176565] jited:0 2639 702 PASS
[   56.210242] test_bpf: #22 LD_NLATTR jited:0 685 2101 PASS
[   56.238881] test_bpf: #23 LD_NLATTR_NEST jited:0 2323 3752 PASS
[   56.300600] test_bpf: #24 LD_PAYLOAD_OFF jited:0 4543 6842 PASS
[   56.415022] test_bpf: #25 LD_ANC_XOR jited:1 168 156 PASS
[   56.419429] test_bpf: #26 SPILL_FILL jited:1 232 212 219 PASS
[   56.427785] test_bpf: #27 JEQ jited:1 362 352 230 PASS
[   56.438180] test_bpf: #28 JGT jited:1 334 236 197 PASS
[   56.446672] test_bpf: #29 JGE jited:1 260 318 307 PASS
[   56.456301] test_bpf: #30 JSET jited:1 274 339 410 PASS
[   56.467681] test_bpf: #31 tcpdump port 22 jited:1 355 951 968 PASS
[   56.492091] test_bpf: #32 tcpdump complex jited:1 318 798 1308 PASS
[   56.517843] test_bpf: #33 RET_A jited:1 83 76 PASS
[   56.520000] test_bpf: #34 INT: ADD trivial jited:1 152 PASS
[   56.522183] test_bpf: #35 INT: MUL_X jited:1 192 PASS
[   56.524626] test_bpf: #36 INT: MUL_X2 jited:1 165 PASS
[   56.526762] test_bpf: #37 INT: MUL32_X jited:1 163 PASS
[   56.528828] test_bpf: #38 INT: ADD 64-bit jited:1 1507 PASS
[   56.544862] test_bpf: #39 INT: ADD 32-bit jited:1 954 PASS
[   56.555409] test_bpf: #40 INT: SUB jited:1 1159 PASS
[   56.567960] test_bpf: #41 INT: XOR jited:1 480 PASS
[   56.573431] test_bpf: #42 INT: MUL jited:1 486 PASS
[   56.579305] test_bpf: #43 MOV REG64 jited:1 274 PASS
[   56.583045] test_bpf: #44 MOV REG32 jited:1 253 PASS
[   56.586138] test_bpf: #45 LD IMM64 jited:1 578 PASS
[   56.592580] test_bpf: #46 INT: ALU MIX jited:0 1199 PASS
[   56.605346] test_bpf: #47 INT: shifts by register jited:1 381 PASS
[   56.610159] test_bpf: #48 INT: DIV + ABS jited:1 588 482 PASS
[   56.621545] test_bpf: #49 INT: DIV by zero jited:1 276 199 PASS
[   56.626894] test_bpf: #50 check: missing ret PASS
[   56.627249] test_bpf: #51 check: div_k_0 PASS
[   56.627403] test_bpf: #52 check: unknown insn PASS
[   56.627518] test_bpf: #53 check: out of range spill/fill PASS
[   56.627639] test_bpf: #54 JUMPS + HOLES jited:1 371 PASS
[   56.632295] test_bpf: #55 check: RET X PASS
[   56.632615] test_bpf: #56 check: LDX + RET X PASS
[   56.632748] test_bpf: #57 M[]: alt STX + LDX jited:1 621 PASS
[   56.639774] test_bpf: #58 M[]: full STX + full LDX jited:1 586 PASS
[   56.646535] test_bpf: #59 check: SKF_AD_MAX PASS
[   56.646837] test_bpf: #60 LD [SKF_AD_OFF-1] jited:1 195 PASS
[   56.649245] test_bpf: #61 load 64-bit immediate jited:1 220 PASS
[   56.652259] test_bpf: #62 nmap reduced jited:1 816 PASS
[   56.661508] test_bpf: #63 ALU_MOV_X: dst = 2 jited:1 76 PASS
[   56.662760] test_bpf: #64 ALU_MOV_X: dst = 4294967295 jited:1 79 PASS
[   56.663905] test_bpf: #65 ALU64_MOV_X: dst = 2 jited:1 80 PASS
[   56.665158] test_bpf: #66 ALU64_MOV_X: dst = 4294967295 jited:1 79 PASS
[   56.666297] test_bpf: #67 ALU_MOV_K: dst = 2 jited:1 75 PASS
[   56.667389] test_bpf: #68 ALU_MOV_K: dst = 4294967295 jited:1 73 PASS
[   56.668504] test_bpf: #69 ALU_MOV_K: 0x0000ffffffff0000 =
0x00000000ffffffff jited:1 195 PASS
[   56.670934] test_bpf: #70 ALU64_MOV_K: dst = 2 jited:1 77 PASS
[   56.672115] test_bpf: #71 ALU64_MOV_K: dst = 2147483647 jited:1 104 PASS
[   56.673550] test_bpf: #72 ALU64_OR_K: dst = 0x0 jited:1 215 PASS
[   56.676139] test_bpf: #73 ALU64_MOV_K: dst = -1 jited:1 173 PASS
[   56.687141] test_bpf: #74 ALU_ADD_X: 1 + 2 = 3 jited:1 114 PASS
[   56.688839] test_bpf: #75 ALU_ADD_X: 1 + 4294967294 = 4294967295
jited:1 112 PASS
[   56.690248] test_bpf: #76 ALU_ADD_X: 2 + 4294967294 = 0 jited:1 186 PASS
[   56.692428] test_bpf: #77 ALU64_ADD_X: 1 + 2 = 3 jited:1 159 PASS
[   56.694388] test_bpf: #78 ALU64_ADD_X: 1 + 4294967294 = 4294967295
jited:1 109 PASS
[   56.696115] test_bpf: #79 ALU64_ADD_X: 2 + 4294967294 = 4294967296
jited:1 218 PASS
[   56.698754] test_bpf: #80 ALU_ADD_K: 1 + 2 = 3 jited:1 120 PASS
[   56.700479] test_bpf: #81 ALU_ADD_K: 3 + 0 = 3 jited:1 118 PASS
[   56.702378] test_bpf: #82 ALU_ADD_K: 1 + 4294967294 = 4294967295
jited:1 121 PASS
[   56.704284] test_bpf: #83 ALU_ADD_K: 4294967294 + 2 = 0 jited:1 139 PASS
[   56.706363] test_bpf: #84 ALU_ADD_K: 0 + (-1) = 0x00000000ffffffff
jited:1 176 PASS
[   56.708715] test_bpf: #85 ALU_ADD_K: 0 + 0xffff = 0xffff jited:1 190 PASS
[   56.711155] test_bpf: #86 ALU_ADD_K: 0 + 0x7fffffff = 0x7fffffff
jited:1 228 PASS
[   56.713878] test_bpf: #87 ALU_ADD_K: 0 + 0x80000000 = 0x80000000
jited:1 198 PASS
[   56.716318] test_bpf: #88 ALU_ADD_K: 0 + 0x80008000 = 0x80008000
jited:1 189 PASS
[   56.718657] test_bpf: #89 ALU64_ADD_K: 1 + 2 = 3 jited:1 112 PASS
[   56.720152] test_bpf: #90 ALU64_ADD_K: 3 + 0 = 3 jited:1 111 PASS
[   56.721639] test_bpf: #91 ALU64_ADD_K: 1 + 2147483646 = 2147483647
jited:1 138 PASS
[   56.723403] test_bpf: #92 ALU64_ADD_K: 4294967294 + 2 = 4294967296
jited:1 151 PASS
[   56.725349] test_bpf: #93 ALU64_ADD_K: 2147483646 + -2147483647 =
-1 jited:1 115 PASS
[   56.726923] test_bpf: #94 ALU64_ADD_K: 1 + 0 = 1 jited:1 206 PASS
[   56.729436] test_bpf: #95 ALU64_ADD_K: 0 + (-1) =
0xffffffffffffffff jited:1 211 PASS
[   56.731988] test_bpf: #96 ALU64_ADD_K: 0 + 0xffff = 0xffff jited:1 250 PASS
[   56.735291] test_bpf: #97 ALU64_ADD_K: 0 + 0x7fffffff = 0x7fffffff
jited:1 199 PASS
[   56.737871] test_bpf: #98 ALU64_ADD_K: 0 + 0x80000000 =
0xffffffff80000000 jited:1 177 PASS
[   56.740193] test_bpf: #99 ALU_ADD_K: 0 + 0x80008000 =
0xffffffff80008000 jited:1 243 PASS
[   56.743126] test_bpf: #100 ALU_SUB_X: 3 - 1 = 2 jited:1 108 PASS
[   56.744676] test_bpf: #101 ALU_SUB_X: 4294967295 - 4294967294 = 1
jited:1 133 PASS
[   56.746386] test_bpf: #102 ALU64_SUB_X: 3 - 1 = 2 jited:1 110 PASS
[   56.747835] test_bpf: #103 ALU64_SUB_X: 4294967295 - 4294967294 = 1
jited:1 111 PASS
[   56.749292] test_bpf: #104 ALU_SUB_K: 3 - 1 = 2 jited:1 110 PASS
[   56.750766] test_bpf: #105 ALU_SUB_K: 3 - 0 = 3 jited:1 123 PASS
[   56.752371] test_bpf: #106 ALU_SUB_K: 4294967295 - 4294967294 = 1
jited:1 124 PASS
[   56.754095] test_bpf: #107 ALU64_SUB_K: 3 - 1 = 2 jited:1 116 PASS
[   56.755687] test_bpf: #108 ALU64_SUB_K: 3 - 0 = 3 jited:1 133 PASS
[   56.757418] test_bpf: #109 ALU64_SUB_K: 4294967294 - 4294967295 =
-1 jited:1 148 PASS
[   56.759295] test_bpf: #110 ALU64_ADD_K: 2147483646 - 2147483647 =
-1 jited:1 145 PASS
[   56.761137] test_bpf: #111 ALU_MUL_X: 2 * 3 = 6 jited:1 172 PASS
[   56.763380] test_bpf: #112 ALU_MUL_X: 2 * 0x7FFFFFF8 = 0xFFFFFFF0
jited:1 117 PASS
[   56.764943] test_bpf: #113 ALU_MUL_X: -1 * -1 = 1 jited:1 109 PASS
[   56.766424] test_bpf: #114 ALU64_MUL_X: 2 * 3 = 6 jited:1 115 PASS
[   56.767999] test_bpf: #115 ALU64_MUL_X: 1 * 2147483647 = 2147483647
jited:1 119 PASS
[   56.769584] test_bpf: #116 ALU_MUL_K: 2 * 3 = 6 jited:1 111 PASS
[   56.771124] test_bpf: #117 ALU_MUL_K: 3 * 1 = 3 jited:1 118 PASS
[   56.772961] test_bpf: #118 ALU_MUL_K: 2 * 0x7FFFFFF8 = 0xFFFFFFF0
jited:1 109 PASS
[   56.774431] test_bpf: #119 ALU_MUL_K: 1 * (-1) = 0x00000000ffffffff
jited:1 201 PASS
[   56.776888] test_bpf: #120 ALU64_MUL_K: 2 * 3 = 6 jited:1 116 PASS
[   56.778460] test_bpf: #121 ALU64_MUL_K: 3 * 1 = 3 jited:1 115 PASS
[   56.779993] test_bpf: #122 ALU64_MUL_K: 1 * 2147483647 = 2147483647
jited:1 278 PASS
[   56.783229] test_bpf: #123 ALU64_MUL_K: 1 * -2147483647 =
-2147483647 jited:1 125 PASS
[   56.785228] test_bpf: #124 ALU64_MUL_K: 1 * (-1) =
0xffffffffffffffff jited:1 208 PASS
[   56.787912] test_bpf: #125 ALU_DIV_X: 6 / 2 = 3 jited:1 246 PASS
[   56.790983] test_bpf: #126 ALU_DIV_X: 4294967295 / 4294967295 = 1
jited:1 291 PASS
[   56.794583] test_bpf: #127 ALU64_DIV_X: 6 / 2 = 3 jited:0 449 PASS
[   56.799521] test_bpf: #128 ALU64_DIV_X: 2147483647 / 2147483647 = 1
jited:0 462 PASS
[   56.804433] test_bpf: #129 ALU64_DIV_X: 0xffffffffffffffff / (-1) =
0x0000000000000001 jited:0 602 PASS
[   56.810815] test_bpf: #130 ALU_DIV_K: 6 / 2 = 3 jited:1 234 PASS
[   56.813585] test_bpf: #131 ALU_DIV_K: 3 / 1 = 3 jited:1 240 PASS
[   56.816466] test_bpf: #132 ALU_DIV_K: 4294967295 / 4294967295 = 1
jited:1 276 PASS
[   56.819790] test_bpf: #133 ALU_DIV_K: 0xffffffffffffffff / (-1) =
0x1 jited:1 373 PASS
[   56.824311] test_bpf: #134 ALU64_DIV_K: 6 / 2 = 3 jited:0 367 PASS
[   56.828509] test_bpf: #135 ALU64_DIV_K: 3 / 1 = 3 jited:0 354 PASS
[   56.832439] test_bpf: #136 ALU64_DIV_K: 2147483647 / 2147483647 = 1
jited:0 358 PASS
[   56.836360] test_bpf: #137 ALU64_DIV_K: 0xffffffffffffffff / (-1) =
0x0000000000000001 jited:0 563 PASS
[   56.842408] test_bpf: #138 ALU_MOD_X: 3 % 2 = 1 jited:1 293 PASS
[   56.845744] test_bpf: #139 ALU_MOD_X: 4294967295 % 4294967293 = 2
jited:1 289 PASS
[   56.849070] test_bpf: #140 ALU64_MOD_X: 3 % 2 = 1 jited:0 660 PASS
[   56.856100] test_bpf: #141 ALU64_MOD_X: 2147483647 % 2147483645 = 2
jited:0 692 PASS
[   56.863515] test_bpf: #142 ALU_MOD_K: 3 % 2 = 1 jited:1 311 PASS
[   56.867145] test_bpf: #143 ALU_MOD_K: 3 % 1 = 0 jited:1 PASS
[   56.867640] test_bpf: #144 ALU_MOD_K: 4294967295 % 4294967293 = 2
jited:1 319 PASS
[   56.871208] test_bpf: #145 ALU64_MOD_K: 3 % 2 = 1 jited:0 539 PASS
[   56.876982] test_bpf: #146 ALU64_MOD_K: 3 % 1 = 0 jited:0 PASS
[   56.877292] test_bpf: #147 ALU64_MOD_K: 2147483647 % 2147483645 = 2
jited:0 499 PASS
[   56.882591] test_bpf: #148 ALU_AND_X: 3 & 2 = 2 jited:1 109 PASS
[   56.884070] test_bpf: #149 ALU_AND_X: 0xffffffff & 0xffffffff =
0xffffffff jited:1 130 PASS
[   56.885807] test_bpf: #150 ALU64_AND_X: 3 & 2 = 2 jited:1 106 PASS
[   56.887288] test_bpf: #151 ALU64_AND_X: 0xffffffff & 0xffffffff =
0xffffffff jited:1 102 PASS
[   56.888746] test_bpf: #152 ALU_AND_K: 3 & 2 = 2 jited:1 114 PASS
[   56.890232] test_bpf: #153 ALU_AND_K: 0xffffffff & 0xffffffff =
0xffffffff jited:1 138 PASS
[   56.891967] test_bpf: #154 ALU64_AND_K: 3 & 2 = 2 jited:1 110 PASS
[   56.893502] test_bpf: #155 ALU64_AND_K: 0xffffffff & 0xffffffff =
0xffffffff jited:1 148 PASS
[   56.895413] test_bpf: #156 ALU64_AND_K: 0x0000ffffffff0000 & 0x0 =
0x0000ffff00000000 jited:1 206 PASS
[   56.897993] test_bpf: #157 ALU64_AND_K: 0x0000ffffffff0000 & -1 =
0x0000ffffffffffff jited:1 176 PASS
[   56.900294] test_bpf: #158 ALU64_AND_K: 0xffffffffffffffff & -1 =
0xffffffffffffffff jited:1 271 PASS
[   56.903712] test_bpf: #159 ALU_OR_X: 1 | 2 = 3 jited:1 108 PASS
[   56.905547] test_bpf: #160 ALU_OR_X: 0x0 | 0xffffffff = 0xffffffff
jited:1 118 PASS
[   56.907467] test_bpf: #161 ALU64_OR_X: 1 | 2 = 3 jited:1 103 PASS
[   56.909247] test_bpf: #162 ALU64_OR_X: 0 | 0xffffffff = 0xffffffff
jited:1 143 PASS
[   56.911219] test_bpf: #163 ALU_OR_K: 1 | 2 = 3 jited:1 123 PASS
[   56.913042] test_bpf: #164 ALU_OR_K: 0 & 0xffffffff = 0xffffffff
jited:1 110 PASS
[   56.914579] test_bpf: #165 ALU64_OR_K: 1 | 2 = 3 jited:1 120 PASS
[   56.916390] test_bpf: #166 ALU64_OR_K: 0 & 0xffffffff = 0xffffffff
jited:1 119 PASS
[   56.918118] test_bpf: #167 ALU64_OR_K: 0x0000ffffffff0000 | 0x0 =
0x0000ffff00000000 jited:1 212 PASS
[   56.920808] test_bpf: #168 ALU64_OR_K: 0x0000ffffffff0000 | -1 =
0xffffffffffffffff jited:1 221 PASS
[   56.923458] test_bpf: #169 ALU64_OR_K: 0x000000000000000 | -1 =
0xffffffffffffffff jited:1 198 PASS
[   56.925881] test_bpf: #170 ALU_XOR_X: 5 ^ 6 = 3 jited:1 138 PASS
[   56.927678] test_bpf: #171 ALU_XOR_X: 0x1 ^ 0xffffffff = 0xfffffffe
jited:1 130 PASS
[   56.929353] test_bpf: #172 ALU64_XOR_X: 5 ^ 6 = 3 jited:1 114 PASS
[   56.930850] test_bpf: #173 ALU64_XOR_X: 1 ^ 0xffffffff = 0xfffffffe
jited:1 106 PASS
[   56.932277] test_bpf: #174 ALU_XOR_K: 5 ^ 6 = 3 jited:1 112 PASS
[   56.933790] test_bpf: #175 ALU_XOR_K: 1 ^ 0xffffffff = 0xfffffffe
jited:1 116 PASS
[   56.935371] test_bpf: #176 ALU64_XOR_K: 5 ^ 6 = 3 jited:1 114 PASS
[   56.936942] test_bpf: #177 ALU64_XOR_K: 1 & 0xffffffff = 0xfffffffe
jited:1 112 PASS
[   56.938503] test_bpf: #178 ALU64_XOR_K: 0x0000ffffffff0000 ^ 0x0 =
0x0000ffffffff0000 jited:1 201 PASS
[   56.940978] test_bpf: #179 ALU64_XOR_K: 0x0000ffffffff0000 ^ -1 =
0xffff00000000ffff jited:1 242 PASS
[   56.943908] test_bpf: #180 ALU64_XOR_K: 0x000000000000000 ^ -1 =
0xffffffffffffffff jited:1 208 PASS
[   56.946575] test_bpf: #181 ALU_LSH_X: 1 << 1 = 2 jited:1 112 PASS
[   56.948252] test_bpf: #182 ALU_LSH_X: 1 << 31 = 0x80000000 jited:1 137 PASS
[   56.950466] test_bpf: #183 ALU64_LSH_X: 1 << 1 = 2 jited:1 163 PASS
[   56.953176] test_bpf: #184 ALU64_LSH_X: 1 << 31 = 0x80000000 jited:1 145 PASS
[   56.955105] test_bpf: #185 ALU_LSH_K: 1 << 1 = 2 jited:1 92 PASS
[   56.956400] test_bpf: #186 ALU_LSH_K: 1 << 31 = 0x80000000 jited:1 94 PASS
[   56.957700] test_bpf: #187 ALU64_LSH_K: 1 << 1 = 2 jited:1 94 PASS
[   56.959086] test_bpf: #188 ALU64_LSH_K: 1 << 31 = 0x80000000 jited:1 127 PASS
[   56.960779] test_bpf: #189 ALU_RSH_X: 2 >> 1 = 1 jited:1 135 PASS
[   56.962532] test_bpf: #190 ALU_RSH_X: 0x80000000 >> 31 = 1 jited:1 109 PASS
[   56.964027] test_bpf: #191 ALU64_RSH_X: 2 >> 1 = 1 jited:1 123 PASS
[   56.965961] test_bpf: #192 ALU64_RSH_X: 0x80000000 >> 31 = 1 jited:1 117 PASS
[   56.967517] test_bpf: #193 ALU_RSH_K: 2 >> 1 = 1 jited:1 95 PASS
[   56.968874] test_bpf: #194 ALU_RSH_K: 0x80000000 >> 31 = 1 jited:1 103 PASS
[   56.970261] test_bpf: #195 ALU64_RSH_K: 2 >> 1 = 1 jited:1 124 PASS
[   56.971879] test_bpf: #196 ALU64_RSH_K: 0x80000000 >> 31 = 1 jited:1 107 PASS
[   56.973346] test_bpf: #197 ALU_ARSH_X: 0xff00ff0000000000 >> 40 =
0xffffffffffff00ff jited:1 125 PASS
[   56.975022] test_bpf: #198 ALU_ARSH_K: 0xff00ff0000000000 >> 40 =
0xffffffffffff00ff jited:1 105 PASS
[   56.976479] test_bpf: #199 ALU_NEG: -(3) = -3 jited:1 76 PASS
[   56.977591] test_bpf: #200 ALU_NEG: -(-3) = 3 jited:1 106 PASS
[   56.979068] test_bpf: #201 ALU64_NEG: -(3) = -3 jited:1 104 PASS
[   56.980508] test_bpf: #202 ALU64_NEG: -(-3) = 3 jited:1 135 PASS
[   56.982223] test_bpf: #203 ALU_END_FROM_BE 16: 0x0123456789abcdef
-> 0xcdef jited:1 115 PASS
[   56.984458] test_bpf: #204 ALU_END_FROM_BE 32: 0x0123456789abcdef
-> 0x89abcdef jited:1 101 PASS
[   56.985991] test_bpf: #205 ALU_END_FROM_BE 64: 0x0123456789abcdef
-> 0x89abcdef jited:1 103 PASS
[   56.987477] test_bpf: #206 ALU_END_FROM_LE 16: 0x0123456789abcdef
-> 0xefcd jited:1 107 PASS
[   56.988937] test_bpf: #207 ALU_END_FROM_LE 32: 0x0123456789abcdef
-> 0xefcdab89 jited:1 93 PASS
[   56.990256] test_bpf: #208 ALU_END_FROM_LE 64: 0x0123456789abcdef
-> 0x67452301 jited:1 108 PASS
[   56.991728] test_bpf: #209 ST_MEM_B: Store/Load byte: max negative
jited:1 168 PASS
[   56.993878] test_bpf: #210 ST_MEM_B: Store/Load byte: max positive
jited:1 105 PASS
[   56.995386] test_bpf: #211 STX_MEM_B: Store/Load byte: max negative
jited:1 140 PASS
[   56.997188] test_bpf: #212 ST_MEM_H: Store/Load half word: max
negative jited:1 98 PASS
[   56.998563] test_bpf: #213 ST_MEM_H: Store/Load half word: max
positive jited:1 109 PASS
[   57.000045] test_bpf: #214 STX_MEM_H: Store/Load half word: max
negative jited:1 134 PASS
[   57.001803] test_bpf: #215 ST_MEM_W: Store/Load word: max negative
jited:1 148 PASS
[   57.003666] test_bpf: #216 ST_MEM_W: Store/Load word: max positive
jited:1 136 PASS
[   57.006376] test_bpf: #217 STX_MEM_W: Store/Load word: max negative
jited:1 205 PASS
[   57.009004] test_bpf: #218 ST_MEM_DW: Store/Load double word: max
negative jited:1 124 PASS
[   57.011164] test_bpf: #219 ST_MEM_DW: Store/Load double word: max
negative 2 jited:1 222 PASS
[   57.014281] test_bpf: #220 ST_MEM_DW: Store/Load double word: max
positive jited:1 110 PASS
[   57.016138] test_bpf: #221 STX_MEM_DW: Store/Load double word: max
negative jited:1 194 PASS
[   57.018614] test_bpf: #222 STX_XADD_W: Test: 0x12 + 0x10 = 0x22
jited:0 292 PASS
[   57.022064] test_bpf: #223 STX_XADD_W: Test side-effects, r10: 0x12
+ 0x10 = 0x22 jited:0 PASS
[   57.022356] test_bpf: #224 STX_XADD_W: Test side-effects, r0: 0x12
+ 0x10 = 0x22 jited:0 241 PASS
[   57.025099] test_bpf: #225 STX_XADD_W: X + 1 + 1 + 1 + ... jited:0
142752 PASS
[   58.454867] test_bpf: #226 STX_XADD_DW: Test: 0x12 + 0x10 = 0x22
jited:0 370 PASS
[   58.459675] test_bpf: #227 STX_XADD_DW: Test side-effects, r10:
0x12 + 0x10 = 0x22 jited:0 PASS
[   58.460082] test_bpf: #228 STX_XADD_DW: Test side-effects, r0: 0x12
+ 0x10 = 0x22 jited:0 268 PASS
[   58.463093] test_bpf: #229 STX_XADD_DW: X + 1 + 1 + 1 + ... jited:0
224885 PASS
[   60.713635] test_bpf: #230 JMP_EXIT jited:1 77 PASS
[   60.715476] test_bpf: #231 JMP_JA: Unconditional jump: if (true)
return 1 jited:1 84 PASS
[   60.716748] test_bpf: #232 JMP_JSGT_K: Signed jump: if (-1 > -2)
return 1 jited:1 128 PASS
[   60.718617] test_bpf: #233 JMP_JSGT_K: Signed jump: if (-1 > -1)
return 0 jited:1 126 PASS
[   60.720303] test_bpf: #234 JMP_JSGE_K: Signed jump: if (-1 >= -2)
return 1 jited:1 179 PASS
[   60.722889] test_bpf: #235 JMP_JSGE_K: Signed jump: if (-1 >= -1)
return 1 jited:1 125 PASS
[   60.724674] test_bpf: #236 JMP_JGT_K: if (3 > 2) return 1 jited:1 142 PASS
[   60.726577] test_bpf: #237 JMP_JGT_K: Unsigned jump: if (-1 > 1)
return 1 jited:1 161 PASS
[   60.728695] test_bpf: #238 JMP_JGE_K: if (3 >= 2) return 1 jited:1 163 PASS
[   60.730807] test_bpf: #239 JMP_JGT_K: if (3 > 2) return 1 (jump
backwards) jited:1 143 PASS
[   60.733042] test_bpf: #240 JMP_JGE_K: if (3 >= 3) return 1 jited:1 179 PASS
[   60.735513] test_bpf: #241 JMP_JNE_K: if (3 != 2) return 1 jited:1 144 PASS
[   60.737586] test_bpf: #242 JMP_JEQ_K: if (3 == 3) return 1 jited:1 144 PASS
[   60.739896] test_bpf: #243 JMP_JSET_K: if (0x3 & 0x2) return 1
jited:1 149 PASS
[   60.741813] test_bpf: #244 JMP_JSET_K: if (0x3 & 0xffffffff) return
1 jited:1 153 PASS
[   60.743773] test_bpf: #245 JMP_JSGT_X: Signed jump: if (-1 > -2)
return 1 jited:1 162 PASS
[   60.745798] test_bpf: #246 JMP_JSGT_X: Signed jump: if (-1 > -1)
return 0 jited:1 162 PASS
[   60.747921] test_bpf: #247 JMP_JSGE_X: Signed jump: if (-1 >= -2)
return 1 jited:1 178 PASS
[   60.750577] test_bpf: #248 JMP_JSGE_X: Signed jump: if (-1 >= -1)
return 1 jited:1 192 PASS
[   60.753315] test_bpf: #249 JMP_JGT_X: if (3 > 2) return 1 jited:1 205 PASS
[   60.756115] test_bpf: #250 JMP_JGT_X: Unsigned jump: if (-1 > 1)
return 1 jited:1 154 PASS
[   60.758287] test_bpf: #251 JMP_JGE_X: if (3 >= 2) return 1 jited:1 177 PASS
[   60.760611] test_bpf: #252 JMP_JGE_X: if (3 >= 3) return 1 jited:1 160 PASS
[   60.762901] test_bpf: #253 JMP_JGE_X: ldimm64 test 1 jited:1 204 PASS
[   60.765394] test_bpf: #254 JMP_JGE_X: ldimm64 test 2 jited:1 201 PASS
[   60.767884] test_bpf: #255 JMP_JGE_X: ldimm64 test 3 jited:1 184 PASS
[   60.770228] test_bpf: #256 JMP_JNE_X: if (3 != 2) return 1 jited:1 168 PASS
[   60.772331] test_bpf: #257 JMP_JEQ_X: if (3 == 3) return 1 jited:1 197 PASS
[   60.774754] test_bpf: #258 JMP_JSET_X: if (0x3 & 0x2) return 1
jited:1 192 PASS
[   60.777384] test_bpf: #259 JMP_JSET_X: if (0x3 & 0xffffffff) return
1 jited:1 181 PASS
[   60.779641] test_bpf: #260 JMP_JA: Jump, gap, jump, ... jited:1 97 PASS
[   60.781022] test_bpf: #261 BPF_MAXINSNS: Maximum possible literals
jited:1 125 PASS
[   61.242879] test_bpf: #262 BPF_MAXINSNS: Single literal jited:1 105 PASS
[   61.835125] test_bpf: #263 BPF_MAXINSNS: Run/add until end jited:1
121315 PASS
[   63.362129] test_bpf: #264 BPF_MAXINSNS: Too many instructions PASS
[   63.362231] test_bpf: #265 BPF_MAXINSNS: Very long jump jited:1 131 PASS
[   63.879679] test_bpf: #266 BPF_MAXINSNS: Ctx heavy transformations
jited:1 217030 181848 PASS
[   68.492725] test_bpf: #267 BPF_MAXINSNS: Call heavy transformations
jited:0 1018683 930359 PASS
[   88.007480] test_bpf: #268 BPF_MAXINSNS: Jump heavy test jited:1 440621 PASS
[   93.074379] test_bpf: #269 BPF_MAXINSNS: Very long jump backwards
jited:1 154 PASS
[   93.358458] test_bpf: #270 BPF_MAXINSNS: Edge hopping nuthouse
jited:1 302835 PASS
[   96.392483] test_bpf: #271 BPF_MAXINSNS: Jump, gap, jump, ...
jited:1 1008 PASS
[   96.501153] test_bpf: #272 BPF_MAXINSNS: ld_abs+get_processor_id
jited:0 597855 PASS
[  102.759854] test_bpf: #273 BPF_MAXINSNS: ld_abs+vlan_push/pop
jited:0 626616 PASS
[  109.247312] test_bpf: #274 LD_IND byte frag jited:1 1453 PASS
[  109.263829] test_bpf: #275 LD_IND halfword frag jited:1 600 PASS
[  109.270433] test_bpf: #276 LD_IND word frag jited:1 719 PASS
[  109.278159] test_bpf: #277 LD_IND halfword mixed head/frag jited:1 705 PASS
[  109.285898] test_bpf: #278 LD_IND word mixed head/frag jited:1 732 PASS
[  109.293879] test_bpf: #279 LD_ABS byte frag jited:1 683 PASS
[  109.301360] test_bpf: #280 LD_ABS halfword frag jited:1 595 PASS
[  109.307841] test_bpf: #281 LD_ABS word frag jited:1 672 PASS
[  109.315579] test_bpf: #282 LD_ABS halfword mixed head/frag jited:1 775 PASS
[  109.323890] test_bpf: #283 LD_ABS word mixed head/frag jited:1 725 PASS
[  109.331927] test_bpf: #284 LD_IND byte default X jited:1 274 PASS
[  109.335451] test_bpf: #285 LD_IND byte positive offset jited:1 302 PASS
[  109.339511] test_bpf: #286 LD_IND byte negative offset jited:1 311 PASS
[  109.343448] test_bpf: #287 LD_IND halfword positive offset jited:1 218 PASS
[  109.346282] test_bpf: #288 LD_IND halfword negative offset jited:1 193 PASS
[  109.348832] test_bpf: #289 LD_IND halfword unaligned jited:1 190 PASS
[  109.351330] test_bpf: #290 LD_IND word positive offset jited:1 200 PASS
[  109.353993] test_bpf: #291 LD_IND word negative offset jited:1 216 PASS
[  109.356739] test_bpf: #292 LD_IND word unaligned (addr & 3 == 2)
jited:1 195 PASS
[  109.359225] test_bpf: #293 LD_IND word unaligned (addr & 3 == 1)
jited:1 196 PASS
[  109.361713] test_bpf: #294 LD_IND word unaligned (addr & 3 == 3)
jited:1 221 PASS
[  109.364417] test_bpf: #295 LD_ABS byte jited:1 195 PASS
[  109.366896] test_bpf: #296 LD_ABS halfword jited:1 170 PASS
[  109.369093] test_bpf: #297 LD_ABS halfword unaligned jited:1 167 PASS
[  109.371399] test_bpf: #298 LD_ABS word jited:1 182 PASS
[  109.373724] test_bpf: #299 LD_ABS word unaligned (addr & 3 == 2)
jited:1 185 PASS
[  109.376064] test_bpf: #300 LD_ABS word unaligned (addr & 3 == 1)
jited:1 162 PASS
[  109.381701] test_bpf: #301 LD_ABS word unaligned (addr & 3 == 3)
jited:1 231 PASS
[  109.384839] test_bpf: #302 ADD default X jited:1 105 PASS
[  109.386839] test_bpf: #303 ADD default A jited:1 101 PASS
[  109.388677] test_bpf: #304 SUB default X jited:1 106 PASS
[  109.390267] test_bpf: #305 SUB default A jited:1 119 PASS
[  109.391992] test_bpf: #306 MUL default X jited:1 131 PASS
[  109.394020] test_bpf: #307 MUL default A jited:1 116 PASS
[  109.395766] test_bpf: #308 DIV default X jited:1 116 PASS
[  109.397706] test_bpf: #309 DIV default A jited:1 227 PASS
[  109.406156] test_bpf: #310 MOD default X jited:1 98 PASS
[  109.407645] test_bpf: #311 MOD default A jited:1 265 PASS
[  109.410774] test_bpf: #312 JMP EQ default A jited:1 134 PASS
[  109.412679] test_bpf: #313 JMP EQ default X jited:1 108 PASS
[  109.414506] test_bpf: Summary: 314 PASSED, 0 FAILED, [278/306 JIT'ed]


These all benchmarks are for ARMv7.
Best,
Shubham Bansal


On Mon, May 22, 2017 at 6:31 PM, Daniel Borkmann <daniel@iogearbox.net> wrote:
> On 05/20/2017 10:01 PM, Shubham Bansal wrote:
> [...]
>>
>> Before I send the patch, I have tested the JIT compiler on ARMv7 but
>> not on ARMv5 or ARMv6. So can you tell me which arch versions I should
>> test it for?
>> Also for my testing, CONFIG_FRAME_POINTER and CONFIG_CPU_BIG_ENDIAN
>> are both disabled. But I need to test JIT with these flags as well.
>> Whenever I put these flags in .config file, the arm kernel is not
>> getting compiler with these flags. Can you tell me why? If you need
>> more information regarding this, please let me know.
>
>
> Maybe Mircea, Kees or someone from linux-arm-kernel can help you out
> on that.
>
> With regards to the below benchmark, I was mentioning how it compares
> to the interpreter. With only the numbers for jit it's hard to compare.
> So would be great to see the output for the following three cases:
>
> 1) Interpreter:
>
> echo 0 > /proc/sys/net/core/bpf_jit_enable
>
> 2) JIT enabled:
>
> echo 1 > /proc/sys/net/core/bpf_jit_enable
>
> 3) JIT + blinding enabled:
>
> echo 1 > /proc/sys/net/core/bpf_jit_enable
> echo 2 > /proc/sys/net/core/bpf_jit_harden
>
>> With current config for ARMv7, benchmarks are :
>>
>> [root at vexpress modules]# insmod test_bpf.ko
>> [   25.797766] test_bpf: #0 TAX jited:1 180 170 169 PASS
>> [   25.811395] test_bpf: #1 TXA jited:1 93 89 111 PASS
>> [   25.815073] test_bpf: #2 ADD_SUB_MUL_K jited:1 94 PASS
>> [   25.816779] test_bpf: #3 DIV_MOD_KX jited:1 983 PASS
>> [   25.827310] test_bpf: #4 AND_OR_LSH_K jited:1 94 93 PASS
>> [   25.829843] test_bpf: #5 LD_IMM_0 jited:1 83 PASS
>> [   25.831260] test_bpf: #6 LD_IND jited:1 338 266 305 PASS
>
> [...]
>
> Thanks,
> Daniel

^ permalink raw reply	[flat|nested] 99+ messages in thread

* [kernel-hardening] Re: arch: arm: bpf: Converting cBPF to eBPF for arm 32 bit
@ 2017-05-22 17:04                       ` Shubham Bansal
  0 siblings, 0 replies; 99+ messages in thread
From: Shubham Bansal @ 2017-05-22 17:04 UTC (permalink / raw)
  To: Daniel Borkmann
  Cc: Kees Cook, David Miller, Mircea Gherzan, Network Development,
	kernel-hardening, linux-arm-kernel, ast

Hi Daniel,

Here are the benchmarks.

1) Interpreter:

[root@vexpress modules]# insmod test_bpf.ko
[   37.244999] test_bpf: #0 TAX jited:0 757 645 650 PASS
[   37.272577] test_bpf: #1 TXA jited:0 366 334 336 PASS
[   37.283507] test_bpf: #2 ADD_SUB_MUL_K jited:0 543 PASS
[   37.289542] test_bpf: #3 DIV_MOD_KX jited:0 1509 PASS
[   37.305374] test_bpf: #4 AND_OR_LSH_K jited:0 539 559 PASS
[   37.317209] test_bpf: #5 LD_IMM_0 jited:0 412 PASS
[   37.321820] test_bpf: #6 LD_IND jited:0 428 376 389 PASS
[   37.334327] test_bpf: #7 LD_ABS jited:0 509 405 358 PASS
[   37.350596] test_bpf: #8 LD_ABS_LL jited:0 542 783 PASS
[   37.364340] test_bpf: #9 LD_IND_LL jited:0 524 496 723 PASS
[   37.382352] test_bpf: #10 LD_ABS_NET jited:0 527 545 PASS
[   37.393642] test_bpf: #11 LD_IND_NET jited:0 650 495 647 PASS
[   37.412228] test_bpf: #12 LD_PKTTYPE jited:0 686 901 PASS
[   37.428818] test_bpf: #13 LD_MARK jited:0 305 291 PASS
[   37.435349] test_bpf: #14 LD_RXHASH jited:0 257 259 PASS
[   37.440850] test_bpf: #15 LD_QUEUE jited:0 255 254 PASS
[   37.446254] test_bpf: #16 LD_PROTOCOL jited:0 593 603 PASS
[   37.458570] test_bpf: #17 LD_VLAN_TAG jited:0 288 292 PASS
[   37.464821] test_bpf: #18 LD_VLAN_TAG_PRESENT jited:0 335 421 PASS
[   37.472817] test_bpf: #19 LD_IFINDEX jited:0 8568 606 PASS
[   37.565163] test_bpf: #20 LD_HATYPE jited:0 618 695 PASS
[   37.579457] test_bpf: #21 LD_CPU jited:0 1200 1172 PASS
[   37.604424] test_bpf: #22 LD_NLATTR jited:0 979 1124 PASS
[   37.626345] test_bpf: #23 LD_NLATTR_NEST jited:0 12232 3593 PASS
[   37.785251] test_bpf: #24 LD_PAYLOAD_OFF jited:0 3697 4834 PASS
[   37.871224] test_bpf: #25 LD_ANC_XOR jited:0 314 344 PASS
[   37.878210] test_bpf: #26 SPILL_FILL jited:0 757 850 903 PASS
[   37.903954] test_bpf: #27 JEQ jited:0 380 420 426 PASS
[   37.916756] test_bpf: #28 JGT jited:0 376 467 448 PASS
[   37.930276] test_bpf: #29 JGE jited:0 446 590 498 PASS
[   37.946729] test_bpf: #30 JSET jited:0 571 787 1003 PASS
[   37.970896] test_bpf: #31 tcpdump port 22 jited:0 358 1079 1190 PASS
[   37.997814] test_bpf: #32 tcpdump complex jited:0 319 1061 2324 PASS
[   38.035596] test_bpf: #33 RET_A jited:0 253 249 PASS
[   38.041262] test_bpf: #34 INT: ADD trivial jited:0 414 PASS
[   38.045777] test_bpf: #35 INT: MUL_X jited:0 336 PASS
[   38.049402] test_bpf: #36 INT: MUL_X2 jited:0 431 PASS
[   38.054178] test_bpf: #37 INT: MUL32_X jited:0 523 PASS
[   38.059902] test_bpf: #38 INT: ADD 64-bit jited:0 5263 PASS
[   38.113069] test_bpf: #39 INT: ADD 32-bit jited:0 4127 PASS
[   38.154754] test_bpf: #40 INT: SUB jited:0 4218 PASS
[   38.197294] test_bpf: #41 INT: XOR jited:0 2252 PASS
[   38.220159] test_bpf: #42 INT: MUL jited:0 1986 PASS
[   38.240410] test_bpf: #43 MOV REG64 jited:0 1103 PASS
[   38.251796] test_bpf: #44 MOV REG32 jited:0 1140 PASS
[   38.263614] test_bpf: #45 LD IMM64 jited:0 1182 PASS
[   38.276031] test_bpf: #46 INT: ALU MIX jited:0 1068 PASS
[   38.287319] test_bpf: #47 INT: shifts by register jited:0 1125 PASS
[   38.298913] test_bpf: #48 INT: DIV + ABS jited:0 570 850 PASS
[   38.313745] test_bpf: #49 INT: DIV by zero jited:0 350 305 PASS
[   38.320829] test_bpf: #50 check: missing ret PASS
[   38.321186] test_bpf: #51 check: div_k_0 PASS
[   38.321350] test_bpf: #52 check: unknown insn PASS
[   38.321492] test_bpf: #53 check: out of range spill/fill PASS
[   38.321665] test_bpf: #54 JUMPS + HOLES jited:0 863 PASS
[   38.330763] test_bpf: #55 check: RET X PASS
[   38.331060] test_bpf: #56 check: LDX + RET X PASS
[   38.331292] test_bpf: #57 M[]: alt STX + LDX jited:0 3990 PASS
[   38.373667] test_bpf: #58 M[]: full STX + full LDX jited:0 2819 PASS
[   38.410225] test_bpf: #59 check: SKF_AD_MAX PASS
[   38.410461] test_bpf: #60 LD [SKF_AD_OFF-1] jited:0 313 PASS
[   38.413785] test_bpf: #61 load 64-bit immediate jited:0 579 PASS
[   38.419764] test_bpf: #62 nmap reduced jited:0 1860 PASS
[   38.439016] test_bpf: #63 ALU_MOV_X: dst = 2 jited:0 249 PASS
[   38.441990] test_bpf: #64 ALU_MOV_X: dst = 4294967295 jited:0 264 PASS
[   38.445000] test_bpf: #65 ALU64_MOV_X: dst = 2 jited:0 229 PASS
[   38.447602] test_bpf: #66 ALU64_MOV_X: dst = 4294967295 jited:0 213 PASS
[   38.450011] test_bpf: #67 ALU_MOV_K: dst = 2 jited:0 167 PASS
[   38.451963] test_bpf: #68 ALU_MOV_K: dst = 4294967295 jited:0 149 PASS
[   38.453694] test_bpf: #69 ALU_MOV_K: 0x0000ffffffff0000 =
0x00000000ffffffff jited:0 358 PASS
[   38.457572] test_bpf: #70 ALU64_MOV_K: dst = 2 jited:0 158 PASS
[   38.459546] test_bpf: #71 ALU64_MOV_K: dst = 2147483647 jited:0 156 PASS
[   38.461364] test_bpf: #72 ALU64_OR_K: dst = 0x0 jited:0 306 PASS
[   38.464652] test_bpf: #73 ALU64_MOV_K: dst = -1 jited:0 327 PASS
[   38.468154] test_bpf: #74 ALU_ADD_X: 1 + 2 = 3 jited:0 212 PASS
[   38.470551] test_bpf: #75 ALU_ADD_X: 1 + 4294967294 = 4294967295
jited:0 231 PASS
[   38.473187] test_bpf: #76 ALU_ADD_X: 2 + 4294967294 = 0 jited:0 309 PASS
[   38.476618] test_bpf: #77 ALU64_ADD_X: 1 + 2 = 3 jited:0 280 PASS
[   38.479675] test_bpf: #78 ALU64_ADD_X: 1 + 4294967294 = 4294967295
jited:0 286 PASS
[   38.482755] test_bpf: #79 ALU64_ADD_X: 2 + 4294967294 = 4294967296
jited:0 460 PASS
[   38.487670] test_bpf: #80 ALU_ADD_K: 1 + 2 = 3 jited:0 210 PASS
[   38.490042] test_bpf: #81 ALU_ADD_K: 3 + 0 = 3 jited:0 208 PASS
[   38.492331] test_bpf: #82 ALU_ADD_K: 1 + 4294967294 = 4294967295
jited:0 205 PASS
[   38.494604] test_bpf: #83 ALU_ADD_K: 4294967294 + 2 = 0 jited:0 323 PASS
[   38.498071] test_bpf: #84 ALU_ADD_K: 0 + (-1) = 0x00000000ffffffff
jited:0 338 PASS
[   38.501674] test_bpf: #85 ALU_ADD_K: 0 + 0xffff = 0xffff jited:0 347 PASS
[   38.505355] test_bpf: #86 ALU_ADD_K: 0 + 0x7fffffff = 0x7fffffff
jited:0 360 PASS
[   38.509197] test_bpf: #87 ALU_ADD_K: 0 + 0x80000000 = 0x80000000
jited:0 345 PASS
[   38.512873] test_bpf: #88 ALU_ADD_K: 0 + 0x80008000 = 0x80008000
jited:0 377 PASS
[   38.516924] test_bpf: #89 ALU64_ADD_K: 1 + 2 = 3 jited:0 184 PASS
[   38.519053] test_bpf: #90 ALU64_ADD_K: 3 + 0 = 3 jited:0 185 PASS
[   38.521246] test_bpf: #91 ALU64_ADD_K: 1 + 2147483646 = 2147483647
jited:0 186 PASS
[   38.523414] test_bpf: #92 ALU64_ADD_K: 4294967294 + 2 = 4294967296
jited:0 353 PASS
[   38.527276] test_bpf: #93 ALU64_ADD_K: 2147483646 + -2147483647 =
-1 jited:0 182 PASS
[   38.529353] test_bpf: #94 ALU64_ADD_K: 1 + 0 = 1 jited:0 311 PASS
[   38.532680] test_bpf: #95 ALU64_ADD_K: 0 + (-1) =
0xffffffffffffffff jited:0 339 PASS
[   38.536308] test_bpf: #96 ALU64_ADD_K: 0 + 0xffff = 0xffff jited:0 310 PASS
[   38.539652] test_bpf: #97 ALU64_ADD_K: 0 + 0x7fffffff = 0x7fffffff
jited:0 313 PASS
[   38.543022] test_bpf: #98 ALU64_ADD_K: 0 + 0x80000000 =
0xffffffff80000000 jited:0 340 PASS
[   38.546651] test_bpf: #99 ALU_ADD_K: 0 + 0x80008000 =
0xffffffff80008000 jited:0 311 PASS
[   38.549994] test_bpf: #100 ALU_SUB_X: 3 - 1 = 2 jited:0 213 PASS
[   38.552326] test_bpf: #101 ALU_SUB_X: 4294967295 - 4294967294 = 1
jited:0 212 PASS
[   38.554661] test_bpf: #102 ALU64_SUB_X: 3 - 1 = 2 jited:0 237 PASS
[   38.557278] test_bpf: #103 ALU64_SUB_X: 4294967295 - 4294967294 = 1
jited:0 221 PASS
[   38.559713] test_bpf: #104 ALU_SUB_K: 3 - 1 = 2 jited:0 177 PASS
[   38.561682] test_bpf: #105 ALU_SUB_K: 3 - 0 = 3 jited:0 179 PASS
[   38.563692] test_bpf: #106 ALU_SUB_K: 4294967295 - 4294967294 = 1
jited:0 195 PASS
[   38.565891] test_bpf: #107 ALU64_SUB_K: 3 - 1 = 2 jited:0 183 PASS
[   38.567926] test_bpf: #108 ALU64_SUB_K: 3 - 0 = 3 jited:0 177 PASS
[   38.569901] test_bpf: #109 ALU64_SUB_K: 4294967294 - 4294967295 =
-1 jited:0 181 PASS
[   38.571925] test_bpf: #110 ALU64_ADD_K: 2147483646 - 2147483647 =
-1 jited:0 177 PASS
[   38.573910] test_bpf: #111 ALU_MUL_X: 2 * 3 = 6 jited:0 241 PASS
[   38.576535] test_bpf: #112 ALU_MUL_X: 2 * 0x7FFFFFF8 = 0xFFFFFFF0
jited:0 220 PASS
[   38.578948] test_bpf: #113 ALU_MUL_X: -1 * -1 = 1 jited:0 224 PASS
[   38.581387] test_bpf: #114 ALU64_MUL_X: 2 * 3 = 6 jited:0 213 PASS
[   38.583715] test_bpf: #115 ALU64_MUL_X: 1 * 2147483647 = 2147483647
jited:0 230 PASS
[   38.586253] test_bpf: #116 ALU_MUL_K: 2 * 3 = 6 jited:0 191 PASS
[   38.588392] test_bpf: #117 ALU_MUL_K: 3 * 1 = 3 jited:0 189 PASS
[   38.590487] test_bpf: #118 ALU_MUL_K: 2 * 0x7FFFFFF8 = 0xFFFFFFF0
jited:0 192 PASS
[   38.592616] test_bpf: #119 ALU_MUL_K: 1 * (-1) = 0x00000000ffffffff
jited:0 333 PASS
[   38.596172] test_bpf: #120 ALU64_MUL_K: 2 * 3 = 6 jited:0 185 PASS
[   38.598224] test_bpf: #121 ALU64_MUL_K: 3 * 1 = 3 jited:0 185 PASS
[   38.600287] test_bpf: #122 ALU64_MUL_K: 1 * 2147483647 = 2147483647
jited:0 184 PASS
[   38.602369] test_bpf: #123 ALU64_MUL_K: 1 * -2147483647 =
-2147483647 jited:0 183 PASS
[   38.604421] test_bpf: #124 ALU64_MUL_K: 1 * (-1) =
0xffffffffffffffff jited:0 336 PASS
[   38.608002] test_bpf: #125 ALU_DIV_X: 6 / 2 = 3 jited:0 316 PASS
[   38.611394] test_bpf: #126 ALU_DIV_X: 4294967295 / 4294967295 = 1
jited:0 315 PASS
[   38.614753] test_bpf: #127 ALU64_DIV_X: 6 / 2 = 3 jited:0 439 PASS
[   38.619370] test_bpf: #128 ALU64_DIV_X: 2147483647 / 2147483647 = 1
jited:0 420 PASS
[   38.623844] test_bpf: #129 ALU64_DIV_X: 0xffffffffffffffff / (-1) =
0x0000000000000001 jited:0 604 PASS
[   38.630156] test_bpf: #130 ALU_DIV_K: 6 / 2 = 3 jited:0 249 PASS
[   38.632858] test_bpf: #131 ALU_DIV_K: 3 / 1 = 3 jited:0 240 PASS
[   38.635647] test_bpf: #132 ALU_DIV_K: 4294967295 / 4294967295 = 1
jited:0 254 PASS
[   38.638408] test_bpf: #133 ALU_DIV_K: 0xffffffffffffffff / (-1) =
0x1 jited:0 379 PASS
[   38.642450] test_bpf: #134 ALU64_DIV_K: 6 / 2 = 3 jited:0 346 PASS
[   38.646123] test_bpf: #135 ALU64_DIV_K: 3 / 1 = 3 jited:0 323 PASS
[   38.649558] test_bpf: #136 ALU64_DIV_K: 2147483647 / 2147483647 = 1
jited:0 329 PASS
[   38.653061] test_bpf: #137 ALU64_DIV_K: 0xffffffffffffffff / (-1) =
0x0000000000000001 jited:0 477 PASS
[   38.658065] test_bpf: #138 ALU_MOD_X: 3 % 2 = 1 jited:0 421 PASS
[   38.662580] test_bpf: #139 ALU_MOD_X: 4294967295 % 4294967293 = 2
jited:0 453 PASS
[   38.667414] test_bpf: #140 ALU64_MOD_X: 3 % 2 = 1 jited:0 553 PASS
[   38.673235] test_bpf: #141 ALU64_MOD_X: 2147483647 % 2147483645 = 2
jited:0 583 PASS
[   38.679343] test_bpf: #142 ALU_MOD_K: 3 % 2 = 1 jited:0 380 PASS
[   38.683374] test_bpf: #143 ALU_MOD_K: 3 % 1 = 0 jited:0 PASS
[   38.683586] test_bpf: #144 ALU_MOD_K: 4294967295 % 4294967293 = 2
jited:0 467 PASS
[   38.688672] test_bpf: #145 ALU64_MOD_K: 3 % 2 = 1 jited:0 492 PASS
[   38.694058] test_bpf: #146 ALU64_MOD_K: 3 % 1 = 0 jited:0 PASS
[   38.694359] test_bpf: #147 ALU64_MOD_K: 2147483647 % 2147483645 = 2
jited:0 571 PASS
[   38.700389] test_bpf: #148 ALU_AND_X: 3 & 2 = 2 jited:0 225 PASS
[   38.702952] test_bpf: #149 ALU_AND_X: 0xffffffff & 0xffffffff =
0xffffffff jited:0 261 PASS
[   38.705982] test_bpf: #150 ALU64_AND_X: 3 & 2 = 2 jited:0 273 PASS
[   38.709194] test_bpf: #151 ALU64_AND_X: 0xffffffff & 0xffffffff =
0xffffffff jited:0 251 PASS
[   38.712213] test_bpf: #152 ALU_AND_K: 3 & 2 = 2 jited:0 201 PASS
[   38.714638] test_bpf: #153 ALU_AND_K: 0xffffffff & 0xffffffff =
0xffffffff jited:0 240 PASS
[   38.717477] test_bpf: #154 ALU64_AND_K: 3 & 2 = 2 jited:0 209 PASS
[   38.720125] test_bpf: #155 ALU64_AND_K: 0xffffffff & 0xffffffff =
0xffffffff jited:0 319 PASS
[   38.724356] test_bpf: #156 ALU64_AND_K: 0x0000ffffffff0000 & 0x0 =
0x0000ffff00000000 jited:0 384 PASS
[   38.729293] test_bpf: #157 ALU64_AND_K: 0x0000ffffffff0000 & -1 =
0x0000ffffffffffff jited:0 367 PASS
[   38.733598] test_bpf: #158 ALU64_AND_K: 0xffffffffffffffff & -1 =
0xffffffffffffffff jited:0 375 PASS
[   38.737966] test_bpf: #159 ALU_OR_X: 1 | 2 = 3 jited:0 271 PASS
[   38.741274] test_bpf: #160 ALU_OR_X: 0x0 | 0xffffffff = 0xffffffff
jited:0 280 PASS
[   38.744653] test_bpf: #161 ALU64_OR_X: 1 | 2 = 3 jited:0 253 PASS
[   38.747717] test_bpf: #162 ALU64_OR_X: 0 | 0xffffffff = 0xffffffff
jited:0 263 PASS
[   38.750830] test_bpf: #163 ALU_OR_K: 1 | 2 = 3 jited:0 216 PASS
[   38.753357] test_bpf: #164 ALU_OR_K: 0 & 0xffffffff = 0xffffffff
jited:0 187 PASS
[   38.755553] test_bpf: #165 ALU64_OR_K: 1 | 2 = 3 jited:0 183 PASS
[   38.757693] test_bpf: #166 ALU64_OR_K: 0 & 0xffffffff = 0xffffffff
jited:0 195 PASS
[   38.759975] test_bpf: #167 ALU64_OR_K: 0x0000ffffffff0000 | 0x0 =
0x0000ffff00000000 jited:0 338 PASS
[   38.763728] test_bpf: #168 ALU64_OR_K: 0x0000ffffffff0000 | -1 =
0xffffffffffffffff jited:0 324 PASS
[   38.767311] test_bpf: #169 ALU64_OR_K: 0x000000000000000 | -1 =
0xffffffffffffffff jited:0 309 PASS
[   38.770633] test_bpf: #170 ALU_XOR_X: 5 ^ 6 = 3 jited:0 216 PASS
[   38.776135] test_bpf: #171 ALU_XOR_X: 0x1 ^ 0xffffffff = 0xfffffffe
jited:0 414 PASS
[   38.780950] test_bpf: #172 ALU64_XOR_X: 5 ^ 6 = 3 jited:0 320 PASS
[   38.784540] test_bpf: #173 ALU64_XOR_X: 1 ^ 0xffffffff = 0xfffffffe
jited:0 223 PASS
[   38.787037] test_bpf: #174 ALU_XOR_K: 5 ^ 6 = 3 jited:0 203 PASS
[   38.789359] test_bpf: #175 ALU_XOR_K: 1 ^ 0xffffffff = 0xfffffffe
jited:0 205 PASS
[   38.791707] test_bpf: #176 ALU64_XOR_K: 5 ^ 6 = 3 jited:0 205 PASS
[   38.794045] test_bpf: #177 ALU64_XOR_K: 1 & 0xffffffff = 0xfffffffe
jited:0 186 PASS
[   38.796180] test_bpf: #178 ALU64_XOR_K: 0x0000ffffffff0000 ^ 0x0 =
0x0000ffffffff0000 jited:0 352 PASS
[   38.800050] test_bpf: #179 ALU64_XOR_K: 0x0000ffffffff0000 ^ -1 =
0xffff00000000ffff jited:0 353 PASS
[   38.803970] test_bpf: #180 ALU64_XOR_K: 0x000000000000000 ^ -1 =
0xffffffffffffffff jited:0 362 PASS
[   38.808102] test_bpf: #181 ALU_LSH_X: 1 << 1 = 2 jited:0 211 PASS
[   38.810517] test_bpf: #182 ALU_LSH_X: 1 << 31 = 0x80000000 jited:0 216 PASS
[   38.812957] test_bpf: #183 ALU64_LSH_X: 1 << 1 = 2 jited:0 224 PASS
[   38.815480] test_bpf: #184 ALU64_LSH_X: 1 << 31 = 0x80000000 jited:0 223 PASS
[   38.818057] test_bpf: #185 ALU_LSH_K: 1 << 1 = 2 jited:0 208 PASS
[   38.820559] test_bpf: #186 ALU_LSH_K: 1 << 31 = 0x80000000 jited:0 210 PASS
[   38.823011] test_bpf: #187 ALU64_LSH_K: 1 << 1 = 2 jited:0 211 PASS
[   38.825737] test_bpf: #188 ALU64_LSH_K: 1 << 31 = 0x80000000 jited:0 182 PASS
[   38.828021] test_bpf: #189 ALU_RSH_X: 2 >> 1 = 1 jited:0 226 PASS
[   38.830655] test_bpf: #190 ALU_RSH_X: 0x80000000 >> 31 = 1 jited:0 225 PASS
[   38.833287] test_bpf: #191 ALU64_RSH_X: 2 >> 1 = 1 jited:0 289 PASS
[   38.836535] test_bpf: #192 ALU64_RSH_X: 0x80000000 >> 31 = 1 jited:0 253 PASS
[   38.839501] test_bpf: #193 ALU_RSH_K: 2 >> 1 = 1 jited:0 207 PASS
[   38.842025] test_bpf: #194 ALU_RSH_K: 0x80000000 >> 31 = 1 jited:0 210 PASS
[   38.844570] test_bpf: #195 ALU64_RSH_K: 2 >> 1 = 1 jited:0 232 PASS
[   38.847341] test_bpf: #196 ALU64_RSH_K: 0x80000000 >> 31 = 1 jited:0 208 PASS
[   38.849849] test_bpf: #197 ALU_ARSH_X: 0xff00ff0000000000 >> 40 =
0xffffffffffff00ff jited:0 252 PASS
[   38.852728] test_bpf: #198 ALU_ARSH_K: 0xff00ff0000000000 >> 40 =
0xffffffffffff00ff jited:0 197 PASS
[   38.855165] test_bpf: #199 ALU_NEG: -(3) = -3 jited:0 189 PASS
[   38.857410] test_bpf: #200 ALU_NEG: -(-3) = 3 jited:0 171 PASS
[   38.859380] test_bpf: #201 ALU64_NEG: -(3) = -3 jited:0 179 PASS
[   38.861411] test_bpf: #202 ALU64_NEG: -(-3) = 3 jited:0 180 PASS
[   38.863491] test_bpf: #203 ALU_END_FROM_BE 16: 0x0123456789abcdef
-> 0xcdef jited:0 202 PASS
[   38.865978] test_bpf: #204 ALU_END_FROM_BE 32: 0x0123456789abcdef
-> 0x89abcdef jited:0 368 PASS
[   38.869957] test_bpf: #205 ALU_END_FROM_BE 64: 0x0123456789abcdef
-> 0x89abcdef jited:0 244 PASS
[   38.872708] test_bpf: #206 ALU_END_FROM_LE 16: 0x0123456789abcdef
-> 0xefcd jited:0 274 PASS
[   38.875930] test_bpf: #207 ALU_END_FROM_LE 32: 0x0123456789abcdef
-> 0xefcdab89 jited:0 319 PASS
[   38.879417] test_bpf: #208 ALU_END_FROM_LE 64: 0x0123456789abcdef
-> 0x67452301 jited:0 193 PASS
[   38.881653] test_bpf: #209 ST_MEM_B: Store/Load byte: max negative
jited:0 219 PASS
[   38.884143] test_bpf: #210 ST_MEM_B: Store/Load byte: max positive
jited:0 227 PASS
[   38.886902] test_bpf: #211 STX_MEM_B: Store/Load byte: max negative
jited:0 251 PASS
[   38.889691] test_bpf: #212 ST_MEM_H: Store/Load half word: max
negative jited:0 218 PASS
[   38.892132] test_bpf: #213 ST_MEM_H: Store/Load half word: max
positive jited:0 208 PASS
[   38.894448] test_bpf: #214 STX_MEM_H: Store/Load half word: max
negative jited:0 259 PASS
[   38.897504] test_bpf: #215 ST_MEM_W: Store/Load word: max negative
jited:0 253 PASS
[   38.900355] test_bpf: #216 ST_MEM_W: Store/Load word: max positive
jited:0 244 PASS
[   38.903051] test_bpf: #217 STX_MEM_W: Store/Load word: max negative
jited:0 297 PASS
[   38.906372] test_bpf: #218 ST_MEM_DW: Store/Load double word: max
negative jited:0 257 PASS
[   38.909268] test_bpf: #219 ST_MEM_DW: Store/Load double word: max
negative 2 jited:0 392 PASS
[   38.913520] test_bpf: #220 ST_MEM_DW: Store/Load double word: max
positive jited:0 292 PASS
[   38.916792] test_bpf: #221 STX_MEM_DW: Store/Load double word: max
negative jited:0 259 PASS
[   38.919654] test_bpf: #222 STX_XADD_W: Test: 0x12 + 0x10 = 0x22
jited:0 262 PASS
[   38.922517] test_bpf: #223 STX_XADD_W: Test side-effects, r10: 0x12
+ 0x10 = 0x22 jited:0 PASS
[   38.922764] test_bpf: #224 STX_XADD_W: Test side-effects, r0: 0x12
+ 0x10 = 0x22 jited:0 221 PASS
[   38.925373] test_bpf: #225 STX_XADD_W: X + 1 + 1 + 1 + ... jited:0
142719 PASS
[   40.352892] test_bpf: #226 STX_XADD_DW: Test: 0x12 + 0x10 = 0x22
jited:0 345 PASS
[   40.356940] test_bpf: #227 STX_XADD_DW: Test side-effects, r10:
0x12 + 0x10 = 0x22 jited:0 PASS
[   40.357188] test_bpf: #228 STX_XADD_DW: Test side-effects, r0: 0x12
+ 0x10 = 0x22 jited:0 254 PASS
[   40.359954] test_bpf: #229 STX_XADD_DW: X + 1 + 1 + 1 + ... jited:0
349891 PASS
[   43.859287] test_bpf: #230 JMP_EXIT jited:0 127 PASS
[   43.861346] test_bpf: #231 JMP_JA: Unconditional jump: if (true)
return 1 jited:0 194 PASS
[   43.863538] test_bpf: #232 JMP_JSGT_K: Signed jump: if (-1 > -2)
return 1 jited:0 262 PASS
[   43.866400] test_bpf: #233 JMP_JSGT_K: Signed jump: if (-1 > -1)
return 0 jited:0 249 PASS
[   43.869132] test_bpf: #234 JMP_JSGE_K: Signed jump: if (-1 >= -2)
return 1 jited:0 262 PASS
[   43.872046] test_bpf: #235 JMP_JSGE_K: Signed jump: if (-1 >= -1)
return 1 jited:0 260 PASS
[   43.874890] test_bpf: #236 JMP_JGT_K: if (3 > 2) return 1 jited:0 260 PASS
[   43.877701] test_bpf: #237 JMP_JGT_K: Unsigned jump: if (-1 > 1)
return 1 jited:0 278 PASS
[   43.880801] test_bpf: #238 JMP_JGE_K: if (3 >= 2) return 1 jited:0 255 PASS
[   43.883637] test_bpf: #239 JMP_JGT_K: if (3 > 2) return 1 (jump
backwards) jited:0 321 PASS
[   43.887202] test_bpf: #240 JMP_JGE_K: if (3 >= 3) return 1 jited:0 340 PASS
[   43.891306] test_bpf: #241 JMP_JNE_K: if (3 != 2) return 1 jited:0 310 PASS
[   43.895036] test_bpf: #242 JMP_JEQ_K: if (3 == 3) return 1 jited:0 310 PASS
[   43.898963] test_bpf: #243 JMP_JSET_K: if (0x3 & 0x2) return 1
jited:0 276 PASS
[   43.902034] test_bpf: #244 JMP_JSET_K: if (0x3 & 0xffffffff) return
1 jited:0 312 PASS
[   43.905679] test_bpf: #245 JMP_JSGT_X: Signed jump: if (-1 > -2)
return 1 jited:0 346 PASS
[   43.909500] test_bpf: #246 JMP_JSGT_X: Signed jump: if (-1 > -1)
return 0 jited:0 292 PASS
[   43.912696] test_bpf: #247 JMP_JSGE_X: Signed jump: if (-1 >= -2)
return 1 jited:0 318 PASS
[   43.916115] test_bpf: #248 JMP_JSGE_X: Signed jump: if (-1 >= -1)
return 1 jited:0 287 PASS
[   43.919236] test_bpf: #249 JMP_JGT_X: if (3 > 2) return 1 jited:0 316 PASS
[   43.922749] test_bpf: #250 JMP_JGT_X: Unsigned jump: if (-1 > 1)
return 1 jited:0 400 PASS
[   43.927178] test_bpf: #251 JMP_JGE_X: if (3 >= 2) return 1 jited:0 287 PASS
[   43.930323] test_bpf: #252 JMP_JGE_X: if (3 >= 3) return 1 jited:0 287 PASS
[   43.933432] test_bpf: #253 JMP_JGE_X: ldimm64 test 1 jited:0 323 PASS
[   43.936912] test_bpf: #254 JMP_JGE_X: ldimm64 test 2 jited:0 298 PASS
[   43.940168] test_bpf: #255 JMP_JGE_X: ldimm64 test 3 jited:0 263 PASS
[   43.943062] test_bpf: #256 JMP_JNE_X: if (3 != 2) return 1 jited:0 313 PASS
[   43.946483] test_bpf: #257 JMP_JEQ_X: if (3 == 3) return 1 jited:0 308 PASS
[   43.949817] test_bpf: #258 JMP_JSET_X: if (0x3 & 0x2) return 1
jited:0 359 PASS
[   43.953715] test_bpf: #259 JMP_JSET_X: if (0x3 & 0xffffffff) return
1 jited:0 421 PASS
[   43.958350] test_bpf: #260 JMP_JA: Jump, gap, jump, ... jited:0 309 PASS
[   43.961783] test_bpf: #261 BPF_MAXINSNS: Maximum possible literals
jited:0 251 PASS
[   43.969019] test_bpf: #262 BPF_MAXINSNS: Single literal jited:0 286 PASS
[   43.976250] test_bpf: #263 BPF_MAXINSNS: Run/add until end jited:0
254969 PASS
[   46.530754] test_bpf: #264 BPF_MAXINSNS: Too many instructions PASS
[   46.531227] test_bpf: #265 BPF_MAXINSNS: Very long jump jited:0 284 PASS
[   46.538925] test_bpf: #266 BPF_MAXINSNS: Ctx heavy transformations
jited:0 548311 560800 PASS
[   57.635685] test_bpf: #267 BPF_MAXINSNS: Call heavy transformations
jited:0 949505 881276 PASS
[   75.951893] test_bpf: #268 BPF_MAXINSNS: Jump heavy test jited:0 480796 PASS
[   80.765143] test_bpf: #269 BPF_MAXINSNS: Very long jump backwards
jited:0 193 PASS
[   80.767750] test_bpf: #270 BPF_MAXINSNS: Edge hopping nuthouse
jited:0 114304 PASS
[   81.911103] test_bpf: #271 BPF_MAXINSNS: Jump, gap, jump, ...
jited:0 1884 PASS
[   81.935374] test_bpf: #272 BPF_MAXINSNS: ld_abs+get_processor_id
jited:0 546269 PASS
[   87.405760] test_bpf: #273 BPF_MAXINSNS: ld_abs+vlan_push/pop
jited:0 594906 PASS
[   93.356075] test_bpf: #274 LD_IND byte frag jited:0 695 PASS
[   93.364087] test_bpf: #275 LD_IND halfword frag jited:0 818 PASS
[   93.372861] test_bpf: #276 LD_IND word frag jited:0 837 PASS
[   93.381738] test_bpf: #277 LD_IND halfword mixed head/frag jited:0 1170 PASS
[   93.394096] test_bpf: #278 LD_IND word mixed head/frag jited:0 950 PASS
[   93.404149] test_bpf: #279 LD_ABS byte frag jited:0 953 PASS
[   93.414270] test_bpf: #280 LD_ABS halfword frag jited:0 754 PASS
[   93.422281] test_bpf: #281 LD_ABS word frag jited:0 1133 PASS
[   93.434166] test_bpf: #282 LD_ABS halfword mixed head/frag jited:0 1079 PASS
[   93.445353] test_bpf: #283 LD_ABS word mixed head/frag jited:0 718 PASS
[   93.452901] test_bpf: #284 LD_IND byte default X jited:0 297 PASS
[   93.456118] test_bpf: #285 LD_IND byte positive offset jited:0 300 PASS
[   93.459342] test_bpf: #286 LD_IND byte negative offset jited:0 296 PASS
[   93.462553] test_bpf: #287 LD_IND halfword positive offset jited:0 333 PASS
[   93.466116] test_bpf: #288 LD_IND halfword negative offset jited:0 306 PASS
[   93.469402] test_bpf: #289 LD_IND halfword unaligned jited:0 307 PASS
[   93.472711] test_bpf: #290 LD_IND word positive offset jited:0 337 PASS
[   93.476296] test_bpf: #291 LD_IND word negative offset jited:0 312 PASS
[   93.479676] test_bpf: #292 LD_IND word unaligned (addr & 3 == 2)
jited:0 309 PASS
[   93.482987] test_bpf: #293 LD_IND word unaligned (addr & 3 == 1)
jited:0 335 PASS
[   93.486601] test_bpf: #294 LD_IND word unaligned (addr & 3 == 3)
jited:0 305 PASS
[   93.489878] test_bpf: #295 LD_ABS byte jited:0 269 PASS
[   93.492784] test_bpf: #296 LD_ABS halfword jited:0 294 PASS
[   93.495950] test_bpf: #297 LD_ABS halfword unaligned jited:0 271 PASS
[   93.498895] test_bpf: #298 LD_ABS word jited:0 265 PASS
[   93.501756] test_bpf: #299 LD_ABS word unaligned (addr & 3 == 2)
jited:0 267 PASS
[   93.504667] test_bpf: #300 LD_ABS word unaligned (addr & 3 == 1)
jited:0 269 PASS
[   93.507584] test_bpf: #301 LD_ABS word unaligned (addr & 3 == 3)
jited:0 281 PASS
[   93.510665] test_bpf: #302 ADD default X jited:0 296 PASS
[   93.513830] test_bpf: #303 ADD default A jited:0 309 PASS
[   93.517144] test_bpf: #304 SUB default X jited:0 290 PASS
[   93.520249] test_bpf: #305 SUB default A jited:0 252 PASS
[   93.522974] test_bpf: #306 MUL default X jited:0 322 PASS
[   93.526403] test_bpf: #307 MUL default A jited:0 267 PASS
[   93.529277] test_bpf: #308 DIV default X jited:0 293 PASS
[   93.532414] test_bpf: #309 DIV default A jited:0 336 PASS
[   93.535988] test_bpf: #310 MOD default X jited:0 284 PASS
[   93.539032] test_bpf: #311 MOD default A jited:0 435 PASS
[   93.543608] test_bpf: #312 JMP EQ default A jited:0 352 PASS
[   93.547355] test_bpf: #313 JMP EQ default X jited:0 357 PASS
[   93.551176] test_bpf: Summary: 314 PASSED, 0 FAILED, [0/306 JIT'ed]

2) JIT enabled

[root@vexpress modules]# insmod test_bpf.ko
[   53.785470] test_bpf: #0 TAX jited:1 234 171 195 PASS
[   53.794856] test_bpf: #1 TXA jited:1 81 79 77 PASS
[   53.803927] test_bpf: #2 ADD_SUB_MUL_K jited:1 89 PASS
[   53.805542] test_bpf: #3 DIV_MOD_KX jited:1 939 PASS
[   53.816227] test_bpf: #4 AND_OR_LSH_K jited:1 116 114 PASS
[   53.821088] test_bpf: #5 LD_IMM_0 jited:1 93 PASS
[   53.822900] test_bpf: #6 LD_IND jited:1 371 279 274 PASS
[   53.833030] test_bpf: #7 LD_ABS jited:1 408 402 272 PASS
[   53.844767] test_bpf: #8 LD_ABS_LL jited:1 387 346 PASS
[   53.852730] test_bpf: #9 LD_IND_LL jited:1 239 248 217 PASS
[   53.860410] test_bpf: #10 LD_ABS_NET jited:1 356 332 PASS
[   53.867897] test_bpf: #11 LD_IND_NET jited:1 223 212 320 PASS
[   53.876076] test_bpf: #12 LD_PKTTYPE jited:1 102 90 PASS
[   53.878660] test_bpf: #13 LD_MARK jited:1 80 80 PASS
[   53.880695] test_bpf: #14 LD_RXHASH jited:1 73 71 PASS
[   53.882488] test_bpf: #15 LD_QUEUE jited:1 120 121 PASS
[   53.885266] test_bpf: #16 LD_PROTOCOL jited:1 256 247 PASS
[   53.890918] test_bpf: #17 LD_VLAN_TAG jited:1 82 84 PASS
[   53.893002] test_bpf: #18 LD_VLAN_TAG_PRESENT jited:1 80 77 PASS
[   53.894946] test_bpf: #19 LD_IFINDEX jited:1 87 98 PASS
[   53.897261] test_bpf: #20 LD_HATYPE jited:1 95 90 PASS
[   53.899466] test_bpf: #21 LD_CPU
[   53.899663] bpf_jit: *** NOT YET: opcode 85 ***
[   53.899796] jited:0 722 837 PASS
[   53.915645] test_bpf: #22 LD_NLATTR jited:0 593 659 PASS
[   53.928662] test_bpf: #23 LD_NLATTR_NEST jited:0 2186 2964 PASS
[   53.980966] test_bpf: #24 LD_PAYLOAD_OFF jited:0 3891 5637 PASS
[   54.076878] test_bpf: #25 LD_ANC_XOR jited:1 86 100 PASS
[   54.079241] test_bpf: #26 SPILL_FILL jited:1 131 137 123 PASS
[   54.084092] test_bpf: #27 JEQ jited:1 266 189 216 PASS
[   54.091500] test_bpf: #28 JGT jited:1 301 211 192 PASS
[   54.099467] test_bpf: #29 JGE jited:1 191 200 223 PASS
[   54.106275] test_bpf: #30 JSET jited:1 211 210 214 PASS
[   54.113660] test_bpf: #31 tcpdump port 22 jited:1 314 722 711 PASS
[   54.131943] test_bpf: #32 tcpdump complex jited:1 291 707 1068 PASS
[   54.153409] test_bpf: #33 RET_A jited:1 83 88 PASS
[   54.155617] test_bpf: #34 INT: ADD trivial jited:1 162 PASS
[   54.158387] test_bpf: #35 INT: MUL_X jited:1 176 PASS
[   54.161075] test_bpf: #36 INT: MUL_X2 jited:1 84 PASS
[   54.162483] test_bpf: #37 INT: MUL32_X jited:1 99 PASS
[   54.163849] test_bpf: #38 INT: ADD 64-bit jited:1 1066 PASS
[   54.175468] test_bpf: #39 INT: ADD 32-bit jited:1 666 PASS
[   54.182860] test_bpf: #40 INT: SUB jited:1 3236 PASS
[   54.215932] test_bpf: #41 INT: XOR jited:1 308 PASS
[   54.219704] test_bpf: #42 INT: MUL jited:1 376 PASS
[   54.224452] test_bpf: #43 MOV REG64 jited:1 227 PASS
[   54.227383] test_bpf: #44 MOV REG32 jited:1 171 PASS
[   54.229618] test_bpf: #45 LD IMM64 jited:1 163 PASS
[   54.231875] test_bpf: #46 INT: ALU MIX jited:0 1277 PASS
[   54.245188] test_bpf: #47 INT: shifts by register jited:1 208 PASS
[   54.248151] test_bpf: #48 INT: DIV + ABS jited:1 659 601 PASS
[   54.261395] test_bpf: #49 INT: DIV by zero jited:1 317 169 PASS
[   54.266949] test_bpf: #50 check: missing ret PASS
[   54.267418] test_bpf: #51 check: div_k_0 PASS
[   54.267631] test_bpf: #52 check: unknown insn PASS
[   54.267804] test_bpf: #53 check: out of range spill/fill PASS
[   54.268008] test_bpf: #54 JUMPS + HOLES jited:1 358 PASS
[   54.272201] test_bpf: #55 check: RET X PASS
[   54.273054] test_bpf: #56 check: LDX + RET X PASS
[   54.273226] test_bpf: #57 M[]: alt STX + LDX jited:1 456 PASS
[   54.278359] test_bpf: #58 M[]: full STX + full LDX jited:1 438 PASS
[   54.283300] test_bpf: #59 check: SKF_AD_MAX PASS
[   54.283576] test_bpf: #60 LD [SKF_AD_OFF-1] jited:1 198 PASS
[   54.285812] test_bpf: #61 load 64-bit immediate jited:1 125 PASS
[   54.287556] test_bpf: #62 nmap reduced jited:1 1054 PASS
[   54.298630] test_bpf: #63 ALU_MOV_X: dst = 2 jited:1 81 PASS
[   54.300079] test_bpf: #64 ALU_MOV_X: dst = 4294967295 jited:1 85 PASS
[   54.301462] test_bpf: #65 ALU64_MOV_X: dst = 2 jited:1 96 PASS
[   54.303048] test_bpf: #66 ALU64_MOV_X: dst = 4294967295 jited:1 71 PASS
[   54.304115] test_bpf: #67 ALU_MOV_K: dst = 2 jited:1 70 PASS
[   54.305148] test_bpf: #68 ALU_MOV_K: dst = 4294967295 jited:1 71 PASS
[   54.306222] test_bpf: #69 ALU_MOV_K: 0x0000ffffffff0000 =
0x00000000ffffffff jited:1 97 PASS
[   54.307659] test_bpf: #70 ALU64_MOV_K: dst = 2 jited:1 75 PASS
[   54.308750] test_bpf: #71 ALU64_MOV_K: dst = 2147483647 jited:1 66 PASS
[   54.309773] test_bpf: #72 ALU64_OR_K: dst = 0x0 jited:1 92 PASS
[   54.311093] test_bpf: #73 ALU64_MOV_K: dst = -1 jited:1 94 PASS
[   54.312383] test_bpf: #74 ALU_ADD_X: 1 + 2 = 3 jited:1 66 PASS
[   54.313388] test_bpf: #75 ALU_ADD_X: 1 + 4294967294 = 4294967295
jited:1 66 PASS
[   54.314430] test_bpf: #76 ALU_ADD_X: 2 + 4294967294 = 0 jited:1 87 PASS
[   54.315756] test_bpf: #77 ALU64_ADD_X: 1 + 2 = 3 jited:1 77 PASS
[   54.316892] test_bpf: #78 ALU64_ADD_X: 1 + 4294967294 = 4294967295
jited:1 72 PASS
[   54.318015] test_bpf: #79 ALU64_ADD_X: 2 + 4294967294 = 4294967296
jited:1 79 PASS
[   54.319181] test_bpf: #80 ALU_ADD_K: 1 + 2 = 3 jited:1 75 PASS
[   54.320261] test_bpf: #81 ALU_ADD_K: 3 + 0 = 3 jited:1 71 PASS
[   54.321307] test_bpf: #82 ALU_ADD_K: 1 + 4294967294 = 4294967295
jited:1 67 PASS
[   54.322342] test_bpf: #83 ALU_ADD_K: 4294967294 + 2 = 0 jited:1 82 PASS
[   54.323600] test_bpf: #84 ALU_ADD_K: 0 + (-1) = 0x00000000ffffffff
jited:1 86 PASS
[   54.325898] test_bpf: #85 ALU_ADD_K: 0 + 0xffff = 0xffff jited:1 99 PASS
[   54.327242] test_bpf: #86 ALU_ADD_K: 0 + 0x7fffffff = 0x7fffffff
jited:1 113 PASS
[   54.328684] test_bpf: #87 ALU_ADD_K: 0 + 0x80000000 = 0x80000000
jited:1 123 PASS
[   54.330224] test_bpf: #88 ALU_ADD_K: 0 + 0x80008000 = 0x80008000
jited:1 85 PASS
[   54.331395] test_bpf: #89 ALU64_ADD_K: 1 + 2 = 3 jited:1 66 PASS
[   54.332375] test_bpf: #90 ALU64_ADD_K: 3 + 0 = 3 jited:1 66 PASS
[   54.333381] test_bpf: #91 ALU64_ADD_K: 1 + 2147483646 = 2147483647
jited:1 69 PASS
[   54.334397] test_bpf: #92 ALU64_ADD_K: 4294967294 + 2 = 4294967296
jited:1 109 PASS
[   54.335818] test_bpf: #93 ALU64_ADD_K: 2147483646 + -2147483647 =
-1 jited:1 72 PASS
[   54.336873] test_bpf: #94 ALU64_ADD_K: 1 + 0 = 1 jited:1 126 PASS
[   54.338484] test_bpf: #95 ALU64_ADD_K: 0 + (-1) =
0xffffffffffffffff jited:1 107 PASS
[   54.340100] test_bpf: #96 ALU64_ADD_K: 0 + 0xffff = 0xffff jited:1 98 PASS
[   54.341569] test_bpf: #97 ALU64_ADD_K: 0 + 0x7fffffff = 0x7fffffff
jited:1 87 PASS
[   54.342794] test_bpf: #98 ALU64_ADD_K: 0 + 0x80000000 =
0xffffffff80000000 jited:1 98 PASS
[   54.344142] test_bpf: #99 ALU_ADD_K: 0 + 0x80008000 =
0xffffffff80008000 jited:1 92 PASS
[   54.345399] test_bpf: #100 ALU_SUB_X: 3 - 1 = 2 jited:1 77 PASS
[   54.346726] test_bpf: #101 ALU_SUB_X: 4294967295 - 4294967294 = 1
jited:1 72 PASS
[   54.347794] test_bpf: #102 ALU64_SUB_X: 3 - 1 = 2 jited:1 72 PASS
[   54.348826] test_bpf: #103 ALU64_SUB_X: 4294967295 - 4294967294 = 1
jited:1 71 PASS
[   54.349843] test_bpf: #104 ALU_SUB_K: 3 - 1 = 2 jited:1 120 PASS
[   54.351486] test_bpf: #105 ALU_SUB_K: 3 - 0 = 3 jited:1 82 PASS
[   54.352814] test_bpf: #106 ALU_SUB_K: 4294967295 - 4294967294 = 1
jited:1 103 PASS
[   54.354550] test_bpf: #107 ALU64_SUB_K: 3 - 1 = 2 jited:1 140 PASS
[   54.356822] test_bpf: #108 ALU64_SUB_K: 3 - 0 = 3 jited:1 117 PASS
[   54.359156] test_bpf: #109 ALU64_SUB_K: 4294967294 - 4294967295 =
-1 jited:1 83 PASS
[   54.360401] test_bpf: #110 ALU64_ADD_K: 2147483646 - 2147483647 =
-1 jited:1 77 PASS
[   54.361515] test_bpf: #111 ALU_MUL_X: 2 * 3 = 6 jited:1 68 PASS
[   54.362528] test_bpf: #112 ALU_MUL_X: 2 * 0x7FFFFFF8 = 0xFFFFFFF0
jited:1 70 PASS
[   54.363572] test_bpf: #113 ALU_MUL_X: -1 * -1 = 1 jited:1 73 PASS
[   54.364644] test_bpf: #114 ALU64_MUL_X: 2 * 3 = 6 jited:1 70 PASS
[   54.365655] test_bpf: #115 ALU64_MUL_X: 1 * 2147483647 = 2147483647
jited:1 75 PASS
[   54.366719] test_bpf: #116 ALU_MUL_K: 2 * 3 = 6 jited:1 67 PASS
[   54.367707] test_bpf: #117 ALU_MUL_K: 3 * 1 = 3 jited:1 71 PASS
[   54.368726] test_bpf: #118 ALU_MUL_K: 2 * 0x7FFFFFF8 = 0xFFFFFFF0
jited:1 70 PASS
[   54.369733] test_bpf: #119 ALU_MUL_K: 1 * (-1) = 0x00000000ffffffff
jited:1 153 PASS
[   54.371617] test_bpf: #120 ALU64_MUL_K: 2 * 3 = 6 jited:1 101 PASS
[   54.373505] test_bpf: #121 ALU64_MUL_K: 3 * 1 = 3 jited:1 108 PASS
[   54.375362] test_bpf: #122 ALU64_MUL_K: 1 * 2147483647 = 2147483647
jited:1 106 PASS
[   54.377242] test_bpf: #123 ALU64_MUL_K: 1 * -2147483647 =
-2147483647 jited:1 92 PASS
[   54.379044] test_bpf: #124 ALU64_MUL_K: 1 * (-1) =
0xffffffffffffffff jited:1 122 PASS
[   54.380863] test_bpf: #125 ALU_DIV_X: 6 / 2 = 3 jited:1 220 PASS
[   54.383591] test_bpf: #126 ALU_DIV_X: 4294967295 / 4294967295 = 1
jited:1 208 PASS
[   54.386292] test_bpf: #127 ALU64_DIV_X: 6 / 2 = 3 jited:0 736 PASS
[   54.394242] test_bpf: #128 ALU64_DIV_X: 2147483647 / 2147483647 = 1
jited:0 464 PASS
[   54.399433] test_bpf: #129 ALU64_DIV_X: 0xffffffffffffffff / (-1) =
0x0000000000000001 jited:0 743 PASS
[   54.407799] test_bpf: #130 ALU_DIV_K: 6 / 2 = 3 jited:1 246 PASS
[   54.410964] test_bpf: #131 ALU_DIV_K: 3 / 1 = 3 jited:1 199 PASS
[   54.413410] test_bpf: #132 ALU_DIV_K: 4294967295 / 4294967295 = 1
jited:1 192 PASS
[   54.415782] test_bpf: #133 ALU_DIV_K: 0xffffffffffffffff / (-1) =
0x1 jited:1 215 PASS
[   54.418414] test_bpf: #134 ALU64_DIV_K: 6 / 2 = 3 jited:0 364 PASS
[   54.422379] test_bpf: #135 ALU64_DIV_K: 3 / 1 = 3 jited:0 369 PASS
[   54.426692] test_bpf: #136 ALU64_DIV_K: 2147483647 / 2147483647 = 1
jited:0 380 PASS
[   54.430875] test_bpf: #137 ALU64_DIV_K: 0xffffffffffffffff / (-1) =
0x0000000000000001 jited:0 623 PASS
[   54.437429] test_bpf: #138 ALU_MOD_X: 3 % 2 = 1 jited:1 235 PASS
[   54.440177] test_bpf: #139 ALU_MOD_X: 4294967295 % 4294967293 = 2
jited:1 262 PASS
[   54.443183] test_bpf: #140 ALU64_MOD_X: 3 % 2 = 1 jited:0 1524 PASS
[   54.458988] test_bpf: #141 ALU64_MOD_X: 2147483647 % 2147483645 = 2
jited:0 720 PASS
[   54.466677] test_bpf: #142 ALU_MOD_K: 3 % 2 = 1 jited:1 231 PASS
[   54.469383] test_bpf: #143 ALU_MOD_K: 3 % 1 = 0 jited:1 PASS
[   54.469685] test_bpf: #144 ALU_MOD_K: 4294967295 % 4294967293 = 2
jited:1 257 PASS
[   54.472650] test_bpf: #145 ALU64_MOD_K: 3 % 2 = 1 jited:0 481 PASS
[   54.477765] test_bpf: #146 ALU64_MOD_K: 3 % 1 = 0 jited:0 PASS
[   54.478042] test_bpf: #147 ALU64_MOD_K: 2147483647 % 2147483645 = 2
jited:0 513 PASS
[   54.483455] test_bpf: #148 ALU_AND_X: 3 & 2 = 2 jited:1 100 PASS
[   54.484786] test_bpf: #149 ALU_AND_X: 0xffffffff & 0xffffffff =
0xffffffff jited:1 106 PASS
[   54.486335] test_bpf: #150 ALU64_AND_X: 3 & 2 = 2 jited:1 86 PASS
[   54.487738] test_bpf: #151 ALU64_AND_X: 0xffffffff & 0xffffffff =
0xffffffff jited:1 118 PASS
[   54.489623] test_bpf: #152 ALU_AND_K: 3 & 2 = 2 jited:1 117 PASS
[   54.491645] test_bpf: #153 ALU_AND_K: 0xffffffff & 0xffffffff =
0xffffffff jited:1 72 PASS
[   54.493119] test_bpf: #154 ALU64_AND_K: 3 & 2 = 2 jited:1 72 PASS
[   54.494195] test_bpf: #155 ALU64_AND_K: 0xffffffff & 0xffffffff =
0xffffffff jited:1 70 PASS
[   54.495330] test_bpf: #156 ALU64_AND_K: 0x0000ffffffff0000 & 0x0 =
0x0000ffff00000000 jited:1 99 PASS
[   54.496721] test_bpf: #157 ALU64_AND_K: 0x0000ffffffff0000 & -1 =
0x0000ffffffffffff jited:1 97 PASS
[   54.498106] test_bpf: #158 ALU64_AND_K: 0xffffffffffffffff & -1 =
0xffffffffffffffff jited:1 86 PASS
[   54.499343] test_bpf: #159 ALU_OR_X: 1 | 2 = 3 jited:1 73 PASS
[   54.500447] test_bpf: #160 ALU_OR_X: 0x0 | 0xffffffff = 0xffffffff
jited:1 72 PASS
[   54.501546] test_bpf: #161 ALU64_OR_X: 1 | 2 = 3 jited:1 89 PASS
[   54.502779] test_bpf: #162 ALU64_OR_X: 0 | 0xffffffff = 0xffffffff
jited:1 91 PASS
[   54.504154] test_bpf: #163 ALU_OR_K: 1 | 2 = 3 jited:1 71 PASS
[   54.505223] test_bpf: #164 ALU_OR_K: 0 & 0xffffffff = 0xffffffff
jited:1 116 PASS
[   54.506916] test_bpf: #165 ALU64_OR_K: 1 | 2 = 3 jited:1 77 PASS
[   54.508328] test_bpf: #166 ALU64_OR_K: 0 & 0xffffffff = 0xffffffff
jited:1 80 PASS
[   54.509666] test_bpf: #167 ALU64_OR_K: 0x0000ffffffff0000 | 0x0 =
0x0000ffff00000000 jited:1 86 PASS
[   54.511012] test_bpf: #168 ALU64_OR_K: 0x0000ffffffff0000 | -1 =
0xffffffffffffffff jited:1 99 PASS
[   54.512432] test_bpf: #169 ALU64_OR_K: 0x000000000000000 | -1 =
0xffffffffffffffff jited:1 147 PASS
[   54.514401] test_bpf: #170 ALU_XOR_X: 5 ^ 6 = 3 jited:1 80 PASS
[   54.515668] test_bpf: #171 ALU_XOR_X: 0x1 ^ 0xffffffff = 0xfffffffe
jited:1 73 PASS
[   54.516794] test_bpf: #172 ALU64_XOR_X: 5 ^ 6 = 3 jited:1 71 PASS
[   54.517879] test_bpf: #173 ALU64_XOR_X: 1 ^ 0xffffffff = 0xfffffffe
jited:1 72 PASS
[   54.518998] test_bpf: #174 ALU_XOR_K: 5 ^ 6 = 3 jited:1 71 PASS
[   54.520120] test_bpf: #175 ALU_XOR_K: 1 ^ 0xffffffff = 0xfffffffe
jited:1 67 PASS
[   54.521181] test_bpf: #176 ALU64_XOR_K: 5 ^ 6 = 3 jited:1 70 PASS
[   54.522292] test_bpf: #177 ALU64_XOR_K: 1 & 0xffffffff = 0xfffffffe
jited:1 104 PASS
[   54.523741] test_bpf: #178 ALU64_XOR_K: 0x0000ffffffff0000 ^ 0x0 =
0x0000ffffffff0000 jited:1 96 PASS
[   54.525269] test_bpf: #179 ALU64_XOR_K: 0x0000ffffffff0000 ^ -1 =
0xffff00000000ffff jited:1 119 PASS
[   54.526875] test_bpf: #180 ALU64_XOR_K: 0x000000000000000 ^ -1 =
0xffffffffffffffff jited:1 116 PASS
[   54.528421] test_bpf: #181 ALU_LSH_X: 1 << 1 = 2 jited:1 100 PASS
[   54.529848] test_bpf: #182 ALU_LSH_X: 1 << 31 = 0x80000000 jited:1 73 PASS
[   54.530965] test_bpf: #183 ALU64_LSH_X: 1 << 1 = 2 jited:1 119 PASS
[   54.532667] test_bpf: #184 ALU64_LSH_X: 1 << 31 = 0x80000000 jited:1 110 PASS
[   54.534257] test_bpf: #185 ALU_LSH_K: 1 << 1 = 2 jited:1 147 PASS
[   54.536290] test_bpf: #186 ALU_LSH_K: 1 << 31 = 0x80000000 jited:1 116 PASS
[   54.538165] test_bpf: #187 ALU64_LSH_K: 1 << 1 = 2 jited:1 154 PASS
[   54.540668] test_bpf: #188 ALU64_LSH_K: 1 << 31 = 0x80000000 jited:1 92 PASS
[   54.542464] test_bpf: #189 ALU_RSH_X: 2 >> 1 = 1 jited:1 86 PASS
[   54.543937] test_bpf: #190 ALU_RSH_X: 0x80000000 >> 31 = 1 jited:1 148 PASS
[   54.545995] test_bpf: #191 ALU64_RSH_X: 2 >> 1 = 1 jited:1 108 PASS
[   54.547759] test_bpf: #192 ALU64_RSH_X: 0x80000000 >> 31 = 1 jited:1 96 PASS
[   54.549178] test_bpf: #193 ALU_RSH_K: 2 >> 1 = 1 jited:1 68 PASS
[   54.550175] test_bpf: #194 ALU_RSH_K: 0x80000000 >> 31 = 1 jited:1 74 PASS
[   54.551208] test_bpf: #195 ALU64_RSH_K: 2 >> 1 = 1 jited:1 66 PASS
[   54.552193] test_bpf: #196 ALU64_RSH_K: 0x80000000 >> 31 = 1 jited:1 95 PASS
[   54.553449] test_bpf: #197 ALU_ARSH_X: 0xff00ff0000000000 >> 40 =
0xffffffffffff00ff jited:1 74 PASS
[   54.554566] test_bpf: #198 ALU_ARSH_K: 0xff00ff0000000000 >> 40 =
0xffffffffffff00ff jited:1 96 PASS
[   54.555984] test_bpf: #199 ALU_NEG: -(3) = -3 jited:1 84 PASS
[   54.557335] test_bpf: #200 ALU_NEG: -(-3) = 3 jited:1 72 PASS
[   54.558442] test_bpf: #201 ALU64_NEG: -(3) = -3 jited:1 74 PASS
[   54.559596] test_bpf: #202 ALU64_NEG: -(-3) = 3 jited:1 68 PASS
[   54.560664] test_bpf: #203 ALU_END_FROM_BE 16: 0x0123456789abcdef
-> 0xcdef jited:1 74 PASS
[   54.561814] test_bpf: #204 ALU_END_FROM_BE 32: 0x0123456789abcdef
-> 0x89abcdef jited:1 101 PASS
[   54.563242] test_bpf: #205 ALU_END_FROM_BE 64: 0x0123456789abcdef
-> 0x89abcdef jited:1 93 PASS
[   54.564578] test_bpf: #206 ALU_END_FROM_LE 16: 0x0123456789abcdef
-> 0xefcd jited:1 73 PASS
[   54.565750] test_bpf: #207 ALU_END_FROM_LE 32: 0x0123456789abcdef
-> 0xefcdab89 jited:1 76 PASS
[   54.566879] test_bpf: #208 ALU_END_FROM_LE 64: 0x0123456789abcdef
-> 0x67452301 jited:1 78 PASS
[   54.568009] test_bpf: #209 ST_MEM_B: Store/Load byte: max negative
jited:1 72 PASS
[   54.569258] test_bpf: #210 ST_MEM_B: Store/Load byte: max positive
jited:1 79 PASS
[   54.570402] test_bpf: #211 STX_MEM_B: Store/Load byte: max negative
jited:1 79 PASS
[   54.571541] test_bpf: #212 ST_MEM_H: Store/Load half word: max
negative jited:1 81 PASS
[   54.572896] test_bpf: #213 ST_MEM_H: Store/Load half word: max
positive jited:1 100 PASS
[   54.574521] test_bpf: #214 STX_MEM_H: Store/Load half word: max
negative jited:1 110 PASS
[   54.576159] test_bpf: #215 ST_MEM_W: Store/Load word: max negative
jited:1 75 PASS
[   54.577570] test_bpf: #216 ST_MEM_W: Store/Load word: max positive
jited:1 89 PASS
[   54.579195] test_bpf: #217 STX_MEM_W: Store/Load word: max negative
jited:1 122 PASS
[   54.581267] test_bpf: #218 ST_MEM_DW: Store/Load double word: max
negative jited:1 85 PASS
[   54.582954] test_bpf: #219 ST_MEM_DW: Store/Load double word: max
negative 2 jited:1 123 PASS
[   54.584677] test_bpf: #220 ST_MEM_DW: Store/Load double word: max
positive jited:1 78 PASS
[   54.585879] test_bpf: #221 STX_MEM_DW: Store/Load double word: max
negative jited:1 85 PASS
[   54.587106] test_bpf: #222 STX_XADD_W: Test: 0x12 + 0x10 = 0x22
jited:0 328 PASS
[   54.590869] test_bpf: #223 STX_XADD_W: Test side-effects, r10: 0x12
+ 0x10 = 0x22 jited:0 PASS
[   54.591178] test_bpf: #224 STX_XADD_W: Test side-effects, r0: 0x12
+ 0x10 = 0x22 jited:0 285 PASS
[   54.594489] test_bpf: #225 STX_XADD_W: X + 1 + 1 + 1 + ... jited:0
158746 PASS
[   56.182499] test_bpf: #226 STX_XADD_DW: Test: 0x12 + 0x10 = 0x22
jited:0 343 PASS
[   56.186642] test_bpf: #227 STX_XADD_DW: Test side-effects, r10:
0x12 + 0x10 = 0x22 jited:0 PASS
[   56.186926] test_bpf: #228 STX_XADD_DW: Test side-effects, r0: 0x12
+ 0x10 = 0x22 jited:0 272 PASS
[   56.190021] test_bpf: #229 STX_XADD_DW: X + 1 + 1 + 1 + ... jited:0
194997 PASS
[   58.140569] test_bpf: #230 JMP_EXIT jited:1 82 PASS
[   58.142427] test_bpf: #231 JMP_JA: Unconditional jump: if (true)
return 1 jited:1 86 PASS
[   58.155637] test_bpf: #232 JMP_JSGT_K: Signed jump: if (-1 > -2)
return 1 jited:1 86 PASS
[   58.157334] test_bpf: #233 JMP_JSGT_K: Signed jump: if (-1 > -1)
return 0 jited:1 82 PASS
[   58.158533] test_bpf: #234 JMP_JSGE_K: Signed jump: if (-1 >= -2)
return 1 jited:1 72 PASS
[   58.159560] test_bpf: #235 JMP_JSGE_K: Signed jump: if (-1 >= -1)
return 1 jited:1 73 PASS
[   58.160538] test_bpf: #236 JMP_JGT_K: if (3 > 2) return 1 jited:1 71 PASS
[   58.161457] test_bpf: #237 JMP_JGT_K: Unsigned jump: if (-1 > 1)
return 1 jited:1 72 PASS
[   58.162407] test_bpf: #238 JMP_JGE_K: if (3 >= 2) return 1 jited:1 77 PASS
[   58.163411] test_bpf: #239 JMP_JGT_K: if (3 > 2) return 1 (jump
backwards) jited:1 76 PASS
[   58.164416] test_bpf: #240 JMP_JGE_K: if (3 >= 3) return 1 jited:1 74 PASS
[   58.165391] test_bpf: #241 JMP_JNE_K: if (3 != 2) return 1 jited:1 74 PASS
[   58.166375] test_bpf: #242 JMP_JEQ_K: if (3 == 3) return 1 jited:1 78 PASS
[   58.167382] test_bpf: #243 JMP_JSET_K: if (0x3 & 0x2) return 1
jited:1 109 PASS
[   58.168822] test_bpf: #244 JMP_JSET_K: if (0x3 & 0xffffffff) return
1 jited:1 71 PASS
[   58.170396] test_bpf: #245 JMP_JSGT_X: Signed jump: if (-1 > -2)
return 1 jited:1 75 PASS
[   58.171568] test_bpf: #246 JMP_JSGT_X: Signed jump: if (-1 > -1)
return 0 jited:1 78 PASS
[   58.172804] test_bpf: #247 JMP_JSGE_X: Signed jump: if (-1 >= -2)
return 1 jited:1 134 PASS
[   58.175486] test_bpf: #248 JMP_JSGE_X: Signed jump: if (-1 >= -1)
return 1 jited:1 102 PASS
[   58.177403] test_bpf: #249 JMP_JGT_X: if (3 > 2) return 1 jited:1 83 PASS
[   58.178806] test_bpf: #250 JMP_JGT_X: Unsigned jump: if (-1 > 1)
return 1 jited:1 80 PASS
[   58.180104] test_bpf: #251 JMP_JGE_X: if (3 >= 2) return 1 jited:1 78 PASS
[   58.181230] test_bpf: #252 JMP_JGE_X: if (3 >= 3) return 1 jited:1 116 PASS
[   58.182751] test_bpf: #253 JMP_JGE_X: ldimm64 test 1 jited:1 81 PASS
[   58.183951] test_bpf: #254 JMP_JGE_X: ldimm64 test 2 jited:1 79 PASS
[   58.185334] test_bpf: #255 JMP_JGE_X: ldimm64 test 3 jited:1 78 PASS
[   58.186505] test_bpf: #256 JMP_JNE_X: if (3 != 2) return 1 jited:1 108 PASS
[   58.187991] test_bpf: #257 JMP_JEQ_X: if (3 == 3) return 1 jited:1 102 PASS
[   58.189496] test_bpf: #258 JMP_JSET_X: if (0x3 & 0x2) return 1
jited:1 133 PASS
[   58.191644] test_bpf: #259 JMP_JSET_X: if (0x3 & 0xffffffff) return
1 jited:1 128 PASS
[   58.193631] test_bpf: #260 JMP_JA: Jump, gap, jump, ... jited:1 108 PASS
[   58.195981] test_bpf: #261 BPF_MAXINSNS: Maximum possible literals
jited:1 111 PASS
[   58.211020] test_bpf: #262 BPF_MAXINSNS: Single literal jited:1 115 PASS
[   58.226185] test_bpf: #263 BPF_MAXINSNS: Run/add until end jited:1 8481 PASS
[   58.322910] test_bpf: #264 BPF_MAXINSNS: Too many instructions PASS
[   58.323076] test_bpf: #265 BPF_MAXINSNS: Very long jump jited:1 123 PASS
[   58.339381] test_bpf: #266 BPF_MAXINSNS: Ctx heavy transformations
jited:1 28166 29032 PASS
[   58.931050] test_bpf: #267 BPF_MAXINSNS: Call heavy transformations
jited:0 903498 894192 PASS
[   76.916296] test_bpf: #268 BPF_MAXINSNS: Jump heavy test jited:1 132663 PASS
[   78.260490] test_bpf: #269 BPF_MAXINSNS: Very long jump backwards
jited:1 148 PASS
[   78.269590] test_bpf: #270 BPF_MAXINSNS: Edge hopping nuthouse
jited:1 277097 PASS
[   81.046383] test_bpf: #271 BPF_MAXINSNS: Jump, gap, jump, ...
jited:1 1041 PASS
[   81.076916] test_bpf: #272 BPF_MAXINSNS: ld_abs+get_processor_id
jited:0 566894 PASS
[   86.754024] test_bpf: #273 BPF_MAXINSNS: ld_abs+vlan_push/pop
jited:0 602040 PASS
[   92.775504] test_bpf: #274 LD_IND byte frag jited:1 574 PASS
[   92.782876] test_bpf: #275 LD_IND halfword frag jited:1 641 PASS
[   92.790062] test_bpf: #276 LD_IND word frag jited:1 731 PASS
[   92.798321] test_bpf: #277 LD_IND halfword mixed head/frag jited:1 741 PASS
[   92.806601] test_bpf: #278 LD_IND word mixed head/frag jited:1 972 PASS
[   92.817542] test_bpf: #279 LD_ABS byte frag jited:1 601 PASS
[   92.824156] test_bpf: #280 LD_ABS halfword frag jited:1 603 PASS
[   92.830806] test_bpf: #281 LD_ABS word frag jited:1 688 PASS
[   92.838273] test_bpf: #282 LD_ABS halfword mixed head/frag jited:1 657 PASS
[   92.845562] test_bpf: #283 LD_ABS word mixed head/frag jited:1 748 PASS
[   92.853678] test_bpf: #284 LD_IND byte default X jited:1 178 PASS
[   92.856290] test_bpf: #285 LD_IND byte positive offset jited:1 187 PASS
[   92.858954] test_bpf: #286 LD_IND byte negative offset jited:1 178 PASS
[   92.861592] test_bpf: #287 LD_IND halfword positive offset jited:1 161 PASS
[   92.863726] test_bpf: #288 LD_IND halfword negative offset jited:1 195 PASS
[   92.866372] test_bpf: #289 LD_IND halfword unaligned jited:1 183 PASS
[   92.868821] test_bpf: #290 LD_IND word positive offset jited:1 170 PASS
[   92.871096] test_bpf: #291 LD_IND word negative offset jited:1 198 PASS
[   92.873832] test_bpf: #292 LD_IND word unaligned (addr & 3 == 2)
jited:1 281 PASS
[   92.877321] test_bpf: #293 LD_IND word unaligned (addr & 3 == 1)
jited:1 172 PASS
[   92.879493] test_bpf: #294 LD_IND word unaligned (addr & 3 == 3)
jited:1 171 PASS
[   92.881590] test_bpf: #295 LD_ABS byte jited:1 162 PASS
[   92.883535] test_bpf: #296 LD_ABS halfword jited:1 160 PASS
[   92.885486] test_bpf: #297 LD_ABS halfword unaligned jited:1 180 PASS
[   92.887650] test_bpf: #298 LD_ABS word jited:1 166 PASS
[   92.889661] test_bpf: #299 LD_ABS word unaligned (addr & 3 == 2)
jited:1 157 PASS
[   92.891595] test_bpf: #300 LD_ABS word unaligned (addr & 3 == 1)
jited:1 170 PASS
[   92.893662] test_bpf: #301 LD_ABS word unaligned (addr & 3 == 3)
jited:1 163 PASS
[   92.895660] test_bpf: #302 ADD default X jited:1 84 PASS
[   92.896895] test_bpf: #303 ADD default A jited:1 79 PASS
[   92.898143] test_bpf: #304 SUB default X jited:1 82 PASS
[   92.899284] test_bpf: #305 SUB default A jited:1 85 PASS
[   92.900529] test_bpf: #306 MUL default X jited:1 76 PASS
[   92.901642] test_bpf: #307 MUL default A jited:1 83 PASS
[   92.903045] test_bpf: #308 DIV default X jited:1 93 PASS
[   92.904524] test_bpf: #309 DIV default A jited:1 203 PASS
[   92.906955] test_bpf: #310 MOD default X jited:1 100 PASS
[   92.908398] test_bpf: #311 MOD default A jited:1 249 PASS
[   92.911232] test_bpf: #312 JMP EQ default A jited:1 83 PASS
[   92.912593] test_bpf: #313 JMP EQ default X jited:1 95 PASS
[   92.913931] test_bpf: Summary: 314 PASSED, 0 FAILED, [278/306 JIT'ed]

3) JIT + blinding enabled:

[root@vexpress modules]# insmod test_bpf.ko
[   56.044720] test_bpf: #0 TAX jited:1 239 218 229 PASS
[   56.054736] test_bpf: #1 TXA jited:1 89 119 85 PASS
[   56.064598] test_bpf: #2 ADD_SUB_MUL_K jited:1 213 PASS
[   56.067415] test_bpf: #3 DIV_MOD_KX jited:1 1190 PASS
[   56.080569] test_bpf: #4 AND_OR_LSH_K jited:1 200 149 PASS
[   56.084764] test_bpf: #5 LD_IMM_0 jited:1 101 PASS
[   56.086832] test_bpf: #6 LD_IND jited:1 314 310 283 PASS
[   56.096521] test_bpf: #7 LD_ABS jited:1 376 460 397 PASS
[   56.109604] test_bpf: #8 LD_ABS_LL jited:1 608 415 PASS
[   56.120753] test_bpf: #9 LD_IND_LL jited:1 248 256 268 PASS
[   56.129296] test_bpf: #10 LD_ABS_NET jited:1 435 420 PASS
[   56.138666] test_bpf: #11 LD_IND_NET jited:1 240 228 215 PASS
[   56.146039] test_bpf: #12 LD_PKTTYPE jited:1 211 274 PASS
[   56.151632] test_bpf: #13 LD_MARK jited:1 119 76 PASS
[   56.154522] test_bpf: #14 LD_RXHASH jited:1 78 70 PASS
[   56.156535] test_bpf: #15 LD_QUEUE jited:1 77 73 PASS
[   56.158482] test_bpf: #16 LD_PROTOCOL jited:1 326 320 PASS
[   56.165778] test_bpf: #17 LD_VLAN_TAG jited:1 129 86 PASS
[   56.168783] test_bpf: #18 LD_VLAN_TAG_PRESENT jited:1 87 88 PASS
[   56.170990] test_bpf: #19 LD_IFINDEX jited:1 97 95 PASS
[   56.173444] test_bpf: #20 LD_HATYPE jited:1 94 118 PASS
[   56.176033] test_bpf: #21 LD_CPU
[   56.176329] bpf_jit: *** NOT YET: opcode 85 ***
[   56.176565] jited:0 2639 702 PASS
[   56.210242] test_bpf: #22 LD_NLATTR jited:0 685 2101 PASS
[   56.238881] test_bpf: #23 LD_NLATTR_NEST jited:0 2323 3752 PASS
[   56.300600] test_bpf: #24 LD_PAYLOAD_OFF jited:0 4543 6842 PASS
[   56.415022] test_bpf: #25 LD_ANC_XOR jited:1 168 156 PASS
[   56.419429] test_bpf: #26 SPILL_FILL jited:1 232 212 219 PASS
[   56.427785] test_bpf: #27 JEQ jited:1 362 352 230 PASS
[   56.438180] test_bpf: #28 JGT jited:1 334 236 197 PASS
[   56.446672] test_bpf: #29 JGE jited:1 260 318 307 PASS
[   56.456301] test_bpf: #30 JSET jited:1 274 339 410 PASS
[   56.467681] test_bpf: #31 tcpdump port 22 jited:1 355 951 968 PASS
[   56.492091] test_bpf: #32 tcpdump complex jited:1 318 798 1308 PASS
[   56.517843] test_bpf: #33 RET_A jited:1 83 76 PASS
[   56.520000] test_bpf: #34 INT: ADD trivial jited:1 152 PASS
[   56.522183] test_bpf: #35 INT: MUL_X jited:1 192 PASS
[   56.524626] test_bpf: #36 INT: MUL_X2 jited:1 165 PASS
[   56.526762] test_bpf: #37 INT: MUL32_X jited:1 163 PASS
[   56.528828] test_bpf: #38 INT: ADD 64-bit jited:1 1507 PASS
[   56.544862] test_bpf: #39 INT: ADD 32-bit jited:1 954 PASS
[   56.555409] test_bpf: #40 INT: SUB jited:1 1159 PASS
[   56.567960] test_bpf: #41 INT: XOR jited:1 480 PASS
[   56.573431] test_bpf: #42 INT: MUL jited:1 486 PASS
[   56.579305] test_bpf: #43 MOV REG64 jited:1 274 PASS
[   56.583045] test_bpf: #44 MOV REG32 jited:1 253 PASS
[   56.586138] test_bpf: #45 LD IMM64 jited:1 578 PASS
[   56.592580] test_bpf: #46 INT: ALU MIX jited:0 1199 PASS
[   56.605346] test_bpf: #47 INT: shifts by register jited:1 381 PASS
[   56.610159] test_bpf: #48 INT: DIV + ABS jited:1 588 482 PASS
[   56.621545] test_bpf: #49 INT: DIV by zero jited:1 276 199 PASS
[   56.626894] test_bpf: #50 check: missing ret PASS
[   56.627249] test_bpf: #51 check: div_k_0 PASS
[   56.627403] test_bpf: #52 check: unknown insn PASS
[   56.627518] test_bpf: #53 check: out of range spill/fill PASS
[   56.627639] test_bpf: #54 JUMPS + HOLES jited:1 371 PASS
[   56.632295] test_bpf: #55 check: RET X PASS
[   56.632615] test_bpf: #56 check: LDX + RET X PASS
[   56.632748] test_bpf: #57 M[]: alt STX + LDX jited:1 621 PASS
[   56.639774] test_bpf: #58 M[]: full STX + full LDX jited:1 586 PASS
[   56.646535] test_bpf: #59 check: SKF_AD_MAX PASS
[   56.646837] test_bpf: #60 LD [SKF_AD_OFF-1] jited:1 195 PASS
[   56.649245] test_bpf: #61 load 64-bit immediate jited:1 220 PASS
[   56.652259] test_bpf: #62 nmap reduced jited:1 816 PASS
[   56.661508] test_bpf: #63 ALU_MOV_X: dst = 2 jited:1 76 PASS
[   56.662760] test_bpf: #64 ALU_MOV_X: dst = 4294967295 jited:1 79 PASS
[   56.663905] test_bpf: #65 ALU64_MOV_X: dst = 2 jited:1 80 PASS
[   56.665158] test_bpf: #66 ALU64_MOV_X: dst = 4294967295 jited:1 79 PASS
[   56.666297] test_bpf: #67 ALU_MOV_K: dst = 2 jited:1 75 PASS
[   56.667389] test_bpf: #68 ALU_MOV_K: dst = 4294967295 jited:1 73 PASS
[   56.668504] test_bpf: #69 ALU_MOV_K: 0x0000ffffffff0000 =
0x00000000ffffffff jited:1 195 PASS
[   56.670934] test_bpf: #70 ALU64_MOV_K: dst = 2 jited:1 77 PASS
[   56.672115] test_bpf: #71 ALU64_MOV_K: dst = 2147483647 jited:1 104 PASS
[   56.673550] test_bpf: #72 ALU64_OR_K: dst = 0x0 jited:1 215 PASS
[   56.676139] test_bpf: #73 ALU64_MOV_K: dst = -1 jited:1 173 PASS
[   56.687141] test_bpf: #74 ALU_ADD_X: 1 + 2 = 3 jited:1 114 PASS
[   56.688839] test_bpf: #75 ALU_ADD_X: 1 + 4294967294 = 4294967295
jited:1 112 PASS
[   56.690248] test_bpf: #76 ALU_ADD_X: 2 + 4294967294 = 0 jited:1 186 PASS
[   56.692428] test_bpf: #77 ALU64_ADD_X: 1 + 2 = 3 jited:1 159 PASS
[   56.694388] test_bpf: #78 ALU64_ADD_X: 1 + 4294967294 = 4294967295
jited:1 109 PASS
[   56.696115] test_bpf: #79 ALU64_ADD_X: 2 + 4294967294 = 4294967296
jited:1 218 PASS
[   56.698754] test_bpf: #80 ALU_ADD_K: 1 + 2 = 3 jited:1 120 PASS
[   56.700479] test_bpf: #81 ALU_ADD_K: 3 + 0 = 3 jited:1 118 PASS
[   56.702378] test_bpf: #82 ALU_ADD_K: 1 + 4294967294 = 4294967295
jited:1 121 PASS
[   56.704284] test_bpf: #83 ALU_ADD_K: 4294967294 + 2 = 0 jited:1 139 PASS
[   56.706363] test_bpf: #84 ALU_ADD_K: 0 + (-1) = 0x00000000ffffffff
jited:1 176 PASS
[   56.708715] test_bpf: #85 ALU_ADD_K: 0 + 0xffff = 0xffff jited:1 190 PASS
[   56.711155] test_bpf: #86 ALU_ADD_K: 0 + 0x7fffffff = 0x7fffffff
jited:1 228 PASS
[   56.713878] test_bpf: #87 ALU_ADD_K: 0 + 0x80000000 = 0x80000000
jited:1 198 PASS
[   56.716318] test_bpf: #88 ALU_ADD_K: 0 + 0x80008000 = 0x80008000
jited:1 189 PASS
[   56.718657] test_bpf: #89 ALU64_ADD_K: 1 + 2 = 3 jited:1 112 PASS
[   56.720152] test_bpf: #90 ALU64_ADD_K: 3 + 0 = 3 jited:1 111 PASS
[   56.721639] test_bpf: #91 ALU64_ADD_K: 1 + 2147483646 = 2147483647
jited:1 138 PASS
[   56.723403] test_bpf: #92 ALU64_ADD_K: 4294967294 + 2 = 4294967296
jited:1 151 PASS
[   56.725349] test_bpf: #93 ALU64_ADD_K: 2147483646 + -2147483647 =
-1 jited:1 115 PASS
[   56.726923] test_bpf: #94 ALU64_ADD_K: 1 + 0 = 1 jited:1 206 PASS
[   56.729436] test_bpf: #95 ALU64_ADD_K: 0 + (-1) =
0xffffffffffffffff jited:1 211 PASS
[   56.731988] test_bpf: #96 ALU64_ADD_K: 0 + 0xffff = 0xffff jited:1 250 PASS
[   56.735291] test_bpf: #97 ALU64_ADD_K: 0 + 0x7fffffff = 0x7fffffff
jited:1 199 PASS
[   56.737871] test_bpf: #98 ALU64_ADD_K: 0 + 0x80000000 =
0xffffffff80000000 jited:1 177 PASS
[   56.740193] test_bpf: #99 ALU_ADD_K: 0 + 0x80008000 =
0xffffffff80008000 jited:1 243 PASS
[   56.743126] test_bpf: #100 ALU_SUB_X: 3 - 1 = 2 jited:1 108 PASS
[   56.744676] test_bpf: #101 ALU_SUB_X: 4294967295 - 4294967294 = 1
jited:1 133 PASS
[   56.746386] test_bpf: #102 ALU64_SUB_X: 3 - 1 = 2 jited:1 110 PASS
[   56.747835] test_bpf: #103 ALU64_SUB_X: 4294967295 - 4294967294 = 1
jited:1 111 PASS
[   56.749292] test_bpf: #104 ALU_SUB_K: 3 - 1 = 2 jited:1 110 PASS
[   56.750766] test_bpf: #105 ALU_SUB_K: 3 - 0 = 3 jited:1 123 PASS
[   56.752371] test_bpf: #106 ALU_SUB_K: 4294967295 - 4294967294 = 1
jited:1 124 PASS
[   56.754095] test_bpf: #107 ALU64_SUB_K: 3 - 1 = 2 jited:1 116 PASS
[   56.755687] test_bpf: #108 ALU64_SUB_K: 3 - 0 = 3 jited:1 133 PASS
[   56.757418] test_bpf: #109 ALU64_SUB_K: 4294967294 - 4294967295 =
-1 jited:1 148 PASS
[   56.759295] test_bpf: #110 ALU64_ADD_K: 2147483646 - 2147483647 =
-1 jited:1 145 PASS
[   56.761137] test_bpf: #111 ALU_MUL_X: 2 * 3 = 6 jited:1 172 PASS
[   56.763380] test_bpf: #112 ALU_MUL_X: 2 * 0x7FFFFFF8 = 0xFFFFFFF0
jited:1 117 PASS
[   56.764943] test_bpf: #113 ALU_MUL_X: -1 * -1 = 1 jited:1 109 PASS
[   56.766424] test_bpf: #114 ALU64_MUL_X: 2 * 3 = 6 jited:1 115 PASS
[   56.767999] test_bpf: #115 ALU64_MUL_X: 1 * 2147483647 = 2147483647
jited:1 119 PASS
[   56.769584] test_bpf: #116 ALU_MUL_K: 2 * 3 = 6 jited:1 111 PASS
[   56.771124] test_bpf: #117 ALU_MUL_K: 3 * 1 = 3 jited:1 118 PASS
[   56.772961] test_bpf: #118 ALU_MUL_K: 2 * 0x7FFFFFF8 = 0xFFFFFFF0
jited:1 109 PASS
[   56.774431] test_bpf: #119 ALU_MUL_K: 1 * (-1) = 0x00000000ffffffff
jited:1 201 PASS
[   56.776888] test_bpf: #120 ALU64_MUL_K: 2 * 3 = 6 jited:1 116 PASS
[   56.778460] test_bpf: #121 ALU64_MUL_K: 3 * 1 = 3 jited:1 115 PASS
[   56.779993] test_bpf: #122 ALU64_MUL_K: 1 * 2147483647 = 2147483647
jited:1 278 PASS
[   56.783229] test_bpf: #123 ALU64_MUL_K: 1 * -2147483647 =
-2147483647 jited:1 125 PASS
[   56.785228] test_bpf: #124 ALU64_MUL_K: 1 * (-1) =
0xffffffffffffffff jited:1 208 PASS
[   56.787912] test_bpf: #125 ALU_DIV_X: 6 / 2 = 3 jited:1 246 PASS
[   56.790983] test_bpf: #126 ALU_DIV_X: 4294967295 / 4294967295 = 1
jited:1 291 PASS
[   56.794583] test_bpf: #127 ALU64_DIV_X: 6 / 2 = 3 jited:0 449 PASS
[   56.799521] test_bpf: #128 ALU64_DIV_X: 2147483647 / 2147483647 = 1
jited:0 462 PASS
[   56.804433] test_bpf: #129 ALU64_DIV_X: 0xffffffffffffffff / (-1) =
0x0000000000000001 jited:0 602 PASS
[   56.810815] test_bpf: #130 ALU_DIV_K: 6 / 2 = 3 jited:1 234 PASS
[   56.813585] test_bpf: #131 ALU_DIV_K: 3 / 1 = 3 jited:1 240 PASS
[   56.816466] test_bpf: #132 ALU_DIV_K: 4294967295 / 4294967295 = 1
jited:1 276 PASS
[   56.819790] test_bpf: #133 ALU_DIV_K: 0xffffffffffffffff / (-1) =
0x1 jited:1 373 PASS
[   56.824311] test_bpf: #134 ALU64_DIV_K: 6 / 2 = 3 jited:0 367 PASS
[   56.828509] test_bpf: #135 ALU64_DIV_K: 3 / 1 = 3 jited:0 354 PASS
[   56.832439] test_bpf: #136 ALU64_DIV_K: 2147483647 / 2147483647 = 1
jited:0 358 PASS
[   56.836360] test_bpf: #137 ALU64_DIV_K: 0xffffffffffffffff / (-1) =
0x0000000000000001 jited:0 563 PASS
[   56.842408] test_bpf: #138 ALU_MOD_X: 3 % 2 = 1 jited:1 293 PASS
[   56.845744] test_bpf: #139 ALU_MOD_X: 4294967295 % 4294967293 = 2
jited:1 289 PASS
[   56.849070] test_bpf: #140 ALU64_MOD_X: 3 % 2 = 1 jited:0 660 PASS
[   56.856100] test_bpf: #141 ALU64_MOD_X: 2147483647 % 2147483645 = 2
jited:0 692 PASS
[   56.863515] test_bpf: #142 ALU_MOD_K: 3 % 2 = 1 jited:1 311 PASS
[   56.867145] test_bpf: #143 ALU_MOD_K: 3 % 1 = 0 jited:1 PASS
[   56.867640] test_bpf: #144 ALU_MOD_K: 4294967295 % 4294967293 = 2
jited:1 319 PASS
[   56.871208] test_bpf: #145 ALU64_MOD_K: 3 % 2 = 1 jited:0 539 PASS
[   56.876982] test_bpf: #146 ALU64_MOD_K: 3 % 1 = 0 jited:0 PASS
[   56.877292] test_bpf: #147 ALU64_MOD_K: 2147483647 % 2147483645 = 2
jited:0 499 PASS
[   56.882591] test_bpf: #148 ALU_AND_X: 3 & 2 = 2 jited:1 109 PASS
[   56.884070] test_bpf: #149 ALU_AND_X: 0xffffffff & 0xffffffff =
0xffffffff jited:1 130 PASS
[   56.885807] test_bpf: #150 ALU64_AND_X: 3 & 2 = 2 jited:1 106 PASS
[   56.887288] test_bpf: #151 ALU64_AND_X: 0xffffffff & 0xffffffff =
0xffffffff jited:1 102 PASS
[   56.888746] test_bpf: #152 ALU_AND_K: 3 & 2 = 2 jited:1 114 PASS
[   56.890232] test_bpf: #153 ALU_AND_K: 0xffffffff & 0xffffffff =
0xffffffff jited:1 138 PASS
[   56.891967] test_bpf: #154 ALU64_AND_K: 3 & 2 = 2 jited:1 110 PASS
[   56.893502] test_bpf: #155 ALU64_AND_K: 0xffffffff & 0xffffffff =
0xffffffff jited:1 148 PASS
[   56.895413] test_bpf: #156 ALU64_AND_K: 0x0000ffffffff0000 & 0x0 =
0x0000ffff00000000 jited:1 206 PASS
[   56.897993] test_bpf: #157 ALU64_AND_K: 0x0000ffffffff0000 & -1 =
0x0000ffffffffffff jited:1 176 PASS
[   56.900294] test_bpf: #158 ALU64_AND_K: 0xffffffffffffffff & -1 =
0xffffffffffffffff jited:1 271 PASS
[   56.903712] test_bpf: #159 ALU_OR_X: 1 | 2 = 3 jited:1 108 PASS
[   56.905547] test_bpf: #160 ALU_OR_X: 0x0 | 0xffffffff = 0xffffffff
jited:1 118 PASS
[   56.907467] test_bpf: #161 ALU64_OR_X: 1 | 2 = 3 jited:1 103 PASS
[   56.909247] test_bpf: #162 ALU64_OR_X: 0 | 0xffffffff = 0xffffffff
jited:1 143 PASS
[   56.911219] test_bpf: #163 ALU_OR_K: 1 | 2 = 3 jited:1 123 PASS
[   56.913042] test_bpf: #164 ALU_OR_K: 0 & 0xffffffff = 0xffffffff
jited:1 110 PASS
[   56.914579] test_bpf: #165 ALU64_OR_K: 1 | 2 = 3 jited:1 120 PASS
[   56.916390] test_bpf: #166 ALU64_OR_K: 0 & 0xffffffff = 0xffffffff
jited:1 119 PASS
[   56.918118] test_bpf: #167 ALU64_OR_K: 0x0000ffffffff0000 | 0x0 =
0x0000ffff00000000 jited:1 212 PASS
[   56.920808] test_bpf: #168 ALU64_OR_K: 0x0000ffffffff0000 | -1 =
0xffffffffffffffff jited:1 221 PASS
[   56.923458] test_bpf: #169 ALU64_OR_K: 0x000000000000000 | -1 =
0xffffffffffffffff jited:1 198 PASS
[   56.925881] test_bpf: #170 ALU_XOR_X: 5 ^ 6 = 3 jited:1 138 PASS
[   56.927678] test_bpf: #171 ALU_XOR_X: 0x1 ^ 0xffffffff = 0xfffffffe
jited:1 130 PASS
[   56.929353] test_bpf: #172 ALU64_XOR_X: 5 ^ 6 = 3 jited:1 114 PASS
[   56.930850] test_bpf: #173 ALU64_XOR_X: 1 ^ 0xffffffff = 0xfffffffe
jited:1 106 PASS
[   56.932277] test_bpf: #174 ALU_XOR_K: 5 ^ 6 = 3 jited:1 112 PASS
[   56.933790] test_bpf: #175 ALU_XOR_K: 1 ^ 0xffffffff = 0xfffffffe
jited:1 116 PASS
[   56.935371] test_bpf: #176 ALU64_XOR_K: 5 ^ 6 = 3 jited:1 114 PASS
[   56.936942] test_bpf: #177 ALU64_XOR_K: 1 & 0xffffffff = 0xfffffffe
jited:1 112 PASS
[   56.938503] test_bpf: #178 ALU64_XOR_K: 0x0000ffffffff0000 ^ 0x0 =
0x0000ffffffff0000 jited:1 201 PASS
[   56.940978] test_bpf: #179 ALU64_XOR_K: 0x0000ffffffff0000 ^ -1 =
0xffff00000000ffff jited:1 242 PASS
[   56.943908] test_bpf: #180 ALU64_XOR_K: 0x000000000000000 ^ -1 =
0xffffffffffffffff jited:1 208 PASS
[   56.946575] test_bpf: #181 ALU_LSH_X: 1 << 1 = 2 jited:1 112 PASS
[   56.948252] test_bpf: #182 ALU_LSH_X: 1 << 31 = 0x80000000 jited:1 137 PASS
[   56.950466] test_bpf: #183 ALU64_LSH_X: 1 << 1 = 2 jited:1 163 PASS
[   56.953176] test_bpf: #184 ALU64_LSH_X: 1 << 31 = 0x80000000 jited:1 145 PASS
[   56.955105] test_bpf: #185 ALU_LSH_K: 1 << 1 = 2 jited:1 92 PASS
[   56.956400] test_bpf: #186 ALU_LSH_K: 1 << 31 = 0x80000000 jited:1 94 PASS
[   56.957700] test_bpf: #187 ALU64_LSH_K: 1 << 1 = 2 jited:1 94 PASS
[   56.959086] test_bpf: #188 ALU64_LSH_K: 1 << 31 = 0x80000000 jited:1 127 PASS
[   56.960779] test_bpf: #189 ALU_RSH_X: 2 >> 1 = 1 jited:1 135 PASS
[   56.962532] test_bpf: #190 ALU_RSH_X: 0x80000000 >> 31 = 1 jited:1 109 PASS
[   56.964027] test_bpf: #191 ALU64_RSH_X: 2 >> 1 = 1 jited:1 123 PASS
[   56.965961] test_bpf: #192 ALU64_RSH_X: 0x80000000 >> 31 = 1 jited:1 117 PASS
[   56.967517] test_bpf: #193 ALU_RSH_K: 2 >> 1 = 1 jited:1 95 PASS
[   56.968874] test_bpf: #194 ALU_RSH_K: 0x80000000 >> 31 = 1 jited:1 103 PASS
[   56.970261] test_bpf: #195 ALU64_RSH_K: 2 >> 1 = 1 jited:1 124 PASS
[   56.971879] test_bpf: #196 ALU64_RSH_K: 0x80000000 >> 31 = 1 jited:1 107 PASS
[   56.973346] test_bpf: #197 ALU_ARSH_X: 0xff00ff0000000000 >> 40 =
0xffffffffffff00ff jited:1 125 PASS
[   56.975022] test_bpf: #198 ALU_ARSH_K: 0xff00ff0000000000 >> 40 =
0xffffffffffff00ff jited:1 105 PASS
[   56.976479] test_bpf: #199 ALU_NEG: -(3) = -3 jited:1 76 PASS
[   56.977591] test_bpf: #200 ALU_NEG: -(-3) = 3 jited:1 106 PASS
[   56.979068] test_bpf: #201 ALU64_NEG: -(3) = -3 jited:1 104 PASS
[   56.980508] test_bpf: #202 ALU64_NEG: -(-3) = 3 jited:1 135 PASS
[   56.982223] test_bpf: #203 ALU_END_FROM_BE 16: 0x0123456789abcdef
-> 0xcdef jited:1 115 PASS
[   56.984458] test_bpf: #204 ALU_END_FROM_BE 32: 0x0123456789abcdef
-> 0x89abcdef jited:1 101 PASS
[   56.985991] test_bpf: #205 ALU_END_FROM_BE 64: 0x0123456789abcdef
-> 0x89abcdef jited:1 103 PASS
[   56.987477] test_bpf: #206 ALU_END_FROM_LE 16: 0x0123456789abcdef
-> 0xefcd jited:1 107 PASS
[   56.988937] test_bpf: #207 ALU_END_FROM_LE 32: 0x0123456789abcdef
-> 0xefcdab89 jited:1 93 PASS
[   56.990256] test_bpf: #208 ALU_END_FROM_LE 64: 0x0123456789abcdef
-> 0x67452301 jited:1 108 PASS
[   56.991728] test_bpf: #209 ST_MEM_B: Store/Load byte: max negative
jited:1 168 PASS
[   56.993878] test_bpf: #210 ST_MEM_B: Store/Load byte: max positive
jited:1 105 PASS
[   56.995386] test_bpf: #211 STX_MEM_B: Store/Load byte: max negative
jited:1 140 PASS
[   56.997188] test_bpf: #212 ST_MEM_H: Store/Load half word: max
negative jited:1 98 PASS
[   56.998563] test_bpf: #213 ST_MEM_H: Store/Load half word: max
positive jited:1 109 PASS
[   57.000045] test_bpf: #214 STX_MEM_H: Store/Load half word: max
negative jited:1 134 PASS
[   57.001803] test_bpf: #215 ST_MEM_W: Store/Load word: max negative
jited:1 148 PASS
[   57.003666] test_bpf: #216 ST_MEM_W: Store/Load word: max positive
jited:1 136 PASS
[   57.006376] test_bpf: #217 STX_MEM_W: Store/Load word: max negative
jited:1 205 PASS
[   57.009004] test_bpf: #218 ST_MEM_DW: Store/Load double word: max
negative jited:1 124 PASS
[   57.011164] test_bpf: #219 ST_MEM_DW: Store/Load double word: max
negative 2 jited:1 222 PASS
[   57.014281] test_bpf: #220 ST_MEM_DW: Store/Load double word: max
positive jited:1 110 PASS
[   57.016138] test_bpf: #221 STX_MEM_DW: Store/Load double word: max
negative jited:1 194 PASS
[   57.018614] test_bpf: #222 STX_XADD_W: Test: 0x12 + 0x10 = 0x22
jited:0 292 PASS
[   57.022064] test_bpf: #223 STX_XADD_W: Test side-effects, r10: 0x12
+ 0x10 = 0x22 jited:0 PASS
[   57.022356] test_bpf: #224 STX_XADD_W: Test side-effects, r0: 0x12
+ 0x10 = 0x22 jited:0 241 PASS
[   57.025099] test_bpf: #225 STX_XADD_W: X + 1 + 1 + 1 + ... jited:0
142752 PASS
[   58.454867] test_bpf: #226 STX_XADD_DW: Test: 0x12 + 0x10 = 0x22
jited:0 370 PASS
[   58.459675] test_bpf: #227 STX_XADD_DW: Test side-effects, r10:
0x12 + 0x10 = 0x22 jited:0 PASS
[   58.460082] test_bpf: #228 STX_XADD_DW: Test side-effects, r0: 0x12
+ 0x10 = 0x22 jited:0 268 PASS
[   58.463093] test_bpf: #229 STX_XADD_DW: X + 1 + 1 + 1 + ... jited:0
224885 PASS
[   60.713635] test_bpf: #230 JMP_EXIT jited:1 77 PASS
[   60.715476] test_bpf: #231 JMP_JA: Unconditional jump: if (true)
return 1 jited:1 84 PASS
[   60.716748] test_bpf: #232 JMP_JSGT_K: Signed jump: if (-1 > -2)
return 1 jited:1 128 PASS
[   60.718617] test_bpf: #233 JMP_JSGT_K: Signed jump: if (-1 > -1)
return 0 jited:1 126 PASS
[   60.720303] test_bpf: #234 JMP_JSGE_K: Signed jump: if (-1 >= -2)
return 1 jited:1 179 PASS
[   60.722889] test_bpf: #235 JMP_JSGE_K: Signed jump: if (-1 >= -1)
return 1 jited:1 125 PASS
[   60.724674] test_bpf: #236 JMP_JGT_K: if (3 > 2) return 1 jited:1 142 PASS
[   60.726577] test_bpf: #237 JMP_JGT_K: Unsigned jump: if (-1 > 1)
return 1 jited:1 161 PASS
[   60.728695] test_bpf: #238 JMP_JGE_K: if (3 >= 2) return 1 jited:1 163 PASS
[   60.730807] test_bpf: #239 JMP_JGT_K: if (3 > 2) return 1 (jump
backwards) jited:1 143 PASS
[   60.733042] test_bpf: #240 JMP_JGE_K: if (3 >= 3) return 1 jited:1 179 PASS
[   60.735513] test_bpf: #241 JMP_JNE_K: if (3 != 2) return 1 jited:1 144 PASS
[   60.737586] test_bpf: #242 JMP_JEQ_K: if (3 == 3) return 1 jited:1 144 PASS
[   60.739896] test_bpf: #243 JMP_JSET_K: if (0x3 & 0x2) return 1
jited:1 149 PASS
[   60.741813] test_bpf: #244 JMP_JSET_K: if (0x3 & 0xffffffff) return
1 jited:1 153 PASS
[   60.743773] test_bpf: #245 JMP_JSGT_X: Signed jump: if (-1 > -2)
return 1 jited:1 162 PASS
[   60.745798] test_bpf: #246 JMP_JSGT_X: Signed jump: if (-1 > -1)
return 0 jited:1 162 PASS
[   60.747921] test_bpf: #247 JMP_JSGE_X: Signed jump: if (-1 >= -2)
return 1 jited:1 178 PASS
[   60.750577] test_bpf: #248 JMP_JSGE_X: Signed jump: if (-1 >= -1)
return 1 jited:1 192 PASS
[   60.753315] test_bpf: #249 JMP_JGT_X: if (3 > 2) return 1 jited:1 205 PASS
[   60.756115] test_bpf: #250 JMP_JGT_X: Unsigned jump: if (-1 > 1)
return 1 jited:1 154 PASS
[   60.758287] test_bpf: #251 JMP_JGE_X: if (3 >= 2) return 1 jited:1 177 PASS
[   60.760611] test_bpf: #252 JMP_JGE_X: if (3 >= 3) return 1 jited:1 160 PASS
[   60.762901] test_bpf: #253 JMP_JGE_X: ldimm64 test 1 jited:1 204 PASS
[   60.765394] test_bpf: #254 JMP_JGE_X: ldimm64 test 2 jited:1 201 PASS
[   60.767884] test_bpf: #255 JMP_JGE_X: ldimm64 test 3 jited:1 184 PASS
[   60.770228] test_bpf: #256 JMP_JNE_X: if (3 != 2) return 1 jited:1 168 PASS
[   60.772331] test_bpf: #257 JMP_JEQ_X: if (3 == 3) return 1 jited:1 197 PASS
[   60.774754] test_bpf: #258 JMP_JSET_X: if (0x3 & 0x2) return 1
jited:1 192 PASS
[   60.777384] test_bpf: #259 JMP_JSET_X: if (0x3 & 0xffffffff) return
1 jited:1 181 PASS
[   60.779641] test_bpf: #260 JMP_JA: Jump, gap, jump, ... jited:1 97 PASS
[   60.781022] test_bpf: #261 BPF_MAXINSNS: Maximum possible literals
jited:1 125 PASS
[   61.242879] test_bpf: #262 BPF_MAXINSNS: Single literal jited:1 105 PASS
[   61.835125] test_bpf: #263 BPF_MAXINSNS: Run/add until end jited:1
121315 PASS
[   63.362129] test_bpf: #264 BPF_MAXINSNS: Too many instructions PASS
[   63.362231] test_bpf: #265 BPF_MAXINSNS: Very long jump jited:1 131 PASS
[   63.879679] test_bpf: #266 BPF_MAXINSNS: Ctx heavy transformations
jited:1 217030 181848 PASS
[   68.492725] test_bpf: #267 BPF_MAXINSNS: Call heavy transformations
jited:0 1018683 930359 PASS
[   88.007480] test_bpf: #268 BPF_MAXINSNS: Jump heavy test jited:1 440621 PASS
[   93.074379] test_bpf: #269 BPF_MAXINSNS: Very long jump backwards
jited:1 154 PASS
[   93.358458] test_bpf: #270 BPF_MAXINSNS: Edge hopping nuthouse
jited:1 302835 PASS
[   96.392483] test_bpf: #271 BPF_MAXINSNS: Jump, gap, jump, ...
jited:1 1008 PASS
[   96.501153] test_bpf: #272 BPF_MAXINSNS: ld_abs+get_processor_id
jited:0 597855 PASS
[  102.759854] test_bpf: #273 BPF_MAXINSNS: ld_abs+vlan_push/pop
jited:0 626616 PASS
[  109.247312] test_bpf: #274 LD_IND byte frag jited:1 1453 PASS
[  109.263829] test_bpf: #275 LD_IND halfword frag jited:1 600 PASS
[  109.270433] test_bpf: #276 LD_IND word frag jited:1 719 PASS
[  109.278159] test_bpf: #277 LD_IND halfword mixed head/frag jited:1 705 PASS
[  109.285898] test_bpf: #278 LD_IND word mixed head/frag jited:1 732 PASS
[  109.293879] test_bpf: #279 LD_ABS byte frag jited:1 683 PASS
[  109.301360] test_bpf: #280 LD_ABS halfword frag jited:1 595 PASS
[  109.307841] test_bpf: #281 LD_ABS word frag jited:1 672 PASS
[  109.315579] test_bpf: #282 LD_ABS halfword mixed head/frag jited:1 775 PASS
[  109.323890] test_bpf: #283 LD_ABS word mixed head/frag jited:1 725 PASS
[  109.331927] test_bpf: #284 LD_IND byte default X jited:1 274 PASS
[  109.335451] test_bpf: #285 LD_IND byte positive offset jited:1 302 PASS
[  109.339511] test_bpf: #286 LD_IND byte negative offset jited:1 311 PASS
[  109.343448] test_bpf: #287 LD_IND halfword positive offset jited:1 218 PASS
[  109.346282] test_bpf: #288 LD_IND halfword negative offset jited:1 193 PASS
[  109.348832] test_bpf: #289 LD_IND halfword unaligned jited:1 190 PASS
[  109.351330] test_bpf: #290 LD_IND word positive offset jited:1 200 PASS
[  109.353993] test_bpf: #291 LD_IND word negative offset jited:1 216 PASS
[  109.356739] test_bpf: #292 LD_IND word unaligned (addr & 3 == 2)
jited:1 195 PASS
[  109.359225] test_bpf: #293 LD_IND word unaligned (addr & 3 == 1)
jited:1 196 PASS
[  109.361713] test_bpf: #294 LD_IND word unaligned (addr & 3 == 3)
jited:1 221 PASS
[  109.364417] test_bpf: #295 LD_ABS byte jited:1 195 PASS
[  109.366896] test_bpf: #296 LD_ABS halfword jited:1 170 PASS
[  109.369093] test_bpf: #297 LD_ABS halfword unaligned jited:1 167 PASS
[  109.371399] test_bpf: #298 LD_ABS word jited:1 182 PASS
[  109.373724] test_bpf: #299 LD_ABS word unaligned (addr & 3 == 2)
jited:1 185 PASS
[  109.376064] test_bpf: #300 LD_ABS word unaligned (addr & 3 == 1)
jited:1 162 PASS
[  109.381701] test_bpf: #301 LD_ABS word unaligned (addr & 3 == 3)
jited:1 231 PASS
[  109.384839] test_bpf: #302 ADD default X jited:1 105 PASS
[  109.386839] test_bpf: #303 ADD default A jited:1 101 PASS
[  109.388677] test_bpf: #304 SUB default X jited:1 106 PASS
[  109.390267] test_bpf: #305 SUB default A jited:1 119 PASS
[  109.391992] test_bpf: #306 MUL default X jited:1 131 PASS
[  109.394020] test_bpf: #307 MUL default A jited:1 116 PASS
[  109.395766] test_bpf: #308 DIV default X jited:1 116 PASS
[  109.397706] test_bpf: #309 DIV default A jited:1 227 PASS
[  109.406156] test_bpf: #310 MOD default X jited:1 98 PASS
[  109.407645] test_bpf: #311 MOD default A jited:1 265 PASS
[  109.410774] test_bpf: #312 JMP EQ default A jited:1 134 PASS
[  109.412679] test_bpf: #313 JMP EQ default X jited:1 108 PASS
[  109.414506] test_bpf: Summary: 314 PASSED, 0 FAILED, [278/306 JIT'ed]


These all benchmarks are for ARMv7.
Best,
Shubham Bansal


On Mon, May 22, 2017 at 6:31 PM, Daniel Borkmann <daniel@iogearbox.net> wrote:
> On 05/20/2017 10:01 PM, Shubham Bansal wrote:
> [...]
>>
>> Before I send the patch, I have tested the JIT compiler on ARMv7 but
>> not on ARMv5 or ARMv6. So can you tell me which arch versions I should
>> test it for?
>> Also for my testing, CONFIG_FRAME_POINTER and CONFIG_CPU_BIG_ENDIAN
>> are both disabled. But I need to test JIT with these flags as well.
>> Whenever I put these flags in .config file, the arm kernel is not
>> getting compiler with these flags. Can you tell me why? If you need
>> more information regarding this, please let me know.
>
>
> Maybe Mircea, Kees or someone from linux-arm-kernel can help you out
> on that.
>
> With regards to the below benchmark, I was mentioning how it compares
> to the interpreter. With only the numbers for jit it's hard to compare.
> So would be great to see the output for the following three cases:
>
> 1) Interpreter:
>
> echo 0 > /proc/sys/net/core/bpf_jit_enable
>
> 2) JIT enabled:
>
> echo 1 > /proc/sys/net/core/bpf_jit_enable
>
> 3) JIT + blinding enabled:
>
> echo 1 > /proc/sys/net/core/bpf_jit_enable
> echo 2 > /proc/sys/net/core/bpf_jit_harden
>
>> With current config for ARMv7, benchmarks are :
>>
>> [root@vexpress modules]# insmod test_bpf.ko
>> [   25.797766] test_bpf: #0 TAX jited:1 180 170 169 PASS
>> [   25.811395] test_bpf: #1 TXA jited:1 93 89 111 PASS
>> [   25.815073] test_bpf: #2 ADD_SUB_MUL_K jited:1 94 PASS
>> [   25.816779] test_bpf: #3 DIV_MOD_KX jited:1 983 PASS
>> [   25.827310] test_bpf: #4 AND_OR_LSH_K jited:1 94 93 PASS
>> [   25.829843] test_bpf: #5 LD_IMM_0 jited:1 83 PASS
>> [   25.831260] test_bpf: #6 LD_IND jited:1 338 266 305 PASS
>
> [...]
>
> Thanks,
> Daniel

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: arch: arm: bpf: Converting cBPF to eBPF for arm 32 bit
  2017-05-20 20:01                   ` Shubham Bansal
  (?)
@ 2017-05-22 18:58                     ` Kees Cook
  -1 siblings, 0 replies; 99+ messages in thread
From: Kees Cook @ 2017-05-22 18:58 UTC (permalink / raw)
  To: Shubham Bansal
  Cc: Daniel Borkmann, David Miller, Mircea Gherzan,
	Network Development, kernel-hardening, linux-arm-kernel, ast

On Sat, May 20, 2017 at 1:01 PM, Shubham Bansal
<illusionist.neo@gmail.com> wrote:
> Hi Daniel and Kees,
>
> Before I send the patch, I have tested the JIT compiler on ARMv7 but
> not on ARMv5 or ARMv6. So can you tell me which arch versions I should
> test it for?
> Also for my testing, CONFIG_FRAME_POINTER and CONFIG_CPU_BIG_ENDIAN
> are both disabled. But I need to test JIT with these flags as well.
> Whenever I put these flags in .config file, the arm kernel is not
> getting compiler with these flags. Can you tell me why? If you need
> more information regarding this, please let me know.

I think it is fine to only target ARMv7. It is harder and harder to
find devices on v5 or v6 CPUs that would want to be using BPF JIT,
IMO.

When they "disappear", it's because there isn't a prerequisite met. I
either read the Kconfig files or use "make menuconfig" and "search" to
tell me where a config is defined and what is needed to meet the
prerequisites.

In the case of CPU_BIG_ENDIAN, you need ARCH_SUPPORTS_BIG_ENDIAN,
which appears to be only ARCH_IXP4XX. I don't think you're going to
find an emulator that will handle this, so I'd suggest ignoring this
config for now unless you can find someone with that hardware that you
can work with to test it.

In the case of CONFIG_FRAME_POINTER, I assume you built a
THUMB2_KERNEL? I'd read the notes in arch/arm/Kconfig.debug for
'config FRAME_POINTER'.

-Kees

-- 
Kees Cook
Pixel Security

^ permalink raw reply	[flat|nested] 99+ messages in thread

* arch: arm: bpf: Converting cBPF to eBPF for arm 32 bit
@ 2017-05-22 18:58                     ` Kees Cook
  0 siblings, 0 replies; 99+ messages in thread
From: Kees Cook @ 2017-05-22 18:58 UTC (permalink / raw)
  To: linux-arm-kernel

On Sat, May 20, 2017 at 1:01 PM, Shubham Bansal
<illusionist.neo@gmail.com> wrote:
> Hi Daniel and Kees,
>
> Before I send the patch, I have tested the JIT compiler on ARMv7 but
> not on ARMv5 or ARMv6. So can you tell me which arch versions I should
> test it for?
> Also for my testing, CONFIG_FRAME_POINTER and CONFIG_CPU_BIG_ENDIAN
> are both disabled. But I need to test JIT with these flags as well.
> Whenever I put these flags in .config file, the arm kernel is not
> getting compiler with these flags. Can you tell me why? If you need
> more information regarding this, please let me know.

I think it is fine to only target ARMv7. It is harder and harder to
find devices on v5 or v6 CPUs that would want to be using BPF JIT,
IMO.

When they "disappear", it's because there isn't a prerequisite met. I
either read the Kconfig files or use "make menuconfig" and "search" to
tell me where a config is defined and what is needed to meet the
prerequisites.

In the case of CPU_BIG_ENDIAN, you need ARCH_SUPPORTS_BIG_ENDIAN,
which appears to be only ARCH_IXP4XX. I don't think you're going to
find an emulator that will handle this, so I'd suggest ignoring this
config for now unless you can find someone with that hardware that you
can work with to test it.

In the case of CONFIG_FRAME_POINTER, I assume you built a
THUMB2_KERNEL? I'd read the notes in arch/arm/Kconfig.debug for
'config FRAME_POINTER'.

-Kees

-- 
Kees Cook
Pixel Security

^ permalink raw reply	[flat|nested] 99+ messages in thread

* [kernel-hardening] Re: arch: arm: bpf: Converting cBPF to eBPF for arm 32 bit
@ 2017-05-22 18:58                     ` Kees Cook
  0 siblings, 0 replies; 99+ messages in thread
From: Kees Cook @ 2017-05-22 18:58 UTC (permalink / raw)
  To: Shubham Bansal
  Cc: Daniel Borkmann, David Miller, Mircea Gherzan,
	Network Development, kernel-hardening, linux-arm-kernel, ast

On Sat, May 20, 2017 at 1:01 PM, Shubham Bansal
<illusionist.neo@gmail.com> wrote:
> Hi Daniel and Kees,
>
> Before I send the patch, I have tested the JIT compiler on ARMv7 but
> not on ARMv5 or ARMv6. So can you tell me which arch versions I should
> test it for?
> Also for my testing, CONFIG_FRAME_POINTER and CONFIG_CPU_BIG_ENDIAN
> are both disabled. But I need to test JIT with these flags as well.
> Whenever I put these flags in .config file, the arm kernel is not
> getting compiler with these flags. Can you tell me why? If you need
> more information regarding this, please let me know.

I think it is fine to only target ARMv7. It is harder and harder to
find devices on v5 or v6 CPUs that would want to be using BPF JIT,
IMO.

When they "disappear", it's because there isn't a prerequisite met. I
either read the Kconfig files or use "make menuconfig" and "search" to
tell me where a config is defined and what is needed to meet the
prerequisites.

In the case of CPU_BIG_ENDIAN, you need ARCH_SUPPORTS_BIG_ENDIAN,
which appears to be only ARCH_IXP4XX. I don't think you're going to
find an emulator that will handle this, so I'd suggest ignoring this
config for now unless you can find someone with that hardware that you
can work with to test it.

In the case of CONFIG_FRAME_POINTER, I assume you built a
THUMB2_KERNEL? I'd read the notes in arch/arm/Kconfig.debug for
'config FRAME_POINTER'.

-Kees

-- 
Kees Cook
Pixel Security

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: arch: arm: bpf: Converting cBPF to eBPF for arm 32 bit
  2017-05-22 18:58                     ` Kees Cook
  (?)
@ 2017-05-22 19:08                       ` Florian Fainelli
  -1 siblings, 0 replies; 99+ messages in thread
From: Florian Fainelli @ 2017-05-22 19:08 UTC (permalink / raw)
  To: Kees Cook, Shubham Bansal
  Cc: Daniel Borkmann, kernel-hardening, Network Development, ast,
	Mircea Gherzan, David Miller, linux-arm-kernel, nschichan,
	andrew

On 05/22/2017 11:58 AM, Kees Cook wrote:
> On Sat, May 20, 2017 at 1:01 PM, Shubham Bansal
> <illusionist.neo@gmail.com> wrote:
>> Hi Daniel and Kees,
>>
>> Before I send the patch, I have tested the JIT compiler on ARMv7 but
>> not on ARMv5 or ARMv6. So can you tell me which arch versions I should
>> test it for?
>> Also for my testing, CONFIG_FRAME_POINTER and CONFIG_CPU_BIG_ENDIAN
>> are both disabled. But I need to test JIT with these flags as well.
>> Whenever I put these flags in .config file, the arm kernel is not
>> getting compiler with these flags. Can you tell me why? If you need
>> more information regarding this, please let me know.
> 
> I think it is fine to only target ARMv7. It is harder and harder to
> find devices on v5 or v6 CPUs that would want to be using BPF JIT,
> IMO.

There are still a ton of Marvell-based routers out there (e.g: Kirkwood,
Orion5x) that are ARMv5 and that prompted Nicholas (hey there) to fix
the cBPF JIT a while ago. I don't think you can just ignore those, it's
fine not to target them initially, but arguably, QEMU has decent support
for some ARMv5 platforms that could be used for testing as well
(realview-eb, versatileab/pbm.

These devices are actually perfect candidates for running eBPF and to
some extent XDP because they are slightly under powered compared to
their newest ARMv7/ARMv8 counterparts.

> 
> When they "disappear", it's because there isn't a prerequisite met. I
> either read the Kconfig files or use "make menuconfig" and "search" to
> tell me where a config is defined and what is needed to meet the
> prerequisites.
> 
> In the case of CPU_BIG_ENDIAN, you need ARCH_SUPPORTS_BIG_ENDIAN,
> which appears to be only ARCH_IXP4XX. I don't think you're going to
> find an emulator that will handle this, so I'd suggest ignoring this
> config for now unless you can find someone with that hardware that you
> can work with to test it.
> 
> In the case of CONFIG_FRAME_POINTER, I assume you built a
> THUMB2_KERNEL? I'd read the notes in arch/arm/Kconfig.debug for
> 'config FRAME_POINTER'.

It sounds like we are at the point where Shubham's patches should be
posted so people could test/fix on earlier ARM devices for instance.

Thanks
-- 
Florian

^ permalink raw reply	[flat|nested] 99+ messages in thread

* arch: arm: bpf: Converting cBPF to eBPF for arm 32 bit
@ 2017-05-22 19:08                       ` Florian Fainelli
  0 siblings, 0 replies; 99+ messages in thread
From: Florian Fainelli @ 2017-05-22 19:08 UTC (permalink / raw)
  To: linux-arm-kernel

On 05/22/2017 11:58 AM, Kees Cook wrote:
> On Sat, May 20, 2017 at 1:01 PM, Shubham Bansal
> <illusionist.neo@gmail.com> wrote:
>> Hi Daniel and Kees,
>>
>> Before I send the patch, I have tested the JIT compiler on ARMv7 but
>> not on ARMv5 or ARMv6. So can you tell me which arch versions I should
>> test it for?
>> Also for my testing, CONFIG_FRAME_POINTER and CONFIG_CPU_BIG_ENDIAN
>> are both disabled. But I need to test JIT with these flags as well.
>> Whenever I put these flags in .config file, the arm kernel is not
>> getting compiler with these flags. Can you tell me why? If you need
>> more information regarding this, please let me know.
> 
> I think it is fine to only target ARMv7. It is harder and harder to
> find devices on v5 or v6 CPUs that would want to be using BPF JIT,
> IMO.

There are still a ton of Marvell-based routers out there (e.g: Kirkwood,
Orion5x) that are ARMv5 and that prompted Nicholas (hey there) to fix
the cBPF JIT a while ago. I don't think you can just ignore those, it's
fine not to target them initially, but arguably, QEMU has decent support
for some ARMv5 platforms that could be used for testing as well
(realview-eb, versatileab/pbm.

These devices are actually perfect candidates for running eBPF and to
some extent XDP because they are slightly under powered compared to
their newest ARMv7/ARMv8 counterparts.

> 
> When they "disappear", it's because there isn't a prerequisite met. I
> either read the Kconfig files or use "make menuconfig" and "search" to
> tell me where a config is defined and what is needed to meet the
> prerequisites.
> 
> In the case of CPU_BIG_ENDIAN, you need ARCH_SUPPORTS_BIG_ENDIAN,
> which appears to be only ARCH_IXP4XX. I don't think you're going to
> find an emulator that will handle this, so I'd suggest ignoring this
> config for now unless you can find someone with that hardware that you
> can work with to test it.
> 
> In the case of CONFIG_FRAME_POINTER, I assume you built a
> THUMB2_KERNEL? I'd read the notes in arch/arm/Kconfig.debug for
> 'config FRAME_POINTER'.

It sounds like we are at the point where Shubham's patches should be
posted so people could test/fix on earlier ARM devices for instance.

Thanks
-- 
Florian

^ permalink raw reply	[flat|nested] 99+ messages in thread

* [kernel-hardening] Re: arch: arm: bpf: Converting cBPF to eBPF for arm 32 bit
@ 2017-05-22 19:08                       ` Florian Fainelli
  0 siblings, 0 replies; 99+ messages in thread
From: Florian Fainelli @ 2017-05-22 19:08 UTC (permalink / raw)
  To: Kees Cook, Shubham Bansal
  Cc: Daniel Borkmann, kernel-hardening, Network Development, ast,
	Mircea Gherzan, David Miller, linux-arm-kernel, nschichan,
	andrew

On 05/22/2017 11:58 AM, Kees Cook wrote:
> On Sat, May 20, 2017 at 1:01 PM, Shubham Bansal
> <illusionist.neo@gmail.com> wrote:
>> Hi Daniel and Kees,
>>
>> Before I send the patch, I have tested the JIT compiler on ARMv7 but
>> not on ARMv5 or ARMv6. So can you tell me which arch versions I should
>> test it for?
>> Also for my testing, CONFIG_FRAME_POINTER and CONFIG_CPU_BIG_ENDIAN
>> are both disabled. But I need to test JIT with these flags as well.
>> Whenever I put these flags in .config file, the arm kernel is not
>> getting compiler with these flags. Can you tell me why? If you need
>> more information regarding this, please let me know.
> 
> I think it is fine to only target ARMv7. It is harder and harder to
> find devices on v5 or v6 CPUs that would want to be using BPF JIT,
> IMO.

There are still a ton of Marvell-based routers out there (e.g: Kirkwood,
Orion5x) that are ARMv5 and that prompted Nicholas (hey there) to fix
the cBPF JIT a while ago. I don't think you can just ignore those, it's
fine not to target them initially, but arguably, QEMU has decent support
for some ARMv5 platforms that could be used for testing as well
(realview-eb, versatileab/pbm.

These devices are actually perfect candidates for running eBPF and to
some extent XDP because they are slightly under powered compared to
their newest ARMv7/ARMv8 counterparts.

> 
> When they "disappear", it's because there isn't a prerequisite met. I
> either read the Kconfig files or use "make menuconfig" and "search" to
> tell me where a config is defined and what is needed to meet the
> prerequisites.
> 
> In the case of CPU_BIG_ENDIAN, you need ARCH_SUPPORTS_BIG_ENDIAN,
> which appears to be only ARCH_IXP4XX. I don't think you're going to
> find an emulator that will handle this, so I'd suggest ignoring this
> config for now unless you can find someone with that hardware that you
> can work with to test it.
> 
> In the case of CONFIG_FRAME_POINTER, I assume you built a
> THUMB2_KERNEL? I'd read the notes in arch/arm/Kconfig.debug for
> 'config FRAME_POINTER'.

It sounds like we are at the point where Shubham's patches should be
posted so people could test/fix on earlier ARM devices for instance.

Thanks
-- 
Florian

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: arch: arm: bpf: Converting cBPF to eBPF for arm 32 bit
  2017-05-22 17:04                       ` Shubham Bansal
  (?)
@ 2017-05-22 20:05                         ` Kees Cook
  -1 siblings, 0 replies; 99+ messages in thread
From: Kees Cook @ 2017-05-22 20:05 UTC (permalink / raw)
  To: Shubham Bansal
  Cc: Daniel Borkmann, David Miller, Mircea Gherzan,
	Network Development, kernel-hardening, linux-arm-kernel, ast

[-- Attachment #1: Type: text/plain, Size: 725 bytes --]

On Mon, May 22, 2017 at 10:04 AM, Shubham Bansal
<illusionist.neo@gmail.com> wrote:
> These all benchmarks are for ARMv7.

Thanks! In the future, try to avoid the white-space damage
(line-wrapping). And it looks like you've still got debugging turned
on in your jit code:

[   56.176033] test_bpf: #21 LD_CPU
[   56.176329] bpf_jit: *** NOT YET: opcode 85 ***
[   56.176565] jited:0 2639 702 PASS

That breaks the test report line. After I cleaned these up and parsed
the results, they look great. Most things are half the speed of the
interpreter, if not better. Only the LD_ABS suffered, and that's
mainly the const blinding, I assume.

Please post your current patch. Thanks for this!

-Kees

-- 
Kees Cook
Pixel Security

[-- Attachment #2: jitted.txt --]
[-- Type: text/plain, Size: 29420 bytes --]

#0 TAX
	interp: 757	645	650
	jitted: 234	171	195
	        30.9%	26.5%	30.0%
	harden: 239	218	229
	        31.6%	33.8%	35.2%
#1 TXA
	interp: 366	334	336
	jitted: 81	79	77
	        22.1%	23.7%	22.9%
	harden: 89	119	85
	        24.3%	35.6%	25.3%
#2 ADD_SUB_MUL_K
	interp: 543
	jitted: 89
	        16.4%
	harden: 213
	        39.2%
#3 DIV_MOD_KX
	interp: 1509
	jitted: 939
	        62.2%
	harden: 1190
	        78.9%
#4 AND_OR_LSH_K
	interp: 539	559
	jitted: 116	114
	        21.5%	20.4%
	harden: 200	149
	        37.1%	26.7%
#5 LD_IMM_0
	interp: 412
	jitted: 93
	        22.6%
	harden: 101
	        24.5%
#6 LD_IND
	interp: 428	376	389
	jitted: 371	279	274
	        86.7%	74.2%	70.4%
	harden: 314	310	283
	        73.4%	82.4%	72.8%
#7 LD_ABS
	interp: 509	405	358
	jitted: 408	402	272
	        80.2%	99.3%	76.0%
	harden: 376	460	397
	        73.9%	113.6%	110.9%
#8 LD_ABS_LL
	interp: 542	783
	jitted: 387	346
	        71.4%	44.2%
	harden: 608	415
	        112.2%	53.0%
#9 LD_IND_LL
	interp: 524	496	723
	jitted: 239	248	217
	        45.6%	50.0%	30.0%
	harden: 248	256	268
	        47.3%	51.6%	37.1%
#10 LD_ABS_NET
	interp: 527	545
	jitted: 356	332
	        67.6%	60.9%
	harden: 435	420
	        82.5%	77.1%
#11 LD_IND_NET
	interp: 650	495	647
	jitted: 223	212	320
	        34.3%	42.8%	49.5%
	harden: 240	228	215
	        36.9%	46.1%	33.2%
#12 LD_PKTTYPE
	interp: 686	901
	jitted: 102	90
	        14.9%	10.0%
	harden: 211	274
	        30.8%	30.4%
#13 LD_MARK
	interp: 305	291
	jitted: 80	80
	        26.2%	27.5%
	harden: 119	76
	        39.0%	26.1%
#14 LD_RXHASH
	interp: 257	259
	jitted: 73	71
	        28.4%	27.4%
	harden: 78	70
	        30.4%	27.0%
#15 LD_QUEUE
	interp: 255	254
	jitted: 120	121
	        47.1%	47.6%
	harden: 77	73
	        30.2%	28.7%
#16 LD_PROTOCOL
	interp: 593	603
	jitted: 256	247
	        43.2%	41.0%
	harden: 326	320
	        55.0%	53.1%
#17 LD_VLAN_TAG
	interp: 288	292
	jitted: 82	84
	        28.5%	28.8%
	harden: 129	86
	        44.8%	29.5%
#18 LD_VLAN_TAG_PRESENT
	interp: 335	421
	jitted: 80	77
	        23.9%	18.3%
	harden: 87	88
	        26.0%	20.9%
#19 LD_IFINDEX
	interp: 8568	606
	jitted: 87	98
	        1.0%	16.2%
	harden: 97	95
	        1.1%	15.7%
#20 LD_HATYPE
	interp: 618	695
	jitted: 95	90
	        15.4%	12.9%
	harden: 94	118
	        15.2%	17.0%
#25 LD_ANC_XOR
	interp: 314	344
	jitted: 86	100
	        27.4%	29.1%
	harden: 168	156
	        53.5%	45.3%
#26 SPILL_FILL
	interp: 757	850	903
	jitted: 131	137	123
	        17.3%	16.1%	13.6%
	harden: 232	212	219
	        30.6%	24.9%	24.3%
#27 JEQ
	interp: 380	420	426
	jitted: 266	189	216
	        70.0%	45.0%	50.7%
	harden: 362	352	230
	        95.3%	83.8%	54.0%
#28 JGT
	interp: 376	467	448
	jitted: 301	211	192
	        80.1%	45.2%	42.9%
	harden: 334	236	197
	        88.8%	50.5%	44.0%
#29 JGE
	interp: 446	590	498
	jitted: 191	200	223
	        42.8%	33.9%	44.8%
	harden: 260	318	307
	        58.3%	53.9%	61.6%
#30 JSET
	interp: 571	787	1003
	jitted: 211	210	214
	        37.0%	26.7%	21.3%
	harden: 274	339	410
	        48.0%	43.1%	40.9%
#31 tcpdump port 22
	interp: 358	1079	1190
	jitted: 314	722	711
	        87.7%	66.9%	59.7%
	harden: 355	951	968
	        99.2%	88.1%	81.3%
#32 tcpdump complex
	interp: 319	1061	2324
	jitted: 291	707	1068
	        91.2%	66.6%	46.0%
	harden: 318	798	1308
	        99.7%	75.2%	56.3%
#33 RET_A
	interp: 253	249
	jitted: 83	88
	        32.8%	35.3%
	harden: 83	76
	        32.8%	30.5%
#34 INT: ADD trivial
	interp: 414
	jitted: 162
	        39.1%
	harden: 152
	        36.7%
#35 INT: MUL_X
	interp: 336
	jitted: 176
	        52.4%
	harden: 192
	        57.1%
#36 INT: MUL_X2
	interp: 431
	jitted: 84
	        19.5%
	harden: 165
	        38.3%
#37 INT: MUL32_X
	interp: 523
	jitted: 99
	        18.9%
	harden: 163
	        31.2%
#38 INT: ADD 64-bit
	interp: 5263
	jitted: 1066
	        20.3%
	harden: 1507
	        28.6%
#39 INT: ADD 32-bit
	interp: 4127
	jitted: 666
	        16.1%
	harden: 954
	        23.1%
#40 INT: SUB
	interp: 4218
	jitted: 3236
	        76.7%
	harden: 1159
	        27.5%
#41 INT: XOR
	interp: 2252
	jitted: 308
	        13.7%
	harden: 480
	        21.3%
#42 INT: MUL
	interp: 1986
	jitted: 376
	        18.9%
	harden: 486
	        24.5%
#43 MOV REG64
	interp: 1103
	jitted: 227
	        20.6%
	harden: 274
	        24.8%
#44 MOV REG32
	interp: 1140
	jitted: 171
	        15.0%
	harden: 253
	        22.2%
#45 LD IMM64
	interp: 1182
	jitted: 163
	        13.8%
	harden: 578
	        48.9%
#47 INT: shifts by register
	interp: 1125
	jitted: 208
	        18.5%
	harden: 381
	        33.9%
#48 INT: DIV + ABS
	interp: 570	850
	jitted: 659	601
	        115.6%	70.7%
	harden: 588	482
	        103.2%	56.7%
#49 INT: DIV by zero
	interp: 350	305
	jitted: 317	169
	        90.6%	55.4%
	harden: 276	199
	        78.9%	65.2%
#54 JUMPS + HOLES
	interp: 863
	jitted: 358
	        41.5%
	harden: 371
	        43.0%
#57 M[]: alt STX + LDX
	interp: 3990
	jitted: 456
	        11.4%
	harden: 621
	        15.6%
#58 M[]: full STX + full LDX
	interp: 2819
	jitted: 438
	        15.5%
	harden: 586
	        20.8%
#60 LD [SKF_AD_OFF-1]
	interp: 313
	jitted: 198
	        63.3%
	harden: 195
	        62.3%
#61 load 64-bit immediate
	interp: 579
	jitted: 125
	        21.6%
	harden: 220
	        38.0%
#62 nmap reduced
	interp: 1860
	jitted: 1054
	        56.7%
	harden: 816
	        43.9%
#63 ALU_MOV_X: dst = 2
	interp: 249
	jitted: 81
	        32.5%
	harden: 76
	        30.5%
#64 ALU_MOV_X: dst = 4294967295
	interp: 264
	jitted: 85
	        32.2%
	harden: 79
	        29.9%
#65 ALU64_MOV_X: dst = 2
	interp: 229
	jitted: 96
	        41.9%
	harden: 80
	        34.9%
#66 ALU64_MOV_X: dst = 4294967295
	interp: 213
	jitted: 71
	        33.3%
	harden: 79
	        37.1%
#67 ALU_MOV_K: dst = 2
	interp: 167
	jitted: 70
	        41.9%
	harden: 75
	        44.9%
#68 ALU_MOV_K: dst = 4294967295
	interp: 149
	jitted: 71
	        47.7%
	harden: 73
	        49.0%
#69 ALU_MOV_K: 0x0000ffffffff0000 = 0x00000000ffffffff
	interp: 358
	jitted: 97
	        27.1%
	harden: 195
	        54.5%
#70 ALU64_MOV_K: dst = 2
	interp: 158
	jitted: 75
	        47.5%
	harden: 77
	        48.7%
#71 ALU64_MOV_K: dst = 2147483647
	interp: 156
	jitted: 66
	        42.3%
	harden: 104
	        66.7%
#72 ALU64_OR_K: dst = 0x0
	interp: 306
	jitted: 92
	        30.1%
	harden: 215
	        70.3%
#73 ALU64_MOV_K: dst = -1
	interp: 327
	jitted: 94
	        28.7%
	harden: 173
	        52.9%
#74 ALU_ADD_X: 1 + 2 = 3
	interp: 212
	jitted: 66
	        31.1%
	harden: 114
	        53.8%
#75 ALU_ADD_X: 1 + 4294967294 = 4294967295
	interp: 231
	jitted: 66
	        28.6%
	harden: 112
	        48.5%
#76 ALU_ADD_X: 2 + 4294967294 = 0
	interp: 309
	jitted: 87
	        28.2%
	harden: 186
	        60.2%
#77 ALU64_ADD_X: 1 + 2 = 3
	interp: 280
	jitted: 77
	        27.5%
	harden: 159
	        56.8%
#78 ALU64_ADD_X: 1 + 4294967294 = 4294967295
	interp: 286
	jitted: 72
	        25.2%
	harden: 109
	        38.1%
#79 ALU64_ADD_X: 2 + 4294967294 = 4294967296
	interp: 460
	jitted: 79
	        17.2%
	harden: 218
	        47.4%
#80 ALU_ADD_K: 1 + 2 = 3
	interp: 210
	jitted: 75
	        35.7%
	harden: 120
	        57.1%
#81 ALU_ADD_K: 3 + 0 = 3
	interp: 208
	jitted: 71
	        34.1%
	harden: 118
	        56.7%
#82 ALU_ADD_K: 1 + 4294967294 = 4294967295
	interp: 205
	jitted: 67
	        32.7%
	harden: 121
	        59.0%
#83 ALU_ADD_K: 4294967294 + 2 = 0
	interp: 323
	jitted: 82
	        25.4%
	harden: 139
	        43.0%
#84 ALU_ADD_K: 0 + (-1) = 0x00000000ffffffff
	interp: 338
	jitted: 86
	        25.4%
	harden: 176
	        52.1%
#85 ALU_ADD_K: 0 + 0xffff = 0xffff
	interp: 347
	jitted: 99
	        28.5%
	harden: 190
	        54.8%
#86 ALU_ADD_K: 0 + 0x7fffffff = 0x7fffffff
	interp: 360
	jitted: 113
	        31.4%
	harden: 228
	        63.3%
#87 ALU_ADD_K: 0 + 0x80000000 = 0x80000000
	interp: 345
	jitted: 123
	        35.7%
	harden: 198
	        57.4%
#88 ALU_ADD_K: 0 + 0x80008000 = 0x80008000
	interp: 377
	jitted: 85
	        22.5%
	harden: 189
	        50.1%
#89 ALU64_ADD_K: 1 + 2 = 3
	interp: 184
	jitted: 66
	        35.9%
	harden: 112
	        60.9%
#90 ALU64_ADD_K: 3 + 0 = 3
	interp: 185
	jitted: 66
	        35.7%
	harden: 111
	        60.0%
#91 ALU64_ADD_K: 1 + 2147483646 = 2147483647
	interp: 186
	jitted: 69
	        37.1%
	harden: 138
	        74.2%
#92 ALU64_ADD_K: 4294967294 + 2 = 4294967296
	interp: 353
	jitted: 109
	        30.9%
	harden: 151
	        42.8%
#93 ALU64_ADD_K: 2147483646 + -2147483647 = -1
	interp: 182
	jitted: 72
	        39.6%
	harden: 115
	        63.2%
#94 ALU64_ADD_K: 1 + 0 = 1
	interp: 311
	jitted: 126
	        40.5%
	harden: 206
	        66.2%
#95 ALU64_ADD_K: 0 + (-1) = 0xffffffffffffffff
	interp: 339
	jitted: 107
	        31.6%
	harden: 211
	        62.2%
#96 ALU64_ADD_K: 0 + 0xffff = 0xffff
	interp: 310
	jitted: 98
	        31.6%
	harden: 250
	        80.6%
#97 ALU64_ADD_K: 0 + 0x7fffffff = 0x7fffffff
	interp: 313
	jitted: 87
	        27.8%
	harden: 199
	        63.6%
#98 ALU64_ADD_K: 0 + 0x80000000 = 0xffffffff80000000
	interp: 340
	jitted: 98
	        28.8%
	harden: 177
	        52.1%
#99 ALU_ADD_K: 0 + 0x80008000 = 0xffffffff80008000
	interp: 311
	jitted: 92
	        29.6%
	harden: 243
	        78.1%
#100 ALU_SUB_X: 3 - 1 = 2
	interp: 213
	jitted: 77
	        36.2%
	harden: 108
	        50.7%
#101 ALU_SUB_X: 4294967295 - 4294967294 = 1
	interp: 212
	jitted: 72
	        34.0%
	harden: 133
	        62.7%
#102 ALU64_SUB_X: 3 - 1 = 2
	interp: 237
	jitted: 72
	        30.4%
	harden: 110
	        46.4%
#103 ALU64_SUB_X: 4294967295 - 4294967294 = 1
	interp: 221
	jitted: 71
	        32.1%
	harden: 111
	        50.2%
#104 ALU_SUB_K: 3 - 1 = 2
	interp: 177
	jitted: 120
	        67.8%
	harden: 110
	        62.1%
#105 ALU_SUB_K: 3 - 0 = 3
	interp: 179
	jitted: 82
	        45.8%
	harden: 123
	        68.7%
#106 ALU_SUB_K: 4294967295 - 4294967294 = 1
	interp: 195
	jitted: 103
	        52.8%
	harden: 124
	        63.6%
#107 ALU64_SUB_K: 3 - 1 = 2
	interp: 183
	jitted: 140
	        76.5%
	harden: 116
	        63.4%
#108 ALU64_SUB_K: 3 - 0 = 3
	interp: 177
	jitted: 117
	        66.1%
	harden: 133
	        75.1%
#109 ALU64_SUB_K: 4294967294 - 4294967295 = -1
	interp: 181
	jitted: 83
	        45.9%
	harden: 148
	        81.8%
#110 ALU64_ADD_K: 2147483646 - 2147483647 = -1
	interp: 177
	jitted: 77
	        43.5%
	harden: 145
	        81.9%
#111 ALU_MUL_X: 2 * 3 = 6
	interp: 241
	jitted: 68
	        28.2%
	harden: 172
	        71.4%
#112 ALU_MUL_X: 2 * 0x7FFFFFF8 = 0xFFFFFFF0
	interp: 220
	jitted: 70
	        31.8%
	harden: 117
	        53.2%
#113 ALU_MUL_X: -1 * -1 = 1
	interp: 224
	jitted: 73
	        32.6%
	harden: 109
	        48.7%
#114 ALU64_MUL_X: 2 * 3 = 6
	interp: 213
	jitted: 70
	        32.9%
	harden: 115
	        54.0%
#115 ALU64_MUL_X: 1 * 2147483647 = 2147483647
	interp: 230
	jitted: 75
	        32.6%
	harden: 119
	        51.7%
#116 ALU_MUL_K: 2 * 3 = 6
	interp: 191
	jitted: 67
	        35.1%
	harden: 111
	        58.1%
#117 ALU_MUL_K: 3 * 1 = 3
	interp: 189
	jitted: 71
	        37.6%
	harden: 118
	        62.4%
#118 ALU_MUL_K: 2 * 0x7FFFFFF8 = 0xFFFFFFF0
	interp: 192
	jitted: 70
	        36.5%
	harden: 109
	        56.8%
#119 ALU_MUL_K: 1 * (-1) = 0x00000000ffffffff
	interp: 333
	jitted: 153
	        45.9%
	harden: 201
	        60.4%
#120 ALU64_MUL_K: 2 * 3 = 6
	interp: 185
	jitted: 101
	        54.6%
	harden: 116
	        62.7%
#121 ALU64_MUL_K: 3 * 1 = 3
	interp: 185
	jitted: 108
	        58.4%
	harden: 115
	        62.2%
#122 ALU64_MUL_K: 1 * 2147483647 = 2147483647
	interp: 184
	jitted: 106
	        57.6%
	harden: 278
	        151.1%
#123 ALU64_MUL_K: 1 * -2147483647 = -2147483647
	interp: 183
	jitted: 92
	        50.3%
	harden: 125
	        68.3%
#124 ALU64_MUL_K: 1 * (-1) = 0xffffffffffffffff
	interp: 336
	jitted: 122
	        36.3%
	harden: 208
	        61.9%
#125 ALU_DIV_X: 6 / 2 = 3
	interp: 316
	jitted: 220
	        69.6%
	harden: 246
	        77.8%
#126 ALU_DIV_X: 4294967295 / 4294967295 = 1
	interp: 315
	jitted: 208
	        66.0%
	harden: 291
	        92.4%
#130 ALU_DIV_K: 6 / 2 = 3
	interp: 249
	jitted: 246
	        98.8%
	harden: 234
	        94.0%
#131 ALU_DIV_K: 3 / 1 = 3
	interp: 240
	jitted: 199
	        82.9%
	harden: 240
	        100.0%
#132 ALU_DIV_K: 4294967295 / 4294967295 = 1
	interp: 254
	jitted: 192
	        75.6%
	harden: 276
	        108.7%
#133 ALU_DIV_K: 0xffffffffffffffff / (-1) = 0x1
	interp: 379
	jitted: 215
	        56.7%
	harden: 373
	        98.4%
#138 ALU_MOD_X: 3 % 2 = 1
	interp: 421
	jitted: 235
	        55.8%
	harden: 293
	        69.6%
#139 ALU_MOD_X: 4294967295 % 4294967293 = 2
	interp: 453
	jitted: 262
	        57.8%
	harden: 289
	        63.8%
#142 ALU_MOD_K: 3 % 2 = 1
	interp: 380
	jitted: 231
	        60.8%
	harden: 311
	        81.8%
#144 ALU_MOD_K: 4294967295 % 4294967293 = 2
	interp: 467
	jitted: 257
	        55.0%
	harden: 319
	        68.3%
#148 ALU_AND_X: 3 & 2 = 2
	interp: 225
	jitted: 100
	        44.4%
	harden: 109
	        48.4%
#149 ALU_AND_X: 0xffffffff & 0xffffffff = 0xffffffff
	interp: 261
	jitted: 106
	        40.6%
	harden: 130
	        49.8%
#150 ALU64_AND_X: 3 & 2 = 2
	interp: 273
	jitted: 86
	        31.5%
	harden: 106
	        38.8%
#151 ALU64_AND_X: 0xffffffff & 0xffffffff = 0xffffffff
	interp: 251
	jitted: 118
	        47.0%
	harden: 102
	        40.6%
#152 ALU_AND_K: 3 & 2 = 2
	interp: 201
	jitted: 117
	        58.2%
	harden: 114
	        56.7%
#153 ALU_AND_K: 0xffffffff & 0xffffffff = 0xffffffff
	interp: 240
	jitted: 72
	        30.0%
	harden: 138
	        57.5%
#154 ALU64_AND_K: 3 & 2 = 2
	interp: 209
	jitted: 72
	        34.4%
	harden: 110
	        52.6%
#155 ALU64_AND_K: 0xffffffff & 0xffffffff = 0xffffffff
	interp: 319
	jitted: 70
	        21.9%
	harden: 148
	        46.4%
#156 ALU64_AND_K: 0x0000ffffffff0000 & 0x0 = 0x0000ffff00000000
	interp: 384
	jitted: 99
	        25.8%
	harden: 206
	        53.6%
#157 ALU64_AND_K: 0x0000ffffffff0000 & -1 = 0x0000ffffffffffff
	interp: 367
	jitted: 97
	        26.4%
	harden: 176
	        48.0%
#158 ALU64_AND_K: 0xffffffffffffffff & -1 = 0xffffffffffffffff
	interp: 375
	jitted: 86
	        22.9%
	harden: 271
	        72.3%
#159 ALU_OR_X: 1 | 2 = 3
	interp: 271
	jitted: 73
	        26.9%
	harden: 108
	        39.9%
#160 ALU_OR_X: 0x0 | 0xffffffff = 0xffffffff
	interp: 280
	jitted: 72
	        25.7%
	harden: 118
	        42.1%
#161 ALU64_OR_X: 1 | 2 = 3
	interp: 253
	jitted: 89
	        35.2%
	harden: 103
	        40.7%
#162 ALU64_OR_X: 0 | 0xffffffff = 0xffffffff
	interp: 263
	jitted: 91
	        34.6%
	harden: 143
	        54.4%
#163 ALU_OR_K: 1 | 2 = 3
	interp: 216
	jitted: 71
	        32.9%
	harden: 123
	        56.9%
#164 ALU_OR_K: 0 & 0xffffffff = 0xffffffff
	interp: 187
	jitted: 116
	        62.0%
	harden: 110
	        58.8%
#165 ALU64_OR_K: 1 | 2 = 3
	interp: 183
	jitted: 77
	        42.1%
	harden: 120
	        65.6%
#166 ALU64_OR_K: 0 & 0xffffffff = 0xffffffff
	interp: 195
	jitted: 80
	        41.0%
	harden: 119
	        61.0%
#167 ALU64_OR_K: 0x0000ffffffff0000 | 0x0 = 0x0000ffff00000000
	interp: 338
	jitted: 86
	        25.4%
	harden: 212
	        62.7%
#168 ALU64_OR_K: 0x0000ffffffff0000 | -1 = 0xffffffffffffffff
	interp: 324
	jitted: 99
	        30.6%
	harden: 221
	        68.2%
#169 ALU64_OR_K: 0x000000000000000 | -1 = 0xffffffffffffffff
	interp: 309
	jitted: 147
	        47.6%
	harden: 198
	        64.1%
#170 ALU_XOR_X: 5 ^ 6 = 3
	interp: 216
	jitted: 80
	        37.0%
	harden: 138
	        63.9%
#171 ALU_XOR_X: 0x1 ^ 0xffffffff = 0xfffffffe
	interp: 414
	jitted: 73
	        17.6%
	harden: 130
	        31.4%
#172 ALU64_XOR_X: 5 ^ 6 = 3
	interp: 320
	jitted: 71
	        22.2%
	harden: 114
	        35.6%
#173 ALU64_XOR_X: 1 ^ 0xffffffff = 0xfffffffe
	interp: 223
	jitted: 72
	        32.3%
	harden: 106
	        47.5%
#174 ALU_XOR_K: 5 ^ 6 = 3
	interp: 203
	jitted: 71
	        35.0%
	harden: 112
	        55.2%
#175 ALU_XOR_K: 1 ^ 0xffffffff = 0xfffffffe
	interp: 205
	jitted: 67
	        32.7%
	harden: 116
	        56.6%
#176 ALU64_XOR_K: 5 ^ 6 = 3
	interp: 205
	jitted: 70
	        34.1%
	harden: 114
	        55.6%
#177 ALU64_XOR_K: 1 & 0xffffffff = 0xfffffffe
	interp: 186
	jitted: 104
	        55.9%
	harden: 112
	        60.2%
#178 ALU64_XOR_K: 0x0000ffffffff0000 ^ 0x0 = 0x0000ffffffff0000
	interp: 352
	jitted: 96
	        27.3%
	harden: 201
	        57.1%
#179 ALU64_XOR_K: 0x0000ffffffff0000 ^ -1 = 0xffff00000000ffff
	interp: 353
	jitted: 119
	        33.7%
	harden: 242
	        68.6%
#180 ALU64_XOR_K: 0x000000000000000 ^ -1 = 0xffffffffffffffff
	interp: 362
	jitted: 116
	        32.0%
	harden: 208
	        57.5%
#181 ALU_LSH_X: 1 << 1 = 2
	interp: 211
	jitted: 100
	        47.4%
	harden: 112
	        53.1%
#182 ALU_LSH_X: 1 << 31 = 0x80000000
	interp: 216
	jitted: 73
	        33.8%
	harden: 137
	        63.4%
#183 ALU64_LSH_X: 1 << 1 = 2
	interp: 224
	jitted: 119
	        53.1%
	harden: 163
	        72.8%
#184 ALU64_LSH_X: 1 << 31 = 0x80000000
	interp: 223
	jitted: 110
	        49.3%
	harden: 145
	        65.0%
#185 ALU_LSH_K: 1 << 1 = 2
	interp: 208
	jitted: 147
	        70.7%
	harden: 92
	        44.2%
#186 ALU_LSH_K: 1 << 31 = 0x80000000
	interp: 210
	jitted: 116
	        55.2%
	harden: 94
	        44.8%
#187 ALU64_LSH_K: 1 << 1 = 2
	interp: 211
	jitted: 154
	        73.0%
	harden: 94
	        44.5%
#188 ALU64_LSH_K: 1 << 31 = 0x80000000
	interp: 182
	jitted: 92
	        50.5%
	harden: 127
	        69.8%
#189 ALU_RSH_X: 2 >> 1 = 1
	interp: 226
	jitted: 86
	        38.1%
	harden: 135
	        59.7%
#190 ALU_RSH_X: 0x80000000 >> 31 = 1
	interp: 225
	jitted: 148
	        65.8%
	harden: 109
	        48.4%
#191 ALU64_RSH_X: 2 >> 1 = 1
	interp: 289
	jitted: 108
	        37.4%
	harden: 123
	        42.6%
#192 ALU64_RSH_X: 0x80000000 >> 31 = 1
	interp: 253
	jitted: 96
	        37.9%
	harden: 117
	        46.2%
#193 ALU_RSH_K: 2 >> 1 = 1
	interp: 207
	jitted: 68
	        32.9%
	harden: 95
	        45.9%
#194 ALU_RSH_K: 0x80000000 >> 31 = 1
	interp: 210
	jitted: 74
	        35.2%
	harden: 103
	        49.0%
#195 ALU64_RSH_K: 2 >> 1 = 1
	interp: 232
	jitted: 66
	        28.4%
	harden: 124
	        53.4%
#196 ALU64_RSH_K: 0x80000000 >> 31 = 1
	interp: 208
	jitted: 95
	        45.7%
	harden: 107
	        51.4%
#197 ALU_ARSH_X: 0xff00ff0000000000 >> 40 = 0xffffffffffff00ff
	interp: 252
	jitted: 74
	        29.4%
	harden: 125
	        49.6%
#198 ALU_ARSH_K: 0xff00ff0000000000 >> 40 = 0xffffffffffff00ff
	interp: 197
	jitted: 96
	        48.7%
	harden: 105
	        53.3%
#199 ALU_NEG: -(3) = -3
	interp: 189
	jitted: 84
	        44.4%
	harden: 76
	        40.2%
#200 ALU_NEG: -(-3) = 3
	interp: 171
	jitted: 72
	        42.1%
	harden: 106
	        62.0%
#201 ALU64_NEG: -(3) = -3
	interp: 179
	jitted: 74
	        41.3%
	harden: 104
	        58.1%
#202 ALU64_NEG: -(-3) = 3
	interp: 180
	jitted: 68
	        37.8%
	harden: 135
	        75.0%
#203 ALU_END_FROM_BE 16: 0x0123456789abcdef -> 0xcdef
	interp: 202
	jitted: 74
	        36.6%
	harden: 115
	        56.9%
#204 ALU_END_FROM_BE 32: 0x0123456789abcdef -> 0x89abcdef
	interp: 368
	jitted: 101
	        27.4%
	harden: 101
	        27.4%
#205 ALU_END_FROM_BE 64: 0x0123456789abcdef -> 0x89abcdef
	interp: 244
	jitted: 93
	        38.1%
	harden: 103
	        42.2%
#206 ALU_END_FROM_LE 16: 0x0123456789abcdef -> 0xefcd
	interp: 274
	jitted: 73
	        26.6%
	harden: 107
	        39.1%
#207 ALU_END_FROM_LE 32: 0x0123456789abcdef -> 0xefcdab89
	interp: 319
	jitted: 76
	        23.8%
	harden: 93
	        29.2%
#208 ALU_END_FROM_LE 64: 0x0123456789abcdef -> 0x67452301
	interp: 193
	jitted: 78
	        40.4%
	harden: 108
	        56.0%
#209 ST_MEM_B: Store/Load byte: max negative
	interp: 219
	jitted: 72
	        32.9%
	harden: 168
	        76.7%
#210 ST_MEM_B: Store/Load byte: max positive
	interp: 227
	jitted: 79
	        34.8%
	harden: 105
	        46.3%
#211 STX_MEM_B: Store/Load byte: max negative
	interp: 251
	jitted: 79
	        31.5%
	harden: 140
	        55.8%
#212 ST_MEM_H: Store/Load half word: max negative
	interp: 218
	jitted: 81
	        37.2%
	harden: 98
	        45.0%
#213 ST_MEM_H: Store/Load half word: max positive
	interp: 208
	jitted: 100
	        48.1%
	harden: 109
	        52.4%
#214 STX_MEM_H: Store/Load half word: max negative
	interp: 259
	jitted: 110
	        42.5%
	harden: 134
	        51.7%
#215 ST_MEM_W: Store/Load word: max negative
	interp: 253
	jitted: 75
	        29.6%
	harden: 148
	        58.5%
#216 ST_MEM_W: Store/Load word: max positive
	interp: 244
	jitted: 89
	        36.5%
	harden: 136
	        55.7%
#217 STX_MEM_W: Store/Load word: max negative
	interp: 297
	jitted: 122
	        41.1%
	harden: 205
	        69.0%
#218 ST_MEM_DW: Store/Load double word: max negative
	interp: 257
	jitted: 85
	        33.1%
	harden: 124
	        48.2%
#219 ST_MEM_DW: Store/Load double word: max negative 2
	interp: 392
	jitted: 123
	        31.4%
	harden: 222
	        56.6%
#220 ST_MEM_DW: Store/Load double word: max positive
	interp: 292
	jitted: 78
	        26.7%
	harden: 110
	        37.7%
#221 STX_MEM_DW: Store/Load double word: max negative
	interp: 259
	jitted: 85
	        32.8%
	harden: 194
	        74.9%
#230 JMP_EXIT
	interp: 127
	jitted: 82
	        64.6%
	harden: 77
	        60.6%
#231 JMP_JA: Unconditional jump: if (true) return 1
	interp: 194
	jitted: 86
	        44.3%
	harden: 84
	        43.3%
#232 JMP_JSGT_K: Signed jump: if (-1 > -2) return 1
	interp: 262
	jitted: 86
	        32.8%
	harden: 128
	        48.9%
#233 JMP_JSGT_K: Signed jump: if (-1 > -1) return 0
	interp: 249
	jitted: 82
	        32.9%
	harden: 126
	        50.6%
#234 JMP_JSGE_K: Signed jump: if (-1 >= -2) return 1
	interp: 262
	jitted: 72
	        27.5%
	harden: 179
	        68.3%
#235 JMP_JSGE_K: Signed jump: if (-1 >= -1) return 1
	interp: 260
	jitted: 73
	        28.1%
	harden: 125
	        48.1%
#236 JMP_JGT_K: if (3 > 2) return 1
	interp: 260
	jitted: 71
	        27.3%
	harden: 142
	        54.6%
#237 JMP_JGT_K: Unsigned jump: if (-1 > 1) return 1
	interp: 278
	jitted: 72
	        25.9%
	harden: 161
	        57.9%
#238 JMP_JGE_K: if (3 >= 2) return 1
	interp: 255
	jitted: 77
	        30.2%
	harden: 163
	        63.9%
#239 JMP_JGT_K: if (3 > 2) return 1 (jump backwards)
	interp: 321
	jitted: 76
	        23.7%
	harden: 143
	        44.5%
#240 JMP_JGE_K: if (3 >= 3) return 1
	interp: 340
	jitted: 74
	        21.8%
	harden: 179
	        52.6%
#241 JMP_JNE_K: if (3 != 2) return 1
	interp: 310
	jitted: 74
	        23.9%
	harden: 144
	        46.5%
#242 JMP_JEQ_K: if (3 == 3) return 1
	interp: 310
	jitted: 78
	        25.2%
	harden: 144
	        46.5%
#243 JMP_JSET_K: if (0x3 & 0x2) return 1
	interp: 276
	jitted: 109
	        39.5%
	harden: 149
	        54.0%
#244 JMP_JSET_K: if (0x3 & 0xffffffff) return 1
	interp: 312
	jitted: 71
	        22.8%
	harden: 153
	        49.0%
#245 JMP_JSGT_X: Signed jump: if (-1 > -2) return 1
	interp: 346
	jitted: 75
	        21.7%
	harden: 162
	        46.8%
#246 JMP_JSGT_X: Signed jump: if (-1 > -1) return 0
	interp: 292
	jitted: 78
	        26.7%
	harden: 162
	        55.5%
#247 JMP_JSGE_X: Signed jump: if (-1 >= -2) return 1
	interp: 318
	jitted: 134
	        42.1%
	harden: 178
	        56.0%
#248 JMP_JSGE_X: Signed jump: if (-1 >= -1) return 1
	interp: 287
	jitted: 102
	        35.5%
	harden: 192
	        66.9%
#249 JMP_JGT_X: if (3 > 2) return 1
	interp: 316
	jitted: 83
	        26.3%
	harden: 205
	        64.9%
#250 JMP_JGT_X: Unsigned jump: if (-1 > 1) return 1
	interp: 400
	jitted: 80
	        20.0%
	harden: 154
	        38.5%
#251 JMP_JGE_X: if (3 >= 2) return 1
	interp: 287
	jitted: 78
	        27.2%
	harden: 177
	        61.7%
#252 JMP_JGE_X: if (3 >= 3) return 1
	interp: 287
	jitted: 116
	        40.4%
	harden: 160
	        55.7%
#253 JMP_JGE_X: ldimm64 test 1
	interp: 323
	jitted: 81
	        25.1%
	harden: 204
	        63.2%
#254 JMP_JGE_X: ldimm64 test 2
	interp: 298
	jitted: 79
	        26.5%
	harden: 201
	        67.4%
#255 JMP_JGE_X: ldimm64 test 3
	interp: 263
	jitted: 78
	        29.7%
	harden: 184
	        70.0%
#256 JMP_JNE_X: if (3 != 2) return 1
	interp: 313
	jitted: 108
	        34.5%
	harden: 168
	        53.7%
#257 JMP_JEQ_X: if (3 == 3) return 1
	interp: 308
	jitted: 102
	        33.1%
	harden: 197
	        64.0%
#258 JMP_JSET_X: if (0x3 & 0x2) return 1
	interp: 359
	jitted: 133
	        37.0%
	harden: 192
	        53.5%
#259 JMP_JSET_X: if (0x3 & 0xffffffff) return 1
	interp: 421
	jitted: 128
	        30.4%
	harden: 181
	        43.0%
#260 JMP_JA: Jump, gap, jump, ...
	interp: 309
	jitted: 108
	        35.0%
	harden: 97
	        31.4%
#261 BPF_MAXINSNS: Maximum possible literals
	interp: 251
	jitted: 111
	        44.2%
	harden: 125
	        49.8%
#262 BPF_MAXINSNS: Single literal
	interp: 286
	jitted: 115
	        40.2%
	harden: 105
	        36.7%
#263 BPF_MAXINSNS: Run/add until end
	interp: 254969
	jitted: 8481
	        3.3%
	harden: 121315
	        47.6%
#265 BPF_MAXINSNS: Very long jump
	interp: 284
	jitted: 123
	        43.3%
	harden: 131
	        46.1%
#266 BPF_MAXINSNS: Ctx heavy transformations
	interp: 548311	560800
	jitted: 28166	29032
	        5.1%	5.2%
	harden: 217030	181848
	        39.6%	32.4%
#268 BPF_MAXINSNS: Jump heavy test
	interp: 480796
	jitted: 132663
	        27.6%
	harden: 440621
	        91.6%
#269 BPF_MAXINSNS: Very long jump backwards
	interp: 193
	jitted: 148
	        76.7%
	harden: 154
	        79.8%
#270 BPF_MAXINSNS: Edge hopping nuthouse
	interp: 114304
	jitted: 277097
	        242.4%
	harden: 302835
	        264.9%
#271 BPF_MAXINSNS: Jump, gap, jump, ...
	interp: 1884
	jitted: 1041
	        55.3%
	harden: 1008
	        53.5%
#274 LD_IND byte frag
	interp: 695
	jitted: 574
	        82.6%
	harden: 1453
	        209.1%
#275 LD_IND halfword frag
	interp: 818
	jitted: 641
	        78.4%
	harden: 600
	        73.3%
#276 LD_IND word frag
	interp: 837
	jitted: 731
	        87.3%
	harden: 719
	        85.9%
#277 LD_IND halfword mixed head/frag
	interp: 1170
	jitted: 741
	        63.3%
	harden: 705
	        60.3%
#278 LD_IND word mixed head/frag
	interp: 950
	jitted: 972
	        102.3%
	harden: 732
	        77.1%
#279 LD_ABS byte frag
	interp: 953
	jitted: 601
	        63.1%
	harden: 683
	        71.7%
#280 LD_ABS halfword frag
	interp: 754
	jitted: 603
	        80.0%
	harden: 595
	        78.9%
#281 LD_ABS word frag
	interp: 1133
	jitted: 688
	        60.7%
	harden: 672
	        59.3%
#282 LD_ABS halfword mixed head/frag
	interp: 1079
	jitted: 657
	        60.9%
	harden: 775
	        71.8%
#283 LD_ABS word mixed head/frag
	interp: 718
	jitted: 748
	        104.2%
	harden: 725
	        101.0%
#284 LD_IND byte default X
	interp: 297
	jitted: 178
	        59.9%
	harden: 274
	        92.3%
#285 LD_IND byte positive offset
	interp: 300
	jitted: 187
	        62.3%
	harden: 302
	        100.7%
#286 LD_IND byte negative offset
	interp: 296
	jitted: 178
	        60.1%
	harden: 311
	        105.1%
#287 LD_IND halfword positive offset
	interp: 333
	jitted: 161
	        48.3%
	harden: 218
	        65.5%
#288 LD_IND halfword negative offset
	interp: 306
	jitted: 195
	        63.7%
	harden: 193
	        63.1%
#289 LD_IND halfword unaligned
	interp: 307
	jitted: 183
	        59.6%
	harden: 190
	        61.9%
#290 LD_IND word positive offset
	interp: 337
	jitted: 170
	        50.4%
	harden: 200
	        59.3%
#291 LD_IND word negative offset
	interp: 312
	jitted: 198
	        63.5%
	harden: 216
	        69.2%
#292 LD_IND word unaligned (addr & 3 == 2)
	interp: 309
	jitted: 281
	        90.9%
	harden: 195
	        63.1%
#293 LD_IND word unaligned (addr & 3 == 1)
	interp: 335
	jitted: 172
	        51.3%
	harden: 196
	        58.5%
#294 LD_IND word unaligned (addr & 3 == 3)
	interp: 305
	jitted: 171
	        56.1%
	harden: 221
	        72.5%
#295 LD_ABS byte
	interp: 269
	jitted: 162
	        60.2%
	harden: 195
	        72.5%
#296 LD_ABS halfword
	interp: 294
	jitted: 160
	        54.4%
	harden: 170
	        57.8%
#297 LD_ABS halfword unaligned
	interp: 271
	jitted: 180
	        66.4%
	harden: 167
	        61.6%
#298 LD_ABS word
	interp: 265
	jitted: 166
	        62.6%
	harden: 182
	        68.7%
#299 LD_ABS word unaligned (addr & 3 == 2)
	interp: 267
	jitted: 157
	        58.8%
	harden: 185
	        69.3%
#300 LD_ABS word unaligned (addr & 3 == 1)
	interp: 269
	jitted: 170
	        63.2%
	harden: 162
	        60.2%
#301 LD_ABS word unaligned (addr & 3 == 3)
	interp: 281
	jitted: 163
	        58.0%
	harden: 231
	        82.2%
#302 ADD default X
	interp: 296
	jitted: 84
	        28.4%
	harden: 105
	        35.5%
#303 ADD default A
	interp: 309
	jitted: 79
	        25.6%
	harden: 101
	        32.7%
#304 SUB default X
	interp: 290
	jitted: 82
	        28.3%
	harden: 106
	        36.6%
#305 SUB default A
	interp: 252
	jitted: 85
	        33.7%
	harden: 119
	        47.2%
#306 MUL default X
	interp: 322
	jitted: 76
	        23.6%
	harden: 131
	        40.7%
#307 MUL default A
	interp: 267
	jitted: 83
	        31.1%
	harden: 116
	        43.4%
#308 DIV default X
	interp: 293
	jitted: 93
	        31.7%
	harden: 116
	        39.6%
#309 DIV default A
	interp: 336
	jitted: 203
	        60.4%
	harden: 227
	        67.6%
#310 MOD default X
	interp: 284
	jitted: 100
	        35.2%
	harden: 98
	        34.5%
#311 MOD default A
	interp: 435
	jitted: 249
	        57.2%
	harden: 265
	        60.9%
#312 JMP EQ default A
	interp: 352
	jitted: 83
	        23.6%
	harden: 134
	        38.1%
#313 JMP EQ default X
	interp: 357
	jitted: 95
	        26.6%
	harden: 108
	        30.3%

^ permalink raw reply	[flat|nested] 99+ messages in thread

* arch: arm: bpf: Converting cBPF to eBPF for arm 32 bit
@ 2017-05-22 20:05                         ` Kees Cook
  0 siblings, 0 replies; 99+ messages in thread
From: Kees Cook @ 2017-05-22 20:05 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, May 22, 2017 at 10:04 AM, Shubham Bansal
<illusionist.neo@gmail.com> wrote:
> These all benchmarks are for ARMv7.

Thanks! In the future, try to avoid the white-space damage
(line-wrapping). And it looks like you've still got debugging turned
on in your jit code:

[   56.176033] test_bpf: #21 LD_CPU
[   56.176329] bpf_jit: *** NOT YET: opcode 85 ***
[   56.176565] jited:0 2639 702 PASS

That breaks the test report line. After I cleaned these up and parsed
the results, they look great. Most things are half the speed of the
interpreter, if not better. Only the LD_ABS suffered, and that's
mainly the const blinding, I assume.

Please post your current patch. Thanks for this!

-Kees

-- 
Kees Cook
Pixel Security
-------------- next part --------------
#0 TAX
	interp: 757	645	650
	jitted: 234	171	195
	        30.9%	26.5%	30.0%
	harden: 239	218	229
	        31.6%	33.8%	35.2%
#1 TXA
	interp: 366	334	336
	jitted: 81	79	77
	        22.1%	23.7%	22.9%
	harden: 89	119	85
	        24.3%	35.6%	25.3%
#2 ADD_SUB_MUL_K
	interp: 543
	jitted: 89
	        16.4%
	harden: 213
	        39.2%
#3 DIV_MOD_KX
	interp: 1509
	jitted: 939
	        62.2%
	harden: 1190
	        78.9%
#4 AND_OR_LSH_K
	interp: 539	559
	jitted: 116	114
	        21.5%	20.4%
	harden: 200	149
	        37.1%	26.7%
#5 LD_IMM_0
	interp: 412
	jitted: 93
	        22.6%
	harden: 101
	        24.5%
#6 LD_IND
	interp: 428	376	389
	jitted: 371	279	274
	        86.7%	74.2%	70.4%
	harden: 314	310	283
	        73.4%	82.4%	72.8%
#7 LD_ABS
	interp: 509	405	358
	jitted: 408	402	272
	        80.2%	99.3%	76.0%
	harden: 376	460	397
	        73.9%	113.6%	110.9%
#8 LD_ABS_LL
	interp: 542	783
	jitted: 387	346
	        71.4%	44.2%
	harden: 608	415
	        112.2%	53.0%
#9 LD_IND_LL
	interp: 524	496	723
	jitted: 239	248	217
	        45.6%	50.0%	30.0%
	harden: 248	256	268
	        47.3%	51.6%	37.1%
#10 LD_ABS_NET
	interp: 527	545
	jitted: 356	332
	        67.6%	60.9%
	harden: 435	420
	        82.5%	77.1%
#11 LD_IND_NET
	interp: 650	495	647
	jitted: 223	212	320
	        34.3%	42.8%	49.5%
	harden: 240	228	215
	        36.9%	46.1%	33.2%
#12 LD_PKTTYPE
	interp: 686	901
	jitted: 102	90
	        14.9%	10.0%
	harden: 211	274
	        30.8%	30.4%
#13 LD_MARK
	interp: 305	291
	jitted: 80	80
	        26.2%	27.5%
	harden: 119	76
	        39.0%	26.1%
#14 LD_RXHASH
	interp: 257	259
	jitted: 73	71
	        28.4%	27.4%
	harden: 78	70
	        30.4%	27.0%
#15 LD_QUEUE
	interp: 255	254
	jitted: 120	121
	        47.1%	47.6%
	harden: 77	73
	        30.2%	28.7%
#16 LD_PROTOCOL
	interp: 593	603
	jitted: 256	247
	        43.2%	41.0%
	harden: 326	320
	        55.0%	53.1%
#17 LD_VLAN_TAG
	interp: 288	292
	jitted: 82	84
	        28.5%	28.8%
	harden: 129	86
	        44.8%	29.5%
#18 LD_VLAN_TAG_PRESENT
	interp: 335	421
	jitted: 80	77
	        23.9%	18.3%
	harden: 87	88
	        26.0%	20.9%
#19 LD_IFINDEX
	interp: 8568	606
	jitted: 87	98
	        1.0%	16.2%
	harden: 97	95
	        1.1%	15.7%
#20 LD_HATYPE
	interp: 618	695
	jitted: 95	90
	        15.4%	12.9%
	harden: 94	118
	        15.2%	17.0%
#25 LD_ANC_XOR
	interp: 314	344
	jitted: 86	100
	        27.4%	29.1%
	harden: 168	156
	        53.5%	45.3%
#26 SPILL_FILL
	interp: 757	850	903
	jitted: 131	137	123
	        17.3%	16.1%	13.6%
	harden: 232	212	219
	        30.6%	24.9%	24.3%
#27 JEQ
	interp: 380	420	426
	jitted: 266	189	216
	        70.0%	45.0%	50.7%
	harden: 362	352	230
	        95.3%	83.8%	54.0%
#28 JGT
	interp: 376	467	448
	jitted: 301	211	192
	        80.1%	45.2%	42.9%
	harden: 334	236	197
	        88.8%	50.5%	44.0%
#29 JGE
	interp: 446	590	498
	jitted: 191	200	223
	        42.8%	33.9%	44.8%
	harden: 260	318	307
	        58.3%	53.9%	61.6%
#30 JSET
	interp: 571	787	1003
	jitted: 211	210	214
	        37.0%	26.7%	21.3%
	harden: 274	339	410
	        48.0%	43.1%	40.9%
#31 tcpdump port 22
	interp: 358	1079	1190
	jitted: 314	722	711
	        87.7%	66.9%	59.7%
	harden: 355	951	968
	        99.2%	88.1%	81.3%
#32 tcpdump complex
	interp: 319	1061	2324
	jitted: 291	707	1068
	        91.2%	66.6%	46.0%
	harden: 318	798	1308
	        99.7%	75.2%	56.3%
#33 RET_A
	interp: 253	249
	jitted: 83	88
	        32.8%	35.3%
	harden: 83	76
	        32.8%	30.5%
#34 INT: ADD trivial
	interp: 414
	jitted: 162
	        39.1%
	harden: 152
	        36.7%
#35 INT: MUL_X
	interp: 336
	jitted: 176
	        52.4%
	harden: 192
	        57.1%
#36 INT: MUL_X2
	interp: 431
	jitted: 84
	        19.5%
	harden: 165
	        38.3%
#37 INT: MUL32_X
	interp: 523
	jitted: 99
	        18.9%
	harden: 163
	        31.2%
#38 INT: ADD 64-bit
	interp: 5263
	jitted: 1066
	        20.3%
	harden: 1507
	        28.6%
#39 INT: ADD 32-bit
	interp: 4127
	jitted: 666
	        16.1%
	harden: 954
	        23.1%
#40 INT: SUB
	interp: 4218
	jitted: 3236
	        76.7%
	harden: 1159
	        27.5%
#41 INT: XOR
	interp: 2252
	jitted: 308
	        13.7%
	harden: 480
	        21.3%
#42 INT: MUL
	interp: 1986
	jitted: 376
	        18.9%
	harden: 486
	        24.5%
#43 MOV REG64
	interp: 1103
	jitted: 227
	        20.6%
	harden: 274
	        24.8%
#44 MOV REG32
	interp: 1140
	jitted: 171
	        15.0%
	harden: 253
	        22.2%
#45 LD IMM64
	interp: 1182
	jitted: 163
	        13.8%
	harden: 578
	        48.9%
#47 INT: shifts by register
	interp: 1125
	jitted: 208
	        18.5%
	harden: 381
	        33.9%
#48 INT: DIV + ABS
	interp: 570	850
	jitted: 659	601
	        115.6%	70.7%
	harden: 588	482
	        103.2%	56.7%
#49 INT: DIV by zero
	interp: 350	305
	jitted: 317	169
	        90.6%	55.4%
	harden: 276	199
	        78.9%	65.2%
#54 JUMPS + HOLES
	interp: 863
	jitted: 358
	        41.5%
	harden: 371
	        43.0%
#57 M[]: alt STX + LDX
	interp: 3990
	jitted: 456
	        11.4%
	harden: 621
	        15.6%
#58 M[]: full STX + full LDX
	interp: 2819
	jitted: 438
	        15.5%
	harden: 586
	        20.8%
#60 LD [SKF_AD_OFF-1]
	interp: 313
	jitted: 198
	        63.3%
	harden: 195
	        62.3%
#61 load 64-bit immediate
	interp: 579
	jitted: 125
	        21.6%
	harden: 220
	        38.0%
#62 nmap reduced
	interp: 1860
	jitted: 1054
	        56.7%
	harden: 816
	        43.9%
#63 ALU_MOV_X: dst = 2
	interp: 249
	jitted: 81
	        32.5%
	harden: 76
	        30.5%
#64 ALU_MOV_X: dst = 4294967295
	interp: 264
	jitted: 85
	        32.2%
	harden: 79
	        29.9%
#65 ALU64_MOV_X: dst = 2
	interp: 229
	jitted: 96
	        41.9%
	harden: 80
	        34.9%
#66 ALU64_MOV_X: dst = 4294967295
	interp: 213
	jitted: 71
	        33.3%
	harden: 79
	        37.1%
#67 ALU_MOV_K: dst = 2
	interp: 167
	jitted: 70
	        41.9%
	harden: 75
	        44.9%
#68 ALU_MOV_K: dst = 4294967295
	interp: 149
	jitted: 71
	        47.7%
	harden: 73
	        49.0%
#69 ALU_MOV_K: 0x0000ffffffff0000 = 0x00000000ffffffff
	interp: 358
	jitted: 97
	        27.1%
	harden: 195
	        54.5%
#70 ALU64_MOV_K: dst = 2
	interp: 158
	jitted: 75
	        47.5%
	harden: 77
	        48.7%
#71 ALU64_MOV_K: dst = 2147483647
	interp: 156
	jitted: 66
	        42.3%
	harden: 104
	        66.7%
#72 ALU64_OR_K: dst = 0x0
	interp: 306
	jitted: 92
	        30.1%
	harden: 215
	        70.3%
#73 ALU64_MOV_K: dst = -1
	interp: 327
	jitted: 94
	        28.7%
	harden: 173
	        52.9%
#74 ALU_ADD_X: 1 + 2 = 3
	interp: 212
	jitted: 66
	        31.1%
	harden: 114
	        53.8%
#75 ALU_ADD_X: 1 + 4294967294 = 4294967295
	interp: 231
	jitted: 66
	        28.6%
	harden: 112
	        48.5%
#76 ALU_ADD_X: 2 + 4294967294 = 0
	interp: 309
	jitted: 87
	        28.2%
	harden: 186
	        60.2%
#77 ALU64_ADD_X: 1 + 2 = 3
	interp: 280
	jitted: 77
	        27.5%
	harden: 159
	        56.8%
#78 ALU64_ADD_X: 1 + 4294967294 = 4294967295
	interp: 286
	jitted: 72
	        25.2%
	harden: 109
	        38.1%
#79 ALU64_ADD_X: 2 + 4294967294 = 4294967296
	interp: 460
	jitted: 79
	        17.2%
	harden: 218
	        47.4%
#80 ALU_ADD_K: 1 + 2 = 3
	interp: 210
	jitted: 75
	        35.7%
	harden: 120
	        57.1%
#81 ALU_ADD_K: 3 + 0 = 3
	interp: 208
	jitted: 71
	        34.1%
	harden: 118
	        56.7%
#82 ALU_ADD_K: 1 + 4294967294 = 4294967295
	interp: 205
	jitted: 67
	        32.7%
	harden: 121
	        59.0%
#83 ALU_ADD_K: 4294967294 + 2 = 0
	interp: 323
	jitted: 82
	        25.4%
	harden: 139
	        43.0%
#84 ALU_ADD_K: 0 + (-1) = 0x00000000ffffffff
	interp: 338
	jitted: 86
	        25.4%
	harden: 176
	        52.1%
#85 ALU_ADD_K: 0 + 0xffff = 0xffff
	interp: 347
	jitted: 99
	        28.5%
	harden: 190
	        54.8%
#86 ALU_ADD_K: 0 + 0x7fffffff = 0x7fffffff
	interp: 360
	jitted: 113
	        31.4%
	harden: 228
	        63.3%
#87 ALU_ADD_K: 0 + 0x80000000 = 0x80000000
	interp: 345
	jitted: 123
	        35.7%
	harden: 198
	        57.4%
#88 ALU_ADD_K: 0 + 0x80008000 = 0x80008000
	interp: 377
	jitted: 85
	        22.5%
	harden: 189
	        50.1%
#89 ALU64_ADD_K: 1 + 2 = 3
	interp: 184
	jitted: 66
	        35.9%
	harden: 112
	        60.9%
#90 ALU64_ADD_K: 3 + 0 = 3
	interp: 185
	jitted: 66
	        35.7%
	harden: 111
	        60.0%
#91 ALU64_ADD_K: 1 + 2147483646 = 2147483647
	interp: 186
	jitted: 69
	        37.1%
	harden: 138
	        74.2%
#92 ALU64_ADD_K: 4294967294 + 2 = 4294967296
	interp: 353
	jitted: 109
	        30.9%
	harden: 151
	        42.8%
#93 ALU64_ADD_K: 2147483646 + -2147483647 = -1
	interp: 182
	jitted: 72
	        39.6%
	harden: 115
	        63.2%
#94 ALU64_ADD_K: 1 + 0 = 1
	interp: 311
	jitted: 126
	        40.5%
	harden: 206
	        66.2%
#95 ALU64_ADD_K: 0 + (-1) = 0xffffffffffffffff
	interp: 339
	jitted: 107
	        31.6%
	harden: 211
	        62.2%
#96 ALU64_ADD_K: 0 + 0xffff = 0xffff
	interp: 310
	jitted: 98
	        31.6%
	harden: 250
	        80.6%
#97 ALU64_ADD_K: 0 + 0x7fffffff = 0x7fffffff
	interp: 313
	jitted: 87
	        27.8%
	harden: 199
	        63.6%
#98 ALU64_ADD_K: 0 + 0x80000000 = 0xffffffff80000000
	interp: 340
	jitted: 98
	        28.8%
	harden: 177
	        52.1%
#99 ALU_ADD_K: 0 + 0x80008000 = 0xffffffff80008000
	interp: 311
	jitted: 92
	        29.6%
	harden: 243
	        78.1%
#100 ALU_SUB_X: 3 - 1 = 2
	interp: 213
	jitted: 77
	        36.2%
	harden: 108
	        50.7%
#101 ALU_SUB_X: 4294967295 - 4294967294 = 1
	interp: 212
	jitted: 72
	        34.0%
	harden: 133
	        62.7%
#102 ALU64_SUB_X: 3 - 1 = 2
	interp: 237
	jitted: 72
	        30.4%
	harden: 110
	        46.4%
#103 ALU64_SUB_X: 4294967295 - 4294967294 = 1
	interp: 221
	jitted: 71
	        32.1%
	harden: 111
	        50.2%
#104 ALU_SUB_K: 3 - 1 = 2
	interp: 177
	jitted: 120
	        67.8%
	harden: 110
	        62.1%
#105 ALU_SUB_K: 3 - 0 = 3
	interp: 179
	jitted: 82
	        45.8%
	harden: 123
	        68.7%
#106 ALU_SUB_K: 4294967295 - 4294967294 = 1
	interp: 195
	jitted: 103
	        52.8%
	harden: 124
	        63.6%
#107 ALU64_SUB_K: 3 - 1 = 2
	interp: 183
	jitted: 140
	        76.5%
	harden: 116
	        63.4%
#108 ALU64_SUB_K: 3 - 0 = 3
	interp: 177
	jitted: 117
	        66.1%
	harden: 133
	        75.1%
#109 ALU64_SUB_K: 4294967294 - 4294967295 = -1
	interp: 181
	jitted: 83
	        45.9%
	harden: 148
	        81.8%
#110 ALU64_ADD_K: 2147483646 - 2147483647 = -1
	interp: 177
	jitted: 77
	        43.5%
	harden: 145
	        81.9%
#111 ALU_MUL_X: 2 * 3 = 6
	interp: 241
	jitted: 68
	        28.2%
	harden: 172
	        71.4%
#112 ALU_MUL_X: 2 * 0x7FFFFFF8 = 0xFFFFFFF0
	interp: 220
	jitted: 70
	        31.8%
	harden: 117
	        53.2%
#113 ALU_MUL_X: -1 * -1 = 1
	interp: 224
	jitted: 73
	        32.6%
	harden: 109
	        48.7%
#114 ALU64_MUL_X: 2 * 3 = 6
	interp: 213
	jitted: 70
	        32.9%
	harden: 115
	        54.0%
#115 ALU64_MUL_X: 1 * 2147483647 = 2147483647
	interp: 230
	jitted: 75
	        32.6%
	harden: 119
	        51.7%
#116 ALU_MUL_K: 2 * 3 = 6
	interp: 191
	jitted: 67
	        35.1%
	harden: 111
	        58.1%
#117 ALU_MUL_K: 3 * 1 = 3
	interp: 189
	jitted: 71
	        37.6%
	harden: 118
	        62.4%
#118 ALU_MUL_K: 2 * 0x7FFFFFF8 = 0xFFFFFFF0
	interp: 192
	jitted: 70
	        36.5%
	harden: 109
	        56.8%
#119 ALU_MUL_K: 1 * (-1) = 0x00000000ffffffff
	interp: 333
	jitted: 153
	        45.9%
	harden: 201
	        60.4%
#120 ALU64_MUL_K: 2 * 3 = 6
	interp: 185
	jitted: 101
	        54.6%
	harden: 116
	        62.7%
#121 ALU64_MUL_K: 3 * 1 = 3
	interp: 185
	jitted: 108
	        58.4%
	harden: 115
	        62.2%
#122 ALU64_MUL_K: 1 * 2147483647 = 2147483647
	interp: 184
	jitted: 106
	        57.6%
	harden: 278
	        151.1%
#123 ALU64_MUL_K: 1 * -2147483647 = -2147483647
	interp: 183
	jitted: 92
	        50.3%
	harden: 125
	        68.3%
#124 ALU64_MUL_K: 1 * (-1) = 0xffffffffffffffff
	interp: 336
	jitted: 122
	        36.3%
	harden: 208
	        61.9%
#125 ALU_DIV_X: 6 / 2 = 3
	interp: 316
	jitted: 220
	        69.6%
	harden: 246
	        77.8%
#126 ALU_DIV_X: 4294967295 / 4294967295 = 1
	interp: 315
	jitted: 208
	        66.0%
	harden: 291
	        92.4%
#130 ALU_DIV_K: 6 / 2 = 3
	interp: 249
	jitted: 246
	        98.8%
	harden: 234
	        94.0%
#131 ALU_DIV_K: 3 / 1 = 3
	interp: 240
	jitted: 199
	        82.9%
	harden: 240
	        100.0%
#132 ALU_DIV_K: 4294967295 / 4294967295 = 1
	interp: 254
	jitted: 192
	        75.6%
	harden: 276
	        108.7%
#133 ALU_DIV_K: 0xffffffffffffffff / (-1) = 0x1
	interp: 379
	jitted: 215
	        56.7%
	harden: 373
	        98.4%
#138 ALU_MOD_X: 3 % 2 = 1
	interp: 421
	jitted: 235
	        55.8%
	harden: 293
	        69.6%
#139 ALU_MOD_X: 4294967295 % 4294967293 = 2
	interp: 453
	jitted: 262
	        57.8%
	harden: 289
	        63.8%
#142 ALU_MOD_K: 3 % 2 = 1
	interp: 380
	jitted: 231
	        60.8%
	harden: 311
	        81.8%
#144 ALU_MOD_K: 4294967295 % 4294967293 = 2
	interp: 467
	jitted: 257
	        55.0%
	harden: 319
	        68.3%
#148 ALU_AND_X: 3 & 2 = 2
	interp: 225
	jitted: 100
	        44.4%
	harden: 109
	        48.4%
#149 ALU_AND_X: 0xffffffff & 0xffffffff = 0xffffffff
	interp: 261
	jitted: 106
	        40.6%
	harden: 130
	        49.8%
#150 ALU64_AND_X: 3 & 2 = 2
	interp: 273
	jitted: 86
	        31.5%
	harden: 106
	        38.8%
#151 ALU64_AND_X: 0xffffffff & 0xffffffff = 0xffffffff
	interp: 251
	jitted: 118
	        47.0%
	harden: 102
	        40.6%
#152 ALU_AND_K: 3 & 2 = 2
	interp: 201
	jitted: 117
	        58.2%
	harden: 114
	        56.7%
#153 ALU_AND_K: 0xffffffff & 0xffffffff = 0xffffffff
	interp: 240
	jitted: 72
	        30.0%
	harden: 138
	        57.5%
#154 ALU64_AND_K: 3 & 2 = 2
	interp: 209
	jitted: 72
	        34.4%
	harden: 110
	        52.6%
#155 ALU64_AND_K: 0xffffffff & 0xffffffff = 0xffffffff
	interp: 319
	jitted: 70
	        21.9%
	harden: 148
	        46.4%
#156 ALU64_AND_K: 0x0000ffffffff0000 & 0x0 = 0x0000ffff00000000
	interp: 384
	jitted: 99
	        25.8%
	harden: 206
	        53.6%
#157 ALU64_AND_K: 0x0000ffffffff0000 & -1 = 0x0000ffffffffffff
	interp: 367
	jitted: 97
	        26.4%
	harden: 176
	        48.0%
#158 ALU64_AND_K: 0xffffffffffffffff & -1 = 0xffffffffffffffff
	interp: 375
	jitted: 86
	        22.9%
	harden: 271
	        72.3%
#159 ALU_OR_X: 1 | 2 = 3
	interp: 271
	jitted: 73
	        26.9%
	harden: 108
	        39.9%
#160 ALU_OR_X: 0x0 | 0xffffffff = 0xffffffff
	interp: 280
	jitted: 72
	        25.7%
	harden: 118
	        42.1%
#161 ALU64_OR_X: 1 | 2 = 3
	interp: 253
	jitted: 89
	        35.2%
	harden: 103
	        40.7%
#162 ALU64_OR_X: 0 | 0xffffffff = 0xffffffff
	interp: 263
	jitted: 91
	        34.6%
	harden: 143
	        54.4%
#163 ALU_OR_K: 1 | 2 = 3
	interp: 216
	jitted: 71
	        32.9%
	harden: 123
	        56.9%
#164 ALU_OR_K: 0 & 0xffffffff = 0xffffffff
	interp: 187
	jitted: 116
	        62.0%
	harden: 110
	        58.8%
#165 ALU64_OR_K: 1 | 2 = 3
	interp: 183
	jitted: 77
	        42.1%
	harden: 120
	        65.6%
#166 ALU64_OR_K: 0 & 0xffffffff = 0xffffffff
	interp: 195
	jitted: 80
	        41.0%
	harden: 119
	        61.0%
#167 ALU64_OR_K: 0x0000ffffffff0000 | 0x0 = 0x0000ffff00000000
	interp: 338
	jitted: 86
	        25.4%
	harden: 212
	        62.7%
#168 ALU64_OR_K: 0x0000ffffffff0000 | -1 = 0xffffffffffffffff
	interp: 324
	jitted: 99
	        30.6%
	harden: 221
	        68.2%
#169 ALU64_OR_K: 0x000000000000000 | -1 = 0xffffffffffffffff
	interp: 309
	jitted: 147
	        47.6%
	harden: 198
	        64.1%
#170 ALU_XOR_X: 5 ^ 6 = 3
	interp: 216
	jitted: 80
	        37.0%
	harden: 138
	        63.9%
#171 ALU_XOR_X: 0x1 ^ 0xffffffff = 0xfffffffe
	interp: 414
	jitted: 73
	        17.6%
	harden: 130
	        31.4%
#172 ALU64_XOR_X: 5 ^ 6 = 3
	interp: 320
	jitted: 71
	        22.2%
	harden: 114
	        35.6%
#173 ALU64_XOR_X: 1 ^ 0xffffffff = 0xfffffffe
	interp: 223
	jitted: 72
	        32.3%
	harden: 106
	        47.5%
#174 ALU_XOR_K: 5 ^ 6 = 3
	interp: 203
	jitted: 71
	        35.0%
	harden: 112
	        55.2%
#175 ALU_XOR_K: 1 ^ 0xffffffff = 0xfffffffe
	interp: 205
	jitted: 67
	        32.7%
	harden: 116
	        56.6%
#176 ALU64_XOR_K: 5 ^ 6 = 3
	interp: 205
	jitted: 70
	        34.1%
	harden: 114
	        55.6%
#177 ALU64_XOR_K: 1 & 0xffffffff = 0xfffffffe
	interp: 186
	jitted: 104
	        55.9%
	harden: 112
	        60.2%
#178 ALU64_XOR_K: 0x0000ffffffff0000 ^ 0x0 = 0x0000ffffffff0000
	interp: 352
	jitted: 96
	        27.3%
	harden: 201
	        57.1%
#179 ALU64_XOR_K: 0x0000ffffffff0000 ^ -1 = 0xffff00000000ffff
	interp: 353
	jitted: 119
	        33.7%
	harden: 242
	        68.6%
#180 ALU64_XOR_K: 0x000000000000000 ^ -1 = 0xffffffffffffffff
	interp: 362
	jitted: 116
	        32.0%
	harden: 208
	        57.5%
#181 ALU_LSH_X: 1 << 1 = 2
	interp: 211
	jitted: 100
	        47.4%
	harden: 112
	        53.1%
#182 ALU_LSH_X: 1 << 31 = 0x80000000
	interp: 216
	jitted: 73
	        33.8%
	harden: 137
	        63.4%
#183 ALU64_LSH_X: 1 << 1 = 2
	interp: 224
	jitted: 119
	        53.1%
	harden: 163
	        72.8%
#184 ALU64_LSH_X: 1 << 31 = 0x80000000
	interp: 223
	jitted: 110
	        49.3%
	harden: 145
	        65.0%
#185 ALU_LSH_K: 1 << 1 = 2
	interp: 208
	jitted: 147
	        70.7%
	harden: 92
	        44.2%
#186 ALU_LSH_K: 1 << 31 = 0x80000000
	interp: 210
	jitted: 116
	        55.2%
	harden: 94
	        44.8%
#187 ALU64_LSH_K: 1 << 1 = 2
	interp: 211
	jitted: 154
	        73.0%
	harden: 94
	        44.5%
#188 ALU64_LSH_K: 1 << 31 = 0x80000000
	interp: 182
	jitted: 92
	        50.5%
	harden: 127
	        69.8%
#189 ALU_RSH_X: 2 >> 1 = 1
	interp: 226
	jitted: 86
	        38.1%
	harden: 135
	        59.7%
#190 ALU_RSH_X: 0x80000000 >> 31 = 1
	interp: 225
	jitted: 148
	        65.8%
	harden: 109
	        48.4%
#191 ALU64_RSH_X: 2 >> 1 = 1
	interp: 289
	jitted: 108
	        37.4%
	harden: 123
	        42.6%
#192 ALU64_RSH_X: 0x80000000 >> 31 = 1
	interp: 253
	jitted: 96
	        37.9%
	harden: 117
	        46.2%
#193 ALU_RSH_K: 2 >> 1 = 1
	interp: 207
	jitted: 68
	        32.9%
	harden: 95
	        45.9%
#194 ALU_RSH_K: 0x80000000 >> 31 = 1
	interp: 210
	jitted: 74
	        35.2%
	harden: 103
	        49.0%
#195 ALU64_RSH_K: 2 >> 1 = 1
	interp: 232
	jitted: 66
	        28.4%
	harden: 124
	        53.4%
#196 ALU64_RSH_K: 0x80000000 >> 31 = 1
	interp: 208
	jitted: 95
	        45.7%
	harden: 107
	        51.4%
#197 ALU_ARSH_X: 0xff00ff0000000000 >> 40 = 0xffffffffffff00ff
	interp: 252
	jitted: 74
	        29.4%
	harden: 125
	        49.6%
#198 ALU_ARSH_K: 0xff00ff0000000000 >> 40 = 0xffffffffffff00ff
	interp: 197
	jitted: 96
	        48.7%
	harden: 105
	        53.3%
#199 ALU_NEG: -(3) = -3
	interp: 189
	jitted: 84
	        44.4%
	harden: 76
	        40.2%
#200 ALU_NEG: -(-3) = 3
	interp: 171
	jitted: 72
	        42.1%
	harden: 106
	        62.0%
#201 ALU64_NEG: -(3) = -3
	interp: 179
	jitted: 74
	        41.3%
	harden: 104
	        58.1%
#202 ALU64_NEG: -(-3) = 3
	interp: 180
	jitted: 68
	        37.8%
	harden: 135
	        75.0%
#203 ALU_END_FROM_BE 16: 0x0123456789abcdef -> 0xcdef
	interp: 202
	jitted: 74
	        36.6%
	harden: 115
	        56.9%
#204 ALU_END_FROM_BE 32: 0x0123456789abcdef -> 0x89abcdef
	interp: 368
	jitted: 101
	        27.4%
	harden: 101
	        27.4%
#205 ALU_END_FROM_BE 64: 0x0123456789abcdef -> 0x89abcdef
	interp: 244
	jitted: 93
	        38.1%
	harden: 103
	        42.2%
#206 ALU_END_FROM_LE 16: 0x0123456789abcdef -> 0xefcd
	interp: 274
	jitted: 73
	        26.6%
	harden: 107
	        39.1%
#207 ALU_END_FROM_LE 32: 0x0123456789abcdef -> 0xefcdab89
	interp: 319
	jitted: 76
	        23.8%
	harden: 93
	        29.2%
#208 ALU_END_FROM_LE 64: 0x0123456789abcdef -> 0x67452301
	interp: 193
	jitted: 78
	        40.4%
	harden: 108
	        56.0%
#209 ST_MEM_B: Store/Load byte: max negative
	interp: 219
	jitted: 72
	        32.9%
	harden: 168
	        76.7%
#210 ST_MEM_B: Store/Load byte: max positive
	interp: 227
	jitted: 79
	        34.8%
	harden: 105
	        46.3%
#211 STX_MEM_B: Store/Load byte: max negative
	interp: 251
	jitted: 79
	        31.5%
	harden: 140
	        55.8%
#212 ST_MEM_H: Store/Load half word: max negative
	interp: 218
	jitted: 81
	        37.2%
	harden: 98
	        45.0%
#213 ST_MEM_H: Store/Load half word: max positive
	interp: 208
	jitted: 100
	        48.1%
	harden: 109
	        52.4%
#214 STX_MEM_H: Store/Load half word: max negative
	interp: 259
	jitted: 110
	        42.5%
	harden: 134
	        51.7%
#215 ST_MEM_W: Store/Load word: max negative
	interp: 253
	jitted: 75
	        29.6%
	harden: 148
	        58.5%
#216 ST_MEM_W: Store/Load word: max positive
	interp: 244
	jitted: 89
	        36.5%
	harden: 136
	        55.7%
#217 STX_MEM_W: Store/Load word: max negative
	interp: 297
	jitted: 122
	        41.1%
	harden: 205
	        69.0%
#218 ST_MEM_DW: Store/Load double word: max negative
	interp: 257
	jitted: 85
	        33.1%
	harden: 124
	        48.2%
#219 ST_MEM_DW: Store/Load double word: max negative 2
	interp: 392
	jitted: 123
	        31.4%
	harden: 222
	        56.6%
#220 ST_MEM_DW: Store/Load double word: max positive
	interp: 292
	jitted: 78
	        26.7%
	harden: 110
	        37.7%
#221 STX_MEM_DW: Store/Load double word: max negative
	interp: 259
	jitted: 85
	        32.8%
	harden: 194
	        74.9%
#230 JMP_EXIT
	interp: 127
	jitted: 82
	        64.6%
	harden: 77
	        60.6%
#231 JMP_JA: Unconditional jump: if (true) return 1
	interp: 194
	jitted: 86
	        44.3%
	harden: 84
	        43.3%
#232 JMP_JSGT_K: Signed jump: if (-1 > -2) return 1
	interp: 262
	jitted: 86
	        32.8%
	harden: 128
	        48.9%
#233 JMP_JSGT_K: Signed jump: if (-1 > -1) return 0
	interp: 249
	jitted: 82
	        32.9%
	harden: 126
	        50.6%
#234 JMP_JSGE_K: Signed jump: if (-1 >= -2) return 1
	interp: 262
	jitted: 72
	        27.5%
	harden: 179
	        68.3%
#235 JMP_JSGE_K: Signed jump: if (-1 >= -1) return 1
	interp: 260
	jitted: 73
	        28.1%
	harden: 125
	        48.1%
#236 JMP_JGT_K: if (3 > 2) return 1
	interp: 260
	jitted: 71
	        27.3%
	harden: 142
	        54.6%
#237 JMP_JGT_K: Unsigned jump: if (-1 > 1) return 1
	interp: 278
	jitted: 72
	        25.9%
	harden: 161
	        57.9%
#238 JMP_JGE_K: if (3 >= 2) return 1
	interp: 255
	jitted: 77
	        30.2%
	harden: 163
	        63.9%
#239 JMP_JGT_K: if (3 > 2) return 1 (jump backwards)
	interp: 321
	jitted: 76
	        23.7%
	harden: 143
	        44.5%
#240 JMP_JGE_K: if (3 >= 3) return 1
	interp: 340
	jitted: 74
	        21.8%
	harden: 179
	        52.6%
#241 JMP_JNE_K: if (3 != 2) return 1
	interp: 310
	jitted: 74
	        23.9%
	harden: 144
	        46.5%
#242 JMP_JEQ_K: if (3 == 3) return 1
	interp: 310
	jitted: 78
	        25.2%
	harden: 144
	        46.5%
#243 JMP_JSET_K: if (0x3 & 0x2) return 1
	interp: 276
	jitted: 109
	        39.5%
	harden: 149
	        54.0%
#244 JMP_JSET_K: if (0x3 & 0xffffffff) return 1
	interp: 312
	jitted: 71
	        22.8%
	harden: 153
	        49.0%
#245 JMP_JSGT_X: Signed jump: if (-1 > -2) return 1
	interp: 346
	jitted: 75
	        21.7%
	harden: 162
	        46.8%
#246 JMP_JSGT_X: Signed jump: if (-1 > -1) return 0
	interp: 292
	jitted: 78
	        26.7%
	harden: 162
	        55.5%
#247 JMP_JSGE_X: Signed jump: if (-1 >= -2) return 1
	interp: 318
	jitted: 134
	        42.1%
	harden: 178
	        56.0%
#248 JMP_JSGE_X: Signed jump: if (-1 >= -1) return 1
	interp: 287
	jitted: 102
	        35.5%
	harden: 192
	        66.9%
#249 JMP_JGT_X: if (3 > 2) return 1
	interp: 316
	jitted: 83
	        26.3%
	harden: 205
	        64.9%
#250 JMP_JGT_X: Unsigned jump: if (-1 > 1) return 1
	interp: 400
	jitted: 80
	        20.0%
	harden: 154
	        38.5%
#251 JMP_JGE_X: if (3 >= 2) return 1
	interp: 287
	jitted: 78
	        27.2%
	harden: 177
	        61.7%
#252 JMP_JGE_X: if (3 >= 3) return 1
	interp: 287
	jitted: 116
	        40.4%
	harden: 160
	        55.7%
#253 JMP_JGE_X: ldimm64 test 1
	interp: 323
	jitted: 81
	        25.1%
	harden: 204
	        63.2%
#254 JMP_JGE_X: ldimm64 test 2
	interp: 298
	jitted: 79
	        26.5%
	harden: 201
	        67.4%
#255 JMP_JGE_X: ldimm64 test 3
	interp: 263
	jitted: 78
	        29.7%
	harden: 184
	        70.0%
#256 JMP_JNE_X: if (3 != 2) return 1
	interp: 313
	jitted: 108
	        34.5%
	harden: 168
	        53.7%
#257 JMP_JEQ_X: if (3 == 3) return 1
	interp: 308
	jitted: 102
	        33.1%
	harden: 197
	        64.0%
#258 JMP_JSET_X: if (0x3 & 0x2) return 1
	interp: 359
	jitted: 133
	        37.0%
	harden: 192
	        53.5%
#259 JMP_JSET_X: if (0x3 & 0xffffffff) return 1
	interp: 421
	jitted: 128
	        30.4%
	harden: 181
	        43.0%
#260 JMP_JA: Jump, gap, jump, ...
	interp: 309
	jitted: 108
	        35.0%
	harden: 97
	        31.4%
#261 BPF_MAXINSNS: Maximum possible literals
	interp: 251
	jitted: 111
	        44.2%
	harden: 125
	        49.8%
#262 BPF_MAXINSNS: Single literal
	interp: 286
	jitted: 115
	        40.2%
	harden: 105
	        36.7%
#263 BPF_MAXINSNS: Run/add until end
	interp: 254969
	jitted: 8481
	        3.3%
	harden: 121315
	        47.6%
#265 BPF_MAXINSNS: Very long jump
	interp: 284
	jitted: 123
	        43.3%
	harden: 131
	        46.1%
#266 BPF_MAXINSNS: Ctx heavy transformations
	interp: 548311	560800
	jitted: 28166	29032
	        5.1%	5.2%
	harden: 217030	181848
	        39.6%	32.4%
#268 BPF_MAXINSNS: Jump heavy test
	interp: 480796
	jitted: 132663
	        27.6%
	harden: 440621
	        91.6%
#269 BPF_MAXINSNS: Very long jump backwards
	interp: 193
	jitted: 148
	        76.7%
	harden: 154
	        79.8%
#270 BPF_MAXINSNS: Edge hopping nuthouse
	interp: 114304
	jitted: 277097
	        242.4%
	harden: 302835
	        264.9%
#271 BPF_MAXINSNS: Jump, gap, jump, ...
	interp: 1884
	jitted: 1041
	        55.3%
	harden: 1008
	        53.5%
#274 LD_IND byte frag
	interp: 695
	jitted: 574
	        82.6%
	harden: 1453
	        209.1%
#275 LD_IND halfword frag
	interp: 818
	jitted: 641
	        78.4%
	harden: 600
	        73.3%
#276 LD_IND word frag
	interp: 837
	jitted: 731
	        87.3%
	harden: 719
	        85.9%
#277 LD_IND halfword mixed head/frag
	interp: 1170
	jitted: 741
	        63.3%
	harden: 705
	        60.3%
#278 LD_IND word mixed head/frag
	interp: 950
	jitted: 972
	        102.3%
	harden: 732
	        77.1%
#279 LD_ABS byte frag
	interp: 953
	jitted: 601
	        63.1%
	harden: 683
	        71.7%
#280 LD_ABS halfword frag
	interp: 754
	jitted: 603
	        80.0%
	harden: 595
	        78.9%
#281 LD_ABS word frag
	interp: 1133
	jitted: 688
	        60.7%
	harden: 672
	        59.3%
#282 LD_ABS halfword mixed head/frag
	interp: 1079
	jitted: 657
	        60.9%
	harden: 775
	        71.8%
#283 LD_ABS word mixed head/frag
	interp: 718
	jitted: 748
	        104.2%
	harden: 725
	        101.0%
#284 LD_IND byte default X
	interp: 297
	jitted: 178
	        59.9%
	harden: 274
	        92.3%
#285 LD_IND byte positive offset
	interp: 300
	jitted: 187
	        62.3%
	harden: 302
	        100.7%
#286 LD_IND byte negative offset
	interp: 296
	jitted: 178
	        60.1%
	harden: 311
	        105.1%
#287 LD_IND halfword positive offset
	interp: 333
	jitted: 161
	        48.3%
	harden: 218
	        65.5%
#288 LD_IND halfword negative offset
	interp: 306
	jitted: 195
	        63.7%
	harden: 193
	        63.1%
#289 LD_IND halfword unaligned
	interp: 307
	jitted: 183
	        59.6%
	harden: 190
	        61.9%
#290 LD_IND word positive offset
	interp: 337
	jitted: 170
	        50.4%
	harden: 200
	        59.3%
#291 LD_IND word negative offset
	interp: 312
	jitted: 198
	        63.5%
	harden: 216
	        69.2%
#292 LD_IND word unaligned (addr & 3 == 2)
	interp: 309
	jitted: 281
	        90.9%
	harden: 195
	        63.1%
#293 LD_IND word unaligned (addr & 3 == 1)
	interp: 335
	jitted: 172
	        51.3%
	harden: 196
	        58.5%
#294 LD_IND word unaligned (addr & 3 == 3)
	interp: 305
	jitted: 171
	        56.1%
	harden: 221
	        72.5%
#295 LD_ABS byte
	interp: 269
	jitted: 162
	        60.2%
	harden: 195
	        72.5%
#296 LD_ABS halfword
	interp: 294
	jitted: 160
	        54.4%
	harden: 170
	        57.8%
#297 LD_ABS halfword unaligned
	interp: 271
	jitted: 180
	        66.4%
	harden: 167
	        61.6%
#298 LD_ABS word
	interp: 265
	jitted: 166
	        62.6%
	harden: 182
	        68.7%
#299 LD_ABS word unaligned (addr & 3 == 2)
	interp: 267
	jitted: 157
	        58.8%
	harden: 185
	        69.3%
#300 LD_ABS word unaligned (addr & 3 == 1)
	interp: 269
	jitted: 170
	        63.2%
	harden: 162
	        60.2%
#301 LD_ABS word unaligned (addr & 3 == 3)
	interp: 281
	jitted: 163
	        58.0%
	harden: 231
	        82.2%
#302 ADD default X
	interp: 296
	jitted: 84
	        28.4%
	harden: 105
	        35.5%
#303 ADD default A
	interp: 309
	jitted: 79
	        25.6%
	harden: 101
	        32.7%
#304 SUB default X
	interp: 290
	jitted: 82
	        28.3%
	harden: 106
	        36.6%
#305 SUB default A
	interp: 252
	jitted: 85
	        33.7%
	harden: 119
	        47.2%
#306 MUL default X
	interp: 322
	jitted: 76
	        23.6%
	harden: 131
	        40.7%
#307 MUL default A
	interp: 267
	jitted: 83
	        31.1%
	harden: 116
	        43.4%
#308 DIV default X
	interp: 293
	jitted: 93
	        31.7%
	harden: 116
	        39.6%
#309 DIV default A
	interp: 336
	jitted: 203
	        60.4%
	harden: 227
	        67.6%
#310 MOD default X
	interp: 284
	jitted: 100
	        35.2%
	harden: 98
	        34.5%
#311 MOD default A
	interp: 435
	jitted: 249
	        57.2%
	harden: 265
	        60.9%
#312 JMP EQ default A
	interp: 352
	jitted: 83
	        23.6%
	harden: 134
	        38.1%
#313 JMP EQ default X
	interp: 357
	jitted: 95
	        26.6%
	harden: 108
	        30.3%

^ permalink raw reply	[flat|nested] 99+ messages in thread

* [kernel-hardening] Re: arch: arm: bpf: Converting cBPF to eBPF for arm 32 bit
@ 2017-05-22 20:05                         ` Kees Cook
  0 siblings, 0 replies; 99+ messages in thread
From: Kees Cook @ 2017-05-22 20:05 UTC (permalink / raw)
  To: Shubham Bansal
  Cc: Daniel Borkmann, David Miller, Mircea Gherzan,
	Network Development, kernel-hardening, linux-arm-kernel, ast

[-- Attachment #1: Type: text/plain, Size: 725 bytes --]

On Mon, May 22, 2017 at 10:04 AM, Shubham Bansal
<illusionist.neo@gmail.com> wrote:
> These all benchmarks are for ARMv7.

Thanks! In the future, try to avoid the white-space damage
(line-wrapping). And it looks like you've still got debugging turned
on in your jit code:

[   56.176033] test_bpf: #21 LD_CPU
[   56.176329] bpf_jit: *** NOT YET: opcode 85 ***
[   56.176565] jited:0 2639 702 PASS

That breaks the test report line. After I cleaned these up and parsed
the results, they look great. Most things are half the speed of the
interpreter, if not better. Only the LD_ABS suffered, and that's
mainly the const blinding, I assume.

Please post your current patch. Thanks for this!

-Kees

-- 
Kees Cook
Pixel Security

[-- Attachment #2: jitted.txt --]
[-- Type: text/plain, Size: 29420 bytes --]

#0 TAX
	interp: 757	645	650
	jitted: 234	171	195
	        30.9%	26.5%	30.0%
	harden: 239	218	229
	        31.6%	33.8%	35.2%
#1 TXA
	interp: 366	334	336
	jitted: 81	79	77
	        22.1%	23.7%	22.9%
	harden: 89	119	85
	        24.3%	35.6%	25.3%
#2 ADD_SUB_MUL_K
	interp: 543
	jitted: 89
	        16.4%
	harden: 213
	        39.2%
#3 DIV_MOD_KX
	interp: 1509
	jitted: 939
	        62.2%
	harden: 1190
	        78.9%
#4 AND_OR_LSH_K
	interp: 539	559
	jitted: 116	114
	        21.5%	20.4%
	harden: 200	149
	        37.1%	26.7%
#5 LD_IMM_0
	interp: 412
	jitted: 93
	        22.6%
	harden: 101
	        24.5%
#6 LD_IND
	interp: 428	376	389
	jitted: 371	279	274
	        86.7%	74.2%	70.4%
	harden: 314	310	283
	        73.4%	82.4%	72.8%
#7 LD_ABS
	interp: 509	405	358
	jitted: 408	402	272
	        80.2%	99.3%	76.0%
	harden: 376	460	397
	        73.9%	113.6%	110.9%
#8 LD_ABS_LL
	interp: 542	783
	jitted: 387	346
	        71.4%	44.2%
	harden: 608	415
	        112.2%	53.0%
#9 LD_IND_LL
	interp: 524	496	723
	jitted: 239	248	217
	        45.6%	50.0%	30.0%
	harden: 248	256	268
	        47.3%	51.6%	37.1%
#10 LD_ABS_NET
	interp: 527	545
	jitted: 356	332
	        67.6%	60.9%
	harden: 435	420
	        82.5%	77.1%
#11 LD_IND_NET
	interp: 650	495	647
	jitted: 223	212	320
	        34.3%	42.8%	49.5%
	harden: 240	228	215
	        36.9%	46.1%	33.2%
#12 LD_PKTTYPE
	interp: 686	901
	jitted: 102	90
	        14.9%	10.0%
	harden: 211	274
	        30.8%	30.4%
#13 LD_MARK
	interp: 305	291
	jitted: 80	80
	        26.2%	27.5%
	harden: 119	76
	        39.0%	26.1%
#14 LD_RXHASH
	interp: 257	259
	jitted: 73	71
	        28.4%	27.4%
	harden: 78	70
	        30.4%	27.0%
#15 LD_QUEUE
	interp: 255	254
	jitted: 120	121
	        47.1%	47.6%
	harden: 77	73
	        30.2%	28.7%
#16 LD_PROTOCOL
	interp: 593	603
	jitted: 256	247
	        43.2%	41.0%
	harden: 326	320
	        55.0%	53.1%
#17 LD_VLAN_TAG
	interp: 288	292
	jitted: 82	84
	        28.5%	28.8%
	harden: 129	86
	        44.8%	29.5%
#18 LD_VLAN_TAG_PRESENT
	interp: 335	421
	jitted: 80	77
	        23.9%	18.3%
	harden: 87	88
	        26.0%	20.9%
#19 LD_IFINDEX
	interp: 8568	606
	jitted: 87	98
	        1.0%	16.2%
	harden: 97	95
	        1.1%	15.7%
#20 LD_HATYPE
	interp: 618	695
	jitted: 95	90
	        15.4%	12.9%
	harden: 94	118
	        15.2%	17.0%
#25 LD_ANC_XOR
	interp: 314	344
	jitted: 86	100
	        27.4%	29.1%
	harden: 168	156
	        53.5%	45.3%
#26 SPILL_FILL
	interp: 757	850	903
	jitted: 131	137	123
	        17.3%	16.1%	13.6%
	harden: 232	212	219
	        30.6%	24.9%	24.3%
#27 JEQ
	interp: 380	420	426
	jitted: 266	189	216
	        70.0%	45.0%	50.7%
	harden: 362	352	230
	        95.3%	83.8%	54.0%
#28 JGT
	interp: 376	467	448
	jitted: 301	211	192
	        80.1%	45.2%	42.9%
	harden: 334	236	197
	        88.8%	50.5%	44.0%
#29 JGE
	interp: 446	590	498
	jitted: 191	200	223
	        42.8%	33.9%	44.8%
	harden: 260	318	307
	        58.3%	53.9%	61.6%
#30 JSET
	interp: 571	787	1003
	jitted: 211	210	214
	        37.0%	26.7%	21.3%
	harden: 274	339	410
	        48.0%	43.1%	40.9%
#31 tcpdump port 22
	interp: 358	1079	1190
	jitted: 314	722	711
	        87.7%	66.9%	59.7%
	harden: 355	951	968
	        99.2%	88.1%	81.3%
#32 tcpdump complex
	interp: 319	1061	2324
	jitted: 291	707	1068
	        91.2%	66.6%	46.0%
	harden: 318	798	1308
	        99.7%	75.2%	56.3%
#33 RET_A
	interp: 253	249
	jitted: 83	88
	        32.8%	35.3%
	harden: 83	76
	        32.8%	30.5%
#34 INT: ADD trivial
	interp: 414
	jitted: 162
	        39.1%
	harden: 152
	        36.7%
#35 INT: MUL_X
	interp: 336
	jitted: 176
	        52.4%
	harden: 192
	        57.1%
#36 INT: MUL_X2
	interp: 431
	jitted: 84
	        19.5%
	harden: 165
	        38.3%
#37 INT: MUL32_X
	interp: 523
	jitted: 99
	        18.9%
	harden: 163
	        31.2%
#38 INT: ADD 64-bit
	interp: 5263
	jitted: 1066
	        20.3%
	harden: 1507
	        28.6%
#39 INT: ADD 32-bit
	interp: 4127
	jitted: 666
	        16.1%
	harden: 954
	        23.1%
#40 INT: SUB
	interp: 4218
	jitted: 3236
	        76.7%
	harden: 1159
	        27.5%
#41 INT: XOR
	interp: 2252
	jitted: 308
	        13.7%
	harden: 480
	        21.3%
#42 INT: MUL
	interp: 1986
	jitted: 376
	        18.9%
	harden: 486
	        24.5%
#43 MOV REG64
	interp: 1103
	jitted: 227
	        20.6%
	harden: 274
	        24.8%
#44 MOV REG32
	interp: 1140
	jitted: 171
	        15.0%
	harden: 253
	        22.2%
#45 LD IMM64
	interp: 1182
	jitted: 163
	        13.8%
	harden: 578
	        48.9%
#47 INT: shifts by register
	interp: 1125
	jitted: 208
	        18.5%
	harden: 381
	        33.9%
#48 INT: DIV + ABS
	interp: 570	850
	jitted: 659	601
	        115.6%	70.7%
	harden: 588	482
	        103.2%	56.7%
#49 INT: DIV by zero
	interp: 350	305
	jitted: 317	169
	        90.6%	55.4%
	harden: 276	199
	        78.9%	65.2%
#54 JUMPS + HOLES
	interp: 863
	jitted: 358
	        41.5%
	harden: 371
	        43.0%
#57 M[]: alt STX + LDX
	interp: 3990
	jitted: 456
	        11.4%
	harden: 621
	        15.6%
#58 M[]: full STX + full LDX
	interp: 2819
	jitted: 438
	        15.5%
	harden: 586
	        20.8%
#60 LD [SKF_AD_OFF-1]
	interp: 313
	jitted: 198
	        63.3%
	harden: 195
	        62.3%
#61 load 64-bit immediate
	interp: 579
	jitted: 125
	        21.6%
	harden: 220
	        38.0%
#62 nmap reduced
	interp: 1860
	jitted: 1054
	        56.7%
	harden: 816
	        43.9%
#63 ALU_MOV_X: dst = 2
	interp: 249
	jitted: 81
	        32.5%
	harden: 76
	        30.5%
#64 ALU_MOV_X: dst = 4294967295
	interp: 264
	jitted: 85
	        32.2%
	harden: 79
	        29.9%
#65 ALU64_MOV_X: dst = 2
	interp: 229
	jitted: 96
	        41.9%
	harden: 80
	        34.9%
#66 ALU64_MOV_X: dst = 4294967295
	interp: 213
	jitted: 71
	        33.3%
	harden: 79
	        37.1%
#67 ALU_MOV_K: dst = 2
	interp: 167
	jitted: 70
	        41.9%
	harden: 75
	        44.9%
#68 ALU_MOV_K: dst = 4294967295
	interp: 149
	jitted: 71
	        47.7%
	harden: 73
	        49.0%
#69 ALU_MOV_K: 0x0000ffffffff0000 = 0x00000000ffffffff
	interp: 358
	jitted: 97
	        27.1%
	harden: 195
	        54.5%
#70 ALU64_MOV_K: dst = 2
	interp: 158
	jitted: 75
	        47.5%
	harden: 77
	        48.7%
#71 ALU64_MOV_K: dst = 2147483647
	interp: 156
	jitted: 66
	        42.3%
	harden: 104
	        66.7%
#72 ALU64_OR_K: dst = 0x0
	interp: 306
	jitted: 92
	        30.1%
	harden: 215
	        70.3%
#73 ALU64_MOV_K: dst = -1
	interp: 327
	jitted: 94
	        28.7%
	harden: 173
	        52.9%
#74 ALU_ADD_X: 1 + 2 = 3
	interp: 212
	jitted: 66
	        31.1%
	harden: 114
	        53.8%
#75 ALU_ADD_X: 1 + 4294967294 = 4294967295
	interp: 231
	jitted: 66
	        28.6%
	harden: 112
	        48.5%
#76 ALU_ADD_X: 2 + 4294967294 = 0
	interp: 309
	jitted: 87
	        28.2%
	harden: 186
	        60.2%
#77 ALU64_ADD_X: 1 + 2 = 3
	interp: 280
	jitted: 77
	        27.5%
	harden: 159
	        56.8%
#78 ALU64_ADD_X: 1 + 4294967294 = 4294967295
	interp: 286
	jitted: 72
	        25.2%
	harden: 109
	        38.1%
#79 ALU64_ADD_X: 2 + 4294967294 = 4294967296
	interp: 460
	jitted: 79
	        17.2%
	harden: 218
	        47.4%
#80 ALU_ADD_K: 1 + 2 = 3
	interp: 210
	jitted: 75
	        35.7%
	harden: 120
	        57.1%
#81 ALU_ADD_K: 3 + 0 = 3
	interp: 208
	jitted: 71
	        34.1%
	harden: 118
	        56.7%
#82 ALU_ADD_K: 1 + 4294967294 = 4294967295
	interp: 205
	jitted: 67
	        32.7%
	harden: 121
	        59.0%
#83 ALU_ADD_K: 4294967294 + 2 = 0
	interp: 323
	jitted: 82
	        25.4%
	harden: 139
	        43.0%
#84 ALU_ADD_K: 0 + (-1) = 0x00000000ffffffff
	interp: 338
	jitted: 86
	        25.4%
	harden: 176
	        52.1%
#85 ALU_ADD_K: 0 + 0xffff = 0xffff
	interp: 347
	jitted: 99
	        28.5%
	harden: 190
	        54.8%
#86 ALU_ADD_K: 0 + 0x7fffffff = 0x7fffffff
	interp: 360
	jitted: 113
	        31.4%
	harden: 228
	        63.3%
#87 ALU_ADD_K: 0 + 0x80000000 = 0x80000000
	interp: 345
	jitted: 123
	        35.7%
	harden: 198
	        57.4%
#88 ALU_ADD_K: 0 + 0x80008000 = 0x80008000
	interp: 377
	jitted: 85
	        22.5%
	harden: 189
	        50.1%
#89 ALU64_ADD_K: 1 + 2 = 3
	interp: 184
	jitted: 66
	        35.9%
	harden: 112
	        60.9%
#90 ALU64_ADD_K: 3 + 0 = 3
	interp: 185
	jitted: 66
	        35.7%
	harden: 111
	        60.0%
#91 ALU64_ADD_K: 1 + 2147483646 = 2147483647
	interp: 186
	jitted: 69
	        37.1%
	harden: 138
	        74.2%
#92 ALU64_ADD_K: 4294967294 + 2 = 4294967296
	interp: 353
	jitted: 109
	        30.9%
	harden: 151
	        42.8%
#93 ALU64_ADD_K: 2147483646 + -2147483647 = -1
	interp: 182
	jitted: 72
	        39.6%
	harden: 115
	        63.2%
#94 ALU64_ADD_K: 1 + 0 = 1
	interp: 311
	jitted: 126
	        40.5%
	harden: 206
	        66.2%
#95 ALU64_ADD_K: 0 + (-1) = 0xffffffffffffffff
	interp: 339
	jitted: 107
	        31.6%
	harden: 211
	        62.2%
#96 ALU64_ADD_K: 0 + 0xffff = 0xffff
	interp: 310
	jitted: 98
	        31.6%
	harden: 250
	        80.6%
#97 ALU64_ADD_K: 0 + 0x7fffffff = 0x7fffffff
	interp: 313
	jitted: 87
	        27.8%
	harden: 199
	        63.6%
#98 ALU64_ADD_K: 0 + 0x80000000 = 0xffffffff80000000
	interp: 340
	jitted: 98
	        28.8%
	harden: 177
	        52.1%
#99 ALU_ADD_K: 0 + 0x80008000 = 0xffffffff80008000
	interp: 311
	jitted: 92
	        29.6%
	harden: 243
	        78.1%
#100 ALU_SUB_X: 3 - 1 = 2
	interp: 213
	jitted: 77
	        36.2%
	harden: 108
	        50.7%
#101 ALU_SUB_X: 4294967295 - 4294967294 = 1
	interp: 212
	jitted: 72
	        34.0%
	harden: 133
	        62.7%
#102 ALU64_SUB_X: 3 - 1 = 2
	interp: 237
	jitted: 72
	        30.4%
	harden: 110
	        46.4%
#103 ALU64_SUB_X: 4294967295 - 4294967294 = 1
	interp: 221
	jitted: 71
	        32.1%
	harden: 111
	        50.2%
#104 ALU_SUB_K: 3 - 1 = 2
	interp: 177
	jitted: 120
	        67.8%
	harden: 110
	        62.1%
#105 ALU_SUB_K: 3 - 0 = 3
	interp: 179
	jitted: 82
	        45.8%
	harden: 123
	        68.7%
#106 ALU_SUB_K: 4294967295 - 4294967294 = 1
	interp: 195
	jitted: 103
	        52.8%
	harden: 124
	        63.6%
#107 ALU64_SUB_K: 3 - 1 = 2
	interp: 183
	jitted: 140
	        76.5%
	harden: 116
	        63.4%
#108 ALU64_SUB_K: 3 - 0 = 3
	interp: 177
	jitted: 117
	        66.1%
	harden: 133
	        75.1%
#109 ALU64_SUB_K: 4294967294 - 4294967295 = -1
	interp: 181
	jitted: 83
	        45.9%
	harden: 148
	        81.8%
#110 ALU64_ADD_K: 2147483646 - 2147483647 = -1
	interp: 177
	jitted: 77
	        43.5%
	harden: 145
	        81.9%
#111 ALU_MUL_X: 2 * 3 = 6
	interp: 241
	jitted: 68
	        28.2%
	harden: 172
	        71.4%
#112 ALU_MUL_X: 2 * 0x7FFFFFF8 = 0xFFFFFFF0
	interp: 220
	jitted: 70
	        31.8%
	harden: 117
	        53.2%
#113 ALU_MUL_X: -1 * -1 = 1
	interp: 224
	jitted: 73
	        32.6%
	harden: 109
	        48.7%
#114 ALU64_MUL_X: 2 * 3 = 6
	interp: 213
	jitted: 70
	        32.9%
	harden: 115
	        54.0%
#115 ALU64_MUL_X: 1 * 2147483647 = 2147483647
	interp: 230
	jitted: 75
	        32.6%
	harden: 119
	        51.7%
#116 ALU_MUL_K: 2 * 3 = 6
	interp: 191
	jitted: 67
	        35.1%
	harden: 111
	        58.1%
#117 ALU_MUL_K: 3 * 1 = 3
	interp: 189
	jitted: 71
	        37.6%
	harden: 118
	        62.4%
#118 ALU_MUL_K: 2 * 0x7FFFFFF8 = 0xFFFFFFF0
	interp: 192
	jitted: 70
	        36.5%
	harden: 109
	        56.8%
#119 ALU_MUL_K: 1 * (-1) = 0x00000000ffffffff
	interp: 333
	jitted: 153
	        45.9%
	harden: 201
	        60.4%
#120 ALU64_MUL_K: 2 * 3 = 6
	interp: 185
	jitted: 101
	        54.6%
	harden: 116
	        62.7%
#121 ALU64_MUL_K: 3 * 1 = 3
	interp: 185
	jitted: 108
	        58.4%
	harden: 115
	        62.2%
#122 ALU64_MUL_K: 1 * 2147483647 = 2147483647
	interp: 184
	jitted: 106
	        57.6%
	harden: 278
	        151.1%
#123 ALU64_MUL_K: 1 * -2147483647 = -2147483647
	interp: 183
	jitted: 92
	        50.3%
	harden: 125
	        68.3%
#124 ALU64_MUL_K: 1 * (-1) = 0xffffffffffffffff
	interp: 336
	jitted: 122
	        36.3%
	harden: 208
	        61.9%
#125 ALU_DIV_X: 6 / 2 = 3
	interp: 316
	jitted: 220
	        69.6%
	harden: 246
	        77.8%
#126 ALU_DIV_X: 4294967295 / 4294967295 = 1
	interp: 315
	jitted: 208
	        66.0%
	harden: 291
	        92.4%
#130 ALU_DIV_K: 6 / 2 = 3
	interp: 249
	jitted: 246
	        98.8%
	harden: 234
	        94.0%
#131 ALU_DIV_K: 3 / 1 = 3
	interp: 240
	jitted: 199
	        82.9%
	harden: 240
	        100.0%
#132 ALU_DIV_K: 4294967295 / 4294967295 = 1
	interp: 254
	jitted: 192
	        75.6%
	harden: 276
	        108.7%
#133 ALU_DIV_K: 0xffffffffffffffff / (-1) = 0x1
	interp: 379
	jitted: 215
	        56.7%
	harden: 373
	        98.4%
#138 ALU_MOD_X: 3 % 2 = 1
	interp: 421
	jitted: 235
	        55.8%
	harden: 293
	        69.6%
#139 ALU_MOD_X: 4294967295 % 4294967293 = 2
	interp: 453
	jitted: 262
	        57.8%
	harden: 289
	        63.8%
#142 ALU_MOD_K: 3 % 2 = 1
	interp: 380
	jitted: 231
	        60.8%
	harden: 311
	        81.8%
#144 ALU_MOD_K: 4294967295 % 4294967293 = 2
	interp: 467
	jitted: 257
	        55.0%
	harden: 319
	        68.3%
#148 ALU_AND_X: 3 & 2 = 2
	interp: 225
	jitted: 100
	        44.4%
	harden: 109
	        48.4%
#149 ALU_AND_X: 0xffffffff & 0xffffffff = 0xffffffff
	interp: 261
	jitted: 106
	        40.6%
	harden: 130
	        49.8%
#150 ALU64_AND_X: 3 & 2 = 2
	interp: 273
	jitted: 86
	        31.5%
	harden: 106
	        38.8%
#151 ALU64_AND_X: 0xffffffff & 0xffffffff = 0xffffffff
	interp: 251
	jitted: 118
	        47.0%
	harden: 102
	        40.6%
#152 ALU_AND_K: 3 & 2 = 2
	interp: 201
	jitted: 117
	        58.2%
	harden: 114
	        56.7%
#153 ALU_AND_K: 0xffffffff & 0xffffffff = 0xffffffff
	interp: 240
	jitted: 72
	        30.0%
	harden: 138
	        57.5%
#154 ALU64_AND_K: 3 & 2 = 2
	interp: 209
	jitted: 72
	        34.4%
	harden: 110
	        52.6%
#155 ALU64_AND_K: 0xffffffff & 0xffffffff = 0xffffffff
	interp: 319
	jitted: 70
	        21.9%
	harden: 148
	        46.4%
#156 ALU64_AND_K: 0x0000ffffffff0000 & 0x0 = 0x0000ffff00000000
	interp: 384
	jitted: 99
	        25.8%
	harden: 206
	        53.6%
#157 ALU64_AND_K: 0x0000ffffffff0000 & -1 = 0x0000ffffffffffff
	interp: 367
	jitted: 97
	        26.4%
	harden: 176
	        48.0%
#158 ALU64_AND_K: 0xffffffffffffffff & -1 = 0xffffffffffffffff
	interp: 375
	jitted: 86
	        22.9%
	harden: 271
	        72.3%
#159 ALU_OR_X: 1 | 2 = 3
	interp: 271
	jitted: 73
	        26.9%
	harden: 108
	        39.9%
#160 ALU_OR_X: 0x0 | 0xffffffff = 0xffffffff
	interp: 280
	jitted: 72
	        25.7%
	harden: 118
	        42.1%
#161 ALU64_OR_X: 1 | 2 = 3
	interp: 253
	jitted: 89
	        35.2%
	harden: 103
	        40.7%
#162 ALU64_OR_X: 0 | 0xffffffff = 0xffffffff
	interp: 263
	jitted: 91
	        34.6%
	harden: 143
	        54.4%
#163 ALU_OR_K: 1 | 2 = 3
	interp: 216
	jitted: 71
	        32.9%
	harden: 123
	        56.9%
#164 ALU_OR_K: 0 & 0xffffffff = 0xffffffff
	interp: 187
	jitted: 116
	        62.0%
	harden: 110
	        58.8%
#165 ALU64_OR_K: 1 | 2 = 3
	interp: 183
	jitted: 77
	        42.1%
	harden: 120
	        65.6%
#166 ALU64_OR_K: 0 & 0xffffffff = 0xffffffff
	interp: 195
	jitted: 80
	        41.0%
	harden: 119
	        61.0%
#167 ALU64_OR_K: 0x0000ffffffff0000 | 0x0 = 0x0000ffff00000000
	interp: 338
	jitted: 86
	        25.4%
	harden: 212
	        62.7%
#168 ALU64_OR_K: 0x0000ffffffff0000 | -1 = 0xffffffffffffffff
	interp: 324
	jitted: 99
	        30.6%
	harden: 221
	        68.2%
#169 ALU64_OR_K: 0x000000000000000 | -1 = 0xffffffffffffffff
	interp: 309
	jitted: 147
	        47.6%
	harden: 198
	        64.1%
#170 ALU_XOR_X: 5 ^ 6 = 3
	interp: 216
	jitted: 80
	        37.0%
	harden: 138
	        63.9%
#171 ALU_XOR_X: 0x1 ^ 0xffffffff = 0xfffffffe
	interp: 414
	jitted: 73
	        17.6%
	harden: 130
	        31.4%
#172 ALU64_XOR_X: 5 ^ 6 = 3
	interp: 320
	jitted: 71
	        22.2%
	harden: 114
	        35.6%
#173 ALU64_XOR_X: 1 ^ 0xffffffff = 0xfffffffe
	interp: 223
	jitted: 72
	        32.3%
	harden: 106
	        47.5%
#174 ALU_XOR_K: 5 ^ 6 = 3
	interp: 203
	jitted: 71
	        35.0%
	harden: 112
	        55.2%
#175 ALU_XOR_K: 1 ^ 0xffffffff = 0xfffffffe
	interp: 205
	jitted: 67
	        32.7%
	harden: 116
	        56.6%
#176 ALU64_XOR_K: 5 ^ 6 = 3
	interp: 205
	jitted: 70
	        34.1%
	harden: 114
	        55.6%
#177 ALU64_XOR_K: 1 & 0xffffffff = 0xfffffffe
	interp: 186
	jitted: 104
	        55.9%
	harden: 112
	        60.2%
#178 ALU64_XOR_K: 0x0000ffffffff0000 ^ 0x0 = 0x0000ffffffff0000
	interp: 352
	jitted: 96
	        27.3%
	harden: 201
	        57.1%
#179 ALU64_XOR_K: 0x0000ffffffff0000 ^ -1 = 0xffff00000000ffff
	interp: 353
	jitted: 119
	        33.7%
	harden: 242
	        68.6%
#180 ALU64_XOR_K: 0x000000000000000 ^ -1 = 0xffffffffffffffff
	interp: 362
	jitted: 116
	        32.0%
	harden: 208
	        57.5%
#181 ALU_LSH_X: 1 << 1 = 2
	interp: 211
	jitted: 100
	        47.4%
	harden: 112
	        53.1%
#182 ALU_LSH_X: 1 << 31 = 0x80000000
	interp: 216
	jitted: 73
	        33.8%
	harden: 137
	        63.4%
#183 ALU64_LSH_X: 1 << 1 = 2
	interp: 224
	jitted: 119
	        53.1%
	harden: 163
	        72.8%
#184 ALU64_LSH_X: 1 << 31 = 0x80000000
	interp: 223
	jitted: 110
	        49.3%
	harden: 145
	        65.0%
#185 ALU_LSH_K: 1 << 1 = 2
	interp: 208
	jitted: 147
	        70.7%
	harden: 92
	        44.2%
#186 ALU_LSH_K: 1 << 31 = 0x80000000
	interp: 210
	jitted: 116
	        55.2%
	harden: 94
	        44.8%
#187 ALU64_LSH_K: 1 << 1 = 2
	interp: 211
	jitted: 154
	        73.0%
	harden: 94
	        44.5%
#188 ALU64_LSH_K: 1 << 31 = 0x80000000
	interp: 182
	jitted: 92
	        50.5%
	harden: 127
	        69.8%
#189 ALU_RSH_X: 2 >> 1 = 1
	interp: 226
	jitted: 86
	        38.1%
	harden: 135
	        59.7%
#190 ALU_RSH_X: 0x80000000 >> 31 = 1
	interp: 225
	jitted: 148
	        65.8%
	harden: 109
	        48.4%
#191 ALU64_RSH_X: 2 >> 1 = 1
	interp: 289
	jitted: 108
	        37.4%
	harden: 123
	        42.6%
#192 ALU64_RSH_X: 0x80000000 >> 31 = 1
	interp: 253
	jitted: 96
	        37.9%
	harden: 117
	        46.2%
#193 ALU_RSH_K: 2 >> 1 = 1
	interp: 207
	jitted: 68
	        32.9%
	harden: 95
	        45.9%
#194 ALU_RSH_K: 0x80000000 >> 31 = 1
	interp: 210
	jitted: 74
	        35.2%
	harden: 103
	        49.0%
#195 ALU64_RSH_K: 2 >> 1 = 1
	interp: 232
	jitted: 66
	        28.4%
	harden: 124
	        53.4%
#196 ALU64_RSH_K: 0x80000000 >> 31 = 1
	interp: 208
	jitted: 95
	        45.7%
	harden: 107
	        51.4%
#197 ALU_ARSH_X: 0xff00ff0000000000 >> 40 = 0xffffffffffff00ff
	interp: 252
	jitted: 74
	        29.4%
	harden: 125
	        49.6%
#198 ALU_ARSH_K: 0xff00ff0000000000 >> 40 = 0xffffffffffff00ff
	interp: 197
	jitted: 96
	        48.7%
	harden: 105
	        53.3%
#199 ALU_NEG: -(3) = -3
	interp: 189
	jitted: 84
	        44.4%
	harden: 76
	        40.2%
#200 ALU_NEG: -(-3) = 3
	interp: 171
	jitted: 72
	        42.1%
	harden: 106
	        62.0%
#201 ALU64_NEG: -(3) = -3
	interp: 179
	jitted: 74
	        41.3%
	harden: 104
	        58.1%
#202 ALU64_NEG: -(-3) = 3
	interp: 180
	jitted: 68
	        37.8%
	harden: 135
	        75.0%
#203 ALU_END_FROM_BE 16: 0x0123456789abcdef -> 0xcdef
	interp: 202
	jitted: 74
	        36.6%
	harden: 115
	        56.9%
#204 ALU_END_FROM_BE 32: 0x0123456789abcdef -> 0x89abcdef
	interp: 368
	jitted: 101
	        27.4%
	harden: 101
	        27.4%
#205 ALU_END_FROM_BE 64: 0x0123456789abcdef -> 0x89abcdef
	interp: 244
	jitted: 93
	        38.1%
	harden: 103
	        42.2%
#206 ALU_END_FROM_LE 16: 0x0123456789abcdef -> 0xefcd
	interp: 274
	jitted: 73
	        26.6%
	harden: 107
	        39.1%
#207 ALU_END_FROM_LE 32: 0x0123456789abcdef -> 0xefcdab89
	interp: 319
	jitted: 76
	        23.8%
	harden: 93
	        29.2%
#208 ALU_END_FROM_LE 64: 0x0123456789abcdef -> 0x67452301
	interp: 193
	jitted: 78
	        40.4%
	harden: 108
	        56.0%
#209 ST_MEM_B: Store/Load byte: max negative
	interp: 219
	jitted: 72
	        32.9%
	harden: 168
	        76.7%
#210 ST_MEM_B: Store/Load byte: max positive
	interp: 227
	jitted: 79
	        34.8%
	harden: 105
	        46.3%
#211 STX_MEM_B: Store/Load byte: max negative
	interp: 251
	jitted: 79
	        31.5%
	harden: 140
	        55.8%
#212 ST_MEM_H: Store/Load half word: max negative
	interp: 218
	jitted: 81
	        37.2%
	harden: 98
	        45.0%
#213 ST_MEM_H: Store/Load half word: max positive
	interp: 208
	jitted: 100
	        48.1%
	harden: 109
	        52.4%
#214 STX_MEM_H: Store/Load half word: max negative
	interp: 259
	jitted: 110
	        42.5%
	harden: 134
	        51.7%
#215 ST_MEM_W: Store/Load word: max negative
	interp: 253
	jitted: 75
	        29.6%
	harden: 148
	        58.5%
#216 ST_MEM_W: Store/Load word: max positive
	interp: 244
	jitted: 89
	        36.5%
	harden: 136
	        55.7%
#217 STX_MEM_W: Store/Load word: max negative
	interp: 297
	jitted: 122
	        41.1%
	harden: 205
	        69.0%
#218 ST_MEM_DW: Store/Load double word: max negative
	interp: 257
	jitted: 85
	        33.1%
	harden: 124
	        48.2%
#219 ST_MEM_DW: Store/Load double word: max negative 2
	interp: 392
	jitted: 123
	        31.4%
	harden: 222
	        56.6%
#220 ST_MEM_DW: Store/Load double word: max positive
	interp: 292
	jitted: 78
	        26.7%
	harden: 110
	        37.7%
#221 STX_MEM_DW: Store/Load double word: max negative
	interp: 259
	jitted: 85
	        32.8%
	harden: 194
	        74.9%
#230 JMP_EXIT
	interp: 127
	jitted: 82
	        64.6%
	harden: 77
	        60.6%
#231 JMP_JA: Unconditional jump: if (true) return 1
	interp: 194
	jitted: 86
	        44.3%
	harden: 84
	        43.3%
#232 JMP_JSGT_K: Signed jump: if (-1 > -2) return 1
	interp: 262
	jitted: 86
	        32.8%
	harden: 128
	        48.9%
#233 JMP_JSGT_K: Signed jump: if (-1 > -1) return 0
	interp: 249
	jitted: 82
	        32.9%
	harden: 126
	        50.6%
#234 JMP_JSGE_K: Signed jump: if (-1 >= -2) return 1
	interp: 262
	jitted: 72
	        27.5%
	harden: 179
	        68.3%
#235 JMP_JSGE_K: Signed jump: if (-1 >= -1) return 1
	interp: 260
	jitted: 73
	        28.1%
	harden: 125
	        48.1%
#236 JMP_JGT_K: if (3 > 2) return 1
	interp: 260
	jitted: 71
	        27.3%
	harden: 142
	        54.6%
#237 JMP_JGT_K: Unsigned jump: if (-1 > 1) return 1
	interp: 278
	jitted: 72
	        25.9%
	harden: 161
	        57.9%
#238 JMP_JGE_K: if (3 >= 2) return 1
	interp: 255
	jitted: 77
	        30.2%
	harden: 163
	        63.9%
#239 JMP_JGT_K: if (3 > 2) return 1 (jump backwards)
	interp: 321
	jitted: 76
	        23.7%
	harden: 143
	        44.5%
#240 JMP_JGE_K: if (3 >= 3) return 1
	interp: 340
	jitted: 74
	        21.8%
	harden: 179
	        52.6%
#241 JMP_JNE_K: if (3 != 2) return 1
	interp: 310
	jitted: 74
	        23.9%
	harden: 144
	        46.5%
#242 JMP_JEQ_K: if (3 == 3) return 1
	interp: 310
	jitted: 78
	        25.2%
	harden: 144
	        46.5%
#243 JMP_JSET_K: if (0x3 & 0x2) return 1
	interp: 276
	jitted: 109
	        39.5%
	harden: 149
	        54.0%
#244 JMP_JSET_K: if (0x3 & 0xffffffff) return 1
	interp: 312
	jitted: 71
	        22.8%
	harden: 153
	        49.0%
#245 JMP_JSGT_X: Signed jump: if (-1 > -2) return 1
	interp: 346
	jitted: 75
	        21.7%
	harden: 162
	        46.8%
#246 JMP_JSGT_X: Signed jump: if (-1 > -1) return 0
	interp: 292
	jitted: 78
	        26.7%
	harden: 162
	        55.5%
#247 JMP_JSGE_X: Signed jump: if (-1 >= -2) return 1
	interp: 318
	jitted: 134
	        42.1%
	harden: 178
	        56.0%
#248 JMP_JSGE_X: Signed jump: if (-1 >= -1) return 1
	interp: 287
	jitted: 102
	        35.5%
	harden: 192
	        66.9%
#249 JMP_JGT_X: if (3 > 2) return 1
	interp: 316
	jitted: 83
	        26.3%
	harden: 205
	        64.9%
#250 JMP_JGT_X: Unsigned jump: if (-1 > 1) return 1
	interp: 400
	jitted: 80
	        20.0%
	harden: 154
	        38.5%
#251 JMP_JGE_X: if (3 >= 2) return 1
	interp: 287
	jitted: 78
	        27.2%
	harden: 177
	        61.7%
#252 JMP_JGE_X: if (3 >= 3) return 1
	interp: 287
	jitted: 116
	        40.4%
	harden: 160
	        55.7%
#253 JMP_JGE_X: ldimm64 test 1
	interp: 323
	jitted: 81
	        25.1%
	harden: 204
	        63.2%
#254 JMP_JGE_X: ldimm64 test 2
	interp: 298
	jitted: 79
	        26.5%
	harden: 201
	        67.4%
#255 JMP_JGE_X: ldimm64 test 3
	interp: 263
	jitted: 78
	        29.7%
	harden: 184
	        70.0%
#256 JMP_JNE_X: if (3 != 2) return 1
	interp: 313
	jitted: 108
	        34.5%
	harden: 168
	        53.7%
#257 JMP_JEQ_X: if (3 == 3) return 1
	interp: 308
	jitted: 102
	        33.1%
	harden: 197
	        64.0%
#258 JMP_JSET_X: if (0x3 & 0x2) return 1
	interp: 359
	jitted: 133
	        37.0%
	harden: 192
	        53.5%
#259 JMP_JSET_X: if (0x3 & 0xffffffff) return 1
	interp: 421
	jitted: 128
	        30.4%
	harden: 181
	        43.0%
#260 JMP_JA: Jump, gap, jump, ...
	interp: 309
	jitted: 108
	        35.0%
	harden: 97
	        31.4%
#261 BPF_MAXINSNS: Maximum possible literals
	interp: 251
	jitted: 111
	        44.2%
	harden: 125
	        49.8%
#262 BPF_MAXINSNS: Single literal
	interp: 286
	jitted: 115
	        40.2%
	harden: 105
	        36.7%
#263 BPF_MAXINSNS: Run/add until end
	interp: 254969
	jitted: 8481
	        3.3%
	harden: 121315
	        47.6%
#265 BPF_MAXINSNS: Very long jump
	interp: 284
	jitted: 123
	        43.3%
	harden: 131
	        46.1%
#266 BPF_MAXINSNS: Ctx heavy transformations
	interp: 548311	560800
	jitted: 28166	29032
	        5.1%	5.2%
	harden: 217030	181848
	        39.6%	32.4%
#268 BPF_MAXINSNS: Jump heavy test
	interp: 480796
	jitted: 132663
	        27.6%
	harden: 440621
	        91.6%
#269 BPF_MAXINSNS: Very long jump backwards
	interp: 193
	jitted: 148
	        76.7%
	harden: 154
	        79.8%
#270 BPF_MAXINSNS: Edge hopping nuthouse
	interp: 114304
	jitted: 277097
	        242.4%
	harden: 302835
	        264.9%
#271 BPF_MAXINSNS: Jump, gap, jump, ...
	interp: 1884
	jitted: 1041
	        55.3%
	harden: 1008
	        53.5%
#274 LD_IND byte frag
	interp: 695
	jitted: 574
	        82.6%
	harden: 1453
	        209.1%
#275 LD_IND halfword frag
	interp: 818
	jitted: 641
	        78.4%
	harden: 600
	        73.3%
#276 LD_IND word frag
	interp: 837
	jitted: 731
	        87.3%
	harden: 719
	        85.9%
#277 LD_IND halfword mixed head/frag
	interp: 1170
	jitted: 741
	        63.3%
	harden: 705
	        60.3%
#278 LD_IND word mixed head/frag
	interp: 950
	jitted: 972
	        102.3%
	harden: 732
	        77.1%
#279 LD_ABS byte frag
	interp: 953
	jitted: 601
	        63.1%
	harden: 683
	        71.7%
#280 LD_ABS halfword frag
	interp: 754
	jitted: 603
	        80.0%
	harden: 595
	        78.9%
#281 LD_ABS word frag
	interp: 1133
	jitted: 688
	        60.7%
	harden: 672
	        59.3%
#282 LD_ABS halfword mixed head/frag
	interp: 1079
	jitted: 657
	        60.9%
	harden: 775
	        71.8%
#283 LD_ABS word mixed head/frag
	interp: 718
	jitted: 748
	        104.2%
	harden: 725
	        101.0%
#284 LD_IND byte default X
	interp: 297
	jitted: 178
	        59.9%
	harden: 274
	        92.3%
#285 LD_IND byte positive offset
	interp: 300
	jitted: 187
	        62.3%
	harden: 302
	        100.7%
#286 LD_IND byte negative offset
	interp: 296
	jitted: 178
	        60.1%
	harden: 311
	        105.1%
#287 LD_IND halfword positive offset
	interp: 333
	jitted: 161
	        48.3%
	harden: 218
	        65.5%
#288 LD_IND halfword negative offset
	interp: 306
	jitted: 195
	        63.7%
	harden: 193
	        63.1%
#289 LD_IND halfword unaligned
	interp: 307
	jitted: 183
	        59.6%
	harden: 190
	        61.9%
#290 LD_IND word positive offset
	interp: 337
	jitted: 170
	        50.4%
	harden: 200
	        59.3%
#291 LD_IND word negative offset
	interp: 312
	jitted: 198
	        63.5%
	harden: 216
	        69.2%
#292 LD_IND word unaligned (addr & 3 == 2)
	interp: 309
	jitted: 281
	        90.9%
	harden: 195
	        63.1%
#293 LD_IND word unaligned (addr & 3 == 1)
	interp: 335
	jitted: 172
	        51.3%
	harden: 196
	        58.5%
#294 LD_IND word unaligned (addr & 3 == 3)
	interp: 305
	jitted: 171
	        56.1%
	harden: 221
	        72.5%
#295 LD_ABS byte
	interp: 269
	jitted: 162
	        60.2%
	harden: 195
	        72.5%
#296 LD_ABS halfword
	interp: 294
	jitted: 160
	        54.4%
	harden: 170
	        57.8%
#297 LD_ABS halfword unaligned
	interp: 271
	jitted: 180
	        66.4%
	harden: 167
	        61.6%
#298 LD_ABS word
	interp: 265
	jitted: 166
	        62.6%
	harden: 182
	        68.7%
#299 LD_ABS word unaligned (addr & 3 == 2)
	interp: 267
	jitted: 157
	        58.8%
	harden: 185
	        69.3%
#300 LD_ABS word unaligned (addr & 3 == 1)
	interp: 269
	jitted: 170
	        63.2%
	harden: 162
	        60.2%
#301 LD_ABS word unaligned (addr & 3 == 3)
	interp: 281
	jitted: 163
	        58.0%
	harden: 231
	        82.2%
#302 ADD default X
	interp: 296
	jitted: 84
	        28.4%
	harden: 105
	        35.5%
#303 ADD default A
	interp: 309
	jitted: 79
	        25.6%
	harden: 101
	        32.7%
#304 SUB default X
	interp: 290
	jitted: 82
	        28.3%
	harden: 106
	        36.6%
#305 SUB default A
	interp: 252
	jitted: 85
	        33.7%
	harden: 119
	        47.2%
#306 MUL default X
	interp: 322
	jitted: 76
	        23.6%
	harden: 131
	        40.7%
#307 MUL default A
	interp: 267
	jitted: 83
	        31.1%
	harden: 116
	        43.4%
#308 DIV default X
	interp: 293
	jitted: 93
	        31.7%
	harden: 116
	        39.6%
#309 DIV default A
	interp: 336
	jitted: 203
	        60.4%
	harden: 227
	        67.6%
#310 MOD default X
	interp: 284
	jitted: 100
	        35.2%
	harden: 98
	        34.5%
#311 MOD default A
	interp: 435
	jitted: 249
	        57.2%
	harden: 265
	        60.9%
#312 JMP EQ default A
	interp: 352
	jitted: 83
	        23.6%
	harden: 134
	        38.1%
#313 JMP EQ default X
	interp: 357
	jitted: 95
	        26.6%
	harden: 108
	        30.3%

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: arch: arm: bpf: Converting cBPF to eBPF for arm 32 bit
  2017-05-22 20:05                         ` Kees Cook
  (?)
@ 2017-05-23  2:58                           ` Shubham Bansal
  -1 siblings, 0 replies; 99+ messages in thread
From: Shubham Bansal @ 2017-05-23  2:58 UTC (permalink / raw)
  To: Kees Cook
  Cc: Daniel Borkmann, David Miller, Mircea Gherzan,
	Network Development, kernel-hardening, linux-arm-kernel, ast

Hi,

On testing the eBPF JIT with CONFIG_FRAME_POINTER I got the following
crash for non jitted testcase.

[   72.032494] test_bpf: #267 BPF_MAXINSNS: Call heavy transformations
jited:0 1112799
[   92.304815] NMI watchdog: BUG: soft lockup - CPU#0 stuck for 22s!
[insmod:104]
[   92.305050] Modules linked in: test_bpf(+)
[   92.305516] CPU: 0 PID: 104 Comm: insmod Not tainted
4.11.0-10603-g13e0988-dirty #21
[   92.305630] Hardware name: ARM-Versatile Express
[   92.305943] task: c75d5280 task.stack: c61b8000
[   92.306383] PC is at __bpf_prog_run+0x818/0x17a8
[   92.306449] LR is at __bpf_prog_run+0xab8/0x17a8
[   92.306510] pc : [<c0407c08>]    lr : [<c0407ea8>]    psr: 20000013
[   92.306510] sp : c61b9a88  ip : c61b9a88  fp : c61b9d4c
[   92.306629] r10: c0404104  r9 : 00000000  r8 : 00000000
[   92.306744] r7 : c0e0b500  r6 : c0c39bb0  r5 : c61b9ad0  r4 : ca314840
[   92.306882] r3 : c0e0b7fc  r2 : 00000000  r1 : c61b9ad8  r0 : 00000000
[   92.307070] Flags: nzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
[   92.307285] Control: 10c5387d  Table: 661e0059  DAC: 00000051
[   92.307503] CPU: 0 PID: 104 Comm: insmod Not tainted
4.11.0-10603-g13e0988-dirty #21
[   92.307575] Hardware name: ARM-Versatile Express
[   92.307651] Backtrace:
[   92.307868] [<c030caec>] (dump_backtrace) from [<c030cda8>]
(show_stack+0x18/0x1c)
[   92.308003]  r7:c1503db8 r6:60000193 r5:00000000 r4:c1570f30
[   92.308085] [<c030cd90>] (show_stack) from [<c064b198>]
(dump_stack+0x90/0xa4)
[   92.308195] [<c064b108>] (dump_stack) from [<c030900c>] (show_regs+0x14/0x18)
[   92.308281]  r7:c1503db8 r6:c14488b8 r5:c16aaea0 r4:c61b8000
[   92.308346] [<c0308ff8>] (show_regs) from [<c03df2a4>]
(watchdog_timer_fn+0x24c/0x2c4)
[   92.308423] [<c03df058>] (watchdog_timer_fn) from [<c03b70d8>]
(__hrtimer_run_queues+0x180/0x318)
[   92.308514]  r10:c03df058 r9:00000003 r8:c1503cbc r7:c7ead580
r6:c7ead5c0 r5:c61b8000
[   92.308578]  r4:c7ead8d8
[   92.308635] [<c03b6f58>] (__hrtimer_run_queues) from [<c03b74e8>]
(hrtimer_interrupt+0xb4/0x204)
[   92.308728]  r10:7fffffff r9:00000003 r8:c7ead5f8 r7:c7ead618
r6:c7ead638 r5:c1448580
[   92.308789]  r4:c7ead580
[   92.308835] [<c03b7434>] (hrtimer_interrupt) from [<c03113fc>]
(twd_handler+0x38/0x48)
[   92.308914]  r10:c0404104 r9:00000010 r8:c1504330 r7:00000001
r6:c701e900 r5:00000000
[   92.308974]  r4:00000001
[   92.309021] [<c03113c4>] (twd_handler) from [<c03a1238>]
(handle_percpu_devid_irq+0x90/0x244)
[   92.309091]  r5:00000000 r4:c7020540
[   92.309165] [<c03a11a8>] (handle_percpu_devid_irq) from
[<c039c148>] (generic_handle_irq+0x2c/0x3c)
[   92.309254]  r10:c0404104 r9:c8803100 r8:c7004a00 r7:00000001
r6:00000000 r5:00000000
[   92.309319]  r4:c1449ed0 r3:c03a11a8
[   92.309369] [<c039c11c>] (generic_handle_irq) from [<c039c6f0>]
(__handle_domain_irq+0x64/0xbc)
[   92.309445] [<c039c68c>] (__handle_domain_irq) from [<c0301808>]
(gic_handle_irq+0x5c/0xa0)
[   92.309525]  r9:c8803100 r8:c8802100 r7:c61b9a38 r6:c880210c
r5:c1571848 r4:c1504330
[   92.309596] [<c03017ac>] (gic_handle_irq) from [<c030d98c>]
(__irq_svc+0x6c/0x90)
[   92.309731] Exception stack(0xc61b9a38 to 0xc61b9a80)
[   92.309943] 9a20:
    00000000 c61b9ad8
[   92.310184] 9a40: 00000000 c0e0b7fc ca314840 c61b9ad0 c0c39bb0
c0e0b500 00000000 00000000
[   92.310377] 9a60: c0404104 c61b9d4c c61b9a88 c61b9a88 c0407ea8
c0407c08 20000013 ffffffff
[   92.310595]  r9:c61b8000 r8:00000000 r7:c61b9a6c r6:ffffffff
r5:20000013 r4:c0407c08
[   92.311103] [<c04073f0>] (__bpf_prog_run) from [<bf15759c>]
(test_bpf_init+0x59c/0x1000 [test_bpf])
[   92.311262]  r10:bf123094 r9:ca2fa020 r8:00000000 r7:bf123128
r6:53edefe8 r5:ca2fa000
[   92.311325]  r4:00000555
[   92.311382] [<bf157000>] (test_bpf_init [test_bpf]) from
[<c0301f7c>] (do_one_initcall+0x4c/0x174)
[   92.311468]  r10:bf154640 r9:c61c2524 r8:39e3db1c r7:00000001
r6:00000000 r5:bf157000
[   92.311529]  r4:ffffe000
[   92.311575] [<c0301f30>] (do_one_initcall) from [<c042a5b0>]
(do_init_module+0x6c/0x1fc)
[   92.311673]  r9:c61c2524 r8:39e3db1c r6:c61c2480 r5:00000001 r4:bf154640
[   92.311744] [<c042a544>] (do_init_module) from [<c03d393c>]
(load_module+0x1f8c/0x2394)
[   92.311815]  r6:c61c2500 r5:00000001 r4:c61b9f34
[   92.311898] [<c03d19b0>] (load_module) from [<c03d3ea0>]
(SyS_init_module+0x15c/0x174)
[   92.311979]  r10:00000051 r9:00000000 r8:00160fda r7:c61b8000
r6:c95a6a18 r5:b6fbca20
[   92.312040]  r4:00006a18
[   92.312087] [<c03d3d44>] (SyS_init_module) from [<c0308260>]
(ret_fast_syscall+0x0/0x3c)
[   92.312196]  r10:00000000 r9:c61b8000 r8:c0308424 r7:00000080
r6:756e694c r5:00156a18
[   92.312277]  r4:00000000
[   93.835343] 1065840 PASS

Does this look like a bug? I will send the separate mail if it does.
Let me know.

Best,
Shubham Bansal


On Tue, May 23, 2017 at 1:35 AM, Kees Cook <keescook@chromium.org> wrote:
> On Mon, May 22, 2017 at 10:04 AM, Shubham Bansal
> <illusionist.neo@gmail.com> wrote:
>> These all benchmarks are for ARMv7.
>
> Thanks! In the future, try to avoid the white-space damage
> (line-wrapping). And it looks like you've still got debugging turned
> on in your jit code:
>
> [   56.176033] test_bpf: #21 LD_CPU
> [   56.176329] bpf_jit: *** NOT YET: opcode 85 ***
> [   56.176565] jited:0 2639 702 PASS
>
> That breaks the test report line. After I cleaned these up and parsed
> the results, they look great. Most things are half the speed of the
> interpreter, if not better. Only the LD_ABS suffered, and that's
> mainly the const blinding, I assume.
>
> Please post your current patch. Thanks for this!
>
> -Kees
>
> --
> Kees Cook
> Pixel Security

^ permalink raw reply	[flat|nested] 99+ messages in thread

* arch: arm: bpf: Converting cBPF to eBPF for arm 32 bit
@ 2017-05-23  2:58                           ` Shubham Bansal
  0 siblings, 0 replies; 99+ messages in thread
From: Shubham Bansal @ 2017-05-23  2:58 UTC (permalink / raw)
  To: linux-arm-kernel

Hi,

On testing the eBPF JIT with CONFIG_FRAME_POINTER I got the following
crash for non jitted testcase.

[   72.032494] test_bpf: #267 BPF_MAXINSNS: Call heavy transformations
jited:0 1112799
[   92.304815] NMI watchdog: BUG: soft lockup - CPU#0 stuck for 22s!
[insmod:104]
[   92.305050] Modules linked in: test_bpf(+)
[   92.305516] CPU: 0 PID: 104 Comm: insmod Not tainted
4.11.0-10603-g13e0988-dirty #21
[   92.305630] Hardware name: ARM-Versatile Express
[   92.305943] task: c75d5280 task.stack: c61b8000
[   92.306383] PC is at __bpf_prog_run+0x818/0x17a8
[   92.306449] LR is at __bpf_prog_run+0xab8/0x17a8
[   92.306510] pc : [<c0407c08>]    lr : [<c0407ea8>]    psr: 20000013
[   92.306510] sp : c61b9a88  ip : c61b9a88  fp : c61b9d4c
[   92.306629] r10: c0404104  r9 : 00000000  r8 : 00000000
[   92.306744] r7 : c0e0b500  r6 : c0c39bb0  r5 : c61b9ad0  r4 : ca314840
[   92.306882] r3 : c0e0b7fc  r2 : 00000000  r1 : c61b9ad8  r0 : 00000000
[   92.307070] Flags: nzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
[   92.307285] Control: 10c5387d  Table: 661e0059  DAC: 00000051
[   92.307503] CPU: 0 PID: 104 Comm: insmod Not tainted
4.11.0-10603-g13e0988-dirty #21
[   92.307575] Hardware name: ARM-Versatile Express
[   92.307651] Backtrace:
[   92.307868] [<c030caec>] (dump_backtrace) from [<c030cda8>]
(show_stack+0x18/0x1c)
[   92.308003]  r7:c1503db8 r6:60000193 r5:00000000 r4:c1570f30
[   92.308085] [<c030cd90>] (show_stack) from [<c064b198>]
(dump_stack+0x90/0xa4)
[   92.308195] [<c064b108>] (dump_stack) from [<c030900c>] (show_regs+0x14/0x18)
[   92.308281]  r7:c1503db8 r6:c14488b8 r5:c16aaea0 r4:c61b8000
[   92.308346] [<c0308ff8>] (show_regs) from [<c03df2a4>]
(watchdog_timer_fn+0x24c/0x2c4)
[   92.308423] [<c03df058>] (watchdog_timer_fn) from [<c03b70d8>]
(__hrtimer_run_queues+0x180/0x318)
[   92.308514]  r10:c03df058 r9:00000003 r8:c1503cbc r7:c7ead580
r6:c7ead5c0 r5:c61b8000
[   92.308578]  r4:c7ead8d8
[   92.308635] [<c03b6f58>] (__hrtimer_run_queues) from [<c03b74e8>]
(hrtimer_interrupt+0xb4/0x204)
[   92.308728]  r10:7fffffff r9:00000003 r8:c7ead5f8 r7:c7ead618
r6:c7ead638 r5:c1448580
[   92.308789]  r4:c7ead580
[   92.308835] [<c03b7434>] (hrtimer_interrupt) from [<c03113fc>]
(twd_handler+0x38/0x48)
[   92.308914]  r10:c0404104 r9:00000010 r8:c1504330 r7:00000001
r6:c701e900 r5:00000000
[   92.308974]  r4:00000001
[   92.309021] [<c03113c4>] (twd_handler) from [<c03a1238>]
(handle_percpu_devid_irq+0x90/0x244)
[   92.309091]  r5:00000000 r4:c7020540
[   92.309165] [<c03a11a8>] (handle_percpu_devid_irq) from
[<c039c148>] (generic_handle_irq+0x2c/0x3c)
[   92.309254]  r10:c0404104 r9:c8803100 r8:c7004a00 r7:00000001
r6:00000000 r5:00000000
[   92.309319]  r4:c1449ed0 r3:c03a11a8
[   92.309369] [<c039c11c>] (generic_handle_irq) from [<c039c6f0>]
(__handle_domain_irq+0x64/0xbc)
[   92.309445] [<c039c68c>] (__handle_domain_irq) from [<c0301808>]
(gic_handle_irq+0x5c/0xa0)
[   92.309525]  r9:c8803100 r8:c8802100 r7:c61b9a38 r6:c880210c
r5:c1571848 r4:c1504330
[   92.309596] [<c03017ac>] (gic_handle_irq) from [<c030d98c>]
(__irq_svc+0x6c/0x90)
[   92.309731] Exception stack(0xc61b9a38 to 0xc61b9a80)
[   92.309943] 9a20:
    00000000 c61b9ad8
[   92.310184] 9a40: 00000000 c0e0b7fc ca314840 c61b9ad0 c0c39bb0
c0e0b500 00000000 00000000
[   92.310377] 9a60: c0404104 c61b9d4c c61b9a88 c61b9a88 c0407ea8
c0407c08 20000013 ffffffff
[   92.310595]  r9:c61b8000 r8:00000000 r7:c61b9a6c r6:ffffffff
r5:20000013 r4:c0407c08
[   92.311103] [<c04073f0>] (__bpf_prog_run) from [<bf15759c>]
(test_bpf_init+0x59c/0x1000 [test_bpf])
[   92.311262]  r10:bf123094 r9:ca2fa020 r8:00000000 r7:bf123128
r6:53edefe8 r5:ca2fa000
[   92.311325]  r4:00000555
[   92.311382] [<bf157000>] (test_bpf_init [test_bpf]) from
[<c0301f7c>] (do_one_initcall+0x4c/0x174)
[   92.311468]  r10:bf154640 r9:c61c2524 r8:39e3db1c r7:00000001
r6:00000000 r5:bf157000
[   92.311529]  r4:ffffe000
[   92.311575] [<c0301f30>] (do_one_initcall) from [<c042a5b0>]
(do_init_module+0x6c/0x1fc)
[   92.311673]  r9:c61c2524 r8:39e3db1c r6:c61c2480 r5:00000001 r4:bf154640
[   92.311744] [<c042a544>] (do_init_module) from [<c03d393c>]
(load_module+0x1f8c/0x2394)
[   92.311815]  r6:c61c2500 r5:00000001 r4:c61b9f34
[   92.311898] [<c03d19b0>] (load_module) from [<c03d3ea0>]
(SyS_init_module+0x15c/0x174)
[   92.311979]  r10:00000051 r9:00000000 r8:00160fda r7:c61b8000
r6:c95a6a18 r5:b6fbca20
[   92.312040]  r4:00006a18
[   92.312087] [<c03d3d44>] (SyS_init_module) from [<c0308260>]
(ret_fast_syscall+0x0/0x3c)
[   92.312196]  r10:00000000 r9:c61b8000 r8:c0308424 r7:00000080
r6:756e694c r5:00156a18
[   92.312277]  r4:00000000
[   93.835343] 1065840 PASS

Does this look like a bug? I will send the separate mail if it does.
Let me know.

Best,
Shubham Bansal


On Tue, May 23, 2017 at 1:35 AM, Kees Cook <keescook@chromium.org> wrote:
> On Mon, May 22, 2017 at 10:04 AM, Shubham Bansal
> <illusionist.neo@gmail.com> wrote:
>> These all benchmarks are for ARMv7.
>
> Thanks! In the future, try to avoid the white-space damage
> (line-wrapping). And it looks like you've still got debugging turned
> on in your jit code:
>
> [   56.176033] test_bpf: #21 LD_CPU
> [   56.176329] bpf_jit: *** NOT YET: opcode 85 ***
> [   56.176565] jited:0 2639 702 PASS
>
> That breaks the test report line. After I cleaned these up and parsed
> the results, they look great. Most things are half the speed of the
> interpreter, if not better. Only the LD_ABS suffered, and that's
> mainly the const blinding, I assume.
>
> Please post your current patch. Thanks for this!
>
> -Kees
>
> --
> Kees Cook
> Pixel Security

^ permalink raw reply	[flat|nested] 99+ messages in thread

* [kernel-hardening] Re: arch: arm: bpf: Converting cBPF to eBPF for arm 32 bit
@ 2017-05-23  2:58                           ` Shubham Bansal
  0 siblings, 0 replies; 99+ messages in thread
From: Shubham Bansal @ 2017-05-23  2:58 UTC (permalink / raw)
  To: Kees Cook
  Cc: Daniel Borkmann, David Miller, Mircea Gherzan,
	Network Development, kernel-hardening, linux-arm-kernel, ast

Hi,

On testing the eBPF JIT with CONFIG_FRAME_POINTER I got the following
crash for non jitted testcase.

[   72.032494] test_bpf: #267 BPF_MAXINSNS: Call heavy transformations
jited:0 1112799
[   92.304815] NMI watchdog: BUG: soft lockup - CPU#0 stuck for 22s!
[insmod:104]
[   92.305050] Modules linked in: test_bpf(+)
[   92.305516] CPU: 0 PID: 104 Comm: insmod Not tainted
4.11.0-10603-g13e0988-dirty #21
[   92.305630] Hardware name: ARM-Versatile Express
[   92.305943] task: c75d5280 task.stack: c61b8000
[   92.306383] PC is at __bpf_prog_run+0x818/0x17a8
[   92.306449] LR is at __bpf_prog_run+0xab8/0x17a8
[   92.306510] pc : [<c0407c08>]    lr : [<c0407ea8>]    psr: 20000013
[   92.306510] sp : c61b9a88  ip : c61b9a88  fp : c61b9d4c
[   92.306629] r10: c0404104  r9 : 00000000  r8 : 00000000
[   92.306744] r7 : c0e0b500  r6 : c0c39bb0  r5 : c61b9ad0  r4 : ca314840
[   92.306882] r3 : c0e0b7fc  r2 : 00000000  r1 : c61b9ad8  r0 : 00000000
[   92.307070] Flags: nzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
[   92.307285] Control: 10c5387d  Table: 661e0059  DAC: 00000051
[   92.307503] CPU: 0 PID: 104 Comm: insmod Not tainted
4.11.0-10603-g13e0988-dirty #21
[   92.307575] Hardware name: ARM-Versatile Express
[   92.307651] Backtrace:
[   92.307868] [<c030caec>] (dump_backtrace) from [<c030cda8>]
(show_stack+0x18/0x1c)
[   92.308003]  r7:c1503db8 r6:60000193 r5:00000000 r4:c1570f30
[   92.308085] [<c030cd90>] (show_stack) from [<c064b198>]
(dump_stack+0x90/0xa4)
[   92.308195] [<c064b108>] (dump_stack) from [<c030900c>] (show_regs+0x14/0x18)
[   92.308281]  r7:c1503db8 r6:c14488b8 r5:c16aaea0 r4:c61b8000
[   92.308346] [<c0308ff8>] (show_regs) from [<c03df2a4>]
(watchdog_timer_fn+0x24c/0x2c4)
[   92.308423] [<c03df058>] (watchdog_timer_fn) from [<c03b70d8>]
(__hrtimer_run_queues+0x180/0x318)
[   92.308514]  r10:c03df058 r9:00000003 r8:c1503cbc r7:c7ead580
r6:c7ead5c0 r5:c61b8000
[   92.308578]  r4:c7ead8d8
[   92.308635] [<c03b6f58>] (__hrtimer_run_queues) from [<c03b74e8>]
(hrtimer_interrupt+0xb4/0x204)
[   92.308728]  r10:7fffffff r9:00000003 r8:c7ead5f8 r7:c7ead618
r6:c7ead638 r5:c1448580
[   92.308789]  r4:c7ead580
[   92.308835] [<c03b7434>] (hrtimer_interrupt) from [<c03113fc>]
(twd_handler+0x38/0x48)
[   92.308914]  r10:c0404104 r9:00000010 r8:c1504330 r7:00000001
r6:c701e900 r5:00000000
[   92.308974]  r4:00000001
[   92.309021] [<c03113c4>] (twd_handler) from [<c03a1238>]
(handle_percpu_devid_irq+0x90/0x244)
[   92.309091]  r5:00000000 r4:c7020540
[   92.309165] [<c03a11a8>] (handle_percpu_devid_irq) from
[<c039c148>] (generic_handle_irq+0x2c/0x3c)
[   92.309254]  r10:c0404104 r9:c8803100 r8:c7004a00 r7:00000001
r6:00000000 r5:00000000
[   92.309319]  r4:c1449ed0 r3:c03a11a8
[   92.309369] [<c039c11c>] (generic_handle_irq) from [<c039c6f0>]
(__handle_domain_irq+0x64/0xbc)
[   92.309445] [<c039c68c>] (__handle_domain_irq) from [<c0301808>]
(gic_handle_irq+0x5c/0xa0)
[   92.309525]  r9:c8803100 r8:c8802100 r7:c61b9a38 r6:c880210c
r5:c1571848 r4:c1504330
[   92.309596] [<c03017ac>] (gic_handle_irq) from [<c030d98c>]
(__irq_svc+0x6c/0x90)
[   92.309731] Exception stack(0xc61b9a38 to 0xc61b9a80)
[   92.309943] 9a20:
    00000000 c61b9ad8
[   92.310184] 9a40: 00000000 c0e0b7fc ca314840 c61b9ad0 c0c39bb0
c0e0b500 00000000 00000000
[   92.310377] 9a60: c0404104 c61b9d4c c61b9a88 c61b9a88 c0407ea8
c0407c08 20000013 ffffffff
[   92.310595]  r9:c61b8000 r8:00000000 r7:c61b9a6c r6:ffffffff
r5:20000013 r4:c0407c08
[   92.311103] [<c04073f0>] (__bpf_prog_run) from [<bf15759c>]
(test_bpf_init+0x59c/0x1000 [test_bpf])
[   92.311262]  r10:bf123094 r9:ca2fa020 r8:00000000 r7:bf123128
r6:53edefe8 r5:ca2fa000
[   92.311325]  r4:00000555
[   92.311382] [<bf157000>] (test_bpf_init [test_bpf]) from
[<c0301f7c>] (do_one_initcall+0x4c/0x174)
[   92.311468]  r10:bf154640 r9:c61c2524 r8:39e3db1c r7:00000001
r6:00000000 r5:bf157000
[   92.311529]  r4:ffffe000
[   92.311575] [<c0301f30>] (do_one_initcall) from [<c042a5b0>]
(do_init_module+0x6c/0x1fc)
[   92.311673]  r9:c61c2524 r8:39e3db1c r6:c61c2480 r5:00000001 r4:bf154640
[   92.311744] [<c042a544>] (do_init_module) from [<c03d393c>]
(load_module+0x1f8c/0x2394)
[   92.311815]  r6:c61c2500 r5:00000001 r4:c61b9f34
[   92.311898] [<c03d19b0>] (load_module) from [<c03d3ea0>]
(SyS_init_module+0x15c/0x174)
[   92.311979]  r10:00000051 r9:00000000 r8:00160fda r7:c61b8000
r6:c95a6a18 r5:b6fbca20
[   92.312040]  r4:00006a18
[   92.312087] [<c03d3d44>] (SyS_init_module) from [<c0308260>]
(ret_fast_syscall+0x0/0x3c)
[   92.312196]  r10:00000000 r9:c61b8000 r8:c0308424 r7:00000080
r6:756e694c r5:00156a18
[   92.312277]  r4:00000000
[   93.835343] 1065840 PASS

Does this look like a bug? I will send the separate mail if it does.
Let me know.

Best,
Shubham Bansal


On Tue, May 23, 2017 at 1:35 AM, Kees Cook <keescook@chromium.org> wrote:
> On Mon, May 22, 2017 at 10:04 AM, Shubham Bansal
> <illusionist.neo@gmail.com> wrote:
>> These all benchmarks are for ARMv7.
>
> Thanks! In the future, try to avoid the white-space damage
> (line-wrapping). And it looks like you've still got debugging turned
> on in your jit code:
>
> [   56.176033] test_bpf: #21 LD_CPU
> [   56.176329] bpf_jit: *** NOT YET: opcode 85 ***
> [   56.176565] jited:0 2639 702 PASS
>
> That breaks the test report line. After I cleaned these up and parsed
> the results, they look great. Most things are half the speed of the
> interpreter, if not better. Only the LD_ABS suffered, and that's
> mainly the const blinding, I assume.
>
> Please post your current patch. Thanks for this!
>
> -Kees
>
> --
> Kees Cook
> Pixel Security

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: arch: arm: bpf: Converting cBPF to eBPF for arm 32 bit
  2017-05-22 19:08                       ` Florian Fainelli
  (?)
@ 2017-05-23  3:34                         ` Shubham Bansal
  -1 siblings, 0 replies; 99+ messages in thread
From: Shubham Bansal @ 2017-05-23  3:34 UTC (permalink / raw)
  To: Florian Fainelli
  Cc: Kees Cook, Daniel Borkmann, kernel-hardening,
	Network Development, ast, Mircea Gherzan, David Miller,
	linux-arm-kernel, nschichan, andrew

Hi Florian,

>> I think it is fine to only target ARMv7. It is harder and harder to
>> find devices on v5 or v6 CPUs that would want to be using BPF JIT,
>> IMO.
>
> There are still a ton of Marvell-based routers out there (e.g: Kirkwood,
> Orion5x) that are ARMv5 and that prompted Nicholas (hey there) to fix
> the cBPF JIT a while ago. I don't think you can just ignore those, it's
> fine not to target them initially, but arguably, QEMU has decent support
> for some ARMv5 platforms that could be used for testing as well
> (realview-eb, versatileab/pbm.

I am using busybox to get the rootfs. Here is what I am doing :-

1. ARCH=arm CROSS_COMPILE=arm-linux-gnueabihf- make -j4 (for kernel
build as well as busybox build)
2. qemu-system-arm -M vexpress-a9 -dtb
./linux/arch/arm/boot/dts/versatile-ab.dts -kernel ./linux/arch/a
rm/boot/zImage -append "root=/dev/mmcblk0 console=ttyAMA0" -sd
./a9rootfs.ext3 --nographic

Can you help me with running qemu for ARMv5 and ARMv6 ?

>> When they "disappear", it's because there isn't a prerequisite met. I
>> either read the Kconfig files or use "make menuconfig" and "search" to
>> tell me where a config is defined and what is needed to meet the
>> prerequisites.
>>
>> In the case of CPU_BIG_ENDIAN, you need ARCH_SUPPORTS_BIG_ENDIAN,
>> which appears to be only ARCH_IXP4XX. I don't think you're going to
>> find an emulator that will handle this, so I'd suggest ignoring this
>> config for now unless you can find someone with that hardware that you
>> can work with to test it.
>>
>> In the case of CONFIG_FRAME_POINTER, I assume you built a
>> THUMB2_KERNEL? I'd read the notes in arch/arm/Kconfig.debug for
>> 'config FRAME_POINTER'.
>
> It sounds like we are at the point where Shubham's patches should be
> posted so people could test/fix on earlier ARM devices for instance.
>
I would post them as soon as I test them on ARMv5 and ARMv6. If you
can help me with that, please let me know.

> Thanks
> --
> Florian

-Shubham

^ permalink raw reply	[flat|nested] 99+ messages in thread

* arch: arm: bpf: Converting cBPF to eBPF for arm 32 bit
@ 2017-05-23  3:34                         ` Shubham Bansal
  0 siblings, 0 replies; 99+ messages in thread
From: Shubham Bansal @ 2017-05-23  3:34 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Florian,

>> I think it is fine to only target ARMv7. It is harder and harder to
>> find devices on v5 or v6 CPUs that would want to be using BPF JIT,
>> IMO.
>
> There are still a ton of Marvell-based routers out there (e.g: Kirkwood,
> Orion5x) that are ARMv5 and that prompted Nicholas (hey there) to fix
> the cBPF JIT a while ago. I don't think you can just ignore those, it's
> fine not to target them initially, but arguably, QEMU has decent support
> for some ARMv5 platforms that could be used for testing as well
> (realview-eb, versatileab/pbm.

I am using busybox to get the rootfs. Here is what I am doing :-

1. ARCH=arm CROSS_COMPILE=arm-linux-gnueabihf- make -j4 (for kernel
build as well as busybox build)
2. qemu-system-arm -M vexpress-a9 -dtb
./linux/arch/arm/boot/dts/versatile-ab.dts -kernel ./linux/arch/a
rm/boot/zImage -append "root=/dev/mmcblk0 console=ttyAMA0" -sd
./a9rootfs.ext3 --nographic

Can you help me with running qemu for ARMv5 and ARMv6 ?

>> When they "disappear", it's because there isn't a prerequisite met. I
>> either read the Kconfig files or use "make menuconfig" and "search" to
>> tell me where a config is defined and what is needed to meet the
>> prerequisites.
>>
>> In the case of CPU_BIG_ENDIAN, you need ARCH_SUPPORTS_BIG_ENDIAN,
>> which appears to be only ARCH_IXP4XX. I don't think you're going to
>> find an emulator that will handle this, so I'd suggest ignoring this
>> config for now unless you can find someone with that hardware that you
>> can work with to test it.
>>
>> In the case of CONFIG_FRAME_POINTER, I assume you built a
>> THUMB2_KERNEL? I'd read the notes in arch/arm/Kconfig.debug for
>> 'config FRAME_POINTER'.
>
> It sounds like we are at the point where Shubham's patches should be
> posted so people could test/fix on earlier ARM devices for instance.
>
I would post them as soon as I test them on ARMv5 and ARMv6. If you
can help me with that, please let me know.

> Thanks
> --
> Florian

-Shubham

^ permalink raw reply	[flat|nested] 99+ messages in thread

* [kernel-hardening] Re: arch: arm: bpf: Converting cBPF to eBPF for arm 32 bit
@ 2017-05-23  3:34                         ` Shubham Bansal
  0 siblings, 0 replies; 99+ messages in thread
From: Shubham Bansal @ 2017-05-23  3:34 UTC (permalink / raw)
  To: Florian Fainelli
  Cc: Kees Cook, Daniel Borkmann, kernel-hardening,
	Network Development, ast, Mircea Gherzan, David Miller,
	linux-arm-kernel, nschichan, andrew

Hi Florian,

>> I think it is fine to only target ARMv7. It is harder and harder to
>> find devices on v5 or v6 CPUs that would want to be using BPF JIT,
>> IMO.
>
> There are still a ton of Marvell-based routers out there (e.g: Kirkwood,
> Orion5x) that are ARMv5 and that prompted Nicholas (hey there) to fix
> the cBPF JIT a while ago. I don't think you can just ignore those, it's
> fine not to target them initially, but arguably, QEMU has decent support
> for some ARMv5 platforms that could be used for testing as well
> (realview-eb, versatileab/pbm.

I am using busybox to get the rootfs. Here is what I am doing :-

1. ARCH=arm CROSS_COMPILE=arm-linux-gnueabihf- make -j4 (for kernel
build as well as busybox build)
2. qemu-system-arm -M vexpress-a9 -dtb
./linux/arch/arm/boot/dts/versatile-ab.dts -kernel ./linux/arch/a
rm/boot/zImage -append "root=/dev/mmcblk0 console=ttyAMA0" -sd
./a9rootfs.ext3 --nographic

Can you help me with running qemu for ARMv5 and ARMv6 ?

>> When they "disappear", it's because there isn't a prerequisite met. I
>> either read the Kconfig files or use "make menuconfig" and "search" to
>> tell me where a config is defined and what is needed to meet the
>> prerequisites.
>>
>> In the case of CPU_BIG_ENDIAN, you need ARCH_SUPPORTS_BIG_ENDIAN,
>> which appears to be only ARCH_IXP4XX. I don't think you're going to
>> find an emulator that will handle this, so I'd suggest ignoring this
>> config for now unless you can find someone with that hardware that you
>> can work with to test it.
>>
>> In the case of CONFIG_FRAME_POINTER, I assume you built a
>> THUMB2_KERNEL? I'd read the notes in arch/arm/Kconfig.debug for
>> 'config FRAME_POINTER'.
>
> It sounds like we are at the point where Shubham's patches should be
> posted so people could test/fix on earlier ARM devices for instance.
>
I would post them as soon as I test them on ARMv5 and ARMv6. If you
can help me with that, please let me know.

> Thanks
> --
> Florian

-Shubham

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: arch: arm: bpf: Converting cBPF to eBPF for arm 32 bit
  2017-05-23  3:34                         ` Shubham Bansal
  (?)
@ 2017-05-23  4:22                           ` Kees Cook
  -1 siblings, 0 replies; 99+ messages in thread
From: Kees Cook @ 2017-05-23  4:22 UTC (permalink / raw)
  To: Shubham Bansal
  Cc: Florian Fainelli, Daniel Borkmann, kernel-hardening,
	Network Development, ast, Mircea Gherzan, David Miller,
	linux-arm-kernel, Nicolas Schichan, andrew

On Mon, May 22, 2017 at 8:34 PM, Shubham Bansal
<illusionist.neo@gmail.com> wrote:
> I would post them as soon as I test them on ARMv5 and ARMv6. If you
> can help me with that, please let me know.

Please post what you have: it would be better to see what you've got
now in case additional changes are needed so you don't have to do it
again on v5 and v6. Also, it means other people with real v5 and v6
hardware could test for you if they were so inclined, and you won't
need to be blocked on doing the tests in qemu.

You can send it as an "RFC" in the subject, just to make sure people
know it's not considered fully done. :)

-Kees

-- 
Kees Cook
Pixel Security

^ permalink raw reply	[flat|nested] 99+ messages in thread

* arch: arm: bpf: Converting cBPF to eBPF for arm 32 bit
@ 2017-05-23  4:22                           ` Kees Cook
  0 siblings, 0 replies; 99+ messages in thread
From: Kees Cook @ 2017-05-23  4:22 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, May 22, 2017 at 8:34 PM, Shubham Bansal
<illusionist.neo@gmail.com> wrote:
> I would post them as soon as I test them on ARMv5 and ARMv6. If you
> can help me with that, please let me know.

Please post what you have: it would be better to see what you've got
now in case additional changes are needed so you don't have to do it
again on v5 and v6. Also, it means other people with real v5 and v6
hardware could test for you if they were so inclined, and you won't
need to be blocked on doing the tests in qemu.

You can send it as an "RFC" in the subject, just to make sure people
know it's not considered fully done. :)

-Kees

-- 
Kees Cook
Pixel Security

^ permalink raw reply	[flat|nested] 99+ messages in thread

* [kernel-hardening] Re: arch: arm: bpf: Converting cBPF to eBPF for arm 32 bit
@ 2017-05-23  4:22                           ` Kees Cook
  0 siblings, 0 replies; 99+ messages in thread
From: Kees Cook @ 2017-05-23  4:22 UTC (permalink / raw)
  To: Shubham Bansal
  Cc: Florian Fainelli, Daniel Borkmann, kernel-hardening,
	Network Development, ast, Mircea Gherzan, David Miller,
	linux-arm-kernel, Nicolas Schichan, andrew

On Mon, May 22, 2017 at 8:34 PM, Shubham Bansal
<illusionist.neo@gmail.com> wrote:
> I would post them as soon as I test them on ARMv5 and ARMv6. If you
> can help me with that, please let me know.

Please post what you have: it would be better to see what you've got
now in case additional changes are needed so you don't have to do it
again on v5 and v6. Also, it means other people with real v5 and v6
hardware could test for you if they were so inclined, and you won't
need to be blocked on doing the tests in qemu.

You can send it as an "RFC" in the subject, just to make sure people
know it's not considered fully done. :)

-Kees

-- 
Kees Cook
Pixel Security

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: arch: arm: bpf: Converting cBPF to eBPF for arm 32 bit
  2017-05-23  2:58                           ` Shubham Bansal
  (?)
@ 2017-05-23  4:27                             ` Kees Cook
  -1 siblings, 0 replies; 99+ messages in thread
From: Kees Cook @ 2017-05-23  4:27 UTC (permalink / raw)
  To: Shubham Bansal
  Cc: Daniel Borkmann, David Miller, Mircea Gherzan,
	Network Development, kernel-hardening, linux-arm-kernel, ast

On Mon, May 22, 2017 at 7:58 PM, Shubham Bansal
<illusionist.neo@gmail.com> wrote:
> On testing the eBPF JIT with CONFIG_FRAME_POINTER I got the following
> crash for non jitted testcase.

It's just a softlockup WARN, not a crash, and I think it'd to be
expected given the large runtime test_bpf reports:

> [   72.032494] test_bpf: #267 BPF_MAXINSNS: Call heavy transformations
> jited:0 1112799
> [   92.304815] NMI watchdog: BUG: soft lockup - CPU#0 stuck for 22s!
> [insmod:104]
> ...
> [   93.835343] 1065840 PASS

https://www.kernel.org/doc/Documentation/lockup-watchdogs.txt

You can raise the softlockup time-out by changing the number of
seconds here: /proc/sys/kernel/watchdog_thresh I think the softlockup
is counting the entire runtime of the bpf_tests run, so if it takes 30
seconds to run, put at least 15 into /proc/sys/kernel/watchdog_thresh

-Kees

-- 
Kees Cook
Pixel Security

^ permalink raw reply	[flat|nested] 99+ messages in thread

* arch: arm: bpf: Converting cBPF to eBPF for arm 32 bit
@ 2017-05-23  4:27                             ` Kees Cook
  0 siblings, 0 replies; 99+ messages in thread
From: Kees Cook @ 2017-05-23  4:27 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, May 22, 2017 at 7:58 PM, Shubham Bansal
<illusionist.neo@gmail.com> wrote:
> On testing the eBPF JIT with CONFIG_FRAME_POINTER I got the following
> crash for non jitted testcase.

It's just a softlockup WARN, not a crash, and I think it'd to be
expected given the large runtime test_bpf reports:

> [   72.032494] test_bpf: #267 BPF_MAXINSNS: Call heavy transformations
> jited:0 1112799
> [   92.304815] NMI watchdog: BUG: soft lockup - CPU#0 stuck for 22s!
> [insmod:104]
> ...
> [   93.835343] 1065840 PASS

https://www.kernel.org/doc/Documentation/lockup-watchdogs.txt

You can raise the softlockup time-out by changing the number of
seconds here: /proc/sys/kernel/watchdog_thresh I think the softlockup
is counting the entire runtime of the bpf_tests run, so if it takes 30
seconds to run, put at least 15 into /proc/sys/kernel/watchdog_thresh

-Kees

-- 
Kees Cook
Pixel Security

^ permalink raw reply	[flat|nested] 99+ messages in thread

* [kernel-hardening] Re: arch: arm: bpf: Converting cBPF to eBPF for arm 32 bit
@ 2017-05-23  4:27                             ` Kees Cook
  0 siblings, 0 replies; 99+ messages in thread
From: Kees Cook @ 2017-05-23  4:27 UTC (permalink / raw)
  To: Shubham Bansal
  Cc: Daniel Borkmann, David Miller, Mircea Gherzan,
	Network Development, kernel-hardening, linux-arm-kernel, ast

On Mon, May 22, 2017 at 7:58 PM, Shubham Bansal
<illusionist.neo@gmail.com> wrote:
> On testing the eBPF JIT with CONFIG_FRAME_POINTER I got the following
> crash for non jitted testcase.

It's just a softlockup WARN, not a crash, and I think it'd to be
expected given the large runtime test_bpf reports:

> [   72.032494] test_bpf: #267 BPF_MAXINSNS: Call heavy transformations
> jited:0 1112799
> [   92.304815] NMI watchdog: BUG: soft lockup - CPU#0 stuck for 22s!
> [insmod:104]
> ...
> [   93.835343] 1065840 PASS

https://www.kernel.org/doc/Documentation/lockup-watchdogs.txt

You can raise the softlockup time-out by changing the number of
seconds here: /proc/sys/kernel/watchdog_thresh I think the softlockup
is counting the entire runtime of the bpf_tests run, so if it takes 30
seconds to run, put at least 15 into /proc/sys/kernel/watchdog_thresh

-Kees

-- 
Kees Cook
Pixel Security

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: arch: arm: bpf: Converting cBPF to eBPF for arm 32 bit
  2017-05-23  4:22                           ` Kees Cook
  (?)
@ 2017-05-23  5:03                             ` Shubham Bansal
  -1 siblings, 0 replies; 99+ messages in thread
From: Shubham Bansal @ 2017-05-23  5:03 UTC (permalink / raw)
  To: Kees Cook
  Cc: Florian Fainelli, Daniel Borkmann, kernel-hardening,
	Network Development, ast, Mircea Gherzan, David Miller,
	linux-arm-kernel, Nicolas Schichan, andrew

Hi Kees,

I already have ARMv5 and ARMv6 code written. I just haven't tested it
yet. Should i send the patch with those as well ?

Best,
Shubham Bansal


On Tue, May 23, 2017 at 9:52 AM, Kees Cook <keescook@chromium.org> wrote:
> On Mon, May 22, 2017 at 8:34 PM, Shubham Bansal
> <illusionist.neo@gmail.com> wrote:
>> I would post them as soon as I test them on ARMv5 and ARMv6. If you
>> can help me with that, please let me know.
>
> Please post what you have: it would be better to see what you've got
> now in case additional changes are needed so you don't have to do it
> again on v5 and v6. Also, it means other people with real v5 and v6
> hardware could test for you if they were so inclined, and you won't
> need to be blocked on doing the tests in qemu.
>
> You can send it as an "RFC" in the subject, just to make sure people
> know it's not considered fully done. :)
>
> -Kees
>
> --
> Kees Cook
> Pixel Security

^ permalink raw reply	[flat|nested] 99+ messages in thread

* arch: arm: bpf: Converting cBPF to eBPF for arm 32 bit
@ 2017-05-23  5:03                             ` Shubham Bansal
  0 siblings, 0 replies; 99+ messages in thread
From: Shubham Bansal @ 2017-05-23  5:03 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Kees,

I already have ARMv5 and ARMv6 code written. I just haven't tested it
yet. Should i send the patch with those as well ?

Best,
Shubham Bansal


On Tue, May 23, 2017 at 9:52 AM, Kees Cook <keescook@chromium.org> wrote:
> On Mon, May 22, 2017 at 8:34 PM, Shubham Bansal
> <illusionist.neo@gmail.com> wrote:
>> I would post them as soon as I test them on ARMv5 and ARMv6. If you
>> can help me with that, please let me know.
>
> Please post what you have: it would be better to see what you've got
> now in case additional changes are needed so you don't have to do it
> again on v5 and v6. Also, it means other people with real v5 and v6
> hardware could test for you if they were so inclined, and you won't
> need to be blocked on doing the tests in qemu.
>
> You can send it as an "RFC" in the subject, just to make sure people
> know it's not considered fully done. :)
>
> -Kees
>
> --
> Kees Cook
> Pixel Security

^ permalink raw reply	[flat|nested] 99+ messages in thread

* [kernel-hardening] Re: arch: arm: bpf: Converting cBPF to eBPF for arm 32 bit
@ 2017-05-23  5:03                             ` Shubham Bansal
  0 siblings, 0 replies; 99+ messages in thread
From: Shubham Bansal @ 2017-05-23  5:03 UTC (permalink / raw)
  To: Kees Cook
  Cc: Florian Fainelli, Daniel Borkmann, kernel-hardening,
	Network Development, ast, Mircea Gherzan, David Miller,
	linux-arm-kernel, Nicolas Schichan, andrew

Hi Kees,

I already have ARMv5 and ARMv6 code written. I just haven't tested it
yet. Should i send the patch with those as well ?

Best,
Shubham Bansal


On Tue, May 23, 2017 at 9:52 AM, Kees Cook <keescook@chromium.org> wrote:
> On Mon, May 22, 2017 at 8:34 PM, Shubham Bansal
> <illusionist.neo@gmail.com> wrote:
>> I would post them as soon as I test them on ARMv5 and ARMv6. If you
>> can help me with that, please let me know.
>
> Please post what you have: it would be better to see what you've got
> now in case additional changes are needed so you don't have to do it
> again on v5 and v6. Also, it means other people with real v5 and v6
> hardware could test for you if they were so inclined, and you won't
> need to be blocked on doing the tests in qemu.
>
> You can send it as an "RFC" in the subject, just to make sure people
> know it's not considered fully done. :)
>
> -Kees
>
> --
> Kees Cook
> Pixel Security

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: arch: arm: bpf: Converting cBPF to eBPF for arm 32 bit
  2017-05-23  5:03                             ` Shubham Bansal
  (?)
@ 2017-05-23  5:35                               ` Kees Cook
  -1 siblings, 0 replies; 99+ messages in thread
From: Kees Cook @ 2017-05-23  5:35 UTC (permalink / raw)
  To: Shubham Bansal
  Cc: Florian Fainelli, Daniel Borkmann, kernel-hardening,
	Network Development, ast, Mircea Gherzan, David Miller,
	linux-arm-kernel, Nicolas Schichan, andrew

On Mon, May 22, 2017 at 10:03 PM, Shubham Bansal
<illusionist.neo@gmail.com> wrote:
> On Tue, May 23, 2017 at 9:52 AM, Kees Cook <keescook@chromium.org> wrote:
>> On Mon, May 22, 2017 at 8:34 PM, Shubham Bansal
>> <illusionist.neo@gmail.com> wrote:
>>> I would post them as soon as I test them on ARMv5 and ARMv6. If you
>>> can help me with that, please let me know.
>>
>> Please post what you have: it would be better to see what you've got
>> now in case additional changes are needed so you don't have to do it
>> again on v5 and v6. Also, it means other people with real v5 and v6
>> hardware could test for you if they were so inclined, and you won't
>> need to be blocked on doing the tests in qemu.
>>
>> You can send it as an "RFC" in the subject, just to make sure people
>> know it's not considered fully done. :)
>
> I already have ARMv5 and ARMv6 code written. I just haven't tested it
> yet. Should i send the patch with those as well ?

Sure, just to have a version up for people to examine. If there are
bugs, that's fine, we'll iron them out.

-Kees

-- 
Kees Cook
Pixel Security

^ permalink raw reply	[flat|nested] 99+ messages in thread

* arch: arm: bpf: Converting cBPF to eBPF for arm 32 bit
@ 2017-05-23  5:35                               ` Kees Cook
  0 siblings, 0 replies; 99+ messages in thread
From: Kees Cook @ 2017-05-23  5:35 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, May 22, 2017 at 10:03 PM, Shubham Bansal
<illusionist.neo@gmail.com> wrote:
> On Tue, May 23, 2017 at 9:52 AM, Kees Cook <keescook@chromium.org> wrote:
>> On Mon, May 22, 2017 at 8:34 PM, Shubham Bansal
>> <illusionist.neo@gmail.com> wrote:
>>> I would post them as soon as I test them on ARMv5 and ARMv6. If you
>>> can help me with that, please let me know.
>>
>> Please post what you have: it would be better to see what you've got
>> now in case additional changes are needed so you don't have to do it
>> again on v5 and v6. Also, it means other people with real v5 and v6
>> hardware could test for you if they were so inclined, and you won't
>> need to be blocked on doing the tests in qemu.
>>
>> You can send it as an "RFC" in the subject, just to make sure people
>> know it's not considered fully done. :)
>
> I already have ARMv5 and ARMv6 code written. I just haven't tested it
> yet. Should i send the patch with those as well ?

Sure, just to have a version up for people to examine. If there are
bugs, that's fine, we'll iron them out.

-Kees

-- 
Kees Cook
Pixel Security

^ permalink raw reply	[flat|nested] 99+ messages in thread

* [kernel-hardening] Re: arch: arm: bpf: Converting cBPF to eBPF for arm 32 bit
@ 2017-05-23  5:35                               ` Kees Cook
  0 siblings, 0 replies; 99+ messages in thread
From: Kees Cook @ 2017-05-23  5:35 UTC (permalink / raw)
  To: Shubham Bansal
  Cc: Florian Fainelli, Daniel Borkmann, kernel-hardening,
	Network Development, ast, Mircea Gherzan, David Miller,
	linux-arm-kernel, Nicolas Schichan, andrew

On Mon, May 22, 2017 at 10:03 PM, Shubham Bansal
<illusionist.neo@gmail.com> wrote:
> On Tue, May 23, 2017 at 9:52 AM, Kees Cook <keescook@chromium.org> wrote:
>> On Mon, May 22, 2017 at 8:34 PM, Shubham Bansal
>> <illusionist.neo@gmail.com> wrote:
>>> I would post them as soon as I test them on ARMv5 and ARMv6. If you
>>> can help me with that, please let me know.
>>
>> Please post what you have: it would be better to see what you've got
>> now in case additional changes are needed so you don't have to do it
>> again on v5 and v6. Also, it means other people with real v5 and v6
>> hardware could test for you if they were so inclined, and you won't
>> need to be blocked on doing the tests in qemu.
>>
>> You can send it as an "RFC" in the subject, just to make sure people
>> know it's not considered fully done. :)
>
> I already have ARMv5 and ARMv6 code written. I just haven't tested it
> yet. Should i send the patch with those as well ?

Sure, just to have a version up for people to examine. If there are
bugs, that's fine, we'll iron them out.

-Kees

-- 
Kees Cook
Pixel Security

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: arch: arm: bpf: Converting cBPF to eBPF for arm 32 bit
  2017-05-23  5:35                               ` Kees Cook
@ 2017-05-23 18:39                                 ` Shubham Bansal
  -1 siblings, 0 replies; 99+ messages in thread
From: Shubham Bansal @ 2017-05-23 18:39 UTC (permalink / raw)
  To: Kees Cook
  Cc: Florian Fainelli, Daniel Borkmann, kernel-hardening,
	Network Development, ast, Mircea Gherzan, David Miller,
	linux-arm-kernel, Nicolas Schichan, andrew

Hi Kees, Daniel, Mircea and David,

Here is the patch I sent to the arm mailing list.
Any Comments are welcome.

---------- Forwarded message ----------
From: Shubham Bansal <illusionist.neo@gmail.com>
Date: Wed, May 24, 2017 at 12:03 AM
Subject: [PATCH] RFC: arm: eBPF JIT compiler
To: linux@armlinux.org.uk
Cc: linux-arm-kernel@lists.infradead.org,
linux-kernel@vger.kernel.org, Shubham Bansal
<illusionist.neo@gmail.com>


The JIT compiler emits ARM 32 bit instructions. Currently, It supports
eBPF only. Classic BPF is supported because of the conversion by BPF
core.

JIT is enabled with

        echo 1 > /proc/sys/net/core/bpf_jit_enable

Constant Blinding can be enabled along with JIT using

        echo 1 > /proc/sys/net/core/bpf_jit_enable
        echo 2 > /proc/sys/net/core/bpf_jit_harden

See Documentation/networking/filter.txt for more information.
Tested on ARMv7 with CONFIG_FRAME_POINTER enabled.

Results:

1. Interpreter:

        [   93.551176] test_bpf: Summary: 314 PASSED, 0 FAILED, [0/306 JIT'ed]

2. JIT enabled:

        [   92.913931] test_bpf: Summary: 314 PASSED, 0 FAILED, [278/306 JIT'ed]

3. JIT + blinding enabled:

        [  109.414506] test_bpf: Summary: 314 PASSED, 0 FAILED, [278/306 JIT'ed]

Currently, following eBPF instructions are not JITed.

        BPF_ALU64 | BPF_DIV | BPF_K
        BPF_ALU64 | BPF_DIV | BPF_X
        BPF_ALU64 | BPF_MOD | BPF_K
        BPF_ALU64 | BPF_MOD | BPF_X
        BPF_STX | BPF_XADD | BPF_W
        BPF_STX | BPF_XADD | BPF_DW
        BPF_JMP | BPF_CALL

Signed-off-by: Shubham Bansal <illusionist.neo@gmail.com>
---
 arch/arm/net/bpf_jit_32.c | 2410 ++++++++++++++++++++++++++++++---------------
 arch/arm/net/bpf_jit_32.h |  108 +-
 2 files changed, 1716 insertions(+), 802 deletions(-)

diff --git a/arch/arm/net/bpf_jit_32.c b/arch/arm/net/bpf_jit_32.c
index 93d0b6d..338d352 100644
--- a/arch/arm/net/bpf_jit_32.c
+++ b/arch/arm/net/bpf_jit_32.c
@@ -1,13 +1,16 @@
 /*
- * Just-In-Time compiler for BPF filters on 32bit ARM
+ * Just-In-Time compiler for eBPF filters on 32bit ARM
  *
  * Copyright (c) 2011 Mircea Gherzan <mgherzan@gmail.com>
+ * Copyright (c) 2017 Shubham Bansal <illusionist.neo@gmail.com>
  *
  * This program is free software; you can redistribute it and/or modify it
  * under the terms of the GNU General Public License as published by the
  * Free Software Foundation; version 2 of the License.
  */
+#define pr_fmt(fmt) "bpf_jit: " fmt

+#include <linux/bpf.h>
 #include <linux/bitops.h>
 #include <linux/compiler.h>
 #include <linux/errno.h>
@@ -23,44 +26,91 @@

 #include "bpf_jit_32.h"

+int bpf_jit_enable __read_mostly;
+
+#define STACK_OFFSET(k)        (k)
+#define TMP_REG_1      (MAX_BPF_JIT_REG + 0)   /* TEMP Register 1 */
+#define TMP_REG_2      (MAX_BPF_JIT_REG + 1)   /* TEMP Register 2 */
+#define TCALL_CNT      (MAX_BPF_JIT_REG + 2)   /* Tail Call Count */
+
+/* Flags used for JIT optimization */
+#define SEEN_CALL      (1 << 0)
+
+#define FLAG_IMM_OVERFLOW      (1 << 0)
+
 /*
- * ABI:
+ * Map eBPF registers to ARM 32bit registers or stack scratch space.
+ *
+ * 1. First argument is passed using the arm 32bit registers and rest of the
+ * arguments are passed on stack scratch space.
+ * 2. First callee-saved aregument is mapped to arm 32 bit registers and rest
+ * arguments are mapped to scratch space on stack.
+ * 3. We need two 64 bit temp registers to do complex operations on eBPF
+ * registers.
+ *
+ * As the eBPF registers are all 64 bit registers and arm has only 32 bit
+ * registers, we have to map each eBPF registers with two arm 32 bit regs or
+ * scratch memory space and we have to build eBPF 64 bit register from those.
  *
- * r0  scratch register
- * r4  BPF register A
- * r5  BPF register X
- * r6  pointer to the skb
- * r7  skb->data
- * r8  skb_headlen(skb)
  */
+static const u8 bpf2a32[][2] = {
+       /* return value from in-kernel function, and exit value from eBPF */
+       [BPF_REG_0] = {ARM_R1, ARM_R0},
+       /* arguments from eBPF program to in-kernel function */
+       [BPF_REG_1] = {ARM_R3, ARM_R2},
+       /* Stored on stack scratch space */
+       [BPF_REG_2] = {STACK_OFFSET(0), STACK_OFFSET(4)},
+       [BPF_REG_3] = {STACK_OFFSET(8), STACK_OFFSET(12)},
+       [BPF_REG_4] = {STACK_OFFSET(16), STACK_OFFSET(20)},
+       [BPF_REG_5] = {STACK_OFFSET(24), STACK_OFFSET(28)},
+       /* callee saved registers that in-kernel function will preserve */
+       [BPF_REG_6] = {ARM_R5, ARM_R4},
+       /* Stored on stack scratch space */
+       [BPF_REG_7] = {STACK_OFFSET(32), STACK_OFFSET(36)},
+       [BPF_REG_8] = {STACK_OFFSET(40), STACK_OFFSET(44)},
+       [BPF_REG_9] = {STACK_OFFSET(48), STACK_OFFSET(52)},
+       /* Read only Frame Pointer to access Stack */
+       [BPF_REG_FP] = {STACK_OFFSET(56), STACK_OFFSET(60)},
+       /* Temperory Register for internal BPF JIT, can be used
+        * for constant blindings and others.
+        */
+       [TMP_REG_1] = {ARM_R7, ARM_R6},
+       [TMP_REG_2] = {ARM_R10, ARM_R8},
+       /* Tail call count. Stored on stack scratch space. */
+       [TCALL_CNT] = {STACK_OFFSET(64), STACK_OFFSET(68)},
+       /* temporary register for blinding constants.
+        * Stored on stack scratch space.
+        */
+       [BPF_REG_AX] = {STACK_OFFSET(72), STACK_OFFSET(76)},
+};

-#define r_scratch      ARM_R0
-/* r1-r3 are (also) used for the unaligned loads on the non-ARMv7 slowpath */
-#define r_off          ARM_R1
-#define r_A            ARM_R4
-#define r_X            ARM_R5
-#define r_skb          ARM_R6
-#define r_skb_data     ARM_R7
-#define r_skb_hl       ARM_R8
-
-#define SCRATCH_SP_OFFSET      0
-#define SCRATCH_OFF(k)         (SCRATCH_SP_OFFSET + 4 * (k))
-
-#define SEEN_MEM               ((1 << BPF_MEMWORDS) - 1)
-#define SEEN_MEM_WORD(k)       (1 << (k))
-#define SEEN_X                 (1 << BPF_MEMWORDS)
-#define SEEN_CALL              (1 << (BPF_MEMWORDS + 1))
-#define SEEN_SKB               (1 << (BPF_MEMWORDS + 2))
-#define SEEN_DATA              (1 << (BPF_MEMWORDS + 3))
+#define        dst_lo  dst[1]
+#define dst_hi dst[0]
+#define src_lo src[1]
+#define src_hi src[0]

-#define FLAG_NEED_X_RESET      (1 << 0)
-#define FLAG_IMM_OVERFLOW      (1 << 1)
+/*
+ * JIT Context:
+ *
+ * prog                        :       bpf_prog
+ * idx                 :       index of current last JITed instruction.
+ * prologue_bytes      :       bytes used in prologue.
+ * epilogue_offset     :       offset of epilogue starting.
+ * seen                        :       bit mask used for JIT optimization.
+ * offsets             :       array of eBPF instruction offsets in
+ *                             JITed code.
+ * target              :       final JITed code.
+ * epilogue_bytes      :       no of bytes used in epilogue.
+ * imm_count           :       no of immediate counts used for global
+ *                             variables.
+ * imms                        :       array of global variable addresses.
+ */

 struct jit_ctx {
-       const struct bpf_prog *skf;
-       unsigned idx;
-       unsigned prologue_bytes;
-       int ret0_fp_idx;
+       const struct bpf_prog *prog;
+       unsigned int idx;
+       unsigned int prologue_bytes;
+       unsigned int epilogue_offset;
        u32 seen;
        u32 flags;
        u32 *offsets;
@@ -72,68 +122,16 @@ struct jit_ctx {
 #endif
 };

-int bpf_jit_enable __read_mostly;
-
-static inline int call_neg_helper(struct sk_buff *skb, int offset, void *ret,
-                     unsigned int size)
-{
-       void *ptr = bpf_internal_load_pointer_neg_helper(skb, offset, size);
-
-       if (!ptr)
-               return -EFAULT;
-       memcpy(ret, ptr, size);
-       return 0;
-}
-
-static u64 jit_get_skb_b(struct sk_buff *skb, int offset)
-{
-       u8 ret;
-       int err;
-
-       if (offset < 0)
-               err = call_neg_helper(skb, offset, &ret, 1);
-       else
-               err = skb_copy_bits(skb, offset, &ret, 1);
-
-       return (u64)err << 32 | ret;
-}
-
-static u64 jit_get_skb_h(struct sk_buff *skb, int offset)
-{
-       u16 ret;
-       int err;
-
-       if (offset < 0)
-               err = call_neg_helper(skb, offset, &ret, 2);
-       else
-               err = skb_copy_bits(skb, offset, &ret, 2);
-
-       return (u64)err << 32 | ntohs(ret);
-}
-
-static u64 jit_get_skb_w(struct sk_buff *skb, int offset)
-{
-       u32 ret;
-       int err;
-
-       if (offset < 0)
-               err = call_neg_helper(skb, offset, &ret, 4);
-       else
-               err = skb_copy_bits(skb, offset, &ret, 4);
-
-       return (u64)err << 32 | ntohl(ret);
-}
-
 /*
  * Wrappers which handle both OABI and EABI and assures Thumb2 interworking
  * (where the assembly routines like __aeabi_uidiv could cause problems).
  */
-static u32 jit_udiv(u32 dividend, u32 divisor)
+static u32 jit_udiv32(u32 dividend, u32 divisor)
 {
        return dividend / divisor;
 }

-static u32 jit_mod(u32 dividend, u32 divisor)
+static u32 jit_mod32(u32 dividend, u32 divisor)
 {
        return dividend % divisor;
 }
@@ -157,36 +155,22 @@ static inline void emit(u32 inst, struct jit_ctx *ctx)
        _emit(ARM_COND_AL, inst, ctx);
 }

-static u16 saved_regs(struct jit_ctx *ctx)
+/*
+ * Checks if immediate value can be converted to imm12(12 bits) value.
+ */
+static int16_t imm8m(u32 x)
 {
-       u16 ret = 0;
-
-       if ((ctx->skf->len > 1) ||
-           (ctx->skf->insns[0].code == (BPF_RET | BPF_A)))
-               ret |= 1 << r_A;
-
-#ifdef CONFIG_FRAME_POINTER
-       ret |= (1 << ARM_FP) | (1 << ARM_IP) | (1 << ARM_LR) | (1 << ARM_PC);
-#else
-       if (ctx->seen & SEEN_CALL)
-               ret |= 1 << ARM_LR;
-#endif
-       if (ctx->seen & (SEEN_DATA | SEEN_SKB))
-               ret |= 1 << r_skb;
-       if (ctx->seen & SEEN_DATA)
-               ret |= (1 << r_skb_data) | (1 << r_skb_hl);
-       if (ctx->seen & SEEN_X)
-               ret |= 1 << r_X;
-
-       return ret;
-}
+       u32 rot;

-static inline int mem_words_used(struct jit_ctx *ctx)
-{
-       /* yes, we do waste some stack space IF there are "holes" in the set" */
-       return fls(ctx->seen & SEEN_MEM);
+       for (rot = 0; rot < 16; rot++)
+               if ((x & ~ror32(0xff, 2 * rot)) == 0)
+                       return rol32(x, 2 * rot) | (rot << 8);
+       return -1;
 }

+/*
+ * Initializes the JIT space with undefined instructions.
+ */
 static void jit_fill_hole(void *area, unsigned int size)
 {
        u32 *ptr;
@@ -195,88 +179,34 @@ static void jit_fill_hole(void *area, unsigned int size)
                *ptr++ = __opcode_to_mem_arm(ARM_INST_UDF);
 }

-static void build_prologue(struct jit_ctx *ctx)
-{
-       u16 reg_set = saved_regs(ctx);
-       u16 off;
-
-#ifdef CONFIG_FRAME_POINTER
-       emit(ARM_MOV_R(ARM_IP, ARM_SP), ctx);
-       emit(ARM_PUSH(reg_set), ctx);
-       emit(ARM_SUB_I(ARM_FP, ARM_IP, 4), ctx);
-#else
-       if (reg_set)
-               emit(ARM_PUSH(reg_set), ctx);
-#endif
+/* Stack must be multiples of 16 Bytes */
+#define STACK_ALIGN(sz) (((sz) + 15) & ~15)

-       if (ctx->seen & (SEEN_DATA | SEEN_SKB))
-               emit(ARM_MOV_R(r_skb, ARM_R0), ctx);
-
-       if (ctx->seen & SEEN_DATA) {
-               off = offsetof(struct sk_buff, data);
-               emit(ARM_LDR_I(r_skb_data, r_skb, off), ctx);
-               /* headlen = len - data_len */
-               off = offsetof(struct sk_buff, len);
-               emit(ARM_LDR_I(r_skb_hl, r_skb, off), ctx);
-               off = offsetof(struct sk_buff, data_len);
-               emit(ARM_LDR_I(r_scratch, r_skb, off), ctx);
-               emit(ARM_SUB_R(r_skb_hl, r_skb_hl, r_scratch), ctx);
-       }
-
-       if (ctx->flags & FLAG_NEED_X_RESET)
-               emit(ARM_MOV_I(r_X, 0), ctx);
-
-       /* do not leak kernel data to userspace */
-       if (bpf_needs_clear_a(&ctx->skf->insns[0]))
-               emit(ARM_MOV_I(r_A, 0), ctx);
-
-       /* stack space for the BPF_MEM words */
-       if (ctx->seen & SEEN_MEM)
-               emit(ARM_SUB_I(ARM_SP, ARM_SP, mem_words_used(ctx) * 4), ctx);
-}
-
-static void build_epilogue(struct jit_ctx *ctx)
-{
-       u16 reg_set = saved_regs(ctx);
-
-       if (ctx->seen & SEEN_MEM)
-               emit(ARM_ADD_I(ARM_SP, ARM_SP, mem_words_used(ctx) * 4), ctx);
-
-       reg_set &= ~(1 << ARM_LR);
-
-#ifdef CONFIG_FRAME_POINTER
-       /* the first instruction of the prologue was: mov ip, sp */
-       reg_set &= ~(1 << ARM_IP);
-       reg_set |= (1 << ARM_SP);
-       emit(ARM_LDM(ARM_SP, reg_set), ctx);
-#else
-       if (reg_set) {
-               if (ctx->seen & SEEN_CALL)
-                       reg_set |= 1 << ARM_PC;
-               emit(ARM_POP(reg_set), ctx);
-       }
+/* Stack space for BPF_REG_2, BPF_REG_3, BPF_REG_4,
+ * BPF_REG_5, BPF_REG_7, BPF_REG_8, BPF_REG_9,
+ * BPF_REG_FP and Tail call counts.
+ */
+#define SCRATCH_SIZE 80

-       if (!(ctx->seen & SEEN_CALL))
-               emit(ARM_BX(ARM_LR), ctx);
-#endif
-}
+/* total stack size used in JITed code */
+#define _STACK_SIZE \
+       (MAX_BPF_STACK + \
+        + SCRATCH_SIZE + \
+        + 4 /* extra for skb_copy_bits buffer */)

-static int16_t imm8m(u32 x)
-{
-       u32 rot;
+#define STACK_SIZE STACK_ALIGN(_STACK_SIZE)

-       for (rot = 0; rot < 16; rot++)
-               if ((x & ~ror32(0xff, 2 * rot)) == 0)
-                       return rol32(x, 2 * rot) | (rot << 8);
+/* Get the offset of eBPF REGISTERs stored on scratch space. */
+#define STACK_VAR(off) (STACK_SIZE-off-4)

-       return -1;
-}
+/* Offset of skb_copy_bits buffer */
+#define SKB_BUFFER STACK_VAR(SCRATCH_SIZE)

 #if __LINUX_ARM_ARCH__ < 7

 static u16 imm_offset(u32 k, struct jit_ctx *ctx)
 {
-       unsigned i = 0, offset;
+       unsigned int i = 0, offset;
        u16 imm;

        /* on the "fake" run we just count them (duplicates included) */
@@ -295,7 +225,7 @@ static u16 imm_offset(u32 k, struct jit_ctx *ctx)
                ctx->imms[i] = k;

        /* constants go just after the epilogue */
-       offset =  ctx->offsets[ctx->skf->len];
+       offset =  ctx->offsets[ctx->prog->len];
        offset += ctx->prologue_bytes;
        offset += ctx->epilogue_bytes;
        offset += i * 4;
@@ -319,10 +249,22 @@ static u16 imm_offset(u32 k, struct jit_ctx *ctx)

 #endif /* __LINUX_ARM_ARCH__ */

+static inline int bpf2a32_offset(int bpf_to, int bpf_from,
+                                const struct jit_ctx *ctx) {
+       int to, from;
+
+       if (ctx->target == NULL)
+               return 0;
+       to = ctx->offsets[bpf_to];
+       from = ctx->offsets[bpf_from];
+
+       return to - from - 1;
+}
+
 /*
  * Move an immediate that's not an imm8m to a core register.
  */
-static inline void emit_mov_i_no8m(int rd, u32 val, struct jit_ctx *ctx)
+static inline void emit_mov_i_no8m(const u8 rd, u32 val, struct jit_ctx *ctx)
 {
 #if __LINUX_ARM_ARCH__ < 7
        emit(ARM_LDR_I(rd, ARM_PC, imm_offset(val, ctx)), ctx);
@@ -333,7 +275,7 @@ static inline void emit_mov_i_no8m(int rd, u32
val, struct jit_ctx *ctx)
 #endif
 }

-static inline void emit_mov_i(int rd, u32 val, struct jit_ctx *ctx)
+static inline void emit_mov_i(const u8 rd, u32 val, struct jit_ctx *ctx)
 {
        int imm12 = imm8m(val);

@@ -343,676 +285,1559 @@ static inline void emit_mov_i(int rd, u32
val, struct jit_ctx *ctx)
                emit_mov_i_no8m(rd, val, ctx);
 }

-#if __LINUX_ARM_ARCH__ < 6
-
-static void emit_load_be32(u8 cond, u8 r_res, u8 r_addr, struct jit_ctx *ctx)
+static inline void emit_blx_r(u8 tgt_reg, struct jit_ctx *ctx)
 {
-       _emit(cond, ARM_LDRB_I(ARM_R3, r_addr, 1), ctx);
-       _emit(cond, ARM_LDRB_I(ARM_R1, r_addr, 0), ctx);
-       _emit(cond, ARM_LDRB_I(ARM_R2, r_addr, 3), ctx);
-       _emit(cond, ARM_LSL_I(ARM_R3, ARM_R3, 16), ctx);
-       _emit(cond, ARM_LDRB_I(ARM_R0, r_addr, 2), ctx);
-       _emit(cond, ARM_ORR_S(ARM_R3, ARM_R3, ARM_R1, SRTYPE_LSL, 24), ctx);
-       _emit(cond, ARM_ORR_R(ARM_R3, ARM_R3, ARM_R2), ctx);
-       _emit(cond, ARM_ORR_S(r_res, ARM_R3, ARM_R0, SRTYPE_LSL, 8), ctx);
+       ctx->seen |= SEEN_CALL;
+#if __LINUX_ARM_ARCH__ < 5
+       emit(ARM_MOV_R(ARM_LR, ARM_PC), ctx);
+
+       if (elf_hwcap & HWCAP_THUMB)
+               emit(ARM_BX(tgt_reg), ctx);
+       else
+               emit(ARM_MOV_R(ARM_PC, tgt_reg), ctx);
+#else
+       emit(ARM_BLX_R(tgt_reg), ctx);
+#endif
 }

-static void emit_load_be16(u8 cond, u8 r_res, u8 r_addr, struct jit_ctx *ctx)
+static inline int epilogue_offset(const struct jit_ctx *ctx)
 {
-       _emit(cond, ARM_LDRB_I(ARM_R1, r_addr, 0), ctx);
-       _emit(cond, ARM_LDRB_I(ARM_R2, r_addr, 1), ctx);
-       _emit(cond, ARM_ORR_S(r_res, ARM_R2, ARM_R1, SRTYPE_LSL, 8), ctx);
+       int to, from;
+       /* No need for 1st dummy run */
+       if (ctx->target == NULL)
+               return 0;
+       to = ctx->epilogue_offset;
+       from = ctx->idx;
+
+       return to - from - 2;
 }

-static inline void emit_swap16(u8 r_dst, u8 r_src, struct jit_ctx *ctx)
+static inline void emit_udivmod(u8 rd, u8 rm, u8 rn, struct jit_ctx
*ctx, u8 op)
 {
-       /* r_dst = (r_src << 8) | (r_src >> 8) */
-       emit(ARM_LSL_I(ARM_R1, r_src, 8), ctx);
-       emit(ARM_ORR_S(r_dst, ARM_R1, r_src, SRTYPE_LSR, 8), ctx);
+       const u8 *tmp = bpf2a32[TMP_REG_1];
+       s32 jmp_offset;
+
+       /* checks if divisor is zero or not. If it is, then
+        * exit directly.
+        */
+       emit(ARM_CMP_I(rn, 0), ctx);
+       _emit(ARM_COND_EQ, ARM_MOV_I(ARM_R0, 0), ctx);
+       jmp_offset = epilogue_offset(ctx);
+       _emit(ARM_COND_EQ, ARM_B(jmp_offset), ctx);
+#if __LINUX_ARM_ARCH__ == 7
+       if (elf_hwcap & HWCAP_IDIVA) {
+               if (op == BPF_DIV)
+                       emit(ARM_UDIV(rd, rm, rn), ctx);
+               else {
+                       emit(ARM_UDIV(ARM_IP, rm, rn), ctx);
+                       emit(ARM_MLS(rd, rn, ARM_IP, rm), ctx);
+               }
+               return;
+       }
+#endif

        /*
-        * we need to mask out the bits set in r_dst[23:16] due to
-        * the first shift instruction.
-        *
-        * note that 0x8ff is the encoded immediate 0x00ff0000.
+        * For BPF_ALU | BPF_DIV | BPF_K instructions
+        * As ARM_R1 and ARM_R0 contains 1st argument of bpf
+        * function, we need to save it on caller side to save
+        * it from getting destroyed within callee.
+        * After the return from the callee, we restore ARM_R0
+        * ARM_R1.
         */
-       emit(ARM_BIC_I(r_dst, r_dst, 0x8ff), ctx);
-}
+       if (rn != ARM_R1) {
+               emit(ARM_MOV_R(tmp[0], ARM_R1), ctx);
+               emit(ARM_MOV_R(ARM_R1, rn), ctx);
+       }
+       if (rm != ARM_R0) {
+               emit(ARM_MOV_R(tmp[1], ARM_R0), ctx);
+               emit(ARM_MOV_R(ARM_R0, rm), ctx);
+       }

-#else  /* ARMv6+ */
+       /* Call appropriate function */
+       ctx->seen |= SEEN_CALL;
+       emit_mov_i(ARM_IP, op == BPF_DIV ?
+                  (u32)jit_udiv32 : (u32)jit_mod32, ctx);
+       emit_blx_r(ARM_IP, ctx);

-static void emit_load_be32(u8 cond, u8 r_res, u8 r_addr, struct jit_ctx *ctx)
-{
-       _emit(cond, ARM_LDR_I(r_res, r_addr, 0), ctx);
-#ifdef __LITTLE_ENDIAN
-       _emit(cond, ARM_REV(r_res, r_res), ctx);
-#endif
+       /* Save return value */
+       if (rd != ARM_R0)
+               emit(ARM_MOV_R(rd, ARM_R0), ctx);
+
+       /* Restore ARM_R0 and ARM_R1 */
+       if (rn != ARM_R1)
+               emit(ARM_MOV_R(ARM_R1, tmp[0]), ctx);
+       if (rm != ARM_R0)
+               emit(ARM_MOV_R(ARM_R0, tmp[1]), ctx);
 }

-static void emit_load_be16(u8 cond, u8 r_res, u8 r_addr, struct jit_ctx *ctx)
+/* Checks whether BPF register is on scratch stack space or not. */
+static inline bool is_on_stack(u8 bpf_reg)
 {
-       _emit(cond, ARM_LDRH_I(r_res, r_addr, 0), ctx);
-#ifdef __LITTLE_ENDIAN
-       _emit(cond, ARM_REV16(r_res, r_res), ctx);
-#endif
+       static u8 stack_regs[] = {BPF_REG_AX, BPF_REG_3, BPF_REG_4, BPF_REG_5,
+                               BPF_REG_7, BPF_REG_8, BPF_REG_9, TCALL_CNT,
+                               BPF_REG_2, BPF_REG_FP};
+       int i, reg_len = sizeof(stack_regs);
+
+       for (i = 0 ; i < reg_len ; i++) {
+               if (bpf_reg == stack_regs[i])
+                       return true;
+       }
+       return false;
 }

-static inline void emit_swap16(u8 r_dst __maybe_unused,
-                              u8 r_src __maybe_unused,
-                              struct jit_ctx *ctx __maybe_unused)
+static inline void emit_a32_mov_i(const u8 dst, const u32 val,
+                                 bool dstk, struct jit_ctx *ctx)
 {
-#ifdef __LITTLE_ENDIAN
-       emit(ARM_REV16(r_dst, r_src), ctx);
-#endif
+       const u8 *tmp = bpf2a32[TMP_REG_1];
+
+       if (dstk) {
+               emit_mov_i(tmp[1], val, ctx);
+               emit(ARM_STR_I(tmp[1], ARM_SP, STACK_VAR(dst)), ctx);
+       } else {
+               emit_mov_i(dst, val, ctx);
+       }
 }

-#endif /* __LINUX_ARM_ARCH__ < 6 */
+/* Sign extended move */
+static inline void emit_a32_mov_i64(const bool is64, const u8 dst[],
+                                 const u32 val, bool dstk,
+                                 struct jit_ctx *ctx) {
+       u32 hi = 0;

+       if (is64 && (val & (1<<31)))
+               hi = (u32)~0;
+       emit_a32_mov_i(dst_lo, val, dstk, ctx);
+       emit_a32_mov_i(dst_hi, hi, dstk, ctx);
+}

-/* Compute the immediate value for a PC-relative branch. */
-static inline u32 b_imm(unsigned tgt, struct jit_ctx *ctx)
-{
-       u32 imm;
+static inline void emit_a32_add_r(const u8 dst, const u8 src,
+                             const bool is64, const bool hi,
+                             struct jit_ctx *ctx) {
+       /* 64 bit :
+        *      adds dst_lo, dst_lo, src_lo
+        *      adc dst_hi, dst_hi, src_hi
+        * 32 bit :
+        *      add dst_lo, dst_lo, src_lo
+        */
+       if (!hi && is64)
+               emit(ARM_ADDS_R(dst, dst, src), ctx);
+       else if (hi && is64)
+               emit(ARM_ADC_R(dst, dst, src), ctx);
+       else
+               emit(ARM_ADD_R(dst, dst, src), ctx);
+}

-       if (ctx->target == NULL)
-               return 0;
-       /*
-        * BPF allows only forward jumps and the offset of the target is
-        * still the one computed during the first pass.
+static inline void emit_a32_sub_r(const u8 dst, const u8 src,
+                                 const bool is64, const bool hi,
+                                 struct jit_ctx *ctx) {
+       /* 64 bit :
+        *      subs dst_lo, dst_lo, src_lo
+        *      sbc dst_hi, dst_hi, src_hi
+        * 32 bit :
+        *      sub dst_lo, dst_lo, src_lo
         */
-       imm  = ctx->offsets[tgt] + ctx->prologue_bytes - (ctx->idx * 4 + 8);
+       if (!hi && is64)
+               emit(ARM_SUBS_R(dst, dst, src), ctx);
+       else if (hi && is64)
+               emit(ARM_SBC_R(dst, dst, src), ctx);
+       else
+               emit(ARM_SUB_R(dst, dst, src), ctx);
+}

-       return imm >> 2;
+static inline void emit_alu_r(const u8 dst, const u8 src, const bool is64,
+                             const bool hi, const u8 op, struct jit_ctx *ctx){
+       switch (BPF_OP(op)) {
+       /* dst = dst + src */
+       case BPF_ADD:
+               emit_a32_add_r(dst, src, is64, hi, ctx);
+               break;
+       /* dst = dst - src */
+       case BPF_SUB:
+               emit_a32_sub_r(dst, src, is64, hi, ctx);
+               break;
+       /* dst = dst | src */
+       case BPF_OR:
+               emit(ARM_ORR_R(dst, dst, src), ctx);
+               break;
+       /* dst = dst & src */
+       case BPF_AND:
+               emit(ARM_AND_R(dst, dst, src), ctx);
+               break;
+       /* dst = dst ^ src */
+       case BPF_XOR:
+               emit(ARM_EOR_R(dst, dst, src), ctx);
+               break;
+       /* dst = dst * src */
+       case BPF_MUL:
+               emit(ARM_MUL(dst, dst, src), ctx);
+               break;
+       /* dst = dst << src */
+       case BPF_LSH:
+               emit(ARM_LSL_R(dst, dst, src), ctx);
+               break;
+       /* dst = dst >> src */
+       case BPF_RSH:
+               emit(ARM_LSR_R(dst, dst, src), ctx);
+               break;
+       /* dst = dst >> src (signed)*/
+       case BPF_ARSH:
+               emit(ARM_MOV_SR(dst, dst, SRTYPE_ASR, src), ctx);
+               break;
+       }
 }

-#define OP_IMM3(op, r1, r2, imm_val, ctx)                              \
-       do {                                                            \
-               imm12 = imm8m(imm_val);                                 \
-               if (imm12 < 0) {                                        \
-                       emit_mov_i_no8m(r_scratch, imm_val, ctx);       \
-                       emit(op ## _R((r1), (r2), r_scratch), ctx);     \
-               } else {                                                \
-                       emit(op ## _I((r1), (r2), imm12), ctx);         \
-               }                                                       \
-       } while (0)
-
-static inline void emit_err_ret(u8 cond, struct jit_ctx *ctx)
-{
-       if (ctx->ret0_fp_idx >= 0) {
-               _emit(cond, ARM_B(b_imm(ctx->ret0_fp_idx, ctx)), ctx);
-               /* NOP to keep the size constant between passes */
-               emit(ARM_MOV_R(ARM_R0, ARM_R0), ctx);
+/* ALU operation (32 bit)
+ * dst = dst (op) src
+ */
+static inline void emit_a32_alu_r(const u8 dst, const u8 src,
+                                 bool dstk, bool sstk,
+                                 struct jit_ctx *ctx, const bool is64,
+                                 const bool hi, const u8 op) {
+       const u8 *tmp = bpf2a32[TMP_REG_1];
+       u8 rn = sstk ? tmp[1] : src;
+
+       if (sstk)
+               emit(ARM_LDR_I(rn, ARM_SP, STACK_VAR(src)), ctx);
+
+       /* ALU operation */
+       if (dstk) {
+               emit(ARM_LDR_I(tmp[0], ARM_SP, STACK_VAR(dst)), ctx);
+               emit_alu_r(tmp[0], rn, is64, hi, op, ctx);
+               emit(ARM_STR_I(tmp[0], ARM_SP, STACK_VAR(dst)), ctx);
        } else {
-               _emit(cond, ARM_MOV_I(ARM_R0, 0), ctx);
-               _emit(cond, ARM_B(b_imm(ctx->skf->len, ctx)), ctx);
+               emit_alu_r(dst, rn, is64, hi, op, ctx);
        }
 }

-static inline void emit_blx_r(u8 tgt_reg, struct jit_ctx *ctx)
-{
-#if __LINUX_ARM_ARCH__ < 5
-       emit(ARM_MOV_R(ARM_LR, ARM_PC), ctx);
+/* ALU operation (64 bit) */
+static inline void emit_a32_alu_r64(const bool is64, const u8 dst[],
+                                 const u8 src[], bool dstk,
+                                 bool sstk, struct jit_ctx *ctx,
+                                 const u8 op) {
+       emit_a32_alu_r(dst_lo, src_lo, dstk, sstk, ctx, is64, false, op);
+       if (is64)
+               emit_a32_alu_r(dst_hi, src_hi, dstk, sstk, ctx, is64, true, op);
+       else
+               emit_a32_mov_i(dst_hi, 0, dstk, ctx);
+}

-       if (elf_hwcap & HWCAP_THUMB)
-               emit(ARM_BX(tgt_reg), ctx);
+/* dst = imm (4 bytes)*/
+static inline void emit_a32_mov_r(const u8 dst, const u8 src,
+                                 bool dstk, bool sstk,
+                                 struct jit_ctx *ctx) {
+       const u8 *tmp = bpf2a32[TMP_REG_1];
+       u8 rt = sstk ? tmp[0] : src;
+
+       if (sstk)
+               emit(ARM_LDR_I(tmp[0], ARM_SP, STACK_VAR(src)), ctx);
+       if (dstk)
+               emit(ARM_STR_I(rt, ARM_SP, STACK_VAR(dst)), ctx);
        else
-               emit(ARM_MOV_R(ARM_PC, tgt_reg), ctx);
-#else
-       emit(ARM_BLX_R(tgt_reg), ctx);
-#endif
+               emit(ARM_MOV_R(dst, rt), ctx);
 }

-static inline void emit_udivmod(u8 rd, u8 rm, u8 rn, struct jit_ctx *ctx,
-                               int bpf_op)
-{
-#if __LINUX_ARM_ARCH__ == 7
-       if (elf_hwcap & HWCAP_IDIVA) {
-               if (bpf_op == BPF_DIV)
-                       emit(ARM_UDIV(rd, rm, rn), ctx);
-               else {
-                       emit(ARM_UDIV(ARM_R3, rm, rn), ctx);
-                       emit(ARM_MLS(rd, rn, ARM_R3, rm), ctx);
-               }
-               return;
+/* dst = src */
+static inline void emit_a32_mov_r64(const bool is64, const u8 dst[],
+                                 const u8 src[], bool dstk,
+                                 bool sstk, struct jit_ctx *ctx) {
+       emit_a32_mov_r(dst_lo, src_lo, dstk, sstk, ctx);
+       if (is64) {
+               /* complete 8 byte move */
+               emit_a32_mov_r(dst_hi, src_hi, dstk, sstk, ctx);
+       } else {
+               /* Zero out high 4 bytes */
+               emit_a32_mov_i(dst_hi, 0, dstk, ctx);
        }
-#endif
+}

-       /*
-        * For BPF_ALU | BPF_DIV | BPF_K instructions, rm is ARM_R4
-        * (r_A) and rn is ARM_R0 (r_scratch) so load rn first into
-        * ARM_R1 to avoid accidentally overwriting ARM_R0 with rm
-        * before using it as a source for ARM_R1.
-        *
-        * For BPF_ALU | BPF_DIV | BPF_X rm is ARM_R4 (r_A) and rn is
-        * ARM_R5 (r_X) so there is no particular register overlap
-        * issues.
-        */
-       if (rn != ARM_R1)
-               emit(ARM_MOV_R(ARM_R1, rn), ctx);
-       if (rm != ARM_R0)
-               emit(ARM_MOV_R(ARM_R0, rm), ctx);
+/* Shift operations */
+static inline void emit_a32_alu_i(const u8 dst, const u32 val, bool dstk,
+                               struct jit_ctx *ctx, const u8 op) {
+       const u8 *tmp = bpf2a32[TMP_REG_1];
+       u8 rd = dstk ? tmp[0] : dst;
+
+       if (dstk)
+               emit(ARM_LDR_I(rd, ARM_SP, STACK_VAR(dst)), ctx);
+
+       /* Do shift operation */
+       switch (op) {
+       case BPF_LSH:
+               emit(ARM_LSL_I(rd, rd, val), ctx);
+               break;
+       case BPF_RSH:
+               emit(ARM_LSR_I(rd, rd, val), ctx);
+               break;
+       case BPF_NEG:
+               emit(ARM_RSB_I(rd, rd, val), ctx);
+               break;
+       }
+
+       if (dstk)
+               emit(ARM_STR_I(rd, ARM_SP, STACK_VAR(dst)), ctx);
+}
+
+/* dst = ~dst (64 bit) */
+static inline void emit_a32_neg64(const u8 dst[], bool dstk,
+                               struct jit_ctx *ctx){
+       const u8 *tmp = bpf2a32[TMP_REG_1];
+       u8 rd = dstk ? tmp[1] : dst[1];
+       u8 rm = dstk ? tmp[0] : dst[0];
+
+       /* Setup Operand */
+       if (dstk) {
+               emit(ARM_LDR_I(rd, ARM_SP, STACK_VAR(dst_lo)), ctx);
+               emit(ARM_LDR_I(rm, ARM_SP, STACK_VAR(dst_hi)), ctx);
+       }
+
+       /* Do Negate Operation */
+       emit(ARM_RSBS_I(rd, rd, 0), ctx);
+       emit(ARM_RSC_I(rm, rm, 0), ctx);
+
+       if (dstk) {
+               emit(ARM_STR_I(rd, ARM_SP, STACK_VAR(dst_lo)), ctx);
+               emit(ARM_STR_I(rm, ARM_SP, STACK_VAR(dst_hi)), ctx);
+       }
+}
+
+/* dst = dst << src */
+static inline void emit_a32_lsh_r64(const u8 dst[], const u8 src[], bool dstk,
+                                   bool sstk, struct jit_ctx *ctx) {
+       const u8 *tmp = bpf2a32[TMP_REG_1];
+       const u8 *tmp2 = bpf2a32[TMP_REG_2];
+
+       /* Setup Operands */
+       u8 rt = sstk ? tmp2[1] : src_lo;
+       u8 rd = dstk ? tmp[1] : dst_lo;
+       u8 rm = dstk ? tmp[0] : dst_hi;
+
+       if (sstk)
+               emit(ARM_LDR_I(rt, ARM_SP, STACK_VAR(src_lo)), ctx);
+       if (dstk) {
+               emit(ARM_LDR_I(rd, ARM_SP, STACK_VAR(dst_lo)), ctx);
+               emit(ARM_LDR_I(rm, ARM_SP, STACK_VAR(dst_hi)), ctx);
+       }

+       /* Do LSH operation */
+       emit(ARM_SUB_I(ARM_IP, rt, 32), ctx);
+       emit(ARM_RSB_I(tmp2[0], rt, 32), ctx);
+       /* As we are using ARM_LR */
        ctx->seen |= SEEN_CALL;
-       emit_mov_i(ARM_R3, bpf_op == BPF_DIV ? (u32)jit_udiv : (u32)jit_mod,
-                  ctx);
-       emit_blx_r(ARM_R3, ctx);
+       emit(ARM_MOV_SR(ARM_LR, rm, SRTYPE_ASL, rt), ctx);
+       emit(ARM_ORR_SR(ARM_LR, ARM_LR, rd, SRTYPE_ASL, ARM_IP), ctx);
+       emit(ARM_ORR_SR(ARM_IP, ARM_LR, rd, SRTYPE_LSR, tmp2[0]), ctx);
+       emit(ARM_MOV_SR(ARM_LR, rd, SRTYPE_ASL, rt), ctx);
+
+       if (dstk) {
+               emit(ARM_STR_I(ARM_LR, ARM_SP, STACK_VAR(dst_lo)), ctx);
+               emit(ARM_STR_I(ARM_IP, ARM_SP, STACK_VAR(dst_hi)), ctx);
+       } else {
+               emit(ARM_MOV_R(rd, ARM_LR), ctx);
+               emit(ARM_MOV_R(rm, ARM_IP), ctx);
+       }
+}

-       if (rd != ARM_R0)
-               emit(ARM_MOV_R(rd, ARM_R0), ctx);
+/* dst = dst >> src (signed)*/
+static inline void emit_a32_arsh_r64(const u8 dst[], const u8 src[], bool dstk,
+                                   bool sstk, struct jit_ctx *ctx) {
+       const u8 *tmp = bpf2a32[TMP_REG_1];
+       const u8 *tmp2 = bpf2a32[TMP_REG_2];
+       /* Setup Operands */
+       u8 rt = sstk ? tmp2[1] : src_lo;
+       u8 rd = dstk ? tmp[1] : dst_lo;
+       u8 rm = dstk ? tmp[0] : dst_hi;
+
+       if (sstk)
+               emit(ARM_LDR_I(rt, ARM_SP, STACK_VAR(src_lo)), ctx);
+       if (dstk) {
+               emit(ARM_LDR_I(rd, ARM_SP, STACK_VAR(dst_lo)), ctx);
+               emit(ARM_LDR_I(rm, ARM_SP, STACK_VAR(dst_hi)), ctx);
+       }
+
+       /* Do the ARSH operation */
+       emit(ARM_RSB_I(ARM_IP, rt, 32), ctx);
+       emit(ARM_SUBS_I(tmp2[0], rt, 32), ctx);
+       /* As we are using ARM_LR */
+       ctx->seen |= SEEN_CALL;
+       emit(ARM_MOV_SR(ARM_LR, rd, SRTYPE_LSR, rt), ctx);
+       emit(ARM_ORR_SR(ARM_LR, ARM_LR, rm, SRTYPE_ASL, ARM_IP), ctx);
+       _emit(ARM_COND_MI, ARM_B(0), ctx);
+       emit(ARM_ORR_SR(ARM_LR, ARM_LR, rm, SRTYPE_ASR, tmp2[0]), ctx);
+       emit(ARM_MOV_SR(ARM_IP, rm, SRTYPE_ASR, rt), ctx);
+       if (dstk) {
+               emit(ARM_STR_I(ARM_LR, ARM_SP, STACK_VAR(dst_lo)), ctx);
+               emit(ARM_STR_I(ARM_IP, ARM_SP, STACK_VAR(dst_hi)), ctx);
+       } else {
+               emit(ARM_MOV_R(rd, ARM_LR), ctx);
+               emit(ARM_MOV_R(rm, ARM_IP), ctx);
+       }
+}
+
+/* dst = dst >> src */
+static inline void emit_a32_lsr_r64(const u8 dst[], const u8 src[], bool dstk,
+                                    bool sstk, struct jit_ctx *ctx) {
+       const u8 *tmp = bpf2a32[TMP_REG_1];
+       const u8 *tmp2 = bpf2a32[TMP_REG_2];
+       /* Setup Operands */
+       u8 rt = sstk ? tmp2[1] : src_lo;
+       u8 rd = dstk ? tmp[1] : dst_lo;
+       u8 rm = dstk ? tmp[0] : dst_hi;
+
+       if (sstk)
+               emit(ARM_LDR_I(rt, ARM_SP, STACK_VAR(src_lo)), ctx);
+       if (dstk) {
+               emit(ARM_LDR_I(rd, ARM_SP, STACK_VAR(dst_lo)), ctx);
+               emit(ARM_LDR_I(rm, ARM_SP, STACK_VAR(dst_hi)), ctx);
+       }
+
+       /* Do LSH operation */
+       emit(ARM_RSB_I(ARM_IP, rt, 32), ctx);
+       emit(ARM_SUBS_I(tmp2[0], rt, 32), ctx);
+       /* As we are using ARM_LR */
+       ctx->seen |= SEEN_CALL;
+       emit(ARM_MOV_SR(ARM_LR, rd, SRTYPE_LSR, rt), ctx);
+       emit(ARM_ORR_SR(ARM_LR, ARM_LR, rm, SRTYPE_ASL, ARM_IP), ctx);
+       emit(ARM_ORR_SR(ARM_LR, ARM_LR, rm, SRTYPE_LSR, tmp2[0]), ctx);
+       emit(ARM_MOV_SR(ARM_IP, rm, SRTYPE_LSR, rt), ctx);
+       if (dstk) {
+               emit(ARM_STR_I(ARM_LR, ARM_SP, STACK_VAR(dst_lo)), ctx);
+               emit(ARM_STR_I(ARM_IP, ARM_SP, STACK_VAR(dst_hi)), ctx);
+       } else {
+               emit(ARM_MOV_R(rd, ARM_LR), ctx);
+               emit(ARM_MOV_R(rm, ARM_IP), ctx);
+       }
+}
+
+/* dst = dst << val */
+static inline void emit_a32_lsh_i64(const u8 dst[], bool dstk,
+                                    const u32 val, struct jit_ctx *ctx){
+       const u8 *tmp = bpf2a32[TMP_REG_1];
+       const u8 *tmp2 = bpf2a32[TMP_REG_2];
+       /* Setup operands */
+       u8 rd = dstk ? tmp[1] : dst_lo;
+       u8 rm = dstk ? tmp[0] : dst_hi;
+
+       if (dstk) {
+               emit(ARM_LDR_I(rd, ARM_SP, STACK_VAR(dst_lo)), ctx);
+               emit(ARM_LDR_I(rm, ARM_SP, STACK_VAR(dst_hi)), ctx);
+       }
+
+       /* Do LSH operation */
+       if (val < 32) {
+               emit(ARM_MOV_SI(tmp2[0], rm, SRTYPE_ASL, val), ctx);
+               emit(ARM_ORR_SI(rm, tmp2[0], rd, SRTYPE_LSR, 32 - val), ctx);
+               emit(ARM_MOV_SI(rd, rd, SRTYPE_ASL, val), ctx);
+       } else {
+               if (val == 32)
+                       emit(ARM_MOV_R(rm, rd), ctx);
+               else
+                       emit(ARM_MOV_SI(rm, rd, SRTYPE_ASL, val - 32), ctx);
+               emit(ARM_EOR_R(rd, rd, rd), ctx);
+       }
+
+       if (dstk) {
+               emit(ARM_STR_I(rd, ARM_SP, STACK_VAR(dst_lo)), ctx);
+               emit(ARM_STR_I(rm, ARM_SP, STACK_VAR(dst_hi)), ctx);
+       }
+}
+
+/* dst = dst >> val */
+static inline void emit_a32_lsr_i64(const u8 dst[], bool dstk,
+                                   const u32 val, struct jit_ctx *ctx) {
+       const u8 *tmp = bpf2a32[TMP_REG_1];
+       const u8 *tmp2 = bpf2a32[TMP_REG_2];
+       /* Setup operands */
+       u8 rd = dstk ? tmp[1] : dst_lo;
+       u8 rm = dstk ? tmp[0] : dst_hi;
+
+       if (dstk) {
+               emit(ARM_LDR_I(rd, ARM_SP, STACK_VAR(dst_lo)), ctx);
+               emit(ARM_LDR_I(rm, ARM_SP, STACK_VAR(dst_hi)), ctx);
+       }
+
+       /* Do LSR operation */
+       if (val < 32) {
+               emit(ARM_MOV_SI(tmp2[1], rd, SRTYPE_LSR, val), ctx);
+               emit(ARM_ORR_SI(rd, tmp2[1], rm, SRTYPE_ASL, 32 - val), ctx);
+               emit(ARM_MOV_SI(rm, rm, SRTYPE_LSR, val), ctx);
+       } else if (val == 32) {
+               emit(ARM_MOV_R(rd, rm), ctx);
+               emit(ARM_MOV_I(rm, 0), ctx);
+       } else {
+               emit(ARM_MOV_SI(rd, rm, SRTYPE_LSR, val - 32), ctx);
+               emit(ARM_MOV_I(rm, 0), ctx);
+       }
+
+       if (dstk) {
+               emit(ARM_STR_I(rd, ARM_SP, STACK_VAR(dst_lo)), ctx);
+               emit(ARM_STR_I(rm, ARM_SP, STACK_VAR(dst_hi)), ctx);
+       }
+}
+
+/* dst = dst >> val (signed) */
+static inline void emit_a32_arsh_i64(const u8 dst[], bool dstk,
+                                    const u32 val, struct jit_ctx *ctx){
+       const u8 *tmp = bpf2a32[TMP_REG_1];
+       const u8 *tmp2 = bpf2a32[TMP_REG_2];
+        /* Setup operands */
+       u8 rd = dstk ? tmp[1] : dst_lo;
+       u8 rm = dstk ? tmp[0] : dst_hi;
+
+       if (dstk) {
+               emit(ARM_LDR_I(rd, ARM_SP, STACK_VAR(dst_lo)), ctx);
+               emit(ARM_LDR_I(rm, ARM_SP, STACK_VAR(dst_hi)), ctx);
+       }
+
+       /* Do ARSH operation */
+       if (val < 32) {
+               emit(ARM_MOV_SI(tmp2[1], rd, SRTYPE_LSR, val), ctx);
+               emit(ARM_ORR_SI(rd, tmp2[1], rm, SRTYPE_ASL, 32 - val), ctx);
+               emit(ARM_MOV_SI(rm, rm, SRTYPE_ASR, val), ctx);
+       } else if (val == 32) {
+               emit(ARM_MOV_R(rd, rm), ctx);
+               emit(ARM_MOV_SI(rm, rm, SRTYPE_ASR, 31), ctx);
+       } else {
+               emit(ARM_MOV_SI(rd, rm, SRTYPE_ASR, val - 32), ctx);
+               emit(ARM_MOV_SI(rm, rm, SRTYPE_ASR, 31), ctx);
+       }
+
+       if (dstk) {
+               emit(ARM_STR_I(rd, ARM_SP, STACK_VAR(dst_lo)), ctx);
+               emit(ARM_STR_I(rm, ARM_SP, STACK_VAR(dst_hi)), ctx);
+       }
+}
+
+static inline void emit_a32_mul_r64(const u8 dst[], const u8 src[], bool dstk,
+                                   bool sstk, struct jit_ctx *ctx) {
+       const u8 *tmp = bpf2a32[TMP_REG_1];
+       const u8 *tmp2 = bpf2a32[TMP_REG_2];
+       /* Setup operands for multiplication */
+       u8 rd = dstk ? tmp[1] : dst_lo;
+       u8 rm = dstk ? tmp[0] : dst_hi;
+       u8 rt = sstk ? tmp2[1] : src_lo;
+       u8 rn = sstk ? tmp2[0] : src_hi;
+
+       if (dstk) {
+               emit(ARM_LDR_I(rd, ARM_SP, STACK_VAR(dst_lo)), ctx);
+               emit(ARM_LDR_I(rm, ARM_SP, STACK_VAR(dst_hi)), ctx);
+       }
+       if (sstk) {
+               emit(ARM_LDR_I(rt, ARM_SP, STACK_VAR(src_lo)), ctx);
+               emit(ARM_LDR_I(rn, ARM_SP, STACK_VAR(src_hi)), ctx);
+       }
+
+       /* Do Multiplication */
+       emit(ARM_MUL(ARM_IP, rd, rn), ctx);
+       emit(ARM_MUL(ARM_LR, rm, rt), ctx);
+       /* As we are using ARM_LR */
+       ctx->seen |= SEEN_CALL;
+       emit(ARM_ADD_R(ARM_LR, ARM_IP, ARM_LR), ctx);
+
+       emit(ARM_UMULL(ARM_IP, rm, rd, rt), ctx);
+       emit(ARM_ADD_R(rm, ARM_LR, rm), ctx);
+       if (dstk) {
+               emit(ARM_STR_I(ARM_IP, ARM_SP, STACK_VAR(dst_lo)), ctx);
+               emit(ARM_STR_I(rm, ARM_SP, STACK_VAR(dst_hi)), ctx);
+       } else {
+               emit(ARM_MOV_R(rd, ARM_IP), ctx);
+       }
 }

-static inline void update_on_xread(struct jit_ctx *ctx)
+/* *(size *)(dst + off) = src */
+static inline void emit_str_r(const u8 dst, const u8 src, bool dstk,
+                             const s32 off, struct jit_ctx *ctx, const u8 sz){
+       const u8 *tmp = bpf2a32[TMP_REG_1];
+       u8 rd = dstk ? tmp[1] : dst;
+
+       if (dstk)
+               emit(ARM_LDR_I(rd, ARM_SP, STACK_VAR(dst)), ctx);
+       if (off) {
+               emit_a32_mov_i(tmp[0], off, false, ctx);
+               emit(ARM_ADD_R(tmp[0], rd, tmp[0]), ctx);
+               rd = tmp[0];
+       }
+       switch (sz) {
+       case BPF_W:
+               /* Store a Word */
+               emit(ARM_STR_I(src, rd, 0), ctx);
+               break;
+       case BPF_H:
+               /* Store a HalfWord */
+               emit(ARM_STRH_I(src, rd, 0), ctx);
+               break;
+       case BPF_B:
+               /* Store a Byte */
+               emit(ARM_STRB_I(src, rd, 0), ctx);
+               break;
+       }
+}
+
+/* dst = *(size*)(src + off) */
+static inline void emit_ldx_r(const u8 dst, const u8 src, bool dstk,
+                             const s32 off, struct jit_ctx *ctx, const u8 sz){
+       const u8 *tmp = bpf2a32[TMP_REG_1];
+       u8 rd = dstk ? tmp[1] : dst;
+       u8 rm = src;
+
+       if (off) {
+               emit_a32_mov_i(tmp[0], off, false, ctx);
+               emit(ARM_ADD_R(tmp[0], tmp[0], src), ctx);
+               rm = tmp[0];
+       }
+       switch (sz) {
+       case BPF_W:
+               /* Load a Word */
+               emit(ARM_LDR_I(rd, rm, 0), ctx);
+               break;
+       case BPF_H:
+               /* Load a HalfWord */
+               emit(ARM_LDRH_I(rd, rm, 0), ctx);
+               break;
+       case BPF_B:
+               /* Load a Byte */
+               emit(ARM_LDRB_I(rd, rm, 0), ctx);
+               break;
+       }
+       if (dstk)
+               emit(ARM_STR_I(rd, ARM_SP, STACK_VAR(dst)), ctx);
+}
+
+/* Arithmatic Operation */
+static inline void emit_ar_r(const u8 rd, const u8 rt, const u8 rm,
+                            const u8 rn, struct jit_ctx *ctx, u8 op) {
+       switch (op) {
+       case BPF_JSET:
+               ctx->seen |= SEEN_CALL;
+               emit(ARM_AND_R(ARM_IP, rt, rn), ctx);
+               emit(ARM_AND_R(ARM_LR, rd, rm), ctx);
+               emit(ARM_ORRS_R(ARM_IP, ARM_LR, ARM_IP), ctx);
+               break;
+       case BPF_JEQ:
+       case BPF_JNE:
+       case BPF_JGT:
+       case BPF_JGE:
+               emit(ARM_CMP_R(rd, rm), ctx);
+               _emit(ARM_COND_EQ, ARM_CMP_R(rt, rn), ctx);
+               break;
+       case BPF_JSGT:
+               emit(ARM_CMP_R(rn, rt), ctx);
+               emit(ARM_SBCS_R(ARM_IP, rm, rd), ctx);
+               break;
+       case BPF_JSGE:
+               emit(ARM_CMP_R(rt, rn), ctx);
+               emit(ARM_SBCS_R(ARM_IP, rd, rm), ctx);
+               break;
+       }
+}
+
+static int out_offset = -1; /* initialized on the first pass of build_body() */
+static int emit_bpf_tail_call(struct jit_ctx *ctx)
 {
-       if (!(ctx->seen & SEEN_X))
-               ctx->flags |= FLAG_NEED_X_RESET;

-       ctx->seen |= SEEN_X;
+       /* bpf_tail_call(void *prog_ctx, struct bpf_array *array, u64 index) */
+       const u8 *r2 = bpf2a32[BPF_REG_2];
+       const u8 *r3 = bpf2a32[BPF_REG_3];
+       const u8 *tmp = bpf2a32[TMP_REG_1];
+       const u8 *tmp2 = bpf2a32[TMP_REG_2];
+       const u8 *tcc = bpf2a32[TCALL_CNT];
+       const int idx0 = ctx->idx;
+#define cur_offset (ctx->idx - idx0)
+#define jmp_offset (out_offset - (cur_offset))
+       u32 off, lo, hi;
+
+       /* if (index >= array->map.max_entries)
+        *      goto out;
+        */
+       off = offsetof(struct bpf_array, map.max_entries);
+       /* array->map.max_entries */
+       emit_a32_mov_i(tmp[1], off, false, ctx);
+       emit(ARM_LDR_I(tmp2[1], ARM_SP, STACK_VAR(r2[1])), ctx);
+       emit(ARM_LDR_R(tmp[1], tmp2[1], tmp[1]), ctx);
+       /* index (64 bit) */
+       emit(ARM_LDR_I(tmp2[1], ARM_SP, STACK_VAR(r3[1])), ctx);
+       /* index >= array->map.max_entries */
+       emit(ARM_CMP_R(tmp2[1], tmp[1]), ctx);
+       _emit(ARM_COND_CS, ARM_B(jmp_offset), ctx);
+
+       /* if (tail_call_cnt > MAX_TAIL_CALL_CNT)
+        *      goto out;
+        * tail_call_cnt++;
+        */
+       lo = (u32)MAX_TAIL_CALL_CNT;
+       hi = (u32)((u64)MAX_TAIL_CALL_CNT >> 32);
+       emit(ARM_LDR_I(tmp[1], ARM_SP, STACK_VAR(tcc[1])), ctx);
+       emit(ARM_LDR_I(tmp[0], ARM_SP, STACK_VAR(tcc[0])), ctx);
+       emit(ARM_CMP_I(tmp[0], hi), ctx);
+       _emit(ARM_COND_EQ, ARM_CMP_I(tmp[1], lo), ctx);
+       _emit(ARM_COND_HI, ARM_B(jmp_offset), ctx);
+       emit(ARM_ADDS_I(tmp[1], tmp[1], 1), ctx);
+       emit(ARM_ADC_I(tmp[0], tmp[0], 0), ctx);
+       emit(ARM_STR_I(tmp[1], ARM_SP, STACK_VAR(tcc[1])), ctx);
+       emit(ARM_STR_I(tmp[0], ARM_SP, STACK_VAR(tcc[0])), ctx);
+
+       /* prog = array->ptrs[index]
+        * if (prog == NULL)
+        *      goto out;
+        */
+       off = offsetof(struct bpf_array, ptrs);
+       emit_a32_mov_i(tmp[1], off, false, ctx);
+       emit(ARM_LDR_I(tmp2[1], ARM_SP, STACK_VAR(r2[1])), ctx);
+       emit(ARM_LDR_R(tmp[1], tmp2[1], tmp[1]), ctx);
+       emit(ARM_LDR_I(tmp2[1], ARM_SP, STACK_VAR(r3[1])), ctx);
+       emit(ARM_MOV_SI(tmp[0], tmp2[1], SRTYPE_ASL, 2), ctx);
+       emit(ARM_LDR_R(tmp[1], tmp[1], tmp[0]), ctx);
+       emit(ARM_CMP_I(tmp[1], 0), ctx);
+       _emit(ARM_COND_EQ, ARM_B(jmp_offset), ctx);
+
+       /* goto *(prog->bpf_func + prologue_size); */
+       off = offsetof(struct bpf_prog, bpf_func);
+       emit_a32_mov_i(tmp2[1], off, false, ctx);
+       emit(ARM_LDR_R(tmp[1], tmp[1], tmp2[1]), ctx);
+       emit(ARM_ADD_I(tmp[1], tmp[1], ctx->prologue_bytes), ctx);
+       emit(ARM_BX(tmp[1]), ctx);
+
+       /* out: */
+       if (out_offset == -1)
+               out_offset = cur_offset;
+       if (cur_offset != out_offset) {
+               pr_err_once("tail_call out_offset = %d, expected %d!\n",
+                           cur_offset, out_offset);
+               return -1;
+       }
+       return 0;
+#undef cur_offset
+#undef jmp_offset
 }

-static int build_body(struct jit_ctx *ctx)
+/* 0xabcd => 0xcdab */
+static inline void emit_rev16(const u8 rd, const u8 rn, struct jit_ctx *ctx)
 {
-       void *load_func[] = {jit_get_skb_b, jit_get_skb_h, jit_get_skb_w};
-       const struct bpf_prog *prog = ctx->skf;
-       const struct sock_filter *inst;
-       unsigned i, load_order, off, condt;
-       int imm12;
-       u32 k;
+#if __LINUX_ARM_ARCH__ < 6
+       const u8 *tmp2 = bpf2a32[TMP_REG_2];
+
+       emit(ARM_AND_I(tmp2[1], rn, 0xff), ctx);
+       emit(ARM_MOV_S(tmp2[0], rn, SRTYPE_LSR, 8), ctx);
+       emit(ARM_AND_I(tmp2[0], tmp2[0], 0xff), ctx);
+       emit(ARM_ORR_SI(rd, tmp2[0], tmp2[1], SRTYPE_LSL, 8), ctx);
+#else /* ARMv6+ */
+       emit(ARM_REV16(rd, rn), ctx);
+#endif
+}

-       for (i = 0; i < prog->len; i++) {
-               u16 code;
+/* 0xabcdefgh => 0xghefcdab */
+static inline void emit_rev32(const u8 rd, const u8 rn, struct jit_ctx *ctx)
+{
+#if __LINUX_ARM_ARCH__ < 6
+       const u8 *tmp2 = bpf2a32[TMP_REG_2];
+
+       emit(ARM_AND_I(tmp2[1], rn, 0xff), ctx);
+       emit(ARM_MOV_S(tmp2[0], rn, SRTYPE_LSR, 24), ctx);
+       emit(ARM_ORR_SI(ARM_IP, tmp2[0], tmp2[1], SRTYPE_LSL, 24), ctx);
+
+       emit(ARM_MOV_I(tmp2[1], rn, 0xff00), ctx);
+       emit(ARM_MOV_S(tmp2[0], rn, SRTYPE_LSR, 8), ctx);
+       emit(ARM_MOV_I(tmp2[0], tmp2[0], 0xff00), ctx);
+       emit(ARM_ORR_SI(tmp2[0], tmp2[0], tmp2[1], SRTYPE_LSL, 8), ctx);
+       emit(ARM_ORR_R(rd, ARM_IP, tmp2[0]), ctx);
+#else /* ARMv6+ */
+       emit(ARM_REV(rd, rn), ctx);
+#endif
+}

-               inst = &(prog->insns[i]);
-               /* K as an immediate value operand */
-               k = inst->k;
-               code = bpf_anc_helper(inst);
+static void build_prologue(struct jit_ctx *ctx)
+{
+       const u8 r0 = bpf2a32[BPF_REG_0][1];
+       const u8 r2 = bpf2a32[BPF_REG_1][1];
+       const u8 r3 = bpf2a32[BPF_REG_1][0];
+       const u8 r4 = bpf2a32[BPF_REG_6][1];
+       const u8 r5 = bpf2a32[BPF_REG_6][0];
+       const u8 r6 = bpf2a32[TMP_REG_1][1];
+       const u8 r7 = bpf2a32[TMP_REG_1][0];
+       const u8 r8 = bpf2a32[TMP_REG_2][1];
+       const u8 r10 = bpf2a32[TMP_REG_2][0];
+       const u8 fplo = bpf2a32[BPF_REG_FP][1];
+       const u8 fphi = bpf2a32[BPF_REG_FP][0];
+       const u8 sp = ARM_SP;
+       const u8 *tcc = bpf2a32[TCALL_CNT];
+
+       u16 reg_set = 0;

-               /* compute offsets only in the fake pass */
-               if (ctx->target == NULL)
-                       ctx->offsets[i] = ctx->idx * 4;
+       /*
+        * eBPF prog stack layout
+        *
+        *                         high
+        * original ARM_SP =>     +-----+ eBPF prologue
+        *                        |FP/LR|
+        * current ARM_FP =>      +-----+
+        *                        | ... | callee saved registers
+        * eBPF fp register =>    +-----+ <= (BPF_FP)
+        *                        | ... | eBPF JIT scratch space
+        *                        |     | eBPF prog stack
+        *                        +-----+
+        *                        |RSVD | JIT scratchpad
+        * current A64_SP =>      +-----+ <= (BPF_FP - STACK_SIZE)
+        *                        |     |
+        *                        | ... | Function call stack
+        *                        |     |
+        *                        +-----+
+        *                          low
+        */

-               switch (code) {
-               case BPF_LD | BPF_IMM:
-                       emit_mov_i(r_A, k, ctx);
+       /* Save callee saved registers. */
+       reg_set |= (1<<r4) | (1<<r5) | (1<<r6) | (1<<r7) | (1<<r8) | (1<<r10);
+#ifdef CONFIG_FRAME_POINTER
+       reg_set |= (1<<ARM_FP) | (1<<ARM_IP) | (1<<ARM_LR) | (1<<ARM_PC);
+       emit(ARM_MOV_R(ARM_IP, sp), ctx);
+       emit(ARM_PUSH(reg_set), ctx);
+       emit(ARM_SUB_I(ARM_FP, ARM_IP, 4), ctx);
+#else
+       /* Check if call instruction exists in BPF body */
+       if (ctx->seen & SEEN_CALL)
+               reg_set |= (1<<ARM_LR);
+       emit(ARM_PUSH(reg_set), ctx);
+#endif
+       /* Save frame pointer for later */
+       emit(ARM_SUB_I(ARM_IP, sp, SCRATCH_SIZE), ctx);
+
+       /* Set up function call stack */
+       emit(ARM_SUB_I(ARM_SP, ARM_SP, imm8m(STACK_SIZE)), ctx);
+
+       /* Set up BPF prog stack base register */
+       emit_a32_mov_r(fplo, ARM_IP, true, false, ctx);
+       emit_a32_mov_i(fphi, 0, true, ctx);
+
+       /* mov r4, 0 */
+       emit(ARM_MOV_I(r4, 0), ctx);
+       /* MOV bpf_ctx pointer to BPF_R1 */
+       emit(ARM_MOV_R(r3, r4), ctx);
+       emit(ARM_MOV_R(r2, r0), ctx);
+       /* Initialize Tail Count */
+       emit(ARM_STR_I(r4, ARM_SP, STACK_VAR(tcc[0])), ctx);
+       emit(ARM_STR_I(r4, ARM_SP, STACK_VAR(tcc[1])), ctx);
+       /* end of prologue */
+}
+
+static void build_epilogue(struct jit_ctx *ctx)
+{
+       const u8 r4 = bpf2a32[BPF_REG_6][1];
+       const u8 r5 = bpf2a32[BPF_REG_6][0];
+       const u8 r6 = bpf2a32[TMP_REG_1][1];
+       const u8 r7 = bpf2a32[TMP_REG_1][0];
+       const u8 r8 = bpf2a32[TMP_REG_2][1];
+       const u8 r10 = bpf2a32[TMP_REG_2][0];
+       u16 reg_set = 0;
+
+       /* unwind function call stack */
+       emit(ARM_ADD_I(ARM_SP, ARM_SP, imm8m(STACK_SIZE)), ctx);
+
+       /* restore callee saved registers. */
+       reg_set |= (1<<r4) | (1<<r5) | (1<<r6) | (1<<r7) | (1<<r8) | (1<<r10);
+#ifdef CONFIG_FRAME_POINTER
+       /* the first instruction of the prologue was: mov ip, sp */
+       reg_set |= (1<<ARM_FP) | (1<<ARM_SP) | (1<<ARM_PC);
+       emit(ARM_LDM(ARM_SP, reg_set), ctx);
+#else
+       if (ctx->seen & SEEN_CALL)
+               reg_set |= (1<<ARM_PC);
+       /* Restore callee saved registers. */
+       emit(ARM_POP(reg_set), ctx);
+       /* Return back to the callee function */
+       if (!(ctx->seen & SEEN_CALL))
+               emit(ARM_BX(ARM_LR), ctx);
+#endif
+}
+
+/*
+ * Convert an eBPF instruction to native instruction, i.e
+ * JITs an eBPF instruction.
+ * Returns :
+ *     0  - Successfully JITed an 8-byte eBPF instruction
+ *     >0 - Successfully JITed a 16-byte eBPF instruction
+ *     <0 - Failed to JIT.
+ */
+static int build_insn(const struct bpf_insn *insn, struct jit_ctx *ctx)
+{
+       const u8 code = insn->code;
+       const u8 *dst = bpf2a32[insn->dst_reg];
+       const u8 *src = bpf2a32[insn->src_reg];
+       const u8 *tmp = bpf2a32[TMP_REG_1];
+       const u8 *tmp2 = bpf2a32[TMP_REG_2];
+       const s16 off = insn->off;
+       const s32 imm = insn->imm;
+       const int i = insn - ctx->prog->insnsi;
+       const bool is64 = BPF_CLASS(code) == BPF_ALU64;
+       const bool dstk = is_on_stack(insn->dst_reg);
+       const bool sstk = is_on_stack(insn->src_reg);
+       u8 rd, rt, rm, rn;
+       s32 jmp_offset;
+
+#define check_imm(bits, imm) do {                              \
+       if ((((imm) > 0) && ((imm) >> (bits))) ||               \
+           (((imm) < 0) && (~(imm) >> (bits)))) {              \
+               pr_info("[%2d] imm=%d(0x%x) out of range\n",    \
+                       i, imm, imm);                           \
+               return -EINVAL;                                 \
+       }                                                       \
+} while (0)
+#define check_imm24(imm) check_imm(24, imm)
+
+       switch (code) {
+       /* ALU operations */
+
+       /* dst = src */
+       case BPF_ALU | BPF_MOV | BPF_K:
+       case BPF_ALU | BPF_MOV | BPF_X:
+       case BPF_ALU64 | BPF_MOV | BPF_K:
+       case BPF_ALU64 | BPF_MOV | BPF_X:
+               switch (BPF_SRC(code)) {
+               case BPF_X:
+                       emit_a32_mov_r64(is64, dst, src, dstk, sstk, ctx);
                        break;
-               case BPF_LD | BPF_W | BPF_LEN:
-                       ctx->seen |= SEEN_SKB;
-                       BUILD_BUG_ON(FIELD_SIZEOF(struct sk_buff, len) != 4);
-                       emit(ARM_LDR_I(r_A, r_skb,
-                                      offsetof(struct sk_buff, len)), ctx);
+               case BPF_K:
+                       /* Sign-extend immediate value to destination reg */
+                       emit_a32_mov_i64(is64, dst, imm, dstk, ctx);
                        break;
-               case BPF_LD | BPF_MEM:
-                       /* A = scratch[k] */
-                       ctx->seen |= SEEN_MEM_WORD(k);
-                       emit(ARM_LDR_I(r_A, ARM_SP, SCRATCH_OFF(k)), ctx);
+               }
+               break;
+       /* dst = dst + src/imm */
+       /* dst = dst - src/imm */
+       /* dst = dst | src/imm */
+       /* dst = dst & src/imm */
+       /* dst = dst ^ src/imm */
+       /* dst = dst * src/imm */
+       /* dst = dst << src */
+       /* dst = dst >> src */
+       case BPF_ALU | BPF_ADD | BPF_K:
+       case BPF_ALU | BPF_ADD | BPF_X:
+       case BPF_ALU | BPF_SUB | BPF_K:
+       case BPF_ALU | BPF_SUB | BPF_X:
+       case BPF_ALU | BPF_OR | BPF_K:
+       case BPF_ALU | BPF_OR | BPF_X:
+       case BPF_ALU | BPF_AND | BPF_K:
+       case BPF_ALU | BPF_AND | BPF_X:
+       case BPF_ALU | BPF_XOR | BPF_K:
+       case BPF_ALU | BPF_XOR | BPF_X:
+       case BPF_ALU | BPF_MUL | BPF_K:
+       case BPF_ALU | BPF_MUL | BPF_X:
+       case BPF_ALU | BPF_LSH | BPF_X:
+       case BPF_ALU | BPF_RSH | BPF_X:
+       case BPF_ALU | BPF_ARSH | BPF_K:
+       case BPF_ALU | BPF_ARSH | BPF_X:
+       case BPF_ALU64 | BPF_ADD | BPF_K:
+       case BPF_ALU64 | BPF_ADD | BPF_X:
+       case BPF_ALU64 | BPF_SUB | BPF_K:
+       case BPF_ALU64 | BPF_SUB | BPF_X:
+       case BPF_ALU64 | BPF_OR | BPF_K:
+       case BPF_ALU64 | BPF_OR | BPF_X:
+       case BPF_ALU64 | BPF_AND | BPF_K:
+       case BPF_ALU64 | BPF_AND | BPF_X:
+       case BPF_ALU64 | BPF_XOR | BPF_K:
+       case BPF_ALU64 | BPF_XOR | BPF_X:
+               switch (BPF_SRC(code)) {
+               case BPF_X:
+                       emit_a32_alu_r64(is64, dst, src, dstk, sstk,
+                                        ctx, BPF_OP(code));
                        break;
-               case BPF_LD | BPF_W | BPF_ABS:
-                       load_order = 2;
-                       goto load;
-               case BPF_LD | BPF_H | BPF_ABS:
-                       load_order = 1;
-                       goto load;
-               case BPF_LD | BPF_B | BPF_ABS:
-                       load_order = 0;
-load:
-                       emit_mov_i(r_off, k, ctx);
-load_common:
-                       ctx->seen |= SEEN_DATA | SEEN_CALL;
-
-                       if (load_order > 0) {
-                               emit(ARM_SUB_I(r_scratch, r_skb_hl,
-                                              1 << load_order), ctx);
-                               emit(ARM_CMP_R(r_scratch, r_off), ctx);
-                               condt = ARM_COND_GE;
-                       } else {
-                               emit(ARM_CMP_R(r_skb_hl, r_off), ctx);
-                               condt = ARM_COND_HI;
-                       }
-
-                       /*
-                        * test for negative offset, only if we are
-                        * currently scheduled to take the fast
-                        * path. this will update the flags so that
-                        * the slowpath instruction are ignored if the
-                        * offset is negative.
-                        *
-                        * for loard_order == 0 the HI condition will
-                        * make loads at offset 0 take the slow path too.
+               case BPF_K:
+                       /* Move immediate value to the temporary register
+                        * and then do the ALU operation on the temporary
+                        * register as this will sign-extend the immediate
+                        * value into temporary reg and then it would be
+                        * safe to do the operation on it.
                         */
-                       _emit(condt, ARM_CMP_I(r_off, 0), ctx);
-
-                       _emit(condt, ARM_ADD_R(r_scratch, r_off, r_skb_data),
-                             ctx);
-
-                       if (load_order == 0)
-                               _emit(condt, ARM_LDRB_I(r_A, r_scratch, 0),
-                                     ctx);
-                       else if (load_order == 1)
-                               emit_load_be16(condt, r_A, r_scratch, ctx);
-                       else if (load_order == 2)
-                               emit_load_be32(condt, r_A, r_scratch, ctx);
-
-                       _emit(condt, ARM_B(b_imm(i + 1, ctx)), ctx);
-
-                       /* the slowpath */
-                       emit_mov_i(ARM_R3, (u32)load_func[load_order], ctx);
-                       emit(ARM_MOV_R(ARM_R0, r_skb), ctx);
-                       /* the offset is already in R1 */
-                       emit_blx_r(ARM_R3, ctx);
-                       /* check the result of skb_copy_bits */
-                       emit(ARM_CMP_I(ARM_R1, 0), ctx);
-                       emit_err_ret(ARM_COND_NE, ctx);
-                       emit(ARM_MOV_R(r_A, ARM_R0), ctx);
+                       emit_a32_mov_i64(is64, tmp2, imm, false, ctx);
+                       emit_a32_alu_r64(is64, dst, tmp2, dstk, false,
+                                        ctx, BPF_OP(code));
                        break;
-               case BPF_LD | BPF_W | BPF_IND:
-                       load_order = 2;
-                       goto load_ind;
-               case BPF_LD | BPF_H | BPF_IND:
-                       load_order = 1;
-                       goto load_ind;
-               case BPF_LD | BPF_B | BPF_IND:
-                       load_order = 0;
-load_ind:
-                       update_on_xread(ctx);
-                       OP_IMM3(ARM_ADD, r_off, r_X, k, ctx);
-                       goto load_common;
-               case BPF_LDX | BPF_IMM:
-                       ctx->seen |= SEEN_X;
-                       emit_mov_i(r_X, k, ctx);
+               }
+               break;
+       /* dst = dst / src(imm) */
+       /* dst = dst % src(imm) */
+       case BPF_ALU | BPF_DIV | BPF_K:
+       case BPF_ALU | BPF_DIV | BPF_X:
+       case BPF_ALU | BPF_MOD | BPF_K:
+       case BPF_ALU | BPF_MOD | BPF_X:
+               rt = src_lo;
+               rd = dstk ? tmp2[1] : dst_lo;
+               if (dstk)
+                       emit(ARM_LDR_I(rd, ARM_SP, STACK_VAR(dst_lo)), ctx);
+               switch (BPF_SRC(code)) {
+               case BPF_X:
+                       rt = sstk ? tmp2[0] : rt;
+                       if (sstk)
+                               emit(ARM_LDR_I(rt, ARM_SP, STACK_VAR(src_lo)),
+                                    ctx);
                        break;
-               case BPF_LDX | BPF_W | BPF_LEN:
-                       ctx->seen |= SEEN_X | SEEN_SKB;
-                       emit(ARM_LDR_I(r_X, r_skb,
-                                      offsetof(struct sk_buff, len)), ctx);
+               case BPF_K:
+                       rt = tmp2[0];
+                       emit_a32_mov_i(rt, imm, false, ctx);
                        break;
-               case BPF_LDX | BPF_MEM:
-                       ctx->seen |= SEEN_X | SEEN_MEM_WORD(k);
-                       emit(ARM_LDR_I(r_X, ARM_SP, SCRATCH_OFF(k)), ctx);
+               }
+               emit_udivmod(rd, rd, rt, ctx, BPF_OP(code));
+               if (dstk)
+                       emit(ARM_STR_I(rd, ARM_SP, STACK_VAR(dst_lo)), ctx);
+               emit_a32_mov_i(dst_hi, 0, dstk, ctx);
+               break;
+       case BPF_ALU64 | BPF_DIV | BPF_K:
+       case BPF_ALU64 | BPF_DIV | BPF_X:
+       case BPF_ALU64 | BPF_MOD | BPF_K:
+       case BPF_ALU64 | BPF_MOD | BPF_X:
+               goto notyet;
+       /* dst = dst >> imm */
+       /* dst = dst << imm */
+       case BPF_ALU | BPF_RSH | BPF_K:
+       case BPF_ALU | BPF_LSH | BPF_K:
+               if (unlikely(imm > 31))
+                       return -EINVAL;
+               if (imm)
+                       emit_a32_alu_i(dst_lo, imm, dstk, ctx, BPF_OP(code));
+               emit_a32_mov_i(dst_hi, 0, dstk, ctx);
+               break;
+       /* dst = dst << imm */
+       case BPF_ALU64 | BPF_LSH | BPF_K:
+               if (unlikely(imm > 63))
+                       return -EINVAL;
+               emit_a32_lsh_i64(dst, dstk, imm, ctx);
+               break;
+       /* dst = dst >> imm */
+       case BPF_ALU64 | BPF_RSH | BPF_K:
+               if (unlikely(imm > 63))
+                       return -EINVAL;
+               emit_a32_lsr_i64(dst, dstk, imm, ctx);
+               break;
+       /* dst = dst << src */
+       case BPF_ALU64 | BPF_LSH | BPF_X:
+               emit_a32_lsh_r64(dst, src, dstk, sstk, ctx);
+               break;
+       /* dst = dst >> src */
+       case BPF_ALU64 | BPF_RSH | BPF_X:
+               emit_a32_lsr_r64(dst, src, dstk, sstk, ctx);
+               break;
+       /* dst = dst >> src (signed) */
+       case BPF_ALU64 | BPF_ARSH | BPF_X:
+               emit_a32_arsh_r64(dst, src, dstk, sstk, ctx);
+               break;
+       /* dst = dst >> imm (signed) */
+       case BPF_ALU64 | BPF_ARSH | BPF_K:
+               if (unlikely(imm > 63))
+                       return -EINVAL;
+               emit_a32_arsh_i64(dst, dstk, imm, ctx);
+               break;
+       /* dst = ~dst */
+       case BPF_ALU | BPF_NEG:
+               emit_a32_alu_i(dst_lo, 0, dstk, ctx, BPF_OP(code));
+               emit_a32_mov_i(dst_hi, 0, dstk, ctx);
+               break;
+       /* dst = ~dst (64 bit) */
+       case BPF_ALU64 | BPF_NEG:
+               emit_a32_neg64(dst, dstk, ctx);
+               break;
+       /* dst = dst * src/imm */
+       case BPF_ALU64 | BPF_MUL | BPF_X:
+       case BPF_ALU64 | BPF_MUL | BPF_K:
+               switch (BPF_SRC(code)) {
+               case BPF_X:
+                       emit_a32_mul_r64(dst, src, dstk, sstk, ctx);
                        break;
-               case BPF_LDX | BPF_B | BPF_MSH:
-                       /* x = ((*(frame + k)) & 0xf) << 2; */
-                       ctx->seen |= SEEN_X | SEEN_DATA | SEEN_CALL;
-                       /* the interpreter should deal with the negative K */
-                       if ((int)k < 0)
-                               return -1;
-                       /* offset in r1: we might have to take the slow path */
-                       emit_mov_i(r_off, k, ctx);
-                       emit(ARM_CMP_R(r_skb_hl, r_off), ctx);
-
-                       /* load in r0: common with the slowpath */
-                       _emit(ARM_COND_HI, ARM_LDRB_R(ARM_R0, r_skb_data,
-                                                     ARM_R1), ctx);
-                       /*
-                        * emit_mov_i() might generate one or two instructions,
-                        * the same holds for emit_blx_r()
+               case BPF_K:
+                       /* Move immediate value to the temporary register
+                        * and then do the multiplication on it as this
+                        * will sign-extend the immediate value into temp
+                        * reg then it would be safe to do the operation
+                        * on it.
                         */
-                       _emit(ARM_COND_HI, ARM_B(b_imm(i + 1, ctx) - 2), ctx);
-
-                       emit(ARM_MOV_R(ARM_R0, r_skb), ctx);
-                       /* r_off is r1 */
-                       emit_mov_i(ARM_R3, (u32)jit_get_skb_b, ctx);
-                       emit_blx_r(ARM_R3, ctx);
-                       /* check the return value of skb_copy_bits */
-                       emit(ARM_CMP_I(ARM_R1, 0), ctx);
-                       emit_err_ret(ARM_COND_NE, ctx);
-
-                       emit(ARM_AND_I(r_X, ARM_R0, 0x00f), ctx);
-                       emit(ARM_LSL_I(r_X, r_X, 2), ctx);
-                       break;
-               case BPF_ST:
-                       ctx->seen |= SEEN_MEM_WORD(k);
-                       emit(ARM_STR_I(r_A, ARM_SP, SCRATCH_OFF(k)), ctx);
-                       break;
-               case BPF_STX:
-                       update_on_xread(ctx);
-                       ctx->seen |= SEEN_MEM_WORD(k);
-                       emit(ARM_STR_I(r_X, ARM_SP, SCRATCH_OFF(k)), ctx);
-                       break;
-               case BPF_ALU | BPF_ADD | BPF_K:
-                       /* A += K */
-                       OP_IMM3(ARM_ADD, r_A, r_A, k, ctx);
-                       break;
-               case BPF_ALU | BPF_ADD | BPF_X:
-                       update_on_xread(ctx);
-                       emit(ARM_ADD_R(r_A, r_A, r_X), ctx);
-                       break;
-               case BPF_ALU | BPF_SUB | BPF_K:
-                       /* A -= K */
-                       OP_IMM3(ARM_SUB, r_A, r_A, k, ctx);
-                       break;
-               case BPF_ALU | BPF_SUB | BPF_X:
-                       update_on_xread(ctx);
-                       emit(ARM_SUB_R(r_A, r_A, r_X), ctx);
-                       break;
-               case BPF_ALU | BPF_MUL | BPF_K:
-                       /* A *= K */
-                       emit_mov_i(r_scratch, k, ctx);
-                       emit(ARM_MUL(r_A, r_A, r_scratch), ctx);
-                       break;
-               case BPF_ALU | BPF_MUL | BPF_X:
-                       update_on_xread(ctx);
-                       emit(ARM_MUL(r_A, r_A, r_X), ctx);
-                       break;
-               case BPF_ALU | BPF_DIV | BPF_K:
-                       if (k == 1)
-                               break;
-                       emit_mov_i(r_scratch, k, ctx);
-                       emit_udivmod(r_A, r_A, r_scratch, ctx, BPF_DIV);
-                       break;
-               case BPF_ALU | BPF_DIV | BPF_X:
-                       update_on_xread(ctx);
-                       emit(ARM_CMP_I(r_X, 0), ctx);
-                       emit_err_ret(ARM_COND_EQ, ctx);
-                       emit_udivmod(r_A, r_A, r_X, ctx, BPF_DIV);
-                       break;
-               case BPF_ALU | BPF_MOD | BPF_K:
-                       if (k == 1) {
-                               emit_mov_i(r_A, 0, ctx);
-                               break;
-                       }
-                       emit_mov_i(r_scratch, k, ctx);
-                       emit_udivmod(r_A, r_A, r_scratch, ctx, BPF_MOD);
+                       emit_a32_mov_i64(is64, tmp2, imm, false, ctx);
+                       emit_a32_mul_r64(dst, tmp2, dstk, false, ctx);
                        break;
-               case BPF_ALU | BPF_MOD | BPF_X:
-                       update_on_xread(ctx);
-                       emit(ARM_CMP_I(r_X, 0), ctx);
-                       emit_err_ret(ARM_COND_EQ, ctx);
-                       emit_udivmod(r_A, r_A, r_X, ctx, BPF_MOD);
-                       break;
-               case BPF_ALU | BPF_OR | BPF_K:
-                       /* A |= K */
-                       OP_IMM3(ARM_ORR, r_A, r_A, k, ctx);
+               }
+               break;
+       /* dst = htole(dst) */
+       /* dst = htobe(dst) */
+       case BPF_ALU | BPF_END | BPF_FROM_LE:
+       case BPF_ALU | BPF_END | BPF_FROM_BE:
+               rd = dstk ? tmp[0] : dst_hi;
+               rt = dstk ? tmp[1] : dst_lo;
+               if (dstk) {
+                       emit(ARM_LDR_I(rt, ARM_SP, STACK_VAR(dst_lo)), ctx);
+                       emit(ARM_LDR_I(rd, ARM_SP, STACK_VAR(dst_hi)), ctx);
+               }
+#ifdef CONFIG_CPU_BIG_ENDIAN
+               if (BPF_SRC(code) == BPF_FROM_BE)
+                       goto emit_bswap_uxt;
+#else /* !CONFIG_CPU_BIG_ENDIAN */
+               if (BPF_SRC(code) == BPF_FROM_LE)
+                       goto emit_bswap_uxt;
+#endif
+               switch (imm) {
+               case 16:
+                       emit_rev16(rt, rt, ctx);
+                       goto emit_bswap_uxt;
+               case 32:
+                       emit_rev32(rt, rt, ctx);
+                       goto emit_bswap_uxt;
+               case 64:
+                       /* Because of the usage of ARM_LR */
+                       ctx->seen |= SEEN_CALL;
+                       emit_rev32(ARM_LR, rt, ctx);
+                       emit_rev32(rt, rd, ctx);
+                       emit(ARM_MOV_R(rd, ARM_LR), ctx);
                        break;
-               case BPF_ALU | BPF_OR | BPF_X:
-                       update_on_xread(ctx);
-                       emit(ARM_ORR_R(r_A, r_A, r_X), ctx);
+               }
+               goto exit;
+emit_bswap_uxt:
+               switch (imm) {
+               case 16:
+                       /* zero-extend 16 bits into 64 bits */
+#if __LINUX_ARM_ARCH__ < 6
+                       emit_a32_mov_i(tmp2[1], 0xffff, false, ctx);
+                       emit(ARM_AND_R(rt, tmp2[1]), ctx);
+#else /* ARMv6+ */
+                       emit(ARM_UXTH(rt, rt), ctx);
+#endif
+                       emit(ARM_EOR_R(rd, rd, rd), ctx);
                        break;
-               case BPF_ALU | BPF_XOR | BPF_K:
-                       /* A ^= K; */
-                       OP_IMM3(ARM_EOR, r_A, r_A, k, ctx);
+               case 32:
+                       /* zero-extend 32 bits into 64 bits */
+                       emit(ARM_EOR_R(rd, rd, rd), ctx);
                        break;
-               case BPF_ANC | SKF_AD_ALU_XOR_X:
-               case BPF_ALU | BPF_XOR | BPF_X:
-                       /* A ^= X */
-                       update_on_xread(ctx);
-                       emit(ARM_EOR_R(r_A, r_A, r_X), ctx);
+               case 64:
+                       /* nop */
                        break;
-               case BPF_ALU | BPF_AND | BPF_K:
-                       /* A &= K */
-                       OP_IMM3(ARM_AND, r_A, r_A, k, ctx);
+               }
+exit:
+               if (dstk) {
+                       emit(ARM_STR_I(rt, ARM_SP, STACK_VAR(dst_lo)), ctx);
+                       emit(ARM_STR_I(rd, ARM_SP, STACK_VAR(dst_hi)), ctx);
+               }
+               break;
+       /* dst = imm64 */
+       case BPF_LD | BPF_IMM | BPF_DW:
+       {
+               const struct bpf_insn insn1 = insn[1];
+               u32 hi, lo = imm;
+
+               if (insn1.code != 0 || insn1.src_reg != 0 ||
+                   insn1.dst_reg != 0 || insn1.off != 0) {
+                       /* Note: verifier in BPF core must catch invalid
+                        * instruction.
+                        */
+                       pr_err_once("Invalid BPF_LD_IMM64 instruction\n");
+                       return -EINVAL;
+               }
+               hi = insn1.imm;
+               emit_a32_mov_i(dst_lo, lo, dstk, ctx);
+               emit_a32_mov_i(dst_hi, hi, dstk, ctx);
+
+               return 1;
+       }
+       /* LDX: dst = *(size *)(src + off) */
+       case BPF_LDX | BPF_MEM | BPF_W:
+       case BPF_LDX | BPF_MEM | BPF_H:
+       case BPF_LDX | BPF_MEM | BPF_B:
+       case BPF_LDX | BPF_MEM | BPF_DW:
+               rn = sstk ? tmp2[1] : src_lo;
+               if (sstk)
+                       emit(ARM_LDR_I(rn, ARM_SP, STACK_VAR(src_lo)), ctx);
+               switch (BPF_SIZE(code)) {
+               case BPF_W:
+                       /* Load a Word */
+               case BPF_H:
+                       /* Load a Half-Word */
+               case BPF_B:
+                       /* Load a Byte */
+                       emit_ldx_r(dst_lo, rn, dstk, off, ctx, BPF_SIZE(code));
+                       emit_a32_mov_i(dst_hi, 0, dstk, ctx);
                        break;
-               case BPF_ALU | BPF_AND | BPF_X:
-                       update_on_xread(ctx);
-                       emit(ARM_AND_R(r_A, r_A, r_X), ctx);
+               case BPF_DW:
+                       /* Load a double word */
+                       emit_ldx_r(dst_lo, rn, dstk, off, ctx, BPF_W);
+                       emit_ldx_r(dst_hi, rn, dstk, off+4, ctx, BPF_W);
                        break;
-               case BPF_ALU | BPF_LSH | BPF_K:
-                       if (unlikely(k > 31))
-                               return -1;
-                       emit(ARM_LSL_I(r_A, r_A, k), ctx);
+               }
+               break;
+       /* R0 = ntohx(*(size *)(((struct sk_buff *)R6)->data + imm)) */
+       case BPF_LD | BPF_ABS | BPF_W:
+       case BPF_LD | BPF_ABS | BPF_H:
+       case BPF_LD | BPF_ABS | BPF_B:
+       /* R0 = ntohx(*(size *)(((struct sk_buff *)R6)->data + src + imm)) */
+       case BPF_LD | BPF_IND | BPF_W:
+       case BPF_LD | BPF_IND | BPF_H:
+       case BPF_LD | BPF_IND | BPF_B:
+       {
+               const u8 r4 = bpf2a32[BPF_REG_6][1]; /* r4 = ptr to sk_buff */
+               const u8 r0 = bpf2a32[BPF_REG_0][1]; /*r0: struct sk_buff *skb*/
+                                                    /* rtn value */
+               const u8 r1 = bpf2a32[BPF_REG_0][0]; /* r1: int k */
+               const u8 r2 = bpf2a32[BPF_REG_1][1]; /* r2: unsigned int size */
+               const u8 r3 = bpf2a32[BPF_REG_1][0]; /* r3: void *buffer */
+               const u8 r6 = bpf2a32[TMP_REG_1][1]; /* r6: void *(*func)(..) */
+               int size;
+
+               /* Setting up first argument */
+               emit(ARM_MOV_R(r0, r4), ctx);
+
+               /* Setting up second argument */
+               emit_a32_mov_i(r1, imm, false, ctx);
+               if (BPF_MODE(code) == BPF_IND)
+                       emit_a32_alu_r(r1, src_lo, false, sstk, ctx,
+                                      false, false, BPF_ADD);
+
+               /* Setting up third argument */
+               switch (BPF_SIZE(code)) {
+               case BPF_W:
+                       size = 4;
                        break;
-               case BPF_ALU | BPF_LSH | BPF_X:
-                       update_on_xread(ctx);
-                       emit(ARM_LSL_R(r_A, r_A, r_X), ctx);
+               case BPF_H:
+                       size = 2;
                        break;
-               case BPF_ALU | BPF_RSH | BPF_K:
-                       if (unlikely(k > 31))
-                               return -1;
-                       if (k)
-                               emit(ARM_LSR_I(r_A, r_A, k), ctx);
+               case BPF_B:
+                       size = 1;
                        break;
-               case BPF_ALU | BPF_RSH | BPF_X:
-                       update_on_xread(ctx);
-                       emit(ARM_LSR_R(r_A, r_A, r_X), ctx);
+               default:
+                       return -EINVAL;
+               }
+               emit_a32_mov_i(r2, size, false, ctx);
+
+               /* Setting up fourth argument */
+               emit(ARM_ADD_I(r3, ARM_SP, imm8m(SKB_BUFFER)), ctx);
+
+               /* Setting up function pointer to call */
+               emit_a32_mov_i(r6, (unsigned int)bpf_load_pointer, false, ctx);
+               emit_blx_r(r6, ctx);
+
+               emit(ARM_EOR_R(r1, r1, r1), ctx);
+               /* Check if return address is NULL or not.
+                * if NULL then jump to epilogue
+                * else continue to load the value from retn address
+                */
+               emit(ARM_CMP_I(r0, 0), ctx);
+               jmp_offset = epilogue_offset(ctx);
+               check_imm24(jmp_offset);
+               _emit(ARM_COND_EQ, ARM_B(jmp_offset), ctx);
+
+               /* Load value from the address */
+               switch (BPF_SIZE(code)) {
+               case BPF_W:
+                       emit(ARM_LDR_I(r0, r0, 0), ctx);
+#ifndef CONFIG_CPU_BIG_ENDIAN
+                       emit_rev32(r0, r0, ctx);
+#endif
                        break;
-               case BPF_ALU | BPF_NEG:
-                       /* A = -A */
-                       emit(ARM_RSB_I(r_A, r_A, 0), ctx);
+               case BPF_H:
+                       emit(ARM_LDRH_I(r0, r0, 0), ctx);
+#ifndef CONFIG_CPU_BIG_ENDIAN
+                       emit_rev16(r0, r0, ctx);
+#endif
                        break;
-               case BPF_JMP | BPF_JA:
-                       /* pc += K */
-                       emit(ARM_B(b_imm(i + k + 1, ctx)), ctx);
+               case BPF_B:
+                       emit(ARM_LDRB_I(r0, r0, 0), ctx);
+                       /* No need to reverse */
                        break;
-               case BPF_JMP | BPF_JEQ | BPF_K:
-                       /* pc += (A == K) ? pc->jt : pc->jf */
-                       condt  = ARM_COND_EQ;
-                       goto cmp_imm;
-               case BPF_JMP | BPF_JGT | BPF_K:
-                       /* pc += (A > K) ? pc->jt : pc->jf */
-                       condt  = ARM_COND_HI;
-                       goto cmp_imm;
-               case BPF_JMP | BPF_JGE | BPF_K:
-                       /* pc += (A >= K) ? pc->jt : pc->jf */
-                       condt  = ARM_COND_HS;
-cmp_imm:
-                       imm12 = imm8m(k);
-                       if (imm12 < 0) {
-                               emit_mov_i_no8m(r_scratch, k, ctx);
-                               emit(ARM_CMP_R(r_A, r_scratch), ctx);
-                       } else {
-                               emit(ARM_CMP_I(r_A, imm12), ctx);
-                       }
-cond_jump:
-                       if (inst->jt)
-                               _emit(condt, ARM_B(b_imm(i + inst->jt + 1,
-                                                  ctx)), ctx);
-                       if (inst->jf)
-                               _emit(condt ^ 1, ARM_B(b_imm(i + inst->jf + 1,
-                                                            ctx)), ctx);
+               }
+               break;
+       }
+       /* ST: *(size *)(dst + off) = imm */
+       case BPF_ST | BPF_MEM | BPF_W:
+       case BPF_ST | BPF_MEM | BPF_H:
+       case BPF_ST | BPF_MEM | BPF_B:
+       case BPF_ST | BPF_MEM | BPF_DW:
+               switch (BPF_SIZE(code)) {
+               case BPF_DW:
+                       /* Sign-extend immediate value into temp reg */
+                       emit_a32_mov_i64(true, tmp2, imm, false, ctx);
+                       emit_str_r(dst_lo, tmp2[1], dstk, off, ctx, BPF_W);
+                       emit_str_r(dst_lo, tmp2[0], dstk, off+4, ctx, BPF_W);
                        break;
-               case BPF_JMP | BPF_JEQ | BPF_X:
-                       /* pc += (A == X) ? pc->jt : pc->jf */
-                       condt   = ARM_COND_EQ;
-                       goto cmp_x;
-               case BPF_JMP | BPF_JGT | BPF_X:
-                       /* pc += (A > X) ? pc->jt : pc->jf */
-                       condt   = ARM_COND_HI;
-                       goto cmp_x;
-               case BPF_JMP | BPF_JGE | BPF_X:
-                       /* pc += (A >= X) ? pc->jt : pc->jf */
-                       condt   = ARM_COND_CS;
-cmp_x:
-                       update_on_xread(ctx);
-                       emit(ARM_CMP_R(r_A, r_X), ctx);
-                       goto cond_jump;
-               case BPF_JMP | BPF_JSET | BPF_K:
-                       /* pc += (A & K) ? pc->jt : pc->jf */
-                       condt  = ARM_COND_NE;
-                       /* not set iff all zeroes iff Z==1 iff EQ */
-
-                       imm12 = imm8m(k);
-                       if (imm12 < 0) {
-                               emit_mov_i_no8m(r_scratch, k, ctx);
-                               emit(ARM_TST_R(r_A, r_scratch), ctx);
-                       } else {
-                               emit(ARM_TST_I(r_A, imm12), ctx);
-                       }
-                       goto cond_jump;
-               case BPF_JMP | BPF_JSET | BPF_X:
-                       /* pc += (A & X) ? pc->jt : pc->jf */
-                       update_on_xread(ctx);
-                       condt  = ARM_COND_NE;
-                       emit(ARM_TST_R(r_A, r_X), ctx);
-                       goto cond_jump;
-               case BPF_RET | BPF_A:
-                       emit(ARM_MOV_R(ARM_R0, r_A), ctx);
-                       goto b_epilogue;
-               case BPF_RET | BPF_K:
-                       if ((k == 0) && (ctx->ret0_fp_idx < 0))
-                               ctx->ret0_fp_idx = i;
-                       emit_mov_i(ARM_R0, k, ctx);
-b_epilogue:
-                       if (i != ctx->skf->len - 1)
-                               emit(ARM_B(b_imm(prog->len, ctx)), ctx);
+               case BPF_W:
+               case BPF_H:
+               case BPF_B:
+                       emit_a32_mov_i(tmp2[1], imm, false, ctx);
+                       emit_str_r(dst_lo, tmp2[1], dstk, off, ctx,
+                                  BPF_SIZE(code));
                        break;
-               case BPF_MISC | BPF_TAX:
-                       /* X = A */
-                       ctx->seen |= SEEN_X;
-                       emit(ARM_MOV_R(r_X, r_A), ctx);
+               }
+               break;
+       /* STX XADD: lock *(u32 *)(dst + off) += src */
+       case BPF_STX | BPF_XADD | BPF_W:
+       /* STX XADD: lock *(u64 *)(dst + off) += src */
+       case BPF_STX | BPF_XADD | BPF_DW:
+               goto notyet;
+       /* STX: *(size *)(dst + off) = src */
+       case BPF_STX | BPF_MEM | BPF_W:
+       case BPF_STX | BPF_MEM | BPF_H:
+       case BPF_STX | BPF_MEM | BPF_B:
+       case BPF_STX | BPF_MEM | BPF_DW:
+       {
+               u8 sz = BPF_SIZE(code);
+
+               rn = sstk ? tmp2[1] : src_lo;
+               rm = sstk ? tmp2[0] : src_hi;
+               if (!sstk)
+                       goto do_store;
+               switch (BPF_SIZE(code)) {
+               case BPF_W:
+                       emit(ARM_LDR_I(rn, ARM_SP, STACK_VAR(src_lo)), ctx);
+                       goto empty_hi;
+               case BPF_H:
+                       emit(ARM_LDRH_I(rn, ARM_SP, STACK_VAR(src_lo)), ctx);
+                       goto empty_hi;
+               case BPF_B:
+                       emit(ARM_LDRB_I(rn, ARM_SP, STACK_VAR(src_lo)), ctx);
+                       goto empty_hi;
+empty_hi:
+                       emit(ARM_EOR_R(rm, rm, rm), ctx);
+               case BPF_DW:
+                       emit(ARM_LDR_I(rn, ARM_SP, STACK_VAR(src_lo)), ctx);
+                       emit(ARM_LDR_I(rm, ARM_SP, STACK_VAR(src_hi)), ctx);
+                       sz = BPF_W;
                        break;
-               case BPF_MISC | BPF_TXA:
-                       /* A = X */
-                       update_on_xread(ctx);
-                       emit(ARM_MOV_R(r_A, r_X), ctx);
+               }
+
+do_store:
+               /* Clear higher word except for BPF_DW */
+               if (BPF_SIZE(code) != BPF_DW)
+                       emit(ARM_EOR_R(rm, rm, rm), ctx);
+
+               /* Store the value */
+               emit_str_r(dst_lo, rn, dstk, off, ctx, sz);
+               emit_str_r(dst_lo, rm, dstk, off+4, ctx, BPF_W);
+               break;
+       }
+       /* PC += off if dst == src */
+       /* PC += off if dst > src */
+       /* PC += off if dst >= src */
+       /* PC += off if dst != src */
+       /* PC += off if dst > src (signed) */
+       /* PC += off if dst >= src (signed) */
+       /* PC += off if dst & src */
+       case BPF_JMP | BPF_JEQ | BPF_X:
+       case BPF_JMP | BPF_JGT | BPF_X:
+       case BPF_JMP | BPF_JGE | BPF_X:
+       case BPF_JMP | BPF_JNE | BPF_X:
+       case BPF_JMP | BPF_JSGT | BPF_X:
+       case BPF_JMP | BPF_JSGE | BPF_X:
+       case BPF_JMP | BPF_JSET | BPF_X:
+               /* Setup source registers */
+               rm = sstk ? tmp2[0] : src_hi;
+               rn = sstk ? tmp2[1] : src_lo;
+               if (sstk) {
+                       emit(ARM_LDR_I(rn, ARM_SP, STACK_VAR(src_lo)), ctx);
+                       emit(ARM_LDR_I(rm, ARM_SP, STACK_VAR(src_hi)), ctx);
+               }
+               goto go_jmp;
+       /* PC += off if dst == imm */
+       /* PC += off if dst > imm */
+       /* PC += off if dst >= imm */
+       /* PC += off if dst != imm */
+       /* PC += off if dst > imm (signed) */
+       /* PC += off if dst >= imm (signed) */
+       /* PC += off if dst & imm */
+       case BPF_JMP | BPF_JEQ | BPF_K:
+       case BPF_JMP | BPF_JGT | BPF_K:
+       case BPF_JMP | BPF_JGE | BPF_K:
+       case BPF_JMP | BPF_JNE | BPF_K:
+       case BPF_JMP | BPF_JSGT | BPF_K:
+       case BPF_JMP | BPF_JSGE | BPF_K:
+       case BPF_JMP | BPF_JSET | BPF_K:
+               if (off == 0)
                        break;
-               case BPF_ANC | SKF_AD_PROTOCOL:
-                       /* A = ntohs(skb->protocol) */
-                       ctx->seen |= SEEN_SKB;
-                       BUILD_BUG_ON(FIELD_SIZEOF(struct sk_buff,
-                                                 protocol) != 2);
-                       off = offsetof(struct sk_buff, protocol);
-                       emit(ARM_LDRH_I(r_scratch, r_skb, off), ctx);
-                       emit_swap16(r_A, r_scratch, ctx);
+               rm = tmp2[0];
+               rn = tmp2[1];
+               /* Sign-extend immediate value */
+               emit_a32_mov_i64(true, tmp2, imm, false, ctx);
+go_jmp:
+               /* Setup destination register */
+               rd = dstk ? tmp[0] : dst_hi;
+               rt = dstk ? tmp[1] : dst_lo;
+               if (dstk) {
+                       emit(ARM_LDR_I(rt, ARM_SP, STACK_VAR(dst_lo)), ctx);
+                       emit(ARM_LDR_I(rd, ARM_SP, STACK_VAR(dst_hi)), ctx);
+               }
+
+               /* Check for the condition */
+               emit_ar_r(rd, rt, rm, rn, ctx, BPF_OP(code));
+
+               /* Setup JUMP instruction */
+               jmp_offset = bpf2a32_offset(i+off, i, ctx);
+               switch (BPF_OP(code)) {
+               case BPF_JNE:
+               case BPF_JSET:
+                       _emit(ARM_COND_NE, ARM_B(jmp_offset), ctx);
                        break;
-               case BPF_ANC | SKF_AD_CPU:
-                       /* r_scratch = current_thread_info() */
-                       OP_IMM3(ARM_BIC, r_scratch, ARM_SP,
THREAD_SIZE - 1, ctx);
-                       /* A = current_thread_info()->cpu */
-                       BUILD_BUG_ON(FIELD_SIZEOF(struct thread_info,
cpu) != 4);
-                       off = offsetof(struct thread_info, cpu);
-                       emit(ARM_LDR_I(r_A, r_scratch, off), ctx);
+               case BPF_JEQ:
+                       _emit(ARM_COND_EQ, ARM_B(jmp_offset), ctx);
                        break;
-               case BPF_ANC | SKF_AD_IFINDEX:
-               case BPF_ANC | SKF_AD_HATYPE:
-                       /* A = skb->dev->ifindex */
-                       /* A = skb->dev->type */
-                       ctx->seen |= SEEN_SKB;
-                       off = offsetof(struct sk_buff, dev);
-                       emit(ARM_LDR_I(r_scratch, r_skb, off), ctx);
-
-                       emit(ARM_CMP_I(r_scratch, 0), ctx);
-                       emit_err_ret(ARM_COND_EQ, ctx);
-
-                       BUILD_BUG_ON(FIELD_SIZEOF(struct net_device,
-                                                 ifindex) != 4);
-                       BUILD_BUG_ON(FIELD_SIZEOF(struct net_device,
-                                                 type) != 2);
-
-                       if (code == (BPF_ANC | SKF_AD_IFINDEX)) {
-                               off = offsetof(struct net_device, ifindex);
-                               emit(ARM_LDR_I(r_A, r_scratch, off), ctx);
-                       } else {
-                               /*
-                                * offset of field "type" in "struct
-                                * net_device" is above what can be
-                                * used in the ldrh rd, [rn, #imm]
-                                * instruction, so load the offset in
-                                * a register and use ldrh rd, [rn, rm]
-                                */
-                               off = offsetof(struct net_device, type);
-                               emit_mov_i(ARM_R3, off, ctx);
-                               emit(ARM_LDRH_R(r_A, r_scratch, ARM_R3), ctx);
-                       }
+               case BPF_JGT:
+                       _emit(ARM_COND_HI, ARM_B(jmp_offset), ctx);
                        break;
-               case BPF_ANC | SKF_AD_MARK:
-                       ctx->seen |= SEEN_SKB;
-                       BUILD_BUG_ON(FIELD_SIZEOF(struct sk_buff, mark) != 4);
-                       off = offsetof(struct sk_buff, mark);
-                       emit(ARM_LDR_I(r_A, r_skb, off), ctx);
+               case BPF_JGE:
+                       _emit(ARM_COND_CS, ARM_B(jmp_offset), ctx);
                        break;
-               case BPF_ANC | SKF_AD_RXHASH:
-                       ctx->seen |= SEEN_SKB;
-                       BUILD_BUG_ON(FIELD_SIZEOF(struct sk_buff, hash) != 4);
-                       off = offsetof(struct sk_buff, hash);
-                       emit(ARM_LDR_I(r_A, r_skb, off), ctx);
+               case BPF_JSGT:
+                       _emit(ARM_COND_LT, ARM_B(jmp_offset), ctx);
                        break;
-               case BPF_ANC | SKF_AD_VLAN_TAG:
-               case BPF_ANC | SKF_AD_VLAN_TAG_PRESENT:
-                       ctx->seen |= SEEN_SKB;
-                       BUILD_BUG_ON(FIELD_SIZEOF(struct sk_buff,
vlan_tci) != 2);
-                       off = offsetof(struct sk_buff, vlan_tci);
-                       emit(ARM_LDRH_I(r_A, r_skb, off), ctx);
-                       if (code == (BPF_ANC | SKF_AD_VLAN_TAG))
-                               OP_IMM3(ARM_AND, r_A, r_A,
~VLAN_TAG_PRESENT, ctx);
-                       else {
-                               OP_IMM3(ARM_LSR, r_A, r_A, 12, ctx);
-                               OP_IMM3(ARM_AND, r_A, r_A, 0x1, ctx);
-                       }
+               case BPF_JSGE:
+                       _emit(ARM_COND_GE, ARM_B(jmp_offset), ctx);
                        break;
-               case BPF_ANC | SKF_AD_PKTTYPE:
-                       ctx->seen |= SEEN_SKB;
-                       BUILD_BUG_ON(FIELD_SIZEOF(struct sk_buff,
-                                                 __pkt_type_offset[0]) != 1);
-                       off = PKT_TYPE_OFFSET();
-                       emit(ARM_LDRB_I(r_A, r_skb, off), ctx);
-                       emit(ARM_AND_I(r_A, r_A, PKT_TYPE_MAX), ctx);
-#ifdef __BIG_ENDIAN_BITFIELD
-                       emit(ARM_LSR_I(r_A, r_A, 5), ctx);
-#endif
+               }
+               break;
+       /* JMP OFF */
+       case BPF_JMP | BPF_JA:
+       {
+               if (off == 0)
                        break;
-               case BPF_ANC | SKF_AD_QUEUE:
-                       ctx->seen |= SEEN_SKB;
-                       BUILD_BUG_ON(FIELD_SIZEOF(struct sk_buff,
-                                                 queue_mapping) != 2);
-                       BUILD_BUG_ON(offsetof(struct sk_buff,
-                                             queue_mapping) > 0xff);
-                       off = offsetof(struct sk_buff, queue_mapping);
-                       emit(ARM_LDRH_I(r_A, r_skb, off), ctx);
+               jmp_offset = bpf2a32_offset(i+off, i, ctx);
+               check_imm24(jmp_offset);
+               emit(ARM_B(jmp_offset), ctx);
+               break;
+       }
+       /* tail call */
+       case BPF_JMP | BPF_CALL | BPF_X:
+               if (emit_bpf_tail_call(ctx))
+                       return -EFAULT;
+               break;
+       /* function call */
+       case BPF_JMP | BPF_CALL:
+               goto notyet;
+       /* function return */
+       case BPF_JMP | BPF_EXIT:
+               /* Optimization: when last instruction is EXIT
+                * simply fallthrough to epilogue.
+                */
+               if (i == ctx->prog->len - 1)
                        break;
-               case BPF_ANC | SKF_AD_PAY_OFFSET:
-                       ctx->seen |= SEEN_SKB | SEEN_CALL;
+               jmp_offset = epilogue_offset(ctx);
+               check_imm24(jmp_offset);
+               emit(ARM_B(jmp_offset), ctx);
+               break;
+notyet:
+               pr_info_once("*** NOT YET: opcode %02x ***\n", code);
+               return -EFAULT;
+       default:
+               pr_err_once("unknown opcode %02x\n", code);
+               return -EINVAL;
+       }

-                       emit(ARM_MOV_R(ARM_R0, r_skb), ctx);
-                       emit_mov_i(ARM_R3, (unsigned int)skb_get_poff, ctx);
-                       emit_blx_r(ARM_R3, ctx);
-                       emit(ARM_MOV_R(r_A, ARM_R0), ctx);
-                       break;
-               case BPF_LDX | BPF_W | BPF_ABS:
-                       /*
-                        * load a 32bit word from struct seccomp_data.
-                        * seccomp_check_filter() will already have checked
-                        * that k is 32bit aligned and lies within the
-                        * struct seccomp_data.
-                        */
-                       ctx->seen |= SEEN_SKB;
-                       emit(ARM_LDR_I(r_A, r_skb, k), ctx);
-                       break;
-               default:
-                       return -1;
+       if (ctx->flags & FLAG_IMM_OVERFLOW)
+               /*
+                * this instruction generated an overflow when
+                * trying to access the literal pool, so
+                * delegate this filter to the kernel interpreter.
+                */
+               return -1;
+       return 0;
+}
+
+static int build_body(struct jit_ctx *ctx)
+{
+       const struct bpf_prog *prog = ctx->prog;
+       unsigned int i;
+
+       for (i = 0; i < prog->len; i++) {
+               const struct bpf_insn *insn = &(prog->insnsi[i]);
+               int ret;
+
+               emit(ARM_MOV_R(ARM_IP, ARM_PC), ctx);
+               ret = build_insn(insn, ctx);
+
+               /* It's used with loading the 64 bit immediate value. */
+               if (ret > 0) {
+                       i++;
+                       if (ctx->target == NULL)
+                               ctx->offsets[i] = ctx->idx;
+                       continue;
                }

-               if (ctx->flags & FLAG_IMM_OVERFLOW)
-                       /*
-                        * this instruction generated an overflow when
-                        * trying to access the literal pool, so
-                        * delegate this filter to the kernel interpreter.
-                        */
-                       return -1;
+               if (ctx->target == NULL)
+                       ctx->offsets[i] = ctx->idx;
+
+               /* If unsuccesfull, return with error code */
+               if (ret)
+                       return ret;
        }
+       return 0;
+}

-       /* compute offsets only during the first pass */
-       if (ctx->target == NULL)
-               ctx->offsets[i] = ctx->idx * 4;
+static int validate_code(struct jit_ctx *ctx)
+{
+       int i;
+
+       for (i = 0; i < ctx->idx; i++) {
+               u32 a32_insn = le32_to_cpu(ctx->target[i]);
+
+               if (a32_insn == ARM_INST_UDF)
+                       return -1;
+       }

        return 0;
 }

+void bpf_jit_compile(struct bpf_prog *prog)
+{
+       /* Nothing to do here. We support Internal BPF. */
+}

-void bpf_jit_compile(struct bpf_prog *fp)
+struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
 {
+       struct bpf_prog *tmp, *orig_prog = prog;
        struct bpf_binary_header *header;
+       bool tmp_blinded = false;
        struct jit_ctx ctx;
-       unsigned tmp_idx;
-       unsigned alloc_size;
-       u8 *target_ptr;
+       unsigned int tmp_idx;
+       unsigned int image_size;
+       u8 *image_ptr;

+       /* If BPF JIT was not enabled then we must fall back to
+        * the interpreter.
+        */
        if (!bpf_jit_enable)
-               return;
+               return orig_prog;

-       memset(&ctx, 0, sizeof(ctx));
-       ctx.skf         = fp;
-       ctx.ret0_fp_idx = -1;
+       /* If constant blinding was enabled and we failed during blinding
+        * then we must fall back to the interpreter. Otherwise, we save
+        * the new JITed code.
+        */
+       tmp = bpf_jit_blind_constants(prog);

-       ctx.offsets = kzalloc(4 * (ctx.skf->len + 1), GFP_KERNEL);
-       if (ctx.offsets == NULL)
-               return;
+       if (IS_ERR(tmp))
+               return orig_prog;
+       if (tmp != prog) {
+               tmp_blinded = true;
+               prog = tmp;
+       }
+
+       memset(&ctx, 0, sizeof(ctx));
+       ctx.prog = prog;

-       /* fake pass to fill in the ctx->seen */
-       if (unlikely(build_body(&ctx)))
+       /* Not able to allocate memory for offsets[] , then
+        * we must fall back to the interpreter
+        */
+       ctx.offsets = kcalloc(prog->len, sizeof(int), GFP_KERNEL);
+       if (ctx.offsets == NULL) {
+               prog = orig_prog;
                goto out;
+       }
+
+       /* 1) fake pass to find in the length of the JITed code,
+        * to compute ctx->offsets and other context variables
+        * needed to compute final JITed code.
+        * Also, calculate random starting pointer/start of JITed code
+        * which is prefixed by random number of fault instructions.
+        *
+        * If the first pass fails then there is no chance of it
+        * being successful in the second pass, so just fall back
+        * to the interpreter.
+        */
+       if (build_body(&ctx)) {
+               prog = orig_prog;
+               goto out_off;
+       }

        tmp_idx = ctx.idx;
        build_prologue(&ctx);
        ctx.prologue_bytes = (ctx.idx - tmp_idx) * 4;

+       ctx.epilogue_offset = ctx.idx;
+
 #if __LINUX_ARM_ARCH__ < 7
        tmp_idx = ctx.idx;
        build_epilogue(&ctx);
@@ -1020,64 +1845,95 @@ void bpf_jit_compile(struct bpf_prog *fp)

        ctx.idx += ctx.imm_count;
        if (ctx.imm_count) {
-               ctx.imms = kzalloc(4 * ctx.imm_count, GFP_KERNEL);
-               if (ctx.imms == NULL)
-                       goto out;
+               ctx.imms = kcalloc(ctx.imm_count, sizeof(u32), GFP_KERNEL);
+               if (ctx.imms == NULL) {
+                       prog = orig_prog;
+                       goto out_off;
+               }
        }
 #else
-       /* there's nothing after the epilogue on ARMv7 */
+       /* there's nothing about the epilogue on ARMv7 */
        build_epilogue(&ctx);
 #endif
-       alloc_size = 4 * ctx.idx;
-       header = bpf_jit_binary_alloc(alloc_size, &target_ptr,
-                                     4, jit_fill_hole);
-       if (header == NULL)
-               goto out;
+       /* Now we can get the actual image size of the JITed arm code.
+        * Currently, we are not considering the THUMB-2 instructions
+        * for jit, although it can decrease the size of the image.
+        *
+        * As each arm instruction is of length 32bit, we are translating
+        * number of JITed intructions into the size required to store these
+        * JITed code.
+        */
+       image_size = sizeof(u32) * ctx.idx;

-       ctx.target = (u32 *) target_ptr;
+       /* Now we know the size of the structure to make */
+       header = bpf_jit_binary_alloc(image_size, &image_ptr,
+                                     sizeof(u32), jit_fill_hole);
+       /* Not able to allocate memory for the structure then
+        * we must fall back to the interpretation
+        */
+       if (header == NULL) {
+               prog = orig_prog;
+               goto out_imms;
+       }
+
+       /* 2.) Actual pass to generate final JIT code */
+       ctx.target = (u32 *) image_ptr;
        ctx.idx = 0;

        build_prologue(&ctx);
+
+       /* If building the body of the JITed code fails somehow,
+        * we fall back to the interpretation.
+        */
        if (build_body(&ctx) < 0) {
-#if __LINUX_ARM_ARCH__ < 7
-               if (ctx.imm_count)
-                       kfree(ctx.imms);
-#endif
+               image_ptr = NULL;
                bpf_jit_binary_free(header);
-               goto out;
+               prog = orig_prog;
+               goto out_imms;
        }
        build_epilogue(&ctx);

+       /* 3.) Extra pass to validate JITed Code */
+       if (validate_code(&ctx)) {
+               image_ptr = NULL;
+               bpf_jit_binary_free(header);
+               prog = orig_prog;
+               goto out_imms;
+       }
        flush_icache_range((u32)header, (u32)(ctx.target + ctx.idx));

-#if __LINUX_ARM_ARCH__ < 7
-       if (ctx.imm_count)
-               kfree(ctx.imms);
-#endif
-
        if (bpf_jit_enable > 1)
                /* there are 2 passes here */
-               bpf_jit_dump(fp->len, alloc_size, 2, ctx.target);
+               bpf_jit_dump(prog->len, image_size, 2, ctx.target);

        set_memory_ro((unsigned long)header, header->pages);
-       fp->bpf_func = (void *)ctx.target;
-       fp->jited = 1;
-out:
+       prog->bpf_func = (void *)ctx.target;
+       prog->jited = 1;
+out_imms:
+#if __LINUX_ARM_ARCH__ < 7
+       if (ctx.imm_count)
+               kfree(ctx.imms);
+#endif
+out_off:
        kfree(ctx.offsets);
-       return;
+out:
+       if (tmp_blinded)
+               bpf_jit_prog_release_other(prog, prog == orig_prog ?
+                                          tmp : orig_prog);
+       return prog;
 }

-void bpf_jit_free(struct bpf_prog *fp)
+void bpf_jit_free(struct bpf_prog *prog)
 {
-       unsigned long addr = (unsigned long)fp->bpf_func & PAGE_MASK;
+       unsigned long addr = (unsigned long)prog->bpf_func & PAGE_MASK;
        struct bpf_binary_header *header = (void *)addr;

-       if (!fp->jited)
+       if (!prog->jited)
                goto free_filter;

        set_memory_rw(addr, header->pages);
        bpf_jit_binary_free(header);

 free_filter:
-       bpf_prog_unlock_free(fp);
+       bpf_prog_unlock_free(prog);
 }
diff --git a/arch/arm/net/bpf_jit_32.h b/arch/arm/net/bpf_jit_32.h
index c46fca2..d5cf5f6 100644
--- a/arch/arm/net/bpf_jit_32.h
+++ b/arch/arm/net/bpf_jit_32.h
@@ -11,6 +11,7 @@
 #ifndef PFILTER_OPCODES_ARM_H
 #define PFILTER_OPCODES_ARM_H

+/* ARM 32bit Registers */
 #define ARM_R0 0
 #define ARM_R1 1
 #define ARM_R2 2
@@ -22,38 +23,43 @@
 #define ARM_R8 8
 #define ARM_R9 9
 #define ARM_R10        10
-#define ARM_FP 11
-#define ARM_IP 12
-#define ARM_SP 13
-#define ARM_LR 14
-#define ARM_PC 15
-
-#define ARM_COND_EQ            0x0
-#define ARM_COND_NE            0x1
-#define ARM_COND_CS            0x2
+#define ARM_FP 11      /* Frame Pointer */
+#define ARM_IP 12      /* Intra-procedure scratch register */
+#define ARM_SP 13      /* Stack pointer: as load/store base reg */
+#define ARM_LR 14      /* Link Register */
+#define ARM_PC 15      /* Program counter */
+
+#define ARM_COND_EQ            0x0     /* == */
+#define ARM_COND_NE            0x1     /* != */
+#define ARM_COND_CS            0x2     /* unsigned >= */
 #define ARM_COND_HS            ARM_COND_CS
-#define ARM_COND_CC            0x3
+#define ARM_COND_CC            0x3     /* unsigned < */
 #define ARM_COND_LO            ARM_COND_CC
-#define ARM_COND_MI            0x4
-#define ARM_COND_PL            0x5
-#define ARM_COND_VS            0x6
-#define ARM_COND_VC            0x7
-#define ARM_COND_HI            0x8
-#define ARM_COND_LS            0x9
-#define ARM_COND_GE            0xa
-#define ARM_COND_LT            0xb
-#define ARM_COND_GT            0xc
-#define ARM_COND_LE            0xd
-#define ARM_COND_AL            0xe
+#define ARM_COND_MI            0x4     /* < 0 */
+#define ARM_COND_PL            0x5     /* >= 0 */
+#define ARM_COND_VS            0x6     /* Signed Overflow */
+#define ARM_COND_VC            0x7     /* No Signed Overflow */
+#define ARM_COND_HI            0x8     /* unsigned > */
+#define ARM_COND_LS            0x9     /* unsigned <= */
+#define ARM_COND_GE            0xa     /* Signed >= */
+#define ARM_COND_LT            0xb     /* Signed < */
+#define ARM_COND_GT            0xc     /* Signed > */
+#define ARM_COND_LE            0xd     /* Signed <= */
+#define ARM_COND_AL            0xe     /* None */

 /* register shift types */
 #define SRTYPE_LSL             0
 #define SRTYPE_LSR             1
 #define SRTYPE_ASR             2
 #define SRTYPE_ROR             3
+#define SRTYPE_ASL             (SRTYPE_LSL)

 #define ARM_INST_ADD_R         0x00800000
+#define ARM_INST_ADDS_R                0x00900000
+#define ARM_INST_ADC_R         0x00a00000
+#define ARM_INST_ADC_I         0x02a00000
 #define ARM_INST_ADD_I         0x02800000
+#define ARM_INST_ADDS_I                0x02900000

 #define ARM_INST_AND_R         0x00000000
 #define ARM_INST_AND_I         0x02000000
@@ -76,8 +82,10 @@
 #define ARM_INST_LDRH_I                0x01d000b0
 #define ARM_INST_LDRH_R                0x019000b0
 #define ARM_INST_LDR_I         0x05900000
+#define ARM_INST_LDR_R         0x07900000

 #define ARM_INST_LDM           0x08900000
+#define ARM_INST_LDM_IA                0x08b00000

 #define ARM_INST_LSL_I         0x01a00000
 #define ARM_INST_LSL_R         0x01a00010
@@ -86,6 +94,7 @@
 #define ARM_INST_LSR_R         0x01a00030

 #define ARM_INST_MOV_R         0x01a00000
+#define ARM_INST_MOVS_R                0x01b00000
 #define ARM_INST_MOV_I         0x03a00000
 #define ARM_INST_MOVW          0x03000000
 #define ARM_INST_MOVT          0x03400000
@@ -96,17 +105,28 @@
 #define ARM_INST_PUSH          0x092d0000

 #define ARM_INST_ORR_R         0x01800000
+#define ARM_INST_ORRS_R                0x01900000
 #define ARM_INST_ORR_I         0x03800000

 #define ARM_INST_REV           0x06bf0f30
 #define ARM_INST_REV16         0x06bf0fb0

 #define ARM_INST_RSB_I         0x02600000
+#define ARM_INST_RSBS_I                0x02700000
+#define ARM_INST_RSC_I         0x02e00000

 #define ARM_INST_SUB_R         0x00400000
+#define ARM_INST_SUBS_R                0x00500000
+#define ARM_INST_RSB_R         0x00600000
 #define ARM_INST_SUB_I         0x02400000
+#define ARM_INST_SUBS_I                0x02500000
+#define ARM_INST_SBC_I         0x02c00000
+#define ARM_INST_SBC_R         0x00c00000
+#define ARM_INST_SBCS_R                0x00d00000

 #define ARM_INST_STR_I         0x05800000
+#define ARM_INST_STRB_I                0x05c00000
+#define ARM_INST_STRH_I                0x01c000b0

 #define ARM_INST_TST_R         0x01100000
 #define ARM_INST_TST_I         0x03100000
@@ -117,6 +137,8 @@

 #define ARM_INST_MLS           0x00600090

+#define ARM_INST_UXTH          0x06ff0070
+
 /*
  * Use a suitable undefined instruction to use for ARM/Thumb2 faulting.
  * We need to be careful not to conflict with those used by other modules
@@ -135,9 +157,15 @@
 #define _AL3_R(op, rd, rn, rm) ((op ## _R) | (rd) << 12 | (rn) << 16 | (rm))
 /* immediate */
 #define _AL3_I(op, rd, rn, imm)        ((op ## _I) | (rd) << 12 |
(rn) << 16 | (imm))
+/* register with register-shift */
+#define _AL3_SR(inst)  (inst | (1 << 4))

 #define ARM_ADD_R(rd, rn, rm)  _AL3_R(ARM_INST_ADD, rd, rn, rm)
+#define ARM_ADDS_R(rd, rn, rm) _AL3_R(ARM_INST_ADDS, rd, rn, rm)
 #define ARM_ADD_I(rd, rn, imm) _AL3_I(ARM_INST_ADD, rd, rn, imm)
+#define ARM_ADDS_I(rd, rn, imm)        _AL3_I(ARM_INST_ADDS, rd, rn, imm)
+#define ARM_ADC_R(rd, rn, rm)  _AL3_R(ARM_INST_ADC, rd, rn, rm)
+#define ARM_ADC_I(rd, rn, imm) _AL3_I(ARM_INST_ADC, rd, rn, imm)

 #define ARM_AND_R(rd, rn, rm)  _AL3_R(ARM_INST_AND, rd, rn, rm)
 #define ARM_AND_I(rd, rn, imm) _AL3_I(ARM_INST_AND, rd, rn, imm)
@@ -156,7 +184,9 @@
 #define ARM_EOR_I(rd, rn, imm) _AL3_I(ARM_INST_EOR, rd, rn, imm)

 #define ARM_LDR_I(rt, rn, off) (ARM_INST_LDR_I | (rt) << 12 | (rn) << 16 \
-                                | (off))
+                                | ((off) & 0xfff))
+#define ARM_LDR_R(rt, rn, rm)  (ARM_INST_LDR_R | (rt) << 12 | (rn) << 16 \
+                                | (rm))
 #define ARM_LDRB_I(rt, rn, off)        (ARM_INST_LDRB_I | (rt) << 12
| (rn) << 16 \
                                 | (off))
 #define ARM_LDRB_R(rt, rn, rm) (ARM_INST_LDRB_R | (rt) << 12 | (rn) << 16 \
@@ -167,15 +197,23 @@
                                 | (rm))

 #define ARM_LDM(rn, regs)      (ARM_INST_LDM | (rn) << 16 | (regs))
+#define ARM_LDM_IA(rn, regs)   (ARM_INST_LDM_IA | (rn) << 16 | (regs))

 #define ARM_LSL_R(rd, rn, rm)  (_AL3_R(ARM_INST_LSL, rd, 0, rn) | (rm) << 8)
 #define ARM_LSL_I(rd, rn, imm) (_AL3_I(ARM_INST_LSL, rd, 0, rn) | (imm) << 7)

 #define ARM_LSR_R(rd, rn, rm)  (_AL3_R(ARM_INST_LSR, rd, 0, rn) | (rm) << 8)
 #define ARM_LSR_I(rd, rn, imm) (_AL3_I(ARM_INST_LSR, rd, 0, rn) | (imm) << 7)
+#define ARM_ASR_R(rd, rn, rm)   (_AL3_R(ARM_INST_ASR, rd, 0, rn) | (rm) << 8)
+#define ARM_ASR_I(rd, rn, imm)  (_AL3_I(ARM_INST_ASR, rd, 0, rn) | (imm) << 7)

 #define ARM_MOV_R(rd, rm)      _AL3_R(ARM_INST_MOV, rd, 0, rm)
+#define ARM_MOVS_R(rd, rm)     _AL3_R(ARM_INST_MOVS, rd, 0, rm)
 #define ARM_MOV_I(rd, imm)     _AL3_I(ARM_INST_MOV, rd, 0, imm)
+#define ARM_MOV_SR(rd, rm, type, rs)   \
+       (_AL3_SR(ARM_MOV_R(rd, rm)) | (type) << 5 | (rs) << 8)
+#define ARM_MOV_SI(rd, rm, type, imm6) \
+       (ARM_MOV_R(rd, rm) | (type) << 5 | (imm6) << 7)

 #define ARM_MOVW(rd, imm)      \
        (ARM_INST_MOVW | ((imm) >> 12) << 16 | (rd) << 12 | ((imm) & 0x0fff))
@@ -190,19 +228,38 @@

 #define ARM_ORR_R(rd, rn, rm)  _AL3_R(ARM_INST_ORR, rd, rn, rm)
 #define ARM_ORR_I(rd, rn, imm) _AL3_I(ARM_INST_ORR, rd, rn, imm)
-#define ARM_ORR_S(rd, rn, rm, type, rs)        \
-       (ARM_ORR_R(rd, rn, rm) | (type) << 5 | (rs) << 7)
+#define ARM_ORR_SR(rd, rn, rm, type, rs)       \
+       (_AL3_SR(ARM_ORR_R(rd, rn, rm)) | (type) << 5 | (rs) << 8)
+#define ARM_ORRS_R(rd, rn, rm) _AL3_R(ARM_INST_ORRS, rd, rn, rm)
+#define ARM_ORRS_SR(rd, rn, rm, type, rs)      \
+       (_AL3_SR(ARM_ORRS_R(rd, rn, rm)) | (type) << 5 | (rs) << 8)
+#define ARM_ORR_SI(rd, rn, rm, type, imm6)     \
+       (ARM_ORR_R(rd, rn, rm) | (type) << 5 | (imm6) << 7)
+#define ARM_ORRS_SI(rd, rn, rm, type, imm6)    \
+       (ARM_ORRS_R(rd, rn, rm) | (type) << 5 | (imm6) << 7)

 #define ARM_REV(rd, rm)                (ARM_INST_REV | (rd) << 12 | (rm))
 #define ARM_REV16(rd, rm)      (ARM_INST_REV16 | (rd) << 12 | (rm))

 #define ARM_RSB_I(rd, rn, imm) _AL3_I(ARM_INST_RSB, rd, rn, imm)
+#define ARM_RSBS_I(rd, rn, imm)        _AL3_I(ARM_INST_RSBS, rd, rn, imm)
+#define ARM_RSC_I(rd, rn, imm) _AL3_I(ARM_INST_RSC, rd, rn, imm)

 #define ARM_SUB_R(rd, rn, rm)  _AL3_R(ARM_INST_SUB, rd, rn, rm)
+#define ARM_SUBS_R(rd, rn, rm) _AL3_R(ARM_INST_SUBS, rd, rn, rm)
+#define ARM_RSB_R(rd, rn, rm)  _AL3_R(ARM_INST_RSB, rd, rn, rm)
+#define ARM_SBC_R(rd, rn, rm)  _AL3_R(ARM_INST_SBC, rd, rn, rm)
+#define ARM_SBCS_R(rd, rn, rm) _AL3_R(ARM_INST_SBCS, rd, rn, rm)
 #define ARM_SUB_I(rd, rn, imm) _AL3_I(ARM_INST_SUB, rd, rn, imm)
+#define ARM_SUBS_I(rd, rn, imm)        _AL3_I(ARM_INST_SUBS, rd, rn, imm)
+#define ARM_SBC_I(rd, rn, imm) _AL3_I(ARM_INST_SBC, rd, rn, imm)

 #define ARM_STR_I(rt, rn, off) (ARM_INST_STR_I | (rt) << 12 | (rn) << 16 \
-                                | (off))
+                                | ((off) & 0xfff))
+#define ARM_STRH_I(rt, rn, off)        (ARM_INST_STRH_I | (rt) << 12
| (rn) << 16 \
+                                | (((off) & 0xf0) << 4) | ((off) & 0xf))
+#define ARM_STRB_I(rt, rn, off)        (ARM_INST_STRB_I | (rt) << 12
| (rn) << 16 \
+                                | (((off) & 0xf0) << 4) | ((off) & 0xf))

 #define ARM_TST_R(rn, rm)      _AL3_R(ARM_INST_TST, 0, rn, rm)
 #define ARM_TST_I(rn, imm)     _AL3_I(ARM_INST_TST, 0, rn, imm)
@@ -214,5 +271,6 @@

 #define ARM_MLS(rd, rn, rm, ra)        (ARM_INST_MLS | (rd) << 16 |
(rn) | (rm) << 8 \
                                 | (ra) << 12)
+#define ARM_UXTH(rd, rm)       (ARM_INST_UXTH | (rd) << 12 | (rm))

 #endif /* PFILTER_OPCODES_ARM_H */
--
2.7.4
Best,
Shubham Bansal


On Tue, May 23, 2017 at 11:05 AM, Kees Cook <keescook@chromium.org> wrote:
> On Mon, May 22, 2017 at 10:03 PM, Shubham Bansal
> <illusionist.neo@gmail.com> wrote:
>> On Tue, May 23, 2017 at 9:52 AM, Kees Cook <keescook@chromium.org> wrote:
>>> On Mon, May 22, 2017 at 8:34 PM, Shubham Bansal
>>> <illusionist.neo@gmail.com> wrote:
>>>> I would post them as soon as I test them on ARMv5 and ARMv6. If you
>>>> can help me with that, please let me know.
>>>
>>> Please post what you have: it would be better to see what you've got
>>> now in case additional changes are needed so you don't have to do it
>>> again on v5 and v6. Also, it means other people with real v5 and v6
>>> hardware could test for you if they were so inclined, and you won't
>>> need to be blocked on doing the tests in qemu.
>>>
>>> You can send it as an "RFC" in the subject, just to make sure people
>>> know it's not considered fully done. :)
>>
>> I already have ARMv5 and ARMv6 code written. I just haven't tested it
>> yet. Should i send the patch with those as well ?
>
> Sure, just to have a version up for people to examine. If there are
> bugs, that's fine, we'll iron them out.
>
> -Kees
>
> --
> Kees Cook
> Pixel Security

^ permalink raw reply related	[flat|nested] 99+ messages in thread

* [kernel-hardening] Re: arch: arm: bpf: Converting cBPF to eBPF for arm 32 bit
@ 2017-05-23 18:39                                 ` Shubham Bansal
  0 siblings, 0 replies; 99+ messages in thread
From: Shubham Bansal @ 2017-05-23 18:39 UTC (permalink / raw)
  To: Kees Cook
  Cc: Florian Fainelli, Daniel Borkmann, kernel-hardening,
	Network Development, ast, Mircea Gherzan, David Miller,
	linux-arm-kernel, Nicolas Schichan, andrew

Hi Kees, Daniel, Mircea and David,

Here is the patch I sent to the arm mailing list.
Any Comments are welcome.

---------- Forwarded message ----------
From: Shubham Bansal <illusionist.neo@gmail.com>
Date: Wed, May 24, 2017 at 12:03 AM
Subject: [PATCH] RFC: arm: eBPF JIT compiler
To: linux@armlinux.org.uk
Cc: linux-arm-kernel@lists.infradead.org,
linux-kernel@vger.kernel.org, Shubham Bansal
<illusionist.neo@gmail.com>


The JIT compiler emits ARM 32 bit instructions. Currently, It supports
eBPF only. Classic BPF is supported because of the conversion by BPF
core.

JIT is enabled with

        echo 1 > /proc/sys/net/core/bpf_jit_enable

Constant Blinding can be enabled along with JIT using

        echo 1 > /proc/sys/net/core/bpf_jit_enable
        echo 2 > /proc/sys/net/core/bpf_jit_harden

See Documentation/networking/filter.txt for more information.
Tested on ARMv7 with CONFIG_FRAME_POINTER enabled.

Results:

1. Interpreter:

        [   93.551176] test_bpf: Summary: 314 PASSED, 0 FAILED, [0/306 JIT'ed]

2. JIT enabled:

        [   92.913931] test_bpf: Summary: 314 PASSED, 0 FAILED, [278/306 JIT'ed]

3. JIT + blinding enabled:

        [  109.414506] test_bpf: Summary: 314 PASSED, 0 FAILED, [278/306 JIT'ed]

Currently, following eBPF instructions are not JITed.

        BPF_ALU64 | BPF_DIV | BPF_K
        BPF_ALU64 | BPF_DIV | BPF_X
        BPF_ALU64 | BPF_MOD | BPF_K
        BPF_ALU64 | BPF_MOD | BPF_X
        BPF_STX | BPF_XADD | BPF_W
        BPF_STX | BPF_XADD | BPF_DW
        BPF_JMP | BPF_CALL

Signed-off-by: Shubham Bansal <illusionist.neo@gmail.com>
---
 arch/arm/net/bpf_jit_32.c | 2410 ++++++++++++++++++++++++++++++---------------
 arch/arm/net/bpf_jit_32.h |  108 +-
 2 files changed, 1716 insertions(+), 802 deletions(-)

diff --git a/arch/arm/net/bpf_jit_32.c b/arch/arm/net/bpf_jit_32.c
index 93d0b6d..338d352 100644
--- a/arch/arm/net/bpf_jit_32.c
+++ b/arch/arm/net/bpf_jit_32.c
@@ -1,13 +1,16 @@
 /*
- * Just-In-Time compiler for BPF filters on 32bit ARM
+ * Just-In-Time compiler for eBPF filters on 32bit ARM
  *
  * Copyright (c) 2011 Mircea Gherzan <mgherzan@gmail.com>
+ * Copyright (c) 2017 Shubham Bansal <illusionist.neo@gmail.com>
  *
  * This program is free software; you can redistribute it and/or modify it
  * under the terms of the GNU General Public License as published by the
  * Free Software Foundation; version 2 of the License.
  */
+#define pr_fmt(fmt) "bpf_jit: " fmt

+#include <linux/bpf.h>
 #include <linux/bitops.h>
 #include <linux/compiler.h>
 #include <linux/errno.h>
@@ -23,44 +26,91 @@

 #include "bpf_jit_32.h"

+int bpf_jit_enable __read_mostly;
+
+#define STACK_OFFSET(k)        (k)
+#define TMP_REG_1      (MAX_BPF_JIT_REG + 0)   /* TEMP Register 1 */
+#define TMP_REG_2      (MAX_BPF_JIT_REG + 1)   /* TEMP Register 2 */
+#define TCALL_CNT      (MAX_BPF_JIT_REG + 2)   /* Tail Call Count */
+
+/* Flags used for JIT optimization */
+#define SEEN_CALL      (1 << 0)
+
+#define FLAG_IMM_OVERFLOW      (1 << 0)
+
 /*
- * ABI:
+ * Map eBPF registers to ARM 32bit registers or stack scratch space.
+ *
+ * 1. First argument is passed using the arm 32bit registers and rest of the
+ * arguments are passed on stack scratch space.
+ * 2. First callee-saved aregument is mapped to arm 32 bit registers and rest
+ * arguments are mapped to scratch space on stack.
+ * 3. We need two 64 bit temp registers to do complex operations on eBPF
+ * registers.
+ *
+ * As the eBPF registers are all 64 bit registers and arm has only 32 bit
+ * registers, we have to map each eBPF registers with two arm 32 bit regs or
+ * scratch memory space and we have to build eBPF 64 bit register from those.
  *
- * r0  scratch register
- * r4  BPF register A
- * r5  BPF register X
- * r6  pointer to the skb
- * r7  skb->data
- * r8  skb_headlen(skb)
  */
+static const u8 bpf2a32[][2] = {
+       /* return value from in-kernel function, and exit value from eBPF */
+       [BPF_REG_0] = {ARM_R1, ARM_R0},
+       /* arguments from eBPF program to in-kernel function */
+       [BPF_REG_1] = {ARM_R3, ARM_R2},
+       /* Stored on stack scratch space */
+       [BPF_REG_2] = {STACK_OFFSET(0), STACK_OFFSET(4)},
+       [BPF_REG_3] = {STACK_OFFSET(8), STACK_OFFSET(12)},
+       [BPF_REG_4] = {STACK_OFFSET(16), STACK_OFFSET(20)},
+       [BPF_REG_5] = {STACK_OFFSET(24), STACK_OFFSET(28)},
+       /* callee saved registers that in-kernel function will preserve */
+       [BPF_REG_6] = {ARM_R5, ARM_R4},
+       /* Stored on stack scratch space */
+       [BPF_REG_7] = {STACK_OFFSET(32), STACK_OFFSET(36)},
+       [BPF_REG_8] = {STACK_OFFSET(40), STACK_OFFSET(44)},
+       [BPF_REG_9] = {STACK_OFFSET(48), STACK_OFFSET(52)},
+       /* Read only Frame Pointer to access Stack */
+       [BPF_REG_FP] = {STACK_OFFSET(56), STACK_OFFSET(60)},
+       /* Temperory Register for internal BPF JIT, can be used
+        * for constant blindings and others.
+        */
+       [TMP_REG_1] = {ARM_R7, ARM_R6},
+       [TMP_REG_2] = {ARM_R10, ARM_R8},
+       /* Tail call count. Stored on stack scratch space. */
+       [TCALL_CNT] = {STACK_OFFSET(64), STACK_OFFSET(68)},
+       /* temporary register for blinding constants.
+        * Stored on stack scratch space.
+        */
+       [BPF_REG_AX] = {STACK_OFFSET(72), STACK_OFFSET(76)},
+};

-#define r_scratch      ARM_R0
-/* r1-r3 are (also) used for the unaligned loads on the non-ARMv7 slowpath */
-#define r_off          ARM_R1
-#define r_A            ARM_R4
-#define r_X            ARM_R5
-#define r_skb          ARM_R6
-#define r_skb_data     ARM_R7
-#define r_skb_hl       ARM_R8
-
-#define SCRATCH_SP_OFFSET      0
-#define SCRATCH_OFF(k)         (SCRATCH_SP_OFFSET + 4 * (k))
-
-#define SEEN_MEM               ((1 << BPF_MEMWORDS) - 1)
-#define SEEN_MEM_WORD(k)       (1 << (k))
-#define SEEN_X                 (1 << BPF_MEMWORDS)
-#define SEEN_CALL              (1 << (BPF_MEMWORDS + 1))
-#define SEEN_SKB               (1 << (BPF_MEMWORDS + 2))
-#define SEEN_DATA              (1 << (BPF_MEMWORDS + 3))
+#define        dst_lo  dst[1]
+#define dst_hi dst[0]
+#define src_lo src[1]
+#define src_hi src[0]

-#define FLAG_NEED_X_RESET      (1 << 0)
-#define FLAG_IMM_OVERFLOW      (1 << 1)
+/*
+ * JIT Context:
+ *
+ * prog                        :       bpf_prog
+ * idx                 :       index of current last JITed instruction.
+ * prologue_bytes      :       bytes used in prologue.
+ * epilogue_offset     :       offset of epilogue starting.
+ * seen                        :       bit mask used for JIT optimization.
+ * offsets             :       array of eBPF instruction offsets in
+ *                             JITed code.
+ * target              :       final JITed code.
+ * epilogue_bytes      :       no of bytes used in epilogue.
+ * imm_count           :       no of immediate counts used for global
+ *                             variables.
+ * imms                        :       array of global variable addresses.
+ */

 struct jit_ctx {
-       const struct bpf_prog *skf;
-       unsigned idx;
-       unsigned prologue_bytes;
-       int ret0_fp_idx;
+       const struct bpf_prog *prog;
+       unsigned int idx;
+       unsigned int prologue_bytes;
+       unsigned int epilogue_offset;
        u32 seen;
        u32 flags;
        u32 *offsets;
@@ -72,68 +122,16 @@ struct jit_ctx {
 #endif
 };

-int bpf_jit_enable __read_mostly;
-
-static inline int call_neg_helper(struct sk_buff *skb, int offset, void *ret,
-                     unsigned int size)
-{
-       void *ptr = bpf_internal_load_pointer_neg_helper(skb, offset, size);
-
-       if (!ptr)
-               return -EFAULT;
-       memcpy(ret, ptr, size);
-       return 0;
-}
-
-static u64 jit_get_skb_b(struct sk_buff *skb, int offset)
-{
-       u8 ret;
-       int err;
-
-       if (offset < 0)
-               err = call_neg_helper(skb, offset, &ret, 1);
-       else
-               err = skb_copy_bits(skb, offset, &ret, 1);
-
-       return (u64)err << 32 | ret;
-}
-
-static u64 jit_get_skb_h(struct sk_buff *skb, int offset)
-{
-       u16 ret;
-       int err;
-
-       if (offset < 0)
-               err = call_neg_helper(skb, offset, &ret, 2);
-       else
-               err = skb_copy_bits(skb, offset, &ret, 2);
-
-       return (u64)err << 32 | ntohs(ret);
-}
-
-static u64 jit_get_skb_w(struct sk_buff *skb, int offset)
-{
-       u32 ret;
-       int err;
-
-       if (offset < 0)
-               err = call_neg_helper(skb, offset, &ret, 4);
-       else
-               err = skb_copy_bits(skb, offset, &ret, 4);
-
-       return (u64)err << 32 | ntohl(ret);
-}
-
 /*
  * Wrappers which handle both OABI and EABI and assures Thumb2 interworking
  * (where the assembly routines like __aeabi_uidiv could cause problems).
  */
-static u32 jit_udiv(u32 dividend, u32 divisor)
+static u32 jit_udiv32(u32 dividend, u32 divisor)
 {
        return dividend / divisor;
 }

-static u32 jit_mod(u32 dividend, u32 divisor)
+static u32 jit_mod32(u32 dividend, u32 divisor)
 {
        return dividend % divisor;
 }
@@ -157,36 +155,22 @@ static inline void emit(u32 inst, struct jit_ctx *ctx)
        _emit(ARM_COND_AL, inst, ctx);
 }

-static u16 saved_regs(struct jit_ctx *ctx)
+/*
+ * Checks if immediate value can be converted to imm12(12 bits) value.
+ */
+static int16_t imm8m(u32 x)
 {
-       u16 ret = 0;
-
-       if ((ctx->skf->len > 1) ||
-           (ctx->skf->insns[0].code == (BPF_RET | BPF_A)))
-               ret |= 1 << r_A;
-
-#ifdef CONFIG_FRAME_POINTER
-       ret |= (1 << ARM_FP) | (1 << ARM_IP) | (1 << ARM_LR) | (1 << ARM_PC);
-#else
-       if (ctx->seen & SEEN_CALL)
-               ret |= 1 << ARM_LR;
-#endif
-       if (ctx->seen & (SEEN_DATA | SEEN_SKB))
-               ret |= 1 << r_skb;
-       if (ctx->seen & SEEN_DATA)
-               ret |= (1 << r_skb_data) | (1 << r_skb_hl);
-       if (ctx->seen & SEEN_X)
-               ret |= 1 << r_X;
-
-       return ret;
-}
+       u32 rot;

-static inline int mem_words_used(struct jit_ctx *ctx)
-{
-       /* yes, we do waste some stack space IF there are "holes" in the set" */
-       return fls(ctx->seen & SEEN_MEM);
+       for (rot = 0; rot < 16; rot++)
+               if ((x & ~ror32(0xff, 2 * rot)) == 0)
+                       return rol32(x, 2 * rot) | (rot << 8);
+       return -1;
 }

+/*
+ * Initializes the JIT space with undefined instructions.
+ */
 static void jit_fill_hole(void *area, unsigned int size)
 {
        u32 *ptr;
@@ -195,88 +179,34 @@ static void jit_fill_hole(void *area, unsigned int size)
                *ptr++ = __opcode_to_mem_arm(ARM_INST_UDF);
 }

-static void build_prologue(struct jit_ctx *ctx)
-{
-       u16 reg_set = saved_regs(ctx);
-       u16 off;
-
-#ifdef CONFIG_FRAME_POINTER
-       emit(ARM_MOV_R(ARM_IP, ARM_SP), ctx);
-       emit(ARM_PUSH(reg_set), ctx);
-       emit(ARM_SUB_I(ARM_FP, ARM_IP, 4), ctx);
-#else
-       if (reg_set)
-               emit(ARM_PUSH(reg_set), ctx);
-#endif
+/* Stack must be multiples of 16 Bytes */
+#define STACK_ALIGN(sz) (((sz) + 15) & ~15)

-       if (ctx->seen & (SEEN_DATA | SEEN_SKB))
-               emit(ARM_MOV_R(r_skb, ARM_R0), ctx);
-
-       if (ctx->seen & SEEN_DATA) {
-               off = offsetof(struct sk_buff, data);
-               emit(ARM_LDR_I(r_skb_data, r_skb, off), ctx);
-               /* headlen = len - data_len */
-               off = offsetof(struct sk_buff, len);
-               emit(ARM_LDR_I(r_skb_hl, r_skb, off), ctx);
-               off = offsetof(struct sk_buff, data_len);
-               emit(ARM_LDR_I(r_scratch, r_skb, off), ctx);
-               emit(ARM_SUB_R(r_skb_hl, r_skb_hl, r_scratch), ctx);
-       }
-
-       if (ctx->flags & FLAG_NEED_X_RESET)
-               emit(ARM_MOV_I(r_X, 0), ctx);
-
-       /* do not leak kernel data to userspace */
-       if (bpf_needs_clear_a(&ctx->skf->insns[0]))
-               emit(ARM_MOV_I(r_A, 0), ctx);
-
-       /* stack space for the BPF_MEM words */
-       if (ctx->seen & SEEN_MEM)
-               emit(ARM_SUB_I(ARM_SP, ARM_SP, mem_words_used(ctx) * 4), ctx);
-}
-
-static void build_epilogue(struct jit_ctx *ctx)
-{
-       u16 reg_set = saved_regs(ctx);
-
-       if (ctx->seen & SEEN_MEM)
-               emit(ARM_ADD_I(ARM_SP, ARM_SP, mem_words_used(ctx) * 4), ctx);
-
-       reg_set &= ~(1 << ARM_LR);
-
-#ifdef CONFIG_FRAME_POINTER
-       /* the first instruction of the prologue was: mov ip, sp */
-       reg_set &= ~(1 << ARM_IP);
-       reg_set |= (1 << ARM_SP);
-       emit(ARM_LDM(ARM_SP, reg_set), ctx);
-#else
-       if (reg_set) {
-               if (ctx->seen & SEEN_CALL)
-                       reg_set |= 1 << ARM_PC;
-               emit(ARM_POP(reg_set), ctx);
-       }
+/* Stack space for BPF_REG_2, BPF_REG_3, BPF_REG_4,
+ * BPF_REG_5, BPF_REG_7, BPF_REG_8, BPF_REG_9,
+ * BPF_REG_FP and Tail call counts.
+ */
+#define SCRATCH_SIZE 80

-       if (!(ctx->seen & SEEN_CALL))
-               emit(ARM_BX(ARM_LR), ctx);
-#endif
-}
+/* total stack size used in JITed code */
+#define _STACK_SIZE \
+       (MAX_BPF_STACK + \
+        + SCRATCH_SIZE + \
+        + 4 /* extra for skb_copy_bits buffer */)

-static int16_t imm8m(u32 x)
-{
-       u32 rot;
+#define STACK_SIZE STACK_ALIGN(_STACK_SIZE)

-       for (rot = 0; rot < 16; rot++)
-               if ((x & ~ror32(0xff, 2 * rot)) == 0)
-                       return rol32(x, 2 * rot) | (rot << 8);
+/* Get the offset of eBPF REGISTERs stored on scratch space. */
+#define STACK_VAR(off) (STACK_SIZE-off-4)

-       return -1;
-}
+/* Offset of skb_copy_bits buffer */
+#define SKB_BUFFER STACK_VAR(SCRATCH_SIZE)

 #if __LINUX_ARM_ARCH__ < 7

 static u16 imm_offset(u32 k, struct jit_ctx *ctx)
 {
-       unsigned i = 0, offset;
+       unsigned int i = 0, offset;
        u16 imm;

        /* on the "fake" run we just count them (duplicates included) */
@@ -295,7 +225,7 @@ static u16 imm_offset(u32 k, struct jit_ctx *ctx)
                ctx->imms[i] = k;

        /* constants go just after the epilogue */
-       offset =  ctx->offsets[ctx->skf->len];
+       offset =  ctx->offsets[ctx->prog->len];
        offset += ctx->prologue_bytes;
        offset += ctx->epilogue_bytes;
        offset += i * 4;
@@ -319,10 +249,22 @@ static u16 imm_offset(u32 k, struct jit_ctx *ctx)

 #endif /* __LINUX_ARM_ARCH__ */

+static inline int bpf2a32_offset(int bpf_to, int bpf_from,
+                                const struct jit_ctx *ctx) {
+       int to, from;
+
+       if (ctx->target == NULL)
+               return 0;
+       to = ctx->offsets[bpf_to];
+       from = ctx->offsets[bpf_from];
+
+       return to - from - 1;
+}
+
 /*
  * Move an immediate that's not an imm8m to a core register.
  */
-static inline void emit_mov_i_no8m(int rd, u32 val, struct jit_ctx *ctx)
+static inline void emit_mov_i_no8m(const u8 rd, u32 val, struct jit_ctx *ctx)
 {
 #if __LINUX_ARM_ARCH__ < 7
        emit(ARM_LDR_I(rd, ARM_PC, imm_offset(val, ctx)), ctx);
@@ -333,7 +275,7 @@ static inline void emit_mov_i_no8m(int rd, u32
val, struct jit_ctx *ctx)
 #endif
 }

-static inline void emit_mov_i(int rd, u32 val, struct jit_ctx *ctx)
+static inline void emit_mov_i(const u8 rd, u32 val, struct jit_ctx *ctx)
 {
        int imm12 = imm8m(val);

@@ -343,676 +285,1559 @@ static inline void emit_mov_i(int rd, u32
val, struct jit_ctx *ctx)
                emit_mov_i_no8m(rd, val, ctx);
 }

-#if __LINUX_ARM_ARCH__ < 6
-
-static void emit_load_be32(u8 cond, u8 r_res, u8 r_addr, struct jit_ctx *ctx)
+static inline void emit_blx_r(u8 tgt_reg, struct jit_ctx *ctx)
 {
-       _emit(cond, ARM_LDRB_I(ARM_R3, r_addr, 1), ctx);
-       _emit(cond, ARM_LDRB_I(ARM_R1, r_addr, 0), ctx);
-       _emit(cond, ARM_LDRB_I(ARM_R2, r_addr, 3), ctx);
-       _emit(cond, ARM_LSL_I(ARM_R3, ARM_R3, 16), ctx);
-       _emit(cond, ARM_LDRB_I(ARM_R0, r_addr, 2), ctx);
-       _emit(cond, ARM_ORR_S(ARM_R3, ARM_R3, ARM_R1, SRTYPE_LSL, 24), ctx);
-       _emit(cond, ARM_ORR_R(ARM_R3, ARM_R3, ARM_R2), ctx);
-       _emit(cond, ARM_ORR_S(r_res, ARM_R3, ARM_R0, SRTYPE_LSL, 8), ctx);
+       ctx->seen |= SEEN_CALL;
+#if __LINUX_ARM_ARCH__ < 5
+       emit(ARM_MOV_R(ARM_LR, ARM_PC), ctx);
+
+       if (elf_hwcap & HWCAP_THUMB)
+               emit(ARM_BX(tgt_reg), ctx);
+       else
+               emit(ARM_MOV_R(ARM_PC, tgt_reg), ctx);
+#else
+       emit(ARM_BLX_R(tgt_reg), ctx);
+#endif
 }

-static void emit_load_be16(u8 cond, u8 r_res, u8 r_addr, struct jit_ctx *ctx)
+static inline int epilogue_offset(const struct jit_ctx *ctx)
 {
-       _emit(cond, ARM_LDRB_I(ARM_R1, r_addr, 0), ctx);
-       _emit(cond, ARM_LDRB_I(ARM_R2, r_addr, 1), ctx);
-       _emit(cond, ARM_ORR_S(r_res, ARM_R2, ARM_R1, SRTYPE_LSL, 8), ctx);
+       int to, from;
+       /* No need for 1st dummy run */
+       if (ctx->target == NULL)
+               return 0;
+       to = ctx->epilogue_offset;
+       from = ctx->idx;
+
+       return to - from - 2;
 }

-static inline void emit_swap16(u8 r_dst, u8 r_src, struct jit_ctx *ctx)
+static inline void emit_udivmod(u8 rd, u8 rm, u8 rn, struct jit_ctx
*ctx, u8 op)
 {
-       /* r_dst = (r_src << 8) | (r_src >> 8) */
-       emit(ARM_LSL_I(ARM_R1, r_src, 8), ctx);
-       emit(ARM_ORR_S(r_dst, ARM_R1, r_src, SRTYPE_LSR, 8), ctx);
+       const u8 *tmp = bpf2a32[TMP_REG_1];
+       s32 jmp_offset;
+
+       /* checks if divisor is zero or not. If it is, then
+        * exit directly.
+        */
+       emit(ARM_CMP_I(rn, 0), ctx);
+       _emit(ARM_COND_EQ, ARM_MOV_I(ARM_R0, 0), ctx);
+       jmp_offset = epilogue_offset(ctx);
+       _emit(ARM_COND_EQ, ARM_B(jmp_offset), ctx);
+#if __LINUX_ARM_ARCH__ == 7
+       if (elf_hwcap & HWCAP_IDIVA) {
+               if (op == BPF_DIV)
+                       emit(ARM_UDIV(rd, rm, rn), ctx);
+               else {
+                       emit(ARM_UDIV(ARM_IP, rm, rn), ctx);
+                       emit(ARM_MLS(rd, rn, ARM_IP, rm), ctx);
+               }
+               return;
+       }
+#endif

        /*
-        * we need to mask out the bits set in r_dst[23:16] due to
-        * the first shift instruction.
-        *
-        * note that 0x8ff is the encoded immediate 0x00ff0000.
+        * For BPF_ALU | BPF_DIV | BPF_K instructions
+        * As ARM_R1 and ARM_R0 contains 1st argument of bpf
+        * function, we need to save it on caller side to save
+        * it from getting destroyed within callee.
+        * After the return from the callee, we restore ARM_R0
+        * ARM_R1.
         */
-       emit(ARM_BIC_I(r_dst, r_dst, 0x8ff), ctx);
-}
+       if (rn != ARM_R1) {
+               emit(ARM_MOV_R(tmp[0], ARM_R1), ctx);
+               emit(ARM_MOV_R(ARM_R1, rn), ctx);
+       }
+       if (rm != ARM_R0) {
+               emit(ARM_MOV_R(tmp[1], ARM_R0), ctx);
+               emit(ARM_MOV_R(ARM_R0, rm), ctx);
+       }

-#else  /* ARMv6+ */
+       /* Call appropriate function */
+       ctx->seen |= SEEN_CALL;
+       emit_mov_i(ARM_IP, op == BPF_DIV ?
+                  (u32)jit_udiv32 : (u32)jit_mod32, ctx);
+       emit_blx_r(ARM_IP, ctx);

-static void emit_load_be32(u8 cond, u8 r_res, u8 r_addr, struct jit_ctx *ctx)
-{
-       _emit(cond, ARM_LDR_I(r_res, r_addr, 0), ctx);
-#ifdef __LITTLE_ENDIAN
-       _emit(cond, ARM_REV(r_res, r_res), ctx);
-#endif
+       /* Save return value */
+       if (rd != ARM_R0)
+               emit(ARM_MOV_R(rd, ARM_R0), ctx);
+
+       /* Restore ARM_R0 and ARM_R1 */
+       if (rn != ARM_R1)
+               emit(ARM_MOV_R(ARM_R1, tmp[0]), ctx);
+       if (rm != ARM_R0)
+               emit(ARM_MOV_R(ARM_R0, tmp[1]), ctx);
 }

-static void emit_load_be16(u8 cond, u8 r_res, u8 r_addr, struct jit_ctx *ctx)
+/* Checks whether BPF register is on scratch stack space or not. */
+static inline bool is_on_stack(u8 bpf_reg)
 {
-       _emit(cond, ARM_LDRH_I(r_res, r_addr, 0), ctx);
-#ifdef __LITTLE_ENDIAN
-       _emit(cond, ARM_REV16(r_res, r_res), ctx);
-#endif
+       static u8 stack_regs[] = {BPF_REG_AX, BPF_REG_3, BPF_REG_4, BPF_REG_5,
+                               BPF_REG_7, BPF_REG_8, BPF_REG_9, TCALL_CNT,
+                               BPF_REG_2, BPF_REG_FP};
+       int i, reg_len = sizeof(stack_regs);
+
+       for (i = 0 ; i < reg_len ; i++) {
+               if (bpf_reg == stack_regs[i])
+                       return true;
+       }
+       return false;
 }

-static inline void emit_swap16(u8 r_dst __maybe_unused,
-                              u8 r_src __maybe_unused,
-                              struct jit_ctx *ctx __maybe_unused)
+static inline void emit_a32_mov_i(const u8 dst, const u32 val,
+                                 bool dstk, struct jit_ctx *ctx)
 {
-#ifdef __LITTLE_ENDIAN
-       emit(ARM_REV16(r_dst, r_src), ctx);
-#endif
+       const u8 *tmp = bpf2a32[TMP_REG_1];
+
+       if (dstk) {
+               emit_mov_i(tmp[1], val, ctx);
+               emit(ARM_STR_I(tmp[1], ARM_SP, STACK_VAR(dst)), ctx);
+       } else {
+               emit_mov_i(dst, val, ctx);
+       }
 }

-#endif /* __LINUX_ARM_ARCH__ < 6 */
+/* Sign extended move */
+static inline void emit_a32_mov_i64(const bool is64, const u8 dst[],
+                                 const u32 val, bool dstk,
+                                 struct jit_ctx *ctx) {
+       u32 hi = 0;

+       if (is64 && (val & (1<<31)))
+               hi = (u32)~0;
+       emit_a32_mov_i(dst_lo, val, dstk, ctx);
+       emit_a32_mov_i(dst_hi, hi, dstk, ctx);
+}

-/* Compute the immediate value for a PC-relative branch. */
-static inline u32 b_imm(unsigned tgt, struct jit_ctx *ctx)
-{
-       u32 imm;
+static inline void emit_a32_add_r(const u8 dst, const u8 src,
+                             const bool is64, const bool hi,
+                             struct jit_ctx *ctx) {
+       /* 64 bit :
+        *      adds dst_lo, dst_lo, src_lo
+        *      adc dst_hi, dst_hi, src_hi
+        * 32 bit :
+        *      add dst_lo, dst_lo, src_lo
+        */
+       if (!hi && is64)
+               emit(ARM_ADDS_R(dst, dst, src), ctx);
+       else if (hi && is64)
+               emit(ARM_ADC_R(dst, dst, src), ctx);
+       else
+               emit(ARM_ADD_R(dst, dst, src), ctx);
+}

-       if (ctx->target == NULL)
-               return 0;
-       /*
-        * BPF allows only forward jumps and the offset of the target is
-        * still the one computed during the first pass.
+static inline void emit_a32_sub_r(const u8 dst, const u8 src,
+                                 const bool is64, const bool hi,
+                                 struct jit_ctx *ctx) {
+       /* 64 bit :
+        *      subs dst_lo, dst_lo, src_lo
+        *      sbc dst_hi, dst_hi, src_hi
+        * 32 bit :
+        *      sub dst_lo, dst_lo, src_lo
         */
-       imm  = ctx->offsets[tgt] + ctx->prologue_bytes - (ctx->idx * 4 + 8);
+       if (!hi && is64)
+               emit(ARM_SUBS_R(dst, dst, src), ctx);
+       else if (hi && is64)
+               emit(ARM_SBC_R(dst, dst, src), ctx);
+       else
+               emit(ARM_SUB_R(dst, dst, src), ctx);
+}

-       return imm >> 2;
+static inline void emit_alu_r(const u8 dst, const u8 src, const bool is64,
+                             const bool hi, const u8 op, struct jit_ctx *ctx){
+       switch (BPF_OP(op)) {
+       /* dst = dst + src */
+       case BPF_ADD:
+               emit_a32_add_r(dst, src, is64, hi, ctx);
+               break;
+       /* dst = dst - src */
+       case BPF_SUB:
+               emit_a32_sub_r(dst, src, is64, hi, ctx);
+               break;
+       /* dst = dst | src */
+       case BPF_OR:
+               emit(ARM_ORR_R(dst, dst, src), ctx);
+               break;
+       /* dst = dst & src */
+       case BPF_AND:
+               emit(ARM_AND_R(dst, dst, src), ctx);
+               break;
+       /* dst = dst ^ src */
+       case BPF_XOR:
+               emit(ARM_EOR_R(dst, dst, src), ctx);
+               break;
+       /* dst = dst * src */
+       case BPF_MUL:
+               emit(ARM_MUL(dst, dst, src), ctx);
+               break;
+       /* dst = dst << src */
+       case BPF_LSH:
+               emit(ARM_LSL_R(dst, dst, src), ctx);
+               break;
+       /* dst = dst >> src */
+       case BPF_RSH:
+               emit(ARM_LSR_R(dst, dst, src), ctx);
+               break;
+       /* dst = dst >> src (signed)*/
+       case BPF_ARSH:
+               emit(ARM_MOV_SR(dst, dst, SRTYPE_ASR, src), ctx);
+               break;
+       }
 }

-#define OP_IMM3(op, r1, r2, imm_val, ctx)                              \
-       do {                                                            \
-               imm12 = imm8m(imm_val);                                 \
-               if (imm12 < 0) {                                        \
-                       emit_mov_i_no8m(r_scratch, imm_val, ctx);       \
-                       emit(op ## _R((r1), (r2), r_scratch), ctx);     \
-               } else {                                                \
-                       emit(op ## _I((r1), (r2), imm12), ctx);         \
-               }                                                       \
-       } while (0)
-
-static inline void emit_err_ret(u8 cond, struct jit_ctx *ctx)
-{
-       if (ctx->ret0_fp_idx >= 0) {
-               _emit(cond, ARM_B(b_imm(ctx->ret0_fp_idx, ctx)), ctx);
-               /* NOP to keep the size constant between passes */
-               emit(ARM_MOV_R(ARM_R0, ARM_R0), ctx);
+/* ALU operation (32 bit)
+ * dst = dst (op) src
+ */
+static inline void emit_a32_alu_r(const u8 dst, const u8 src,
+                                 bool dstk, bool sstk,
+                                 struct jit_ctx *ctx, const bool is64,
+                                 const bool hi, const u8 op) {
+       const u8 *tmp = bpf2a32[TMP_REG_1];
+       u8 rn = sstk ? tmp[1] : src;
+
+       if (sstk)
+               emit(ARM_LDR_I(rn, ARM_SP, STACK_VAR(src)), ctx);
+
+       /* ALU operation */
+       if (dstk) {
+               emit(ARM_LDR_I(tmp[0], ARM_SP, STACK_VAR(dst)), ctx);
+               emit_alu_r(tmp[0], rn, is64, hi, op, ctx);
+               emit(ARM_STR_I(tmp[0], ARM_SP, STACK_VAR(dst)), ctx);
        } else {
-               _emit(cond, ARM_MOV_I(ARM_R0, 0), ctx);
-               _emit(cond, ARM_B(b_imm(ctx->skf->len, ctx)), ctx);
+               emit_alu_r(dst, rn, is64, hi, op, ctx);
        }
 }

-static inline void emit_blx_r(u8 tgt_reg, struct jit_ctx *ctx)
-{
-#if __LINUX_ARM_ARCH__ < 5
-       emit(ARM_MOV_R(ARM_LR, ARM_PC), ctx);
+/* ALU operation (64 bit) */
+static inline void emit_a32_alu_r64(const bool is64, const u8 dst[],
+                                 const u8 src[], bool dstk,
+                                 bool sstk, struct jit_ctx *ctx,
+                                 const u8 op) {
+       emit_a32_alu_r(dst_lo, src_lo, dstk, sstk, ctx, is64, false, op);
+       if (is64)
+               emit_a32_alu_r(dst_hi, src_hi, dstk, sstk, ctx, is64, true, op);
+       else
+               emit_a32_mov_i(dst_hi, 0, dstk, ctx);
+}

-       if (elf_hwcap & HWCAP_THUMB)
-               emit(ARM_BX(tgt_reg), ctx);
+/* dst = imm (4 bytes)*/
+static inline void emit_a32_mov_r(const u8 dst, const u8 src,
+                                 bool dstk, bool sstk,
+                                 struct jit_ctx *ctx) {
+       const u8 *tmp = bpf2a32[TMP_REG_1];
+       u8 rt = sstk ? tmp[0] : src;
+
+       if (sstk)
+               emit(ARM_LDR_I(tmp[0], ARM_SP, STACK_VAR(src)), ctx);
+       if (dstk)
+               emit(ARM_STR_I(rt, ARM_SP, STACK_VAR(dst)), ctx);
        else
-               emit(ARM_MOV_R(ARM_PC, tgt_reg), ctx);
-#else
-       emit(ARM_BLX_R(tgt_reg), ctx);
-#endif
+               emit(ARM_MOV_R(dst, rt), ctx);
 }

-static inline void emit_udivmod(u8 rd, u8 rm, u8 rn, struct jit_ctx *ctx,
-                               int bpf_op)
-{
-#if __LINUX_ARM_ARCH__ == 7
-       if (elf_hwcap & HWCAP_IDIVA) {
-               if (bpf_op == BPF_DIV)
-                       emit(ARM_UDIV(rd, rm, rn), ctx);
-               else {
-                       emit(ARM_UDIV(ARM_R3, rm, rn), ctx);
-                       emit(ARM_MLS(rd, rn, ARM_R3, rm), ctx);
-               }
-               return;
+/* dst = src */
+static inline void emit_a32_mov_r64(const bool is64, const u8 dst[],
+                                 const u8 src[], bool dstk,
+                                 bool sstk, struct jit_ctx *ctx) {
+       emit_a32_mov_r(dst_lo, src_lo, dstk, sstk, ctx);
+       if (is64) {
+               /* complete 8 byte move */
+               emit_a32_mov_r(dst_hi, src_hi, dstk, sstk, ctx);
+       } else {
+               /* Zero out high 4 bytes */
+               emit_a32_mov_i(dst_hi, 0, dstk, ctx);
        }
-#endif
+}

-       /*
-        * For BPF_ALU | BPF_DIV | BPF_K instructions, rm is ARM_R4
-        * (r_A) and rn is ARM_R0 (r_scratch) so load rn first into
-        * ARM_R1 to avoid accidentally overwriting ARM_R0 with rm
-        * before using it as a source for ARM_R1.
-        *
-        * For BPF_ALU | BPF_DIV | BPF_X rm is ARM_R4 (r_A) and rn is
-        * ARM_R5 (r_X) so there is no particular register overlap
-        * issues.
-        */
-       if (rn != ARM_R1)
-               emit(ARM_MOV_R(ARM_R1, rn), ctx);
-       if (rm != ARM_R0)
-               emit(ARM_MOV_R(ARM_R0, rm), ctx);
+/* Shift operations */
+static inline void emit_a32_alu_i(const u8 dst, const u32 val, bool dstk,
+                               struct jit_ctx *ctx, const u8 op) {
+       const u8 *tmp = bpf2a32[TMP_REG_1];
+       u8 rd = dstk ? tmp[0] : dst;
+
+       if (dstk)
+               emit(ARM_LDR_I(rd, ARM_SP, STACK_VAR(dst)), ctx);
+
+       /* Do shift operation */
+       switch (op) {
+       case BPF_LSH:
+               emit(ARM_LSL_I(rd, rd, val), ctx);
+               break;
+       case BPF_RSH:
+               emit(ARM_LSR_I(rd, rd, val), ctx);
+               break;
+       case BPF_NEG:
+               emit(ARM_RSB_I(rd, rd, val), ctx);
+               break;
+       }
+
+       if (dstk)
+               emit(ARM_STR_I(rd, ARM_SP, STACK_VAR(dst)), ctx);
+}
+
+/* dst = ~dst (64 bit) */
+static inline void emit_a32_neg64(const u8 dst[], bool dstk,
+                               struct jit_ctx *ctx){
+       const u8 *tmp = bpf2a32[TMP_REG_1];
+       u8 rd = dstk ? tmp[1] : dst[1];
+       u8 rm = dstk ? tmp[0] : dst[0];
+
+       /* Setup Operand */
+       if (dstk) {
+               emit(ARM_LDR_I(rd, ARM_SP, STACK_VAR(dst_lo)), ctx);
+               emit(ARM_LDR_I(rm, ARM_SP, STACK_VAR(dst_hi)), ctx);
+       }
+
+       /* Do Negate Operation */
+       emit(ARM_RSBS_I(rd, rd, 0), ctx);
+       emit(ARM_RSC_I(rm, rm, 0), ctx);
+
+       if (dstk) {
+               emit(ARM_STR_I(rd, ARM_SP, STACK_VAR(dst_lo)), ctx);
+               emit(ARM_STR_I(rm, ARM_SP, STACK_VAR(dst_hi)), ctx);
+       }
+}
+
+/* dst = dst << src */
+static inline void emit_a32_lsh_r64(const u8 dst[], const u8 src[], bool dstk,
+                                   bool sstk, struct jit_ctx *ctx) {
+       const u8 *tmp = bpf2a32[TMP_REG_1];
+       const u8 *tmp2 = bpf2a32[TMP_REG_2];
+
+       /* Setup Operands */
+       u8 rt = sstk ? tmp2[1] : src_lo;
+       u8 rd = dstk ? tmp[1] : dst_lo;
+       u8 rm = dstk ? tmp[0] : dst_hi;
+
+       if (sstk)
+               emit(ARM_LDR_I(rt, ARM_SP, STACK_VAR(src_lo)), ctx);
+       if (dstk) {
+               emit(ARM_LDR_I(rd, ARM_SP, STACK_VAR(dst_lo)), ctx);
+               emit(ARM_LDR_I(rm, ARM_SP, STACK_VAR(dst_hi)), ctx);
+       }

+       /* Do LSH operation */
+       emit(ARM_SUB_I(ARM_IP, rt, 32), ctx);
+       emit(ARM_RSB_I(tmp2[0], rt, 32), ctx);
+       /* As we are using ARM_LR */
        ctx->seen |= SEEN_CALL;
-       emit_mov_i(ARM_R3, bpf_op == BPF_DIV ? (u32)jit_udiv : (u32)jit_mod,
-                  ctx);
-       emit_blx_r(ARM_R3, ctx);
+       emit(ARM_MOV_SR(ARM_LR, rm, SRTYPE_ASL, rt), ctx);
+       emit(ARM_ORR_SR(ARM_LR, ARM_LR, rd, SRTYPE_ASL, ARM_IP), ctx);
+       emit(ARM_ORR_SR(ARM_IP, ARM_LR, rd, SRTYPE_LSR, tmp2[0]), ctx);
+       emit(ARM_MOV_SR(ARM_LR, rd, SRTYPE_ASL, rt), ctx);
+
+       if (dstk) {
+               emit(ARM_STR_I(ARM_LR, ARM_SP, STACK_VAR(dst_lo)), ctx);
+               emit(ARM_STR_I(ARM_IP, ARM_SP, STACK_VAR(dst_hi)), ctx);
+       } else {
+               emit(ARM_MOV_R(rd, ARM_LR), ctx);
+               emit(ARM_MOV_R(rm, ARM_IP), ctx);
+       }
+}

-       if (rd != ARM_R0)
-               emit(ARM_MOV_R(rd, ARM_R0), ctx);
+/* dst = dst >> src (signed)*/
+static inline void emit_a32_arsh_r64(const u8 dst[], const u8 src[], bool dstk,
+                                   bool sstk, struct jit_ctx *ctx) {
+       const u8 *tmp = bpf2a32[TMP_REG_1];
+       const u8 *tmp2 = bpf2a32[TMP_REG_2];
+       /* Setup Operands */
+       u8 rt = sstk ? tmp2[1] : src_lo;
+       u8 rd = dstk ? tmp[1] : dst_lo;
+       u8 rm = dstk ? tmp[0] : dst_hi;
+
+       if (sstk)
+               emit(ARM_LDR_I(rt, ARM_SP, STACK_VAR(src_lo)), ctx);
+       if (dstk) {
+               emit(ARM_LDR_I(rd, ARM_SP, STACK_VAR(dst_lo)), ctx);
+               emit(ARM_LDR_I(rm, ARM_SP, STACK_VAR(dst_hi)), ctx);
+       }
+
+       /* Do the ARSH operation */
+       emit(ARM_RSB_I(ARM_IP, rt, 32), ctx);
+       emit(ARM_SUBS_I(tmp2[0], rt, 32), ctx);
+       /* As we are using ARM_LR */
+       ctx->seen |= SEEN_CALL;
+       emit(ARM_MOV_SR(ARM_LR, rd, SRTYPE_LSR, rt), ctx);
+       emit(ARM_ORR_SR(ARM_LR, ARM_LR, rm, SRTYPE_ASL, ARM_IP), ctx);
+       _emit(ARM_COND_MI, ARM_B(0), ctx);
+       emit(ARM_ORR_SR(ARM_LR, ARM_LR, rm, SRTYPE_ASR, tmp2[0]), ctx);
+       emit(ARM_MOV_SR(ARM_IP, rm, SRTYPE_ASR, rt), ctx);
+       if (dstk) {
+               emit(ARM_STR_I(ARM_LR, ARM_SP, STACK_VAR(dst_lo)), ctx);
+               emit(ARM_STR_I(ARM_IP, ARM_SP, STACK_VAR(dst_hi)), ctx);
+       } else {
+               emit(ARM_MOV_R(rd, ARM_LR), ctx);
+               emit(ARM_MOV_R(rm, ARM_IP), ctx);
+       }
+}
+
+/* dst = dst >> src */
+static inline void emit_a32_lsr_r64(const u8 dst[], const u8 src[], bool dstk,
+                                    bool sstk, struct jit_ctx *ctx) {
+       const u8 *tmp = bpf2a32[TMP_REG_1];
+       const u8 *tmp2 = bpf2a32[TMP_REG_2];
+       /* Setup Operands */
+       u8 rt = sstk ? tmp2[1] : src_lo;
+       u8 rd = dstk ? tmp[1] : dst_lo;
+       u8 rm = dstk ? tmp[0] : dst_hi;
+
+       if (sstk)
+               emit(ARM_LDR_I(rt, ARM_SP, STACK_VAR(src_lo)), ctx);
+       if (dstk) {
+               emit(ARM_LDR_I(rd, ARM_SP, STACK_VAR(dst_lo)), ctx);
+               emit(ARM_LDR_I(rm, ARM_SP, STACK_VAR(dst_hi)), ctx);
+       }
+
+       /* Do LSH operation */
+       emit(ARM_RSB_I(ARM_IP, rt, 32), ctx);
+       emit(ARM_SUBS_I(tmp2[0], rt, 32), ctx);
+       /* As we are using ARM_LR */
+       ctx->seen |= SEEN_CALL;
+       emit(ARM_MOV_SR(ARM_LR, rd, SRTYPE_LSR, rt), ctx);
+       emit(ARM_ORR_SR(ARM_LR, ARM_LR, rm, SRTYPE_ASL, ARM_IP), ctx);
+       emit(ARM_ORR_SR(ARM_LR, ARM_LR, rm, SRTYPE_LSR, tmp2[0]), ctx);
+       emit(ARM_MOV_SR(ARM_IP, rm, SRTYPE_LSR, rt), ctx);
+       if (dstk) {
+               emit(ARM_STR_I(ARM_LR, ARM_SP, STACK_VAR(dst_lo)), ctx);
+               emit(ARM_STR_I(ARM_IP, ARM_SP, STACK_VAR(dst_hi)), ctx);
+       } else {
+               emit(ARM_MOV_R(rd, ARM_LR), ctx);
+               emit(ARM_MOV_R(rm, ARM_IP), ctx);
+       }
+}
+
+/* dst = dst << val */
+static inline void emit_a32_lsh_i64(const u8 dst[], bool dstk,
+                                    const u32 val, struct jit_ctx *ctx){
+       const u8 *tmp = bpf2a32[TMP_REG_1];
+       const u8 *tmp2 = bpf2a32[TMP_REG_2];
+       /* Setup operands */
+       u8 rd = dstk ? tmp[1] : dst_lo;
+       u8 rm = dstk ? tmp[0] : dst_hi;
+
+       if (dstk) {
+               emit(ARM_LDR_I(rd, ARM_SP, STACK_VAR(dst_lo)), ctx);
+               emit(ARM_LDR_I(rm, ARM_SP, STACK_VAR(dst_hi)), ctx);
+       }
+
+       /* Do LSH operation */
+       if (val < 32) {
+               emit(ARM_MOV_SI(tmp2[0], rm, SRTYPE_ASL, val), ctx);
+               emit(ARM_ORR_SI(rm, tmp2[0], rd, SRTYPE_LSR, 32 - val), ctx);
+               emit(ARM_MOV_SI(rd, rd, SRTYPE_ASL, val), ctx);
+       } else {
+               if (val == 32)
+                       emit(ARM_MOV_R(rm, rd), ctx);
+               else
+                       emit(ARM_MOV_SI(rm, rd, SRTYPE_ASL, val - 32), ctx);
+               emit(ARM_EOR_R(rd, rd, rd), ctx);
+       }
+
+       if (dstk) {
+               emit(ARM_STR_I(rd, ARM_SP, STACK_VAR(dst_lo)), ctx);
+               emit(ARM_STR_I(rm, ARM_SP, STACK_VAR(dst_hi)), ctx);
+       }
+}
+
+/* dst = dst >> val */
+static inline void emit_a32_lsr_i64(const u8 dst[], bool dstk,
+                                   const u32 val, struct jit_ctx *ctx) {
+       const u8 *tmp = bpf2a32[TMP_REG_1];
+       const u8 *tmp2 = bpf2a32[TMP_REG_2];
+       /* Setup operands */
+       u8 rd = dstk ? tmp[1] : dst_lo;
+       u8 rm = dstk ? tmp[0] : dst_hi;
+
+       if (dstk) {
+               emit(ARM_LDR_I(rd, ARM_SP, STACK_VAR(dst_lo)), ctx);
+               emit(ARM_LDR_I(rm, ARM_SP, STACK_VAR(dst_hi)), ctx);
+       }
+
+       /* Do LSR operation */
+       if (val < 32) {
+               emit(ARM_MOV_SI(tmp2[1], rd, SRTYPE_LSR, val), ctx);
+               emit(ARM_ORR_SI(rd, tmp2[1], rm, SRTYPE_ASL, 32 - val), ctx);
+               emit(ARM_MOV_SI(rm, rm, SRTYPE_LSR, val), ctx);
+       } else if (val == 32) {
+               emit(ARM_MOV_R(rd, rm), ctx);
+               emit(ARM_MOV_I(rm, 0), ctx);
+       } else {
+               emit(ARM_MOV_SI(rd, rm, SRTYPE_LSR, val - 32), ctx);
+               emit(ARM_MOV_I(rm, 0), ctx);
+       }
+
+       if (dstk) {
+               emit(ARM_STR_I(rd, ARM_SP, STACK_VAR(dst_lo)), ctx);
+               emit(ARM_STR_I(rm, ARM_SP, STACK_VAR(dst_hi)), ctx);
+       }
+}
+
+/* dst = dst >> val (signed) */
+static inline void emit_a32_arsh_i64(const u8 dst[], bool dstk,
+                                    const u32 val, struct jit_ctx *ctx){
+       const u8 *tmp = bpf2a32[TMP_REG_1];
+       const u8 *tmp2 = bpf2a32[TMP_REG_2];
+        /* Setup operands */
+       u8 rd = dstk ? tmp[1] : dst_lo;
+       u8 rm = dstk ? tmp[0] : dst_hi;
+
+       if (dstk) {
+               emit(ARM_LDR_I(rd, ARM_SP, STACK_VAR(dst_lo)), ctx);
+               emit(ARM_LDR_I(rm, ARM_SP, STACK_VAR(dst_hi)), ctx);
+       }
+
+       /* Do ARSH operation */
+       if (val < 32) {
+               emit(ARM_MOV_SI(tmp2[1], rd, SRTYPE_LSR, val), ctx);
+               emit(ARM_ORR_SI(rd, tmp2[1], rm, SRTYPE_ASL, 32 - val), ctx);
+               emit(ARM_MOV_SI(rm, rm, SRTYPE_ASR, val), ctx);
+       } else if (val == 32) {
+               emit(ARM_MOV_R(rd, rm), ctx);
+               emit(ARM_MOV_SI(rm, rm, SRTYPE_ASR, 31), ctx);
+       } else {
+               emit(ARM_MOV_SI(rd, rm, SRTYPE_ASR, val - 32), ctx);
+               emit(ARM_MOV_SI(rm, rm, SRTYPE_ASR, 31), ctx);
+       }
+
+       if (dstk) {
+               emit(ARM_STR_I(rd, ARM_SP, STACK_VAR(dst_lo)), ctx);
+               emit(ARM_STR_I(rm, ARM_SP, STACK_VAR(dst_hi)), ctx);
+       }
+}
+
+static inline void emit_a32_mul_r64(const u8 dst[], const u8 src[], bool dstk,
+                                   bool sstk, struct jit_ctx *ctx) {
+       const u8 *tmp = bpf2a32[TMP_REG_1];
+       const u8 *tmp2 = bpf2a32[TMP_REG_2];
+       /* Setup operands for multiplication */
+       u8 rd = dstk ? tmp[1] : dst_lo;
+       u8 rm = dstk ? tmp[0] : dst_hi;
+       u8 rt = sstk ? tmp2[1] : src_lo;
+       u8 rn = sstk ? tmp2[0] : src_hi;
+
+       if (dstk) {
+               emit(ARM_LDR_I(rd, ARM_SP, STACK_VAR(dst_lo)), ctx);
+               emit(ARM_LDR_I(rm, ARM_SP, STACK_VAR(dst_hi)), ctx);
+       }
+       if (sstk) {
+               emit(ARM_LDR_I(rt, ARM_SP, STACK_VAR(src_lo)), ctx);
+               emit(ARM_LDR_I(rn, ARM_SP, STACK_VAR(src_hi)), ctx);
+       }
+
+       /* Do Multiplication */
+       emit(ARM_MUL(ARM_IP, rd, rn), ctx);
+       emit(ARM_MUL(ARM_LR, rm, rt), ctx);
+       /* As we are using ARM_LR */
+       ctx->seen |= SEEN_CALL;
+       emit(ARM_ADD_R(ARM_LR, ARM_IP, ARM_LR), ctx);
+
+       emit(ARM_UMULL(ARM_IP, rm, rd, rt), ctx);
+       emit(ARM_ADD_R(rm, ARM_LR, rm), ctx);
+       if (dstk) {
+               emit(ARM_STR_I(ARM_IP, ARM_SP, STACK_VAR(dst_lo)), ctx);
+               emit(ARM_STR_I(rm, ARM_SP, STACK_VAR(dst_hi)), ctx);
+       } else {
+               emit(ARM_MOV_R(rd, ARM_IP), ctx);
+       }
 }

-static inline void update_on_xread(struct jit_ctx *ctx)
+/* *(size *)(dst + off) = src */
+static inline void emit_str_r(const u8 dst, const u8 src, bool dstk,
+                             const s32 off, struct jit_ctx *ctx, const u8 sz){
+       const u8 *tmp = bpf2a32[TMP_REG_1];
+       u8 rd = dstk ? tmp[1] : dst;
+
+       if (dstk)
+               emit(ARM_LDR_I(rd, ARM_SP, STACK_VAR(dst)), ctx);
+       if (off) {
+               emit_a32_mov_i(tmp[0], off, false, ctx);
+               emit(ARM_ADD_R(tmp[0], rd, tmp[0]), ctx);
+               rd = tmp[0];
+       }
+       switch (sz) {
+       case BPF_W:
+               /* Store a Word */
+               emit(ARM_STR_I(src, rd, 0), ctx);
+               break;
+       case BPF_H:
+               /* Store a HalfWord */
+               emit(ARM_STRH_I(src, rd, 0), ctx);
+               break;
+       case BPF_B:
+               /* Store a Byte */
+               emit(ARM_STRB_I(src, rd, 0), ctx);
+               break;
+       }
+}
+
+/* dst = *(size*)(src + off) */
+static inline void emit_ldx_r(const u8 dst, const u8 src, bool dstk,
+                             const s32 off, struct jit_ctx *ctx, const u8 sz){
+       const u8 *tmp = bpf2a32[TMP_REG_1];
+       u8 rd = dstk ? tmp[1] : dst;
+       u8 rm = src;
+
+       if (off) {
+               emit_a32_mov_i(tmp[0], off, false, ctx);
+               emit(ARM_ADD_R(tmp[0], tmp[0], src), ctx);
+               rm = tmp[0];
+       }
+       switch (sz) {
+       case BPF_W:
+               /* Load a Word */
+               emit(ARM_LDR_I(rd, rm, 0), ctx);
+               break;
+       case BPF_H:
+               /* Load a HalfWord */
+               emit(ARM_LDRH_I(rd, rm, 0), ctx);
+               break;
+       case BPF_B:
+               /* Load a Byte */
+               emit(ARM_LDRB_I(rd, rm, 0), ctx);
+               break;
+       }
+       if (dstk)
+               emit(ARM_STR_I(rd, ARM_SP, STACK_VAR(dst)), ctx);
+}
+
+/* Arithmatic Operation */
+static inline void emit_ar_r(const u8 rd, const u8 rt, const u8 rm,
+                            const u8 rn, struct jit_ctx *ctx, u8 op) {
+       switch (op) {
+       case BPF_JSET:
+               ctx->seen |= SEEN_CALL;
+               emit(ARM_AND_R(ARM_IP, rt, rn), ctx);
+               emit(ARM_AND_R(ARM_LR, rd, rm), ctx);
+               emit(ARM_ORRS_R(ARM_IP, ARM_LR, ARM_IP), ctx);
+               break;
+       case BPF_JEQ:
+       case BPF_JNE:
+       case BPF_JGT:
+       case BPF_JGE:
+               emit(ARM_CMP_R(rd, rm), ctx);
+               _emit(ARM_COND_EQ, ARM_CMP_R(rt, rn), ctx);
+               break;
+       case BPF_JSGT:
+               emit(ARM_CMP_R(rn, rt), ctx);
+               emit(ARM_SBCS_R(ARM_IP, rm, rd), ctx);
+               break;
+       case BPF_JSGE:
+               emit(ARM_CMP_R(rt, rn), ctx);
+               emit(ARM_SBCS_R(ARM_IP, rd, rm), ctx);
+               break;
+       }
+}
+
+static int out_offset = -1; /* initialized on the first pass of build_body() */
+static int emit_bpf_tail_call(struct jit_ctx *ctx)
 {
-       if (!(ctx->seen & SEEN_X))
-               ctx->flags |= FLAG_NEED_X_RESET;

-       ctx->seen |= SEEN_X;
+       /* bpf_tail_call(void *prog_ctx, struct bpf_array *array, u64 index) */
+       const u8 *r2 = bpf2a32[BPF_REG_2];
+       const u8 *r3 = bpf2a32[BPF_REG_3];
+       const u8 *tmp = bpf2a32[TMP_REG_1];
+       const u8 *tmp2 = bpf2a32[TMP_REG_2];
+       const u8 *tcc = bpf2a32[TCALL_CNT];
+       const int idx0 = ctx->idx;
+#define cur_offset (ctx->idx - idx0)
+#define jmp_offset (out_offset - (cur_offset))
+       u32 off, lo, hi;
+
+       /* if (index >= array->map.max_entries)
+        *      goto out;
+        */
+       off = offsetof(struct bpf_array, map.max_entries);
+       /* array->map.max_entries */
+       emit_a32_mov_i(tmp[1], off, false, ctx);
+       emit(ARM_LDR_I(tmp2[1], ARM_SP, STACK_VAR(r2[1])), ctx);
+       emit(ARM_LDR_R(tmp[1], tmp2[1], tmp[1]), ctx);
+       /* index (64 bit) */
+       emit(ARM_LDR_I(tmp2[1], ARM_SP, STACK_VAR(r3[1])), ctx);
+       /* index >= array->map.max_entries */
+       emit(ARM_CMP_R(tmp2[1], tmp[1]), ctx);
+       _emit(ARM_COND_CS, ARM_B(jmp_offset), ctx);
+
+       /* if (tail_call_cnt > MAX_TAIL_CALL_CNT)
+        *      goto out;
+        * tail_call_cnt++;
+        */
+       lo = (u32)MAX_TAIL_CALL_CNT;
+       hi = (u32)((u64)MAX_TAIL_CALL_CNT >> 32);
+       emit(ARM_LDR_I(tmp[1], ARM_SP, STACK_VAR(tcc[1])), ctx);
+       emit(ARM_LDR_I(tmp[0], ARM_SP, STACK_VAR(tcc[0])), ctx);
+       emit(ARM_CMP_I(tmp[0], hi), ctx);
+       _emit(ARM_COND_EQ, ARM_CMP_I(tmp[1], lo), ctx);
+       _emit(ARM_COND_HI, ARM_B(jmp_offset), ctx);
+       emit(ARM_ADDS_I(tmp[1], tmp[1], 1), ctx);
+       emit(ARM_ADC_I(tmp[0], tmp[0], 0), ctx);
+       emit(ARM_STR_I(tmp[1], ARM_SP, STACK_VAR(tcc[1])), ctx);
+       emit(ARM_STR_I(tmp[0], ARM_SP, STACK_VAR(tcc[0])), ctx);
+
+       /* prog = array->ptrs[index]
+        * if (prog == NULL)
+        *      goto out;
+        */
+       off = offsetof(struct bpf_array, ptrs);
+       emit_a32_mov_i(tmp[1], off, false, ctx);
+       emit(ARM_LDR_I(tmp2[1], ARM_SP, STACK_VAR(r2[1])), ctx);
+       emit(ARM_LDR_R(tmp[1], tmp2[1], tmp[1]), ctx);
+       emit(ARM_LDR_I(tmp2[1], ARM_SP, STACK_VAR(r3[1])), ctx);
+       emit(ARM_MOV_SI(tmp[0], tmp2[1], SRTYPE_ASL, 2), ctx);
+       emit(ARM_LDR_R(tmp[1], tmp[1], tmp[0]), ctx);
+       emit(ARM_CMP_I(tmp[1], 0), ctx);
+       _emit(ARM_COND_EQ, ARM_B(jmp_offset), ctx);
+
+       /* goto *(prog->bpf_func + prologue_size); */
+       off = offsetof(struct bpf_prog, bpf_func);
+       emit_a32_mov_i(tmp2[1], off, false, ctx);
+       emit(ARM_LDR_R(tmp[1], tmp[1], tmp2[1]), ctx);
+       emit(ARM_ADD_I(tmp[1], tmp[1], ctx->prologue_bytes), ctx);
+       emit(ARM_BX(tmp[1]), ctx);
+
+       /* out: */
+       if (out_offset == -1)
+               out_offset = cur_offset;
+       if (cur_offset != out_offset) {
+               pr_err_once("tail_call out_offset = %d, expected %d!\n",
+                           cur_offset, out_offset);
+               return -1;
+       }
+       return 0;
+#undef cur_offset
+#undef jmp_offset
 }

-static int build_body(struct jit_ctx *ctx)
+/* 0xabcd => 0xcdab */
+static inline void emit_rev16(const u8 rd, const u8 rn, struct jit_ctx *ctx)
 {
-       void *load_func[] = {jit_get_skb_b, jit_get_skb_h, jit_get_skb_w};
-       const struct bpf_prog *prog = ctx->skf;
-       const struct sock_filter *inst;
-       unsigned i, load_order, off, condt;
-       int imm12;
-       u32 k;
+#if __LINUX_ARM_ARCH__ < 6
+       const u8 *tmp2 = bpf2a32[TMP_REG_2];
+
+       emit(ARM_AND_I(tmp2[1], rn, 0xff), ctx);
+       emit(ARM_MOV_S(tmp2[0], rn, SRTYPE_LSR, 8), ctx);
+       emit(ARM_AND_I(tmp2[0], tmp2[0], 0xff), ctx);
+       emit(ARM_ORR_SI(rd, tmp2[0], tmp2[1], SRTYPE_LSL, 8), ctx);
+#else /* ARMv6+ */
+       emit(ARM_REV16(rd, rn), ctx);
+#endif
+}

-       for (i = 0; i < prog->len; i++) {
-               u16 code;
+/* 0xabcdefgh => 0xghefcdab */
+static inline void emit_rev32(const u8 rd, const u8 rn, struct jit_ctx *ctx)
+{
+#if __LINUX_ARM_ARCH__ < 6
+       const u8 *tmp2 = bpf2a32[TMP_REG_2];
+
+       emit(ARM_AND_I(tmp2[1], rn, 0xff), ctx);
+       emit(ARM_MOV_S(tmp2[0], rn, SRTYPE_LSR, 24), ctx);
+       emit(ARM_ORR_SI(ARM_IP, tmp2[0], tmp2[1], SRTYPE_LSL, 24), ctx);
+
+       emit(ARM_MOV_I(tmp2[1], rn, 0xff00), ctx);
+       emit(ARM_MOV_S(tmp2[0], rn, SRTYPE_LSR, 8), ctx);
+       emit(ARM_MOV_I(tmp2[0], tmp2[0], 0xff00), ctx);
+       emit(ARM_ORR_SI(tmp2[0], tmp2[0], tmp2[1], SRTYPE_LSL, 8), ctx);
+       emit(ARM_ORR_R(rd, ARM_IP, tmp2[0]), ctx);
+#else /* ARMv6+ */
+       emit(ARM_REV(rd, rn), ctx);
+#endif
+}

-               inst = &(prog->insns[i]);
-               /* K as an immediate value operand */
-               k = inst->k;
-               code = bpf_anc_helper(inst);
+static void build_prologue(struct jit_ctx *ctx)
+{
+       const u8 r0 = bpf2a32[BPF_REG_0][1];
+       const u8 r2 = bpf2a32[BPF_REG_1][1];
+       const u8 r3 = bpf2a32[BPF_REG_1][0];
+       const u8 r4 = bpf2a32[BPF_REG_6][1];
+       const u8 r5 = bpf2a32[BPF_REG_6][0];
+       const u8 r6 = bpf2a32[TMP_REG_1][1];
+       const u8 r7 = bpf2a32[TMP_REG_1][0];
+       const u8 r8 = bpf2a32[TMP_REG_2][1];
+       const u8 r10 = bpf2a32[TMP_REG_2][0];
+       const u8 fplo = bpf2a32[BPF_REG_FP][1];
+       const u8 fphi = bpf2a32[BPF_REG_FP][0];
+       const u8 sp = ARM_SP;
+       const u8 *tcc = bpf2a32[TCALL_CNT];
+
+       u16 reg_set = 0;

-               /* compute offsets only in the fake pass */
-               if (ctx->target == NULL)
-                       ctx->offsets[i] = ctx->idx * 4;
+       /*
+        * eBPF prog stack layout
+        *
+        *                         high
+        * original ARM_SP =>     +-----+ eBPF prologue
+        *                        |FP/LR|
+        * current ARM_FP =>      +-----+
+        *                        | ... | callee saved registers
+        * eBPF fp register =>    +-----+ <= (BPF_FP)
+        *                        | ... | eBPF JIT scratch space
+        *                        |     | eBPF prog stack
+        *                        +-----+
+        *                        |RSVD | JIT scratchpad
+        * current A64_SP =>      +-----+ <= (BPF_FP - STACK_SIZE)
+        *                        |     |
+        *                        | ... | Function call stack
+        *                        |     |
+        *                        +-----+
+        *                          low
+        */

-               switch (code) {
-               case BPF_LD | BPF_IMM:
-                       emit_mov_i(r_A, k, ctx);
+       /* Save callee saved registers. */
+       reg_set |= (1<<r4) | (1<<r5) | (1<<r6) | (1<<r7) | (1<<r8) | (1<<r10);
+#ifdef CONFIG_FRAME_POINTER
+       reg_set |= (1<<ARM_FP) | (1<<ARM_IP) | (1<<ARM_LR) | (1<<ARM_PC);
+       emit(ARM_MOV_R(ARM_IP, sp), ctx);
+       emit(ARM_PUSH(reg_set), ctx);
+       emit(ARM_SUB_I(ARM_FP, ARM_IP, 4), ctx);
+#else
+       /* Check if call instruction exists in BPF body */
+       if (ctx->seen & SEEN_CALL)
+               reg_set |= (1<<ARM_LR);
+       emit(ARM_PUSH(reg_set), ctx);
+#endif
+       /* Save frame pointer for later */
+       emit(ARM_SUB_I(ARM_IP, sp, SCRATCH_SIZE), ctx);
+
+       /* Set up function call stack */
+       emit(ARM_SUB_I(ARM_SP, ARM_SP, imm8m(STACK_SIZE)), ctx);
+
+       /* Set up BPF prog stack base register */
+       emit_a32_mov_r(fplo, ARM_IP, true, false, ctx);
+       emit_a32_mov_i(fphi, 0, true, ctx);
+
+       /* mov r4, 0 */
+       emit(ARM_MOV_I(r4, 0), ctx);
+       /* MOV bpf_ctx pointer to BPF_R1 */
+       emit(ARM_MOV_R(r3, r4), ctx);
+       emit(ARM_MOV_R(r2, r0), ctx);
+       /* Initialize Tail Count */
+       emit(ARM_STR_I(r4, ARM_SP, STACK_VAR(tcc[0])), ctx);
+       emit(ARM_STR_I(r4, ARM_SP, STACK_VAR(tcc[1])), ctx);
+       /* end of prologue */
+}
+
+static void build_epilogue(struct jit_ctx *ctx)
+{
+       const u8 r4 = bpf2a32[BPF_REG_6][1];
+       const u8 r5 = bpf2a32[BPF_REG_6][0];
+       const u8 r6 = bpf2a32[TMP_REG_1][1];
+       const u8 r7 = bpf2a32[TMP_REG_1][0];
+       const u8 r8 = bpf2a32[TMP_REG_2][1];
+       const u8 r10 = bpf2a32[TMP_REG_2][0];
+       u16 reg_set = 0;
+
+       /* unwind function call stack */
+       emit(ARM_ADD_I(ARM_SP, ARM_SP, imm8m(STACK_SIZE)), ctx);
+
+       /* restore callee saved registers. */
+       reg_set |= (1<<r4) | (1<<r5) | (1<<r6) | (1<<r7) | (1<<r8) | (1<<r10);
+#ifdef CONFIG_FRAME_POINTER
+       /* the first instruction of the prologue was: mov ip, sp */
+       reg_set |= (1<<ARM_FP) | (1<<ARM_SP) | (1<<ARM_PC);
+       emit(ARM_LDM(ARM_SP, reg_set), ctx);
+#else
+       if (ctx->seen & SEEN_CALL)
+               reg_set |= (1<<ARM_PC);
+       /* Restore callee saved registers. */
+       emit(ARM_POP(reg_set), ctx);
+       /* Return back to the callee function */
+       if (!(ctx->seen & SEEN_CALL))
+               emit(ARM_BX(ARM_LR), ctx);
+#endif
+}
+
+/*
+ * Convert an eBPF instruction to native instruction, i.e
+ * JITs an eBPF instruction.
+ * Returns :
+ *     0  - Successfully JITed an 8-byte eBPF instruction
+ *     >0 - Successfully JITed a 16-byte eBPF instruction
+ *     <0 - Failed to JIT.
+ */
+static int build_insn(const struct bpf_insn *insn, struct jit_ctx *ctx)
+{
+       const u8 code = insn->code;
+       const u8 *dst = bpf2a32[insn->dst_reg];
+       const u8 *src = bpf2a32[insn->src_reg];
+       const u8 *tmp = bpf2a32[TMP_REG_1];
+       const u8 *tmp2 = bpf2a32[TMP_REG_2];
+       const s16 off = insn->off;
+       const s32 imm = insn->imm;
+       const int i = insn - ctx->prog->insnsi;
+       const bool is64 = BPF_CLASS(code) == BPF_ALU64;
+       const bool dstk = is_on_stack(insn->dst_reg);
+       const bool sstk = is_on_stack(insn->src_reg);
+       u8 rd, rt, rm, rn;
+       s32 jmp_offset;
+
+#define check_imm(bits, imm) do {                              \
+       if ((((imm) > 0) && ((imm) >> (bits))) ||               \
+           (((imm) < 0) && (~(imm) >> (bits)))) {              \
+               pr_info("[%2d] imm=%d(0x%x) out of range\n",    \
+                       i, imm, imm);                           \
+               return -EINVAL;                                 \
+       }                                                       \
+} while (0)
+#define check_imm24(imm) check_imm(24, imm)
+
+       switch (code) {
+       /* ALU operations */
+
+       /* dst = src */
+       case BPF_ALU | BPF_MOV | BPF_K:
+       case BPF_ALU | BPF_MOV | BPF_X:
+       case BPF_ALU64 | BPF_MOV | BPF_K:
+       case BPF_ALU64 | BPF_MOV | BPF_X:
+               switch (BPF_SRC(code)) {
+               case BPF_X:
+                       emit_a32_mov_r64(is64, dst, src, dstk, sstk, ctx);
                        break;
-               case BPF_LD | BPF_W | BPF_LEN:
-                       ctx->seen |= SEEN_SKB;
-                       BUILD_BUG_ON(FIELD_SIZEOF(struct sk_buff, len) != 4);
-                       emit(ARM_LDR_I(r_A, r_skb,
-                                      offsetof(struct sk_buff, len)), ctx);
+               case BPF_K:
+                       /* Sign-extend immediate value to destination reg */
+                       emit_a32_mov_i64(is64, dst, imm, dstk, ctx);
                        break;
-               case BPF_LD | BPF_MEM:
-                       /* A = scratch[k] */
-                       ctx->seen |= SEEN_MEM_WORD(k);
-                       emit(ARM_LDR_I(r_A, ARM_SP, SCRATCH_OFF(k)), ctx);
+               }
+               break;
+       /* dst = dst + src/imm */
+       /* dst = dst - src/imm */
+       /* dst = dst | src/imm */
+       /* dst = dst & src/imm */
+       /* dst = dst ^ src/imm */
+       /* dst = dst * src/imm */
+       /* dst = dst << src */
+       /* dst = dst >> src */
+       case BPF_ALU | BPF_ADD | BPF_K:
+       case BPF_ALU | BPF_ADD | BPF_X:
+       case BPF_ALU | BPF_SUB | BPF_K:
+       case BPF_ALU | BPF_SUB | BPF_X:
+       case BPF_ALU | BPF_OR | BPF_K:
+       case BPF_ALU | BPF_OR | BPF_X:
+       case BPF_ALU | BPF_AND | BPF_K:
+       case BPF_ALU | BPF_AND | BPF_X:
+       case BPF_ALU | BPF_XOR | BPF_K:
+       case BPF_ALU | BPF_XOR | BPF_X:
+       case BPF_ALU | BPF_MUL | BPF_K:
+       case BPF_ALU | BPF_MUL | BPF_X:
+       case BPF_ALU | BPF_LSH | BPF_X:
+       case BPF_ALU | BPF_RSH | BPF_X:
+       case BPF_ALU | BPF_ARSH | BPF_K:
+       case BPF_ALU | BPF_ARSH | BPF_X:
+       case BPF_ALU64 | BPF_ADD | BPF_K:
+       case BPF_ALU64 | BPF_ADD | BPF_X:
+       case BPF_ALU64 | BPF_SUB | BPF_K:
+       case BPF_ALU64 | BPF_SUB | BPF_X:
+       case BPF_ALU64 | BPF_OR | BPF_K:
+       case BPF_ALU64 | BPF_OR | BPF_X:
+       case BPF_ALU64 | BPF_AND | BPF_K:
+       case BPF_ALU64 | BPF_AND | BPF_X:
+       case BPF_ALU64 | BPF_XOR | BPF_K:
+       case BPF_ALU64 | BPF_XOR | BPF_X:
+               switch (BPF_SRC(code)) {
+               case BPF_X:
+                       emit_a32_alu_r64(is64, dst, src, dstk, sstk,
+                                        ctx, BPF_OP(code));
                        break;
-               case BPF_LD | BPF_W | BPF_ABS:
-                       load_order = 2;
-                       goto load;
-               case BPF_LD | BPF_H | BPF_ABS:
-                       load_order = 1;
-                       goto load;
-               case BPF_LD | BPF_B | BPF_ABS:
-                       load_order = 0;
-load:
-                       emit_mov_i(r_off, k, ctx);
-load_common:
-                       ctx->seen |= SEEN_DATA | SEEN_CALL;
-
-                       if (load_order > 0) {
-                               emit(ARM_SUB_I(r_scratch, r_skb_hl,
-                                              1 << load_order), ctx);
-                               emit(ARM_CMP_R(r_scratch, r_off), ctx);
-                               condt = ARM_COND_GE;
-                       } else {
-                               emit(ARM_CMP_R(r_skb_hl, r_off), ctx);
-                               condt = ARM_COND_HI;
-                       }
-
-                       /*
-                        * test for negative offset, only if we are
-                        * currently scheduled to take the fast
-                        * path. this will update the flags so that
-                        * the slowpath instruction are ignored if the
-                        * offset is negative.
-                        *
-                        * for loard_order == 0 the HI condition will
-                        * make loads at offset 0 take the slow path too.
+               case BPF_K:
+                       /* Move immediate value to the temporary register
+                        * and then do the ALU operation on the temporary
+                        * register as this will sign-extend the immediate
+                        * value into temporary reg and then it would be
+                        * safe to do the operation on it.
                         */
-                       _emit(condt, ARM_CMP_I(r_off, 0), ctx);
-
-                       _emit(condt, ARM_ADD_R(r_scratch, r_off, r_skb_data),
-                             ctx);
-
-                       if (load_order == 0)
-                               _emit(condt, ARM_LDRB_I(r_A, r_scratch, 0),
-                                     ctx);
-                       else if (load_order == 1)
-                               emit_load_be16(condt, r_A, r_scratch, ctx);
-                       else if (load_order == 2)
-                               emit_load_be32(condt, r_A, r_scratch, ctx);
-
-                       _emit(condt, ARM_B(b_imm(i + 1, ctx)), ctx);
-
-                       /* the slowpath */
-                       emit_mov_i(ARM_R3, (u32)load_func[load_order], ctx);
-                       emit(ARM_MOV_R(ARM_R0, r_skb), ctx);
-                       /* the offset is already in R1 */
-                       emit_blx_r(ARM_R3, ctx);
-                       /* check the result of skb_copy_bits */
-                       emit(ARM_CMP_I(ARM_R1, 0), ctx);
-                       emit_err_ret(ARM_COND_NE, ctx);
-                       emit(ARM_MOV_R(r_A, ARM_R0), ctx);
+                       emit_a32_mov_i64(is64, tmp2, imm, false, ctx);
+                       emit_a32_alu_r64(is64, dst, tmp2, dstk, false,
+                                        ctx, BPF_OP(code));
                        break;
-               case BPF_LD | BPF_W | BPF_IND:
-                       load_order = 2;
-                       goto load_ind;
-               case BPF_LD | BPF_H | BPF_IND:
-                       load_order = 1;
-                       goto load_ind;
-               case BPF_LD | BPF_B | BPF_IND:
-                       load_order = 0;
-load_ind:
-                       update_on_xread(ctx);
-                       OP_IMM3(ARM_ADD, r_off, r_X, k, ctx);
-                       goto load_common;
-               case BPF_LDX | BPF_IMM:
-                       ctx->seen |= SEEN_X;
-                       emit_mov_i(r_X, k, ctx);
+               }
+               break;
+       /* dst = dst / src(imm) */
+       /* dst = dst % src(imm) */
+       case BPF_ALU | BPF_DIV | BPF_K:
+       case BPF_ALU | BPF_DIV | BPF_X:
+       case BPF_ALU | BPF_MOD | BPF_K:
+       case BPF_ALU | BPF_MOD | BPF_X:
+               rt = src_lo;
+               rd = dstk ? tmp2[1] : dst_lo;
+               if (dstk)
+                       emit(ARM_LDR_I(rd, ARM_SP, STACK_VAR(dst_lo)), ctx);
+               switch (BPF_SRC(code)) {
+               case BPF_X:
+                       rt = sstk ? tmp2[0] : rt;
+                       if (sstk)
+                               emit(ARM_LDR_I(rt, ARM_SP, STACK_VAR(src_lo)),
+                                    ctx);
                        break;
-               case BPF_LDX | BPF_W | BPF_LEN:
-                       ctx->seen |= SEEN_X | SEEN_SKB;
-                       emit(ARM_LDR_I(r_X, r_skb,
-                                      offsetof(struct sk_buff, len)), ctx);
+               case BPF_K:
+                       rt = tmp2[0];
+                       emit_a32_mov_i(rt, imm, false, ctx);
                        break;
-               case BPF_LDX | BPF_MEM:
-                       ctx->seen |= SEEN_X | SEEN_MEM_WORD(k);
-                       emit(ARM_LDR_I(r_X, ARM_SP, SCRATCH_OFF(k)), ctx);
+               }
+               emit_udivmod(rd, rd, rt, ctx, BPF_OP(code));
+               if (dstk)
+                       emit(ARM_STR_I(rd, ARM_SP, STACK_VAR(dst_lo)), ctx);
+               emit_a32_mov_i(dst_hi, 0, dstk, ctx);
+               break;
+       case BPF_ALU64 | BPF_DIV | BPF_K:
+       case BPF_ALU64 | BPF_DIV | BPF_X:
+       case BPF_ALU64 | BPF_MOD | BPF_K:
+       case BPF_ALU64 | BPF_MOD | BPF_X:
+               goto notyet;
+       /* dst = dst >> imm */
+       /* dst = dst << imm */
+       case BPF_ALU | BPF_RSH | BPF_K:
+       case BPF_ALU | BPF_LSH | BPF_K:
+               if (unlikely(imm > 31))
+                       return -EINVAL;
+               if (imm)
+                       emit_a32_alu_i(dst_lo, imm, dstk, ctx, BPF_OP(code));
+               emit_a32_mov_i(dst_hi, 0, dstk, ctx);
+               break;
+       /* dst = dst << imm */
+       case BPF_ALU64 | BPF_LSH | BPF_K:
+               if (unlikely(imm > 63))
+                       return -EINVAL;
+               emit_a32_lsh_i64(dst, dstk, imm, ctx);
+               break;
+       /* dst = dst >> imm */
+       case BPF_ALU64 | BPF_RSH | BPF_K:
+               if (unlikely(imm > 63))
+                       return -EINVAL;
+               emit_a32_lsr_i64(dst, dstk, imm, ctx);
+               break;
+       /* dst = dst << src */
+       case BPF_ALU64 | BPF_LSH | BPF_X:
+               emit_a32_lsh_r64(dst, src, dstk, sstk, ctx);
+               break;
+       /* dst = dst >> src */
+       case BPF_ALU64 | BPF_RSH | BPF_X:
+               emit_a32_lsr_r64(dst, src, dstk, sstk, ctx);
+               break;
+       /* dst = dst >> src (signed) */
+       case BPF_ALU64 | BPF_ARSH | BPF_X:
+               emit_a32_arsh_r64(dst, src, dstk, sstk, ctx);
+               break;
+       /* dst = dst >> imm (signed) */
+       case BPF_ALU64 | BPF_ARSH | BPF_K:
+               if (unlikely(imm > 63))
+                       return -EINVAL;
+               emit_a32_arsh_i64(dst, dstk, imm, ctx);
+               break;
+       /* dst = ~dst */
+       case BPF_ALU | BPF_NEG:
+               emit_a32_alu_i(dst_lo, 0, dstk, ctx, BPF_OP(code));
+               emit_a32_mov_i(dst_hi, 0, dstk, ctx);
+               break;
+       /* dst = ~dst (64 bit) */
+       case BPF_ALU64 | BPF_NEG:
+               emit_a32_neg64(dst, dstk, ctx);
+               break;
+       /* dst = dst * src/imm */
+       case BPF_ALU64 | BPF_MUL | BPF_X:
+       case BPF_ALU64 | BPF_MUL | BPF_K:
+               switch (BPF_SRC(code)) {
+               case BPF_X:
+                       emit_a32_mul_r64(dst, src, dstk, sstk, ctx);
                        break;
-               case BPF_LDX | BPF_B | BPF_MSH:
-                       /* x = ((*(frame + k)) & 0xf) << 2; */
-                       ctx->seen |= SEEN_X | SEEN_DATA | SEEN_CALL;
-                       /* the interpreter should deal with the negative K */
-                       if ((int)k < 0)
-                               return -1;
-                       /* offset in r1: we might have to take the slow path */
-                       emit_mov_i(r_off, k, ctx);
-                       emit(ARM_CMP_R(r_skb_hl, r_off), ctx);
-
-                       /* load in r0: common with the slowpath */
-                       _emit(ARM_COND_HI, ARM_LDRB_R(ARM_R0, r_skb_data,
-                                                     ARM_R1), ctx);
-                       /*
-                        * emit_mov_i() might generate one or two instructions,
-                        * the same holds for emit_blx_r()
+               case BPF_K:
+                       /* Move immediate value to the temporary register
+                        * and then do the multiplication on it as this
+                        * will sign-extend the immediate value into temp
+                        * reg then it would be safe to do the operation
+                        * on it.
                         */
-                       _emit(ARM_COND_HI, ARM_B(b_imm(i + 1, ctx) - 2), ctx);
-
-                       emit(ARM_MOV_R(ARM_R0, r_skb), ctx);
-                       /* r_off is r1 */
-                       emit_mov_i(ARM_R3, (u32)jit_get_skb_b, ctx);
-                       emit_blx_r(ARM_R3, ctx);
-                       /* check the return value of skb_copy_bits */
-                       emit(ARM_CMP_I(ARM_R1, 0), ctx);
-                       emit_err_ret(ARM_COND_NE, ctx);
-
-                       emit(ARM_AND_I(r_X, ARM_R0, 0x00f), ctx);
-                       emit(ARM_LSL_I(r_X, r_X, 2), ctx);
-                       break;
-               case BPF_ST:
-                       ctx->seen |= SEEN_MEM_WORD(k);
-                       emit(ARM_STR_I(r_A, ARM_SP, SCRATCH_OFF(k)), ctx);
-                       break;
-               case BPF_STX:
-                       update_on_xread(ctx);
-                       ctx->seen |= SEEN_MEM_WORD(k);
-                       emit(ARM_STR_I(r_X, ARM_SP, SCRATCH_OFF(k)), ctx);
-                       break;
-               case BPF_ALU | BPF_ADD | BPF_K:
-                       /* A += K */
-                       OP_IMM3(ARM_ADD, r_A, r_A, k, ctx);
-                       break;
-               case BPF_ALU | BPF_ADD | BPF_X:
-                       update_on_xread(ctx);
-                       emit(ARM_ADD_R(r_A, r_A, r_X), ctx);
-                       break;
-               case BPF_ALU | BPF_SUB | BPF_K:
-                       /* A -= K */
-                       OP_IMM3(ARM_SUB, r_A, r_A, k, ctx);
-                       break;
-               case BPF_ALU | BPF_SUB | BPF_X:
-                       update_on_xread(ctx);
-                       emit(ARM_SUB_R(r_A, r_A, r_X), ctx);
-                       break;
-               case BPF_ALU | BPF_MUL | BPF_K:
-                       /* A *= K */
-                       emit_mov_i(r_scratch, k, ctx);
-                       emit(ARM_MUL(r_A, r_A, r_scratch), ctx);
-                       break;
-               case BPF_ALU | BPF_MUL | BPF_X:
-                       update_on_xread(ctx);
-                       emit(ARM_MUL(r_A, r_A, r_X), ctx);
-                       break;
-               case BPF_ALU | BPF_DIV | BPF_K:
-                       if (k == 1)
-                               break;
-                       emit_mov_i(r_scratch, k, ctx);
-                       emit_udivmod(r_A, r_A, r_scratch, ctx, BPF_DIV);
-                       break;
-               case BPF_ALU | BPF_DIV | BPF_X:
-                       update_on_xread(ctx);
-                       emit(ARM_CMP_I(r_X, 0), ctx);
-                       emit_err_ret(ARM_COND_EQ, ctx);
-                       emit_udivmod(r_A, r_A, r_X, ctx, BPF_DIV);
-                       break;
-               case BPF_ALU | BPF_MOD | BPF_K:
-                       if (k == 1) {
-                               emit_mov_i(r_A, 0, ctx);
-                               break;
-                       }
-                       emit_mov_i(r_scratch, k, ctx);
-                       emit_udivmod(r_A, r_A, r_scratch, ctx, BPF_MOD);
+                       emit_a32_mov_i64(is64, tmp2, imm, false, ctx);
+                       emit_a32_mul_r64(dst, tmp2, dstk, false, ctx);
                        break;
-               case BPF_ALU | BPF_MOD | BPF_X:
-                       update_on_xread(ctx);
-                       emit(ARM_CMP_I(r_X, 0), ctx);
-                       emit_err_ret(ARM_COND_EQ, ctx);
-                       emit_udivmod(r_A, r_A, r_X, ctx, BPF_MOD);
-                       break;
-               case BPF_ALU | BPF_OR | BPF_K:
-                       /* A |= K */
-                       OP_IMM3(ARM_ORR, r_A, r_A, k, ctx);
+               }
+               break;
+       /* dst = htole(dst) */
+       /* dst = htobe(dst) */
+       case BPF_ALU | BPF_END | BPF_FROM_LE:
+       case BPF_ALU | BPF_END | BPF_FROM_BE:
+               rd = dstk ? tmp[0] : dst_hi;
+               rt = dstk ? tmp[1] : dst_lo;
+               if (dstk) {
+                       emit(ARM_LDR_I(rt, ARM_SP, STACK_VAR(dst_lo)), ctx);
+                       emit(ARM_LDR_I(rd, ARM_SP, STACK_VAR(dst_hi)), ctx);
+               }
+#ifdef CONFIG_CPU_BIG_ENDIAN
+               if (BPF_SRC(code) == BPF_FROM_BE)
+                       goto emit_bswap_uxt;
+#else /* !CONFIG_CPU_BIG_ENDIAN */
+               if (BPF_SRC(code) == BPF_FROM_LE)
+                       goto emit_bswap_uxt;
+#endif
+               switch (imm) {
+               case 16:
+                       emit_rev16(rt, rt, ctx);
+                       goto emit_bswap_uxt;
+               case 32:
+                       emit_rev32(rt, rt, ctx);
+                       goto emit_bswap_uxt;
+               case 64:
+                       /* Because of the usage of ARM_LR */
+                       ctx->seen |= SEEN_CALL;
+                       emit_rev32(ARM_LR, rt, ctx);
+                       emit_rev32(rt, rd, ctx);
+                       emit(ARM_MOV_R(rd, ARM_LR), ctx);
                        break;
-               case BPF_ALU | BPF_OR | BPF_X:
-                       update_on_xread(ctx);
-                       emit(ARM_ORR_R(r_A, r_A, r_X), ctx);
+               }
+               goto exit;
+emit_bswap_uxt:
+               switch (imm) {
+               case 16:
+                       /* zero-extend 16 bits into 64 bits */
+#if __LINUX_ARM_ARCH__ < 6
+                       emit_a32_mov_i(tmp2[1], 0xffff, false, ctx);
+                       emit(ARM_AND_R(rt, tmp2[1]), ctx);
+#else /* ARMv6+ */
+                       emit(ARM_UXTH(rt, rt), ctx);
+#endif
+                       emit(ARM_EOR_R(rd, rd, rd), ctx);
                        break;
-               case BPF_ALU | BPF_XOR | BPF_K:
-                       /* A ^= K; */
-                       OP_IMM3(ARM_EOR, r_A, r_A, k, ctx);
+               case 32:
+                       /* zero-extend 32 bits into 64 bits */
+                       emit(ARM_EOR_R(rd, rd, rd), ctx);
                        break;
-               case BPF_ANC | SKF_AD_ALU_XOR_X:
-               case BPF_ALU | BPF_XOR | BPF_X:
-                       /* A ^= X */
-                       update_on_xread(ctx);
-                       emit(ARM_EOR_R(r_A, r_A, r_X), ctx);
+               case 64:
+                       /* nop */
                        break;
-               case BPF_ALU | BPF_AND | BPF_K:
-                       /* A &= K */
-                       OP_IMM3(ARM_AND, r_A, r_A, k, ctx);
+               }
+exit:
+               if (dstk) {
+                       emit(ARM_STR_I(rt, ARM_SP, STACK_VAR(dst_lo)), ctx);
+                       emit(ARM_STR_I(rd, ARM_SP, STACK_VAR(dst_hi)), ctx);
+               }
+               break;
+       /* dst = imm64 */
+       case BPF_LD | BPF_IMM | BPF_DW:
+       {
+               const struct bpf_insn insn1 = insn[1];
+               u32 hi, lo = imm;
+
+               if (insn1.code != 0 || insn1.src_reg != 0 ||
+                   insn1.dst_reg != 0 || insn1.off != 0) {
+                       /* Note: verifier in BPF core must catch invalid
+                        * instruction.
+                        */
+                       pr_err_once("Invalid BPF_LD_IMM64 instruction\n");
+                       return -EINVAL;
+               }
+               hi = insn1.imm;
+               emit_a32_mov_i(dst_lo, lo, dstk, ctx);
+               emit_a32_mov_i(dst_hi, hi, dstk, ctx);
+
+               return 1;
+       }
+       /* LDX: dst = *(size *)(src + off) */
+       case BPF_LDX | BPF_MEM | BPF_W:
+       case BPF_LDX | BPF_MEM | BPF_H:
+       case BPF_LDX | BPF_MEM | BPF_B:
+       case BPF_LDX | BPF_MEM | BPF_DW:
+               rn = sstk ? tmp2[1] : src_lo;
+               if (sstk)
+                       emit(ARM_LDR_I(rn, ARM_SP, STACK_VAR(src_lo)), ctx);
+               switch (BPF_SIZE(code)) {
+               case BPF_W:
+                       /* Load a Word */
+               case BPF_H:
+                       /* Load a Half-Word */
+               case BPF_B:
+                       /* Load a Byte */
+                       emit_ldx_r(dst_lo, rn, dstk, off, ctx, BPF_SIZE(code));
+                       emit_a32_mov_i(dst_hi, 0, dstk, ctx);
                        break;
-               case BPF_ALU | BPF_AND | BPF_X:
-                       update_on_xread(ctx);
-                       emit(ARM_AND_R(r_A, r_A, r_X), ctx);
+               case BPF_DW:
+                       /* Load a double word */
+                       emit_ldx_r(dst_lo, rn, dstk, off, ctx, BPF_W);
+                       emit_ldx_r(dst_hi, rn, dstk, off+4, ctx, BPF_W);
                        break;
-               case BPF_ALU | BPF_LSH | BPF_K:
-                       if (unlikely(k > 31))
-                               return -1;
-                       emit(ARM_LSL_I(r_A, r_A, k), ctx);
+               }
+               break;
+       /* R0 = ntohx(*(size *)(((struct sk_buff *)R6)->data + imm)) */
+       case BPF_LD | BPF_ABS | BPF_W:
+       case BPF_LD | BPF_ABS | BPF_H:
+       case BPF_LD | BPF_ABS | BPF_B:
+       /* R0 = ntohx(*(size *)(((struct sk_buff *)R6)->data + src + imm)) */
+       case BPF_LD | BPF_IND | BPF_W:
+       case BPF_LD | BPF_IND | BPF_H:
+       case BPF_LD | BPF_IND | BPF_B:
+       {
+               const u8 r4 = bpf2a32[BPF_REG_6][1]; /* r4 = ptr to sk_buff */
+               const u8 r0 = bpf2a32[BPF_REG_0][1]; /*r0: struct sk_buff *skb*/
+                                                    /* rtn value */
+               const u8 r1 = bpf2a32[BPF_REG_0][0]; /* r1: int k */
+               const u8 r2 = bpf2a32[BPF_REG_1][1]; /* r2: unsigned int size */
+               const u8 r3 = bpf2a32[BPF_REG_1][0]; /* r3: void *buffer */
+               const u8 r6 = bpf2a32[TMP_REG_1][1]; /* r6: void *(*func)(..) */
+               int size;
+
+               /* Setting up first argument */
+               emit(ARM_MOV_R(r0, r4), ctx);
+
+               /* Setting up second argument */
+               emit_a32_mov_i(r1, imm, false, ctx);
+               if (BPF_MODE(code) == BPF_IND)
+                       emit_a32_alu_r(r1, src_lo, false, sstk, ctx,
+                                      false, false, BPF_ADD);
+
+               /* Setting up third argument */
+               switch (BPF_SIZE(code)) {
+               case BPF_W:
+                       size = 4;
                        break;
-               case BPF_ALU | BPF_LSH | BPF_X:
-                       update_on_xread(ctx);
-                       emit(ARM_LSL_R(r_A, r_A, r_X), ctx);
+               case BPF_H:
+                       size = 2;
                        break;
-               case BPF_ALU | BPF_RSH | BPF_K:
-                       if (unlikely(k > 31))
-                               return -1;
-                       if (k)
-                               emit(ARM_LSR_I(r_A, r_A, k), ctx);
+               case BPF_B:
+                       size = 1;
                        break;
-               case BPF_ALU | BPF_RSH | BPF_X:
-                       update_on_xread(ctx);
-                       emit(ARM_LSR_R(r_A, r_A, r_X), ctx);
+               default:
+                       return -EINVAL;
+               }
+               emit_a32_mov_i(r2, size, false, ctx);
+
+               /* Setting up fourth argument */
+               emit(ARM_ADD_I(r3, ARM_SP, imm8m(SKB_BUFFER)), ctx);
+
+               /* Setting up function pointer to call */
+               emit_a32_mov_i(r6, (unsigned int)bpf_load_pointer, false, ctx);
+               emit_blx_r(r6, ctx);
+
+               emit(ARM_EOR_R(r1, r1, r1), ctx);
+               /* Check if return address is NULL or not.
+                * if NULL then jump to epilogue
+                * else continue to load the value from retn address
+                */
+               emit(ARM_CMP_I(r0, 0), ctx);
+               jmp_offset = epilogue_offset(ctx);
+               check_imm24(jmp_offset);
+               _emit(ARM_COND_EQ, ARM_B(jmp_offset), ctx);
+
+               /* Load value from the address */
+               switch (BPF_SIZE(code)) {
+               case BPF_W:
+                       emit(ARM_LDR_I(r0, r0, 0), ctx);
+#ifndef CONFIG_CPU_BIG_ENDIAN
+                       emit_rev32(r0, r0, ctx);
+#endif
                        break;
-               case BPF_ALU | BPF_NEG:
-                       /* A = -A */
-                       emit(ARM_RSB_I(r_A, r_A, 0), ctx);
+               case BPF_H:
+                       emit(ARM_LDRH_I(r0, r0, 0), ctx);
+#ifndef CONFIG_CPU_BIG_ENDIAN
+                       emit_rev16(r0, r0, ctx);
+#endif
                        break;
-               case BPF_JMP | BPF_JA:
-                       /* pc += K */
-                       emit(ARM_B(b_imm(i + k + 1, ctx)), ctx);
+               case BPF_B:
+                       emit(ARM_LDRB_I(r0, r0, 0), ctx);
+                       /* No need to reverse */
                        break;
-               case BPF_JMP | BPF_JEQ | BPF_K:
-                       /* pc += (A == K) ? pc->jt : pc->jf */
-                       condt  = ARM_COND_EQ;
-                       goto cmp_imm;
-               case BPF_JMP | BPF_JGT | BPF_K:
-                       /* pc += (A > K) ? pc->jt : pc->jf */
-                       condt  = ARM_COND_HI;
-                       goto cmp_imm;
-               case BPF_JMP | BPF_JGE | BPF_K:
-                       /* pc += (A >= K) ? pc->jt : pc->jf */
-                       condt  = ARM_COND_HS;
-cmp_imm:
-                       imm12 = imm8m(k);
-                       if (imm12 < 0) {
-                               emit_mov_i_no8m(r_scratch, k, ctx);
-                               emit(ARM_CMP_R(r_A, r_scratch), ctx);
-                       } else {
-                               emit(ARM_CMP_I(r_A, imm12), ctx);
-                       }
-cond_jump:
-                       if (inst->jt)
-                               _emit(condt, ARM_B(b_imm(i + inst->jt + 1,
-                                                  ctx)), ctx);
-                       if (inst->jf)
-                               _emit(condt ^ 1, ARM_B(b_imm(i + inst->jf + 1,
-                                                            ctx)), ctx);
+               }
+               break;
+       }
+       /* ST: *(size *)(dst + off) = imm */
+       case BPF_ST | BPF_MEM | BPF_W:
+       case BPF_ST | BPF_MEM | BPF_H:
+       case BPF_ST | BPF_MEM | BPF_B:
+       case BPF_ST | BPF_MEM | BPF_DW:
+               switch (BPF_SIZE(code)) {
+               case BPF_DW:
+                       /* Sign-extend immediate value into temp reg */
+                       emit_a32_mov_i64(true, tmp2, imm, false, ctx);
+                       emit_str_r(dst_lo, tmp2[1], dstk, off, ctx, BPF_W);
+                       emit_str_r(dst_lo, tmp2[0], dstk, off+4, ctx, BPF_W);
                        break;
-               case BPF_JMP | BPF_JEQ | BPF_X:
-                       /* pc += (A == X) ? pc->jt : pc->jf */
-                       condt   = ARM_COND_EQ;
-                       goto cmp_x;
-               case BPF_JMP | BPF_JGT | BPF_X:
-                       /* pc += (A > X) ? pc->jt : pc->jf */
-                       condt   = ARM_COND_HI;
-                       goto cmp_x;
-               case BPF_JMP | BPF_JGE | BPF_X:
-                       /* pc += (A >= X) ? pc->jt : pc->jf */
-                       condt   = ARM_COND_CS;
-cmp_x:
-                       update_on_xread(ctx);
-                       emit(ARM_CMP_R(r_A, r_X), ctx);
-                       goto cond_jump;
-               case BPF_JMP | BPF_JSET | BPF_K:
-                       /* pc += (A & K) ? pc->jt : pc->jf */
-                       condt  = ARM_COND_NE;
-                       /* not set iff all zeroes iff Z==1 iff EQ */
-
-                       imm12 = imm8m(k);
-                       if (imm12 < 0) {
-                               emit_mov_i_no8m(r_scratch, k, ctx);
-                               emit(ARM_TST_R(r_A, r_scratch), ctx);
-                       } else {
-                               emit(ARM_TST_I(r_A, imm12), ctx);
-                       }
-                       goto cond_jump;
-               case BPF_JMP | BPF_JSET | BPF_X:
-                       /* pc += (A & X) ? pc->jt : pc->jf */
-                       update_on_xread(ctx);
-                       condt  = ARM_COND_NE;
-                       emit(ARM_TST_R(r_A, r_X), ctx);
-                       goto cond_jump;
-               case BPF_RET | BPF_A:
-                       emit(ARM_MOV_R(ARM_R0, r_A), ctx);
-                       goto b_epilogue;
-               case BPF_RET | BPF_K:
-                       if ((k == 0) && (ctx->ret0_fp_idx < 0))
-                               ctx->ret0_fp_idx = i;
-                       emit_mov_i(ARM_R0, k, ctx);
-b_epilogue:
-                       if (i != ctx->skf->len - 1)
-                               emit(ARM_B(b_imm(prog->len, ctx)), ctx);
+               case BPF_W:
+               case BPF_H:
+               case BPF_B:
+                       emit_a32_mov_i(tmp2[1], imm, false, ctx);
+                       emit_str_r(dst_lo, tmp2[1], dstk, off, ctx,
+                                  BPF_SIZE(code));
                        break;
-               case BPF_MISC | BPF_TAX:
-                       /* X = A */
-                       ctx->seen |= SEEN_X;
-                       emit(ARM_MOV_R(r_X, r_A), ctx);
+               }
+               break;
+       /* STX XADD: lock *(u32 *)(dst + off) += src */
+       case BPF_STX | BPF_XADD | BPF_W:
+       /* STX XADD: lock *(u64 *)(dst + off) += src */
+       case BPF_STX | BPF_XADD | BPF_DW:
+               goto notyet;
+       /* STX: *(size *)(dst + off) = src */
+       case BPF_STX | BPF_MEM | BPF_W:
+       case BPF_STX | BPF_MEM | BPF_H:
+       case BPF_STX | BPF_MEM | BPF_B:
+       case BPF_STX | BPF_MEM | BPF_DW:
+       {
+               u8 sz = BPF_SIZE(code);
+
+               rn = sstk ? tmp2[1] : src_lo;
+               rm = sstk ? tmp2[0] : src_hi;
+               if (!sstk)
+                       goto do_store;
+               switch (BPF_SIZE(code)) {
+               case BPF_W:
+                       emit(ARM_LDR_I(rn, ARM_SP, STACK_VAR(src_lo)), ctx);
+                       goto empty_hi;
+               case BPF_H:
+                       emit(ARM_LDRH_I(rn, ARM_SP, STACK_VAR(src_lo)), ctx);
+                       goto empty_hi;
+               case BPF_B:
+                       emit(ARM_LDRB_I(rn, ARM_SP, STACK_VAR(src_lo)), ctx);
+                       goto empty_hi;
+empty_hi:
+                       emit(ARM_EOR_R(rm, rm, rm), ctx);
+               case BPF_DW:
+                       emit(ARM_LDR_I(rn, ARM_SP, STACK_VAR(src_lo)), ctx);
+                       emit(ARM_LDR_I(rm, ARM_SP, STACK_VAR(src_hi)), ctx);
+                       sz = BPF_W;
                        break;
-               case BPF_MISC | BPF_TXA:
-                       /* A = X */
-                       update_on_xread(ctx);
-                       emit(ARM_MOV_R(r_A, r_X), ctx);
+               }
+
+do_store:
+               /* Clear higher word except for BPF_DW */
+               if (BPF_SIZE(code) != BPF_DW)
+                       emit(ARM_EOR_R(rm, rm, rm), ctx);
+
+               /* Store the value */
+               emit_str_r(dst_lo, rn, dstk, off, ctx, sz);
+               emit_str_r(dst_lo, rm, dstk, off+4, ctx, BPF_W);
+               break;
+       }
+       /* PC += off if dst == src */
+       /* PC += off if dst > src */
+       /* PC += off if dst >= src */
+       /* PC += off if dst != src */
+       /* PC += off if dst > src (signed) */
+       /* PC += off if dst >= src (signed) */
+       /* PC += off if dst & src */
+       case BPF_JMP | BPF_JEQ | BPF_X:
+       case BPF_JMP | BPF_JGT | BPF_X:
+       case BPF_JMP | BPF_JGE | BPF_X:
+       case BPF_JMP | BPF_JNE | BPF_X:
+       case BPF_JMP | BPF_JSGT | BPF_X:
+       case BPF_JMP | BPF_JSGE | BPF_X:
+       case BPF_JMP | BPF_JSET | BPF_X:
+               /* Setup source registers */
+               rm = sstk ? tmp2[0] : src_hi;
+               rn = sstk ? tmp2[1] : src_lo;
+               if (sstk) {
+                       emit(ARM_LDR_I(rn, ARM_SP, STACK_VAR(src_lo)), ctx);
+                       emit(ARM_LDR_I(rm, ARM_SP, STACK_VAR(src_hi)), ctx);
+               }
+               goto go_jmp;
+       /* PC += off if dst == imm */
+       /* PC += off if dst > imm */
+       /* PC += off if dst >= imm */
+       /* PC += off if dst != imm */
+       /* PC += off if dst > imm (signed) */
+       /* PC += off if dst >= imm (signed) */
+       /* PC += off if dst & imm */
+       case BPF_JMP | BPF_JEQ | BPF_K:
+       case BPF_JMP | BPF_JGT | BPF_K:
+       case BPF_JMP | BPF_JGE | BPF_K:
+       case BPF_JMP | BPF_JNE | BPF_K:
+       case BPF_JMP | BPF_JSGT | BPF_K:
+       case BPF_JMP | BPF_JSGE | BPF_K:
+       case BPF_JMP | BPF_JSET | BPF_K:
+               if (off == 0)
                        break;
-               case BPF_ANC | SKF_AD_PROTOCOL:
-                       /* A = ntohs(skb->protocol) */
-                       ctx->seen |= SEEN_SKB;
-                       BUILD_BUG_ON(FIELD_SIZEOF(struct sk_buff,
-                                                 protocol) != 2);
-                       off = offsetof(struct sk_buff, protocol);
-                       emit(ARM_LDRH_I(r_scratch, r_skb, off), ctx);
-                       emit_swap16(r_A, r_scratch, ctx);
+               rm = tmp2[0];
+               rn = tmp2[1];
+               /* Sign-extend immediate value */
+               emit_a32_mov_i64(true, tmp2, imm, false, ctx);
+go_jmp:
+               /* Setup destination register */
+               rd = dstk ? tmp[0] : dst_hi;
+               rt = dstk ? tmp[1] : dst_lo;
+               if (dstk) {
+                       emit(ARM_LDR_I(rt, ARM_SP, STACK_VAR(dst_lo)), ctx);
+                       emit(ARM_LDR_I(rd, ARM_SP, STACK_VAR(dst_hi)), ctx);
+               }
+
+               /* Check for the condition */
+               emit_ar_r(rd, rt, rm, rn, ctx, BPF_OP(code));
+
+               /* Setup JUMP instruction */
+               jmp_offset = bpf2a32_offset(i+off, i, ctx);
+               switch (BPF_OP(code)) {
+               case BPF_JNE:
+               case BPF_JSET:
+                       _emit(ARM_COND_NE, ARM_B(jmp_offset), ctx);
                        break;
-               case BPF_ANC | SKF_AD_CPU:
-                       /* r_scratch = current_thread_info() */
-                       OP_IMM3(ARM_BIC, r_scratch, ARM_SP,
THREAD_SIZE - 1, ctx);
-                       /* A = current_thread_info()->cpu */
-                       BUILD_BUG_ON(FIELD_SIZEOF(struct thread_info,
cpu) != 4);
-                       off = offsetof(struct thread_info, cpu);
-                       emit(ARM_LDR_I(r_A, r_scratch, off), ctx);
+               case BPF_JEQ:
+                       _emit(ARM_COND_EQ, ARM_B(jmp_offset), ctx);
                        break;
-               case BPF_ANC | SKF_AD_IFINDEX:
-               case BPF_ANC | SKF_AD_HATYPE:
-                       /* A = skb->dev->ifindex */
-                       /* A = skb->dev->type */
-                       ctx->seen |= SEEN_SKB;
-                       off = offsetof(struct sk_buff, dev);
-                       emit(ARM_LDR_I(r_scratch, r_skb, off), ctx);
-
-                       emit(ARM_CMP_I(r_scratch, 0), ctx);
-                       emit_err_ret(ARM_COND_EQ, ctx);
-
-                       BUILD_BUG_ON(FIELD_SIZEOF(struct net_device,
-                                                 ifindex) != 4);
-                       BUILD_BUG_ON(FIELD_SIZEOF(struct net_device,
-                                                 type) != 2);
-
-                       if (code == (BPF_ANC | SKF_AD_IFINDEX)) {
-                               off = offsetof(struct net_device, ifindex);
-                               emit(ARM_LDR_I(r_A, r_scratch, off), ctx);
-                       } else {
-                               /*
-                                * offset of field "type" in "struct
-                                * net_device" is above what can be
-                                * used in the ldrh rd, [rn, #imm]
-                                * instruction, so load the offset in
-                                * a register and use ldrh rd, [rn, rm]
-                                */
-                               off = offsetof(struct net_device, type);
-                               emit_mov_i(ARM_R3, off, ctx);
-                               emit(ARM_LDRH_R(r_A, r_scratch, ARM_R3), ctx);
-                       }
+               case BPF_JGT:
+                       _emit(ARM_COND_HI, ARM_B(jmp_offset), ctx);
                        break;
-               case BPF_ANC | SKF_AD_MARK:
-                       ctx->seen |= SEEN_SKB;
-                       BUILD_BUG_ON(FIELD_SIZEOF(struct sk_buff, mark) != 4);
-                       off = offsetof(struct sk_buff, mark);
-                       emit(ARM_LDR_I(r_A, r_skb, off), ctx);
+               case BPF_JGE:
+                       _emit(ARM_COND_CS, ARM_B(jmp_offset), ctx);
                        break;
-               case BPF_ANC | SKF_AD_RXHASH:
-                       ctx->seen |= SEEN_SKB;
-                       BUILD_BUG_ON(FIELD_SIZEOF(struct sk_buff, hash) != 4);
-                       off = offsetof(struct sk_buff, hash);
-                       emit(ARM_LDR_I(r_A, r_skb, off), ctx);
+               case BPF_JSGT:
+                       _emit(ARM_COND_LT, ARM_B(jmp_offset), ctx);
                        break;
-               case BPF_ANC | SKF_AD_VLAN_TAG:
-               case BPF_ANC | SKF_AD_VLAN_TAG_PRESENT:
-                       ctx->seen |= SEEN_SKB;
-                       BUILD_BUG_ON(FIELD_SIZEOF(struct sk_buff,
vlan_tci) != 2);
-                       off = offsetof(struct sk_buff, vlan_tci);
-                       emit(ARM_LDRH_I(r_A, r_skb, off), ctx);
-                       if (code == (BPF_ANC | SKF_AD_VLAN_TAG))
-                               OP_IMM3(ARM_AND, r_A, r_A,
~VLAN_TAG_PRESENT, ctx);
-                       else {
-                               OP_IMM3(ARM_LSR, r_A, r_A, 12, ctx);
-                               OP_IMM3(ARM_AND, r_A, r_A, 0x1, ctx);
-                       }
+               case BPF_JSGE:
+                       _emit(ARM_COND_GE, ARM_B(jmp_offset), ctx);
                        break;
-               case BPF_ANC | SKF_AD_PKTTYPE:
-                       ctx->seen |= SEEN_SKB;
-                       BUILD_BUG_ON(FIELD_SIZEOF(struct sk_buff,
-                                                 __pkt_type_offset[0]) != 1);
-                       off = PKT_TYPE_OFFSET();
-                       emit(ARM_LDRB_I(r_A, r_skb, off), ctx);
-                       emit(ARM_AND_I(r_A, r_A, PKT_TYPE_MAX), ctx);
-#ifdef __BIG_ENDIAN_BITFIELD
-                       emit(ARM_LSR_I(r_A, r_A, 5), ctx);
-#endif
+               }
+               break;
+       /* JMP OFF */
+       case BPF_JMP | BPF_JA:
+       {
+               if (off == 0)
                        break;
-               case BPF_ANC | SKF_AD_QUEUE:
-                       ctx->seen |= SEEN_SKB;
-                       BUILD_BUG_ON(FIELD_SIZEOF(struct sk_buff,
-                                                 queue_mapping) != 2);
-                       BUILD_BUG_ON(offsetof(struct sk_buff,
-                                             queue_mapping) > 0xff);
-                       off = offsetof(struct sk_buff, queue_mapping);
-                       emit(ARM_LDRH_I(r_A, r_skb, off), ctx);
+               jmp_offset = bpf2a32_offset(i+off, i, ctx);
+               check_imm24(jmp_offset);
+               emit(ARM_B(jmp_offset), ctx);
+               break;
+       }
+       /* tail call */
+       case BPF_JMP | BPF_CALL | BPF_X:
+               if (emit_bpf_tail_call(ctx))
+                       return -EFAULT;
+               break;
+       /* function call */
+       case BPF_JMP | BPF_CALL:
+               goto notyet;
+       /* function return */
+       case BPF_JMP | BPF_EXIT:
+               /* Optimization: when last instruction is EXIT
+                * simply fallthrough to epilogue.
+                */
+               if (i == ctx->prog->len - 1)
                        break;
-               case BPF_ANC | SKF_AD_PAY_OFFSET:
-                       ctx->seen |= SEEN_SKB | SEEN_CALL;
+               jmp_offset = epilogue_offset(ctx);
+               check_imm24(jmp_offset);
+               emit(ARM_B(jmp_offset), ctx);
+               break;
+notyet:
+               pr_info_once("*** NOT YET: opcode %02x ***\n", code);
+               return -EFAULT;
+       default:
+               pr_err_once("unknown opcode %02x\n", code);
+               return -EINVAL;
+       }

-                       emit(ARM_MOV_R(ARM_R0, r_skb), ctx);
-                       emit_mov_i(ARM_R3, (unsigned int)skb_get_poff, ctx);
-                       emit_blx_r(ARM_R3, ctx);
-                       emit(ARM_MOV_R(r_A, ARM_R0), ctx);
-                       break;
-               case BPF_LDX | BPF_W | BPF_ABS:
-                       /*
-                        * load a 32bit word from struct seccomp_data.
-                        * seccomp_check_filter() will already have checked
-                        * that k is 32bit aligned and lies within the
-                        * struct seccomp_data.
-                        */
-                       ctx->seen |= SEEN_SKB;
-                       emit(ARM_LDR_I(r_A, r_skb, k), ctx);
-                       break;
-               default:
-                       return -1;
+       if (ctx->flags & FLAG_IMM_OVERFLOW)
+               /*
+                * this instruction generated an overflow when
+                * trying to access the literal pool, so
+                * delegate this filter to the kernel interpreter.
+                */
+               return -1;
+       return 0;
+}
+
+static int build_body(struct jit_ctx *ctx)
+{
+       const struct bpf_prog *prog = ctx->prog;
+       unsigned int i;
+
+       for (i = 0; i < prog->len; i++) {
+               const struct bpf_insn *insn = &(prog->insnsi[i]);
+               int ret;
+
+               emit(ARM_MOV_R(ARM_IP, ARM_PC), ctx);
+               ret = build_insn(insn, ctx);
+
+               /* It's used with loading the 64 bit immediate value. */
+               if (ret > 0) {
+                       i++;
+                       if (ctx->target == NULL)
+                               ctx->offsets[i] = ctx->idx;
+                       continue;
                }

-               if (ctx->flags & FLAG_IMM_OVERFLOW)
-                       /*
-                        * this instruction generated an overflow when
-                        * trying to access the literal pool, so
-                        * delegate this filter to the kernel interpreter.
-                        */
-                       return -1;
+               if (ctx->target == NULL)
+                       ctx->offsets[i] = ctx->idx;
+
+               /* If unsuccesfull, return with error code */
+               if (ret)
+                       return ret;
        }
+       return 0;
+}

-       /* compute offsets only during the first pass */
-       if (ctx->target == NULL)
-               ctx->offsets[i] = ctx->idx * 4;
+static int validate_code(struct jit_ctx *ctx)
+{
+       int i;
+
+       for (i = 0; i < ctx->idx; i++) {
+               u32 a32_insn = le32_to_cpu(ctx->target[i]);
+
+               if (a32_insn == ARM_INST_UDF)
+                       return -1;
+       }

        return 0;
 }

+void bpf_jit_compile(struct bpf_prog *prog)
+{
+       /* Nothing to do here. We support Internal BPF. */
+}

-void bpf_jit_compile(struct bpf_prog *fp)
+struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
 {
+       struct bpf_prog *tmp, *orig_prog = prog;
        struct bpf_binary_header *header;
+       bool tmp_blinded = false;
        struct jit_ctx ctx;
-       unsigned tmp_idx;
-       unsigned alloc_size;
-       u8 *target_ptr;
+       unsigned int tmp_idx;
+       unsigned int image_size;
+       u8 *image_ptr;

+       /* If BPF JIT was not enabled then we must fall back to
+        * the interpreter.
+        */
        if (!bpf_jit_enable)
-               return;
+               return orig_prog;

-       memset(&ctx, 0, sizeof(ctx));
-       ctx.skf         = fp;
-       ctx.ret0_fp_idx = -1;
+       /* If constant blinding was enabled and we failed during blinding
+        * then we must fall back to the interpreter. Otherwise, we save
+        * the new JITed code.
+        */
+       tmp = bpf_jit_blind_constants(prog);

-       ctx.offsets = kzalloc(4 * (ctx.skf->len + 1), GFP_KERNEL);
-       if (ctx.offsets == NULL)
-               return;
+       if (IS_ERR(tmp))
+               return orig_prog;
+       if (tmp != prog) {
+               tmp_blinded = true;
+               prog = tmp;
+       }
+
+       memset(&ctx, 0, sizeof(ctx));
+       ctx.prog = prog;

-       /* fake pass to fill in the ctx->seen */
-       if (unlikely(build_body(&ctx)))
+       /* Not able to allocate memory for offsets[] , then
+        * we must fall back to the interpreter
+        */
+       ctx.offsets = kcalloc(prog->len, sizeof(int), GFP_KERNEL);
+       if (ctx.offsets == NULL) {
+               prog = orig_prog;
                goto out;
+       }
+
+       /* 1) fake pass to find in the length of the JITed code,
+        * to compute ctx->offsets and other context variables
+        * needed to compute final JITed code.
+        * Also, calculate random starting pointer/start of JITed code
+        * which is prefixed by random number of fault instructions.
+        *
+        * If the first pass fails then there is no chance of it
+        * being successful in the second pass, so just fall back
+        * to the interpreter.
+        */
+       if (build_body(&ctx)) {
+               prog = orig_prog;
+               goto out_off;
+       }

        tmp_idx = ctx.idx;
        build_prologue(&ctx);
        ctx.prologue_bytes = (ctx.idx - tmp_idx) * 4;

+       ctx.epilogue_offset = ctx.idx;
+
 #if __LINUX_ARM_ARCH__ < 7
        tmp_idx = ctx.idx;
        build_epilogue(&ctx);
@@ -1020,64 +1845,95 @@ void bpf_jit_compile(struct bpf_prog *fp)

        ctx.idx += ctx.imm_count;
        if (ctx.imm_count) {
-               ctx.imms = kzalloc(4 * ctx.imm_count, GFP_KERNEL);
-               if (ctx.imms == NULL)
-                       goto out;
+               ctx.imms = kcalloc(ctx.imm_count, sizeof(u32), GFP_KERNEL);
+               if (ctx.imms == NULL) {
+                       prog = orig_prog;
+                       goto out_off;
+               }
        }
 #else
-       /* there's nothing after the epilogue on ARMv7 */
+       /* there's nothing about the epilogue on ARMv7 */
        build_epilogue(&ctx);
 #endif
-       alloc_size = 4 * ctx.idx;
-       header = bpf_jit_binary_alloc(alloc_size, &target_ptr,
-                                     4, jit_fill_hole);
-       if (header == NULL)
-               goto out;
+       /* Now we can get the actual image size of the JITed arm code.
+        * Currently, we are not considering the THUMB-2 instructions
+        * for jit, although it can decrease the size of the image.
+        *
+        * As each arm instruction is of length 32bit, we are translating
+        * number of JITed intructions into the size required to store these
+        * JITed code.
+        */
+       image_size = sizeof(u32) * ctx.idx;

-       ctx.target = (u32 *) target_ptr;
+       /* Now we know the size of the structure to make */
+       header = bpf_jit_binary_alloc(image_size, &image_ptr,
+                                     sizeof(u32), jit_fill_hole);
+       /* Not able to allocate memory for the structure then
+        * we must fall back to the interpretation
+        */
+       if (header == NULL) {
+               prog = orig_prog;
+               goto out_imms;
+       }
+
+       /* 2.) Actual pass to generate final JIT code */
+       ctx.target = (u32 *) image_ptr;
        ctx.idx = 0;

        build_prologue(&ctx);
+
+       /* If building the body of the JITed code fails somehow,
+        * we fall back to the interpretation.
+        */
        if (build_body(&ctx) < 0) {
-#if __LINUX_ARM_ARCH__ < 7
-               if (ctx.imm_count)
-                       kfree(ctx.imms);
-#endif
+               image_ptr = NULL;
                bpf_jit_binary_free(header);
-               goto out;
+               prog = orig_prog;
+               goto out_imms;
        }
        build_epilogue(&ctx);

+       /* 3.) Extra pass to validate JITed Code */
+       if (validate_code(&ctx)) {
+               image_ptr = NULL;
+               bpf_jit_binary_free(header);
+               prog = orig_prog;
+               goto out_imms;
+       }
        flush_icache_range((u32)header, (u32)(ctx.target + ctx.idx));

-#if __LINUX_ARM_ARCH__ < 7
-       if (ctx.imm_count)
-               kfree(ctx.imms);
-#endif
-
        if (bpf_jit_enable > 1)
                /* there are 2 passes here */
-               bpf_jit_dump(fp->len, alloc_size, 2, ctx.target);
+               bpf_jit_dump(prog->len, image_size, 2, ctx.target);

        set_memory_ro((unsigned long)header, header->pages);
-       fp->bpf_func = (void *)ctx.target;
-       fp->jited = 1;
-out:
+       prog->bpf_func = (void *)ctx.target;
+       prog->jited = 1;
+out_imms:
+#if __LINUX_ARM_ARCH__ < 7
+       if (ctx.imm_count)
+               kfree(ctx.imms);
+#endif
+out_off:
        kfree(ctx.offsets);
-       return;
+out:
+       if (tmp_blinded)
+               bpf_jit_prog_release_other(prog, prog == orig_prog ?
+                                          tmp : orig_prog);
+       return prog;
 }

-void bpf_jit_free(struct bpf_prog *fp)
+void bpf_jit_free(struct bpf_prog *prog)
 {
-       unsigned long addr = (unsigned long)fp->bpf_func & PAGE_MASK;
+       unsigned long addr = (unsigned long)prog->bpf_func & PAGE_MASK;
        struct bpf_binary_header *header = (void *)addr;

-       if (!fp->jited)
+       if (!prog->jited)
                goto free_filter;

        set_memory_rw(addr, header->pages);
        bpf_jit_binary_free(header);

 free_filter:
-       bpf_prog_unlock_free(fp);
+       bpf_prog_unlock_free(prog);
 }
diff --git a/arch/arm/net/bpf_jit_32.h b/arch/arm/net/bpf_jit_32.h
index c46fca2..d5cf5f6 100644
--- a/arch/arm/net/bpf_jit_32.h
+++ b/arch/arm/net/bpf_jit_32.h
@@ -11,6 +11,7 @@
 #ifndef PFILTER_OPCODES_ARM_H
 #define PFILTER_OPCODES_ARM_H

+/* ARM 32bit Registers */
 #define ARM_R0 0
 #define ARM_R1 1
 #define ARM_R2 2
@@ -22,38 +23,43 @@
 #define ARM_R8 8
 #define ARM_R9 9
 #define ARM_R10        10
-#define ARM_FP 11
-#define ARM_IP 12
-#define ARM_SP 13
-#define ARM_LR 14
-#define ARM_PC 15
-
-#define ARM_COND_EQ            0x0
-#define ARM_COND_NE            0x1
-#define ARM_COND_CS            0x2
+#define ARM_FP 11      /* Frame Pointer */
+#define ARM_IP 12      /* Intra-procedure scratch register */
+#define ARM_SP 13      /* Stack pointer: as load/store base reg */
+#define ARM_LR 14      /* Link Register */
+#define ARM_PC 15      /* Program counter */
+
+#define ARM_COND_EQ            0x0     /* == */
+#define ARM_COND_NE            0x1     /* != */
+#define ARM_COND_CS            0x2     /* unsigned >= */
 #define ARM_COND_HS            ARM_COND_CS
-#define ARM_COND_CC            0x3
+#define ARM_COND_CC            0x3     /* unsigned < */
 #define ARM_COND_LO            ARM_COND_CC
-#define ARM_COND_MI            0x4
-#define ARM_COND_PL            0x5
-#define ARM_COND_VS            0x6
-#define ARM_COND_VC            0x7
-#define ARM_COND_HI            0x8
-#define ARM_COND_LS            0x9
-#define ARM_COND_GE            0xa
-#define ARM_COND_LT            0xb
-#define ARM_COND_GT            0xc
-#define ARM_COND_LE            0xd
-#define ARM_COND_AL            0xe
+#define ARM_COND_MI            0x4     /* < 0 */
+#define ARM_COND_PL            0x5     /* >= 0 */
+#define ARM_COND_VS            0x6     /* Signed Overflow */
+#define ARM_COND_VC            0x7     /* No Signed Overflow */
+#define ARM_COND_HI            0x8     /* unsigned > */
+#define ARM_COND_LS            0x9     /* unsigned <= */
+#define ARM_COND_GE            0xa     /* Signed >= */
+#define ARM_COND_LT            0xb     /* Signed < */
+#define ARM_COND_GT            0xc     /* Signed > */
+#define ARM_COND_LE            0xd     /* Signed <= */
+#define ARM_COND_AL            0xe     /* None */

 /* register shift types */
 #define SRTYPE_LSL             0
 #define SRTYPE_LSR             1
 #define SRTYPE_ASR             2
 #define SRTYPE_ROR             3
+#define SRTYPE_ASL             (SRTYPE_LSL)

 #define ARM_INST_ADD_R         0x00800000
+#define ARM_INST_ADDS_R                0x00900000
+#define ARM_INST_ADC_R         0x00a00000
+#define ARM_INST_ADC_I         0x02a00000
 #define ARM_INST_ADD_I         0x02800000
+#define ARM_INST_ADDS_I                0x02900000

 #define ARM_INST_AND_R         0x00000000
 #define ARM_INST_AND_I         0x02000000
@@ -76,8 +82,10 @@
 #define ARM_INST_LDRH_I                0x01d000b0
 #define ARM_INST_LDRH_R                0x019000b0
 #define ARM_INST_LDR_I         0x05900000
+#define ARM_INST_LDR_R         0x07900000

 #define ARM_INST_LDM           0x08900000
+#define ARM_INST_LDM_IA                0x08b00000

 #define ARM_INST_LSL_I         0x01a00000
 #define ARM_INST_LSL_R         0x01a00010
@@ -86,6 +94,7 @@
 #define ARM_INST_LSR_R         0x01a00030

 #define ARM_INST_MOV_R         0x01a00000
+#define ARM_INST_MOVS_R                0x01b00000
 #define ARM_INST_MOV_I         0x03a00000
 #define ARM_INST_MOVW          0x03000000
 #define ARM_INST_MOVT          0x03400000
@@ -96,17 +105,28 @@
 #define ARM_INST_PUSH          0x092d0000

 #define ARM_INST_ORR_R         0x01800000
+#define ARM_INST_ORRS_R                0x01900000
 #define ARM_INST_ORR_I         0x03800000

 #define ARM_INST_REV           0x06bf0f30
 #define ARM_INST_REV16         0x06bf0fb0

 #define ARM_INST_RSB_I         0x02600000
+#define ARM_INST_RSBS_I                0x02700000
+#define ARM_INST_RSC_I         0x02e00000

 #define ARM_INST_SUB_R         0x00400000
+#define ARM_INST_SUBS_R                0x00500000
+#define ARM_INST_RSB_R         0x00600000
 #define ARM_INST_SUB_I         0x02400000
+#define ARM_INST_SUBS_I                0x02500000
+#define ARM_INST_SBC_I         0x02c00000
+#define ARM_INST_SBC_R         0x00c00000
+#define ARM_INST_SBCS_R                0x00d00000

 #define ARM_INST_STR_I         0x05800000
+#define ARM_INST_STRB_I                0x05c00000
+#define ARM_INST_STRH_I                0x01c000b0

 #define ARM_INST_TST_R         0x01100000
 #define ARM_INST_TST_I         0x03100000
@@ -117,6 +137,8 @@

 #define ARM_INST_MLS           0x00600090

+#define ARM_INST_UXTH          0x06ff0070
+
 /*
  * Use a suitable undefined instruction to use for ARM/Thumb2 faulting.
  * We need to be careful not to conflict with those used by other modules
@@ -135,9 +157,15 @@
 #define _AL3_R(op, rd, rn, rm) ((op ## _R) | (rd) << 12 | (rn) << 16 | (rm))
 /* immediate */
 #define _AL3_I(op, rd, rn, imm)        ((op ## _I) | (rd) << 12 |
(rn) << 16 | (imm))
+/* register with register-shift */
+#define _AL3_SR(inst)  (inst | (1 << 4))

 #define ARM_ADD_R(rd, rn, rm)  _AL3_R(ARM_INST_ADD, rd, rn, rm)
+#define ARM_ADDS_R(rd, rn, rm) _AL3_R(ARM_INST_ADDS, rd, rn, rm)
 #define ARM_ADD_I(rd, rn, imm) _AL3_I(ARM_INST_ADD, rd, rn, imm)
+#define ARM_ADDS_I(rd, rn, imm)        _AL3_I(ARM_INST_ADDS, rd, rn, imm)
+#define ARM_ADC_R(rd, rn, rm)  _AL3_R(ARM_INST_ADC, rd, rn, rm)
+#define ARM_ADC_I(rd, rn, imm) _AL3_I(ARM_INST_ADC, rd, rn, imm)

 #define ARM_AND_R(rd, rn, rm)  _AL3_R(ARM_INST_AND, rd, rn, rm)
 #define ARM_AND_I(rd, rn, imm) _AL3_I(ARM_INST_AND, rd, rn, imm)
@@ -156,7 +184,9 @@
 #define ARM_EOR_I(rd, rn, imm) _AL3_I(ARM_INST_EOR, rd, rn, imm)

 #define ARM_LDR_I(rt, rn, off) (ARM_INST_LDR_I | (rt) << 12 | (rn) << 16 \
-                                | (off))
+                                | ((off) & 0xfff))
+#define ARM_LDR_R(rt, rn, rm)  (ARM_INST_LDR_R | (rt) << 12 | (rn) << 16 \
+                                | (rm))
 #define ARM_LDRB_I(rt, rn, off)        (ARM_INST_LDRB_I | (rt) << 12
| (rn) << 16 \
                                 | (off))
 #define ARM_LDRB_R(rt, rn, rm) (ARM_INST_LDRB_R | (rt) << 12 | (rn) << 16 \
@@ -167,15 +197,23 @@
                                 | (rm))

 #define ARM_LDM(rn, regs)      (ARM_INST_LDM | (rn) << 16 | (regs))
+#define ARM_LDM_IA(rn, regs)   (ARM_INST_LDM_IA | (rn) << 16 | (regs))

 #define ARM_LSL_R(rd, rn, rm)  (_AL3_R(ARM_INST_LSL, rd, 0, rn) | (rm) << 8)
 #define ARM_LSL_I(rd, rn, imm) (_AL3_I(ARM_INST_LSL, rd, 0, rn) | (imm) << 7)

 #define ARM_LSR_R(rd, rn, rm)  (_AL3_R(ARM_INST_LSR, rd, 0, rn) | (rm) << 8)
 #define ARM_LSR_I(rd, rn, imm) (_AL3_I(ARM_INST_LSR, rd, 0, rn) | (imm) << 7)
+#define ARM_ASR_R(rd, rn, rm)   (_AL3_R(ARM_INST_ASR, rd, 0, rn) | (rm) << 8)
+#define ARM_ASR_I(rd, rn, imm)  (_AL3_I(ARM_INST_ASR, rd, 0, rn) | (imm) << 7)

 #define ARM_MOV_R(rd, rm)      _AL3_R(ARM_INST_MOV, rd, 0, rm)
+#define ARM_MOVS_R(rd, rm)     _AL3_R(ARM_INST_MOVS, rd, 0, rm)
 #define ARM_MOV_I(rd, imm)     _AL3_I(ARM_INST_MOV, rd, 0, imm)
+#define ARM_MOV_SR(rd, rm, type, rs)   \
+       (_AL3_SR(ARM_MOV_R(rd, rm)) | (type) << 5 | (rs) << 8)
+#define ARM_MOV_SI(rd, rm, type, imm6) \
+       (ARM_MOV_R(rd, rm) | (type) << 5 | (imm6) << 7)

 #define ARM_MOVW(rd, imm)      \
        (ARM_INST_MOVW | ((imm) >> 12) << 16 | (rd) << 12 | ((imm) & 0x0fff))
@@ -190,19 +228,38 @@

 #define ARM_ORR_R(rd, rn, rm)  _AL3_R(ARM_INST_ORR, rd, rn, rm)
 #define ARM_ORR_I(rd, rn, imm) _AL3_I(ARM_INST_ORR, rd, rn, imm)
-#define ARM_ORR_S(rd, rn, rm, type, rs)        \
-       (ARM_ORR_R(rd, rn, rm) | (type) << 5 | (rs) << 7)
+#define ARM_ORR_SR(rd, rn, rm, type, rs)       \
+       (_AL3_SR(ARM_ORR_R(rd, rn, rm)) | (type) << 5 | (rs) << 8)
+#define ARM_ORRS_R(rd, rn, rm) _AL3_R(ARM_INST_ORRS, rd, rn, rm)
+#define ARM_ORRS_SR(rd, rn, rm, type, rs)      \
+       (_AL3_SR(ARM_ORRS_R(rd, rn, rm)) | (type) << 5 | (rs) << 8)
+#define ARM_ORR_SI(rd, rn, rm, type, imm6)     \
+       (ARM_ORR_R(rd, rn, rm) | (type) << 5 | (imm6) << 7)
+#define ARM_ORRS_SI(rd, rn, rm, type, imm6)    \
+       (ARM_ORRS_R(rd, rn, rm) | (type) << 5 | (imm6) << 7)

 #define ARM_REV(rd, rm)                (ARM_INST_REV | (rd) << 12 | (rm))
 #define ARM_REV16(rd, rm)      (ARM_INST_REV16 | (rd) << 12 | (rm))

 #define ARM_RSB_I(rd, rn, imm) _AL3_I(ARM_INST_RSB, rd, rn, imm)
+#define ARM_RSBS_I(rd, rn, imm)        _AL3_I(ARM_INST_RSBS, rd, rn, imm)
+#define ARM_RSC_I(rd, rn, imm) _AL3_I(ARM_INST_RSC, rd, rn, imm)

 #define ARM_SUB_R(rd, rn, rm)  _AL3_R(ARM_INST_SUB, rd, rn, rm)
+#define ARM_SUBS_R(rd, rn, rm) _AL3_R(ARM_INST_SUBS, rd, rn, rm)
+#define ARM_RSB_R(rd, rn, rm)  _AL3_R(ARM_INST_RSB, rd, rn, rm)
+#define ARM_SBC_R(rd, rn, rm)  _AL3_R(ARM_INST_SBC, rd, rn, rm)
+#define ARM_SBCS_R(rd, rn, rm) _AL3_R(ARM_INST_SBCS, rd, rn, rm)
 #define ARM_SUB_I(rd, rn, imm) _AL3_I(ARM_INST_SUB, rd, rn, imm)
+#define ARM_SUBS_I(rd, rn, imm)        _AL3_I(ARM_INST_SUBS, rd, rn, imm)
+#define ARM_SBC_I(rd, rn, imm) _AL3_I(ARM_INST_SBC, rd, rn, imm)

 #define ARM_STR_I(rt, rn, off) (ARM_INST_STR_I | (rt) << 12 | (rn) << 16 \
-                                | (off))
+                                | ((off) & 0xfff))
+#define ARM_STRH_I(rt, rn, off)        (ARM_INST_STRH_I | (rt) << 12
| (rn) << 16 \
+                                | (((off) & 0xf0) << 4) | ((off) & 0xf))
+#define ARM_STRB_I(rt, rn, off)        (ARM_INST_STRB_I | (rt) << 12
| (rn) << 16 \
+                                | (((off) & 0xf0) << 4) | ((off) & 0xf))

 #define ARM_TST_R(rn, rm)      _AL3_R(ARM_INST_TST, 0, rn, rm)
 #define ARM_TST_I(rn, imm)     _AL3_I(ARM_INST_TST, 0, rn, imm)
@@ -214,5 +271,6 @@

 #define ARM_MLS(rd, rn, rm, ra)        (ARM_INST_MLS | (rd) << 16 |
(rn) | (rm) << 8 \
                                 | (ra) << 12)
+#define ARM_UXTH(rd, rm)       (ARM_INST_UXTH | (rd) << 12 | (rm))

 #endif /* PFILTER_OPCODES_ARM_H */
--
2.7.4
Best,
Shubham Bansal


On Tue, May 23, 2017 at 11:05 AM, Kees Cook <keescook@chromium.org> wrote:
> On Mon, May 22, 2017 at 10:03 PM, Shubham Bansal
> <illusionist.neo@gmail.com> wrote:
>> On Tue, May 23, 2017 at 9:52 AM, Kees Cook <keescook@chromium.org> wrote:
>>> On Mon, May 22, 2017 at 8:34 PM, Shubham Bansal
>>> <illusionist.neo@gmail.com> wrote:
>>>> I would post them as soon as I test them on ARMv5 and ARMv6. If you
>>>> can help me with that, please let me know.
>>>
>>> Please post what you have: it would be better to see what you've got
>>> now in case additional changes are needed so you don't have to do it
>>> again on v5 and v6. Also, it means other people with real v5 and v6
>>> hardware could test for you if they were so inclined, and you won't
>>> need to be blocked on doing the tests in qemu.
>>>
>>> You can send it as an "RFC" in the subject, just to make sure people
>>> know it's not considered fully done. :)
>>
>> I already have ARMv5 and ARMv6 code written. I just haven't tested it
>> yet. Should i send the patch with those as well ?
>
> Sure, just to have a version up for people to examine. If there are
> bugs, that's fine, we'll iron them out.
>
> -Kees
>
> --
> Kees Cook
> Pixel Security

^ permalink raw reply related	[flat|nested] 99+ messages in thread

* Re: arch: arm: bpf: Converting cBPF to eBPF for arm 32 bit
  2017-05-23 18:39                                 ` [kernel-hardening] " Shubham Bansal
  (?)
@ 2017-05-23 19:32                                   ` Kees Cook
  -1 siblings, 0 replies; 99+ messages in thread
From: Kees Cook @ 2017-05-23 19:32 UTC (permalink / raw)
  To: Shubham Bansal
  Cc: Florian Fainelli, Daniel Borkmann, kernel-hardening,
	Network Development, ast, Mircea Gherzan, David Miller,
	linux-arm-kernel, Nicolas Schichan, andrew

On Tue, May 23, 2017 at 11:39 AM, Shubham Bansal
<illusionist.neo@gmail.com> wrote:
> Here is the patch I sent to the arm mailing list.

Thanks for sending this!

I'm trying to figure out the best way to split this up, but the bulk
of it (the actual JIT changes) seems like they need to all land at the
same time. Any thoughts on this Daniel?

-Kees

-- 
Kees Cook
Pixel Security

^ permalink raw reply	[flat|nested] 99+ messages in thread

* arch: arm: bpf: Converting cBPF to eBPF for arm 32 bit
@ 2017-05-23 19:32                                   ` Kees Cook
  0 siblings, 0 replies; 99+ messages in thread
From: Kees Cook @ 2017-05-23 19:32 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, May 23, 2017 at 11:39 AM, Shubham Bansal
<illusionist.neo@gmail.com> wrote:
> Here is the patch I sent to the arm mailing list.

Thanks for sending this!

I'm trying to figure out the best way to split this up, but the bulk
of it (the actual JIT changes) seems like they need to all land at the
same time. Any thoughts on this Daniel?

-Kees

-- 
Kees Cook
Pixel Security

^ permalink raw reply	[flat|nested] 99+ messages in thread

* [kernel-hardening] Re: arch: arm: bpf: Converting cBPF to eBPF for arm 32 bit
@ 2017-05-23 19:32                                   ` Kees Cook
  0 siblings, 0 replies; 99+ messages in thread
From: Kees Cook @ 2017-05-23 19:32 UTC (permalink / raw)
  To: Shubham Bansal
  Cc: Florian Fainelli, Daniel Borkmann, kernel-hardening,
	Network Development, ast, Mircea Gherzan, David Miller,
	linux-arm-kernel, Nicolas Schichan, andrew

On Tue, May 23, 2017 at 11:39 AM, Shubham Bansal
<illusionist.neo@gmail.com> wrote:
> Here is the patch I sent to the arm mailing list.

Thanks for sending this!

I'm trying to figure out the best way to split this up, but the bulk
of it (the actual JIT changes) seems like they need to all land at the
same time. Any thoughts on this Daniel?

-Kees

-- 
Kees Cook
Pixel Security

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: arch: arm: bpf: Converting cBPF to eBPF for arm 32 bit
  2017-05-09 20:12                           ` Shubham Bansal
@ 2017-05-09 20:25                             ` Daniel Borkmann
  -1 siblings, 0 replies; 99+ messages in thread
From: Daniel Borkmann @ 2017-05-09 20:25 UTC (permalink / raw)
  To: Shubham Bansal
  Cc: David Miller, Kees Cook, Mircea Gherzan, Network Development,
	kernel-hardening, linux-arm-kernel, ast

On 05/09/2017 10:12 PM, Shubham Bansal wrote:
> Hi Daniel,
>
> I just tried running test_bpf.ko module.
>
> $ echo 2 >>  /proc/sys/net/core/bpf_jit_enable
> $ insmod test_bpf.ko
>
> test_bpf: #0 TAX
> bpf_jit: flen=14 proglen=212 pass=2 image=7f15a83c from=insmod pid=730
> JIT code: 00000000: f0 05 2d e9 40 d2 4d e2 00 40 a0 e3 0c 42 8d e5
> JIT code: 00000010: 08 42 8d e5 00 00 20 e0 01 10 21 e0 20 62 9d e5
> JIT code: 00000020: 20 72 9d e5 06 70 27 e0 20 72 8d e5 24 62 9d e5
> JIT code: 00000030: 24 72 9d e5 06 70 27 e0 24 72 8d e5 00 40 a0 e1
> JIT code: 00000040: 01 50 a0 e1 01 00 a0 e3 00 10 a0 e3 20 02 8d e5
> JIT code: 00000050: 24 12 8d e5 02 00 a0 e3 00 10 a0 e3 20 62 9d e5
> JIT code: 00000060: 06 00 80 e0 00 10 a0 e3 00 00 60 e2 00 10 a0 e3
> JIT code: 00000070: 20 02 8d e5 24 12 8d e5 54 40 90 e5 20 62 9d e5
> JIT code: 00000080: 06 00 80 e0 00 10 a0 e3 20 02 8d e5 24 12 8d e5
> JIT code: 00000090: 04 00 a0 e1 01 10 a0 e3 20 62 9d e5 06 10 81 e0
> JIT code: 000000a0: 01 20 a0 e3 04 32 8d e2 bc 68 0a e3 11 60 48 e3
> JIT code: 000000b0: 36 ff 2f e1 01 10 21 e0 00 00 50 e3 04 00 00 0a
> JIT code: 000000c0: 00 00 d0 e5 01 00 00 ea 40 d2 8d e2 f0 05 bd e8
> JIT code: 000000d0: 1e ff 2f e1
> jited:1
> Unhandled fault: page domain fault (0x01b) at 0x00000051
> pgd = 871d0000
> [00000051] *pgd=671b7831, *pte=00000000, *ppte=00000000
> Internal error: : 1b [#1] SMP ARM
> Modules linked in: test_bpf(+)
> CPU: 0 PID: 730 Comm: insmod Not tainted 4.11.0+ #5
> Hardware name: ARM-Versatile Express
> task: 87023700 task.stack: 8718a000
> PC is at 0x7f15a8b4
> LR is at test_bpf_init+0x5bc/0x1000 [test_bpf]
> pc : [<7f15a8b4>]    lr : [<7f1575bc>]    psr: 80000013
> sp : 8718bd7c  ip : 00000015  fp : 7f005008
> r10: 7f005094  r9 : 893ba020  r8 : 893ba000
> r7 : 00000000  r6 : 00000001  r5 : 00000000  r4 : 00000000
> r3 : 7f15a83c  r2 : 893ba020  r1 : 00000000  r0 : fffffffd
> Flags: Nzcv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
> Control: 10c5387d  Table: 671d0059  DAC: 00000051
> Process insmod (pid: 730, stack limit = 0x8718a210)
> Stack: (0x8718bd7c to 0x8718c000)
> bd60:                                                                00000000
> bd80: 00002710 870db300 c302e7e8 7f004010 893ba000 7f005094 00000000 00000000
> bda0: 00000000 00000000 00000000 00000001 00000001 00000000 014000c0 00150628
> bdc0: 7f0050ac 7f154840 1234aaaa 1234aaab c302e7e8 0000000f 00000000 893ba000
> bde0: 0000000b 7f004010 87fd54a0 ffffe000 7f157000 00000000 871b6fc0 00000001
> be00: 78e4905c 00000024 7f154640 8010179c 80a06544 8718a000 00000001 80a54980
> be20: 80a3066c 00000007 809685c0 80a54700 80a54700 07551000 80a54700 60070013
> be40: 7f154640 801f3fc8 78e4905c 7f154640 00000001 871b6fe4 7f154640 00000001
> be60: 871b6b00 00000001 78e4905c 801eaa94 00000001 871b6fe4 8718bf44 00000001
> be80: 871b6fe4 80196e4c 7f15464c 00007fff 7f154640 80193f10 87127000 7f154640
> bea0: 7f154688 80703800 7f154770 807037e4 8081b184 807bec60 807becc4 807bec6c
> bec0: 7f15481c 8010c1b8 93600000 76ed8028 00000f60 00000000 00000000 00000000
> bee0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
> bf00: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00003f80
> bf20: 76f5cf88 00000000 93684f80 8718a000 00160fda 00000051 00000000 801973b0
> bf40: 87671a00 93501000 00183f80 93684760 93684574 936788e0 00155000 00155290
> bf60: 00000000 00000000 00000000 00001f64 00000032 00000033 0000001d 00000000
> bf80: 00000017 00000000 00000000 00183f80 756e694c 00000080 80107684 fffffffd
> bfa0: 00000000 801074c0 00000000 00183f80 76dd9008 00183f80 00160fda 00000000
> bfc0: 00000000 00183f80 756e694c 00000080 00000001 7eabae2c 00172f8c 00000000
> bfe0: 7eabaae0 7eabaad0 0004017f 00013172 60070030 76dd9008 00000000 00000000
> [<7f1575bc>] (test_bpf_init [test_bpf]) from [<7f157000>]
> (test_bpf_init+0x0/0x1000 [test_bpf])
> [<7f157000>] (test_bpf_init [test_bpf]) from [<78e4905c>] (0x78e4905c)
> Code: e2600000 e3a01000 e58d0220 e58d1224 (e5904054)
> ---[ end trace a36398923b914fe2 ]---
> Segmentation fault
>
> Why is trying to execute TAX which is a cBPF instruction?

Kernel translates this to eBPF internally (bpf_prepare_filter() ->
bpf_migrate_filter()), no cBPF will see the JIT directly.

Is your implementation still using bpf_jit_compile() callback as
opposed to bpf_int_jit_compile()?!

Cheers,
Daniel

^ permalink raw reply	[flat|nested] 99+ messages in thread

* arch: arm: bpf: Converting cBPF to eBPF for arm 32 bit
@ 2017-05-09 20:25                             ` Daniel Borkmann
  0 siblings, 0 replies; 99+ messages in thread
From: Daniel Borkmann @ 2017-05-09 20:25 UTC (permalink / raw)
  To: linux-arm-kernel

On 05/09/2017 10:12 PM, Shubham Bansal wrote:
> Hi Daniel,
>
> I just tried running test_bpf.ko module.
>
> $ echo 2 >>  /proc/sys/net/core/bpf_jit_enable
> $ insmod test_bpf.ko
>
> test_bpf: #0 TAX
> bpf_jit: flen=14 proglen=212 pass=2 image=7f15a83c from=insmod pid=730
> JIT code: 00000000: f0 05 2d e9 40 d2 4d e2 00 40 a0 e3 0c 42 8d e5
> JIT code: 00000010: 08 42 8d e5 00 00 20 e0 01 10 21 e0 20 62 9d e5
> JIT code: 00000020: 20 72 9d e5 06 70 27 e0 20 72 8d e5 24 62 9d e5
> JIT code: 00000030: 24 72 9d e5 06 70 27 e0 24 72 8d e5 00 40 a0 e1
> JIT code: 00000040: 01 50 a0 e1 01 00 a0 e3 00 10 a0 e3 20 02 8d e5
> JIT code: 00000050: 24 12 8d e5 02 00 a0 e3 00 10 a0 e3 20 62 9d e5
> JIT code: 00000060: 06 00 80 e0 00 10 a0 e3 00 00 60 e2 00 10 a0 e3
> JIT code: 00000070: 20 02 8d e5 24 12 8d e5 54 40 90 e5 20 62 9d e5
> JIT code: 00000080: 06 00 80 e0 00 10 a0 e3 20 02 8d e5 24 12 8d e5
> JIT code: 00000090: 04 00 a0 e1 01 10 a0 e3 20 62 9d e5 06 10 81 e0
> JIT code: 000000a0: 01 20 a0 e3 04 32 8d e2 bc 68 0a e3 11 60 48 e3
> JIT code: 000000b0: 36 ff 2f e1 01 10 21 e0 00 00 50 e3 04 00 00 0a
> JIT code: 000000c0: 00 00 d0 e5 01 00 00 ea 40 d2 8d e2 f0 05 bd e8
> JIT code: 000000d0: 1e ff 2f e1
> jited:1
> Unhandled fault: page domain fault (0x01b) at 0x00000051
> pgd = 871d0000
> [00000051] *pgd=671b7831, *pte=00000000, *ppte=00000000
> Internal error: : 1b [#1] SMP ARM
> Modules linked in: test_bpf(+)
> CPU: 0 PID: 730 Comm: insmod Not tainted 4.11.0+ #5
> Hardware name: ARM-Versatile Express
> task: 87023700 task.stack: 8718a000
> PC is at 0x7f15a8b4
> LR is at test_bpf_init+0x5bc/0x1000 [test_bpf]
> pc : [<7f15a8b4>]    lr : [<7f1575bc>]    psr: 80000013
> sp : 8718bd7c  ip : 00000015  fp : 7f005008
> r10: 7f005094  r9 : 893ba020  r8 : 893ba000
> r7 : 00000000  r6 : 00000001  r5 : 00000000  r4 : 00000000
> r3 : 7f15a83c  r2 : 893ba020  r1 : 00000000  r0 : fffffffd
> Flags: Nzcv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
> Control: 10c5387d  Table: 671d0059  DAC: 00000051
> Process insmod (pid: 730, stack limit = 0x8718a210)
> Stack: (0x8718bd7c to 0x8718c000)
> bd60:                                                                00000000
> bd80: 00002710 870db300 c302e7e8 7f004010 893ba000 7f005094 00000000 00000000
> bda0: 00000000 00000000 00000000 00000001 00000001 00000000 014000c0 00150628
> bdc0: 7f0050ac 7f154840 1234aaaa 1234aaab c302e7e8 0000000f 00000000 893ba000
> bde0: 0000000b 7f004010 87fd54a0 ffffe000 7f157000 00000000 871b6fc0 00000001
> be00: 78e4905c 00000024 7f154640 8010179c 80a06544 8718a000 00000001 80a54980
> be20: 80a3066c 00000007 809685c0 80a54700 80a54700 07551000 80a54700 60070013
> be40: 7f154640 801f3fc8 78e4905c 7f154640 00000001 871b6fe4 7f154640 00000001
> be60: 871b6b00 00000001 78e4905c 801eaa94 00000001 871b6fe4 8718bf44 00000001
> be80: 871b6fe4 80196e4c 7f15464c 00007fff 7f154640 80193f10 87127000 7f154640
> bea0: 7f154688 80703800 7f154770 807037e4 8081b184 807bec60 807becc4 807bec6c
> bec0: 7f15481c 8010c1b8 93600000 76ed8028 00000f60 00000000 00000000 00000000
> bee0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
> bf00: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00003f80
> bf20: 76f5cf88 00000000 93684f80 8718a000 00160fda 00000051 00000000 801973b0
> bf40: 87671a00 93501000 00183f80 93684760 93684574 936788e0 00155000 00155290
> bf60: 00000000 00000000 00000000 00001f64 00000032 00000033 0000001d 00000000
> bf80: 00000017 00000000 00000000 00183f80 756e694c 00000080 80107684 fffffffd
> bfa0: 00000000 801074c0 00000000 00183f80 76dd9008 00183f80 00160fda 00000000
> bfc0: 00000000 00183f80 756e694c 00000080 00000001 7eabae2c 00172f8c 00000000
> bfe0: 7eabaae0 7eabaad0 0004017f 00013172 60070030 76dd9008 00000000 00000000
> [<7f1575bc>] (test_bpf_init [test_bpf]) from [<7f157000>]
> (test_bpf_init+0x0/0x1000 [test_bpf])
> [<7f157000>] (test_bpf_init [test_bpf]) from [<78e4905c>] (0x78e4905c)
> Code: e2600000 e3a01000 e58d0220 e58d1224 (e5904054)
> ---[ end trace a36398923b914fe2 ]---
> Segmentation fault
>
> Why is trying to execute TAX which is a cBPF instruction?

Kernel translates this to eBPF internally (bpf_prepare_filter() ->
bpf_migrate_filter()), no cBPF will see the JIT directly.

Is your implementation still using bpf_jit_compile() callback as
opposed to bpf_int_jit_compile()?!

Cheers,
Daniel

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: arch: arm: bpf: Converting cBPF to eBPF for arm 32 bit
  2017-05-09 20:12                           ` Shubham Bansal
@ 2017-05-09 20:19                             ` David Miller
  -1 siblings, 0 replies; 99+ messages in thread
From: David Miller @ 2017-05-09 20:19 UTC (permalink / raw)
  To: illusionist.neo
  Cc: daniel, keescook, mgherzan, netdev, kernel-hardening,
	linux-arm-kernel, ast

From: Shubham Bansal <illusionist.neo@gmail.com>
Date: Wed, 10 May 2017 01:42:10 +0530

> Why is trying to execute TAX which is a cBPF instruction?

Because some of the tests are classic BPF programs which
get translated into eBPF ones and sent to the JIT for
compilation.

^ permalink raw reply	[flat|nested] 99+ messages in thread

* arch: arm: bpf: Converting cBPF to eBPF for arm 32 bit
@ 2017-05-09 20:19                             ` David Miller
  0 siblings, 0 replies; 99+ messages in thread
From: David Miller @ 2017-05-09 20:19 UTC (permalink / raw)
  To: linux-arm-kernel

From: Shubham Bansal <illusionist.neo@gmail.com>
Date: Wed, 10 May 2017 01:42:10 +0530

> Why is trying to execute TAX which is a cBPF instruction?

Because some of the tests are classic BPF programs which
get translated into eBPF ones and sent to the JIT for
compilation.

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: arch: arm: bpf: Converting cBPF to eBPF for arm 32 bit
  2017-04-06 12:51                         ` Daniel Borkmann
@ 2017-05-09 20:12                           ` Shubham Bansal
  -1 siblings, 0 replies; 99+ messages in thread
From: Shubham Bansal @ 2017-05-09 20:12 UTC (permalink / raw)
  To: Daniel Borkmann
  Cc: David Miller, Kees Cook, Mircea Gherzan, Network Development,
	kernel-hardening, linux-arm-kernel, ast

Hi Daniel,

I just tried running test_bpf.ko module.

$ echo 2 >>  /proc/sys/net/core/bpf_jit_enable
$ insmod test_bpf.ko

test_bpf: #0 TAX
bpf_jit: flen=14 proglen=212 pass=2 image=7f15a83c from=insmod pid=730
JIT code: 00000000: f0 05 2d e9 40 d2 4d e2 00 40 a0 e3 0c 42 8d e5
JIT code: 00000010: 08 42 8d e5 00 00 20 e0 01 10 21 e0 20 62 9d e5
JIT code: 00000020: 20 72 9d e5 06 70 27 e0 20 72 8d e5 24 62 9d e5
JIT code: 00000030: 24 72 9d e5 06 70 27 e0 24 72 8d e5 00 40 a0 e1
JIT code: 00000040: 01 50 a0 e1 01 00 a0 e3 00 10 a0 e3 20 02 8d e5
JIT code: 00000050: 24 12 8d e5 02 00 a0 e3 00 10 a0 e3 20 62 9d e5
JIT code: 00000060: 06 00 80 e0 00 10 a0 e3 00 00 60 e2 00 10 a0 e3
JIT code: 00000070: 20 02 8d e5 24 12 8d e5 54 40 90 e5 20 62 9d e5
JIT code: 00000080: 06 00 80 e0 00 10 a0 e3 20 02 8d e5 24 12 8d e5
JIT code: 00000090: 04 00 a0 e1 01 10 a0 e3 20 62 9d e5 06 10 81 e0
JIT code: 000000a0: 01 20 a0 e3 04 32 8d e2 bc 68 0a e3 11 60 48 e3
JIT code: 000000b0: 36 ff 2f e1 01 10 21 e0 00 00 50 e3 04 00 00 0a
JIT code: 000000c0: 00 00 d0 e5 01 00 00 ea 40 d2 8d e2 f0 05 bd e8
JIT code: 000000d0: 1e ff 2f e1
jited:1
Unhandled fault: page domain fault (0x01b) at 0x00000051
pgd = 871d0000
[00000051] *pgd=671b7831, *pte=00000000, *ppte=00000000
Internal error: : 1b [#1] SMP ARM
Modules linked in: test_bpf(+)
CPU: 0 PID: 730 Comm: insmod Not tainted 4.11.0+ #5
Hardware name: ARM-Versatile Express
task: 87023700 task.stack: 8718a000
PC is at 0x7f15a8b4
LR is at test_bpf_init+0x5bc/0x1000 [test_bpf]
pc : [<7f15a8b4>]    lr : [<7f1575bc>]    psr: 80000013
sp : 8718bd7c  ip : 00000015  fp : 7f005008
r10: 7f005094  r9 : 893ba020  r8 : 893ba000
r7 : 00000000  r6 : 00000001  r5 : 00000000  r4 : 00000000
r3 : 7f15a83c  r2 : 893ba020  r1 : 00000000  r0 : fffffffd
Flags: Nzcv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
Control: 10c5387d  Table: 671d0059  DAC: 00000051
Process insmod (pid: 730, stack limit = 0x8718a210)
Stack: (0x8718bd7c to 0x8718c000)
bd60:                                                                00000000
bd80: 00002710 870db300 c302e7e8 7f004010 893ba000 7f005094 00000000 00000000
bda0: 00000000 00000000 00000000 00000001 00000001 00000000 014000c0 00150628
bdc0: 7f0050ac 7f154840 1234aaaa 1234aaab c302e7e8 0000000f 00000000 893ba000
bde0: 0000000b 7f004010 87fd54a0 ffffe000 7f157000 00000000 871b6fc0 00000001
be00: 78e4905c 00000024 7f154640 8010179c 80a06544 8718a000 00000001 80a54980
be20: 80a3066c 00000007 809685c0 80a54700 80a54700 07551000 80a54700 60070013
be40: 7f154640 801f3fc8 78e4905c 7f154640 00000001 871b6fe4 7f154640 00000001
be60: 871b6b00 00000001 78e4905c 801eaa94 00000001 871b6fe4 8718bf44 00000001
be80: 871b6fe4 80196e4c 7f15464c 00007fff 7f154640 80193f10 87127000 7f154640
bea0: 7f154688 80703800 7f154770 807037e4 8081b184 807bec60 807becc4 807bec6c
bec0: 7f15481c 8010c1b8 93600000 76ed8028 00000f60 00000000 00000000 00000000
bee0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
bf00: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00003f80
bf20: 76f5cf88 00000000 93684f80 8718a000 00160fda 00000051 00000000 801973b0
bf40: 87671a00 93501000 00183f80 93684760 93684574 936788e0 00155000 00155290
bf60: 00000000 00000000 00000000 00001f64 00000032 00000033 0000001d 00000000
bf80: 00000017 00000000 00000000 00183f80 756e694c 00000080 80107684 fffffffd
bfa0: 00000000 801074c0 00000000 00183f80 76dd9008 00183f80 00160fda 00000000
bfc0: 00000000 00183f80 756e694c 00000080 00000001 7eabae2c 00172f8c 00000000
bfe0: 7eabaae0 7eabaad0 0004017f 00013172 60070030 76dd9008 00000000 00000000
[<7f1575bc>] (test_bpf_init [test_bpf]) from [<7f157000>]
(test_bpf_init+0x0/0x1000 [test_bpf])
[<7f157000>] (test_bpf_init [test_bpf]) from [<78e4905c>] (0x78e4905c)
Code: e2600000 e3a01000 e58d0220 e58d1224 (e5904054)
---[ end trace a36398923b914fe2 ]---
Segmentation fault

Why is trying to execute TAX which is a cBPF instruction?

Best,
Shubham Bansal


On Thu, Apr 6, 2017 at 6:21 PM, Daniel Borkmann <daniel@iogearbox.net> wrote:
> On 04/06/2017 01:05 PM, Shubham Bansal wrote:
>>
>> Gentle Reminder.
>
>
> Sorry for late reply.
>
>> Anybody can tell me how to test the JIT compiler ?
>
>
> There's lib/test_bpf.c, see Documentation/networking/filter.txt +1349
> for some more information. It basically contains various test cases that
> have the purpose to test the JIT with corner cases. If you see a useful
> test missing, please send a patch for it, so all other JITs can benefit
> from this as well. For extracting disassembly from a generated test case,
> check out bpf_jit_disasm (Documentation/networking/filter.txt +486).
>
> Thanks,
> Daniel

^ permalink raw reply	[flat|nested] 99+ messages in thread

* arch: arm: bpf: Converting cBPF to eBPF for arm 32 bit
@ 2017-05-09 20:12                           ` Shubham Bansal
  0 siblings, 0 replies; 99+ messages in thread
From: Shubham Bansal @ 2017-05-09 20:12 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Daniel,

I just tried running test_bpf.ko module.

$ echo 2 >>  /proc/sys/net/core/bpf_jit_enable
$ insmod test_bpf.ko

test_bpf: #0 TAX
bpf_jit: flen=14 proglen=212 pass=2 image=7f15a83c from=insmod pid=730
JIT code: 00000000: f0 05 2d e9 40 d2 4d e2 00 40 a0 e3 0c 42 8d e5
JIT code: 00000010: 08 42 8d e5 00 00 20 e0 01 10 21 e0 20 62 9d e5
JIT code: 00000020: 20 72 9d e5 06 70 27 e0 20 72 8d e5 24 62 9d e5
JIT code: 00000030: 24 72 9d e5 06 70 27 e0 24 72 8d e5 00 40 a0 e1
JIT code: 00000040: 01 50 a0 e1 01 00 a0 e3 00 10 a0 e3 20 02 8d e5
JIT code: 00000050: 24 12 8d e5 02 00 a0 e3 00 10 a0 e3 20 62 9d e5
JIT code: 00000060: 06 00 80 e0 00 10 a0 e3 00 00 60 e2 00 10 a0 e3
JIT code: 00000070: 20 02 8d e5 24 12 8d e5 54 40 90 e5 20 62 9d e5
JIT code: 00000080: 06 00 80 e0 00 10 a0 e3 20 02 8d e5 24 12 8d e5
JIT code: 00000090: 04 00 a0 e1 01 10 a0 e3 20 62 9d e5 06 10 81 e0
JIT code: 000000a0: 01 20 a0 e3 04 32 8d e2 bc 68 0a e3 11 60 48 e3
JIT code: 000000b0: 36 ff 2f e1 01 10 21 e0 00 00 50 e3 04 00 00 0a
JIT code: 000000c0: 00 00 d0 e5 01 00 00 ea 40 d2 8d e2 f0 05 bd e8
JIT code: 000000d0: 1e ff 2f e1
jited:1
Unhandled fault: page domain fault (0x01b) at 0x00000051
pgd = 871d0000
[00000051] *pgd=671b7831, *pte=00000000, *ppte=00000000
Internal error: : 1b [#1] SMP ARM
Modules linked in: test_bpf(+)
CPU: 0 PID: 730 Comm: insmod Not tainted 4.11.0+ #5
Hardware name: ARM-Versatile Express
task: 87023700 task.stack: 8718a000
PC is at 0x7f15a8b4
LR is at test_bpf_init+0x5bc/0x1000 [test_bpf]
pc : [<7f15a8b4>]    lr : [<7f1575bc>]    psr: 80000013
sp : 8718bd7c  ip : 00000015  fp : 7f005008
r10: 7f005094  r9 : 893ba020  r8 : 893ba000
r7 : 00000000  r6 : 00000001  r5 : 00000000  r4 : 00000000
r3 : 7f15a83c  r2 : 893ba020  r1 : 00000000  r0 : fffffffd
Flags: Nzcv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
Control: 10c5387d  Table: 671d0059  DAC: 00000051
Process insmod (pid: 730, stack limit = 0x8718a210)
Stack: (0x8718bd7c to 0x8718c000)
bd60:                                                                00000000
bd80: 00002710 870db300 c302e7e8 7f004010 893ba000 7f005094 00000000 00000000
bda0: 00000000 00000000 00000000 00000001 00000001 00000000 014000c0 00150628
bdc0: 7f0050ac 7f154840 1234aaaa 1234aaab c302e7e8 0000000f 00000000 893ba000
bde0: 0000000b 7f004010 87fd54a0 ffffe000 7f157000 00000000 871b6fc0 00000001
be00: 78e4905c 00000024 7f154640 8010179c 80a06544 8718a000 00000001 80a54980
be20: 80a3066c 00000007 809685c0 80a54700 80a54700 07551000 80a54700 60070013
be40: 7f154640 801f3fc8 78e4905c 7f154640 00000001 871b6fe4 7f154640 00000001
be60: 871b6b00 00000001 78e4905c 801eaa94 00000001 871b6fe4 8718bf44 00000001
be80: 871b6fe4 80196e4c 7f15464c 00007fff 7f154640 80193f10 87127000 7f154640
bea0: 7f154688 80703800 7f154770 807037e4 8081b184 807bec60 807becc4 807bec6c
bec0: 7f15481c 8010c1b8 93600000 76ed8028 00000f60 00000000 00000000 00000000
bee0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
bf00: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00003f80
bf20: 76f5cf88 00000000 93684f80 8718a000 00160fda 00000051 00000000 801973b0
bf40: 87671a00 93501000 00183f80 93684760 93684574 936788e0 00155000 00155290
bf60: 00000000 00000000 00000000 00001f64 00000032 00000033 0000001d 00000000
bf80: 00000017 00000000 00000000 00183f80 756e694c 00000080 80107684 fffffffd
bfa0: 00000000 801074c0 00000000 00183f80 76dd9008 00183f80 00160fda 00000000
bfc0: 00000000 00183f80 756e694c 00000080 00000001 7eabae2c 00172f8c 00000000
bfe0: 7eabaae0 7eabaad0 0004017f 00013172 60070030 76dd9008 00000000 00000000
[<7f1575bc>] (test_bpf_init [test_bpf]) from [<7f157000>]
(test_bpf_init+0x0/0x1000 [test_bpf])
[<7f157000>] (test_bpf_init [test_bpf]) from [<78e4905c>] (0x78e4905c)
Code: e2600000 e3a01000 e58d0220 e58d1224 (e5904054)
---[ end trace a36398923b914fe2 ]---
Segmentation fault

Why is trying to execute TAX which is a cBPF instruction?

Best,
Shubham Bansal


On Thu, Apr 6, 2017 at 6:21 PM, Daniel Borkmann <daniel@iogearbox.net> wrote:
> On 04/06/2017 01:05 PM, Shubham Bansal wrote:
>>
>> Gentle Reminder.
>
>
> Sorry for late reply.
>
>> Anybody can tell me how to test the JIT compiler ?
>
>
> There's lib/test_bpf.c, see Documentation/networking/filter.txt +1349
> for some more information. It basically contains various test cases that
> have the purpose to test the JIT with corner cases. If you see a useful
> test missing, please send a patch for it, so all other JITs can benefit
> from this as well. For extracting disassembly from a generated test case,
> check out bpf_jit_disasm (Documentation/networking/filter.txt +486).
>
> Thanks,
> Daniel

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: arch: arm: bpf: Converting cBPF to eBPF for arm 32 bit
  2017-05-06 20:27                               ` Shubham Bansal
@ 2017-05-06 22:17                                 ` Shubham Bansal
  -1 siblings, 0 replies; 99+ messages in thread
From: Shubham Bansal @ 2017-05-06 22:17 UTC (permalink / raw)
  To: David Miller
  Cc: Daniel Borkmann, Kees Cook, Mircea Gherzan, Network Development,
	kernel-hardening, linux-arm-kernel, ast

Okay. My mistake. I just checked the verify function.

Apologies.
Best,
Shubham Bansal


On Sun, May 7, 2017 at 1:57 AM, Shubham Bansal
<illusionist.neo@gmail.com> wrote:
> Thanks David.
>
> Hi all,
>
> I have two questions about the code at arch/arm64/net/bpf_jit_comp.c.
>
> 1. At line 708, " const u8 r1 = bpf2a64[BPF_REG_1]; /* r1: struct
> sk_buff *skb */ ".
>     Why is this code using BPF_REG_1 before saving it? As far as I
> know, BPF_REG_1 has pointer to bpf program context and this code
> clearly is overwriting that pointer which makes that pointer useless
> for future usage. It clearly looks like a bug.
>
> 2. At line 256, " emit(A64_LDR64(prg, tmp, r3), ctx); ".
>     This line of code is used to load an array( of pointers ) element,
> where r3 is used as an index of that array. Shouldn't it be be
> arithmetic left shifted by 3 or multiplied by 8 to get the right
> address in that array of pointers ?
>
> Apologies if any of the above question is stupid to ask.
>
> Best,
> Shubham
> Best,
> Shubham Bansal
>
>
> On Sun, May 7, 2017 at 12:08 AM, David Miller <davem@davemloft.net> wrote:
>> From: Shubham Bansal <illusionist.neo@gmail.com>
>> Date: Sat, 6 May 2017 22:18:16 +0530
>>
>>> Hi Daniel,
>>>
>>> Thanks for the last reply about the testing of eBPF JIT.
>>>
>>> I have one issue though, I am not able to find what BPF_ABS and
>>> BPF_IND instruction does exactly.
>>
>> They are not instructions, they are modifiers for the BPF_LD
>> instruction which indicate an SKB load is to be performed.
>>
>> You never need to ask what a BPF instruction does, it is clear
>> defined in the BPF interperter found in kernel/bpf/core.c
>>
>> Look for the case statement LD_ABS_W and friends in __bpf_prog_run().

^ permalink raw reply	[flat|nested] 99+ messages in thread

* arch: arm: bpf: Converting cBPF to eBPF for arm 32 bit
@ 2017-05-06 22:17                                 ` Shubham Bansal
  0 siblings, 0 replies; 99+ messages in thread
From: Shubham Bansal @ 2017-05-06 22:17 UTC (permalink / raw)
  To: linux-arm-kernel

Okay. My mistake. I just checked the verify function.

Apologies.
Best,
Shubham Bansal


On Sun, May 7, 2017 at 1:57 AM, Shubham Bansal
<illusionist.neo@gmail.com> wrote:
> Thanks David.
>
> Hi all,
>
> I have two questions about the code at arch/arm64/net/bpf_jit_comp.c.
>
> 1. At line 708, " const u8 r1 = bpf2a64[BPF_REG_1]; /* r1: struct
> sk_buff *skb */ ".
>     Why is this code using BPF_REG_1 before saving it? As far as I
> know, BPF_REG_1 has pointer to bpf program context and this code
> clearly is overwriting that pointer which makes that pointer useless
> for future usage. It clearly looks like a bug.
>
> 2. At line 256, " emit(A64_LDR64(prg, tmp, r3), ctx); ".
>     This line of code is used to load an array( of pointers ) element,
> where r3 is used as an index of that array. Shouldn't it be be
> arithmetic left shifted by 3 or multiplied by 8 to get the right
> address in that array of pointers ?
>
> Apologies if any of the above question is stupid to ask.
>
> Best,
> Shubham
> Best,
> Shubham Bansal
>
>
> On Sun, May 7, 2017 at 12:08 AM, David Miller <davem@davemloft.net> wrote:
>> From: Shubham Bansal <illusionist.neo@gmail.com>
>> Date: Sat, 6 May 2017 22:18:16 +0530
>>
>>> Hi Daniel,
>>>
>>> Thanks for the last reply about the testing of eBPF JIT.
>>>
>>> I have one issue though, I am not able to find what BPF_ABS and
>>> BPF_IND instruction does exactly.
>>
>> They are not instructions, they are modifiers for the BPF_LD
>> instruction which indicate an SKB load is to be performed.
>>
>> You never need to ask what a BPF instruction does, it is clear
>> defined in the BPF interperter found in kernel/bpf/core.c
>>
>> Look for the case statement LD_ABS_W and friends in __bpf_prog_run().

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: arch: arm: bpf: Converting cBPF to eBPF for arm 32 bit
  2017-05-06 18:38                             ` David Miller
@ 2017-05-06 20:27                               ` Shubham Bansal
  -1 siblings, 0 replies; 99+ messages in thread
From: Shubham Bansal @ 2017-05-06 20:27 UTC (permalink / raw)
  To: David Miller
  Cc: Kees Cook, Daniel Borkmann, kernel-hardening,
	Network Development, ast, Mircea Gherzan, linux-arm-kernel

Thanks David.

Hi all,

I have two questions about the code at arch/arm64/net/bpf_jit_comp.c.

1. At line 708, " const u8 r1 = bpf2a64[BPF_REG_1]; /* r1: struct
sk_buff *skb */ ".
    Why is this code using BPF_REG_1 before saving it? As far as I
know, BPF_REG_1 has pointer to bpf program context and this code
clearly is overwriting that pointer which makes that pointer useless
for future usage. It clearly looks like a bug.

2. At line 256, " emit(A64_LDR64(prg, tmp, r3), ctx); ".
    This line of code is used to load an array( of pointers ) element,
where r3 is used as an index of that array. Shouldn't it be be
arithmetic left shifted by 3 or multiplied by 8 to get the right
address in that array of pointers ?

Apologies if any of the above question is stupid to ask.

Best,
Shubham
Best,
Shubham Bansal


On Sun, May 7, 2017 at 12:08 AM, David Miller <davem@davemloft.net> wrote:
> From: Shubham Bansal <illusionist.neo@gmail.com>
> Date: Sat, 6 May 2017 22:18:16 +0530
>
>> Hi Daniel,
>>
>> Thanks for the last reply about the testing of eBPF JIT.
>>
>> I have one issue though, I am not able to find what BPF_ABS and
>> BPF_IND instruction does exactly.
>
> They are not instructions, they are modifiers for the BPF_LD
> instruction which indicate an SKB load is to be performed.
>
> You never need to ask what a BPF instruction does, it is clear
> defined in the BPF interperter found in kernel/bpf/core.c
>
> Look for the case statement LD_ABS_W and friends in __bpf_prog_run().

^ permalink raw reply	[flat|nested] 99+ messages in thread

* arch: arm: bpf: Converting cBPF to eBPF for arm 32 bit
@ 2017-05-06 20:27                               ` Shubham Bansal
  0 siblings, 0 replies; 99+ messages in thread
From: Shubham Bansal @ 2017-05-06 20:27 UTC (permalink / raw)
  To: linux-arm-kernel

Thanks David.

Hi all,

I have two questions about the code at arch/arm64/net/bpf_jit_comp.c.

1. At line 708, " const u8 r1 = bpf2a64[BPF_REG_1]; /* r1: struct
sk_buff *skb */ ".
    Why is this code using BPF_REG_1 before saving it? As far as I
know, BPF_REG_1 has pointer to bpf program context and this code
clearly is overwriting that pointer which makes that pointer useless
for future usage. It clearly looks like a bug.

2. At line 256, " emit(A64_LDR64(prg, tmp, r3), ctx); ".
    This line of code is used to load an array( of pointers ) element,
where r3 is used as an index of that array. Shouldn't it be be
arithmetic left shifted by 3 or multiplied by 8 to get the right
address in that array of pointers ?

Apologies if any of the above question is stupid to ask.

Best,
Shubham
Best,
Shubham Bansal


On Sun, May 7, 2017 at 12:08 AM, David Miller <davem@davemloft.net> wrote:
> From: Shubham Bansal <illusionist.neo@gmail.com>
> Date: Sat, 6 May 2017 22:18:16 +0530
>
>> Hi Daniel,
>>
>> Thanks for the last reply about the testing of eBPF JIT.
>>
>> I have one issue though, I am not able to find what BPF_ABS and
>> BPF_IND instruction does exactly.
>
> They are not instructions, they are modifiers for the BPF_LD
> instruction which indicate an SKB load is to be performed.
>
> You never need to ask what a BPF instruction does, it is clear
> defined in the BPF interperter found in kernel/bpf/core.c
>
> Look for the case statement LD_ABS_W and friends in __bpf_prog_run().

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: arch: arm: bpf: Converting cBPF to eBPF for arm 32 bit
  2017-05-06 16:48                           ` Shubham Bansal
@ 2017-05-06 18:38                             ` David Miller
  -1 siblings, 0 replies; 99+ messages in thread
From: David Miller @ 2017-05-06 18:38 UTC (permalink / raw)
  To: illusionist.neo
  Cc: daniel, keescook, mgherzan, netdev, kernel-hardening,
	linux-arm-kernel, ast

From: Shubham Bansal <illusionist.neo@gmail.com>
Date: Sat, 6 May 2017 22:18:16 +0530

> Hi Daniel,
> 
> Thanks for the last reply about the testing of eBPF JIT.
> 
> I have one issue though, I am not able to find what BPF_ABS and
> BPF_IND instruction does exactly.

They are not instructions, they are modifiers for the BPF_LD
instruction which indicate an SKB load is to be performed.

You never need to ask what a BPF instruction does, it is clear
defined in the BPF interperter found in kernel/bpf/core.c

Look for the case statement LD_ABS_W and friends in __bpf_prog_run().

^ permalink raw reply	[flat|nested] 99+ messages in thread

* arch: arm: bpf: Converting cBPF to eBPF for arm 32 bit
@ 2017-05-06 18:38                             ` David Miller
  0 siblings, 0 replies; 99+ messages in thread
From: David Miller @ 2017-05-06 18:38 UTC (permalink / raw)
  To: linux-arm-kernel

From: Shubham Bansal <illusionist.neo@gmail.com>
Date: Sat, 6 May 2017 22:18:16 +0530

> Hi Daniel,
> 
> Thanks for the last reply about the testing of eBPF JIT.
> 
> I have one issue though, I am not able to find what BPF_ABS and
> BPF_IND instruction does exactly.

They are not instructions, they are modifiers for the BPF_LD
instruction which indicate an SKB load is to be performed.

You never need to ask what a BPF instruction does, it is clear
defined in the BPF interperter found in kernel/bpf/core.c

Look for the case statement LD_ABS_W and friends in __bpf_prog_run().

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: arch: arm: bpf: Converting cBPF to eBPF for arm 32 bit
  2017-04-06 12:51                         ` Daniel Borkmann
@ 2017-05-06 16:48                           ` Shubham Bansal
  -1 siblings, 0 replies; 99+ messages in thread
From: Shubham Bansal @ 2017-05-06 16:48 UTC (permalink / raw)
  To: Daniel Borkmann
  Cc: David Miller, Kees Cook, Mircea Gherzan, Network Development,
	kernel-hardening, linux-arm-kernel, ast

Hi Daniel,

Thanks for the last reply about the testing of eBPF JIT.

I have one issue though, I am not able to find what BPF_ABS and
BPF_IND instruction does exactly. It not described on this link -
https://www.kernel.org/doc/Documentation/networking/filter.txt either.
Can you please tell me where I could find the description of these
instructions please?
Best,
Shubham Bansal


On Thu, Apr 6, 2017 at 6:21 PM, Daniel Borkmann <daniel@iogearbox.net> wrote:
> On 04/06/2017 01:05 PM, Shubham Bansal wrote:
>>
>> Gentle Reminder.
>
>
> Sorry for late reply.
>
>> Anybody can tell me how to test the JIT compiler ?
>
>
> There's lib/test_bpf.c, see Documentation/networking/filter.txt +1349
> for some more information. It basically contains various test cases that
> have the purpose to test the JIT with corner cases. If you see a useful
> test missing, please send a patch for it, so all other JITs can benefit
> from this as well. For extracting disassembly from a generated test case,
> check out bpf_jit_disasm (Documentation/networking/filter.txt +486).
>
> Thanks,
> Daniel

^ permalink raw reply	[flat|nested] 99+ messages in thread

* arch: arm: bpf: Converting cBPF to eBPF for arm 32 bit
@ 2017-05-06 16:48                           ` Shubham Bansal
  0 siblings, 0 replies; 99+ messages in thread
From: Shubham Bansal @ 2017-05-06 16:48 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Daniel,

Thanks for the last reply about the testing of eBPF JIT.

I have one issue though, I am not able to find what BPF_ABS and
BPF_IND instruction does exactly. It not described on this link -
https://www.kernel.org/doc/Documentation/networking/filter.txt either.
Can you please tell me where I could find the description of these
instructions please?
Best,
Shubham Bansal


On Thu, Apr 6, 2017 at 6:21 PM, Daniel Borkmann <daniel@iogearbox.net> wrote:
> On 04/06/2017 01:05 PM, Shubham Bansal wrote:
>>
>> Gentle Reminder.
>
>
> Sorry for late reply.
>
>> Anybody can tell me how to test the JIT compiler ?
>
>
> There's lib/test_bpf.c, see Documentation/networking/filter.txt +1349
> for some more information. It basically contains various test cases that
> have the purpose to test the JIT with corner cases. If you see a useful
> test missing, please send a patch for it, so all other JITs can benefit
> from this as well. For extracting disassembly from a generated test case,
> check out bpf_jit_disasm (Documentation/networking/filter.txt +486).
>
> Thanks,
> Daniel

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: arch: arm: bpf: Converting cBPF to eBPF for arm 32 bit
  2017-04-06 11:05                       ` Shubham Bansal
@ 2017-04-06 12:51                         ` Daniel Borkmann
  -1 siblings, 0 replies; 99+ messages in thread
From: Daniel Borkmann @ 2017-04-06 12:51 UTC (permalink / raw)
  To: Shubham Bansal
  Cc: David Miller, Kees Cook, Mircea Gherzan, Network Development,
	kernel-hardening, linux-arm-kernel, ast

On 04/06/2017 01:05 PM, Shubham Bansal wrote:
> Gentle Reminder.

Sorry for late reply.

> Anybody can tell me how to test the JIT compiler ?

There's lib/test_bpf.c, see Documentation/networking/filter.txt +1349
for some more information. It basically contains various test cases that
have the purpose to test the JIT with corner cases. If you see a useful
test missing, please send a patch for it, so all other JITs can benefit
from this as well. For extracting disassembly from a generated test case,
check out bpf_jit_disasm (Documentation/networking/filter.txt +486).

Thanks,
Daniel

^ permalink raw reply	[flat|nested] 99+ messages in thread

* arch: arm: bpf: Converting cBPF to eBPF for arm 32 bit
@ 2017-04-06 12:51                         ` Daniel Borkmann
  0 siblings, 0 replies; 99+ messages in thread
From: Daniel Borkmann @ 2017-04-06 12:51 UTC (permalink / raw)
  To: linux-arm-kernel

On 04/06/2017 01:05 PM, Shubham Bansal wrote:
> Gentle Reminder.

Sorry for late reply.

> Anybody can tell me how to test the JIT compiler ?

There's lib/test_bpf.c, see Documentation/networking/filter.txt +1349
for some more information. It basically contains various test cases that
have the purpose to test the JIT with corner cases. If you see a useful
test missing, please send a patch for it, so all other JITs can benefit
from this as well. For extracting disassembly from a generated test case,
check out bpf_jit_disasm (Documentation/networking/filter.txt +486).

Thanks,
Daniel

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: arch: arm: bpf: Converting cBPF to eBPF for arm 32 bit
  2017-03-30 14:04                     ` Shubham Bansal
@ 2017-04-06 11:05                       ` Shubham Bansal
  -1 siblings, 0 replies; 99+ messages in thread
From: Shubham Bansal @ 2017-04-06 11:05 UTC (permalink / raw)
  To: Daniel Borkmann
  Cc: David Miller, Kees Cook, Mircea Gherzan, Network Development,
	kernel-hardening, linux-arm-kernel, ast

Gentle Reminder.

Anybody can tell me how to test the JIT compiler ?
Best,
Shubham Bansal


On Thu, Mar 30, 2017 at 7:34 PM, Shubham Bansal
<illusionist.neo@gmail.com> wrote:
> Thanks Daniel.
>
> Can you tell me how to test the eBPF JIT compiler? It would be great
> if you could tell me starting from compiling to proper testing.
> Best,
> Shubham Bansal
>
>
> On Wed, Mar 29, 2017 at 5:30 AM, Daniel Borkmann <daniel@iogearbox.net> wrote:
>> Hi Shubham,
>>
>> On 03/28/2017 10:49 PM, Shubham Bansal wrote:
>> [...]
>>>
>>> Do you have any document to understand the working of tail calls? I
>>> looked at your presentations but it seemed confusing to me. Anything
>>> simple would be great, just about the tail calls. I don't think I need
>>> examples, I can get them from your presentations. I just need a very
>>> general idea. May be you know the code in kernel where it is
>>> implemented.
>>
>>
>> Sure, it's in __bpf_prog_run(), see the JMP_TAIL_CALL (kernel/bpf/core.c
>> +1019).
>> That's effectively what JITs implement. [1] page 3 has a high-level
>> description
>> as well, hope that helps.
>>
>> Thanks,
>> Daniel
>>
>>   [1]
>> http://www.netdevconf.org/1.1/proceedings/papers/On-getting-tc-classifier-fully-programmable-with-cls-bpf.pdf

^ permalink raw reply	[flat|nested] 99+ messages in thread

* arch: arm: bpf: Converting cBPF to eBPF for arm 32 bit
@ 2017-04-06 11:05                       ` Shubham Bansal
  0 siblings, 0 replies; 99+ messages in thread
From: Shubham Bansal @ 2017-04-06 11:05 UTC (permalink / raw)
  To: linux-arm-kernel

Gentle Reminder.

Anybody can tell me how to test the JIT compiler ?
Best,
Shubham Bansal


On Thu, Mar 30, 2017 at 7:34 PM, Shubham Bansal
<illusionist.neo@gmail.com> wrote:
> Thanks Daniel.
>
> Can you tell me how to test the eBPF JIT compiler? It would be great
> if you could tell me starting from compiling to proper testing.
> Best,
> Shubham Bansal
>
>
> On Wed, Mar 29, 2017 at 5:30 AM, Daniel Borkmann <daniel@iogearbox.net> wrote:
>> Hi Shubham,
>>
>> On 03/28/2017 10:49 PM, Shubham Bansal wrote:
>> [...]
>>>
>>> Do you have any document to understand the working of tail calls? I
>>> looked at your presentations but it seemed confusing to me. Anything
>>> simple would be great, just about the tail calls. I don't think I need
>>> examples, I can get them from your presentations. I just need a very
>>> general idea. May be you know the code in kernel where it is
>>> implemented.
>>
>>
>> Sure, it's in __bpf_prog_run(), see the JMP_TAIL_CALL (kernel/bpf/core.c
>> +1019).
>> That's effectively what JITs implement. [1] page 3 has a high-level
>> description
>> as well, hope that helps.
>>
>> Thanks,
>> Daniel
>>
>>   [1]
>> http://www.netdevconf.org/1.1/proceedings/papers/On-getting-tc-classifier-fully-programmable-with-cls-bpf.pdf

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: arch: arm: bpf: Converting cBPF to eBPF for arm 32 bit
  2017-03-29  0:00                   ` Daniel Borkmann
@ 2017-03-30 14:04                     ` Shubham Bansal
  -1 siblings, 0 replies; 99+ messages in thread
From: Shubham Bansal @ 2017-03-30 14:04 UTC (permalink / raw)
  To: Daniel Borkmann
  Cc: David Miller, Kees Cook, Mircea Gherzan, Network Development,
	kernel-hardening, linux-arm-kernel, ast

Thanks Daniel.

Can you tell me how to test the eBPF JIT compiler? It would be great
if you could tell me starting from compiling to proper testing.
Best,
Shubham Bansal


On Wed, Mar 29, 2017 at 5:30 AM, Daniel Borkmann <daniel@iogearbox.net> wrote:
> Hi Shubham,
>
> On 03/28/2017 10:49 PM, Shubham Bansal wrote:
> [...]
>>
>> Do you have any document to understand the working of tail calls? I
>> looked at your presentations but it seemed confusing to me. Anything
>> simple would be great, just about the tail calls. I don't think I need
>> examples, I can get them from your presentations. I just need a very
>> general idea. May be you know the code in kernel where it is
>> implemented.
>
>
> Sure, it's in __bpf_prog_run(), see the JMP_TAIL_CALL (kernel/bpf/core.c
> +1019).
> That's effectively what JITs implement. [1] page 3 has a high-level
> description
> as well, hope that helps.
>
> Thanks,
> Daniel
>
>   [1]
> http://www.netdevconf.org/1.1/proceedings/papers/On-getting-tc-classifier-fully-programmable-with-cls-bpf.pdf

^ permalink raw reply	[flat|nested] 99+ messages in thread

* arch: arm: bpf: Converting cBPF to eBPF for arm 32 bit
@ 2017-03-30 14:04                     ` Shubham Bansal
  0 siblings, 0 replies; 99+ messages in thread
From: Shubham Bansal @ 2017-03-30 14:04 UTC (permalink / raw)
  To: linux-arm-kernel

Thanks Daniel.

Can you tell me how to test the eBPF JIT compiler? It would be great
if you could tell me starting from compiling to proper testing.
Best,
Shubham Bansal


On Wed, Mar 29, 2017 at 5:30 AM, Daniel Borkmann <daniel@iogearbox.net> wrote:
> Hi Shubham,
>
> On 03/28/2017 10:49 PM, Shubham Bansal wrote:
> [...]
>>
>> Do you have any document to understand the working of tail calls? I
>> looked at your presentations but it seemed confusing to me. Anything
>> simple would be great, just about the tail calls. I don't think I need
>> examples, I can get them from your presentations. I just need a very
>> general idea. May be you know the code in kernel where it is
>> implemented.
>
>
> Sure, it's in __bpf_prog_run(), see the JMP_TAIL_CALL (kernel/bpf/core.c
> +1019).
> That's effectively what JITs implement. [1] page 3 has a high-level
> description
> as well, hope that helps.
>
> Thanks,
> Daniel
>
>   [1]
> http://www.netdevconf.org/1.1/proceedings/papers/On-getting-tc-classifier-fully-programmable-with-cls-bpf.pdf

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: arch: arm: bpf: Converting cBPF to eBPF for arm 32 bit
  2017-03-28 20:49                 ` Shubham Bansal
@ 2017-03-29  0:00                   ` Daniel Borkmann
  -1 siblings, 0 replies; 99+ messages in thread
From: Daniel Borkmann @ 2017-03-29  0:00 UTC (permalink / raw)
  To: Shubham Bansal
  Cc: David Miller, Kees Cook, Mircea Gherzan, Network Development,
	kernel-hardening, linux-arm-kernel, ast

Hi Shubham,

On 03/28/2017 10:49 PM, Shubham Bansal wrote:
[...]
> Do you have any document to understand the working of tail calls? I
> looked at your presentations but it seemed confusing to me. Anything
> simple would be great, just about the tail calls. I don't think I need
> examples, I can get them from your presentations. I just need a very
> general idea. May be you know the code in kernel where it is
> implemented.

Sure, it's in __bpf_prog_run(), see the JMP_TAIL_CALL (kernel/bpf/core.c +1019).
That's effectively what JITs implement. [1] page 3 has a high-level description
as well, hope that helps.

Thanks,
Daniel

   [1] http://www.netdevconf.org/1.1/proceedings/papers/On-getting-tc-classifier-fully-programmable-with-cls-bpf.pdf

^ permalink raw reply	[flat|nested] 99+ messages in thread

* arch: arm: bpf: Converting cBPF to eBPF for arm 32 bit
@ 2017-03-29  0:00                   ` Daniel Borkmann
  0 siblings, 0 replies; 99+ messages in thread
From: Daniel Borkmann @ 2017-03-29  0:00 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Shubham,

On 03/28/2017 10:49 PM, Shubham Bansal wrote:
[...]
> Do you have any document to understand the working of tail calls? I
> looked at your presentations but it seemed confusing to me. Anything
> simple would be great, just about the tail calls. I don't think I need
> examples, I can get them from your presentations. I just need a very
> general idea. May be you know the code in kernel where it is
> implemented.

Sure, it's in __bpf_prog_run(), see the JMP_TAIL_CALL (kernel/bpf/core.c +1019).
That's effectively what JITs implement. [1] page 3 has a high-level description
as well, hope that helps.

Thanks,
Daniel

   [1] http://www.netdevconf.org/1.1/proceedings/papers/On-getting-tc-classifier-fully-programmable-with-cls-bpf.pdf

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: arch: arm: bpf: Converting cBPF to eBPF for arm 32 bit
  2017-03-15 21:55               ` David Miller
@ 2017-03-28 20:49                 ` Shubham Bansal
  -1 siblings, 0 replies; 99+ messages in thread
From: Shubham Bansal @ 2017-03-28 20:49 UTC (permalink / raw)
  To: David Miller
  Cc: Kees Cook, Daniel Borkmann, Mircea Gherzan, Network Development,
	kernel-hardening, linux-arm-kernel

Thanks David. It helped a lot.

Hi Daniel,

Do you have any document to understand the working of tail calls? I
looked at your presentations but it seemed confusing to me. Anything
simple would be great, just about the tail calls. I don't think I need
examples, I can get them from your presentations. I just need a very
general idea. May be you know the code in kernel where it is
implemented.
Thanks for the help Daniel.

Best,
Shubham Bansal


On Thu, Mar 16, 2017 at 3:25 AM, David Miller <davem@davemloft.net> wrote:
> From: Shubham Bansal <illusionist.neo@gmail.com>
> Date: Wed, 15 Mar 2017 17:43:44 +0530
>
>>> You can't truncate, but you'll have to build 64-bit ops using 32-bit registers.
>>
>> A small example would help a lot.
>
> You can simply perform 64-bit operations in C code and see what gcc
> outputs for that code on this 32-bit target.

^ permalink raw reply	[flat|nested] 99+ messages in thread

* arch: arm: bpf: Converting cBPF to eBPF for arm 32 bit
@ 2017-03-28 20:49                 ` Shubham Bansal
  0 siblings, 0 replies; 99+ messages in thread
From: Shubham Bansal @ 2017-03-28 20:49 UTC (permalink / raw)
  To: linux-arm-kernel

Thanks David. It helped a lot.

Hi Daniel,

Do you have any document to understand the working of tail calls? I
looked at your presentations but it seemed confusing to me. Anything
simple would be great, just about the tail calls. I don't think I need
examples, I can get them from your presentations. I just need a very
general idea. May be you know the code in kernel where it is
implemented.
Thanks for the help Daniel.

Best,
Shubham Bansal


On Thu, Mar 16, 2017 at 3:25 AM, David Miller <davem@davemloft.net> wrote:
> From: Shubham Bansal <illusionist.neo@gmail.com>
> Date: Wed, 15 Mar 2017 17:43:44 +0530
>
>>> You can't truncate, but you'll have to build 64-bit ops using 32-bit registers.
>>
>> A small example would help a lot.
>
> You can simply perform 64-bit operations in C code and see what gcc
> outputs for that code on this 32-bit target.

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: arch: arm: bpf: Converting cBPF to eBPF for arm 32 bit
  2017-03-15 12:13             ` Shubham Bansal
@ 2017-03-15 21:55               ` David Miller
  -1 siblings, 0 replies; 99+ messages in thread
From: David Miller @ 2017-03-15 21:55 UTC (permalink / raw)
  To: illusionist.neo
  Cc: keescook, daniel, mgherzan, netdev, kernel-hardening, linux-arm-kernel

From: Shubham Bansal <illusionist.neo@gmail.com>
Date: Wed, 15 Mar 2017 17:43:44 +0530

>> You can't truncate, but you'll have to build 64-bit ops using 32-bit registers.
> 
> A small example would help a lot.

You can simply perform 64-bit operations in C code and see what gcc
outputs for that code on this 32-bit target.

^ permalink raw reply	[flat|nested] 99+ messages in thread

* arch: arm: bpf: Converting cBPF to eBPF for arm 32 bit
@ 2017-03-15 21:55               ` David Miller
  0 siblings, 0 replies; 99+ messages in thread
From: David Miller @ 2017-03-15 21:55 UTC (permalink / raw)
  To: linux-arm-kernel

From: Shubham Bansal <illusionist.neo@gmail.com>
Date: Wed, 15 Mar 2017 17:43:44 +0530

>> You can't truncate, but you'll have to build 64-bit ops using 32-bit registers.
> 
> A small example would help a lot.

You can simply perform 64-bit operations in C code and see what gcc
outputs for that code on this 32-bit target.

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: arch: arm: bpf: Converting cBPF to eBPF for arm 32 bit
  2017-02-08 19:41           ` Kees Cook
@ 2017-03-15 12:13             ` Shubham Bansal
  -1 siblings, 0 replies; 99+ messages in thread
From: Shubham Bansal @ 2017-03-15 12:13 UTC (permalink / raw)
  To: Kees Cook
  Cc: Daniel Borkmann, Mircea Gherzan, Network Development,
	kernel-hardening, linux-arm-kernel

Hi kees,

> It seems like you're suggesting truncating the 64-bit register values?
> I think your best solution is going to be to use a memory scratch
> space and build 64-bit operations using 32-bit registers and memory
> operations.

Yes. I was suggesting the truncating of 64-bit register values, but
for 32 bit operands only. So when I am truncating the BPF register, I
am getting rid of non-useful bytes only.

 Can you explain how to use memory scratch space and build 64-bit
operations using 32-bit registers and memory operations ? A small
example would help a lot.

>> - Similarly, For all BPF_ALU class instructions.
>> - For BPF_ADD, I will mask the addition result to 32 bit only.
>>  I am not sure, Overflow might be a problem.
>> - For BPF_SUB, I will mask the subtraction result to 32 bit only.
>>  I am not sure, Underflow might be problem.
>> - For BPF_MUL, similar to BPF_ADD. Overflow Problem ?
>> - For BPF_DIV, 32 bit masking should be fine, I guess.
>> - For BPF_OR, BPF_AND, BPF_XOR, BPF_LSH, BPF_RSH, BPF_MOD 32 bit
>>  masking should be fine.
>> - For BPF_NEG and BPF_ARSH, might be a problem because of the sign bit.
>> - For BPF_END, 32 bit masking should work fine.
>>  Let me know if any of the above point is wrong or need your suggestion.
>>
>> - Although, for ALU instructions, there is a big problem of register
>>   flag manipulations. Generally, architecture's ABI takes care of this
>>   part but as we are doing 64 bit Instructions emulation(kind of) on 32
>>   bit machine, it needs to be done manually. Does that sound correct ?
>
> You can't truncate, but you'll have to build 64-bit ops using 32-bit registers.

A small example would help a lot.
>
>>
>> - I am not JITing BPF_ALU64 class instructions as of now. As we have to
>>   take care of atomic instructions and race conditions with these
>>   instruction which looks complicated to me as of now. Will try to figure out
>>   this part and implement it later. Currently, I will just let it be
>>   interpreted by the ebpf interpreter.
>>
>> - For BPF_JMP class, I am assuming that, although eBPF is 64 bit ABI,
>>   the address pointers on 32 bit arch like arm will be of 32 bit only.
>>   So, for BPF_JMP, masking the 64 bit destination address to 32 bit
>>   should do the trick and no address will be corrupted in this way. Am I
>>   correct to assume this ?
>>   Also, I need to check for address getting out of the allowed memory
>>   range.
>
> That's probably true, but the JIT should likely detect a truncation
> here, if you're going to depend on it, and reject the BPF.

Okay. So I guess I have to use memory for this as well ?
An example would be great.

>
>> - For BPF_LD, BPF_LDX, BPF_ST and BPF_STX class instructions, I am
>>   assuming the same thing as above - All addresses and pointers are 32
>>   bit - which can be taken care just by maksing the eBPF register
>>   values. Does that sound correct ?
>>   Also, I need to check for the address overflow, address getting out
>>   of the allowed memory range and things like that.
>
> I'd say, get something working and send a patch -- that's likely the
> best way to get more detailed feedback. :)
I would love to but I have to understand what to implement first.


-Shubham

^ permalink raw reply	[flat|nested] 99+ messages in thread

* arch: arm: bpf: Converting cBPF to eBPF for arm 32 bit
@ 2017-03-15 12:13             ` Shubham Bansal
  0 siblings, 0 replies; 99+ messages in thread
From: Shubham Bansal @ 2017-03-15 12:13 UTC (permalink / raw)
  To: linux-arm-kernel

Hi kees,

> It seems like you're suggesting truncating the 64-bit register values?
> I think your best solution is going to be to use a memory scratch
> space and build 64-bit operations using 32-bit registers and memory
> operations.

Yes. I was suggesting the truncating of 64-bit register values, but
for 32 bit operands only. So when I am truncating the BPF register, I
am getting rid of non-useful bytes only.

 Can you explain how to use memory scratch space and build 64-bit
operations using 32-bit registers and memory operations ? A small
example would help a lot.

>> - Similarly, For all BPF_ALU class instructions.
>> - For BPF_ADD, I will mask the addition result to 32 bit only.
>>  I am not sure, Overflow might be a problem.
>> - For BPF_SUB, I will mask the subtraction result to 32 bit only.
>>  I am not sure, Underflow might be problem.
>> - For BPF_MUL, similar to BPF_ADD. Overflow Problem ?
>> - For BPF_DIV, 32 bit masking should be fine, I guess.
>> - For BPF_OR, BPF_AND, BPF_XOR, BPF_LSH, BPF_RSH, BPF_MOD 32 bit
>>  masking should be fine.
>> - For BPF_NEG and BPF_ARSH, might be a problem because of the sign bit.
>> - For BPF_END, 32 bit masking should work fine.
>>  Let me know if any of the above point is wrong or need your suggestion.
>>
>> - Although, for ALU instructions, there is a big problem of register
>>   flag manipulations. Generally, architecture's ABI takes care of this
>>   part but as we are doing 64 bit Instructions emulation(kind of) on 32
>>   bit machine, it needs to be done manually. Does that sound correct ?
>
> You can't truncate, but you'll have to build 64-bit ops using 32-bit registers.

A small example would help a lot.
>
>>
>> - I am not JITing BPF_ALU64 class instructions as of now. As we have to
>>   take care of atomic instructions and race conditions with these
>>   instruction which looks complicated to me as of now. Will try to figure out
>>   this part and implement it later. Currently, I will just let it be
>>   interpreted by the ebpf interpreter.
>>
>> - For BPF_JMP class, I am assuming that, although eBPF is 64 bit ABI,
>>   the address pointers on 32 bit arch like arm will be of 32 bit only.
>>   So, for BPF_JMP, masking the 64 bit destination address to 32 bit
>>   should do the trick and no address will be corrupted in this way. Am I
>>   correct to assume this ?
>>   Also, I need to check for address getting out of the allowed memory
>>   range.
>
> That's probably true, but the JIT should likely detect a truncation
> here, if you're going to depend on it, and reject the BPF.

Okay. So I guess I have to use memory for this as well ?
An example would be great.

>
>> - For BPF_LD, BPF_LDX, BPF_ST and BPF_STX class instructions, I am
>>   assuming the same thing as above - All addresses and pointers are 32
>>   bit - which can be taken care just by maksing the eBPF register
>>   values. Does that sound correct ?
>>   Also, I need to check for the address overflow, address getting out
>>   of the allowed memory range and things like that.
>
> I'd say, get something working and send a patch -- that's likely the
> best way to get more detailed feedback. :)
I would love to but I have to understand what to implement first.


-Shubham

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: arch: arm: bpf: Converting cBPF to eBPF for arm 32 bit
  2017-02-01 13:01         ` Shubham Bansal
@ 2017-02-08 19:41           ` Kees Cook
  -1 siblings, 0 replies; 99+ messages in thread
From: Kees Cook @ 2017-02-08 19:41 UTC (permalink / raw)
  To: Shubham Bansal
  Cc: Daniel Borkmann, Mircea Gherzan, Network Development,
	kernel-hardening, linux-arm-kernel

On Wed, Feb 1, 2017 at 5:01 AM, Shubham Bansal
<illusionist.neo@gmail.com> wrote:
> Hi Kees & Daniel,
>
> On Tue, Jan 31, 2017 at 09:44:56AM -0800, Kees Cook wrote:
>> >> > 1.) Currently, as eBPF uses 64 bit registers, I am mapping 64 bit eBPF
>> >> > registers with 32 bit arm registers which looks wrong to me. Do anybody
>> >> > have some idea about how to map eBPF->arm 32 bit registers ?
>> >>
>> >> I was going to say "look at the x86 32-bit implementation." ... But
>> >> there isn't one. :( I'm going to guess that there isn't a very good
>> >> answer here. I assume you'll have to build some kind of stack scratch
>> >> space to load/save.
>> >
>> >
>> > Now I see why nobody has implemented eBPF JIT for the 32 bit systems. I
>> > think its very difficult to implement it without any complications and
>> > errors.
>>
>> Yeah, that does seem to make it much more difficult.
> I was thinking of first implementing only instructions with 32 bit
> register operands. It will hugely decrease the surface area of eBPF
> instructions that I have to cover for the first patch.

I don't know much about eBPF internals, but I can take a crack at
answering this... I assume whatever you implement would need to pass
the BPF regression tests...

> So, What I am thinking is something like this :
>
> - bpf_mov r0(64),r1(64) will be JITed like this :
> - ar1(32) <- r1(64). Convert/Mask 64 bit ebpf register(r1) value into 32
> bit and store it in arm register(ar1).
> - Do MOV ar0(32),ar1(32) as an ARM instruction.
> - ar0(32) -> r0(64). Zero Extend the ar0 32 bit register value
> and store it in 64 bit ebpf register r0.

It seems like you're suggesting truncating the 64-bit register values?
I think your best solution is going to be to use a memory scratch
space and build 64-bit operations using 32-bit registers and memory
operations.

> - Similarly, For all BPF_ALU class instructions.
> - For BPF_ADD, I will mask the addition result to 32 bit only.
>  I am not sure, Overflow might be a problem.
> - For BPF_SUB, I will mask the subtraction result to 32 bit only.
>  I am not sure, Underflow might be problem.
> - For BPF_MUL, similar to BPF_ADD. Overflow Problem ?
> - For BPF_DIV, 32 bit masking should be fine, I guess.
> - For BPF_OR, BPF_AND, BPF_XOR, BPF_LSH, BPF_RSH, BPF_MOD 32 bit
>  masking should be fine.
> - For BPF_NEG and BPF_ARSH, might be a problem because of the sign bit.
> - For BPF_END, 32 bit masking should work fine.
>  Let me know if any of the above point is wrong or need your suggestion.
>
> - Although, for ALU instructions, there is a big problem of register
>   flag manipulations. Generally, architecture's ABI takes care of this
>   part but as we are doing 64 bit Instructions emulation(kind of) on 32
>   bit machine, it needs to be done manually. Does that sound correct ?

You can't truncate, but you'll have to build 64-bit ops using 32-bit registers.

>
> - I am not JITing BPF_ALU64 class instructions as of now. As we have to
>   take care of atomic instructions and race conditions with these
>   instruction which looks complicated to me as of now. Will try to figure out
>   this part and implement it later. Currently, I will just let it be
>   interpreted by the ebpf interpreter.
>
> - For BPF_JMP class, I am assuming that, although eBPF is 64 bit ABI,
>   the address pointers on 32 bit arch like arm will be of 32 bit only.
>   So, for BPF_JMP, masking the 64 bit destination address to 32 bit
>   should do the trick and no address will be corrupted in this way. Am I
>   correct to assume this ?
>   Also, I need to check for address getting out of the allowed memory
>   range.

That's probably true, but the JIT should likely detect a truncation
here, if you're going to depend on it, and reject the BPF.

> - For BPF_LD, BPF_LDX, BPF_ST and BPF_STX class instructions, I am
>   assuming the same thing as above - All addresses and pointers are 32
>   bit - which can be taken care just by maksing the eBPF register
>   values. Does that sound correct ?
>   Also, I need to check for the address overflow, address getting out
>   of the allowed memory range and things like that.

I'd say, get something working and send a patch -- that's likely the
best way to get more detailed feedback. :)

-Kees

-- 
Kees Cook
Pixel Security

^ permalink raw reply	[flat|nested] 99+ messages in thread

* arch: arm: bpf: Converting cBPF to eBPF for arm 32 bit
@ 2017-02-08 19:41           ` Kees Cook
  0 siblings, 0 replies; 99+ messages in thread
From: Kees Cook @ 2017-02-08 19:41 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Feb 1, 2017 at 5:01 AM, Shubham Bansal
<illusionist.neo@gmail.com> wrote:
> Hi Kees & Daniel,
>
> On Tue, Jan 31, 2017 at 09:44:56AM -0800, Kees Cook wrote:
>> >> > 1.) Currently, as eBPF uses 64 bit registers, I am mapping 64 bit eBPF
>> >> > registers with 32 bit arm registers which looks wrong to me. Do anybody
>> >> > have some idea about how to map eBPF->arm 32 bit registers ?
>> >>
>> >> I was going to say "look at the x86 32-bit implementation." ... But
>> >> there isn't one. :( I'm going to guess that there isn't a very good
>> >> answer here. I assume you'll have to build some kind of stack scratch
>> >> space to load/save.
>> >
>> >
>> > Now I see why nobody has implemented eBPF JIT for the 32 bit systems. I
>> > think its very difficult to implement it without any complications and
>> > errors.
>>
>> Yeah, that does seem to make it much more difficult.
> I was thinking of first implementing only instructions with 32 bit
> register operands. It will hugely decrease the surface area of eBPF
> instructions that I have to cover for the first patch.

I don't know much about eBPF internals, but I can take a crack at
answering this... I assume whatever you implement would need to pass
the BPF regression tests...

> So, What I am thinking is something like this :
>
> - bpf_mov r0(64),r1(64) will be JITed like this :
> - ar1(32) <- r1(64). Convert/Mask 64 bit ebpf register(r1) value into 32
> bit and store it in arm register(ar1).
> - Do MOV ar0(32),ar1(32) as an ARM instruction.
> - ar0(32) -> r0(64). Zero Extend the ar0 32 bit register value
> and store it in 64 bit ebpf register r0.

It seems like you're suggesting truncating the 64-bit register values?
I think your best solution is going to be to use a memory scratch
space and build 64-bit operations using 32-bit registers and memory
operations.

> - Similarly, For all BPF_ALU class instructions.
> - For BPF_ADD, I will mask the addition result to 32 bit only.
>  I am not sure, Overflow might be a problem.
> - For BPF_SUB, I will mask the subtraction result to 32 bit only.
>  I am not sure, Underflow might be problem.
> - For BPF_MUL, similar to BPF_ADD. Overflow Problem ?
> - For BPF_DIV, 32 bit masking should be fine, I guess.
> - For BPF_OR, BPF_AND, BPF_XOR, BPF_LSH, BPF_RSH, BPF_MOD 32 bit
>  masking should be fine.
> - For BPF_NEG and BPF_ARSH, might be a problem because of the sign bit.
> - For BPF_END, 32 bit masking should work fine.
>  Let me know if any of the above point is wrong or need your suggestion.
>
> - Although, for ALU instructions, there is a big problem of register
>   flag manipulations. Generally, architecture's ABI takes care of this
>   part but as we are doing 64 bit Instructions emulation(kind of) on 32
>   bit machine, it needs to be done manually. Does that sound correct ?

You can't truncate, but you'll have to build 64-bit ops using 32-bit registers.

>
> - I am not JITing BPF_ALU64 class instructions as of now. As we have to
>   take care of atomic instructions and race conditions with these
>   instruction which looks complicated to me as of now. Will try to figure out
>   this part and implement it later. Currently, I will just let it be
>   interpreted by the ebpf interpreter.
>
> - For BPF_JMP class, I am assuming that, although eBPF is 64 bit ABI,
>   the address pointers on 32 bit arch like arm will be of 32 bit only.
>   So, for BPF_JMP, masking the 64 bit destination address to 32 bit
>   should do the trick and no address will be corrupted in this way. Am I
>   correct to assume this ?
>   Also, I need to check for address getting out of the allowed memory
>   range.

That's probably true, but the JIT should likely detect a truncation
here, if you're going to depend on it, and reject the BPF.

> - For BPF_LD, BPF_LDX, BPF_ST and BPF_STX class instructions, I am
>   assuming the same thing as above - All addresses and pointers are 32
>   bit - which can be taken care just by maksing the eBPF register
>   values. Does that sound correct ?
>   Also, I need to check for the address overflow, address getting out
>   of the allowed memory range and things like that.

I'd say, get something working and send a patch -- that's likely the
best way to get more detailed feedback. :)

-Kees

-- 
Kees Cook
Pixel Security

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: arch: arm: bpf: Converting cBPF to eBPF for arm 32 bit
  2017-02-01 13:01         ` Shubham Bansal
@ 2017-02-08  7:29           ` Shubham Bansal
  -1 siblings, 0 replies; 99+ messages in thread
From: Shubham Bansal @ 2017-02-08  7:29 UTC (permalink / raw)
  To: Kees Cook
  Cc: Daniel Borkmann, Mircea Gherzan, netdev, kernel-hardening,
	linux-arm-kernel

Anybody willing to take swing at my following comments ?

On Wed, Feb 1, 2017 at 6:31 PM, Shubham Bansal
<illusionist.neo@gmail.com> wrote:
> Hi Kees & Daniel,
>
> On Tue, Jan 31, 2017 at 09:44:56AM -0800, Kees Cook wrote:
>> >> > 1.) Currently, as eBPF uses 64 bit registers, I am mapping 64 bit eBPF
>> >> > registers with 32 bit arm registers which looks wrong to me. Do anybody
>> >> > have some idea about how to map eBPF->arm 32 bit registers ?
>> >>
>> >> I was going to say "look at the x86 32-bit implementation." ... But
>> >> there isn't one. :( I'm going to guess that there isn't a very good
>> >> answer here. I assume you'll have to build some kind of stack scratch
>> >> space to load/save.
>> >
>> >
>> > Now I see why nobody has implemented eBPF JIT for the 32 bit systems. I
>> > think its very difficult to implement it without any complications and
>> > errors.
>>
>> Yeah, that does seem to make it much more difficult.
> I was thinking of first implementing only instructions with 32 bit
> register operands. It will hugely decrease the surface area of eBPF
> instructions that I have to cover for the first patch.
>
> So, What I am thinking is something like this :
>
> - bpf_mov r0(64),r1(64) will be JITed like this :
> - ar1(32) <- r1(64). Convert/Mask 64 bit ebpf register(r1) value into 32
> bit and store it in arm register(ar1).
> - Do MOV ar0(32),ar1(32) as an ARM instruction.
> - ar0(32) -> r0(64). Zero Extend the ar0 32 bit register value
> and store it in 64 bit ebpf register r0.
>
> - Similarly, For all BPF_ALU class instructions.
> - For BPF_ADD, I will mask the addition result to 32 bit only.
>  I am not sure, Overflow might be a problem.
> - For BPF_SUB, I will mask the subtraction result to 32 bit only.
>  I am not sure, Underflow might be problem.
> - For BPF_MUL, similar to BPF_ADD. Overflow Problem ?
> - For BPF_DIV, 32 bit masking should be fine, I guess.
> - For BPF_OR, BPF_AND, BPF_XOR, BPF_LSH, BPF_RSH, BPF_MOD 32 bit
>  masking should be fine.
> - For BPF_NEG and BPF_ARSH, might be a problem because of the sign bit.
> - For BPF_END, 32 bit masking should work fine.
>  Let me know if any of the above point is wrong or need your suggestion.
>
> - Although, for ALU instructions, there is a big problem of register
>   flag manipulations. Generally, architecture's ABI takes care of this
>   part but as we are doing 64 bit Instructions emulation(kind of) on 32
>   bit machine, it needs to be done manually. Does that sound correct ?
>
> - I am not JITing BPF_ALU64 class instructions as of now. As we have to
>   take care of atomic instructions and race conditions with these
>   instruction which looks complicated to me as of now. Will try to figure out
>   this part and implement it later. Currently, I will just let it be
>   interpreted by the ebpf interpreter.
>
> - For BPF_JMP class, I am assuming that, although eBPF is 64 bit ABI,
>   the address pointers on 32 bit arch like arm will be of 32 bit only.
>   So, for BPF_JMP, masking the 64 bit destination address to 32 bit
>   should do the trick and no address will be corrupted in this way. Am I
>   correct to assume this ?
>   Also, I need to check for address getting out of the allowed memory
>   range.
>
> - For BPF_LD, BPF_LDX, BPF_ST and BPF_STX class instructions, I am
>   assuming the same thing as above - All addresses and pointers are 32
>   bit - which can be taken care just by maksing the eBPF register
>   values. Does that sound correct ?
>   Also, I need to check for the address overflow, address getting out
>   of the allowed memory range and things like that.
>
>> > Do you have any code references for me to take a look? Otherwise, I think
>> > its not possible for me to implement it without using any reference.
>>
>> I don't know anything else, no.
>
> I think, I will give it a try. Otherwise, my last 1 month which I used
> to read about eBPF, eBPF linux code and arm32 ABI would be a complete
> waste.
>
>> >>
>> >>
>> >> > 2.) Also, is my current mapping good enough to make the JIT fast enough
>> >> > ?
>> >> > because as you might know, eBPF JIT mostly depends on 1-to-1 mapping of
>> >> > its instructions with native instructions.
>> >>
>> >> I don't know -- it might be tricky with needing to deal with 64-bit
>> >> registers. But if you can make it faster than the non-JIT, it should
>> >> be a win. :) Yay assembly.
>
> Well, As I mentioned above about my thinking towards the implementation,
> I am not sure it would be faster than non-JIT or even correct for that matter.
> It might be but I don't think I have enough knowledge to benchmark the
> implementation as of now.
>
>
> -Shubham Bansal

-Shubham

^ permalink raw reply	[flat|nested] 99+ messages in thread

* arch: arm: bpf: Converting cBPF to eBPF for arm 32 bit
@ 2017-02-08  7:29           ` Shubham Bansal
  0 siblings, 0 replies; 99+ messages in thread
From: Shubham Bansal @ 2017-02-08  7:29 UTC (permalink / raw)
  To: linux-arm-kernel

Anybody willing to take swing at my following comments ?

On Wed, Feb 1, 2017 at 6:31 PM, Shubham Bansal
<illusionist.neo@gmail.com> wrote:
> Hi Kees & Daniel,
>
> On Tue, Jan 31, 2017 at 09:44:56AM -0800, Kees Cook wrote:
>> >> > 1.) Currently, as eBPF uses 64 bit registers, I am mapping 64 bit eBPF
>> >> > registers with 32 bit arm registers which looks wrong to me. Do anybody
>> >> > have some idea about how to map eBPF->arm 32 bit registers ?
>> >>
>> >> I was going to say "look at the x86 32-bit implementation." ... But
>> >> there isn't one. :( I'm going to guess that there isn't a very good
>> >> answer here. I assume you'll have to build some kind of stack scratch
>> >> space to load/save.
>> >
>> >
>> > Now I see why nobody has implemented eBPF JIT for the 32 bit systems. I
>> > think its very difficult to implement it without any complications and
>> > errors.
>>
>> Yeah, that does seem to make it much more difficult.
> I was thinking of first implementing only instructions with 32 bit
> register operands. It will hugely decrease the surface area of eBPF
> instructions that I have to cover for the first patch.
>
> So, What I am thinking is something like this :
>
> - bpf_mov r0(64),r1(64) will be JITed like this :
> - ar1(32) <- r1(64). Convert/Mask 64 bit ebpf register(r1) value into 32
> bit and store it in arm register(ar1).
> - Do MOV ar0(32),ar1(32) as an ARM instruction.
> - ar0(32) -> r0(64). Zero Extend the ar0 32 bit register value
> and store it in 64 bit ebpf register r0.
>
> - Similarly, For all BPF_ALU class instructions.
> - For BPF_ADD, I will mask the addition result to 32 bit only.
>  I am not sure, Overflow might be a problem.
> - For BPF_SUB, I will mask the subtraction result to 32 bit only.
>  I am not sure, Underflow might be problem.
> - For BPF_MUL, similar to BPF_ADD. Overflow Problem ?
> - For BPF_DIV, 32 bit masking should be fine, I guess.
> - For BPF_OR, BPF_AND, BPF_XOR, BPF_LSH, BPF_RSH, BPF_MOD 32 bit
>  masking should be fine.
> - For BPF_NEG and BPF_ARSH, might be a problem because of the sign bit.
> - For BPF_END, 32 bit masking should work fine.
>  Let me know if any of the above point is wrong or need your suggestion.
>
> - Although, for ALU instructions, there is a big problem of register
>   flag manipulations. Generally, architecture's ABI takes care of this
>   part but as we are doing 64 bit Instructions emulation(kind of) on 32
>   bit machine, it needs to be done manually. Does that sound correct ?
>
> - I am not JITing BPF_ALU64 class instructions as of now. As we have to
>   take care of atomic instructions and race conditions with these
>   instruction which looks complicated to me as of now. Will try to figure out
>   this part and implement it later. Currently, I will just let it be
>   interpreted by the ebpf interpreter.
>
> - For BPF_JMP class, I am assuming that, although eBPF is 64 bit ABI,
>   the address pointers on 32 bit arch like arm will be of 32 bit only.
>   So, for BPF_JMP, masking the 64 bit destination address to 32 bit
>   should do the trick and no address will be corrupted in this way. Am I
>   correct to assume this ?
>   Also, I need to check for address getting out of the allowed memory
>   range.
>
> - For BPF_LD, BPF_LDX, BPF_ST and BPF_STX class instructions, I am
>   assuming the same thing as above - All addresses and pointers are 32
>   bit - which can be taken care just by maksing the eBPF register
>   values. Does that sound correct ?
>   Also, I need to check for the address overflow, address getting out
>   of the allowed memory range and things like that.
>
>> > Do you have any code references for me to take a look? Otherwise, I think
>> > its not possible for me to implement it without using any reference.
>>
>> I don't know anything else, no.
>
> I think, I will give it a try. Otherwise, my last 1 month which I used
> to read about eBPF, eBPF linux code and arm32 ABI would be a complete
> waste.
>
>> >>
>> >>
>> >> > 2.) Also, is my current mapping good enough to make the JIT fast enough
>> >> > ?
>> >> > because as you might know, eBPF JIT mostly depends on 1-to-1 mapping of
>> >> > its instructions with native instructions.
>> >>
>> >> I don't know -- it might be tricky with needing to deal with 64-bit
>> >> registers. But if you can make it faster than the non-JIT, it should
>> >> be a win. :) Yay assembly.
>
> Well, As I mentioned above about my thinking towards the implementation,
> I am not sure it would be faster than non-JIT or even correct for that matter.
> It might be but I don't think I have enough knowledge to benchmark the
> implementation as of now.
>
>
> -Shubham Bansal

-Shubham

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: arch: arm: bpf: Converting cBPF to eBPF for arm 32 bit
  2017-02-03  7:04             ` Shubham Bansal
@ 2017-02-03  8:25               ` nick viljoen
  -1 siblings, 0 replies; 99+ messages in thread
From: nick viljoen @ 2017-02-03  8:25 UTC (permalink / raw)
  To: Shubham Bansal
  Cc: Kees Cook, Daniel Borkmann, Mircea Gherzan, netdev,
	kernel-hardening, linux-arm-kernel



> On Feb 2, 2017, at 11:04 PM, Shubham Bansal <illusionist.neo@gmail.com> wrote:
> 
> Hi Nick,
> 
> On Thu, Feb 2, 2017 at 12:59 PM, nick viljoen
> <nick.viljoen@netronome.com> wrote:
>> Hey Shubham,
>> 
>> I have been doing some similar work-might be worth pooling
>> resource if there is interest?
> 
> Sure. That sounds great.
> 
>> 
>> We made a presentation at the previous netdev conference about
>> what we are doing-you can check it out here :)
>> 
>> https://www.youtube.com/watch?v=-5BzT1ch19s&t=45s
> 
> Sorry for the late reply. I had to watch the whole video. Its was fun.
> Now. Its seems like a small of your complete project was related to
> eBPF 64 bit register to 32 bit register mapping, although I don't have
> any knowledge about the Hardware aspect of it.
> Now, getting back to your slides, on Page 7 you are mapping eBPF 64
> bit register to 32 bit register.
> 
> 1. Can you explain that to me? I didn't get this part from you presentation.
> 2. How are you taking care of Race Condition on 64 bit eBPF registers
> Read/Write as you are using 32 bit registers to emulate them ?
> 
>> 
>> What is your reason for looking at these problems?
> 
> I just wanted to contribute toward linux kernel. This is the only
> reason I think.
There seems to have been some tying of emails here-my previous
email ended here-currently on my mail client it appears as though the
below is my email. As you have implied, I presume the below is you
replying to yourself.
-----------------------------------------
> 
>> I was thinking of first implementing only instructions with 32 bit
>> register operands. It will hugely decrease the surface area of eBPF
>> instructions that I have to cover for the first patch.
>> 
>> So, What I am thinking is something like this :
>> 
>> - bpf_mov r0(64),r1(64) will be JITed like this :
>> - ar1(32) <- r1(64). Convert/Mask 64 bit ebpf register(r1) value into 32
>> bit and store it in arm register(ar1).
>> - Do MOV ar0(32),ar1(32) as an ARM instruction.
>> - ar0(32) -> r0(64). Zero Extend the ar0 32 bit register value
>> and store it in 64 bit ebpf register r0.
> 
> What about this ? Does this makes sense to you ?
>> 
>> - Similarly, For all BPF_ALU class instructions.
>> - For BPF_ADD, I will mask the addition result to 32 bit only.
>> I am not sure, Overflow might be a problem.
>> - For BPF_SUB, I will mask the subtraction result to 32 bit only.
>> I am not sure, Underflow might be problem.
>> - For BPF_MUL, similar to BPF_ADD. Overflow Problem ?
>> - For BPF_DIV, 32 bit masking should be fine, I guess.
>> - For BPF_OR, BPF_AND, BPF_XOR, BPF_LSH, BPF_RSH, BPF_MOD 32 bit
>> masking should be fine.
>> - For BPF_NEG and BPF_ARSH, might be a problem because of the sign bit.
>> - For BPF_END, 32 bit masking should work fine.
>> Let me know if any of the above point is wrong or need your suggestion.
> What about this ?
>> 
>> - Although, for ALU instructions, there is a big problem of register
>> flag manipulations. Generally, architecture's ABI takes care of this
>> part but as we are doing 64 bit Instructions emulation(kind of) on 32
>> bit machine, it needs to be done manually. Does that sound correct ?
>> 
>> - I am not JITing BPF_ALU64 class instructions as of now. As we have to
>> take care of atomic instructions and race conditions with these
>> instruction which looks complicated to me as of now. Will try to figure out
>> this part and implement it later. Currently, I will just let it be
>> interpreted by the ebpf interpreter.
>> 
>> - For BPF_JMP class, I am assuming that, although eBPF is 64 bit ABI,
>> the address pointers on 32 bit arch like arm will be of 32 bit only.
>> So, for BPF_JMP, masking the 64 bit destination address to 32 bit
>> should do the trick and no address will be corrupted in this way. Am I
>> correct to assume this ?
>> Also, I need to check for address getting out of the allowed memory
>> range.
>> 
>> - For BPF_LD, BPF_LDX, BPF_ST and BPF_STX class instructions, I am
>> assuming the same thing as above - All addresses and pointers are 32
>> bit - which can be taken care just by maksing the eBPF register
>> values. Does that sound correct ?
>> Also, I need to check for the address overflow, address getting out
>> of the allowed memory range and things like that.
>> 

> Nick, It would be great if you could give me your comments/suggestions
> on all of the above points for JIT implementation.

As we are selectively offloading to a NPU based NIC we can avoid some of
the problems you have mentioned so I am afraid I don't have all the 
answers

While we have stated publicly we are doing this work and aren't trying to
hide anything, the reason I replied to you in private is that it is generally
not a good idea to share half baked ideas on the mailing list as it wastes
peoples time :). 

The best approach is to wait until you are able to post an RFC patch for
public discussion.
> 
>> Do you have any code references for me to take a look? Otherwise, I think
>> its not possible for me to implement it without using any reference.
>> 
>> 
>> I don't know anything else, no.
> 
> +Kees,
> 
> I think drivers/net/ethernet/netronome/nfp/ could be a good reference for this.
> 
>> 
>> 
>> I think, I will give it a try. Otherwise, my last 1 month which I used
>> to read about eBPF, eBPF linux code and arm32 ABI would be a complete
>> waste.
>> 
>> 
>> 
>> 2.) Also, is my current mapping good enough to make the JIT fast enough
>> ?
>> because as you might know, eBPF JIT mostly depends on 1-to-1 mapping of
>> its instructions with native instructions.
>> 
>> 
>> I don't know -- it might be tricky with needing to deal with 64-bit
>> registers. But if you can make it faster than the non-JIT, it should
>> be a win. :) Yay assembly.
>> 
>> 
>> Well, As I mentioned above about my thinking towards the implementation,
>> I am not sure it would be faster than non-JIT or even correct for that
>> matter.
>> It might be but I don't think I have enough knowledge to benchmark the
>> implementation as of now.
> 
> Nick, How fast was your JIT as compared to interpreter if you had the
> chance to benchmark them?
> 
> -Shubham

^ permalink raw reply	[flat|nested] 99+ messages in thread

* arch: arm: bpf: Converting cBPF to eBPF for arm 32 bit
@ 2017-02-03  8:25               ` nick viljoen
  0 siblings, 0 replies; 99+ messages in thread
From: nick viljoen @ 2017-02-03  8:25 UTC (permalink / raw)
  To: linux-arm-kernel



> On Feb 2, 2017, at 11:04 PM, Shubham Bansal <illusionist.neo@gmail.com> wrote:
> 
> Hi Nick,
> 
> On Thu, Feb 2, 2017 at 12:59 PM, nick viljoen
> <nick.viljoen@netronome.com> wrote:
>> Hey Shubham,
>> 
>> I have been doing some similar work-might be worth pooling
>> resource if there is interest?
> 
> Sure. That sounds great.
> 
>> 
>> We made a presentation at the previous netdev conference about
>> what we are doing-you can check it out here :)
>> 
>> https://www.youtube.com/watch?v=-5BzT1ch19s&t=45s
> 
> Sorry for the late reply. I had to watch the whole video. Its was fun.
> Now. Its seems like a small of your complete project was related to
> eBPF 64 bit register to 32 bit register mapping, although I don't have
> any knowledge about the Hardware aspect of it.
> Now, getting back to your slides, on Page 7 you are mapping eBPF 64
> bit register to 32 bit register.
> 
> 1. Can you explain that to me? I didn't get this part from you presentation.
> 2. How are you taking care of Race Condition on 64 bit eBPF registers
> Read/Write as you are using 32 bit registers to emulate them ?
> 
>> 
>> What is your reason for looking at these problems?
> 
> I just wanted to contribute toward linux kernel. This is the only
> reason I think.
There seems to have been some tying of emails here-my previous
email ended here-currently on my mail client it appears as though the
below is my email. As you have implied, I presume the below is you
replying to yourself.
-----------------------------------------
> 
>> I was thinking of first implementing only instructions with 32 bit
>> register operands. It will hugely decrease the surface area of eBPF
>> instructions that I have to cover for the first patch.
>> 
>> So, What I am thinking is something like this :
>> 
>> - bpf_mov r0(64),r1(64) will be JITed like this :
>> - ar1(32) <- r1(64). Convert/Mask 64 bit ebpf register(r1) value into 32
>> bit and store it in arm register(ar1).
>> - Do MOV ar0(32),ar1(32) as an ARM instruction.
>> - ar0(32) -> r0(64). Zero Extend the ar0 32 bit register value
>> and store it in 64 bit ebpf register r0.
> 
> What about this ? Does this makes sense to you ?
>> 
>> - Similarly, For all BPF_ALU class instructions.
>> - For BPF_ADD, I will mask the addition result to 32 bit only.
>> I am not sure, Overflow might be a problem.
>> - For BPF_SUB, I will mask the subtraction result to 32 bit only.
>> I am not sure, Underflow might be problem.
>> - For BPF_MUL, similar to BPF_ADD. Overflow Problem ?
>> - For BPF_DIV, 32 bit masking should be fine, I guess.
>> - For BPF_OR, BPF_AND, BPF_XOR, BPF_LSH, BPF_RSH, BPF_MOD 32 bit
>> masking should be fine.
>> - For BPF_NEG and BPF_ARSH, might be a problem because of the sign bit.
>> - For BPF_END, 32 bit masking should work fine.
>> Let me know if any of the above point is wrong or need your suggestion.
> What about this ?
>> 
>> - Although, for ALU instructions, there is a big problem of register
>> flag manipulations. Generally, architecture's ABI takes care of this
>> part but as we are doing 64 bit Instructions emulation(kind of) on 32
>> bit machine, it needs to be done manually. Does that sound correct ?
>> 
>> - I am not JITing BPF_ALU64 class instructions as of now. As we have to
>> take care of atomic instructions and race conditions with these
>> instruction which looks complicated to me as of now. Will try to figure out
>> this part and implement it later. Currently, I will just let it be
>> interpreted by the ebpf interpreter.
>> 
>> - For BPF_JMP class, I am assuming that, although eBPF is 64 bit ABI,
>> the address pointers on 32 bit arch like arm will be of 32 bit only.
>> So, for BPF_JMP, masking the 64 bit destination address to 32 bit
>> should do the trick and no address will be corrupted in this way. Am I
>> correct to assume this ?
>> Also, I need to check for address getting out of the allowed memory
>> range.
>> 
>> - For BPF_LD, BPF_LDX, BPF_ST and BPF_STX class instructions, I am
>> assuming the same thing as above - All addresses and pointers are 32
>> bit - which can be taken care just by maksing the eBPF register
>> values. Does that sound correct ?
>> Also, I need to check for the address overflow, address getting out
>> of the allowed memory range and things like that.
>> 

> Nick, It would be great if you could give me your comments/suggestions
> on all of the above points for JIT implementation.

As we are selectively offloading to a NPU based NIC we can avoid some of
the problems you have mentioned so I am afraid I don't have all the 
answers

While we have stated publicly we are doing this work and aren't trying to
hide anything, the reason I replied to you in private is that it is generally
not a good idea to share half baked ideas on the mailing list as it wastes
peoples time :). 

The best approach is to wait until you are able to post an RFC patch for
public discussion.
> 
>> Do you have any code references for me to take a look? Otherwise, I think
>> its not possible for me to implement it without using any reference.
>> 
>> 
>> I don't know anything else, no.
> 
> +Kees,
> 
> I think drivers/net/ethernet/netronome/nfp/ could be a good reference for this.
> 
>> 
>> 
>> I think, I will give it a try. Otherwise, my last 1 month which I used
>> to read about eBPF, eBPF linux code and arm32 ABI would be a complete
>> waste.
>> 
>> 
>> 
>> 2.) Also, is my current mapping good enough to make the JIT fast enough
>> ?
>> because as you might know, eBPF JIT mostly depends on 1-to-1 mapping of
>> its instructions with native instructions.
>> 
>> 
>> I don't know -- it might be tricky with needing to deal with 64-bit
>> registers. But if you can make it faster than the non-JIT, it should
>> be a win. :) Yay assembly.
>> 
>> 
>> Well, As I mentioned above about my thinking towards the implementation,
>> I am not sure it would be faster than non-JIT or even correct for that
>> matter.
>> It might be but I don't think I have enough knowledge to benchmark the
>> implementation as of now.
> 
> Nick, How fast was your JIT as compared to interpreter if you had the
> chance to benchmark them?
> 
> -Shubham

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: arch: arm: bpf: Converting cBPF to eBPF for arm 32 bit
       [not found]         ` <76621BFF-B30B-4417-AB2B-DB21CA6092D9@netronome.com>
@ 2017-02-03  7:04             ` Shubham Bansal
  0 siblings, 0 replies; 99+ messages in thread
From: Shubham Bansal @ 2017-02-03  7:04 UTC (permalink / raw)
  To: nick viljoen, Kees Cook
  Cc: Daniel Borkmann, Mircea Gherzan, netdev, kernel-hardening,
	linux-arm-kernel

Hi Nick,

On Thu, Feb 2, 2017 at 12:59 PM, nick viljoen
<nick.viljoen@netronome.com> wrote:
> Hey Shubham,
>
> I have been doing some similar work-might be worth pooling
> resource if there is interest?

Sure. That sounds great.

>
> We made a presentation at the previous netdev conference about
> what we are doing-you can check it out here :)
>
> https://www.youtube.com/watch?v=-5BzT1ch19s&t=45s

Sorry for the late reply. I had to watch the whole video. Its was fun.
Now. Its seems like a small of your complete project was related to
eBPF 64 bit register to 32 bit register mapping, although I don't have
any knowledge about the Hardware aspect of it.
Now, getting back to your slides, on Page 7 you are mapping eBPF 64
bit register to 32 bit register.

1. Can you explain that to me? I didn't get this part from you presentation.
2. How are you taking care of Race Condition on 64 bit eBPF registers
Read/Write as you are using 32 bit registers to emulate them ?

>
> What is your reason for looking at these problems?

I just wanted to contribute toward linux kernel. This is the only
reason I think.

> I was thinking of first implementing only instructions with 32 bit
> register operands. It will hugely decrease the surface area of eBPF
> instructions that I have to cover for the first patch.
>
> So, What I am thinking is something like this :
>
> - bpf_mov r0(64),r1(64) will be JITed like this :
> - ar1(32) <- r1(64). Convert/Mask 64 bit ebpf register(r1) value into 32
> bit and store it in arm register(ar1).
> - Do MOV ar0(32),ar1(32) as an ARM instruction.
> - ar0(32) -> r0(64). Zero Extend the ar0 32 bit register value
> and store it in 64 bit ebpf register r0.

What about this ? Does this makes sense to you ?
>
> - Similarly, For all BPF_ALU class instructions.
> - For BPF_ADD, I will mask the addition result to 32 bit only.
> I am not sure, Overflow might be a problem.
> - For BPF_SUB, I will mask the subtraction result to 32 bit only.
> I am not sure, Underflow might be problem.
> - For BPF_MUL, similar to BPF_ADD. Overflow Problem ?
> - For BPF_DIV, 32 bit masking should be fine, I guess.
> - For BPF_OR, BPF_AND, BPF_XOR, BPF_LSH, BPF_RSH, BPF_MOD 32 bit
> masking should be fine.
> - For BPF_NEG and BPF_ARSH, might be a problem because of the sign bit.
> - For BPF_END, 32 bit masking should work fine.
> Let me know if any of the above point is wrong or need your suggestion.
What about this ?
>
> - Although, for ALU instructions, there is a big problem of register
>  flag manipulations. Generally, architecture's ABI takes care of this
>  part but as we are doing 64 bit Instructions emulation(kind of) on 32
>  bit machine, it needs to be done manually. Does that sound correct ?
>
> - I am not JITing BPF_ALU64 class instructions as of now. As we have to
>  take care of atomic instructions and race conditions with these
>  instruction which looks complicated to me as of now. Will try to figure out
>  this part and implement it later. Currently, I will just let it be
>  interpreted by the ebpf interpreter.
>
> - For BPF_JMP class, I am assuming that, although eBPF is 64 bit ABI,
>  the address pointers on 32 bit arch like arm will be of 32 bit only.
>  So, for BPF_JMP, masking the 64 bit destination address to 32 bit
>  should do the trick and no address will be corrupted in this way. Am I
>  correct to assume this ?
>  Also, I need to check for address getting out of the allowed memory
>  range.
>
> - For BPF_LD, BPF_LDX, BPF_ST and BPF_STX class instructions, I am
>  assuming the same thing as above - All addresses and pointers are 32
>  bit - which can be taken care just by maksing the eBPF register
>  values. Does that sound correct ?
>  Also, I need to check for the address overflow, address getting out
>  of the allowed memory range and things like that.
>

Nick, It would be great if you could give me your comments/suggestions
on all of the above points for JIT implementation.

> Do you have any code references for me to take a look? Otherwise, I think
> its not possible for me to implement it without using any reference.
>
>
> I don't know anything else, no.

+Kees,

I think drivers/net/ethernet/netronome/nfp/ could be a good reference for this.

>
>
> I think, I will give it a try. Otherwise, my last 1 month which I used
> to read about eBPF, eBPF linux code and arm32 ABI would be a complete
> waste.
>
>
>
> 2.) Also, is my current mapping good enough to make the JIT fast enough
> ?
> because as you might know, eBPF JIT mostly depends on 1-to-1 mapping of
> its instructions with native instructions.
>
>
> I don't know -- it might be tricky with needing to deal with 64-bit
> registers. But if you can make it faster than the non-JIT, it should
> be a win. :) Yay assembly.
>
>
> Well, As I mentioned above about my thinking towards the implementation,
> I am not sure it would be faster than non-JIT or even correct for that
> matter.
> It might be but I don't think I have enough knowledge to benchmark the
> implementation as of now.

Nick, How fast was your JIT as compared to interpreter if you had the
chance to benchmark them?

-Shubham

^ permalink raw reply	[flat|nested] 99+ messages in thread

* arch: arm: bpf: Converting cBPF to eBPF for arm 32 bit
@ 2017-02-03  7:04             ` Shubham Bansal
  0 siblings, 0 replies; 99+ messages in thread
From: Shubham Bansal @ 2017-02-03  7:04 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Nick,

On Thu, Feb 2, 2017 at 12:59 PM, nick viljoen
<nick.viljoen@netronome.com> wrote:
> Hey Shubham,
>
> I have been doing some similar work-might be worth pooling
> resource if there is interest?

Sure. That sounds great.

>
> We made a presentation at the previous netdev conference about
> what we are doing-you can check it out here :)
>
> https://www.youtube.com/watch?v=-5BzT1ch19s&t=45s

Sorry for the late reply. I had to watch the whole video. Its was fun.
Now. Its seems like a small of your complete project was related to
eBPF 64 bit register to 32 bit register mapping, although I don't have
any knowledge about the Hardware aspect of it.
Now, getting back to your slides, on Page 7 you are mapping eBPF 64
bit register to 32 bit register.

1. Can you explain that to me? I didn't get this part from you presentation.
2. How are you taking care of Race Condition on 64 bit eBPF registers
Read/Write as you are using 32 bit registers to emulate them ?

>
> What is your reason for looking at these problems?

I just wanted to contribute toward linux kernel. This is the only
reason I think.

> I was thinking of first implementing only instructions with 32 bit
> register operands. It will hugely decrease the surface area of eBPF
> instructions that I have to cover for the first patch.
>
> So, What I am thinking is something like this :
>
> - bpf_mov r0(64),r1(64) will be JITed like this :
> - ar1(32) <- r1(64). Convert/Mask 64 bit ebpf register(r1) value into 32
> bit and store it in arm register(ar1).
> - Do MOV ar0(32),ar1(32) as an ARM instruction.
> - ar0(32) -> r0(64). Zero Extend the ar0 32 bit register value
> and store it in 64 bit ebpf register r0.

What about this ? Does this makes sense to you ?
>
> - Similarly, For all BPF_ALU class instructions.
> - For BPF_ADD, I will mask the addition result to 32 bit only.
> I am not sure, Overflow might be a problem.
> - For BPF_SUB, I will mask the subtraction result to 32 bit only.
> I am not sure, Underflow might be problem.
> - For BPF_MUL, similar to BPF_ADD. Overflow Problem ?
> - For BPF_DIV, 32 bit masking should be fine, I guess.
> - For BPF_OR, BPF_AND, BPF_XOR, BPF_LSH, BPF_RSH, BPF_MOD 32 bit
> masking should be fine.
> - For BPF_NEG and BPF_ARSH, might be a problem because of the sign bit.
> - For BPF_END, 32 bit masking should work fine.
> Let me know if any of the above point is wrong or need your suggestion.
What about this ?
>
> - Although, for ALU instructions, there is a big problem of register
>  flag manipulations. Generally, architecture's ABI takes care of this
>  part but as we are doing 64 bit Instructions emulation(kind of) on 32
>  bit machine, it needs to be done manually. Does that sound correct ?
>
> - I am not JITing BPF_ALU64 class instructions as of now. As we have to
>  take care of atomic instructions and race conditions with these
>  instruction which looks complicated to me as of now. Will try to figure out
>  this part and implement it later. Currently, I will just let it be
>  interpreted by the ebpf interpreter.
>
> - For BPF_JMP class, I am assuming that, although eBPF is 64 bit ABI,
>  the address pointers on 32 bit arch like arm will be of 32 bit only.
>  So, for BPF_JMP, masking the 64 bit destination address to 32 bit
>  should do the trick and no address will be corrupted in this way. Am I
>  correct to assume this ?
>  Also, I need to check for address getting out of the allowed memory
>  range.
>
> - For BPF_LD, BPF_LDX, BPF_ST and BPF_STX class instructions, I am
>  assuming the same thing as above - All addresses and pointers are 32
>  bit - which can be taken care just by maksing the eBPF register
>  values. Does that sound correct ?
>  Also, I need to check for the address overflow, address getting out
>  of the allowed memory range and things like that.
>

Nick, It would be great if you could give me your comments/suggestions
on all of the above points for JIT implementation.

> Do you have any code references for me to take a look? Otherwise, I think
> its not possible for me to implement it without using any reference.
>
>
> I don't know anything else, no.

+Kees,

I think drivers/net/ethernet/netronome/nfp/ could be a good reference for this.

>
>
> I think, I will give it a try. Otherwise, my last 1 month which I used
> to read about eBPF, eBPF linux code and arm32 ABI would be a complete
> waste.
>
>
>
> 2.) Also, is my current mapping good enough to make the JIT fast enough
> ?
> because as you might know, eBPF JIT mostly depends on 1-to-1 mapping of
> its instructions with native instructions.
>
>
> I don't know -- it might be tricky with needing to deal with 64-bit
> registers. But if you can make it faster than the non-JIT, it should
> be a win. :) Yay assembly.
>
>
> Well, As I mentioned above about my thinking towards the implementation,
> I am not sure it would be faster than non-JIT or even correct for that
> matter.
> It might be but I don't think I have enough knowledge to benchmark the
> implementation as of now.

Nick, How fast was your JIT as compared to interpreter if you had the
chance to benchmark them?

-Shubham

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: arch: arm: bpf: Converting cBPF to eBPF for arm 32 bit
       [not found]     ` <CAGXu5j+NSLomuSgD40kys+pWc+J9aB6Bbk_gSP9Lp_ScimQn_w@mail.gmail.com>
@ 2017-02-01 13:01         ` Shubham Bansal
  0 siblings, 0 replies; 99+ messages in thread
From: Shubham Bansal @ 2017-02-01 13:01 UTC (permalink / raw)
  To: Kees Cook
  Cc: Daniel Borkmann, Mircea Gherzan, netdev, kernel-hardening,
	linux-arm-kernel

Hi Kees & Daniel,

On Tue, Jan 31, 2017 at 09:44:56AM -0800, Kees Cook wrote:
> >> > 1.) Currently, as eBPF uses 64 bit registers, I am mapping 64 bit eBPF
> >> > registers with 32 bit arm registers which looks wrong to me. Do anybody
> >> > have some idea about how to map eBPF->arm 32 bit registers ?
> >>
> >> I was going to say "look at the x86 32-bit implementation." ... But
> >> there isn't one. :( I'm going to guess that there isn't a very good
> >> answer here. I assume you'll have to build some kind of stack scratch
> >> space to load/save.
> >
> >
> > Now I see why nobody has implemented eBPF JIT for the 32 bit systems. I
> > think its very difficult to implement it without any complications and
> > errors.
>
> Yeah, that does seem to make it much more difficult.
I was thinking of first implementing only instructions with 32 bit
register operands. It will hugely decrease the surface area of eBPF
instructions that I have to cover for the first patch.

So, What I am thinking is something like this :

- bpf_mov r0(64),r1(64) will be JITed like this :
- ar1(32) <- r1(64). Convert/Mask 64 bit ebpf register(r1) value into 32
bit and store it in arm register(ar1).
- Do MOV ar0(32),ar1(32) as an ARM instruction.
- ar0(32) -> r0(64). Zero Extend the ar0 32 bit register value
and store it in 64 bit ebpf register r0.

- Similarly, For all BPF_ALU class instructions.
- For BPF_ADD, I will mask the addition result to 32 bit only.
 I am not sure, Overflow might be a problem.
- For BPF_SUB, I will mask the subtraction result to 32 bit only.
 I am not sure, Underflow might be problem.
- For BPF_MUL, similar to BPF_ADD. Overflow Problem ?
- For BPF_DIV, 32 bit masking should be fine, I guess.
- For BPF_OR, BPF_AND, BPF_XOR, BPF_LSH, BPF_RSH, BPF_MOD 32 bit
 masking should be fine.
- For BPF_NEG and BPF_ARSH, might be a problem because of the sign bit.
- For BPF_END, 32 bit masking should work fine.
 Let me know if any of the above point is wrong or need your suggestion.

- Although, for ALU instructions, there is a big problem of register
  flag manipulations. Generally, architecture's ABI takes care of this
  part but as we are doing 64 bit Instructions emulation(kind of) on 32
  bit machine, it needs to be done manually. Does that sound correct ?

- I am not JITing BPF_ALU64 class instructions as of now. As we have to
  take care of atomic instructions and race conditions with these
  instruction which looks complicated to me as of now. Will try to figure out
  this part and implement it later. Currently, I will just let it be
  interpreted by the ebpf interpreter.

- For BPF_JMP class, I am assuming that, although eBPF is 64 bit ABI,
  the address pointers on 32 bit arch like arm will be of 32 bit only.
  So, for BPF_JMP, masking the 64 bit destination address to 32 bit
  should do the trick and no address will be corrupted in this way. Am I
  correct to assume this ?
  Also, I need to check for address getting out of the allowed memory
  range.

- For BPF_LD, BPF_LDX, BPF_ST and BPF_STX class instructions, I am
  assuming the same thing as above - All addresses and pointers are 32
  bit - which can be taken care just by maksing the eBPF register
  values. Does that sound correct ?
  Also, I need to check for the address overflow, address getting out
  of the allowed memory range and things like that.

> > Do you have any code references for me to take a look? Otherwise, I think
> > its not possible for me to implement it without using any reference.
>
> I don't know anything else, no.

I think, I will give it a try. Otherwise, my last 1 month which I used
to read about eBPF, eBPF linux code and arm32 ABI would be a complete
waste.

> >>
> >>
> >> > 2.) Also, is my current mapping good enough to make the JIT fast enough
> >> > ?
> >> > because as you might know, eBPF JIT mostly depends on 1-to-1 mapping of
> >> > its instructions with native instructions.
> >>
> >> I don't know -- it might be tricky with needing to deal with 64-bit
> >> registers. But if you can make it faster than the non-JIT, it should
> >> be a win. :) Yay assembly.

Well, As I mentioned above about my thinking towards the implementation,
I am not sure it would be faster than non-JIT or even correct for that matter.
It might be but I don't think I have enough knowledge to benchmark the
implementation as of now.


-Shubham Bansal

^ permalink raw reply	[flat|nested] 99+ messages in thread

* arch: arm: bpf: Converting cBPF to eBPF for arm 32 bit
@ 2017-02-01 13:01         ` Shubham Bansal
  0 siblings, 0 replies; 99+ messages in thread
From: Shubham Bansal @ 2017-02-01 13:01 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Kees & Daniel,

On Tue, Jan 31, 2017 at 09:44:56AM -0800, Kees Cook wrote:
> >> > 1.) Currently, as eBPF uses 64 bit registers, I am mapping 64 bit eBPF
> >> > registers with 32 bit arm registers which looks wrong to me. Do anybody
> >> > have some idea about how to map eBPF->arm 32 bit registers ?
> >>
> >> I was going to say "look at the x86 32-bit implementation." ... But
> >> there isn't one. :( I'm going to guess that there isn't a very good
> >> answer here. I assume you'll have to build some kind of stack scratch
> >> space to load/save.
> >
> >
> > Now I see why nobody has implemented eBPF JIT for the 32 bit systems. I
> > think its very difficult to implement it without any complications and
> > errors.
>
> Yeah, that does seem to make it much more difficult.
I was thinking of first implementing only instructions with 32 bit
register operands. It will hugely decrease the surface area of eBPF
instructions that I have to cover for the first patch.

So, What I am thinking is something like this :

- bpf_mov r0(64),r1(64) will be JITed like this :
- ar1(32) <- r1(64). Convert/Mask 64 bit ebpf register(r1) value into 32
bit and store it in arm register(ar1).
- Do MOV ar0(32),ar1(32) as an ARM instruction.
- ar0(32) -> r0(64). Zero Extend the ar0 32 bit register value
and store it in 64 bit ebpf register r0.

- Similarly, For all BPF_ALU class instructions.
- For BPF_ADD, I will mask the addition result to 32 bit only.
 I am not sure, Overflow might be a problem.
- For BPF_SUB, I will mask the subtraction result to 32 bit only.
 I am not sure, Underflow might be problem.
- For BPF_MUL, similar to BPF_ADD. Overflow Problem ?
- For BPF_DIV, 32 bit masking should be fine, I guess.
- For BPF_OR, BPF_AND, BPF_XOR, BPF_LSH, BPF_RSH, BPF_MOD 32 bit
 masking should be fine.
- For BPF_NEG and BPF_ARSH, might be a problem because of the sign bit.
- For BPF_END, 32 bit masking should work fine.
 Let me know if any of the above point is wrong or need your suggestion.

- Although, for ALU instructions, there is a big problem of register
  flag manipulations. Generally, architecture's ABI takes care of this
  part but as we are doing 64 bit Instructions emulation(kind of) on 32
  bit machine, it needs to be done manually. Does that sound correct ?

- I am not JITing BPF_ALU64 class instructions as of now. As we have to
  take care of atomic instructions and race conditions with these
  instruction which looks complicated to me as of now. Will try to figure out
  this part and implement it later. Currently, I will just let it be
  interpreted by the ebpf interpreter.

- For BPF_JMP class, I am assuming that, although eBPF is 64 bit ABI,
  the address pointers on 32 bit arch like arm will be of 32 bit only.
  So, for BPF_JMP, masking the 64 bit destination address to 32 bit
  should do the trick and no address will be corrupted in this way. Am I
  correct to assume this ?
  Also, I need to check for address getting out of the allowed memory
  range.

- For BPF_LD, BPF_LDX, BPF_ST and BPF_STX class instructions, I am
  assuming the same thing as above - All addresses and pointers are 32
  bit - which can be taken care just by maksing the eBPF register
  values. Does that sound correct ?
  Also, I need to check for the address overflow, address getting out
  of the allowed memory range and things like that.

> > Do you have any code references for me to take a look? Otherwise, I think
> > its not possible for me to implement it without using any reference.
>
> I don't know anything else, no.

I think, I will give it a try. Otherwise, my last 1 month which I used
to read about eBPF, eBPF linux code and arm32 ABI would be a complete
waste.

> >>
> >>
> >> > 2.) Also, is my current mapping good enough to make the JIT fast enough
> >> > ?
> >> > because as you might know, eBPF JIT mostly depends on 1-to-1 mapping of
> >> > its instructions with native instructions.
> >>
> >> I don't know -- it might be tricky with needing to deal with 64-bit
> >> registers. But if you can make it faster than the non-JIT, it should
> >> be a win. :) Yay assembly.

Well, As I mentioned above about my thinking towards the implementation,
I am not sure it would be faster than non-JIT or even correct for that matter.
It might be but I don't think I have enough knowledge to benchmark the
implementation as of now.


-Shubham Bansal

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: arch: arm: bpf: Converting cBPF to eBPF for arm 32 bit
  2017-01-30 10:38 ` Shubham Bansal
@ 2017-01-30 21:57   ` Kees Cook
  -1 siblings, 0 replies; 99+ messages in thread
From: Kees Cook @ 2017-01-30 21:57 UTC (permalink / raw)
  To: Shubham Bansal
  Cc: Daniel Borkmann, Mircea Gherzan, Network Development,
	kernel-hardening, Russell King, linux-arm-kernel

On Mon, Jan 30, 2017 at 2:38 AM, Shubham Bansal
<illusionist.neo@gmail.com> wrote:
> Hi all,
>
> Please ignore last copy of this mail. Kernel mailing lists bounced my
> last mail back because of HTML content.
>
> Just starting a new thread with proper heading on the main kernel
> hardening and net-dev mailing list so that other people can be involved
> in this. Please don't take this as a personal mail.
>
> I am working on conversion of arm32 cBPF into eBPF JIT. I wanted some
> help, regarding understanding of kernel code, from the dev available on
> the mailing list. If you look at the ./arch/arm/net/bpf_jit_32.c code,
> you will see jit_ctx structure. If anybody could help me understand what
> each fields of this structure represent then it would be great.
>
> Also, currently I am mapping the eBPF registers to arm 32 bit registers
> in the following way.
>
>> static const int bpf2a32[] = {
>>
>>         /* return value from in-kernel function, and exit value from
>>         eBPF
>> */
>>         [BPF_REG_0] = ARM_R0,
>>
>>         /* arguments from eBPF program to in-kernel function */
>>
>>         [BPF_REG_1] = ARM_R1,
>>
>>         [BPF_REG_2] = ARM_R2,
>>
>>         [BPF_REG_3] = ARM_R3,
>>
>>         [BPF_REG_4] = ARM_R4,
>>
>>         [BPF_REG_5] = ARM_R5,
>>
>>         /* callee saved registers that in-kernel function will
>>         preserve */
>>
>>         [BPF_REG_6] = ARM_R6,
>>
>>         [BPF_REG_7] = ARM_R7,
>>
>>         [BPF_REG_8] = ARM_R8,
>>
>>         [BPF_REG_9] = ARM_R9,
>>
>>         /* Read only Frame Pointer to access Stack */
>>
>>         [BPF_REG_FP] = ARM_FP,
>>
>>         /* Temperory Register for internal BPF JIT */
>>
>>         [TMP_REG_1] = ARM_R11,
>>
>>         /* temporary register for blinding constants */
>>
>>         [BPF_REG_AX] = ARM_R10,
>>
>> };
>
> But I have some question if anybody could help with those.
>
> 1.) Currently, as eBPF uses 64 bit registers, I am mapping 64 bit eBPF
> registers with 32 bit arm registers which looks wrong to me. Do anybody
> have some idea about how to map eBPF->arm 32 bit registers ?

I was going to say "look at the x86 32-bit implementation." ... But
there isn't one. :( I'm going to guess that there isn't a very good
answer here. I assume you'll have to build some kind of stack scratch
space to load/save.

> 2.) Also, is my current mapping good enough to make the JIT fast enough ?
> because as you might know, eBPF JIT mostly depends on 1-to-1 mapping of
> its instructions with native instructions.

I don't know -- it might be tricky with needing to deal with 64-bit
registers. But if you can make it faster than the non-JIT, it should
be a win. :) Yay assembly.

-Kees

-- 
Kees Cook
Pixel Security

^ permalink raw reply	[flat|nested] 99+ messages in thread

* arch: arm: bpf: Converting cBPF to eBPF for arm 32 bit
@ 2017-01-30 21:57   ` Kees Cook
  0 siblings, 0 replies; 99+ messages in thread
From: Kees Cook @ 2017-01-30 21:57 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Jan 30, 2017 at 2:38 AM, Shubham Bansal
<illusionist.neo@gmail.com> wrote:
> Hi all,
>
> Please ignore last copy of this mail. Kernel mailing lists bounced my
> last mail back because of HTML content.
>
> Just starting a new thread with proper heading on the main kernel
> hardening and net-dev mailing list so that other people can be involved
> in this. Please don't take this as a personal mail.
>
> I am working on conversion of arm32 cBPF into eBPF JIT. I wanted some
> help, regarding understanding of kernel code, from the dev available on
> the mailing list. If you look at the ./arch/arm/net/bpf_jit_32.c code,
> you will see jit_ctx structure. If anybody could help me understand what
> each fields of this structure represent then it would be great.
>
> Also, currently I am mapping the eBPF registers to arm 32 bit registers
> in the following way.
>
>> static const int bpf2a32[] = {
>>
>>         /* return value from in-kernel function, and exit value from
>>         eBPF
>> */
>>         [BPF_REG_0] = ARM_R0,
>>
>>         /* arguments from eBPF program to in-kernel function */
>>
>>         [BPF_REG_1] = ARM_R1,
>>
>>         [BPF_REG_2] = ARM_R2,
>>
>>         [BPF_REG_3] = ARM_R3,
>>
>>         [BPF_REG_4] = ARM_R4,
>>
>>         [BPF_REG_5] = ARM_R5,
>>
>>         /* callee saved registers that in-kernel function will
>>         preserve */
>>
>>         [BPF_REG_6] = ARM_R6,
>>
>>         [BPF_REG_7] = ARM_R7,
>>
>>         [BPF_REG_8] = ARM_R8,
>>
>>         [BPF_REG_9] = ARM_R9,
>>
>>         /* Read only Frame Pointer to access Stack */
>>
>>         [BPF_REG_FP] = ARM_FP,
>>
>>         /* Temperory Register for internal BPF JIT */
>>
>>         [TMP_REG_1] = ARM_R11,
>>
>>         /* temporary register for blinding constants */
>>
>>         [BPF_REG_AX] = ARM_R10,
>>
>> };
>
> But I have some question if anybody could help with those.
>
> 1.) Currently, as eBPF uses 64 bit registers, I am mapping 64 bit eBPF
> registers with 32 bit arm registers which looks wrong to me. Do anybody
> have some idea about how to map eBPF->arm 32 bit registers ?

I was going to say "look at the x86 32-bit implementation." ... But
there isn't one. :( I'm going to guess that there isn't a very good
answer here. I assume you'll have to build some kind of stack scratch
space to load/save.

> 2.) Also, is my current mapping good enough to make the JIT fast enough ?
> because as you might know, eBPF JIT mostly depends on 1-to-1 mapping of
> its instructions with native instructions.

I don't know -- it might be tricky with needing to deal with 64-bit
registers. But if you can make it faster than the non-JIT, it should
be a win. :) Yay assembly.

-Kees

-- 
Kees Cook
Pixel Security

^ permalink raw reply	[flat|nested] 99+ messages in thread

* arch: arm: bpf: Converting cBPF to eBPF for arm 32 bit
@ 2017-01-30 10:38 ` Shubham Bansal
  0 siblings, 0 replies; 99+ messages in thread
From: Shubham Bansal @ 2017-01-30 10:38 UTC (permalink / raw)
  To: Kees Cook, Daniel Borkmann, Mircea Gherzan
  Cc: netdev, kernel-hardening, linux, linux-arm-kernel

Hi all,

Please ignore last copy of this mail. Kernel mailing lists bounced my
last mail back because of HTML content.

Just starting a new thread with proper heading on the main kernel
hardening and net-dev mailing list so that other people can be involved
in this. Please don't take this as a personal mail.

I am working on conversion of arm32 cBPF into eBPF JIT. I wanted some
help, regarding understanding of kernel code, from the dev available on
the mailing list. If you look at the ./arch/arm/net/bpf_jit_32.c code,
you will see jit_ctx structure. If anybody could help me understand what
each fields of this structure represent then it would be great.

Also, currently I am mapping the eBPF registers to arm 32 bit registers
in the following way.

> static const int bpf2a32[] = {
>
>         /* return value from in-kernel function, and exit value from
>         eBPF
> */
>         [BPF_REG_0] = ARM_R0,
>
>         /* arguments from eBPF program to in-kernel function */
>
>         [BPF_REG_1] = ARM_R1,
>
>         [BPF_REG_2] = ARM_R2,
>
>         [BPF_REG_3] = ARM_R3,
>
>         [BPF_REG_4] = ARM_R4,
>
>         [BPF_REG_5] = ARM_R5,
>
>         /* callee saved registers that in-kernel function will
>         preserve */
>
>         [BPF_REG_6] = ARM_R6,
>
>         [BPF_REG_7] = ARM_R7,
>
>         [BPF_REG_8] = ARM_R8,
>
>         [BPF_REG_9] = ARM_R9,
>
>         /* Read only Frame Pointer to access Stack */
>
>         [BPF_REG_FP] = ARM_FP,
>
>         /* Temperory Register for internal BPF JIT */
>
>         [TMP_REG_1] = ARM_R11,
>
>         /* temporary register for blinding constants */
>
>         [BPF_REG_AX] = ARM_R10,
>
> };

But I have some question if anybody could help with those.

1.) Currently, as eBPF uses 64 bit registers, I am mapping 64 bit eBPF
registers with 32 bit arm registers which looks wrong to me. Do anybody
have some idea about how to map eBPF->arm 32 bit registers ?
2.) Also, is my current mapping good enough to make the JIT fast enough ?
because as you might know, eBPF JIT mostly depends on 1-to-1 mapping of
its instructions with native instructions.

Appreciate the help from anybody from the mailing list.

Best,
Shubham Bansal

^ permalink raw reply	[flat|nested] 99+ messages in thread

* arch: arm: bpf: Converting cBPF to eBPF for arm 32 bit
@ 2017-01-30 10:38 ` Shubham Bansal
  0 siblings, 0 replies; 99+ messages in thread
From: Shubham Bansal @ 2017-01-30 10:38 UTC (permalink / raw)
  To: linux-arm-kernel

Hi all,

Please ignore last copy of this mail. Kernel mailing lists bounced my
last mail back because of HTML content.

Just starting a new thread with proper heading on the main kernel
hardening and net-dev mailing list so that other people can be involved
in this. Please don't take this as a personal mail.

I am working on conversion of arm32 cBPF into eBPF JIT. I wanted some
help, regarding understanding of kernel code, from the dev available on
the mailing list. If you look at the ./arch/arm/net/bpf_jit_32.c code,
you will see jit_ctx structure. If anybody could help me understand what
each fields of this structure represent then it would be great.

Also, currently I am mapping the eBPF registers to arm 32 bit registers
in the following way.

> static const int bpf2a32[] = {
>
>         /* return value from in-kernel function, and exit value from
>         eBPF
> */
>         [BPF_REG_0] = ARM_R0,
>
>         /* arguments from eBPF program to in-kernel function */
>
>         [BPF_REG_1] = ARM_R1,
>
>         [BPF_REG_2] = ARM_R2,
>
>         [BPF_REG_3] = ARM_R3,
>
>         [BPF_REG_4] = ARM_R4,
>
>         [BPF_REG_5] = ARM_R5,
>
>         /* callee saved registers that in-kernel function will
>         preserve */
>
>         [BPF_REG_6] = ARM_R6,
>
>         [BPF_REG_7] = ARM_R7,
>
>         [BPF_REG_8] = ARM_R8,
>
>         [BPF_REG_9] = ARM_R9,
>
>         /* Read only Frame Pointer to access Stack */
>
>         [BPF_REG_FP] = ARM_FP,
>
>         /* Temperory Register for internal BPF JIT */
>
>         [TMP_REG_1] = ARM_R11,
>
>         /* temporary register for blinding constants */
>
>         [BPF_REG_AX] = ARM_R10,
>
> };

But I have some question if anybody could help with those.

1.) Currently, as eBPF uses 64 bit registers, I am mapping 64 bit eBPF
registers with 32 bit arm registers which looks wrong to me. Do anybody
have some idea about how to map eBPF->arm 32 bit registers ?
2.) Also, is my current mapping good enough to make the JIT fast enough ?
because as you might know, eBPF JIT mostly depends on 1-to-1 mapping of
its instructions with native instructions.

Appreciate the help from anybody from the mailing list.

Best,
Shubham Bansal

^ permalink raw reply	[flat|nested] 99+ messages in thread

* arch: arm: bpf: Converting cBPF to eBPF for arm 32 bit
@ 2017-01-30 10:16 Shubham Bansal
  0 siblings, 0 replies; 99+ messages in thread
From: Shubham Bansal @ 2017-01-30 10:16 UTC (permalink / raw)
  To: Kees Cook, Daniel Borkmann, Mircea Gherzan
  Cc: netdev, kernel-hardening, linux, linux-arm-kernel

[-- Attachment #1: Type: text/plain, Size: 2083 bytes --]

Hi all,

Just starting a new thread with proper heading on the main kernel hardening
and net-dev mailing list so that other people can be involved in this.
Please don't take this as a personal mail.

I am working on conversion of arm32 cBPF into eBPF JIT. I wanted some help,
regarding understanding of kernel code, from the dev available on the
mailing list. If you look at the ./arch/arm/net/bpf_jit_32.c code, you will
see jit_ctx structure. If anybody could help me understand what each fields
of this structure represent then it would be great.

Also, currently I am mapping the eBPF registers to arm 32 bit registers in
the following way.

static const int bpf2a32[] = {
>
>         /* return value from in-kernel function, and exit value from eBPF
> */
>         [BPF_REG_0] = ARM_R0,
>
>         /* arguments from eBPF program to in-kernel function */
>
>         [BPF_REG_1] = ARM_R1,
>
>         [BPF_REG_2] = ARM_R2,
>
>         [BPF_REG_3] = ARM_R3,
>
>         [BPF_REG_4] = ARM_R4,
>
>         [BPF_REG_5] = ARM_R5,
>
>         /* callee saved registers that in-kernel function will preserve */
>
>         [BPF_REG_6] = ARM_R6,
>
>         [BPF_REG_7] = ARM_R7,
>
>         [BPF_REG_8] = ARM_R8,
>
>         [BPF_REG_9] = ARM_R9,
>
>         /* Read only Frame Pointer to access Stack */
>
>         [BPF_REG_FP] = ARM_FP,
>
>         /* Temperory Register for internal BPF JIT */
>
>         [TMP_REG_1] = ARM_R11,
>
>         /* temporary register for blinding constants */
>
>         [BPF_REG_AX] = ARM_R10,
>
> };


But I have some question if anybody could help with those.

   - Currently, as eBPF uses 64 bit registers, I am mapping 64 bit eBPF
   registers with 32 bit arm registers which looks wrong to me. Do anybody
   have some idea about how to map eBPF->arm 32 bit registers ?
   - Also, is my current mapping good enough to make the JIT fast enough ?
   because as you might know, eBPF JIT mostly depends on 1-to-1 mapping of its
   instructions with native instructions.


Appreciate the help from anybody from the mailing list.

Best,
Shubham Bansal

[-- Attachment #2: Type: text/html, Size: 6852 bytes --]

^ permalink raw reply	[flat|nested] 99+ messages in thread

end of thread, other threads:[~2017-05-23 19:33 UTC | newest]

Thread overview: 99+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <CAHgaXdKsO2xoKYp7g91g+n+d_1KHSSByLjzBB-WjVXSjhB7qxw@mail.gmail.com>
     [not found] ` <20170510.212952.1440495072777358778.davem@davemloft.net>
     [not found]   ` <CAHgaXdK8LEEUPm4jTRRzCnjwdWAauHmmB=caZsSFY8MmStH89Q@mail.gmail.com>
     [not found]     ` <20170510.215218.2185526627014393313.davem@davemloft.net>
     [not found]       ` <CAHgaXdKZ_v+iO7uqEDx7PA7D+xcp1FngGvJ1SRSsGXNQ-iWWDQ@mail.gmail.com>
2017-05-11  9:32         ` arch: arm: bpf: Converting cBPF to eBPF for arm 32 bit Shubham Bansal
2017-05-11  9:32           ` [kernel-hardening] " Shubham Bansal
2017-05-11  9:32           ` Shubham Bansal
2017-05-11 15:30           ` Kees Cook
2017-05-11 15:30             ` [kernel-hardening] " Kees Cook
2017-05-11 15:30             ` Kees Cook
2017-05-13 21:38             ` Shubham Bansal
2017-05-13 21:38               ` [kernel-hardening] " Shubham Bansal
2017-05-13 21:38               ` Shubham Bansal
2017-05-15 17:44               ` Kees Cook
2017-05-15 17:44                 ` [kernel-hardening] " Kees Cook
2017-05-15 17:44                 ` Kees Cook
2017-05-15 19:55               ` Daniel Borkmann
2017-05-15 19:55                 ` [kernel-hardening] " Daniel Borkmann
2017-05-15 19:55                 ` Daniel Borkmann
2017-05-20 20:01                 ` Shubham Bansal
2017-05-20 20:01                   ` [kernel-hardening] " Shubham Bansal
2017-05-20 20:01                   ` Shubham Bansal
2017-05-22 13:01                   ` Daniel Borkmann
2017-05-22 13:01                     ` [kernel-hardening] " Daniel Borkmann
2017-05-22 13:01                     ` Daniel Borkmann
2017-05-22 17:04                     ` Shubham Bansal
2017-05-22 17:04                       ` [kernel-hardening] " Shubham Bansal
2017-05-22 17:04                       ` Shubham Bansal
2017-05-22 20:05                       ` Kees Cook
2017-05-22 20:05                         ` [kernel-hardening] " Kees Cook
2017-05-22 20:05                         ` Kees Cook
2017-05-23  2:58                         ` Shubham Bansal
2017-05-23  2:58                           ` [kernel-hardening] " Shubham Bansal
2017-05-23  2:58                           ` Shubham Bansal
2017-05-23  4:27                           ` Kees Cook
2017-05-23  4:27                             ` [kernel-hardening] " Kees Cook
2017-05-23  4:27                             ` Kees Cook
2017-05-22 18:58                   ` Kees Cook
2017-05-22 18:58                     ` [kernel-hardening] " Kees Cook
2017-05-22 18:58                     ` Kees Cook
2017-05-22 19:08                     ` Florian Fainelli
2017-05-22 19:08                       ` [kernel-hardening] " Florian Fainelli
2017-05-22 19:08                       ` Florian Fainelli
2017-05-23  3:34                       ` Shubham Bansal
2017-05-23  3:34                         ` [kernel-hardening] " Shubham Bansal
2017-05-23  3:34                         ` Shubham Bansal
2017-05-23  4:22                         ` Kees Cook
2017-05-23  4:22                           ` [kernel-hardening] " Kees Cook
2017-05-23  4:22                           ` Kees Cook
2017-05-23  5:03                           ` Shubham Bansal
2017-05-23  5:03                             ` [kernel-hardening] " Shubham Bansal
2017-05-23  5:03                             ` Shubham Bansal
2017-05-23  5:35                             ` Kees Cook
2017-05-23  5:35                               ` [kernel-hardening] " Kees Cook
2017-05-23  5:35                               ` Kees Cook
2017-05-23 18:39                               ` Shubham Bansal
2017-05-23 18:39                                 ` [kernel-hardening] " Shubham Bansal
2017-05-23 19:32                                 ` Kees Cook
2017-05-23 19:32                                   ` [kernel-hardening] " Kees Cook
2017-05-23 19:32                                   ` Kees Cook
2017-01-30 10:38 Shubham Bansal
2017-01-30 10:38 ` Shubham Bansal
2017-01-30 21:57 ` Kees Cook
2017-01-30 21:57   ` Kees Cook
     [not found]   ` <CAHgaXd+nj69n-Xf46N=4M-j-0hKHVrrLfsvRZCG=2CCAtVF6ZA@mail.gmail.com>
     [not found]     ` <CAGXu5j+NSLomuSgD40kys+pWc+J9aB6Bbk_gSP9Lp_ScimQn_w@mail.gmail.com>
2017-02-01 13:01       ` Shubham Bansal
2017-02-01 13:01         ` Shubham Bansal
     [not found]         ` <76621BFF-B30B-4417-AB2B-DB21CA6092D9@netronome.com>
2017-02-03  7:04           ` Shubham Bansal
2017-02-03  7:04             ` Shubham Bansal
2017-02-03  8:25             ` nick viljoen
2017-02-03  8:25               ` nick viljoen
2017-02-08  7:29         ` Shubham Bansal
2017-02-08  7:29           ` Shubham Bansal
2017-02-08 19:41         ` Kees Cook
2017-02-08 19:41           ` Kees Cook
2017-03-15 12:13           ` Shubham Bansal
2017-03-15 12:13             ` Shubham Bansal
2017-03-15 21:55             ` David Miller
2017-03-15 21:55               ` David Miller
2017-03-28 20:49               ` Shubham Bansal
2017-03-28 20:49                 ` Shubham Bansal
2017-03-29  0:00                 ` Daniel Borkmann
2017-03-29  0:00                   ` Daniel Borkmann
2017-03-30 14:04                   ` Shubham Bansal
2017-03-30 14:04                     ` Shubham Bansal
2017-04-06 11:05                     ` Shubham Bansal
2017-04-06 11:05                       ` Shubham Bansal
2017-04-06 12:51                       ` Daniel Borkmann
2017-04-06 12:51                         ` Daniel Borkmann
2017-05-06 16:48                         ` Shubham Bansal
2017-05-06 16:48                           ` Shubham Bansal
2017-05-06 18:38                           ` David Miller
2017-05-06 18:38                             ` David Miller
2017-05-06 20:27                             ` Shubham Bansal
2017-05-06 20:27                               ` Shubham Bansal
2017-05-06 22:17                               ` Shubham Bansal
2017-05-06 22:17                                 ` Shubham Bansal
2017-05-09 20:12                         ` Shubham Bansal
2017-05-09 20:12                           ` Shubham Bansal
2017-05-09 20:19                           ` David Miller
2017-05-09 20:19                             ` David Miller
2017-05-09 20:25                           ` Daniel Borkmann
2017-05-09 20:25                             ` Daniel Borkmann
  -- strict thread matches above, loose matches on Subject: below --
2017-01-30 10:16 Shubham Bansal

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.