All of lore.kernel.org
 help / color / mirror / Atom feed
* Clang | llc incorrect jumps
@ 2020-08-25 16:33 Kevin Sheldrake
  2020-08-25 16:48 ` Alexei Starovoitov
  0 siblings, 1 reply; 3+ messages in thread
From: Kevin Sheldrake @ 2020-08-25 16:33 UTC (permalink / raw)
  To: bpf

Hello

I've got some odd behaviour and I'd like to ask if anyone could spare a few moments to offer suggestions of what my error might be?  Maybe there is a different mailing list that this would be more appropriate to send to?

I'm building a relatively complex eBPF program that attaches to the raw tracepoints, sys_enter and sys_exit, compiled with clang and llc using the same switches and includes as the kernel samples (kernel v5.3 on ubuntu 18.04), loaded using the latest libbpf git sources.  The sys_exit program is quite big at 55045 instructions (unrolled loops so the same code will run on kernel v5.1) and contains illegal jumps.  Everything is inlined - using __attribute__((always_inline)) - and the programs themselves have the __attribute__((flatten)) applied.  The verifier complains:
jump out of range from insn 15 to -10500

If I remove one section of code, unrelated to where the illegal jumps are, reducing the overall size to 24480 instructions, the illegal jumps disappear.  If I reenable that section of code and remove a different section of code, also unrelated to the illegal jumps, reducing the overall size to 47444 instructions, the illegal jumps disappear again.

The illegal jumps look like (there are many, these are a sample):
      15:       if r0 == 0 goto -10516 <LBB1_8987+0xfffffffffff7ffd0>
    1285:       goto -11827 <LBB1_8987+0xfffffffffff7fe88>
   54994:       goto +28205 <LBB1_8987+0x36ff0>

Note the first two jump to before the program and the final one jumps to after the program.  Elsewhere there are jumps to the actual label:
   47846:       if r3 == r4 goto +7195 <LBB1_8987>

And the label itself is the last in the program and doesn't appear special:
LBB1_8987:
   55042:       r1 = *(u32 *)(r1 + 0)
   55043:       r8 += r1
   55044:       r3 = r8
   55045:       goto -7149 <LBB1_5498>
<END OF CODE>

FYI, the code was dumped with:
llvm-objdump -S -no-show-raw-insn EBPF_OBJECT.o

In my limited understanding, it appears that the length of the code has somehow tricked clang/llc into thinking there are codes fragment in illegal locations.  I'm not certain, but I believe these jumps are all related to exit conditions.

Am I missing something?  Does anyone have any suggestions?  I can share the entire codebase including the build process if that would help?

Many thanks for any help or interest you can offer,

Kevin Sheldrake


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Clang | llc incorrect jumps
  2020-08-25 16:33 Clang | llc incorrect jumps Kevin Sheldrake
@ 2020-08-25 16:48 ` Alexei Starovoitov
  2020-08-26 12:19   ` [EXTERNAL] " Kevin Sheldrake
  0 siblings, 1 reply; 3+ messages in thread
From: Alexei Starovoitov @ 2020-08-25 16:48 UTC (permalink / raw)
  To: Kevin Sheldrake, Yonghong Song, John Fastabend; +Cc: bpf

On Tue, Aug 25, 2020 at 9:34 AM Kevin Sheldrake
<Kevin.Sheldrake@microsoft.com> wrote:
>
> Hello
>
> I've got some odd behaviour and I'd like to ask if anyone could spare a few moments to offer suggestions of what my error might be?  Maybe there is a different mailing list that this would be more appropriate to send to?
>
> I'm building a relatively complex eBPF program that attaches to the raw tracepoints, sys_enter and sys_exit, compiled with clang and llc using the same switches and includes as the kernel samples (kernel v5.3 on ubuntu 18.04), loaded using the latest libbpf git sources.  The sys_exit program is quite big at 55045 instructions (unrolled loops so the same code will run on kernel v5.1) and contains illegal jumps.  Everything is inlined - using __attribute__((always_inline)) - and the programs themselves have the __attribute__((flatten)) applied.  The verifier complains:
> jump out of range from insn 15 to -10500
>
> If I remove one section of code, unrelated to where the illegal jumps are, reducing the overall size to 24480 instructions, the illegal jumps disappear.  If I reenable that section of code and remove a different section of code, also unrelated to the illegal jumps, reducing the overall size to 47444 instructions, the illegal jumps disappear again.

I suspect it's a bug in llvm. It doesn't have a check that the branch
target fits into 16-bits.
It simply does:
llvm/lib/Target/BPF/MCTargetDesc/BPFAsmBackend.cpp
    Value = (uint16_t)((Value - 8) / 8);
    support::endian::write<uint16_t>(&Data[Fixup.getOffset() + 2], Value,
                                     Endian);
Could you add a log around that line to double check that's the case?
May be you could send a patch to add an assert there?
It will help others avoid this debugging.

In the past we've talked about extending BPF ISA with 32-bit
unconditional jump instruction.
But no one didn't come around to actually implementing it.
Once we have such insn llvm should be able to detect this 16-bit overflow
and use this new jmp insn.

^ permalink raw reply	[flat|nested] 3+ messages in thread

* RE: [EXTERNAL] Re: Clang | llc incorrect jumps
  2020-08-25 16:48 ` Alexei Starovoitov
@ 2020-08-26 12:19   ` Kevin Sheldrake
  0 siblings, 0 replies; 3+ messages in thread
From: Kevin Sheldrake @ 2020-08-26 12:19 UTC (permalink / raw)
  To: Alexei Starovoitov, Yonghong Song, John Fastabend; +Cc: bpf


On Tue, Aug 25, 2020 at 5:49 PM Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote:
> On Tue, Aug 25, 2020 at 9:34 AM Kevin Sheldrake
> <Kevin.Sheldrake@microsoft.com> wrote:
<SNIP>
> > If I remove one section of code, unrelated to where the illegal jumps are,
> reducing the overall size to 24480 instructions, the illegal jumps disappear.  If
> I reenable that section of code and remove a different section of code, also
> unrelated to the illegal jumps, reducing the overall size to 47444 instructions,
> the illegal jumps disappear again.
> 
> I suspect it's a bug in llvm. It doesn't have a check that the branch target fits
> into 16-bits.
> It simply does:
> llvm/lib/Target/BPF/MCTargetDesc/BPFAsmBackend.cpp
>     Value = (uint16_t)((Value - 8) / 8);
>     support::endian::write<uint16_t>(&Data[Fixup.getOffset() + 2], Value,
>                                      Endian); Could you add a log around that line to double
> check that's the case?
> May be you could send a patch to add an assert there?
> It will help others avoid this debugging.
> 
> In the past we've talked about extending BPF ISA with 32-bit unconditional
> jump instruction.
> But no one didn't come around to actually implementing it.
> Once we have such insn llvm should be able to detect this 16-bit overflow
> and use this new jmp insn.

Hello Alexei

Thank you for this insight - that's really helpful.  I'll get on it when I'm back at work on Friday and I'll submit a patch.

Thanks

Kevin Sheldrake


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2020-08-26 12:19 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-08-25 16:33 Clang | llc incorrect jumps Kevin Sheldrake
2020-08-25 16:48 ` Alexei Starovoitov
2020-08-26 12:19   ` [EXTERNAL] " Kevin Sheldrake

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.