All of lore.kernel.org
 help / color / mirror / Atom feed
* BPF_CORE_READ issue with nvme_submit_cmd kprobe.
@ 2022-05-26 19:15 John Mazzie
  2022-06-01  0:22 ` Andrii Nakryiko
  0 siblings, 1 reply; 7+ messages in thread
From: John Mazzie @ 2022-05-26 19:15 UTC (permalink / raw)
  To: bpf, John Mazzie (jmazzie)

While attempting to learn more about BPF and libbpf, I ran into an
issue I can't quite seem to resolve.

While writing some tools to practice tracing with libbpf, I came
across a situation where I get an error when using BPF_CORE_READ,
which appears to be that CO-RE relocation failed to find a
corresponding field. Compilation doesn't complain, just when I try to
execute.

Error Message:
---------------------------------------------
8: (85) call unknown#195896080
invalid func unknown#195896080

I'm using the Makefile from libbpf-bootstrap to build my program. The
other example programs build and execute properly, and I've also
successfully used tracepoints to trace the nvme_setup_cmd and
nvme_complete_rq functions. My issue appears to be when I attempt to
use kprobes for the nvme_submit_cmd function.

In the program I'm attempting to trace the nvme_command structure to
get the opcode of the command in the function nvme_submit_cmd. I'm
using Rocky Linux (RedHat based distro) with their kernel version of
4.18. I verified the structures and interfaces in the source code, vs
the default 4.18 version of the kernel and made the appropriate
changes.

traceopcode.bpf.c
---------------------------------------------
#include "vmlinux.h"
#include <bpf/bpf_helpers.h>
#include <bpf/bpf_tracing.h>
#include <bpf/bpf_core_read.h>
#include "traceopcode.h"

char LICENSE[] SEC("license") = "Dual BSD/GPL";

struct {
    __uint(type, BPF_MAP_TYPE_RINGBUF);
    __uint(max_entries, 1024 * 1024);
} ring_buffer SEC(".maps");

struct nvme_common_command {
    __u8         opcode;
} __attribute__((preserve_access_index));

struct nvme_command {
    union {
        struct nvme_common_command common;
    };
} __attribute__((preserve_access_index));

SEC("kprobe/nvme_submit_cmd")
int BPF_KPROBE(nvme_submit_cmd, void *nvmeq, struct nvme_command *cmd,
bool write_sq)
{
    struct opcode_event *e;

    e = bpf_ringbuf_reserve(&ring_buffer, sizeof(*e), 0);
    if (!e)
        return 0;

    e->opcode = BPF_CORE_READ(cmd, common.opcode);
    //e->opcode = cmd->common.opcode;
    bpf_ringbuf_submit(e, 0);

   return 0;
}


traceopcode.h
---------------------------------------------
#ifndef __TRACEOPCODE_H
#define __TRACEOPCODE_H

struct opcode_event {
    __u8 opcode;
};

#endif


My userspace code is basically the same as the bootstrap example, with
a modification to the handler that just prints out the opcode from the
opcode_event structure. My guess is that I have some problem with how
I'm defining the structs that I'm using for nvme_command, as they
aren't part of vmlinux and need to be defined in my bpf program.

Any help or guidance would be appreciated.

Thanks,
John Mazzie

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: BPF_CORE_READ issue with nvme_submit_cmd kprobe.
  2022-05-26 19:15 BPF_CORE_READ issue with nvme_submit_cmd kprobe John Mazzie
@ 2022-06-01  0:22 ` Andrii Nakryiko
  2022-06-01  2:16   ` John Mazzie
  0 siblings, 1 reply; 7+ messages in thread
From: Andrii Nakryiko @ 2022-06-01  0:22 UTC (permalink / raw)
  To: John Mazzie; +Cc: bpf, John Mazzie (jmazzie)

On Fri, May 27, 2022 at 3:07 AM John Mazzie <john.p.mazzie@gmail.com> wrote:
>
> While attempting to learn more about BPF and libbpf, I ran into an
> issue I can't quite seem to resolve.
>
> While writing some tools to practice tracing with libbpf, I came
> across a situation where I get an error when using BPF_CORE_READ,
> which appears to be that CO-RE relocation failed to find a
> corresponding field. Compilation doesn't complain, just when I try to
> execute.
>
> Error Message:
> ---------------------------------------------
> 8: (85) call unknown#195896080
> invalid func unknown#195896080

This means CO-RE relocation failed. If you update libbpf submodule (or
maybe we already did it for libbpf-bootstrap recently), you'll get
more meaningful error and details. But basically in running kernel
there is no cmd->common.opcode.

>
> I'm using the Makefile from libbpf-bootstrap to build my program. The
> other example programs build and execute properly, and I've also
> successfully used tracepoints to trace the nvme_setup_cmd and
> nvme_complete_rq functions. My issue appears to be when I attempt to
> use kprobes for the nvme_submit_cmd function.
>

[...]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: BPF_CORE_READ issue with nvme_submit_cmd kprobe.
  2022-06-01  0:22 ` Andrii Nakryiko
@ 2022-06-01  2:16   ` John Mazzie
  2022-06-01  4:51     ` Andrii Nakryiko
  0 siblings, 1 reply; 7+ messages in thread
From: John Mazzie @ 2022-06-01  2:16 UTC (permalink / raw)
  To: Andrii Nakryiko; +Cc: bpf, John Mazzie (jmazzie)

I pulled the latest libbpf-bootstrap and rebuilt my programs. The
error message is clearer now. I think last time I tried
libbpf-bootstrap was still linked to 0.7 instead of 0.8.

The new message is the following which makes sense in regard to what you said.

<invalid CO-RE relocation>
failed to resolve CO-RE relocation <byte_off> [14] struct
nvme_command.common.opcode (0:0:0:0 @ offset 0)
processed 8 insns (limit 1000000) max_states_per_insn 0 total_states 0
peak_states 0 mark_read 0

This struct is part of the nvme driver, which is running on this
system as it only has nvme devices (including boot device). I've been
able to access this data using bpftrace on the same system. If I don't
try to access this struct I can count the correct number of
nvme_submit_cmd triggers, so I believe the probe is working correctly.
Is this a case where I need to define more/all of the struct?

On Tue, May 31, 2022 at 7:22 PM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
>
> On Fri, May 27, 2022 at 3:07 AM John Mazzie <john.p.mazzie@gmail.com> wrote:
> >
> > While attempting to learn more about BPF and libbpf, I ran into an
> > issue I can't quite seem to resolve.
> >
> > While writing some tools to practice tracing with libbpf, I came
> > across a situation where I get an error when using BPF_CORE_READ,
> > which appears to be that CO-RE relocation failed to find a
> > corresponding field. Compilation doesn't complain, just when I try to
> > execute.
> >
> > Error Message:
> > ---------------------------------------------
> > 8: (85) call unknown#195896080
> > invalid func unknown#195896080
>
> This means CO-RE relocation failed. If you update libbpf submodule (or
> maybe we already did it for libbpf-bootstrap recently), you'll get
> more meaningful error and details. But basically in running kernel
> there is no cmd->common.opcode.
>
> >
> > I'm using the Makefile from libbpf-bootstrap to build my program. The
> > other example programs build and execute properly, and I've also
> > successfully used tracepoints to trace the nvme_setup_cmd and
> > nvme_complete_rq functions. My issue appears to be when I attempt to
> > use kprobes for the nvme_submit_cmd function.
> >
>
> [...]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: BPF_CORE_READ issue with nvme_submit_cmd kprobe.
  2022-06-01  2:16   ` John Mazzie
@ 2022-06-01  4:51     ` Andrii Nakryiko
  2022-06-01 18:06       ` John Mazzie
  0 siblings, 1 reply; 7+ messages in thread
From: Andrii Nakryiko @ 2022-06-01  4:51 UTC (permalink / raw)
  To: John Mazzie; +Cc: bpf, John Mazzie (jmazzie)

On Tue, May 31, 2022 at 7:16 PM John Mazzie <john.p.mazzie@gmail.com> wrote:
>
> I pulled the latest libbpf-bootstrap and rebuilt my programs. The
> error message is clearer now. I think last time I tried
> libbpf-bootstrap was still linked to 0.7 instead of 0.8.
>
> The new message is the following which makes sense in regard to what you said.
>
> <invalid CO-RE relocation>
> failed to resolve CO-RE relocation <byte_off> [14] struct
> nvme_command.common.opcode (0:0:0:0 @ offset 0)
> processed 8 insns (limit 1000000) max_states_per_insn 0 total_states 0
> peak_states 0 mark_read 0
>
> This struct is part of the nvme driver, which is running on this
> system as it only has nvme devices (including boot device). I've been
> able to access this data using bpftrace on the same system. If I don't
> try to access this struct I can count the correct number of
> nvme_submit_cmd triggers, so I believe the probe is working correctly.
> Is this a case where I need to define more/all of the struct?
>

Look at debug logs from libbpf. I tried simplified version of your
program and it all works for me.

struct nvme_common_command {
    __u8         opcode;
} __attribute__((preserve_access_index));

struct nvme_command {
    union {
        struct nvme_common_command common;
    };
} __attribute__((preserve_access_index));

SEC("kprobe/nvme_submit_cmd")
int BPF_KPROBE(nvme_submit_cmd, void *nvmeq, struct nvme_command *cmd,
bool write_sq)
{
    bpf_printk("OPCODE %d", BPF_CORE_READ(cmd, common.opcode));

   return 0;
}


Libbpf logs:

libbpf: sec 'kprobe/nvme_submit_cmd': found 2 CO-RE relocations
libbpf: CO-RE relocating [6] struct pt_regs: found target candidate
[226] struct pt_regs in [vmlinux]
libbpf: prog 'nvme_submit_cmd': relo #0: kind <byte_off> (0), spec is
[6] struct pt_regs.si (0:13 @ offset 104)
libbpf: prog 'nvme_submit_cmd': relo #0: matching candidate #0 [226]
struct pt_regs.si (0:13 @ offset 104)
libbpf: prog 'nvme_submit_cmd': relo #0: patched insn #0 (LDX/ST/STX)
off 104 -> 104
libbpf: CO-RE relocating [10] struct nvme_command: found target
candidate [107390] struct nvme_command in [nvme_core]
libbpf: CO-RE relocating [10] struct nvme_command: found target
candidate [106451] struct nvme_command in [nvme]
libbpf: prog 'nvme_submit_cmd': relo #1: kind <byte_off> (0), spec is
[10] struct nvme_command.common.opcode (0:0:0:0 @ offset 0)
libbpf: prog 'nvme_submit_cmd': relo #1: matching candidate #0
[107390] struct nvme_command.common.opcode (0:0:0:0 @ offset 0)
libbpf: prog 'nvme_submit_cmd': relo #1: matching candidate #1
[106451] struct nvme_command.common.opcode (0:0:0:0 @ offset 0)
libbpf: prog 'nvme_submit_cmd': relo #1: patched insn #1 (ALU/ALU64) imm 0 -> 0
Successfully started! Please run `sudo cat
/sys/kernel/debug/tracing/trace_pipe` to see output of the BPF
programs.
..............^C



> On Tue, May 31, 2022 at 7:22 PM Andrii Nakryiko
> <andrii.nakryiko@gmail.com> wrote:
> >
> > On Fri, May 27, 2022 at 3:07 AM John Mazzie <john.p.mazzie@gmail.com> wrote:
> > >
> > > While attempting to learn more about BPF and libbpf, I ran into an
> > > issue I can't quite seem to resolve.
> > >
> > > While writing some tools to practice tracing with libbpf, I came
> > > across a situation where I get an error when using BPF_CORE_READ,
> > > which appears to be that CO-RE relocation failed to find a
> > > corresponding field. Compilation doesn't complain, just when I try to
> > > execute.
> > >
> > > Error Message:
> > > ---------------------------------------------
> > > 8: (85) call unknown#195896080
> > > invalid func unknown#195896080
> >
> > This means CO-RE relocation failed. If you update libbpf submodule (or
> > maybe we already did it for libbpf-bootstrap recently), you'll get
> > more meaningful error and details. But basically in running kernel
> > there is no cmd->common.opcode.
> >
> > >
> > > I'm using the Makefile from libbpf-bootstrap to build my program. The
> > > other example programs build and execute properly, and I've also
> > > successfully used tracepoints to trace the nvme_setup_cmd and
> > > nvme_complete_rq functions. My issue appears to be when I attempt to
> > > use kprobes for the nvme_submit_cmd function.
> > >
> >
> > [...]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: BPF_CORE_READ issue with nvme_submit_cmd kprobe.
  2022-06-01  4:51     ` Andrii Nakryiko
@ 2022-06-01 18:06       ` John Mazzie
  2022-06-01 21:43         ` Andrii Nakryiko
  0 siblings, 1 reply; 7+ messages in thread
From: John Mazzie @ 2022-06-01 18:06 UTC (permalink / raw)
  To: Andrii Nakryiko; +Cc: bpf, John Mazzie (jmazzie)

It appears that it might be some kind of kernel dependency. I tested
on Rocky Linux (RHEL based image) with Kernel 4.18 and Ubuntu 20.04
(Kernel 5.4) with the same issue running the simplified code.

Error
-----------------------
libbpf: sec 'kprobe/nvme_submit_cmd': found 2 CO-RE relocations
libbpf: CO-RE relocating [2] struct pt_regs: found target candidate
[202] struct pt_regs in [vmlinux]
libbpf: prog 'nvme_submit_cmd': relo #0: <byte_off> [2] struct
pt_regs.si (0:13 @ offset 104)
libbpf: prog 'nvme_submit_cmd': relo #0: matching candidate #0
<byte_off> [202] struct pt_regs.si (0:13 @ offset 104)
libbpf: prog 'nvme_submit_cmd': relo #0: patched insn #0 (LDX/ST/STX)
off 104 -> 104
libbpf: prog 'nvme_submit_cmd': relo #1: <byte_off> [7] struct
nvme_command.common.opcode (0:0:0:0 @ offset 0)
libbpf: prog 'nvme_submit_cmd': relo #1: no matching targets found
libbpf: prog 'nvme_submit_cmd': relo #1: substituting insn #1 w/ invalid insn
libbpf: prog 'nvme_submit_cmd': BPF program load failed: Invalid argument
libbpf: prog 'nvme_submit_cmd': -- BEGIN PROG LOAD LOG --
Unrecognized arg#0 type PTR
; int BPF_KPROBE(nvme_submit_cmd, void *nvmeq, struct nvme_command
*cmd, bool write_sq)
0: (79) r3 = *(u64 *)(r1 +104)
1: <invalid CO-RE relocation>
failed to resolve CO-RE relocation <byte_off> [7] struct
nvme_command.common.opcode (0:0:0:0 @ offset 0)
processed 2 insns (limit 1000000) max_states_per_insn 0 total_states 0
peak_states 0 mark_read 0
-- END PROG LOAD LOG --

I did have a breakthrough when upgrading Ubuntu to the HWE kernel
(5.13) where the tool worked properly. We can start using the HWE
Kernel for our development and make progress with our tools, but I
would still like to try to understand why it may not be working on
Ubuntu 20.04 Kernel 5.4 or RedHat's version of 4.18.

I verified the following kernel configuration parameters.

CONFIG_KPROBES=y
CONFIG_UPROBES=y
CONFIG_DEBUG_FS=y
CONFIG_FTRACE=y
CONFIG_FTRACE_SYSCALLS=y
CONFIG_KPROBE_EVENTS=y
CONFIG_UPROBE_EVENTS=y
CONFIG_BPF_EVENTS=y

Are there other config settings that I might not be thinking of for
these kernels?


On Tue, May 31, 2022 at 11:51 PM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
>
> On Tue, May 31, 2022 at 7:16 PM John Mazzie <john.p.mazzie@gmail.com> wrote:
> >
> > I pulled the latest libbpf-bootstrap and rebuilt my programs. The
> > error message is clearer now. I think last time I tried
> > libbpf-bootstrap was still linked to 0.7 instead of 0.8.
> >
> > The new message is the following which makes sense in regard to what you said.
> >
> > <invalid CO-RE relocation>
> > failed to resolve CO-RE relocation <byte_off> [14] struct
> > nvme_command.common.opcode (0:0:0:0 @ offset 0)
> > processed 8 insns (limit 1000000) max_states_per_insn 0 total_states 0
> > peak_states 0 mark_read 0
> >
> > This struct is part of the nvme driver, which is running on this
> > system as it only has nvme devices (including boot device). I've been
> > able to access this data using bpftrace on the same system. If I don't
> > try to access this struct I can count the correct number of
> > nvme_submit_cmd triggers, so I believe the probe is working correctly.
> > Is this a case where I need to define more/all of the struct?
> >
>
> Look at debug logs from libbpf. I tried simplified version of your
> program and it all works for me.
>
> struct nvme_common_command {
>     __u8         opcode;
> } __attribute__((preserve_access_index));
>
> struct nvme_command {
>     union {
>         struct nvme_common_command common;
>     };
> } __attribute__((preserve_access_index));
>
> SEC("kprobe/nvme_submit_cmd")
> int BPF_KPROBE(nvme_submit_cmd, void *nvmeq, struct nvme_command *cmd,
> bool write_sq)
> {
>     bpf_printk("OPCODE %d", BPF_CORE_READ(cmd, common.opcode));
>
>    return 0;
> }
>
>
> Libbpf logs:
>
> libbpf: sec 'kprobe/nvme_submit_cmd': found 2 CO-RE relocations
> libbpf: CO-RE relocating [6] struct pt_regs: found target candidate
> [226] struct pt_regs in [vmlinux]
> libbpf: prog 'nvme_submit_cmd': relo #0: kind <byte_off> (0), spec is
> [6] struct pt_regs.si (0:13 @ offset 104)
> libbpf: prog 'nvme_submit_cmd': relo #0: matching candidate #0 [226]
> struct pt_regs.si (0:13 @ offset 104)
> libbpf: prog 'nvme_submit_cmd': relo #0: patched insn #0 (LDX/ST/STX)
> off 104 -> 104
> libbpf: CO-RE relocating [10] struct nvme_command: found target
> candidate [107390] struct nvme_command in [nvme_core]
> libbpf: CO-RE relocating [10] struct nvme_command: found target
> candidate [106451] struct nvme_command in [nvme]
> libbpf: prog 'nvme_submit_cmd': relo #1: kind <byte_off> (0), spec is
> [10] struct nvme_command.common.opcode (0:0:0:0 @ offset 0)
> libbpf: prog 'nvme_submit_cmd': relo #1: matching candidate #0
> [107390] struct nvme_command.common.opcode (0:0:0:0 @ offset 0)
> libbpf: prog 'nvme_submit_cmd': relo #1: matching candidate #1
> [106451] struct nvme_command.common.opcode (0:0:0:0 @ offset 0)
> libbpf: prog 'nvme_submit_cmd': relo #1: patched insn #1 (ALU/ALU64) imm 0 -> 0
> Successfully started! Please run `sudo cat
> /sys/kernel/debug/tracing/trace_pipe` to see output of the BPF
> programs.
> ..............^C
>
>
>
> > On Tue, May 31, 2022 at 7:22 PM Andrii Nakryiko
> > <andrii.nakryiko@gmail.com> wrote:
> > >
> > > On Fri, May 27, 2022 at 3:07 AM John Mazzie <john.p.mazzie@gmail.com> wrote:
> > > >
> > > > While attempting to learn more about BPF and libbpf, I ran into an
> > > > issue I can't quite seem to resolve.
> > > >
> > > > While writing some tools to practice tracing with libbpf, I came
> > > > across a situation where I get an error when using BPF_CORE_READ,
> > > > which appears to be that CO-RE relocation failed to find a
> > > > corresponding field. Compilation doesn't complain, just when I try to
> > > > execute.
> > > >
> > > > Error Message:
> > > > ---------------------------------------------
> > > > 8: (85) call unknown#195896080
> > > > invalid func unknown#195896080
> > >
> > > This means CO-RE relocation failed. If you update libbpf submodule (or
> > > maybe we already did it for libbpf-bootstrap recently), you'll get
> > > more meaningful error and details. But basically in running kernel
> > > there is no cmd->common.opcode.
> > >
> > > >
> > > > I'm using the Makefile from libbpf-bootstrap to build my program. The
> > > > other example programs build and execute properly, and I've also
> > > > successfully used tracepoints to trace the nvme_setup_cmd and
> > > > nvme_complete_rq functions. My issue appears to be when I attempt to
> > > > use kprobes for the nvme_submit_cmd function.
> > > >
> > >
> > > [...]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: BPF_CORE_READ issue with nvme_submit_cmd kprobe.
  2022-06-01 18:06       ` John Mazzie
@ 2022-06-01 21:43         ` Andrii Nakryiko
  2022-06-02  0:03           ` John Mazzie
  0 siblings, 1 reply; 7+ messages in thread
From: Andrii Nakryiko @ 2022-06-01 21:43 UTC (permalink / raw)
  To: John Mazzie; +Cc: bpf, John Mazzie (jmazzie)

On Wed, Jun 1, 2022 at 11:06 AM John Mazzie <john.p.mazzie@gmail.com> wrote:
>
> It appears that it might be some kind of kernel dependency. I tested
> on Rocky Linux (RHEL based image) with Kernel 4.18 and Ubuntu 20.04
> (Kernel 5.4) with the same issue running the simplified code.
>
> Error
> -----------------------
> libbpf: sec 'kprobe/nvme_submit_cmd': found 2 CO-RE relocations
> libbpf: CO-RE relocating [2] struct pt_regs: found target candidate
> [202] struct pt_regs in [vmlinux]
> libbpf: prog 'nvme_submit_cmd': relo #0: <byte_off> [2] struct
> pt_regs.si (0:13 @ offset 104)
> libbpf: prog 'nvme_submit_cmd': relo #0: matching candidate #0
> <byte_off> [202] struct pt_regs.si (0:13 @ offset 104)
> libbpf: prog 'nvme_submit_cmd': relo #0: patched insn #0 (LDX/ST/STX)
> off 104 -> 104
> libbpf: prog 'nvme_submit_cmd': relo #1: <byte_off> [7] struct
> nvme_command.common.opcode (0:0:0:0 @ offset 0)
> libbpf: prog 'nvme_submit_cmd': relo #1: no matching targets found
> libbpf: prog 'nvme_submit_cmd': relo #1: substituting insn #1 w/ invalid insn
> libbpf: prog 'nvme_submit_cmd': BPF program load failed: Invalid argument
> libbpf: prog 'nvme_submit_cmd': -- BEGIN PROG LOAD LOG --
> Unrecognized arg#0 type PTR
> ; int BPF_KPROBE(nvme_submit_cmd, void *nvmeq, struct nvme_command
> *cmd, bool write_sq)
> 0: (79) r3 = *(u64 *)(r1 +104)
> 1: <invalid CO-RE relocation>
> failed to resolve CO-RE relocation <byte_off> [7] struct
> nvme_command.common.opcode (0:0:0:0 @ offset 0)
> processed 2 insns (limit 1000000) max_states_per_insn 0 total_states 0
> peak_states 0 mark_read 0
> -- END PROG LOAD LOG --
>
> I did have a breakthrough when upgrading Ubuntu to the HWE kernel
> (5.13) where the tool worked properly. We can start using the HWE
> Kernel for our development and make progress with our tools, but I
> would still like to try to understand why it may not be working on
> Ubuntu 20.04 Kernel 5.4 or RedHat's version of 4.18.
>
> I verified the following kernel configuration parameters.
>
> CONFIG_KPROBES=y
> CONFIG_UPROBES=y
> CONFIG_DEBUG_FS=y
> CONFIG_FTRACE=y
> CONFIG_FTRACE_SYSCALLS=y
> CONFIG_KPROBE_EVENTS=y
> CONFIG_UPROBE_EVENTS=y
> CONFIG_BPF_EVENTS=y
>
> Are there other config settings that I might not be thinking of for
> these kernels?
>

Oh, I think I know what it is. In my case struct nvme_command comes
from kernel modules nvme and nvme_core. But for that to work your
kernel should have kernel module BTF generated. Which was added a bit
later than vmlinux (kernel) BTF itself. You can check in
/sys/kernel/btf, if the only entry you see there is vmlinux then you
most probably don't have kernel modules.

So your realistic options are either compiling in the nvme driver (and
thus all its types will be present in vmlinux BTF) or upgrading the
kernel to the version that supports kernel modules (plus make sure
that you have recent enough pahole that supports generation of module
BTFs). You need to have:

$ zcat /proc/config.gz | grep BTF_MODULE
CONFIG_DEBUG_INFO_BTF_MODULES=y


>
> On Tue, May 31, 2022 at 11:51 PM Andrii Nakryiko
> <andrii.nakryiko@gmail.com> wrote:
> >
> > On Tue, May 31, 2022 at 7:16 PM John Mazzie <john.p.mazzie@gmail.com> wrote:
> > >
> > > I pulled the latest libbpf-bootstrap and rebuilt my programs. The
> > > error message is clearer now. I think last time I tried
> > > libbpf-bootstrap was still linked to 0.7 instead of 0.8.
> > >
> > > The new message is the following which makes sense in regard to what you said.
> > >
> > > <invalid CO-RE relocation>
> > > failed to resolve CO-RE relocation <byte_off> [14] struct
> > > nvme_command.common.opcode (0:0:0:0 @ offset 0)
> > > processed 8 insns (limit 1000000) max_states_per_insn 0 total_states 0
> > > peak_states 0 mark_read 0
> > >
> > > This struct is part of the nvme driver, which is running on this
> > > system as it only has nvme devices (including boot device). I've been
> > > able to access this data using bpftrace on the same system. If I don't
> > > try to access this struct I can count the correct number of
> > > nvme_submit_cmd triggers, so I believe the probe is working correctly.
> > > Is this a case where I need to define more/all of the struct?
> > >
> >
> > Look at debug logs from libbpf. I tried simplified version of your
> > program and it all works for me.
> >
> > struct nvme_common_command {
> >     __u8         opcode;
> > } __attribute__((preserve_access_index));
> >
> > struct nvme_command {
> >     union {
> >         struct nvme_common_command common;
> >     };
> > } __attribute__((preserve_access_index));
> >
> > SEC("kprobe/nvme_submit_cmd")
> > int BPF_KPROBE(nvme_submit_cmd, void *nvmeq, struct nvme_command *cmd,
> > bool write_sq)
> > {
> >     bpf_printk("OPCODE %d", BPF_CORE_READ(cmd, common.opcode));
> >
> >    return 0;
> > }
> >
> >
> > Libbpf logs:
> >
> > libbpf: sec 'kprobe/nvme_submit_cmd': found 2 CO-RE relocations
> > libbpf: CO-RE relocating [6] struct pt_regs: found target candidate
> > [226] struct pt_regs in [vmlinux]
> > libbpf: prog 'nvme_submit_cmd': relo #0: kind <byte_off> (0), spec is
> > [6] struct pt_regs.si (0:13 @ offset 104)
> > libbpf: prog 'nvme_submit_cmd': relo #0: matching candidate #0 [226]
> > struct pt_regs.si (0:13 @ offset 104)
> > libbpf: prog 'nvme_submit_cmd': relo #0: patched insn #0 (LDX/ST/STX)
> > off 104 -> 104
> > libbpf: CO-RE relocating [10] struct nvme_command: found target
> > candidate [107390] struct nvme_command in [nvme_core]
> > libbpf: CO-RE relocating [10] struct nvme_command: found target
> > candidate [106451] struct nvme_command in [nvme]
> > libbpf: prog 'nvme_submit_cmd': relo #1: kind <byte_off> (0), spec is
> > [10] struct nvme_command.common.opcode (0:0:0:0 @ offset 0)
> > libbpf: prog 'nvme_submit_cmd': relo #1: matching candidate #0
> > [107390] struct nvme_command.common.opcode (0:0:0:0 @ offset 0)
> > libbpf: prog 'nvme_submit_cmd': relo #1: matching candidate #1
> > [106451] struct nvme_command.common.opcode (0:0:0:0 @ offset 0)
> > libbpf: prog 'nvme_submit_cmd': relo #1: patched insn #1 (ALU/ALU64) imm 0 -> 0
> > Successfully started! Please run `sudo cat
> > /sys/kernel/debug/tracing/trace_pipe` to see output of the BPF
> > programs.
> > ..............^C
> >
> >
> >
> > > On Tue, May 31, 2022 at 7:22 PM Andrii Nakryiko
> > > <andrii.nakryiko@gmail.com> wrote:
> > > >
> > > > On Fri, May 27, 2022 at 3:07 AM John Mazzie <john.p.mazzie@gmail.com> wrote:
> > > > >
> > > > > While attempting to learn more about BPF and libbpf, I ran into an
> > > > > issue I can't quite seem to resolve.
> > > > >
> > > > > While writing some tools to practice tracing with libbpf, I came
> > > > > across a situation where I get an error when using BPF_CORE_READ,
> > > > > which appears to be that CO-RE relocation failed to find a
> > > > > corresponding field. Compilation doesn't complain, just when I try to
> > > > > execute.
> > > > >
> > > > > Error Message:
> > > > > ---------------------------------------------
> > > > > 8: (85) call unknown#195896080
> > > > > invalid func unknown#195896080
> > > >
> > > > This means CO-RE relocation failed. If you update libbpf submodule (or
> > > > maybe we already did it for libbpf-bootstrap recently), you'll get
> > > > more meaningful error and details. But basically in running kernel
> > > > there is no cmd->common.opcode.
> > > >
> > > > >
> > > > > I'm using the Makefile from libbpf-bootstrap to build my program. The
> > > > > other example programs build and execute properly, and I've also
> > > > > successfully used tracepoints to trace the nvme_setup_cmd and
> > > > > nvme_complete_rq functions. My issue appears to be when I attempt to
> > > > > use kprobes for the nvme_submit_cmd function.
> > > > >
> > > >
> > > > [...]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: BPF_CORE_READ issue with nvme_submit_cmd kprobe.
  2022-06-01 21:43         ` Andrii Nakryiko
@ 2022-06-02  0:03           ` John Mazzie
  0 siblings, 0 replies; 7+ messages in thread
From: John Mazzie @ 2022-06-02  0:03 UTC (permalink / raw)
  To: Andrii Nakryiko; +Cc: bpf, John Mazzie (jmazzie)

Thanks for the help that makes sense and was our suspicion, though we
weren't familiar enough to prove it or know what to look for. We are
just moving to Kernel 5.13 on Ubuntu with HWE which is working and has
the propper configuration.

On Wed, Jun 1, 2022 at 4:43 PM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
>
> On Wed, Jun 1, 2022 at 11:06 AM John Mazzie <john.p.mazzie@gmail.com> wrote:
> >
> > It appears that it might be some kind of kernel dependency. I tested
> > on Rocky Linux (RHEL based image) with Kernel 4.18 and Ubuntu 20.04
> > (Kernel 5.4) with the same issue running the simplified code.
> >
> > Error
> > -----------------------
> > libbpf: sec 'kprobe/nvme_submit_cmd': found 2 CO-RE relocations
> > libbpf: CO-RE relocating [2] struct pt_regs: found target candidate
> > [202] struct pt_regs in [vmlinux]
> > libbpf: prog 'nvme_submit_cmd': relo #0: <byte_off> [2] struct
> > pt_regs.si (0:13 @ offset 104)
> > libbpf: prog 'nvme_submit_cmd': relo #0: matching candidate #0
> > <byte_off> [202] struct pt_regs.si (0:13 @ offset 104)
> > libbpf: prog 'nvme_submit_cmd': relo #0: patched insn #0 (LDX/ST/STX)
> > off 104 -> 104
> > libbpf: prog 'nvme_submit_cmd': relo #1: <byte_off> [7] struct
> > nvme_command.common.opcode (0:0:0:0 @ offset 0)
> > libbpf: prog 'nvme_submit_cmd': relo #1: no matching targets found
> > libbpf: prog 'nvme_submit_cmd': relo #1: substituting insn #1 w/ invalid insn
> > libbpf: prog 'nvme_submit_cmd': BPF program load failed: Invalid argument
> > libbpf: prog 'nvme_submit_cmd': -- BEGIN PROG LOAD LOG --
> > Unrecognized arg#0 type PTR
> > ; int BPF_KPROBE(nvme_submit_cmd, void *nvmeq, struct nvme_command
> > *cmd, bool write_sq)
> > 0: (79) r3 = *(u64 *)(r1 +104)
> > 1: <invalid CO-RE relocation>
> > failed to resolve CO-RE relocation <byte_off> [7] struct
> > nvme_command.common.opcode (0:0:0:0 @ offset 0)
> > processed 2 insns (limit 1000000) max_states_per_insn 0 total_states 0
> > peak_states 0 mark_read 0
> > -- END PROG LOAD LOG --
> >
> > I did have a breakthrough when upgrading Ubuntu to the HWE kernel
> > (5.13) where the tool worked properly. We can start using the HWE
> > Kernel for our development and make progress with our tools, but I
> > would still like to try to understand why it may not be working on
> > Ubuntu 20.04 Kernel 5.4 or RedHat's version of 4.18.
> >
> > I verified the following kernel configuration parameters.
> >
> > CONFIG_KPROBES=y
> > CONFIG_UPROBES=y
> > CONFIG_DEBUG_FS=y
> > CONFIG_FTRACE=y
> > CONFIG_FTRACE_SYSCALLS=y
> > CONFIG_KPROBE_EVENTS=y
> > CONFIG_UPROBE_EVENTS=y
> > CONFIG_BPF_EVENTS=y
> >
> > Are there other config settings that I might not be thinking of for
> > these kernels?
> >
>
> Oh, I think I know what it is. In my case struct nvme_command comes
> from kernel modules nvme and nvme_core. But for that to work your
> kernel should have kernel module BTF generated. Which was added a bit
> later than vmlinux (kernel) BTF itself. You can check in
> /sys/kernel/btf, if the only entry you see there is vmlinux then you
> most probably don't have kernel modules.
>
> So your realistic options are either compiling in the nvme driver (and
> thus all its types will be present in vmlinux BTF) or upgrading the
> kernel to the version that supports kernel modules (plus make sure
> that you have recent enough pahole that supports generation of module
> BTFs). You need to have:
>
> $ zcat /proc/config.gz | grep BTF_MODULE
> CONFIG_DEBUG_INFO_BTF_MODULES=y
>
>
> >
> > On Tue, May 31, 2022 at 11:51 PM Andrii Nakryiko
> > <andrii.nakryiko@gmail.com> wrote:
> > >
> > > On Tue, May 31, 2022 at 7:16 PM John Mazzie <john.p.mazzie@gmail.com> wrote:
> > > >
> > > > I pulled the latest libbpf-bootstrap and rebuilt my programs. The
> > > > error message is clearer now. I think last time I tried
> > > > libbpf-bootstrap was still linked to 0.7 instead of 0.8.
> > > >
> > > > The new message is the following which makes sense in regard to what you said.
> > > >
> > > > <invalid CO-RE relocation>
> > > > failed to resolve CO-RE relocation <byte_off> [14] struct
> > > > nvme_command.common.opcode (0:0:0:0 @ offset 0)
> > > > processed 8 insns (limit 1000000) max_states_per_insn 0 total_states 0
> > > > peak_states 0 mark_read 0
> > > >
> > > > This struct is part of the nvme driver, which is running on this
> > > > system as it only has nvme devices (including boot device). I've been
> > > > able to access this data using bpftrace on the same system. If I don't
> > > > try to access this struct I can count the correct number of
> > > > nvme_submit_cmd triggers, so I believe the probe is working correctly.
> > > > Is this a case where I need to define more/all of the struct?
> > > >
> > >
> > > Look at debug logs from libbpf. I tried simplified version of your
> > > program and it all works for me.
> > >
> > > struct nvme_common_command {
> > >     __u8         opcode;
> > > } __attribute__((preserve_access_index));
> > >
> > > struct nvme_command {
> > >     union {
> > >         struct nvme_common_command common;
> > >     };
> > > } __attribute__((preserve_access_index));
> > >
> > > SEC("kprobe/nvme_submit_cmd")
> > > int BPF_KPROBE(nvme_submit_cmd, void *nvmeq, struct nvme_command *cmd,
> > > bool write_sq)
> > > {
> > >     bpf_printk("OPCODE %d", BPF_CORE_READ(cmd, common.opcode));
> > >
> > >    return 0;
> > > }
> > >
> > >
> > > Libbpf logs:
> > >
> > > libbpf: sec 'kprobe/nvme_submit_cmd': found 2 CO-RE relocations
> > > libbpf: CO-RE relocating [6] struct pt_regs: found target candidate
> > > [226] struct pt_regs in [vmlinux]
> > > libbpf: prog 'nvme_submit_cmd': relo #0: kind <byte_off> (0), spec is
> > > [6] struct pt_regs.si (0:13 @ offset 104)
> > > libbpf: prog 'nvme_submit_cmd': relo #0: matching candidate #0 [226]
> > > struct pt_regs.si (0:13 @ offset 104)
> > > libbpf: prog 'nvme_submit_cmd': relo #0: patched insn #0 (LDX/ST/STX)
> > > off 104 -> 104
> > > libbpf: CO-RE relocating [10] struct nvme_command: found target
> > > candidate [107390] struct nvme_command in [nvme_core]
> > > libbpf: CO-RE relocating [10] struct nvme_command: found target
> > > candidate [106451] struct nvme_command in [nvme]
> > > libbpf: prog 'nvme_submit_cmd': relo #1: kind <byte_off> (0), spec is
> > > [10] struct nvme_command.common.opcode (0:0:0:0 @ offset 0)
> > > libbpf: prog 'nvme_submit_cmd': relo #1: matching candidate #0
> > > [107390] struct nvme_command.common.opcode (0:0:0:0 @ offset 0)
> > > libbpf: prog 'nvme_submit_cmd': relo #1: matching candidate #1
> > > [106451] struct nvme_command.common.opcode (0:0:0:0 @ offset 0)
> > > libbpf: prog 'nvme_submit_cmd': relo #1: patched insn #1 (ALU/ALU64) imm 0 -> 0
> > > Successfully started! Please run `sudo cat
> > > /sys/kernel/debug/tracing/trace_pipe` to see output of the BPF
> > > programs.
> > > ..............^C
> > >
> > >
> > >
> > > > On Tue, May 31, 2022 at 7:22 PM Andrii Nakryiko
> > > > <andrii.nakryiko@gmail.com> wrote:
> > > > >
> > > > > On Fri, May 27, 2022 at 3:07 AM John Mazzie <john.p.mazzie@gmail.com> wrote:
> > > > > >
> > > > > > While attempting to learn more about BPF and libbpf, I ran into an
> > > > > > issue I can't quite seem to resolve.
> > > > > >
> > > > > > While writing some tools to practice tracing with libbpf, I came
> > > > > > across a situation where I get an error when using BPF_CORE_READ,
> > > > > > which appears to be that CO-RE relocation failed to find a
> > > > > > corresponding field. Compilation doesn't complain, just when I try to
> > > > > > execute.
> > > > > >
> > > > > > Error Message:
> > > > > > ---------------------------------------------
> > > > > > 8: (85) call unknown#195896080
> > > > > > invalid func unknown#195896080
> > > > >
> > > > > This means CO-RE relocation failed. If you update libbpf submodule (or
> > > > > maybe we already did it for libbpf-bootstrap recently), you'll get
> > > > > more meaningful error and details. But basically in running kernel
> > > > > there is no cmd->common.opcode.
> > > > >
> > > > > >
> > > > > > I'm using the Makefile from libbpf-bootstrap to build my program. The
> > > > > > other example programs build and execute properly, and I've also
> > > > > > successfully used tracepoints to trace the nvme_setup_cmd and
> > > > > > nvme_complete_rq functions. My issue appears to be when I attempt to
> > > > > > use kprobes for the nvme_submit_cmd function.
> > > > > >
> > > > >
> > > > > [...]

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2022-06-02  0:03 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-05-26 19:15 BPF_CORE_READ issue with nvme_submit_cmd kprobe John Mazzie
2022-06-01  0:22 ` Andrii Nakryiko
2022-06-01  2:16   ` John Mazzie
2022-06-01  4:51     ` Andrii Nakryiko
2022-06-01 18:06       ` John Mazzie
2022-06-01 21:43         ` Andrii Nakryiko
2022-06-02  0:03           ` John Mazzie

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.