* [PATCH bpf-next 0/3] Implement runqslower BCC tool with BPF CO-RE @ 2019-12-19 7:06 Andrii Nakryiko 2019-12-19 7:06 ` [PATCH bpf-next 1/3] bpftool: add extra CO-RE mode to btf dump command Andrii Nakryiko ` (2 more replies) 0 siblings, 3 replies; 15+ messages in thread From: Andrii Nakryiko @ 2019-12-19 7:06 UTC (permalink / raw) To: bpf, netdev, ast, daniel; +Cc: andrii.nakryiko, kernel-team, Andrii Nakryiko Based on recent BPF CO-RE, tp_btf, and skeleton changes, re-implement BCC-based runqslower tool as portable pre-compiled BPF CO-RE-based tool. Make sure it's built as part of selftests to ensure it doesn't bit rot. As part of this patch set, also introduce new `format core` to `bpftool btf dump` sub-command. It generates same compilable C header file with all the types from BTF, but additionally ensures seamless use of generated header with BPF CO-RE. Currently `format core` applies preserve_access_index attribute (if supported by Clang) to all structs and unions, to improve user experience of writing TRACING programs with direct kernel memory read access. Andrii Nakryiko (3): bpftool: add extra CO-RE mode to btf dump command libbpf/tools: add runqslower tool to libbpf selftests/bpf: build runqslower from selftests .../bpf/bpftool/Documentation/bpftool-btf.rst | 7 +- tools/bpf/bpftool/bash-completion/bpftool | 2 +- tools/bpf/bpftool/btf.c | 24 ++- tools/lib/bpf/tools/runqslower/.gitignore | 2 + tools/lib/bpf/tools/runqslower/Makefile | 60 ++++++ .../lib/bpf/tools/runqslower/runqslower.bpf.c | 101 ++++++++++ tools/lib/bpf/tools/runqslower/runqslower.c | 187 ++++++++++++++++++ tools/lib/bpf/tools/runqslower/runqslower.h | 13 ++ tools/testing/selftests/bpf/Makefile | 7 +- 9 files changed, 395 insertions(+), 8 deletions(-) create mode 100644 tools/lib/bpf/tools/runqslower/.gitignore create mode 100644 tools/lib/bpf/tools/runqslower/Makefile create mode 100644 tools/lib/bpf/tools/runqslower/runqslower.bpf.c create mode 100644 tools/lib/bpf/tools/runqslower/runqslower.c create mode 100644 tools/lib/bpf/tools/runqslower/runqslower.h -- 2.17.1 ^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH bpf-next 1/3] bpftool: add extra CO-RE mode to btf dump command 2019-12-19 7:06 [PATCH bpf-next 0/3] Implement runqslower BCC tool with BPF CO-RE Andrii Nakryiko @ 2019-12-19 7:06 ` Andrii Nakryiko 2019-12-19 17:06 ` Alexei Starovoitov 2019-12-19 7:06 ` [PATCH bpf-next 2/3] libbpf/tools: add runqslower tool to libbpf Andrii Nakryiko 2019-12-19 7:06 ` [PATCH bpf-next 3/3] selftests/bpf: build runqslower from selftests Andrii Nakryiko 2 siblings, 1 reply; 15+ messages in thread From: Andrii Nakryiko @ 2019-12-19 7:06 UTC (permalink / raw) To: bpf, netdev, ast, daniel; +Cc: andrii.nakryiko, kernel-team, Andrii Nakryiko Add special BPF CO-RE mode to `btf dump` sub-command, which emits C dump of provided BTF data, but with additional provisions for use with BPF CO-RE. The first special BPF CO-RE specific feature added is applying preserve_access_index attribute to all structs/unions to make them automatically relocatable. This is especially useful for tp_btf/fentry/fexit BPF program types. They allow direct memory access, so BPF C code just uses straightfoward a->b->c access pattern to read data from kernel. But without kernel structs marked as CO-RE relocatable through preserve_access_index attribute, one has to enclose all the data reads into a special __builtin_preserve_access_index code block, like so: __builtin_preserve_access_index(({ x = p->pid; /* where p is struct task_struct *, for example */ })); This is very inconvenient and obscures the logic quite a bit. By marking all auto-generated types with preserve_access_index attribute the above code is reduced to just clean and natural `x = p->pid;`. Having a special `format core`, as opposed to extending existing `format c` mode, allows us to do more improvements like above, knowing that intended use case is for BPF CO-RE. And still have a clean and unassuming plain C mode for any other generic use case. Signed-off-by: Andrii Nakryiko <andriin@fb.com> --- .../bpf/bpftool/Documentation/bpftool-btf.rst | 7 ++++-- tools/bpf/bpftool/bash-completion/bpftool | 2 +- tools/bpf/bpftool/btf.c | 24 +++++++++++++++---- 3 files changed, 26 insertions(+), 7 deletions(-) diff --git a/tools/bpf/bpftool/Documentation/bpftool-btf.rst b/tools/bpf/bpftool/Documentation/bpftool-btf.rst index 39615f8e145b..101a91280f3d 100644 --- a/tools/bpf/bpftool/Documentation/bpftool-btf.rst +++ b/tools/bpf/bpftool/Documentation/bpftool-btf.rst @@ -24,7 +24,7 @@ BTF COMMANDS | **bpftool** **btf help** | | *BTF_SRC* := { **id** *BTF_ID* | **prog** *PROG* | **map** *MAP* [{**key** | **value** | **kv** | **all**}] | **file** *FILE* } -| *FORMAT* := { **raw** | **c** } +| *FORMAT* := { **raw** | **c** | **core** } | *MAP* := { **id** *MAP_ID* | **pinned** *FILE* } | *PROG* := { **id** *PROG_ID* | **pinned** *FILE* | **tag** *PROG_TAG* } @@ -59,7 +59,10 @@ DESCRIPTION **format** option can be used to override default (raw) output format. Raw (**raw**) or C-syntax (**c**) output - formats are supported. + formats are supported. There is also a special BPF CO-RE + mode (**core**), which emits types in valid C syntax, but + with additional provisions for more seamless use with BPF + CO-RE. **bpftool btf help** Print short help message. diff --git a/tools/bpf/bpftool/bash-completion/bpftool b/tools/bpf/bpftool/bash-completion/bpftool index 754d8395e451..8308ae5f5679 100644 --- a/tools/bpf/bpftool/bash-completion/bpftool +++ b/tools/bpf/bpftool/bash-completion/bpftool @@ -800,7 +800,7 @@ _bpftool() return 0 ;; format) - COMPREPLY=( $( compgen -W "c raw" -- "$cur" ) ) + COMPREPLY=( $( compgen -W "c core raw" -- "$cur" ) ) ;; *) # emit extra options diff --git a/tools/bpf/bpftool/btf.c b/tools/bpf/bpftool/btf.c index e5bc97b71ceb..a57494493fec 100644 --- a/tools/bpf/bpftool/btf.c +++ b/tools/bpf/bpftool/btf.c @@ -361,7 +361,7 @@ static void __printf(2, 0) btf_dump_printf(void *ctx, } static int dump_btf_c(const struct btf *btf, - __u32 *root_type_ids, int root_type_cnt) + __u32 *root_type_ids, int root_type_cnt, bool core_mode) { struct btf_dump *d; int err = 0, i; @@ -370,6 +370,13 @@ static int dump_btf_c(const struct btf *btf, if (IS_ERR(d)) return PTR_ERR(d); + if (core_mode) { + printf("#if defined(__has_attribute) && __has_attribute(preserve_access_index)\n"); + printf("#define __CLANG_BPF_CORE_SUPPORTED\n"); + printf("#pragma clang attribute push (__attribute__((preserve_access_index)), apply_to = record)\n"); + printf("#endif\n\n"); + } + if (root_type_cnt) { for (i = 0; i < root_type_cnt; i++) { err = btf_dump__dump_type(d, root_type_ids[i]); @@ -386,6 +393,12 @@ static int dump_btf_c(const struct btf *btf, } } + if (core_mode) { + printf("#ifdef __CLANG_BPF_CORE_SUPPORTED\n"); + printf("#pragma clang attribute pop\n"); + printf("#endif\n"); + } + done: btf_dump__free(d); return err; @@ -441,10 +454,10 @@ static bool is_btf_raw(const char *file) static int do_dump(int argc, char **argv) { + bool dump_c = false, core_mode = false; struct btf *btf = NULL; __u32 root_type_ids[2]; int root_type_cnt = 0; - bool dump_c = false; __u32 btf_id = -1; const char *src; int fd = -1; @@ -544,6 +557,9 @@ static int do_dump(int argc, char **argv) } if (strcmp(*argv, "c") == 0) { dump_c = true; + } else if (strcmp(*argv, "core") == 0) { + dump_c = true; + core_mode = true; } else if (strcmp(*argv, "raw") == 0) { dump_c = false; } else { @@ -577,7 +593,7 @@ static int do_dump(int argc, char **argv) err = -ENOTSUP; goto done; } - err = dump_btf_c(btf, root_type_ids, root_type_cnt); + err = dump_btf_c(btf, root_type_ids, root_type_cnt, core_mode); } else { err = dump_btf_raw(btf, root_type_ids, root_type_cnt); } @@ -925,7 +941,7 @@ static int do_help(int argc, char **argv) " %s btf help\n" "\n" " BTF_SRC := { id BTF_ID | prog PROG | map MAP [{key | value | kv | all}] | file FILE }\n" - " FORMAT := { raw | c }\n" + " FORMAT := { raw | c | core }\n" " " HELP_SPEC_MAP "\n" " " HELP_SPEC_PROGRAM "\n" " " HELP_SPEC_OPTIONS "\n" -- 2.17.1 ^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH bpf-next 1/3] bpftool: add extra CO-RE mode to btf dump command 2019-12-19 7:06 ` [PATCH bpf-next 1/3] bpftool: add extra CO-RE mode to btf dump command Andrii Nakryiko @ 2019-12-19 17:06 ` Alexei Starovoitov 2019-12-19 21:07 ` Andrii Nakryiko 0 siblings, 1 reply; 15+ messages in thread From: Alexei Starovoitov @ 2019-12-19 17:06 UTC (permalink / raw) To: Andrii Nakryiko; +Cc: bpf, netdev, ast, daniel, andrii.nakryiko, kernel-team On Wed, Dec 18, 2019 at 11:06:56PM -0800, Andrii Nakryiko wrote: > + if (core_mode) { > + printf("#if defined(__has_attribute) && __has_attribute(preserve_access_index)\n"); > + printf("#define __CLANG_BPF_CORE_SUPPORTED\n"); > + printf("#pragma clang attribute push (__attribute__((preserve_access_index)), apply_to = record)\n"); > + printf("#endif\n\n"); I think it's dangerous to automatically opt-out when clang is not new enough. bpf prog will compile fine, but it will be missing co-re relocations. How about doing something like: printf("#ifdef NEEDS_CO_RE\n"); printf("#pragma clang attribute push (__attribute__((preserve_access_index)), apply_to = record)\n"); printf("#endif\n\n"); and emit it always when 'format c'. Then on the program side it will look: #define NEEDS_CO_RE #include "vmlinux.h" If clang is too old there will be a compile time error which is a good thing. Future features will have different NEEDS_ macros. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH bpf-next 1/3] bpftool: add extra CO-RE mode to btf dump command 2019-12-19 17:06 ` Alexei Starovoitov @ 2019-12-19 21:07 ` Andrii Nakryiko 2019-12-19 22:04 ` Alexei Starovoitov 0 siblings, 1 reply; 15+ messages in thread From: Andrii Nakryiko @ 2019-12-19 21:07 UTC (permalink / raw) To: Alexei Starovoitov Cc: Andrii Nakryiko, bpf, Networking, Alexei Starovoitov, Daniel Borkmann, Kernel Team On Thu, Dec 19, 2019 at 9:06 AM Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote: > > On Wed, Dec 18, 2019 at 11:06:56PM -0800, Andrii Nakryiko wrote: > > + if (core_mode) { > > + printf("#if defined(__has_attribute) && __has_attribute(preserve_access_index)\n"); > > + printf("#define __CLANG_BPF_CORE_SUPPORTED\n"); > > + printf("#pragma clang attribute push (__attribute__((preserve_access_index)), apply_to = record)\n"); > > + printf("#endif\n\n"); > > I think it's dangerous to automatically opt-out when clang is not new enough. > bpf prog will compile fine, but it will be missing co-re relocations. > How about doing something like: > printf("#ifdef NEEDS_CO_RE\n"); > printf("#pragma clang attribute push (__attribute__((preserve_access_index)), apply_to = record)\n"); > printf("#endif\n\n"); > and emit it always when 'format c'. > Then on the program side it will look: > #define NEEDS_CO_RE > #include "vmlinux.h" > If clang is too old there will be a compile time error which is a good thing. > Future features will have different NEEDS_ macros. Wouldn't it be cleaner to separate vanilla C types dump vs CO-RE-specific one? I'd prefer to have them separate and not require every application to specify this #define NEEDS_CO_RE macro. Furthermore, later we probably are going to add some additional auto-generated types, definitions, etc, so plain C types dump and CO-RE-specific one will deviate quite a bit. So it feels cleaner to separate them now instead of polluting `format c` with irrelevant noise. I can unconditionally assume preserve_access_index availability, though, because Clang 10 release is going to have all those features needed for BPF CO-RE. I can also add nicer compiler error, if this feature is not detected. Ok? BTW, the reason I added this opt-out is because if you use bpf_core_read() and BPF_CORE_READ() macros, you don't really need those structs marked as relocatable. But again, I think it's fine to just assume it has to be supported by compiler. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH bpf-next 1/3] bpftool: add extra CO-RE mode to btf dump command 2019-12-19 21:07 ` Andrii Nakryiko @ 2019-12-19 22:04 ` Alexei Starovoitov 2019-12-20 17:40 ` Andrii Nakryiko 0 siblings, 1 reply; 15+ messages in thread From: Alexei Starovoitov @ 2019-12-19 22:04 UTC (permalink / raw) To: Andrii Nakryiko Cc: Andrii Nakryiko, bpf, Networking, Alexei Starovoitov, Daniel Borkmann, Kernel Team On Thu, Dec 19, 2019 at 01:07:38PM -0800, Andrii Nakryiko wrote: > On Thu, Dec 19, 2019 at 9:06 AM Alexei Starovoitov > <alexei.starovoitov@gmail.com> wrote: > > > > On Wed, Dec 18, 2019 at 11:06:56PM -0800, Andrii Nakryiko wrote: > > > + if (core_mode) { > > > + printf("#if defined(__has_attribute) && __has_attribute(preserve_access_index)\n"); > > > + printf("#define __CLANG_BPF_CORE_SUPPORTED\n"); > > > + printf("#pragma clang attribute push (__attribute__((preserve_access_index)), apply_to = record)\n"); > > > + printf("#endif\n\n"); > > > > I think it's dangerous to automatically opt-out when clang is not new enough. > > bpf prog will compile fine, but it will be missing co-re relocations. > > How about doing something like: > > printf("#ifdef NEEDS_CO_RE\n"); > > printf("#pragma clang attribute push (__attribute__((preserve_access_index)), apply_to = record)\n"); > > printf("#endif\n\n"); > > and emit it always when 'format c'. > > Then on the program side it will look: > > #define NEEDS_CO_RE > > #include "vmlinux.h" > > If clang is too old there will be a compile time error which is a good thing. > > Future features will have different NEEDS_ macros. > > Wouldn't it be cleaner to separate vanilla C types dump vs > CO-RE-specific one? I'd prefer to have them separate and not require > every application to specify this #define NEEDS_CO_RE macro. > Furthermore, later we probably are going to add some additional > auto-generated types, definitions, etc, so plain C types dump and > CO-RE-specific one will deviate quite a bit. So it feels cleaner to > separate them now instead of polluting `format c` with irrelevant > noise. Say we do this 'format core' today then tomorrow another tweak to vmlinux.h would need 'format core2' ? I think adding new format to bpftool for every little feature will be annoying to users. I think the output should stay as 'format c' and that format should be extensible/customizable by bpf progs via #define NEEDS_FEATURE_X. Then these features can grow without a need to keep adding new cmd line args. This preserve_access_index feature makes up for less than 1% difference in generated vmlinux.h. If some feature extension would drastically change generated .h then it would justify new 'format'. This one is just a small tweak. Also #define NEEDS_CO_RE is probably too broad. I think #define CLANG_NEEDS_TO_EMIT_RELO would be more precise and less ambiguous. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH bpf-next 1/3] bpftool: add extra CO-RE mode to btf dump command 2019-12-19 22:04 ` Alexei Starovoitov @ 2019-12-20 17:40 ` Andrii Nakryiko 2019-12-21 3:21 ` Alexei Starovoitov 0 siblings, 1 reply; 15+ messages in thread From: Andrii Nakryiko @ 2019-12-20 17:40 UTC (permalink / raw) To: Alexei Starovoitov Cc: Andrii Nakryiko, bpf, Networking, Alexei Starovoitov, Daniel Borkmann, Kernel Team On Thu, Dec 19, 2019 at 2:04 PM Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote: > > On Thu, Dec 19, 2019 at 01:07:38PM -0800, Andrii Nakryiko wrote: > > On Thu, Dec 19, 2019 at 9:06 AM Alexei Starovoitov > > <alexei.starovoitov@gmail.com> wrote: > > > > > > On Wed, Dec 18, 2019 at 11:06:56PM -0800, Andrii Nakryiko wrote: > > > > + if (core_mode) { > > > > + printf("#if defined(__has_attribute) && __has_attribute(preserve_access_index)\n"); > > > > + printf("#define __CLANG_BPF_CORE_SUPPORTED\n"); > > > > + printf("#pragma clang attribute push (__attribute__((preserve_access_index)), apply_to = record)\n"); > > > > + printf("#endif\n\n"); > > > > > > I think it's dangerous to automatically opt-out when clang is not new enough. > > > bpf prog will compile fine, but it will be missing co-re relocations. > > > How about doing something like: > > > printf("#ifdef NEEDS_CO_RE\n"); > > > printf("#pragma clang attribute push (__attribute__((preserve_access_index)), apply_to = record)\n"); > > > printf("#endif\n\n"); > > > and emit it always when 'format c'. > > > Then on the program side it will look: > > > #define NEEDS_CO_RE > > > #include "vmlinux.h" > > > If clang is too old there will be a compile time error which is a good thing. > > > Future features will have different NEEDS_ macros. > > > > Wouldn't it be cleaner to separate vanilla C types dump vs > > CO-RE-specific one? I'd prefer to have them separate and not require > > every application to specify this #define NEEDS_CO_RE macro. > > Furthermore, later we probably are going to add some additional > > auto-generated types, definitions, etc, so plain C types dump and > > CO-RE-specific one will deviate quite a bit. So it feels cleaner to > > separate them now instead of polluting `format c` with irrelevant > > noise. > > Say we do this 'format core' today then tomorrow another tweak to vmlinux.h > would need 'format core2' ? I think adding new format to bpftool for every No, not at all, it will stay within `format core`. If we need to do some parameterized tweak to BPF CO-RE-targeted vmlinux.h, then we'll have to add this parameter (even though I'd try to avoid parameterizing it as much as possible, of course). > little feature will be annoying to users. I think the output should stay as > 'format c' and that format should be extensible/customizable by bpf progs via > #define NEEDS_FEATURE_X. Then these features can grow without a need to keep > adding new cmd line args. This preserve_access_index feature makes up for less > than 1% difference in generated vmlinux.h. If some feature extension would > drastically change generated .h then it would justify new 'format'. This one is > just a small tweak. Also #define NEEDS_CO_RE is probably too broad. I think This one is a small line-number-wise. But the big difference between `format c` and `format core` is that the latter assumes we are dumping *vmlinux's BTF* for use with *BPF CO-RE from BPF side*. `format c` doesn't make any assumptions and faithfully dumps whatever BTF information is provided, which can be some other BPF program, or just any userspace program, on which pahole -J was executed. This assumption is why I think we should separate those two formats. For `format core` we can start auto-generating extra helper types, similarly how BCC auto-generates them for tracepoint, for example. Technically, sure, we can always guard everything behind extra #ifdefs, but think about dumping BTF type info for your small BPF program, and instead of seeing clean dump of types, you see all those crazy #ifdefs and weird #pragma clang's, extra attributes and so on. Not a great user experience for sure. > #define CLANG_NEEDS_TO_EMIT_RELO would be more precise and less ambiguous. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH bpf-next 1/3] bpftool: add extra CO-RE mode to btf dump command 2019-12-20 17:40 ` Andrii Nakryiko @ 2019-12-21 3:21 ` Alexei Starovoitov 2019-12-21 5:40 ` Andrii Nakryiko 0 siblings, 1 reply; 15+ messages in thread From: Alexei Starovoitov @ 2019-12-21 3:21 UTC (permalink / raw) To: Andrii Nakryiko Cc: Andrii Nakryiko, bpf, Networking, Alexei Starovoitov, Daniel Borkmann, Kernel Team On Fri, Dec 20, 2019 at 09:40:31AM -0800, Andrii Nakryiko wrote: > > This one is a small line-number-wise. But the big difference between > `format c` and `format core` is that the latter assumes we are dumping > *vmlinux's BTF* for use with *BPF CO-RE from BPF side*. `format c` > doesn't make any assumptions and faithfully dumps whatever BTF > information is provided, which can be some other BPF program, or just > any userspace program, on which pahole -J was executed. When 'format c' was introduced it was specifically targeting CO-RE framework. It is useful with BPF_CORE_READ macro and with builtin_preserve_access_index. Then we realized that both macro and builtin(({ ... })) are quite cumbersome to use and came with new clang attribute((preserve_access_index)) which makes programs read like normal C without any extra gotchas. Obviously it's nice if vmlinux.h already contains this attribute. Hence the need to either add extra flag to bpftool to emit this attribute or just emit it by default. So introducing new 'format core' (which is 99% the same as 'format c') and deprecating 'format c' to 'this is just .h dump of BTF' when it was around for few month only is absolutely no go. You need to find a way to extend 'format c' without breaking existing users. Yes. Likely there are no such users, but that doesn't matter. Once new api is introduced we have to stick to it. 'format c' is such api. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH bpf-next 1/3] bpftool: add extra CO-RE mode to btf dump command 2019-12-21 3:21 ` Alexei Starovoitov @ 2019-12-21 5:40 ` Andrii Nakryiko 0 siblings, 0 replies; 15+ messages in thread From: Andrii Nakryiko @ 2019-12-21 5:40 UTC (permalink / raw) To: Alexei Starovoitov Cc: Andrii Nakryiko, bpf, Networking, Alexei Starovoitov, Daniel Borkmann, Kernel Team On Fri, Dec 20, 2019 at 7:21 PM Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote: > > On Fri, Dec 20, 2019 at 09:40:31AM -0800, Andrii Nakryiko wrote: > > > > This one is a small line-number-wise. But the big difference between > > `format c` and `format core` is that the latter assumes we are dumping > > *vmlinux's BTF* for use with *BPF CO-RE from BPF side*. `format c` > > doesn't make any assumptions and faithfully dumps whatever BTF > > information is provided, which can be some other BPF program, or just > > any userspace program, on which pahole -J was executed. > > When 'format c' was introduced it was specifically targeting CO-RE framework. No it wasn't. Here's "motivational" part of BTF-to-C dump API patch set: "This patch set adds BTF-to-C dumping APIs to libbpf, allowing to output a subset of BTF types as a compilable C type definitions. This is useful by itself, as raw BTF output is not easy to inspect and comprehend. But it's also a big part of BPF CO-RE (compile once - run everywhere) initiative aimed at allowing to write relocatable BPF programs, that won't require on-the-host kernel headers (and would be able to inspect internal kernel structures, not exposed through kernel headers)." And here's `format c` patch commit message: "Utilize new libbpf's btf_dump API to emit BTF as a C definitions." It was never **just** for CO-RE framework, rather as a convenient C-syntax view of BTF types. > It is useful with BPF_CORE_READ macro and with builtin_preserve_access_index. > Then we realized that both macro and builtin(({ ... })) are quite cumbersome to > use and came with new clang attribute((preserve_access_index)) which makes > programs read like normal C without any extra gotchas. Obviously it's nice if > vmlinux.h already contains this attribute. Hence the need to either add extra > flag to bpftool to emit this attribute or just emit it by default. So > introducing new 'format core' (which is 99% the same as 'format c') and > deprecating 'format c' to 'this is just .h dump of BTF' when it was around for > few month only is absolutely no go. You need to find a way to extend 'format c' I found the way, that's not the point of this discussion and absolutely **not why** I'm adding `format core`. I feel like having plain C dump of BTF is useful by itself (at least as a diagnostics tool for BTF, similarly to objdump/readelf for ELF). But if no one else cares, sure, I'll just reuse `format c` as CO-RE-specific BTF dumper. > without breaking existing users. Yes. Likely there are no such users, but that > doesn't matter. Once new api is introduced we have to stick to it. 'format c' > is such api. ^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH bpf-next 2/3] libbpf/tools: add runqslower tool to libbpf 2019-12-19 7:06 [PATCH bpf-next 0/3] Implement runqslower BCC tool with BPF CO-RE Andrii Nakryiko 2019-12-19 7:06 ` [PATCH bpf-next 1/3] bpftool: add extra CO-RE mode to btf dump command Andrii Nakryiko @ 2019-12-19 7:06 ` Andrii Nakryiko 2019-12-19 15:41 ` Daniel Borkmann 2019-12-19 18:13 ` Yonghong Song 2019-12-19 7:06 ` [PATCH bpf-next 3/3] selftests/bpf: build runqslower from selftests Andrii Nakryiko 2 siblings, 2 replies; 15+ messages in thread From: Andrii Nakryiko @ 2019-12-19 7:06 UTC (permalink / raw) To: bpf, netdev, ast, daniel; +Cc: andrii.nakryiko, kernel-team, Andrii Nakryiko Convert one of BCC tools (runqslower [0]) to BPF CO-RE + libbpf. It matches its BCC-based counterpart 1-to-1, supporting all the same parameters and functionality. runqslower tool utilizes BPF skeleton, auto-generated from BPF object file, as well as memory-mapped interface to global (read-only, in this case) data. Its makefile also ensures auto-generation of "relocatable" vmlinux.h, which is necessary for BTF-typed raw tracepoints with direct memory access. [0] https://github.com/iovisor/bcc/blob/11bf5d02c895df9646c117c713082eb192825293/tools/runqslower.py Signed-off-by: Andrii Nakryiko <andriin@fb.com> --- tools/lib/bpf/tools/runqslower/.gitignore | 2 + tools/lib/bpf/tools/runqslower/Makefile | 60 ++++++ .../lib/bpf/tools/runqslower/runqslower.bpf.c | 101 ++++++++++ tools/lib/bpf/tools/runqslower/runqslower.c | 187 ++++++++++++++++++ tools/lib/bpf/tools/runqslower/runqslower.h | 13 ++ 5 files changed, 363 insertions(+) create mode 100644 tools/lib/bpf/tools/runqslower/.gitignore create mode 100644 tools/lib/bpf/tools/runqslower/Makefile create mode 100644 tools/lib/bpf/tools/runqslower/runqslower.bpf.c create mode 100644 tools/lib/bpf/tools/runqslower/runqslower.c create mode 100644 tools/lib/bpf/tools/runqslower/runqslower.h diff --git a/tools/lib/bpf/tools/runqslower/.gitignore b/tools/lib/bpf/tools/runqslower/.gitignore new file mode 100644 index 000000000000..404942cc9371 --- /dev/null +++ b/tools/lib/bpf/tools/runqslower/.gitignore @@ -0,0 +1,2 @@ +/.output +/runqslower diff --git a/tools/lib/bpf/tools/runqslower/Makefile b/tools/lib/bpf/tools/runqslower/Makefile new file mode 100644 index 000000000000..b87b1f9fe9da --- /dev/null +++ b/tools/lib/bpf/tools/runqslower/Makefile @@ -0,0 +1,60 @@ +# SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause) +CLANG := clang +LLC := llc +LLVM_STRIP := llvm-strip +BPFTOOL := bpftool +LIBBPF_SRC := ../.. +CFLAGS := -g -Wall + +# Try to detect best kernel BTF source +KERNEL_REL := $(shell uname -r) +ifneq ("$(wildcard /sys/kenerl/btf/vmlinux)","") +VMLINUX_BTF := /sys/kernel/btf/vmlinux +else ifneq ("$(wildcard /boot/vmlinux-$(KERNEL_REL))","") +VMLINUX_BTF := /boot/vmlinux-$(KERNEL_REL) +else +$(error "Can't detect kernel BTF, use VMLINUX_BTF to specify it explicitly") +endif + +out := .output +abs_out := $(abspath $(out)) +libbpf_src := $(abspath $(LIBBPF_SRC)) + +.DELETE_ON_ERROR: + +.PHONY: all +all: runqslower + +.PHONY: clean +clean: + rm -rf $(out) runqslower + +runqslower: $(out)/runqslower.o $(out)/libbpf.a + $(CC) $(CFLAGS) -lelf -lz $^ -o $@ + +$(out)/vmlinux.h: $(VMLINUX_BTF) | $(out) + $(BPFTOOL) btf dump file $(VMLINUX_BTF) format core > $@ + +$(out)/libbpf.a: | $(out) + cd $(out) && \ + $(MAKE) -C $(libbpf_src) OUTPUT=$(abs_out)/ $(abs_out)/libbpf.a + +$(out)/runqslower.o: runqslower.h $(out)/runqslower.skel.h \ + $(out)/runqslower.bpf.o + +$(out)/runqslower.bpf.o: $(out)/vmlinux.h runqslower.h + +$(out)/%.skel.h: $(out)/%.bpf.o + $(BPFTOOL) gen skeleton $< > $@ + +$(out)/%.bpf.o: %.bpf.c | $(out) + $(CLANG) -g -O2 -target bpf -I$(out) -I$(LIBBPF_SRC) \ + -c $(filter %.c,$^) -o $@ && \ + $(LLVM_STRIP) -g $@ + +$(out)/%.o: %.c | $(out) + $(CC) $(CFLAGS) -I$(LIBBPF_SRC) -I$(out) -c $(filter %.c,$^) -o $@ + +$(out): + mkdir -p $(out) + diff --git a/tools/lib/bpf/tools/runqslower/runqslower.bpf.c b/tools/lib/bpf/tools/runqslower/runqslower.bpf.c new file mode 100644 index 000000000000..7f078e8b1365 --- /dev/null +++ b/tools/lib/bpf/tools/runqslower/runqslower.bpf.c @@ -0,0 +1,101 @@ +// SPDX-License-Identifier: GPL-2.0 +// Copyright (c) 2019 Facebook +#include "vmlinux.h" +#include "bpf_helpers.h" +#include "bpf_core_read.h" +#include "runqslower.h" + +#define TASK_RUNNING 0 + +#define BPF_F_INDEX_MASK 0xffffffffULL +#define BPF_F_CURRENT_CPU BPF_F_INDEX_MASK + +const volatile __u64 min_us = 0; +const volatile pid_t targ_pid = 0; + +struct { + __uint(type, BPF_MAP_TYPE_HASH); + __uint(max_entries, 10240); + __type(key, u32); + __type(value, u64); +} start SEC(".maps"); + +struct { + __uint(type, BPF_MAP_TYPE_PERF_EVENT_ARRAY); + __uint(key_size, sizeof(u32)); + __uint(value_size, sizeof(u32)); +} events SEC(".maps"); + +/* record enqueue timestamp */ +__always_inline +static int trace_enqueue(u32 tgid, u32 pid) +{ + u64 ts; + + if (!pid || (targ_pid && targ_pid != pid)) + return 0; + + ts = bpf_ktime_get_ns(); + bpf_map_update_elem(&start, &pid, &ts, 0); + return 0; +} + +SEC("tp_btf/sched_wakeup") +int handle__sched_wakeup(u64 *ctx) +{ + /* TP_PROTO(struct task_struct *p) */ + struct task_struct *p = (void *)ctx[0]; + + return trace_enqueue(p->tgid, p->pid); +} + +SEC("tp_btf/sched_wakeup_new") +int handle__sched_wakeup_new(u64 *ctx) +{ + /* TP_PROTO(struct task_struct *p) */ + struct task_struct *p = (void *)ctx[0]; + + return trace_enqueue(p->tgid, p->pid); +} + +SEC("tp_btf/sched_switch") +int handle__sched_switch(u64 *ctx) +{ + /* TP_PROTO(bool preempt, struct task_struct *prev, + * struct task_struct *next) + */ + struct task_struct *prev = (struct task_struct *)ctx[1]; + struct task_struct *next = (struct task_struct *)ctx[2]; + struct event event = {}; + u64 *tsp, delta_us; + long state; + u32 pid; + + /* ivcsw: treat like an enqueue event and store timestamp */ + if (prev->state == TASK_RUNNING) + trace_enqueue(prev->tgid, prev->pid); + + pid = next->pid; + + /* fetch timestamp and calculate delta */ + tsp = bpf_map_lookup_elem(&start, &pid); + if (!tsp) + return 0; /* missed enqueue */ + + delta_us = (bpf_ktime_get_ns() - *tsp) / 1000; + if (min_us && delta_us <= min_us) + return 0; + + event.pid = pid; + event.delta_us = delta_us; + bpf_get_current_comm(&event.task, sizeof(event.task)); + + /* output */ + bpf_perf_event_output(ctx, &events, BPF_F_CURRENT_CPU, + &event, sizeof(event)); + + bpf_map_delete_elem(&start, &pid); + return 0; +} + +char LICENSE[] SEC("license") = "GPL"; diff --git a/tools/lib/bpf/tools/runqslower/runqslower.c b/tools/lib/bpf/tools/runqslower/runqslower.c new file mode 100644 index 000000000000..996f0e2c560e --- /dev/null +++ b/tools/lib/bpf/tools/runqslower/runqslower.c @@ -0,0 +1,187 @@ +// SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause) +// Copyright (c) 2019 Facebook +#include <argp.h> +#include <stdio.h> +#include <stdlib.h> +#include <string.h> +#include <sys/resource.h> +#include <time.h> +#include <libbpf.h> +#include <bpf.h> +#include "runqslower.h" +#include "runqslower.skel.h" + +struct env { + pid_t pid; + __u64 min_us; + bool verbose; +} env = { + .min_us = 10000, +}; + +const char *argp_program_version = "runqslower 0.1"; +const char *argp_program_bug_address = "<bpf@vger.kernel.org>"; +const char argp_program_doc[] = +"runqslower Trace long process scheduling delays.\n" +" For Linux, uses eBPF, BPF CO-RE, libbpf, BTF.\n" +"\n" +"This script traces high scheduling delays between tasks being\n" +"ready to run and them running on CPU after that.\n" +"\n" +"USAGE: runqslower [-p PID] [min_us]\n" +"\n" +"EXAMPLES:\n" +" runqslower # trace run queue latency higher than 10000 us (default)\n" +" runqslower 1000 # trace run queue latency higher than 1000 us\n" +" runqslower -p 123 # trace pid 123 only\n"; + +static const struct argp_option opts[] = { + { "pid", 'p', "PID", 0, "Process PID to trace"}, + { "verbose", 'v', NULL, 0, "Verbose debug output" }, + {}, +}; + +static error_t parse_arg(int key, char *arg, struct argp_state *state) +{ + static int pos_args; + int pid; + long long min_us; + + switch (key) { + case 'v': + env.verbose = true; + break; + case 'p': + errno = 0; + pid = strtol(arg, NULL, 10); + if (errno || pid <= 0) { + fprintf(stderr, "Invalid PID: %s\n", arg); + argp_usage(state); + } + env.pid = pid; + break; + case ARGP_KEY_ARG: + if (pos_args++) { + fprintf(stderr, + "Unrecognized positional argument: %s\n", arg); + argp_usage(state); + } + errno = 0; + min_us = strtoll(arg, NULL, 10); + if (errno || min_us <= 0) { + fprintf(stderr, "Invalid delay (in us): %s\n", arg); + argp_usage(state); + } + env.min_us = min_us; + break; + default: + return ARGP_ERR_UNKNOWN; + } + return 0; +} + +int libbpf_print_fn(enum libbpf_print_level level, + const char *format, va_list args) +{ + if (level == LIBBPF_DEBUG && !env.verbose) + return 0; + return vfprintf(stderr, format, args); +} + +static int bump_memlock_rlimit(void) +{ + struct rlimit rlim_new = { + .rlim_cur = RLIM_INFINITY, + .rlim_max = RLIM_INFINITY, + }; + + return setrlimit(RLIMIT_MEMLOCK, &rlim_new); +} + +void handle_event(void *ctx, int cpu, void *data, __u32 data_sz) +{ + const struct event *e = data; + struct tm *tm; + char ts[32]; + time_t t; + + time(&t); + tm = localtime(&t); + strftime(ts, sizeof(ts), "%H:%M:%S", tm); + printf("%-8s %-16s %-6d %14llu\n", ts, e->task, e->pid, e->delta_us); +} + +void handle_lost_events(void *ctx, int cpu, __u64 lost_cnt) +{ + printf("Lost %llu events on CPU #%d!\n", lost_cnt, cpu); +} + +int main(int argc, char **argv) +{ + static const struct argp argp = { + .options = opts, + .parser = parse_arg, + .doc = argp_program_doc, + }; + struct perf_buffer_opts pb_opts; + struct perf_buffer *pb = NULL; + struct runqslower_bpf *obj; + int err; + + err = argp_parse(&argp, argc, argv, 0, NULL, NULL); + if (err) + return err; + + libbpf_set_print(libbpf_print_fn); + + err = bump_memlock_rlimit(); + if (err) { + fprintf(stderr, "failed to increase rlimit: %d", err); + return 1; + } + + obj = runqslower_bpf__open(); + if (!obj) { + fprintf(stderr, "failed to open and/or load BPF object\n"); + return 1; + } + + /* initialize global data (filtering options) */ + obj->rodata->targ_pid = env.pid; + obj->rodata->min_us = env.min_us; + + err = runqslower_bpf__load(obj); + if (err) { + fprintf(stderr, "failed to load BPF object: %d\n", err); + goto cleanup; + } + + err = runqslower_bpf__attach(obj); + if (err) { + fprintf(stderr, "failed to attach BPF programs\n"); + goto cleanup; + } + + printf("Tracing run queue latency higher than %llu us\n", env.min_us); + printf("%-8s %-16s %-6s %14s\n", "TIME", "COMM", "PID", "LAT(us)"); + + pb_opts.sample_cb = handle_event; + pb_opts.lost_cb = handle_lost_events; + pb = perf_buffer__new(bpf_map__fd(obj->maps.events), 64, &pb_opts); + err = libbpf_get_error(pb); + if (err) { + pb = NULL; + fprintf(stderr, "failed to open perf buffer: %d\n", err); + goto cleanup; + } + + while ((err = perf_buffer__poll(pb, 100)) >= 0) + ; + printf("Error polling perf buffer: %d\n", err); + +cleanup: + perf_buffer__free(pb); + runqslower_bpf__destroy(obj); + + return err != 0; +} diff --git a/tools/lib/bpf/tools/runqslower/runqslower.h b/tools/lib/bpf/tools/runqslower/runqslower.h new file mode 100644 index 000000000000..9db225425e5f --- /dev/null +++ b/tools/lib/bpf/tools/runqslower/runqslower.h @@ -0,0 +1,13 @@ +/* SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause) */ +#ifndef __RUNQSLOWER_H +#define __RUNQSLOWER_H + +#define TASK_COMM_LEN 16 + +struct event { + char task[TASK_COMM_LEN]; + __u64 delta_us; + pid_t pid; +}; + +#endif /* __RUNQSLOWER_H */ -- 2.17.1 ^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH bpf-next 2/3] libbpf/tools: add runqslower tool to libbpf 2019-12-19 7:06 ` [PATCH bpf-next 2/3] libbpf/tools: add runqslower tool to libbpf Andrii Nakryiko @ 2019-12-19 15:41 ` Daniel Borkmann 2019-12-19 21:14 ` Andrii Nakryiko 2019-12-19 18:13 ` Yonghong Song 1 sibling, 1 reply; 15+ messages in thread From: Daniel Borkmann @ 2019-12-19 15:41 UTC (permalink / raw) To: Andrii Nakryiko; +Cc: bpf, netdev, ast, andrii.nakryiko, kernel-team On Wed, Dec 18, 2019 at 11:06:57PM -0800, Andrii Nakryiko wrote: > Convert one of BCC tools (runqslower [0]) to BPF CO-RE + libbpf. It matches > its BCC-based counterpart 1-to-1, supporting all the same parameters and > functionality. > > runqslower tool utilizes BPF skeleton, auto-generated from BPF object file, > as well as memory-mapped interface to global (read-only, in this case) data. > Its makefile also ensures auto-generation of "relocatable" vmlinux.h, which is > necessary for BTF-typed raw tracepoints with direct memory access. > > [0] https://github.com/iovisor/bcc/blob/11bf5d02c895df9646c117c713082eb192825293/tools/runqslower.py > > Signed-off-by: Andrii Nakryiko <andriin@fb.com> > --- > tools/lib/bpf/tools/runqslower/.gitignore | 2 + > tools/lib/bpf/tools/runqslower/Makefile | 60 ++++++ > .../lib/bpf/tools/runqslower/runqslower.bpf.c | 101 ++++++++++ > tools/lib/bpf/tools/runqslower/runqslower.c | 187 ++++++++++++++++++ > tools/lib/bpf/tools/runqslower/runqslower.h | 13 ++ tools/lib/bpf/tools/ is rather weird, please add to tools/bpf/ which is the more appropriate place we have for small tools. Could also live directly in there, e.g. tools/bpf/runqslower.{c,h,bpf.c} and then built/run from selftests, but under libbpf directly is too odd. Thanks, Daniel ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH bpf-next 2/3] libbpf/tools: add runqslower tool to libbpf 2019-12-19 15:41 ` Daniel Borkmann @ 2019-12-19 21:14 ` Andrii Nakryiko 2019-12-19 22:04 ` Alexei Starovoitov 0 siblings, 1 reply; 15+ messages in thread From: Andrii Nakryiko @ 2019-12-19 21:14 UTC (permalink / raw) To: Daniel Borkmann Cc: Andrii Nakryiko, bpf, Networking, Alexei Starovoitov, Kernel Team On Thu, Dec 19, 2019 at 7:41 AM Daniel Borkmann <daniel@iogearbox.net> wrote: > > On Wed, Dec 18, 2019 at 11:06:57PM -0800, Andrii Nakryiko wrote: > > Convert one of BCC tools (runqslower [0]) to BPF CO-RE + libbpf. It matches > > its BCC-based counterpart 1-to-1, supporting all the same parameters and > > functionality. > > > > runqslower tool utilizes BPF skeleton, auto-generated from BPF object file, > > as well as memory-mapped interface to global (read-only, in this case) data. > > Its makefile also ensures auto-generation of "relocatable" vmlinux.h, which is > > necessary for BTF-typed raw tracepoints with direct memory access. > > > > [0] https://github.com/iovisor/bcc/blob/11bf5d02c895df9646c117c713082eb192825293/tools/runqslower.py > > > > Signed-off-by: Andrii Nakryiko <andriin@fb.com> > > --- > > tools/lib/bpf/tools/runqslower/.gitignore | 2 + > > tools/lib/bpf/tools/runqslower/Makefile | 60 ++++++ > > .../lib/bpf/tools/runqslower/runqslower.bpf.c | 101 ++++++++++ > > tools/lib/bpf/tools/runqslower/runqslower.c | 187 ++++++++++++++++++ > > tools/lib/bpf/tools/runqslower/runqslower.h | 13 ++ > > tools/lib/bpf/tools/ is rather weird, please add to tools/bpf/ which is the > more appropriate place we have for small tools. Could also live directly in > there, e.g. tools/bpf/runqslower.{c,h,bpf.c} and then built/run from selftests, > but under libbpf directly is too odd. runqslower is as much as a showcase of how to build a stand-alone tool with libbpf and CO-RE, as a separate tool, which is why I put it under libbpf directory. It's also not really BPF-specific tool, wouldn't that make it weird to put it under tools/bpf along the bpftool? If we added few more such tools using BPF CO-RE + libbpf, would you feel it's still a good idea to put them under tools/bpf? I tried to follow BCC structure, which has tools/ subdirectory with all kinds of examples/tools built with BCC (but not necessarily used by BCC itself). This is the same idea here. > > Thanks, > Daniel ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH bpf-next 2/3] libbpf/tools: add runqslower tool to libbpf 2019-12-19 21:14 ` Andrii Nakryiko @ 2019-12-19 22:04 ` Alexei Starovoitov 0 siblings, 0 replies; 15+ messages in thread From: Alexei Starovoitov @ 2019-12-19 22:04 UTC (permalink / raw) To: Andrii Nakryiko Cc: Daniel Borkmann, Andrii Nakryiko, bpf, Networking, Alexei Starovoitov, Kernel Team On Thu, Dec 19, 2019 at 01:14:46PM -0800, Andrii Nakryiko wrote: > On Thu, Dec 19, 2019 at 7:41 AM Daniel Borkmann <daniel@iogearbox.net> wrote: > > > > On Wed, Dec 18, 2019 at 11:06:57PM -0800, Andrii Nakryiko wrote: > > > Convert one of BCC tools (runqslower [0]) to BPF CO-RE + libbpf. It matches > > > its BCC-based counterpart 1-to-1, supporting all the same parameters and > > > functionality. > > > > > > runqslower tool utilizes BPF skeleton, auto-generated from BPF object file, > > > as well as memory-mapped interface to global (read-only, in this case) data. > > > Its makefile also ensures auto-generation of "relocatable" vmlinux.h, which is > > > necessary for BTF-typed raw tracepoints with direct memory access. > > > > > > [0] https://github.com/iovisor/bcc/blob/11bf5d02c895df9646c117c713082eb192825293/tools/runqslower.py > > > > > > Signed-off-by: Andrii Nakryiko <andriin@fb.com> > > > --- > > > tools/lib/bpf/tools/runqslower/.gitignore | 2 + > > > tools/lib/bpf/tools/runqslower/Makefile | 60 ++++++ > > > .../lib/bpf/tools/runqslower/runqslower.bpf.c | 101 ++++++++++ > > > tools/lib/bpf/tools/runqslower/runqslower.c | 187 ++++++++++++++++++ > > > tools/lib/bpf/tools/runqslower/runqslower.h | 13 ++ > > > > tools/lib/bpf/tools/ is rather weird, please add to tools/bpf/ which is the > > more appropriate place we have for small tools. Could also live directly in > > there, e.g. tools/bpf/runqslower.{c,h,bpf.c} and then built/run from selftests, > > but under libbpf directly is too odd. > > runqslower is as much as a showcase of how to build a stand-alone tool > with libbpf and CO-RE, as a separate tool, which is why I put it under > libbpf directory. It's also not really BPF-specific tool, wouldn't > that make it weird to put it under tools/bpf along the bpftool? If we > added few more such tools using BPF CO-RE + libbpf, would you feel > it's still a good idea to put them under tools/bpf? I agree with Daniel tools/bpf/ seems like better location. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH bpf-next 2/3] libbpf/tools: add runqslower tool to libbpf 2019-12-19 7:06 ` [PATCH bpf-next 2/3] libbpf/tools: add runqslower tool to libbpf Andrii Nakryiko 2019-12-19 15:41 ` Daniel Borkmann @ 2019-12-19 18:13 ` Yonghong Song 2019-12-19 21:16 ` Andrii Nakryiko 1 sibling, 1 reply; 15+ messages in thread From: Yonghong Song @ 2019-12-19 18:13 UTC (permalink / raw) To: Andrii Nakryiko, bpf, netdev, Alexei Starovoitov, daniel Cc: andrii.nakryiko, Kernel Team On 12/18/19 11:06 PM, Andrii Nakryiko wrote: > Convert one of BCC tools (runqslower [0]) to BPF CO-RE + libbpf. It matches > its BCC-based counterpart 1-to-1, supporting all the same parameters and > functionality. > > runqslower tool utilizes BPF skeleton, auto-generated from BPF object file, > as well as memory-mapped interface to global (read-only, in this case) data. > Its makefile also ensures auto-generation of "relocatable" vmlinux.h, which is > necessary for BTF-typed raw tracepoints with direct memory access. > > [0] https://github.com/iovisor/bcc/blob/11bf5d02c895df9646c117c713082eb192825293/tools/runqslower.py > > Signed-off-by: Andrii Nakryiko <andriin@fb.com> > --- > tools/lib/bpf/tools/runqslower/.gitignore | 2 + > tools/lib/bpf/tools/runqslower/Makefile | 60 ++++++ > .../lib/bpf/tools/runqslower/runqslower.bpf.c | 101 ++++++++++ > tools/lib/bpf/tools/runqslower/runqslower.c | 187 ++++++++++++++++++ > tools/lib/bpf/tools/runqslower/runqslower.h | 13 ++ > 5 files changed, 363 insertions(+) > create mode 100644 tools/lib/bpf/tools/runqslower/.gitignore > create mode 100644 tools/lib/bpf/tools/runqslower/Makefile > create mode 100644 tools/lib/bpf/tools/runqslower/runqslower.bpf.c > create mode 100644 tools/lib/bpf/tools/runqslower/runqslower.c > create mode 100644 tools/lib/bpf/tools/runqslower/runqslower.h > > diff --git a/tools/lib/bpf/tools/runqslower/.gitignore b/tools/lib/bpf/tools/runqslower/.gitignore > new file mode 100644 > index 000000000000..404942cc9371 > --- /dev/null > +++ b/tools/lib/bpf/tools/runqslower/.gitignore > @@ -0,0 +1,2 @@ > +/.output > +/runqslower > diff --git a/tools/lib/bpf/tools/runqslower/Makefile b/tools/lib/bpf/tools/runqslower/Makefile > new file mode 100644 > index 000000000000..b87b1f9fe9da > --- /dev/null > +++ b/tools/lib/bpf/tools/runqslower/Makefile > @@ -0,0 +1,60 @@ > +# SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause) > +CLANG := clang > +LLC := llc > +LLVM_STRIP := llvm-strip > +BPFTOOL := bpftool Maybe it is better to use in-tree bpftool? This will ensure we use the one shipped together with the source which should have needed functionality. > +LIBBPF_SRC := ../.. > +CFLAGS := -g -Wall > + > +# Try to detect best kernel BTF source > +KERNEL_REL := $(shell uname -r) > +ifneq ("$(wildcard /sys/kenerl/btf/vmlinux)","") > +VMLINUX_BTF := /sys/kernel/btf/vmlinux > +else ifneq ("$(wildcard /boot/vmlinux-$(KERNEL_REL))","") > +VMLINUX_BTF := /boot/vmlinux-$(KERNEL_REL) > +else > +$(error "Can't detect kernel BTF, use VMLINUX_BTF to specify it explicitly") > +endif > + > +out := .output > +abs_out := $(abspath $(out)) > +libbpf_src := $(abspath $(LIBBPF_SRC)) > + > +.DELETE_ON_ERROR: > + > +.PHONY: all > +all: runqslower > + > +.PHONY: clean > +clean: > + rm -rf $(out) runqslower > + > +runqslower: $(out)/runqslower.o $(out)/libbpf.a > + $(CC) $(CFLAGS) -lelf -lz $^ -o $@ > + > +$(out)/vmlinux.h: $(VMLINUX_BTF) | $(out) > + $(BPFTOOL) btf dump file $(VMLINUX_BTF) format core > $@ > + > +$(out)/libbpf.a: | $(out) > + cd $(out) && \ > + $(MAKE) -C $(libbpf_src) OUTPUT=$(abs_out)/ $(abs_out)/libbpf.a > + > +$(out)/runqslower.o: runqslower.h $(out)/runqslower.skel.h \ > + $(out)/runqslower.bpf.o > + > +$(out)/runqslower.bpf.o: $(out)/vmlinux.h runqslower.h > + > +$(out)/%.skel.h: $(out)/%.bpf.o > + $(BPFTOOL) gen skeleton $< > $@ > + > +$(out)/%.bpf.o: %.bpf.c | $(out) > + $(CLANG) -g -O2 -target bpf -I$(out) -I$(LIBBPF_SRC) \ > + -c $(filter %.c,$^) -o $@ && \ > + $(LLVM_STRIP) -g $@ > + > +$(out)/%.o: %.c | $(out) > + $(CC) $(CFLAGS) -I$(LIBBPF_SRC) -I$(out) -c $(filter %.c,$^) -o $@ > + > +$(out): > + mkdir -p $(out) > + [...] ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH bpf-next 2/3] libbpf/tools: add runqslower tool to libbpf 2019-12-19 18:13 ` Yonghong Song @ 2019-12-19 21:16 ` Andrii Nakryiko 0 siblings, 0 replies; 15+ messages in thread From: Andrii Nakryiko @ 2019-12-19 21:16 UTC (permalink / raw) To: Yonghong Song Cc: Andrii Nakryiko, bpf, netdev, Alexei Starovoitov, daniel, Kernel Team On Thu, Dec 19, 2019 at 10:13 AM Yonghong Song <yhs@fb.com> wrote: > > > > On 12/18/19 11:06 PM, Andrii Nakryiko wrote: > > Convert one of BCC tools (runqslower [0]) to BPF CO-RE + libbpf. It matches > > its BCC-based counterpart 1-to-1, supporting all the same parameters and > > functionality. > > > > runqslower tool utilizes BPF skeleton, auto-generated from BPF object file, > > as well as memory-mapped interface to global (read-only, in this case) data. > > Its makefile also ensures auto-generation of "relocatable" vmlinux.h, which is > > necessary for BTF-typed raw tracepoints with direct memory access. > > > > [0] https://github.com/iovisor/bcc/blob/11bf5d02c895df9646c117c713082eb192825293/tools/runqslower.py > > > > Signed-off-by: Andrii Nakryiko <andriin@fb.com> > > --- > > tools/lib/bpf/tools/runqslower/.gitignore | 2 + > > tools/lib/bpf/tools/runqslower/Makefile | 60 ++++++ > > .../lib/bpf/tools/runqslower/runqslower.bpf.c | 101 ++++++++++ > > tools/lib/bpf/tools/runqslower/runqslower.c | 187 ++++++++++++++++++ > > tools/lib/bpf/tools/runqslower/runqslower.h | 13 ++ > > 5 files changed, 363 insertions(+) > > create mode 100644 tools/lib/bpf/tools/runqslower/.gitignore > > create mode 100644 tools/lib/bpf/tools/runqslower/Makefile > > create mode 100644 tools/lib/bpf/tools/runqslower/runqslower.bpf.c > > create mode 100644 tools/lib/bpf/tools/runqslower/runqslower.c > > create mode 100644 tools/lib/bpf/tools/runqslower/runqslower.h > > > > diff --git a/tools/lib/bpf/tools/runqslower/.gitignore b/tools/lib/bpf/tools/runqslower/.gitignore > > new file mode 100644 > > index 000000000000..404942cc9371 > > --- /dev/null > > +++ b/tools/lib/bpf/tools/runqslower/.gitignore > > @@ -0,0 +1,2 @@ > > +/.output > > +/runqslower > > diff --git a/tools/lib/bpf/tools/runqslower/Makefile b/tools/lib/bpf/tools/runqslower/Makefile > > new file mode 100644 > > index 000000000000..b87b1f9fe9da > > --- /dev/null > > +++ b/tools/lib/bpf/tools/runqslower/Makefile > > @@ -0,0 +1,60 @@ > > +# SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause) > > +CLANG := clang > > +LLC := llc > > +LLVM_STRIP := llvm-strip > > +BPFTOOL := bpftool > > Maybe it is better to use in-tree bpftool? This will ensure we use the > one shipped together with the source which should have needed functionality. I can do that as well, though I tried to keep it close enough to how libbpf+CO-RE applications are going to be built in the wild, so tried to minimize amount of in-kernel source tree assumptions. Ideally both bpftool is installed and libbpf is available as a dynamic library in the system. But surely I can add bpftool auto-build and use that one. > > > +LIBBPF_SRC := ../.. > > +CFLAGS := -g -Wall > > + > > +# Try to detect best kernel BTF source > > +KERNEL_REL := $(shell uname -r) > > +ifneq ("$(wildcard /sys/kenerl/btf/vmlinux)","") > > +VMLINUX_BTF := /sys/kernel/btf/vmlinux > > +else ifneq ("$(wildcard /boot/vmlinux-$(KERNEL_REL))","") > > +VMLINUX_BTF := /boot/vmlinux-$(KERNEL_REL) > > +else > > +$(error "Can't detect kernel BTF, use VMLINUX_BTF to specify it explicitly") > > +endif > > + > > +out := .output > > +abs_out := $(abspath $(out)) > > +libbpf_src := $(abspath $(LIBBPF_SRC)) > > + > > +.DELETE_ON_ERROR: > > + > > +.PHONY: all > > +all: runqslower > > + > > +.PHONY: clean > > +clean: > > + rm -rf $(out) runqslower > > + > > +runqslower: $(out)/runqslower.o $(out)/libbpf.a > > + $(CC) $(CFLAGS) -lelf -lz $^ -o $@ > > + > > +$(out)/vmlinux.h: $(VMLINUX_BTF) | $(out) > > + $(BPFTOOL) btf dump file $(VMLINUX_BTF) format core > $@ > > + > > +$(out)/libbpf.a: | $(out) > > + cd $(out) && \ > > + $(MAKE) -C $(libbpf_src) OUTPUT=$(abs_out)/ $(abs_out)/libbpf.a > > + > > +$(out)/runqslower.o: runqslower.h $(out)/runqslower.skel.h \ > > + $(out)/runqslower.bpf.o > > + > > +$(out)/runqslower.bpf.o: $(out)/vmlinux.h runqslower.h > > + > > +$(out)/%.skel.h: $(out)/%.bpf.o > > + $(BPFTOOL) gen skeleton $< > $@ > > + > > +$(out)/%.bpf.o: %.bpf.c | $(out) > > + $(CLANG) -g -O2 -target bpf -I$(out) -I$(LIBBPF_SRC) \ > > + -c $(filter %.c,$^) -o $@ && \ > > + $(LLVM_STRIP) -g $@ > > + > > +$(out)/%.o: %.c | $(out) > > + $(CC) $(CFLAGS) -I$(LIBBPF_SRC) -I$(out) -c $(filter %.c,$^) -o $@ > > + > > +$(out): > > + mkdir -p $(out) > > + > [...] ^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH bpf-next 3/3] selftests/bpf: build runqslower from selftests 2019-12-19 7:06 [PATCH bpf-next 0/3] Implement runqslower BCC tool with BPF CO-RE Andrii Nakryiko 2019-12-19 7:06 ` [PATCH bpf-next 1/3] bpftool: add extra CO-RE mode to btf dump command Andrii Nakryiko 2019-12-19 7:06 ` [PATCH bpf-next 2/3] libbpf/tools: add runqslower tool to libbpf Andrii Nakryiko @ 2019-12-19 7:06 ` Andrii Nakryiko 2 siblings, 0 replies; 15+ messages in thread From: Andrii Nakryiko @ 2019-12-19 7:06 UTC (permalink / raw) To: bpf, netdev, ast, daniel; +Cc: andrii.nakryiko, kernel-team, Andrii Nakryiko Ensure runqslower tool from libbpf is built as part of selftests to prevent it from bit rotting. Signed-off-by: Andrii Nakryiko <andriin@fb.com> --- tools/testing/selftests/bpf/Makefile | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile index c652bd84ef0e..1f8eb95d019c 100644 --- a/tools/testing/selftests/bpf/Makefile +++ b/tools/testing/selftests/bpf/Makefile @@ -75,7 +75,7 @@ TEST_GEN_PROGS_EXTENDED = test_sock_addr test_skb_cgroup_id_user \ flow_dissector_load test_flow_dissector test_tcp_check_syncookie_user \ test_lirc_mode2_user xdping test_cpp -TEST_CUSTOM_PROGS = urandom_read +TEST_CUSTOM_PROGS = urandom_read runqslower # Emit succinct information message describing current building step # $1 - generic step name (e.g., CC, LINK, etc); @@ -93,6 +93,7 @@ OVERRIDE_TARGETS := 1 override define CLEAN $(call msg, CLEAN) $(RM) -r $(TEST_GEN_PROGS) $(TEST_GEN_PROGS_EXTENDED) $(TEST_GEN_FILES) $(EXTRA_CLEAN) + $(MAKE) -C $(BPFDIR)/tools/runqslower clean endef include ../lib.mk @@ -119,6 +120,10 @@ $(OUTPUT)/test_stub.o: test_stub.c $(call msg, CC,,$@) $(CC) -c $(CFLAGS) -o $@ $< +.PHONY: $(OUTPUT)/runqslower +$(OUTPUT)/runqslower: force + $(MAKE) -C $(BPFDIR)/tools/runqslower + BPFOBJ := $(OUTPUT)/libbpf.a $(TEST_GEN_PROGS) $(TEST_GEN_PROGS_EXTENDED): $(OUTPUT)/test_stub.o $(BPFOBJ) -- 2.17.1 ^ permalink raw reply related [flat|nested] 15+ messages in thread
end of thread, other threads:[~2019-12-21 5:40 UTC | newest] Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2019-12-19 7:06 [PATCH bpf-next 0/3] Implement runqslower BCC tool with BPF CO-RE Andrii Nakryiko 2019-12-19 7:06 ` [PATCH bpf-next 1/3] bpftool: add extra CO-RE mode to btf dump command Andrii Nakryiko 2019-12-19 17:06 ` Alexei Starovoitov 2019-12-19 21:07 ` Andrii Nakryiko 2019-12-19 22:04 ` Alexei Starovoitov 2019-12-20 17:40 ` Andrii Nakryiko 2019-12-21 3:21 ` Alexei Starovoitov 2019-12-21 5:40 ` Andrii Nakryiko 2019-12-19 7:06 ` [PATCH bpf-next 2/3] libbpf/tools: add runqslower tool to libbpf Andrii Nakryiko 2019-12-19 15:41 ` Daniel Borkmann 2019-12-19 21:14 ` Andrii Nakryiko 2019-12-19 22:04 ` Alexei Starovoitov 2019-12-19 18:13 ` Yonghong Song 2019-12-19 21:16 ` Andrii Nakryiko 2019-12-19 7:06 ` [PATCH bpf-next 3/3] selftests/bpf: build runqslower from selftests Andrii Nakryiko
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.