* [bpf-next v2] bpftool: profile online CPUs instead of possible
@ 2023-02-01 12:24 tong
2023-02-02 11:15 ` Daniel Borkmann
0 siblings, 1 reply; 3+ messages in thread
From: tong @ 2023-02-01 12:24 UTC (permalink / raw)
To: bpf, quentin
Cc: Tonghao Zhang, Alexei Starovoitov, Daniel Borkmann,
Andrii Nakryiko, Martin KaFai Lau, Song Liu, Yonghong Song,
John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa
From: Tonghao Zhang <tong@infragraf.org>
The number of online cpu may be not equal to possible cpu.
bpftool prog profile, can not create pmu event on possible
but on online cpu.
$ dmidecode -s system-product-name
PowerEdge R620
$ cat /sys/devices/system/cpu/online
0-31
$ cat /sys/devices/system/cpu/possible
0-47
BTW, we can disable CPU dynamically:
$ echo 0 > /sys/devices/system/cpu/cpuX/online
If CPU is offline, perf_event_open will return ENODEV.
To fix this issue, check the value returned and skip
offline CPU.
Signed-off-by: Tonghao Zhang <tong@infragraf.org>
Cc: Quentin Monnet <quentin@isovalent.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Martin KaFai Lau <martin.lau@linux.dev>
Cc: Song Liu <song@kernel.org>
Cc: Yonghong Song <yhs@fb.com>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: KP Singh <kpsingh@kernel.org>
Cc: Stanislav Fomichev <sdf@google.com>
Cc: Hao Luo <haoluo@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
---
v1:
https://patchwork.kernel.org/project/netdevbpf/patch/20230117044902.98938-1-tong@infragraf.org/
https://patchwork.kernel.org/project/netdevbpf/patch/20230117044902.98938-2-tong@infragraf.org/
---
tools/bpf/bpftool/prog.c | 36 ++++++++++++++++++++++++++++--------
1 file changed, 28 insertions(+), 8 deletions(-)
diff --git a/tools/bpf/bpftool/prog.c b/tools/bpf/bpftool/prog.c
index cfc9fdc1e863..f48067cb0496 100644
--- a/tools/bpf/bpftool/prog.c
+++ b/tools/bpf/bpftool/prog.c
@@ -2233,10 +2233,36 @@ static void profile_close_perf_events(struct profiler_bpf *obj)
profile_perf_event_cnt = 0;
}
+static int profile_open_perf_event(int mid, int cpu, int map_fd)
+{
+ int pmu_fd;
+
+ pmu_fd = syscall(__NR_perf_event_open, &metrics[mid].attr,
+ -1/*pid*/, cpu, -1/*group_fd*/, 0);
+ if (pmu_fd < 0) {
+ if (errno == ENODEV) {
+ p_info("cpu %d may be offline, skip %s metric profiling.",
+ cpu, metrics[mid].name);
+ profile_perf_event_cnt++;
+ return 0;
+ }
+ return -1;
+ }
+
+ if (bpf_map_update_elem(map_fd,
+ &profile_perf_event_cnt,
+ &pmu_fd, BPF_ANY) ||
+ ioctl(pmu_fd, PERF_EVENT_IOC_ENABLE, 0))
+ return -1;
+
+ profile_perf_events[profile_perf_event_cnt++] = pmu_fd;
+ return 0;
+}
+
static int profile_open_perf_events(struct profiler_bpf *obj)
{
unsigned int cpu, m;
- int map_fd, pmu_fd;
+ int map_fd;
profile_perf_events = calloc(
sizeof(int), obj->rodata->num_cpu * obj->rodata->num_metric);
@@ -2255,17 +2281,11 @@ static int profile_open_perf_events(struct profiler_bpf *obj)
if (!metrics[m].selected)
continue;
for (cpu = 0; cpu < obj->rodata->num_cpu; cpu++) {
- pmu_fd = syscall(__NR_perf_event_open, &metrics[m].attr,
- -1/*pid*/, cpu, -1/*group_fd*/, 0);
- if (pmu_fd < 0 ||
- bpf_map_update_elem(map_fd, &profile_perf_event_cnt,
- &pmu_fd, BPF_ANY) ||
- ioctl(pmu_fd, PERF_EVENT_IOC_ENABLE, 0)) {
+ if (profile_open_perf_event(m, cpu, map_fd)) {
p_err("failed to create event %s on cpu %d",
metrics[m].name, cpu);
return -1;
}
- profile_perf_events[profile_perf_event_cnt++] = pmu_fd;
}
}
return 0;
--
2.27.0
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [bpf-next v2] bpftool: profile online CPUs instead of possible
2023-02-01 12:24 [bpf-next v2] bpftool: profile online CPUs instead of possible tong
@ 2023-02-02 11:15 ` Daniel Borkmann
2023-02-02 11:36 ` Tonghao Zhang
0 siblings, 1 reply; 3+ messages in thread
From: Daniel Borkmann @ 2023-02-02 11:15 UTC (permalink / raw)
To: tong, bpf, quentin
Cc: Alexei Starovoitov, Andrii Nakryiko, Martin KaFai Lau, Song Liu,
Yonghong Song, John Fastabend, KP Singh, Stanislav Fomichev,
Hao Luo, Jiri Olsa
On 2/1/23 1:24 PM, tong@infragraf.org wrote:
> From: Tonghao Zhang <tong@infragraf.org>
>
> The number of online cpu may be not equal to possible cpu.
> bpftool prog profile, can not create pmu event on possible
> but on online cpu.
>
> $ dmidecode -s system-product-name
> PowerEdge R620
> $ cat /sys/devices/system/cpu/online
> 0-31
> $ cat /sys/devices/system/cpu/possible
> 0-47
>
> BTW, we can disable CPU dynamically:
> $ echo 0 > /sys/devices/system/cpu/cpuX/online
>
> If CPU is offline, perf_event_open will return ENODEV.
> To fix this issue, check the value returned and skip
> offline CPU.
>
> Signed-off-by: Tonghao Zhang <tong@infragraf.org>
> Cc: Quentin Monnet <quentin@isovalent.com>
> Cc: Alexei Starovoitov <ast@kernel.org>
> Cc: Daniel Borkmann <daniel@iogearbox.net>
> Cc: Andrii Nakryiko <andrii@kernel.org>
> Cc: Martin KaFai Lau <martin.lau@linux.dev>
> Cc: Song Liu <song@kernel.org>
> Cc: Yonghong Song <yhs@fb.com>
> Cc: John Fastabend <john.fastabend@gmail.com>
> Cc: KP Singh <kpsingh@kernel.org>
> Cc: Stanislav Fomichev <sdf@google.com>
> Cc: Hao Luo <haoluo@google.com>
> Cc: Jiri Olsa <jolsa@kernel.org>
> ---
> v1:
> https://patchwork.kernel.org/project/netdevbpf/patch/20230117044902.98938-1-tong@infragraf.org/
> https://patchwork.kernel.org/project/netdevbpf/patch/20230117044902.98938-2-tong@infragraf.org/
> ---
> tools/bpf/bpftool/prog.c | 36 ++++++++++++++++++++++++++++--------
> 1 file changed, 28 insertions(+), 8 deletions(-)
>
> diff --git a/tools/bpf/bpftool/prog.c b/tools/bpf/bpftool/prog.c
> index cfc9fdc1e863..f48067cb0496 100644
> --- a/tools/bpf/bpftool/prog.c
> +++ b/tools/bpf/bpftool/prog.c
> @@ -2233,10 +2233,36 @@ static void profile_close_perf_events(struct profiler_bpf *obj)
> profile_perf_event_cnt = 0;
> }
>
> +static int profile_open_perf_event(int mid, int cpu, int map_fd)
> +{
> + int pmu_fd;
> +
> + pmu_fd = syscall(__NR_perf_event_open, &metrics[mid].attr,
> + -1/*pid*/, cpu, -1/*group_fd*/, 0);
> + if (pmu_fd < 0) {
> + if (errno == ENODEV) {
> + p_info("cpu %d may be offline, skip %s metric profiling.",
> + cpu, metrics[mid].name);
> + profile_perf_event_cnt++;
> + return 0;
> + }
> + return -1;
> + }
> +
> + if (bpf_map_update_elem(map_fd,
> + &profile_perf_event_cnt,
> + &pmu_fd, BPF_ANY) ||
> + ioctl(pmu_fd, PERF_EVENT_IOC_ENABLE, 0))
> + return -1;
This leaks pmu_fd here, no? We should close fd on error as the later call to
profile_close_perf_events() only closes those which are in profile_perf_events[]
array.
> + profile_perf_events[profile_perf_event_cnt++] = pmu_fd;
> + return 0;
> +}
> +
> static int profile_open_perf_events(struct profiler_bpf *obj)
> {
> unsigned int cpu, m;
> - int map_fd, pmu_fd;
> + int map_fd;
>
> profile_perf_events = calloc(
> sizeof(int), obj->rodata->num_cpu * obj->rodata->num_metric);
> @@ -2255,17 +2281,11 @@ static int profile_open_perf_events(struct profiler_bpf *obj)
> if (!metrics[m].selected)
> continue;
> for (cpu = 0; cpu < obj->rodata->num_cpu; cpu++) {
> - pmu_fd = syscall(__NR_perf_event_open, &metrics[m].attr,
> - -1/*pid*/, cpu, -1/*group_fd*/, 0);
> - if (pmu_fd < 0 ||
> - bpf_map_update_elem(map_fd, &profile_perf_event_cnt,
> - &pmu_fd, BPF_ANY) ||
> - ioctl(pmu_fd, PERF_EVENT_IOC_ENABLE, 0)) {
> + if (profile_open_perf_event(m, cpu, map_fd)) {
> p_err("failed to create event %s on cpu %d",
> metrics[m].name, cpu);
> return -1;
> }
> - profile_perf_events[profile_perf_event_cnt++] = pmu_fd;
> }
> }
> return 0;
>
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [bpf-next v2] bpftool: profile online CPUs instead of possible
2023-02-02 11:15 ` Daniel Borkmann
@ 2023-02-02 11:36 ` Tonghao Zhang
0 siblings, 0 replies; 3+ messages in thread
From: Tonghao Zhang @ 2023-02-02 11:36 UTC (permalink / raw)
To: Daniel Borkmann
Cc: bpf, Quentin Monnet, Alexei Starovoitov, Andrii Nakryiko,
Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa
> On Feb 2, 2023, at 7:15 PM, Daniel Borkmann <daniel@iogearbox.net> wrote:
>
> On 2/1/23 1:24 PM, tong@infragraf.org wrote:
>> From: Tonghao Zhang <tong@infragraf.org>
>> The number of online cpu may be not equal to possible cpu.
>> bpftool prog profile, can not create pmu event on possible
>> but on online cpu.
>> $ dmidecode -s system-product-name
>> PowerEdge R620
>> $ cat /sys/devices/system/cpu/online
>> 0-31
>> $ cat /sys/devices/system/cpu/possible
>> 0-47
>> BTW, we can disable CPU dynamically:
>> $ echo 0 > /sys/devices/system/cpu/cpuX/online
>> If CPU is offline, perf_event_open will return ENODEV.
>> To fix this issue, check the value returned and skip
>> offline CPU.
>> Signed-off-by: Tonghao Zhang <tong@infragraf.org>
>> Cc: Quentin Monnet <quentin@isovalent.com>
>> Cc: Alexei Starovoitov <ast@kernel.org>
>> Cc: Daniel Borkmann <daniel@iogearbox.net>
>> Cc: Andrii Nakryiko <andrii@kernel.org>
>> Cc: Martin KaFai Lau <martin.lau@linux.dev>
>> Cc: Song Liu <song@kernel.org>
>> Cc: Yonghong Song <yhs@fb.com>
>> Cc: John Fastabend <john.fastabend@gmail.com>
>> Cc: KP Singh <kpsingh@kernel.org>
>> Cc: Stanislav Fomichev <sdf@google.com>
>> Cc: Hao Luo <haoluo@google.com>
>> Cc: Jiri Olsa <jolsa@kernel.org>
>> ---
>> v1:
>> https://patchwork.kernel.org/project/netdevbpf/patch/20230117044902.98938-1-tong@infragraf.org/
>> https://patchwork.kernel.org/project/netdevbpf/patch/20230117044902.98938-2-tong@infragraf.org/
>> ---
>> tools/bpf/bpftool/prog.c | 36 ++++++++++++++++++++++++++++--------
>> 1 file changed, 28 insertions(+), 8 deletions(-)
>> diff --git a/tools/bpf/bpftool/prog.c b/tools/bpf/bpftool/prog.c
>> index cfc9fdc1e863..f48067cb0496 100644
>> --- a/tools/bpf/bpftool/prog.c
>> +++ b/tools/bpf/bpftool/prog.c
>> @@ -2233,10 +2233,36 @@ static void profile_close_perf_events(struct profiler_bpf *obj)
>> profile_perf_event_cnt = 0;
>> }
>> +static int profile_open_perf_event(int mid, int cpu, int map_fd)
>> +{
>> + int pmu_fd;
>> +
>> + pmu_fd = syscall(__NR_perf_event_open, &metrics[mid].attr,
>> + -1/*pid*/, cpu, -1/*group_fd*/, 0);
>> + if (pmu_fd < 0) {
>> + if (errno == ENODEV) {
>> + p_info("cpu %d may be offline, skip %s metric profiling.",
>> + cpu, metrics[mid].name);
>> + profile_perf_event_cnt++;
>> + return 0;
>> + }
>> + return -1;
>> + }
>> +
>> + if (bpf_map_update_elem(map_fd,
>> + &profile_perf_event_cnt,
>> + &pmu_fd, BPF_ANY) ||
>> + ioctl(pmu_fd, PERF_EVENT_IOC_ENABLE, 0))
>> + return -1;
>
> This leaks pmu_fd here, no? We should close fd on error as the later call to
> profile_close_perf_events() only closes those which are in profile_perf_events[]
> array.
Yes. I will fix this issue. Seem this issue introduced by 47c09d6a9f67. I will add fix tag in next version.
Fixes: 47c09d6a9f67 ("bpftool: Introduce "prog profile" command")
Cc: Song Liu <songliubraving@fb.com>
>
>> + profile_perf_events[profile_perf_event_cnt++] = pmu_fd;
>> + return 0;
>> +}
>> +
>> static int profile_open_perf_events(struct profiler_bpf *obj)
>> {
>> unsigned int cpu, m;
>> - int map_fd, pmu_fd;
>> + int map_fd;
>> profile_perf_events = calloc(
>> sizeof(int), obj->rodata->num_cpu * obj->rodata->num_metric);
>> @@ -2255,17 +2281,11 @@ static int profile_open_perf_events(struct profiler_bpf *obj)
>> if (!metrics[m].selected)
>> continue;
>> for (cpu = 0; cpu < obj->rodata->num_cpu; cpu++) {
>> - pmu_fd = syscall(__NR_perf_event_open, &metrics[m].attr,
>> - -1/*pid*/, cpu, -1/*group_fd*/, 0);
>> - if (pmu_fd < 0 ||
>> - bpf_map_update_elem(map_fd, &profile_perf_event_cnt,
>> - &pmu_fd, BPF_ANY) ||
>> - ioctl(pmu_fd, PERF_EVENT_IOC_ENABLE, 0)) {
>> + if (profile_open_perf_event(m, cpu, map_fd)) {
>> p_err("failed to create event %s on cpu %d",
>> metrics[m].name, cpu);
>> return -1;
>> }
>> - profile_perf_events[profile_perf_event_cnt++] = pmu_fd;
>> }
>> }
>> return 0;
>
>
----
Best Regards, Tonghao <tong@infragraf.org>
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2023-02-02 11:36 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-02-01 12:24 [bpf-next v2] bpftool: profile online CPUs instead of possible tong
2023-02-02 11:15 ` Daniel Borkmann
2023-02-02 11:36 ` Tonghao Zhang
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.