bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH bpf-next] bpftool: add {i,d}tlb_misses support for bpftool profile
@ 2020-11-19  4:23 Yonghong Song
  2020-11-19  5:03 ` Song Liu
  0 siblings, 1 reply; 3+ messages in thread
From: Yonghong Song @ 2020-11-19  4:23 UTC (permalink / raw)
  To: bpf; +Cc: Alexei Starovoitov, Daniel Borkmann, kernel-team, Song Liu

Commit 47c09d6a9f67("bpftool: Introduce "prog profile" command")
introduced "bpftool prog profile" command which can be used
to profile bpf program with metrics like # of instructions,

This patch added support for itlb_misses and dtlb_misses.
During an internal bpf program performance evaluation,
I found these two metrics are also very useful. The following
is an example output:

 $ bpftool prog profile id 324 duration 3 cycles itlb_misses

           1885029 run_cnt
        5134686073 cycles
            306893 itlb_misses
 $ bpftool prog profile id 324 duration 3 cycles dtlb_misses

           1827382 run_cnt
        4943593648 cycles
           5975636 dtlb_misses
 $ bpftool prog profile id 324 duration 3 cycles llc_misses

           1836527 run_cnt
        5019612972 cycles
           4161041 llc_misses

From the above, we can see quite some dtlb misses, 3 dtlb misses
perf prog run. This might be something worth further investigation.

Cc: Song Liu <songliubraving@fb.com>
Signed-off-by: Yonghong Song <yhs@fb.com>
---
 tools/bpf/bpftool/prog.c | 32 ++++++++++++++++++++++++++++++--
 1 file changed, 30 insertions(+), 2 deletions(-)

diff --git a/tools/bpf/bpftool/prog.c b/tools/bpf/bpftool/prog.c
index acdb2c245f0a..e33f27b950a5 100644
--- a/tools/bpf/bpftool/prog.c
+++ b/tools/bpf/bpftool/prog.c
@@ -1717,11 +1717,39 @@ struct profile_metric {
 		.ratio_desc = "LLC misses per million insns",
 		.ratio_mul = 1e6,
 	},
+	{
+		.name = "itlb_misses",
+		.attr = {
+			.type = PERF_TYPE_HW_CACHE,
+			.config =
+				PERF_COUNT_HW_CACHE_ITLB |
+				(PERF_COUNT_HW_CACHE_OP_READ << 8) |
+				(PERF_COUNT_HW_CACHE_RESULT_MISS << 16),
+			.exclude_user = 1
+		},
+		.ratio_metric = 2,
+		.ratio_desc = "itlb misses per million insns",
+		.ratio_mul = 1e6,
+	},
+	{
+		.name = "dtlb_misses",
+		.attr = {
+			.type = PERF_TYPE_HW_CACHE,
+			.config =
+				PERF_COUNT_HW_CACHE_DTLB |
+				(PERF_COUNT_HW_CACHE_OP_READ << 8) |
+				(PERF_COUNT_HW_CACHE_RESULT_MISS << 16),
+			.exclude_user = 1
+		},
+		.ratio_metric = 2,
+		.ratio_desc = "dtlb misses per million insns",
+		.ratio_mul = 1e6,
+	},
 };
 
 static __u64 profile_total_count;
 
-#define MAX_NUM_PROFILE_METRICS 4
+#define MAX_NUM_PROFILE_METRICS 6
 
 static int profile_parse_metrics(int argc, char **argv)
 {
@@ -2109,7 +2137,7 @@ static int do_help(int argc, char **argv)
 		"                 struct_ops | fentry | fexit | freplace | sk_lookup }\n"
 		"       ATTACH_TYPE := { msg_verdict | stream_verdict | stream_parser |\n"
 		"                        flow_dissector }\n"
-		"       METRIC := { cycles | instructions | l1d_loads | llc_misses }\n"
+		"       METRIC := { cycles | instructions | l1d_loads | llc_misses | itlb_misses | dtlb_misses }\n"
 		"       " HELP_SPEC_OPTIONS "\n"
 		"",
 		bin_name, argv[-2]);
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH bpf-next] bpftool: add {i,d}tlb_misses support for bpftool profile
  2020-11-19  4:23 [PATCH bpf-next] bpftool: add {i,d}tlb_misses support for bpftool profile Yonghong Song
@ 2020-11-19  5:03 ` Song Liu
  2020-11-19  7:18   ` Yonghong Song
  0 siblings, 1 reply; 3+ messages in thread
From: Song Liu @ 2020-11-19  5:03 UTC (permalink / raw)
  To: Yonghong Song; +Cc: bpf, Alexei Starovoitov, Daniel Borkmann, Kernel Team


> On Nov 18, 2020, at 8:23 PM, Yonghong Song <yhs@fb.com> wrote:

[...]

> 
> Cc: Song Liu <songliubraving@fb.com>
> Signed-off-by: Yonghong Song <yhs@fb.com>
> ---
> tools/bpf/bpftool/prog.c | 32 ++++++++++++++++++++++++++++++--
> 1 file changed, 30 insertions(+), 2 deletions(-)
> 
> diff --git a/tools/bpf/bpftool/prog.c b/tools/bpf/bpftool/prog.c
> index acdb2c245f0a..e33f27b950a5 100644
> --- a/tools/bpf/bpftool/prog.c
> +++ b/tools/bpf/bpftool/prog.c
> @@ -1717,11 +1717,39 @@ struct profile_metric {
> 		.ratio_desc = "LLC misses per million insns",
> 		.ratio_mul = 1e6,
> 	},
> +	{
> +		.name = "itlb_misses",
> +		.attr = {
> +			.type = PERF_TYPE_HW_CACHE,
> +			.config =
> +				PERF_COUNT_HW_CACHE_ITLB |
> +				(PERF_COUNT_HW_CACHE_OP_READ << 8) |
> +				(PERF_COUNT_HW_CACHE_RESULT_MISS << 16),
> +			.exclude_user = 1
> +		},
> +		.ratio_metric = 2,
> +		.ratio_desc = "itlb misses per million insns",
> +		.ratio_mul = 1e6,
> +	},
> +	{
> +		.name = "dtlb_misses",
> +		.attr = {
> +			.type = PERF_TYPE_HW_CACHE,
> +			.config =
> +				PERF_COUNT_HW_CACHE_DTLB |
> +				(PERF_COUNT_HW_CACHE_OP_READ << 8) |
> +				(PERF_COUNT_HW_CACHE_RESULT_MISS << 16),
> +			.exclude_user = 1
> +		},
> +		.ratio_metric = 2,
> +		.ratio_desc = "dtlb misses per million insns",
> +		.ratio_mul = 1e6,
> +	},
> };
> 
> static __u64 profile_total_count;
> 
> -#define MAX_NUM_PROFILE_METRICS 4
> +#define MAX_NUM_PROFILE_METRICS 6

This change is not necessary. This is the max number of enabled metrics. 
We don't stop the perf counters, each enabled metric adds some error to 
the final reading. Therefore, we don't want to enable too many metrics
in the same run. If we don't have a use case for more metrics in one run, 
let's keep the cap of 4 metrics. 

If we do want to increase this, we should also increase MAX_NUM_MATRICS 
in profiler.bpf.c.

Other than this, 

Acked-by: Song Liu <songliubraving@fb.com>

Thanks!

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH bpf-next] bpftool: add {i,d}tlb_misses support for bpftool profile
  2020-11-19  5:03 ` Song Liu
@ 2020-11-19  7:18   ` Yonghong Song
  0 siblings, 0 replies; 3+ messages in thread
From: Yonghong Song @ 2020-11-19  7:18 UTC (permalink / raw)
  To: Song Liu; +Cc: bpf, Alexei Starovoitov, Daniel Borkmann, Kernel Team



On 11/18/20 9:03 PM, Song Liu wrote:
> 
>> On Nov 18, 2020, at 8:23 PM, Yonghong Song <yhs@fb.com> wrote:
> 
> [...]
> 
>>
>> Cc: Song Liu <songliubraving@fb.com>
>> Signed-off-by: Yonghong Song <yhs@fb.com>
>> ---
>> tools/bpf/bpftool/prog.c | 32 ++++++++++++++++++++++++++++++--
>> 1 file changed, 30 insertions(+), 2 deletions(-)
>>
>> diff --git a/tools/bpf/bpftool/prog.c b/tools/bpf/bpftool/prog.c
>> index acdb2c245f0a..e33f27b950a5 100644
>> --- a/tools/bpf/bpftool/prog.c
>> +++ b/tools/bpf/bpftool/prog.c
>> @@ -1717,11 +1717,39 @@ struct profile_metric {
>> 		.ratio_desc = "LLC misses per million insns",
>> 		.ratio_mul = 1e6,
>> 	},
>> +	{
>> +		.name = "itlb_misses",
>> +		.attr = {
>> +			.type = PERF_TYPE_HW_CACHE,
>> +			.config =
>> +				PERF_COUNT_HW_CACHE_ITLB |
>> +				(PERF_COUNT_HW_CACHE_OP_READ << 8) |
>> +				(PERF_COUNT_HW_CACHE_RESULT_MISS << 16),
>> +			.exclude_user = 1
>> +		},
>> +		.ratio_metric = 2,
>> +		.ratio_desc = "itlb misses per million insns",
>> +		.ratio_mul = 1e6,
>> +	},
>> +	{
>> +		.name = "dtlb_misses",
>> +		.attr = {
>> +			.type = PERF_TYPE_HW_CACHE,
>> +			.config =
>> +				PERF_COUNT_HW_CACHE_DTLB |
>> +				(PERF_COUNT_HW_CACHE_OP_READ << 8) |
>> +				(PERF_COUNT_HW_CACHE_RESULT_MISS << 16),
>> +			.exclude_user = 1
>> +		},
>> +		.ratio_metric = 2,
>> +		.ratio_desc = "dtlb misses per million insns",
>> +		.ratio_mul = 1e6,
>> +	},
>> };
>>
>> static __u64 profile_total_count;
>>
>> -#define MAX_NUM_PROFILE_METRICS 4
>> +#define MAX_NUM_PROFILE_METRICS 6
> 
> This change is not necessary. This is the max number of enabled metrics.
> We don't stop the perf counters, each enabled metric adds some error to
> the final reading. Therefore, we don't want to enable too many metrics
> in the same run. If we don't have a use case for more metrics in one run,
> let's keep the cap of 4 metrics.
> 
> If we do want to increase this, we should also increase MAX_NUM_MATRICS
> in profiler.bpf.c.

Thanks for explanation. I missed that we also need to change 
MAX_NUM_MATRICS to really make it work. Ideally it would be best if we
only need to change in one header and that header is shared by both
bpf program and non-bpf program but probably is a overkill here just
for a single macro.

Also, your concern makes sense. We do not want to enable too many
metrics as this will cause counter sharing and may sacrifice accuracy.
I will leave current max profile metrics 4 then.

Will respin v2.

> 
> Other than this,
> 
> Acked-by: Song Liu <songliubraving@fb.com>
> 
> Thanks!
> 

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2020-11-19  7:19 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-11-19  4:23 [PATCH bpf-next] bpftool: add {i,d}tlb_misses support for bpftool profile Yonghong Song
2020-11-19  5:03 ` Song Liu
2020-11-19  7:18   ` Yonghong Song

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).