linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Arnaldo Carvalho de Melo <acme@kernel.org>
To: Stephane Eranian <eranian@google.com>
Cc: linux-kernel@vger.kernel.org, peterz@infradead.org,
	kim.phillips@amd.com, acme@redhat.com, jolsa@redhat.com,
	songliubraving@fb.com
Subject: Re: [PATCH v6 12/12] perf report: add addr_from/addr_to sort dimensions
Date: Wed, 16 Feb 2022 11:21:12 -0300	[thread overview]
Message-ID: <Yg0IWIjgpRcb8bZX@kernel.org> (raw)
In-Reply-To: <20220208211637.2221872-13-eranian@google.com>

Em Tue, Feb 08, 2022 at 01:16:37PM -0800, Stephane Eranian escreveu:
> With the existing symbol_from/symbol_to, branches captured in the same
> function would be collapsed into a single function if the latencies associated
> with the each branch (cycles) were all the same.  That is the case on Intel
> Broadwell, for instance. Since Intel Skylake, the latency is captured by
> hardware and therefore is used to disambiguate branches.
> 
> Add addr_from/addr_to sort dimensions to sort branches based on their
> addresses and not the function there are in. The output is still the function
> name but the offset within the function is provided to uniquely identify each
> branch.  These new sort dimensions also help with annotate because they create
> different entries in the histogram which, in turn, generates proper branch
> annotations.
> 
> Here is an example using AMD's branch sampling:

This can be cherry picked from this patchset, I'll try to do it now, and
also add the above explanation for the new sort dimensions to:

tools/perf/Documentation/perf-report.txt

Please update the documentation in the future when adding new sort
dimensions.

Thanks,

- Arnaldo
 
> $ perf record -a -b -c 1000037 -e cpu/branch-brs/ test_prg
> 
> $ perf report
> Samples: 6M of event 'cpu/branch-brs/', Event count (approx.): 6901276
> Overhead  Command          Source Shared Object  Source Symbol                                   Target Symbol                                   Basic Block Cycle
>   99.65%  test_prg	   test_prg              [.] test_thread                                 [.] test_thread                                 -
>    0.02%  test_prg         [kernel.vmlinux]      [k] asm_sysvec_apic_timer_interrupt             [k] error_entry                                 -
> 
> $ perf report -F overhead,comm,dso,addr_from,addr_to
> Samples: 6M of event 'cpu/branch-brs/', Event count (approx.): 6901276
> Overhead  Command          Shared Object     Source Address          Target Address
>    4.22%  test_prg         test_prg          [.] test_thread+0x3c    [.] test_thread+0x4
>    4.13%  test_prg         test_prg          [.] test_thread+0x4     [.] test_thread+0x3a
>    4.09%  test_prg         test_prg          [.] test_thread+0x3a    [.] test_thread+0x6
>    4.08%  test_prg         test_prg          [.] test_thread+0x2     [.] test_thread+0x3c
>    4.06%  test_prg         test_prg          [.] test_thread+0x3e    [.] test_thread+0x2
>    3.87%  test_prg         test_prg          [.] test_thread+0x6     [.] test_thread+0x38
>    3.84%  test_prg         test_prg          [.] test_thread         [.] test_thread+0x3e
>    3.76%  test_prg         test_prg          [.] test_thread+0x1e    [.] test_thread
>    3.76%  test_prg         test_prg          [.] test_thread+0x38    [.] test_thread+0x8
>    3.56%  test_prg         test_prg          [.] test_thread+0x22    [.] test_thread+0x1e
>    3.54%  test_prg         test_prg          [.] test_thread+0x8     [.] test_thread+0x36
>    3.47%  test_prg         test_prg          [.] test_thread+0x1c    [.] test_thread+0x22
>    3.45%  test_prg         test_prg          [.] test_thread+0x36    [.] test_thread+0xa
>    3.28%  test_prg         test_prg          [.] test_thread+0x24    [.] test_thread+0x1c
>    3.25%  test_prg         test_prg          [.] test_thread+0xa     [.] test_thread+0x34
>    3.24%  test_prg         test_prg          [.] test_thread+0x1a    [.] test_thread+0x24
>    3.20%  test_prg         test_prg          [.] test_thread+0x34    [.] test_thread+0xc
>    3.04%  test_prg         test_prg          [.] test_thread+0x26    [.] test_thread+0x1a
>    3.01%  test_prg         test_prg          [.] test_thread+0xc     [.] test_thread+0x32
>    2.98%  test_prg         test_prg          [.] test_thread+0x18    [.] test_thread+0x26
>    2.94%  test_prg         test_prg          [.] test_thread+0x32    [.] test_thread+0xe
>    2.76%  test_prg         test_prg          [.] test_thread+0x28    [.] test_thread+0x18
>    2.73%  test_prg         test_prg          [.] test_thread+0xe     [.] test_thread+0x30
>    2.67%  test_prg         test_prg          [.] test_thread+0x30    [.] test_thread+0x10
>    2.67%  test_prg         test_prg          [.] test_thread+0x16    [.] test_thread+0x28
>    2.46%  test_prg         test_prg          [.] test_thread+0x10    [.] test_thread+0x2e
>    2.44%  test_prg         test_prg          [.] test_thread+0x2a    [.] test_thread+0x16
>    2.38%  test_prg         test_prg          [.] test_thread+0x14    [.] test_thread+0x2a
>    2.32%  test_prg         test_prg          [.] test_thread+0x2e    [.] test_thread+0x12
>    2.28%  test_prg         test_prg          [.] test_thread+0x12    [.] test_thread+0x2c
>    2.16%  test_prg         test_prg          [.] test_thread+0x2c    [.] test_thread+0x14
>    0.02%  test_prg         [kernel.vmlinux]  [k] asm_sysvec_apic_ti+0x5  [k] error_entry
> 
> Signed-off-by: Stephane Eranian <eranian@google.com>
> ---
>  tools/perf/util/hist.c |   2 +
>  tools/perf/util/hist.h |   2 +
>  tools/perf/util/sort.c | 128 +++++++++++++++++++++++++++++++++++++++++
>  tools/perf/util/sort.h |   2 +
>  4 files changed, 134 insertions(+)
> 
> diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
> index 0a8033b09e28..1c085ab56534 100644
> --- a/tools/perf/util/hist.c
> +++ b/tools/perf/util/hist.c
> @@ -124,6 +124,7 @@ void hists__calc_col_len(struct hists *hists, struct hist_entry *h)
>  		} else {
>  			symlen = unresolved_col_width + 4 + 2;
>  			hists__new_col_len(hists, HISTC_SYMBOL_FROM, symlen);
> +			hists__new_col_len(hists, HISTC_ADDR_FROM, symlen);
>  			hists__set_unres_dso_col_len(hists, HISTC_DSO_FROM);
>  		}
>  
> @@ -138,6 +139,7 @@ void hists__calc_col_len(struct hists *hists, struct hist_entry *h)
>  		} else {
>  			symlen = unresolved_col_width + 4 + 2;
>  			hists__new_col_len(hists, HISTC_SYMBOL_TO, symlen);
> +			hists__new_col_len(hists, HISTC_ADDR_TO, symlen);
>  			hists__set_unres_dso_col_len(hists, HISTC_DSO_TO);
>  		}
>  
> diff --git a/tools/perf/util/hist.h b/tools/perf/util/hist.h
> index 2a15e22fb89c..7ed4648d2fc2 100644
> --- a/tools/perf/util/hist.h
> +++ b/tools/perf/util/hist.h
> @@ -77,6 +77,8 @@ enum hist_column {
>  	HISTC_GLOBAL_INS_LAT,
>  	HISTC_LOCAL_P_STAGE_CYC,
>  	HISTC_GLOBAL_P_STAGE_CYC,
> +	HISTC_ADDR_FROM,
> +	HISTC_ADDR_TO,
>  	HISTC_NR_COLS, /* Last entry */
>  };
>  
> diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c
> index 2da081ef532b..6d5588e80935 100644
> --- a/tools/perf/util/sort.c
> +++ b/tools/perf/util/sort.c
> @@ -990,6 +990,128 @@ struct sort_entry sort_sym_to = {
>  	.se_width_idx	= HISTC_SYMBOL_TO,
>  };
>  
> +static int _hist_entry__addr_snprintf(struct map_symbol *ms,
> +				     u64 ip, char level, char *bf, size_t size,
> +				     unsigned int width)
> +{
> +	struct symbol *sym = ms->sym;
> +	struct map *map = ms->map;
> +	size_t ret = 0, offs;
> +
> +	ret += repsep_snprintf(bf + ret, size - ret, "[%c] ", level);
> +	if (sym && map) {
> +		if (sym->type == STT_OBJECT) {
> +			ret += repsep_snprintf(bf + ret, size - ret, "%s", sym->name);
> +			ret += repsep_snprintf(bf + ret, size - ret, "+0x%llx",
> +					ip - map->unmap_ip(map, sym->start));
> +		} else {
> +			ret += repsep_snprintf(bf + ret, size - ret, "%.*s",
> +					       width - ret,
> +					       sym->name);
> +			offs = ip - sym->start;
> +			if (offs)
> +				ret += repsep_snprintf(bf + ret, size - ret, "+0x%llx", offs);
> +		}
> +	} else {
> +		size_t len = BITS_PER_LONG / 4;
> +		ret += repsep_snprintf(bf + ret, size - ret, "%-#.*llx",
> +				       len, ip);
> +	}
> +
> +	return ret;
> +}
> +
> +static int hist_entry__addr_from_snprintf(struct hist_entry *he, char *bf,
> +					 size_t size, unsigned int width)
> +{
> +	if (he->branch_info) {
> +		struct addr_map_symbol *from = &he->branch_info->from;
> +
> +		return _hist_entry__addr_snprintf(&from->ms, from->al_addr,
> +						 he->level, bf, size, width);
> +	}
> +
> +	return repsep_snprintf(bf, size, "%-*.*s", width, width, "N/A");
> +}
> +
> +static int hist_entry__addr_to_snprintf(struct hist_entry *he, char *bf,
> +				       size_t size, unsigned int width)
> +{
> +	if (he->branch_info) {
> +		struct addr_map_symbol *to = &he->branch_info->to;
> +
> +		return _hist_entry__addr_snprintf(&to->ms, to->al_addr,
> +						 he->level, bf, size, width);
> +	}
> +
> +	return repsep_snprintf(bf, size, "%-*.*s", width, width, "N/A");
> +}
> +
> +static int64_t
> +sort__addr_from_cmp(struct hist_entry *left, struct hist_entry *right)
> +{
> +	struct addr_map_symbol *from_l;
> +	struct addr_map_symbol *from_r;
> +	int64_t ret;
> +
> +	if (!left->branch_info || !right->branch_info)
> +		return cmp_null(left->branch_info, right->branch_info);
> +
> +	from_l = &left->branch_info->from;
> +	from_r = &right->branch_info->from;
> +
> +	/*
> +	 * comparing symbol address alone is not enough since it's a
> +	 * relative address within a dso.
> +	 */
> +	ret = _sort__dso_cmp(from_l->ms.map, from_r->ms.map);
> +	if (ret != 0)
> +		return ret;
> +
> +	return _sort__addr_cmp(from_l->addr, from_r->addr);
> +}
> +
> +static int64_t
> +sort__addr_to_cmp(struct hist_entry *left, struct hist_entry *right)
> +{
> +	struct addr_map_symbol *to_l;
> +	struct addr_map_symbol *to_r;
> +	int64_t ret;
> +
> +	if (!left->branch_info || !right->branch_info)
> +		return cmp_null(left->branch_info, right->branch_info);
> +
> +	to_l = &left->branch_info->to;
> +	to_r = &right->branch_info->to;
> +
> +	/*
> +	 * comparing symbol address alone is not enough since it's a
> +	 * relative address within a dso.
> +	 */
> +	ret = _sort__dso_cmp(to_l->ms.map, to_r->ms.map);
> +	if (ret != 0)
> +		return ret;
> +
> +	return _sort__addr_cmp(to_l->addr, to_r->addr);
> +}
> +
> +struct sort_entry sort_addr_from = {
> +	.se_header	= "Source Address",
> +	.se_cmp		= sort__addr_from_cmp,
> +	.se_snprintf	= hist_entry__addr_from_snprintf,
> +	.se_filter	= hist_entry__sym_from_filter, /* shared with sym_from */
> +	.se_width_idx	= HISTC_ADDR_FROM,
> +};
> +
> +struct sort_entry sort_addr_to = {
> +	.se_header	= "Target Address",
> +	.se_cmp		= sort__addr_to_cmp,
> +	.se_snprintf	= hist_entry__addr_to_snprintf,
> +	.se_filter	= hist_entry__sym_to_filter, /* shared with sym_to */
> +	.se_width_idx	= HISTC_ADDR_TO,
> +};
> +
> +
>  static int64_t
>  sort__mispredict_cmp(struct hist_entry *left, struct hist_entry *right)
>  {
> @@ -1893,6 +2015,8 @@ static struct sort_dimension bstack_sort_dimensions[] = {
>  	DIM(SORT_SRCLINE_FROM, "srcline_from", sort_srcline_from),
>  	DIM(SORT_SRCLINE_TO, "srcline_to", sort_srcline_to),
>  	DIM(SORT_SYM_IPC, "ipc_lbr", sort_sym_ipc),
> +	DIM(SORT_ADDR_FROM, "addr_from", sort_addr_from),
> +	DIM(SORT_ADDR_TO, "addr_to", sort_addr_to),
>  };
>  
>  #undef DIM
> @@ -3126,6 +3250,10 @@ static bool get_elide(int idx, FILE *output)
>  		return __get_elide(symbol_conf.dso_from_list, "dso_from", output);
>  	case HISTC_DSO_TO:
>  		return __get_elide(symbol_conf.dso_to_list, "dso_to", output);
> +	case HISTC_ADDR_FROM:
> +		return __get_elide(symbol_conf.sym_from_list, "addr_from", output);
> +	case HISTC_ADDR_TO:
> +		return __get_elide(symbol_conf.sym_to_list, "addr_to", output);
>  	default:
>  		break;
>  	}
> diff --git a/tools/perf/util/sort.h b/tools/perf/util/sort.h
> index f994261888e1..2ddc00d1c464 100644
> --- a/tools/perf/util/sort.h
> +++ b/tools/perf/util/sort.h
> @@ -251,6 +251,8 @@ enum sort_type {
>  	SORT_SRCLINE_FROM,
>  	SORT_SRCLINE_TO,
>  	SORT_SYM_IPC,
> +	SORT_ADDR_FROM,
> +	SORT_ADDR_TO,
>  
>  	/* memory mode specific sort keys */
>  	__SORT_MEMORY_MODE,
> -- 
> 2.35.0.263.gb82422642f-goog

-- 

- Arnaldo

      reply	other threads:[~2022-02-16 14:21 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-02-08 21:16 [PATCH v6 00/12] perf/x86/amd: Add AMD Fam19h Branch Sampling support Stephane Eranian
2022-02-08 21:16 ` [PATCH v6 01/12] perf/core: add perf_clear_branch_entry_bitfields() helper Stephane Eranian
2022-02-08 21:16 ` [PATCH v6 02/12] x86/cpufeatures: add AMD Fam19h Branch Sampling feature Stephane Eranian
2022-02-08 21:16 ` [PATCH v6 03/12] perf/x86/amd: add AMD Fam19h Branch Sampling support Stephane Eranian
2022-02-08 21:16 ` [PATCH v6 04/12] perf/x86/amd: add branch-brs helper event for Fam19h BRS Stephane Eranian
2022-02-08 21:16 ` [PATCH v6 05/12] perf/x86/amd: enable branch sampling priv level filtering Stephane Eranian
2022-02-08 21:16 ` [PATCH v6 06/12] perf/x86/amd: add AMD branch sampling period adjustment Stephane Eranian
2022-02-09 15:32   ` Peter Zijlstra
2022-03-04 15:45     ` Peter Zijlstra
2022-03-09 23:03       ` Stephane Eranian
2022-03-15 12:08         ` Peter Zijlstra
2022-03-17 17:11           ` Stephane Eranian
2022-02-08 21:16 ` [PATCH v6 07/12] perf/x86/amd: make Zen3 branch sampling opt-in Stephane Eranian
2022-02-08 21:16 ` [PATCH v6 08/12] ACPI: add perf low power callback Stephane Eranian
2022-02-08 21:16 ` [PATCH v6 09/12] perf/x86/amd: add idle hooks for branch sampling Stephane Eranian
2022-02-08 21:16 ` [PATCH v6 10/12] perf tools: Improve IBS error handling Stephane Eranian
2022-02-09 15:47   ` Arnaldo Carvalho de Melo
2022-03-15  6:49     ` Ravi Bangoria
2022-03-15  2:01   ` Stephane Eranian
2022-03-15  6:23     ` Ravi Bangoria
2022-03-15  7:12       ` Stephane Eranian
2022-03-15  7:45   ` Ravi Bangoria
2022-03-16  0:03     ` Stephane Eranian
2022-03-16 11:07       ` Ravi Bangoria
2022-03-16 11:16         ` Ravi Bangoria
2022-02-08 21:16 ` [PATCH v6 11/12] perf tools: Improve error handling of AMD Branch Sampling Stephane Eranian
2022-02-16 14:17   ` Arnaldo Carvalho de Melo
2022-02-08 21:16 ` [PATCH v6 12/12] perf report: add addr_from/addr_to sort dimensions Stephane Eranian
2022-02-16 14:21   ` Arnaldo Carvalho de Melo [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Yg0IWIjgpRcb8bZX@kernel.org \
    --to=acme@kernel.org \
    --cc=acme@redhat.com \
    --cc=eranian@google.com \
    --cc=jolsa@redhat.com \
    --cc=kim.phillips@amd.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=peterz@infradead.org \
    --cc=songliubraving@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).