* [PATCH 01/61] perf symbols: Do not open device files again
2016-09-19 13:09 [PATCHv3 00/61] perf c2c: Add new tool to analyze cacheline contention on NUMA systems Jiri Olsa
@ 2016-09-19 13:09 ` Jiri Olsa
2016-09-20 15:28 ` Arnaldo Carvalho de Melo
2016-09-19 13:09 ` [PATCH 02/61] perf tools: Remove superfluous initialization of weight Jiri Olsa
` (59 subsequent siblings)
60 siblings, 1 reply; 85+ messages in thread
From: Jiri Olsa @ 2016-09-19 13:09 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo
Cc: lkml, Don Zickus, Joe Mario, Ingo Molnar, Peter Zijlstra,
Namhyung Kim, David Ahern, Andi Kleen
Moving the regular file check into the entry
of the dso__read_binary_type_filename function.
This way we can eliminate some calls and extend
the file check for all cases.
Link: http://lkml.kernel.org/n/tip-np802m7jwzd7fu09vx2tp23y@git.kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
tools/perf/util/dso.c | 8 +++-----
tools/perf/util/symbol.c | 3 ---
2 files changed, 3 insertions(+), 8 deletions(-)
diff --git a/tools/perf/util/dso.c b/tools/perf/util/dso.c
index 774f6ec884d5..9a027a0cc037 100644
--- a/tools/perf/util/dso.c
+++ b/tools/perf/util/dso.c
@@ -43,6 +43,9 @@ int dso__read_binary_type_filename(const struct dso *dso,
int ret = 0;
size_t len;
+ if (!is_regular_file(filename))
+ return -1;
+
switch (type) {
case DSO_BINARY_TYPE__DEBUGLINK: {
char *debuglink;
@@ -53,11 +56,6 @@ int dso__read_binary_type_filename(const struct dso *dso,
debuglink--;
if (*debuglink == '/')
debuglink++;
-
- ret = -1;
- if (!is_regular_file(filename))
- break;
-
ret = filename__read_debuglink(filename, debuglink,
size - (debuglink - filename));
}
diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
index 19c9c558454f..827a58ce29f0 100644
--- a/tools/perf/util/symbol.c
+++ b/tools/perf/util/symbol.c
@@ -1466,9 +1466,6 @@ int dso__load(struct dso *dso, struct map *map)
root_dir, name, PATH_MAX))
continue;
- if (!is_regular_file(name))
- continue;
-
/* Name is now the name of the next image to try */
if (symsrc__init(ss, dso, name, symtab_type) < 0)
continue;
--
2.7.4
^ permalink raw reply related [flat|nested] 85+ messages in thread
* Re: [PATCH 01/61] perf symbols: Do not open device files again
2016-09-19 13:09 ` [PATCH 01/61] perf symbols: Do not open device files again Jiri Olsa
@ 2016-09-20 15:28 ` Arnaldo Carvalho de Melo
2016-09-20 15:36 ` Jiri Olsa
0 siblings, 1 reply; 85+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-09-20 15:28 UTC (permalink / raw)
To: Jiri Olsa
Cc: lkml, Don Zickus, Joe Mario, Ingo Molnar, Peter Zijlstra,
Namhyung Kim, David Ahern, Andi Kleen
Em Mon, Sep 19, 2016 at 03:09:10PM +0200, Jiri Olsa escreveu:
> Moving the regular file check into the entry
> of the dso__read_binary_type_filename function.
>
> This way we can eliminate some calls and extend
> the file check for all cases.
Bzzt:
[root@jouet ~]# perf test "Test dso"
8: Test dso data read : FAILED!
9: Test dso data cache : FAILED!
10: Test dso data reopen : FAILED!
[root@jouet ~]#
git bisect pointed to this patch, removing it for now, haven't tried to
fix, please take a look.
- Arnaldo
> Link: http://lkml.kernel.org/n/tip-np802m7jwzd7fu09vx2tp23y@git.kernel.org
> Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> ---
> tools/perf/util/dso.c | 8 +++-----
> tools/perf/util/symbol.c | 3 ---
> 2 files changed, 3 insertions(+), 8 deletions(-)
>
> diff --git a/tools/perf/util/dso.c b/tools/perf/util/dso.c
> index 774f6ec884d5..9a027a0cc037 100644
> --- a/tools/perf/util/dso.c
> +++ b/tools/perf/util/dso.c
> @@ -43,6 +43,9 @@ int dso__read_binary_type_filename(const struct dso *dso,
> int ret = 0;
> size_t len;
>
> + if (!is_regular_file(filename))
> + return -1;
> +
> switch (type) {
> case DSO_BINARY_TYPE__DEBUGLINK: {
> char *debuglink;
> @@ -53,11 +56,6 @@ int dso__read_binary_type_filename(const struct dso *dso,
> debuglink--;
> if (*debuglink == '/')
> debuglink++;
> -
> - ret = -1;
> - if (!is_regular_file(filename))
> - break;
> -
> ret = filename__read_debuglink(filename, debuglink,
> size - (debuglink - filename));
> }
> diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
> index 19c9c558454f..827a58ce29f0 100644
> --- a/tools/perf/util/symbol.c
> +++ b/tools/perf/util/symbol.c
> @@ -1466,9 +1466,6 @@ int dso__load(struct dso *dso, struct map *map)
> root_dir, name, PATH_MAX))
> continue;
>
> - if (!is_regular_file(name))
> - continue;
> -
> /* Name is now the name of the next image to try */
> if (symsrc__init(ss, dso, name, symtab_type) < 0)
> continue;
> --
> 2.7.4
^ permalink raw reply [flat|nested] 85+ messages in thread
* Re: [PATCH 01/61] perf symbols: Do not open device files again
2016-09-20 15:28 ` Arnaldo Carvalho de Melo
@ 2016-09-20 15:36 ` Jiri Olsa
2016-09-20 16:12 ` [PATCHv2 01/61] perf symbols: Do not open device files Jiri Olsa
0 siblings, 1 reply; 85+ messages in thread
From: Jiri Olsa @ 2016-09-20 15:36 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo
Cc: Jiri Olsa, lkml, Don Zickus, Joe Mario, Ingo Molnar,
Peter Zijlstra, Namhyung Kim, David Ahern, Andi Kleen
On Tue, Sep 20, 2016 at 12:28:03PM -0300, Arnaldo Carvalho de Melo wrote:
> Em Mon, Sep 19, 2016 at 03:09:10PM +0200, Jiri Olsa escreveu:
> > Moving the regular file check into the entry
> > of the dso__read_binary_type_filename function.
> >
> > This way we can eliminate some calls and extend
> > the file check for all cases.
>
> Bzzt:
>
> [root@jouet ~]# perf test "Test dso"
> 8: Test dso data read : FAILED!
> 9: Test dso data cache : FAILED!
> 10: Test dso data reopen : FAILED!
> [root@jouet ~]#
ugh, will check.. thanks
jirka
^ permalink raw reply [flat|nested] 85+ messages in thread
* [PATCHv2 01/61] perf symbols: Do not open device files
2016-09-20 15:36 ` Jiri Olsa
@ 2016-09-20 16:12 ` Jiri Olsa
2016-09-20 21:45 ` [tip:perf/core] " tip-bot for Jiri Olsa
0 siblings, 1 reply; 85+ messages in thread
From: Jiri Olsa @ 2016-09-20 16:12 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo
Cc: Jiri Olsa, lkml, Don Zickus, Joe Mario, Ingo Molnar,
Peter Zijlstra, Namhyung Kim, David Ahern, Andi Kleen
On Tue, Sep 20, 2016 at 05:36:47PM +0200, Jiri Olsa wrote:
> On Tue, Sep 20, 2016 at 12:28:03PM -0300, Arnaldo Carvalho de Melo wrote:
> > Em Mon, Sep 19, 2016 at 03:09:10PM +0200, Jiri Olsa escreveu:
> > > Moving the regular file check into the entry
> > > of the dso__read_binary_type_filename function.
> > >
> > > This way we can eliminate some calls and extend
> > > the file check for all cases.
> >
> > Bzzt:
> >
> > [root@jouet ~]# perf test "Test dso"
> > 8: Test dso data read : FAILED!
> > 9: Test dso data cache : FAILED!
> > 10: Test dso data reopen : FAILED!
> > [root@jouet ~]#
>
> ugh, will check.. thanks
ook, I confused this one with earlier version, sry.. correct version attached
it's pushed in the perf/c2c branch now
thanks,
jirka
---
The dso__read_binary_type_filename gets the dso's file name
to open. We need to check it for regular file before trying
to open it, otherwise we might get stuck with device file.
Link: http://lkml.kernel.org/n/tip-twbp391v8v9f5idp584hlfov@git.kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
tools/perf/util/dso.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/tools/perf/util/dso.c b/tools/perf/util/dso.c
index 774f6ec884d5..d2c6cdd9d42b 100644
--- a/tools/perf/util/dso.c
+++ b/tools/perf/util/dso.c
@@ -363,6 +363,9 @@ static int __open_dso(struct dso *dso, struct machine *machine)
return -EINVAL;
}
+ if (!is_regular_file(name))
+ return -EINVAL;
+
fd = do_open(name);
free(name);
return fd;
--
2.7.4
^ permalink raw reply related [flat|nested] 85+ messages in thread
* [tip:perf/core] perf symbols: Do not open device files
2016-09-20 16:12 ` [PATCHv2 01/61] perf symbols: Do not open device files Jiri Olsa
@ 2016-09-20 21:45 ` tip-bot for Jiri Olsa
0 siblings, 0 replies; 85+ messages in thread
From: tip-bot for Jiri Olsa @ 2016-09-20 21:45 UTC (permalink / raw)
To: linux-tip-commits
Cc: acme, namhyung, jmario, hpa, a.p.zijlstra, mingo, jolsa,
linux-kernel, jolsa, tglx, dzickus, andi, dsahern
Commit-ID: 3c028a0cb5b71f47d523bc8ad2c597cb257f41fb
Gitweb: http://git.kernel.org/tip/3c028a0cb5b71f47d523bc8ad2c597cb257f41fb
Author: Jiri Olsa <jolsa@redhat.com>
AuthorDate: Tue, 20 Sep 2016 18:12:45 +0200
Committer: Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Tue, 20 Sep 2016 16:20:21 -0300
perf symbols: Do not open device files
The dso__read_binary_type_filename gets the dso's file name to open. We
need to check it for regular file before trying to open it, otherwise we
might get stuck with device file.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Joe Mario <jmario@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20160920161245.GA8995@krava
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
tools/perf/util/dso.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/tools/perf/util/dso.c b/tools/perf/util/dso.c
index 774f6ec..d2c6cdd 100644
--- a/tools/perf/util/dso.c
+++ b/tools/perf/util/dso.c
@@ -363,6 +363,9 @@ static int __open_dso(struct dso *dso, struct machine *machine)
return -EINVAL;
}
+ if (!is_regular_file(name))
+ return -EINVAL;
+
fd = do_open(name);
free(name);
return fd;
^ permalink raw reply related [flat|nested] 85+ messages in thread
* [PATCH 02/61] perf tools: Remove superfluous initialization of weight
2016-09-19 13:09 [PATCHv3 00/61] perf c2c: Add new tool to analyze cacheline contention on NUMA systems Jiri Olsa
2016-09-19 13:09 ` [PATCH 01/61] perf symbols: Do not open device files again Jiri Olsa
@ 2016-09-19 13:09 ` Jiri Olsa
2016-09-21 15:15 ` Arnaldo Carvalho de Melo
2016-09-23 5:24 ` [tip:perf/core] perf evsel: " tip-bot for Jiri Olsa
2016-09-19 13:09 ` [PATCH 03/61] perf tools: Make hist_entry__snprintf work over struct perf_hpp_list Jiri Olsa
` (58 subsequent siblings)
60 siblings, 2 replies; 85+ messages in thread
From: Jiri Olsa @ 2016-09-19 13:09 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo
Cc: lkml, Don Zickus, Joe Mario, Ingo Molnar, Peter Zijlstra,
Namhyung Kim, David Ahern, Andi Kleen
Removing superfluous initialization of weight,
it's already set to 0 via memset.
Link: http://lkml.kernel.org/n/tip-1fmf7sw8p16zwl9q6au10t7c@git.kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
tools/perf/util/evsel.c | 2 --
1 file changed, 2 deletions(-)
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 21fd573106ed..f3225a2e6eee 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -1728,7 +1728,6 @@ int perf_evsel__parse_sample(struct perf_evsel *evsel, union perf_event *event,
data->cpu = data->pid = data->tid = -1;
data->stream_id = data->id = data->time = -1ULL;
data->period = evsel->attr.sample_period;
- data->weight = 0;
data->cpumode = event->header.misc & PERF_RECORD_MISC_CPUMODE_MASK;
if (event->header.type != PERF_RECORD_SAMPLE) {
@@ -1935,7 +1934,6 @@ int perf_evsel__parse_sample(struct perf_evsel *evsel, union perf_event *event,
}
}
- data->weight = 0;
if (type & PERF_SAMPLE_WEIGHT) {
OVERFLOW_CHECK_u64(array);
data->weight = *array;
--
2.7.4
^ permalink raw reply related [flat|nested] 85+ messages in thread
* Re: [PATCH 02/61] perf tools: Remove superfluous initialization of weight
2016-09-19 13:09 ` [PATCH 02/61] perf tools: Remove superfluous initialization of weight Jiri Olsa
@ 2016-09-21 15:15 ` Arnaldo Carvalho de Melo
2016-09-23 5:24 ` [tip:perf/core] perf evsel: " tip-bot for Jiri Olsa
1 sibling, 0 replies; 85+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-09-21 15:15 UTC (permalink / raw)
To: Jiri Olsa
Cc: lkml, Don Zickus, Joe Mario, Ingo Molnar, Peter Zijlstra,
Namhyung Kim, David Ahern, Andi Kleen
Em Mon, Sep 19, 2016 at 03:09:11PM +0200, Jiri Olsa escreveu:
> Removing superfluous initialization of weight,
> it's already set to 0 via memset.
Thanks, applied.
> Link: http://lkml.kernel.org/n/tip-1fmf7sw8p16zwl9q6au10t7c@git.kernel.org
> Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> ---
> tools/perf/util/evsel.c | 2 --
> 1 file changed, 2 deletions(-)
>
> diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
> index 21fd573106ed..f3225a2e6eee 100644
> --- a/tools/perf/util/evsel.c
> +++ b/tools/perf/util/evsel.c
> @@ -1728,7 +1728,6 @@ int perf_evsel__parse_sample(struct perf_evsel *evsel, union perf_event *event,
> data->cpu = data->pid = data->tid = -1;
> data->stream_id = data->id = data->time = -1ULL;
> data->period = evsel->attr.sample_period;
> - data->weight = 0;
> data->cpumode = event->header.misc & PERF_RECORD_MISC_CPUMODE_MASK;
>
> if (event->header.type != PERF_RECORD_SAMPLE) {
> @@ -1935,7 +1934,6 @@ int perf_evsel__parse_sample(struct perf_evsel *evsel, union perf_event *event,
> }
> }
>
> - data->weight = 0;
> if (type & PERF_SAMPLE_WEIGHT) {
> OVERFLOW_CHECK_u64(array);
> data->weight = *array;
> --
> 2.7.4
^ permalink raw reply [flat|nested] 85+ messages in thread
* [tip:perf/core] perf evsel: Remove superfluous initialization of weight
2016-09-19 13:09 ` [PATCH 02/61] perf tools: Remove superfluous initialization of weight Jiri Olsa
2016-09-21 15:15 ` Arnaldo Carvalho de Melo
@ 2016-09-23 5:24 ` tip-bot for Jiri Olsa
1 sibling, 0 replies; 85+ messages in thread
From: tip-bot for Jiri Olsa @ 2016-09-23 5:24 UTC (permalink / raw)
To: linux-tip-commits
Cc: jolsa, acme, dzickus, jmario, dsahern, namhyung, hpa, tglx,
linux-kernel, a.p.zijlstra, mingo, andi
Commit-ID: 82deb8a242cd8aceaf553c9fb731f91dbdc1f9a6
Gitweb: http://git.kernel.org/tip/82deb8a242cd8aceaf553c9fb731f91dbdc1f9a6
Author: Jiri Olsa <jolsa@kernel.org>
AuthorDate: Mon, 19 Sep 2016 15:09:11 +0200
Committer: Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Wed, 21 Sep 2016 12:07:24 -0300
perf evsel: Remove superfluous initialization of weight
Removing superfluous initialization of weight, it's already set to 0 via
memset.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Joe Mario <jmario@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1474290610-23241-3-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
tools/perf/util/evsel.c | 2 --
1 file changed, 2 deletions(-)
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 21fd573..f3225a2 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -1728,7 +1728,6 @@ int perf_evsel__parse_sample(struct perf_evsel *evsel, union perf_event *event,
data->cpu = data->pid = data->tid = -1;
data->stream_id = data->id = data->time = -1ULL;
data->period = evsel->attr.sample_period;
- data->weight = 0;
data->cpumode = event->header.misc & PERF_RECORD_MISC_CPUMODE_MASK;
if (event->header.type != PERF_RECORD_SAMPLE) {
@@ -1935,7 +1934,6 @@ int perf_evsel__parse_sample(struct perf_evsel *evsel, union perf_event *event,
}
}
- data->weight = 0;
if (type & PERF_SAMPLE_WEIGHT) {
OVERFLOW_CHECK_u64(array);
data->weight = *array;
^ permalink raw reply related [flat|nested] 85+ messages in thread
* [PATCH 03/61] perf tools: Make hist_entry__snprintf work over struct perf_hpp_list
2016-09-19 13:09 [PATCHv3 00/61] perf c2c: Add new tool to analyze cacheline contention on NUMA systems Jiri Olsa
2016-09-19 13:09 ` [PATCH 01/61] perf symbols: Do not open device files again Jiri Olsa
2016-09-19 13:09 ` [PATCH 02/61] perf tools: Remove superfluous initialization of weight Jiri Olsa
@ 2016-09-19 13:09 ` Jiri Olsa
2016-09-21 15:14 ` Arnaldo Carvalho de Melo
2016-09-19 13:09 ` [PATCH 04/61] perf tools: Use bigger buffer for stdio headers Jiri Olsa
` (57 subsequent siblings)
60 siblings, 1 reply; 85+ messages in thread
From: Jiri Olsa @ 2016-09-19 13:09 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo
Cc: lkml, Don Zickus, Joe Mario, Ingo Molnar, Peter Zijlstra,
Namhyung Kim, David Ahern, Andi Kleen
Make hist_entry__snprintf to take perf_hpp_list as an argument
instead of using he->hists->hpp_list. This way we can display
arbitrary list of entries regardles of the hists setup, which
will be useful in following patches.
Link: http://lkml.kernel.org/n/tip-j2sizkyglam3narmndlj99xq@git.kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
tools/perf/ui/stdio/hist.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/tools/perf/ui/stdio/hist.c b/tools/perf/ui/stdio/hist.c
index a57131e61fe3..cb0371106c21 100644
--- a/tools/perf/ui/stdio/hist.c
+++ b/tools/perf/ui/stdio/hist.c
@@ -373,7 +373,8 @@ static size_t hist_entry_callchain__fprintf(struct hist_entry *he,
return 0;
}
-static int hist_entry__snprintf(struct hist_entry *he, struct perf_hpp *hpp)
+static int hist_entry__snprintf(struct hist_entry *he, struct perf_hpp *hpp,
+ struct perf_hpp_list *hpp_list)
{
const char *sep = symbol_conf.field_sep;
struct perf_hpp_fmt *fmt;
@@ -384,7 +385,7 @@ static int hist_entry__snprintf(struct hist_entry *he, struct perf_hpp *hpp)
if (symbol_conf.exclude_other && !he->parent)
return 0;
- hists__for_each_format(he->hists, fmt) {
+ perf_hpp_list__for_each_format(hpp_list, fmt) {
if (perf_hpp__should_skip(fmt, he->hists))
continue;
@@ -509,7 +510,7 @@ static int hist_entry__fprintf(struct hist_entry *he, size_t size,
if (symbol_conf.report_hierarchy)
return hist_entry__hierarchy_fprintf(he, &hpp, hists, fp);
- hist_entry__snprintf(he, &hpp);
+ hist_entry__snprintf(he, &hpp, hists->hpp_list);
ret = fprintf(fp, "%s\n", bf);
--
2.7.4
^ permalink raw reply related [flat|nested] 85+ messages in thread
* Re: [PATCH 03/61] perf tools: Make hist_entry__snprintf work over struct perf_hpp_list
2016-09-19 13:09 ` [PATCH 03/61] perf tools: Make hist_entry__snprintf work over struct perf_hpp_list Jiri Olsa
@ 2016-09-21 15:14 ` Arnaldo Carvalho de Melo
2016-09-21 15:30 ` Jiri Olsa
0 siblings, 1 reply; 85+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-09-21 15:14 UTC (permalink / raw)
To: Jiri Olsa
Cc: lkml, Don Zickus, Joe Mario, Ingo Molnar, Peter Zijlstra,
Namhyung Kim, David Ahern, Andi Kleen
Em Mon, Sep 19, 2016 at 03:09:12PM +0200, Jiri Olsa escreveu:
> Make hist_entry__snprintf to take perf_hpp_list as an argument
> instead of using he->hists->hpp_list. This way we can display
> arbitrary list of entries regardles of the hists setup, which
> will be useful in following patches.
>
> Link: http://lkml.kernel.org/n/tip-j2sizkyglam3narmndlj99xq@git.kernel.org
> Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> ---
> tools/perf/ui/stdio/hist.c | 7 ++++---
> 1 file changed, 4 insertions(+), 3 deletions(-)
>
> diff --git a/tools/perf/ui/stdio/hist.c b/tools/perf/ui/stdio/hist.c
> index a57131e61fe3..cb0371106c21 100644
> --- a/tools/perf/ui/stdio/hist.c
> +++ b/tools/perf/ui/stdio/hist.c
> @@ -373,7 +373,8 @@ static size_t hist_entry_callchain__fprintf(struct hist_entry *he,
> return 0;
> }
>
> -static int hist_entry__snprintf(struct hist_entry *he, struct perf_hpp *hpp)
> +static int hist_entry__snprintf(struct hist_entry *he, struct perf_hpp *hpp,
> + struct perf_hpp_list *hpp_list)
What I usually do in these cases is to keep the existing interface and
add a new one that is more low level, that exhibits more flexibility,
i.e.:
static int __hist_entry__snprintf(struct hist_entry *he, struct perf_hpp *hpp,
struct perf_hpp_list *hpp_list)
{
...
}
And:
static int hist_entry__snprintf(struct hist_entry *he, struct perf_hpp *hpp)
{
return __hist_entry__snprintf(he, hpp, he->hists->hpp_list);
}
This way no users of the existing function need to be changed, and new
ones can use the more flexible, lower level interface.
In this case there is just one such user, but the refactoring technique
could be consistently used, other people will not be left scratching
their heads asking why we pass something that can be obtained from
another parameter already in that function, while __ functions already
indicate they are more flexible and thus can flout that assumption.
- Arnaldo
> {
> const char *sep = symbol_conf.field_sep;
> struct perf_hpp_fmt *fmt;
> @@ -384,7 +385,7 @@ static int hist_entry__snprintf(struct hist_entry *he, struct perf_hpp *hpp)
> if (symbol_conf.exclude_other && !he->parent)
> return 0;
>
> - hists__for_each_format(he->hists, fmt) {
> + perf_hpp_list__for_each_format(hpp_list, fmt) {
> if (perf_hpp__should_skip(fmt, he->hists))
> continue;
>
> @@ -509,7 +510,7 @@ static int hist_entry__fprintf(struct hist_entry *he, size_t size,
> if (symbol_conf.report_hierarchy)
> return hist_entry__hierarchy_fprintf(he, &hpp, hists, fp);
>
> - hist_entry__snprintf(he, &hpp);
> + hist_entry__snprintf(he, &hpp, hists->hpp_list);
>
> ret = fprintf(fp, "%s\n", bf);
>
> --
> 2.7.4
^ permalink raw reply [flat|nested] 85+ messages in thread
* Re: [PATCH 03/61] perf tools: Make hist_entry__snprintf work over struct perf_hpp_list
2016-09-21 15:14 ` Arnaldo Carvalho de Melo
@ 2016-09-21 15:30 ` Jiri Olsa
0 siblings, 0 replies; 85+ messages in thread
From: Jiri Olsa @ 2016-09-21 15:30 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo
Cc: Jiri Olsa, lkml, Don Zickus, Joe Mario, Ingo Molnar,
Peter Zijlstra, Namhyung Kim, David Ahern, Andi Kleen
On Wed, Sep 21, 2016 at 12:14:32PM -0300, Arnaldo Carvalho de Melo wrote:
> Em Mon, Sep 19, 2016 at 03:09:12PM +0200, Jiri Olsa escreveu:
> > Make hist_entry__snprintf to take perf_hpp_list as an argument
> > instead of using he->hists->hpp_list. This way we can display
> > arbitrary list of entries regardles of the hists setup, which
> > will be useful in following patches.
> >
> > Link: http://lkml.kernel.org/n/tip-j2sizkyglam3narmndlj99xq@git.kernel.org
> > Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> > ---
> > tools/perf/ui/stdio/hist.c | 7 ++++---
> > 1 file changed, 4 insertions(+), 3 deletions(-)
> >
> > diff --git a/tools/perf/ui/stdio/hist.c b/tools/perf/ui/stdio/hist.c
> > index a57131e61fe3..cb0371106c21 100644
> > --- a/tools/perf/ui/stdio/hist.c
> > +++ b/tools/perf/ui/stdio/hist.c
> > @@ -373,7 +373,8 @@ static size_t hist_entry_callchain__fprintf(struct hist_entry *he,
> > return 0;
> > }
> >
> > -static int hist_entry__snprintf(struct hist_entry *he, struct perf_hpp *hpp)
> > +static int hist_entry__snprintf(struct hist_entry *he, struct perf_hpp *hpp,
> > + struct perf_hpp_list *hpp_list)
>
> What I usually do in these cases is to keep the existing interface and
> add a new one that is more low level, that exhibits more flexibility,
> i.e.:
>
> static int __hist_entry__snprintf(struct hist_entry *he, struct perf_hpp *hpp,
> struct perf_hpp_list *hpp_list)
> {
> ...
> }
>
> And:
>
> static int hist_entry__snprintf(struct hist_entry *he, struct perf_hpp *hpp)
> {
> return __hist_entry__snprintf(he, hpp, he->hists->hpp_list);
> }
>
> This way no users of the existing function need to be changed, and new
> ones can use the more flexible, lower level interface.
>
> In this case there is just one such user, but the refactoring technique
> could be consistently used, other people will not be left scratching
> their heads asking why we pass something that can be obtained from
> another parameter already in that function, while __ functions already
> indicate they are more flexible and thus can flout that assumption.
ook, will change
thanks,
jirka
^ permalink raw reply [flat|nested] 85+ messages in thread
* [PATCH 04/61] perf tools: Use bigger buffer for stdio headers
2016-09-19 13:09 [PATCHv3 00/61] perf c2c: Add new tool to analyze cacheline contention on NUMA systems Jiri Olsa
` (2 preceding siblings ...)
2016-09-19 13:09 ` [PATCH 03/61] perf tools: Make hist_entry__snprintf work over struct perf_hpp_list Jiri Olsa
@ 2016-09-19 13:09 ` Jiri Olsa
2016-09-21 15:15 ` Arnaldo Carvalho de Melo
2016-09-23 5:25 ` [tip:perf/core] perf hists: " tip-bot for Jiri Olsa
2016-09-19 13:09 ` [PATCH 05/61] perf tools: Introduce c2c_decode_stats function Jiri Olsa
` (56 subsequent siblings)
60 siblings, 2 replies; 85+ messages in thread
From: Jiri Olsa @ 2016-09-19 13:09 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo
Cc: lkml, Don Zickus, Joe Mario, Ingo Molnar, Peter Zijlstra,
Namhyung Kim, David Ahern, Andi Kleen
With node column on big CPUs servers we can run out
of stdio header space quite soon. Enlarging header
buffer.
Link: http://lkml.kernel.org/n/tip-p55193hynw8mmok6i2o1wx6c@git.kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
tools/perf/ui/stdio/hist.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/perf/ui/stdio/hist.c b/tools/perf/ui/stdio/hist.c
index cb0371106c21..0a32b48eda80 100644
--- a/tools/perf/ui/stdio/hist.c
+++ b/tools/perf/ui/stdio/hist.c
@@ -699,7 +699,7 @@ hists__fprintf_standard_headers(struct hists *hists,
static int hists__fprintf_headers(struct hists *hists, FILE *fp)
{
- char bf[96];
+ char bf[1024];
struct perf_hpp dummy_hpp = {
.buf = bf,
.size = sizeof(bf),
--
2.7.4
^ permalink raw reply related [flat|nested] 85+ messages in thread
* Re: [PATCH 04/61] perf tools: Use bigger buffer for stdio headers
2016-09-19 13:09 ` [PATCH 04/61] perf tools: Use bigger buffer for stdio headers Jiri Olsa
@ 2016-09-21 15:15 ` Arnaldo Carvalho de Melo
2016-09-23 5:25 ` [tip:perf/core] perf hists: " tip-bot for Jiri Olsa
1 sibling, 0 replies; 85+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-09-21 15:15 UTC (permalink / raw)
To: Jiri Olsa
Cc: lkml, Don Zickus, Joe Mario, Ingo Molnar, Peter Zijlstra,
Namhyung Kim, David Ahern, Andi Kleen
Em Mon, Sep 19, 2016 at 03:09:13PM +0200, Jiri Olsa escreveu:
> With node column on big CPUs servers we can run out
> of stdio header space quite soon. Enlarging header
> buffer.
Applied.
- Arnaldo
> Link: http://lkml.kernel.org/n/tip-p55193hynw8mmok6i2o1wx6c@git.kernel.org
> Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> ---
> tools/perf/ui/stdio/hist.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/tools/perf/ui/stdio/hist.c b/tools/perf/ui/stdio/hist.c
> index cb0371106c21..0a32b48eda80 100644
> --- a/tools/perf/ui/stdio/hist.c
> +++ b/tools/perf/ui/stdio/hist.c
> @@ -699,7 +699,7 @@ hists__fprintf_standard_headers(struct hists *hists,
>
> static int hists__fprintf_headers(struct hists *hists, FILE *fp)
> {
> - char bf[96];
> + char bf[1024];
> struct perf_hpp dummy_hpp = {
> .buf = bf,
> .size = sizeof(bf),
> --
> 2.7.4
^ permalink raw reply [flat|nested] 85+ messages in thread
* [tip:perf/core] perf hists: Use bigger buffer for stdio headers
2016-09-19 13:09 ` [PATCH 04/61] perf tools: Use bigger buffer for stdio headers Jiri Olsa
2016-09-21 15:15 ` Arnaldo Carvalho de Melo
@ 2016-09-23 5:25 ` tip-bot for Jiri Olsa
1 sibling, 0 replies; 85+ messages in thread
From: tip-bot for Jiri Olsa @ 2016-09-23 5:25 UTC (permalink / raw)
To: linux-tip-commits
Cc: dzickus, tglx, mingo, dsahern, a.p.zijlstra, namhyung, jolsa,
jmario, andi, acme, linux-kernel, hpa
Commit-ID: d5278220be663753a011910c194d50758cd8dc98
Gitweb: http://git.kernel.org/tip/d5278220be663753a011910c194d50758cd8dc98
Author: Jiri Olsa <jolsa@kernel.org>
AuthorDate: Mon, 19 Sep 2016 15:09:13 +0200
Committer: Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Wed, 21 Sep 2016 12:14:59 -0300
perf hists: Use bigger buffer for stdio headers
With node column on big CPUs servers we can run out of stdio header
space quite soon. Enlarging header buffer.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Joe Mario <jmario@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1474290610-23241-5-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
tools/perf/ui/stdio/hist.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/perf/ui/stdio/hist.c b/tools/perf/ui/stdio/hist.c
index 8e1840b..c8dca34 100644
--- a/tools/perf/ui/stdio/hist.c
+++ b/tools/perf/ui/stdio/hist.c
@@ -698,7 +698,7 @@ hists__fprintf_standard_headers(struct hists *hists,
static int hists__fprintf_headers(struct hists *hists, FILE *fp)
{
- char bf[96];
+ char bf[1024];
struct perf_hpp dummy_hpp = {
.buf = bf,
.size = sizeof(bf),
^ permalink raw reply related [flat|nested] 85+ messages in thread
* [PATCH 05/61] perf tools: Introduce c2c_decode_stats function
2016-09-19 13:09 [PATCHv3 00/61] perf c2c: Add new tool to analyze cacheline contention on NUMA systems Jiri Olsa
` (3 preceding siblings ...)
2016-09-19 13:09 ` [PATCH 04/61] perf tools: Use bigger buffer for stdio headers Jiri Olsa
@ 2016-09-19 13:09 ` Jiri Olsa
2016-09-19 17:15 ` Nilay Vaish
[not found] ` <CACDz1GupJi3kcDx6zBK68KtpL=Q9hJvUFvHCdtMirMyuuuyMOQ@mail.gmail.com>
2016-09-19 13:09 ` [PATCH 06/61] perf tools: Introduce c2c_add_stats function Jiri Olsa
` (55 subsequent siblings)
60 siblings, 2 replies; 85+ messages in thread
From: Jiri Olsa @ 2016-09-19 13:09 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo
Cc: lkml, Don Zickus, Joe Mario, Ingo Molnar, Peter Zijlstra,
Namhyung Kim, David Ahern, Andi Kleen
Introducing c2c_decode_stats function, which decodes
data_src data into new struct c2c_stats.
Original-patch-by: Dick Fowles <rfowles@redhat.com>
Original-patch-by: Don Zickus <dzickus@redhat.com>
Link: http://lkml.kernel.org/n/tip-7garqfmx5izaqysde9jik4iy@git.kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
tools/perf/util/mem-events.c | 98 ++++++++++++++++++++++++++++++++++++++++++++
tools/perf/util/mem-events.h | 36 ++++++++++++++++
2 files changed, 134 insertions(+)
diff --git a/tools/perf/util/mem-events.c b/tools/perf/util/mem-events.c
index bbc368e7d1e4..502fcee91973 100644
--- a/tools/perf/util/mem-events.c
+++ b/tools/perf/util/mem-events.c
@@ -9,6 +9,7 @@
#include "mem-events.h"
#include "debug.h"
#include "symbol.h"
+#include "sort.h"
unsigned int perf_mem_events__loads_ldlat = 30;
@@ -268,3 +269,100 @@ int perf_script__meminfo_scnprintf(char *out, size_t sz, struct mem_info *mem_in
return i;
}
+
+int c2c_decode_stats(struct c2c_stats *stats, struct mem_info *mi)
+{
+ union perf_mem_data_src *data_src = &mi->data_src;
+ u64 daddr = mi->daddr.addr;
+ u64 op = data_src->mem_op;
+ u64 lvl = data_src->mem_lvl;
+ u64 snoop = data_src->mem_snoop;
+ u64 lock = data_src->mem_lock;
+ int err = 0;
+
+#define P(a, b) PERF_MEM_##a##_##b
+
+ stats->nr_entries++;
+
+ if (lock & P(LOCK, LOCKED)) stats->locks++;
+
+ if (op & P(OP, LOAD)) {
+ /* load */
+ stats->load++;
+
+ if (!daddr) {
+ stats->ld_noadrs++;
+ return -1;
+ }
+
+ if (lvl & P(LVL, HIT)) {
+ if (lvl & P(LVL, UNC)) stats->ld_uncache++;
+ if (lvl & P(LVL, IO)) stats->ld_io++;
+ if (lvl & P(LVL, LFB)) stats->ld_fbhit++;
+ if (lvl & P(LVL, L1 )) stats->ld_l1hit++;
+ if (lvl & P(LVL, L2 )) stats->ld_l2hit++;
+ if (lvl & P(LVL, L3 )) {
+ if (snoop & P(SNOOP, HITM))
+ stats->lcl_hitm++;
+ else
+ stats->ld_llchit++;
+ }
+
+ if (lvl & P(LVL, LOC_RAM)) {
+ stats->lcl_dram++;
+ if (snoop & P(SNOOP, HIT))
+ stats->ld_shared++;
+ else
+ stats->ld_excl++;
+ }
+
+ if ((lvl & P(LVL, REM_RAM1)) ||
+ (lvl & P(LVL, REM_RAM2))) {
+ stats->rmt_dram++;
+ if (snoop & P(SNOOP, HIT))
+ stats->ld_shared++;
+ else
+ stats->ld_excl++;
+ }
+ }
+
+ if ((lvl & P(LVL, REM_CCE1)) ||
+ (lvl & P(LVL, REM_CCE2))) {
+ if (snoop & P(SNOOP, HIT))
+ stats->rmt_hit++;
+ else if (snoop & P(SNOOP, HITM))
+ stats->rmt_hitm++;
+ }
+
+ if ((lvl & P(LVL, MISS)))
+ stats->ld_miss++;
+
+ } else if (op & P(OP, STORE)) {
+ /* store */
+ stats->store++;
+
+ if (!daddr) {
+ stats->st_noadrs++;
+ return -1;
+ }
+
+ if (lvl & P(LVL, HIT)) {
+ if (lvl & P(LVL, UNC)) stats->st_uncache++;
+ if (lvl & P(LVL, L1 )) stats->st_l1hit++;
+ }
+ if (lvl & P(LVL, MISS))
+ if (lvl & P(LVL, L1)) stats->st_l1miss++;
+ } else {
+ /* unparsable data_src? */
+ stats->noparse++;
+ return -1;
+ }
+
+ if (!mi->daddr.map || !mi->iaddr.map) {
+ stats->nomap++;
+ return -1;
+ }
+
+#undef P
+ return err;
+}
diff --git a/tools/perf/util/mem-events.h b/tools/perf/util/mem-events.h
index 7f69bf9d789d..27c6bb5abafb 100644
--- a/tools/perf/util/mem-events.h
+++ b/tools/perf/util/mem-events.h
@@ -2,6 +2,10 @@
#define __PERF_MEM_EVENTS_H
#include <stdbool.h>
+#include <stdint.h>
+#include <stdio.h>
+#include <linux/types.h>
+#include "stat.h"
struct perf_mem_event {
bool record;
@@ -33,4 +37,36 @@ int perf_mem__lck_scnprintf(char *out, size_t sz, struct mem_info *mem_info);
int perf_script__meminfo_scnprintf(char *bf, size_t size, struct mem_info *mem_info);
+struct c2c_stats {
+ int nr_entries;
+
+ int locks; /* count of 'lock' transactions */
+ int store; /* count of all stores in trace */
+ int st_uncache; /* stores to uncacheable address */
+ int st_noadrs; /* cacheable store with no address */
+ int st_l1hit; /* count of stores that hit L1D */
+ int st_l1miss; /* count of stores that miss L1D */
+ int load; /* count of all loads in trace */
+ int ld_excl; /* exclusive loads, rmt/lcl DRAM - snp none/miss */
+ int ld_shared; /* shared loads, rmt/lcl DRAM - snp hit */
+ int ld_uncache; /* loads to uncacheable address */
+ int ld_io; /* loads to io address */
+ int ld_miss; /* loads miss */
+ int ld_noadrs; /* cacheable load with no address */
+ int ld_fbhit; /* count of loads hitting Fill Buffer */
+ int ld_l1hit; /* count of loads that hit L1D */
+ int ld_l2hit; /* count of loads that hit L2D */
+ int ld_llchit; /* count of loads that hit LLC */
+ int lcl_hitm; /* count of loads with local HITM */
+ int rmt_hitm; /* count of loads with remote HITM */
+ int rmt_hit; /* count of loads with remote hit clean; */
+ int lcl_dram; /* count of loads miss to local DRAM */
+ int rmt_dram; /* count of loads miss to remote DRAM */
+ int nomap; /* count of load/stores with no phys adrs */
+ int noparse; /* count of unparsable data sources */
+};
+
+struct hist_entry;
+int c2c_decode_stats(struct c2c_stats *stats, struct mem_info *mi);
+
#endif /* __PERF_MEM_EVENTS_H */
--
2.7.4
^ permalink raw reply related [flat|nested] 85+ messages in thread
* Re: [PATCH 05/61] perf tools: Introduce c2c_decode_stats function
2016-09-19 13:09 ` [PATCH 05/61] perf tools: Introduce c2c_decode_stats function Jiri Olsa
@ 2016-09-19 17:15 ` Nilay Vaish
2016-09-19 18:04 ` Joe Mario
[not found] ` <CACDz1GupJi3kcDx6zBK68KtpL=Q9hJvUFvHCdtMirMyuuuyMOQ@mail.gmail.com>
1 sibling, 1 reply; 85+ messages in thread
From: Nilay Vaish @ 2016-09-19 17:15 UTC (permalink / raw)
To: Jiri Olsa
Cc: Arnaldo Carvalho de Melo, lkml, Don Zickus, Joe Mario,
Ingo Molnar, Peter Zijlstra, Namhyung Kim, David Ahern,
Andi Kleen
On 19 September 2016 at 08:09, Jiri Olsa <jolsa@kernel.org> wrote:
> diff --git a/tools/perf/util/mem-events.h b/tools/perf/util/mem-events.h
> index 7f69bf9d789d..27c6bb5abafb 100644
> --- a/tools/perf/util/mem-events.h
> +++ b/tools/perf/util/mem-events.h
> @@ -2,6 +2,10 @@
> #define __PERF_MEM_EVENTS_H
>
> #include <stdbool.h>
> +#include <stdint.h>
> +#include <stdio.h>
> +#include <linux/types.h>
> +#include "stat.h"
>
> struct perf_mem_event {
> bool record;
> @@ -33,4 +37,36 @@ int perf_mem__lck_scnprintf(char *out, size_t sz, struct mem_info *mem_info);
>
> int perf_script__meminfo_scnprintf(char *bf, size_t size, struct mem_info *mem_info);
>
> +struct c2c_stats {
> + int nr_entries;
> +
> + int locks; /* count of 'lock' transactions */
> + int store; /* count of all stores in trace */
> + int st_uncache; /* stores to uncacheable address */
> + int st_noadrs; /* cacheable store with no address */
No address! Why would that happen?
--
Nilay
^ permalink raw reply [flat|nested] 85+ messages in thread
* Re: [PATCH 05/61] perf tools: Introduce c2c_decode_stats function
2016-09-19 17:15 ` Nilay Vaish
@ 2016-09-19 18:04 ` Joe Mario
0 siblings, 0 replies; 85+ messages in thread
From: Joe Mario @ 2016-09-19 18:04 UTC (permalink / raw)
To: Nilay Vaish, Jiri Olsa
Cc: Arnaldo Carvalho de Melo, lkml, Don Zickus, Ingo Molnar,
Peter Zijlstra, Namhyung Kim, David Ahern, Andi Kleen
On 09/19/2016 01:15 PM, Nilay Vaish wrote:
> On 19 September 2016 at 08:09, Jiri Olsa <jolsa@kernel.org> wrote:
>> diff --git a/tools/perf/util/mem-events.h b/tools/perf/util/mem-events.h
>> index 7f69bf9d789d..27c6bb5abafb 100644
>> --- a/tools/perf/util/mem-events.h
>> +++ b/tools/perf/util/mem-events.h
>> @@ -2,6 +2,10 @@
>> #define __PERF_MEM_EVENTS_H
>>
>> #include <stdbool.h>
>> +#include <stdint.h>
>> +#include <stdio.h>
>> +#include <linux/types.h>
>> +#include "stat.h"
>>
>> struct perf_mem_event {
>> bool record;
>> @@ -33,4 +37,36 @@ int perf_mem__lck_scnprintf(char *out, size_t sz, struct mem_info *mem_info);
>>
>> int perf_script__meminfo_scnprintf(char *bf, size_t size, struct mem_info *mem_info);
>>
>> +struct c2c_stats {
>> + int nr_entries;
>> +
>> + int locks; /* count of 'lock' transactions */
>> + int store; /* count of all stores in trace */
>> + int st_uncache; /* stores to uncacheable address */
>> + int st_noadrs; /* cacheable store with no address */
>
> No address! Why would that happen?
[Resending without the html]
There are a small number of instructions that will trigger a perf mem event and will have no address associated with them. Three of them include mfence, wrmsr, and rdtsc. I believe there are at least two more.
>
>
> --
> Nilay
>
^ permalink raw reply [flat|nested] 85+ messages in thread
[parent not found: <CACDz1GupJi3kcDx6zBK68KtpL=Q9hJvUFvHCdtMirMyuuuyMOQ@mail.gmail.com>]
* Re: [PATCH 05/61] perf tools: Introduce c2c_decode_stats function
[not found] ` <CACDz1GupJi3kcDx6zBK68KtpL=Q9hJvUFvHCdtMirMyuuuyMOQ@mail.gmail.com>
@ 2016-09-21 9:18 ` Jiri Olsa
2016-09-21 15:16 ` Don Zickus
0 siblings, 1 reply; 85+ messages in thread
From: Jiri Olsa @ 2016-09-21 9:18 UTC (permalink / raw)
To: Stanislav Ievlev
Cc: Jiri Olsa, Arnaldo Carvalho de Melo, lkml, Don Zickus, Joe Mario,
Ingo Molnar, Peter Zijlstra, Namhyung Kim, David Ahern,
Andi Kleen
On Wed, Sep 21, 2016 at 09:08:40AM +0000, Stanislav Ievlev wrote:
> Hi, Jiri!
>
> Why are you not using unsigned integer for counters in c2c_stats structure?
hi,
never really thought of that, because that's one of the original
patches I could take almost untouched.. so no real reason ;-)
jirka
>
> On Mon, Sep 19, 2016 at 4:27 PM Jiri Olsa <jolsa@kernel.org> wrote:
>
> > Introducing c2c_decode_stats function, which decodes
> > data_src data into new struct c2c_stats.
> >
> > +struct c2c_stats {
> > + int nr_entries;
> > +
> > + int locks; /* count of 'lock' transactions */
> > + int store; /* count of all stores in trace */
> > + int st_uncache; /* stores to uncacheable address */
> > + int st_noadrs; /* cacheable store with no address */
> > + int st_l1hit; /* count of stores that hit L1D */
> > + int st_l1miss; /* count of stores that miss L1D */
> > + int load; /* count of all loads in trace */
> > + int ld_excl; /* exclusive loads, rmt/lcl DRAM -
> > snp none/miss */
> > + int ld_shared; /* shared loads, rmt/lcl DRAM - snp
> > hit */
> > + int ld_uncache; /* loads to uncacheable address */
> > + int ld_io; /* loads to io address */
> > + int ld_miss; /* loads miss */
> > + int ld_noadrs; /* cacheable load with no address */
> > + int ld_fbhit; /* count of loads hitting Fill Buffer
> > */
> > + int ld_l1hit; /* count of loads that hit L1D */
> > + int ld_l2hit; /* count of loads that hit L2D */
> > + int ld_llchit; /* count of loads that hit LLC */
> > + int lcl_hitm; /* count of loads with local HITM */
> > + int rmt_hitm; /* count of loads with remote HITM */
> > + int rmt_hit; /* count of loads with remote hit
> > clean; */
> > + int lcl_dram; /* count of loads miss to local DRAM
> > */
> > + int rmt_dram; /* count of loads miss to remote DRAM
> > */
> > + int nomap; /* count of load/stores with no phys
> > adrs */
> > + int noparse; /* count of unparsable data sources */
> > +};
> >
> >
^ permalink raw reply [flat|nested] 85+ messages in thread
* Re: [PATCH 05/61] perf tools: Introduce c2c_decode_stats function
2016-09-21 9:18 ` Jiri Olsa
@ 2016-09-21 15:16 ` Don Zickus
2016-09-21 15:32 ` Jiri Olsa
0 siblings, 1 reply; 85+ messages in thread
From: Don Zickus @ 2016-09-21 15:16 UTC (permalink / raw)
To: Jiri Olsa
Cc: Stanislav Ievlev, Jiri Olsa, Arnaldo Carvalho de Melo, lkml,
Joe Mario, Ingo Molnar, Peter Zijlstra, Namhyung Kim,
David Ahern, Andi Kleen
On Wed, Sep 21, 2016 at 11:18:29AM +0200, Jiri Olsa wrote:
> On Wed, Sep 21, 2016 at 09:08:40AM +0000, Stanislav Ievlev wrote:
> > Hi, Jiri!
> >
> > Why are you not using unsigned integer for counters in c2c_stats structure?
>
> hi,
> never really thought of that, because that's one of the original
> patches I could take almost untouched.. so no real reason ;-)
Hi Jirka,
I can't recall the reason Dick and myself started that way. I think it
makes sense to use u32 here. So I am fine with it. :-)
Cheers,
Don
>
> jirka
>
> >
> > On Mon, Sep 19, 2016 at 4:27 PM Jiri Olsa <jolsa@kernel.org> wrote:
> >
> > > Introducing c2c_decode_stats function, which decodes
> > > data_src data into new struct c2c_stats.
> > >
> > > +struct c2c_stats {
> > > + int nr_entries;
> > > +
> > > + int locks; /* count of 'lock' transactions */
> > > + int store; /* count of all stores in trace */
> > > + int st_uncache; /* stores to uncacheable address */
> > > + int st_noadrs; /* cacheable store with no address */
> > > + int st_l1hit; /* count of stores that hit L1D */
> > > + int st_l1miss; /* count of stores that miss L1D */
> > > + int load; /* count of all loads in trace */
> > > + int ld_excl; /* exclusive loads, rmt/lcl DRAM -
> > > snp none/miss */
> > > + int ld_shared; /* shared loads, rmt/lcl DRAM - snp
> > > hit */
> > > + int ld_uncache; /* loads to uncacheable address */
> > > + int ld_io; /* loads to io address */
> > > + int ld_miss; /* loads miss */
> > > + int ld_noadrs; /* cacheable load with no address */
> > > + int ld_fbhit; /* count of loads hitting Fill Buffer
> > > */
> > > + int ld_l1hit; /* count of loads that hit L1D */
> > > + int ld_l2hit; /* count of loads that hit L2D */
> > > + int ld_llchit; /* count of loads that hit LLC */
> > > + int lcl_hitm; /* count of loads with local HITM */
> > > + int rmt_hitm; /* count of loads with remote HITM */
> > > + int rmt_hit; /* count of loads with remote hit
> > > clean; */
> > > + int lcl_dram; /* count of loads miss to local DRAM
> > > */
> > > + int rmt_dram; /* count of loads miss to remote DRAM
> > > */
> > > + int nomap; /* count of load/stores with no phys
> > > adrs */
> > > + int noparse; /* count of unparsable data sources */
> > > +};
> > >
> > >
^ permalink raw reply [flat|nested] 85+ messages in thread
* Re: [PATCH 05/61] perf tools: Introduce c2c_decode_stats function
2016-09-21 15:16 ` Don Zickus
@ 2016-09-21 15:32 ` Jiri Olsa
0 siblings, 0 replies; 85+ messages in thread
From: Jiri Olsa @ 2016-09-21 15:32 UTC (permalink / raw)
To: Don Zickus
Cc: Stanislav Ievlev, Jiri Olsa, Arnaldo Carvalho de Melo, lkml,
Joe Mario, Ingo Molnar, Peter Zijlstra, Namhyung Kim,
David Ahern, Andi Kleen
On Wed, Sep 21, 2016 at 11:16:26AM -0400, Don Zickus wrote:
> On Wed, Sep 21, 2016 at 11:18:29AM +0200, Jiri Olsa wrote:
> > On Wed, Sep 21, 2016 at 09:08:40AM +0000, Stanislav Ievlev wrote:
> > > Hi, Jiri!
> > >
> > > Why are you not using unsigned integer for counters in c2c_stats structure?
> >
> > hi,
> > never really thought of that, because that's one of the original
> > patches I could take almost untouched.. so no real reason ;-)
>
> Hi Jirka,
>
> I can't recall the reason Dick and myself started that way. I think it
> makes sense to use u32 here. So I am fine with it. :-)
ok, will change
thanks,
jirka
^ permalink raw reply [flat|nested] 85+ messages in thread
* [PATCH 06/61] perf tools: Introduce c2c_add_stats function
2016-09-19 13:09 [PATCHv3 00/61] perf c2c: Add new tool to analyze cacheline contention on NUMA systems Jiri Olsa
` (4 preceding siblings ...)
2016-09-19 13:09 ` [PATCH 05/61] perf tools: Introduce c2c_decode_stats function Jiri Olsa
@ 2016-09-19 13:09 ` Jiri Olsa
2016-09-19 13:09 ` [PATCH 07/61] perf tools: Make reset_dimensions global Jiri Olsa
` (54 subsequent siblings)
60 siblings, 0 replies; 85+ messages in thread
From: Jiri Olsa @ 2016-09-19 13:09 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo
Cc: lkml, Don Zickus, Joe Mario, Ingo Molnar, Peter Zijlstra,
Namhyung Kim, David Ahern, Andi Kleen
Introducing c2c_add_stats function helper to
cumulate c2c_stats.
Original-patch-by: Dick Fowles <rfowles@redhat.com>
Original-patch-by: Don Zickus <dzickus@redhat.com>
Link: http://lkml.kernel.org/n/tip-7garqfmx5izaqysde9jik4iy@git.kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
tools/perf/util/mem-events.c | 30 ++++++++++++++++++++++++++++++
tools/perf/util/mem-events.h | 1 +
2 files changed, 31 insertions(+)
diff --git a/tools/perf/util/mem-events.c b/tools/perf/util/mem-events.c
index 502fcee91973..e50773286ef6 100644
--- a/tools/perf/util/mem-events.c
+++ b/tools/perf/util/mem-events.c
@@ -366,3 +366,33 @@ int c2c_decode_stats(struct c2c_stats *stats, struct mem_info *mi)
#undef P
return err;
}
+
+void c2c_add_stats(struct c2c_stats *stats, struct c2c_stats *add)
+{
+ stats->nr_entries += add->nr_entries;
+
+ stats->locks += add->locks;
+ stats->store += add->store;
+ stats->st_uncache += add->st_uncache;
+ stats->st_noadrs += add->st_noadrs;
+ stats->st_l1hit += add->st_l1hit;
+ stats->st_l1miss += add->st_l1miss;
+ stats->load += add->load;
+ stats->ld_excl += add->ld_excl;
+ stats->ld_shared += add->ld_shared;
+ stats->ld_uncache += add->ld_uncache;
+ stats->ld_io += add->ld_io;
+ stats->ld_miss += add->ld_miss;
+ stats->ld_noadrs += add->ld_noadrs;
+ stats->ld_fbhit += add->ld_fbhit;
+ stats->ld_l1hit += add->ld_l1hit;
+ stats->ld_l2hit += add->ld_l2hit;
+ stats->ld_llchit += add->ld_llchit;
+ stats->lcl_hitm += add->lcl_hitm;
+ stats->rmt_hitm += add->rmt_hitm;
+ stats->rmt_hit += add->rmt_hit;
+ stats->lcl_dram += add->lcl_dram;
+ stats->rmt_dram += add->rmt_dram;
+ stats->nomap += add->nomap;
+ stats->noparse += add->noparse;
+}
diff --git a/tools/perf/util/mem-events.h b/tools/perf/util/mem-events.h
index 27c6bb5abafb..30b3757ee326 100644
--- a/tools/perf/util/mem-events.h
+++ b/tools/perf/util/mem-events.h
@@ -68,5 +68,6 @@ struct c2c_stats {
struct hist_entry;
int c2c_decode_stats(struct c2c_stats *stats, struct mem_info *mi);
+void c2c_add_stats(struct c2c_stats *stats, struct c2c_stats *add);
#endif /* __PERF_MEM_EVENTS_H */
--
2.7.4
^ permalink raw reply related [flat|nested] 85+ messages in thread
* [PATCH 07/61] perf tools: Make reset_dimensions global
2016-09-19 13:09 [PATCHv3 00/61] perf c2c: Add new tool to analyze cacheline contention on NUMA systems Jiri Olsa
` (5 preceding siblings ...)
2016-09-19 13:09 ` [PATCH 06/61] perf tools: Introduce c2c_add_stats function Jiri Olsa
@ 2016-09-19 13:09 ` Jiri Olsa
2016-09-19 13:09 ` [PATCH 08/61] perf tools: Make output_field_add and sort_dimension__add global Jiri Olsa
` (53 subsequent siblings)
60 siblings, 0 replies; 85+ messages in thread
From: Jiri Olsa @ 2016-09-19 13:09 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo
Cc: lkml, Don Zickus, Joe Mario, Ingo Molnar, Peter Zijlstra,
Namhyung Kim, David Ahern, Andi Kleen
Will be used from external places in following patches.
Link: http://lkml.kernel.org/n/tip-7garqfmx5izaqysde9jik4iy@git.kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
tools/perf/util/sort.c | 2 +-
tools/perf/util/sort.h | 1 +
2 files changed, 2 insertions(+), 1 deletion(-)
diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c
index 1884d7f9b9d2..9e1f6f75a50f 100644
--- a/tools/perf/util/sort.c
+++ b/tools/perf/util/sort.c
@@ -2748,7 +2748,7 @@ static int setup_output_list(struct perf_hpp_list *list, char *str)
return ret;
}
-static void reset_dimensions(void)
+void reset_dimensions(void)
{
unsigned int i;
diff --git a/tools/perf/util/sort.h b/tools/perf/util/sort.h
index 28c0524c8702..3f743bf2acd4 100644
--- a/tools/perf/util/sort.h
+++ b/tools/perf/util/sort.h
@@ -268,4 +268,5 @@ int report_parse_ignore_callees_opt(const struct option *opt, const char *arg, i
bool is_strict_order(const char *order);
int hpp_dimension__add_output(unsigned col);
+void reset_dimensions(void);
#endif /* __PERF_SORT_H */
--
2.7.4
^ permalink raw reply related [flat|nested] 85+ messages in thread
* [PATCH 08/61] perf tools: Make output_field_add and sort_dimension__add global
2016-09-19 13:09 [PATCHv3 00/61] perf c2c: Add new tool to analyze cacheline contention on NUMA systems Jiri Olsa
` (6 preceding siblings ...)
2016-09-19 13:09 ` [PATCH 07/61] perf tools: Make reset_dimensions global Jiri Olsa
@ 2016-09-19 13:09 ` Jiri Olsa
2016-09-19 13:09 ` [PATCH 09/61] perf tools: Make several sorting functions global Jiri Olsa
` (52 subsequent siblings)
60 siblings, 0 replies; 85+ messages in thread
From: Jiri Olsa @ 2016-09-19 13:09 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo
Cc: lkml, Don Zickus, Joe Mario, Ingo Molnar, Peter Zijlstra,
Namhyung Kim, David Ahern, Andi Kleen
Will be used from external places in following patches.
Link: http://lkml.kernel.org/n/tip-15488tnxcj4rtteksy79y4qu@git.kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
tools/perf/util/sort.c | 8 ++++----
tools/perf/util/sort.h | 4 ++++
2 files changed, 8 insertions(+), 4 deletions(-)
diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c
index 9e1f6f75a50f..9f7c1ea9e3ad 100644
--- a/tools/perf/util/sort.c
+++ b/tools/perf/util/sort.c
@@ -2308,9 +2308,9 @@ int hpp_dimension__add_output(unsigned col)
return __hpp_dimension__add_output(&perf_hpp_list, &hpp_sort_dimensions[col]);
}
-static int sort_dimension__add(struct perf_hpp_list *list, const char *tok,
- struct perf_evlist *evlist,
- int level)
+int sort_dimension__add(struct perf_hpp_list *list, const char *tok,
+ struct perf_evlist *evlist,
+ int level)
{
unsigned int i;
@@ -2685,7 +2685,7 @@ void sort__setup_elide(FILE *output)
}
}
-static int output_field_add(struct perf_hpp_list *list, char *tok)
+int output_field_add(struct perf_hpp_list *list, char *tok)
{
unsigned int i;
diff --git a/tools/perf/util/sort.h b/tools/perf/util/sort.h
index 3f743bf2acd4..ac7998048b1e 100644
--- a/tools/perf/util/sort.h
+++ b/tools/perf/util/sort.h
@@ -269,4 +269,8 @@ bool is_strict_order(const char *order);
int hpp_dimension__add_output(unsigned col);
void reset_dimensions(void);
+int sort_dimension__add(struct perf_hpp_list *list, const char *tok,
+ struct perf_evlist *evlist,
+ int level);
+int output_field_add(struct perf_hpp_list *list, char *tok);
#endif /* __PERF_SORT_H */
--
2.7.4
^ permalink raw reply related [flat|nested] 85+ messages in thread
* [PATCH 09/61] perf tools: Make several sorting functions global
2016-09-19 13:09 [PATCHv3 00/61] perf c2c: Add new tool to analyze cacheline contention on NUMA systems Jiri Olsa
` (7 preceding siblings ...)
2016-09-19 13:09 ` [PATCH 08/61] perf tools: Make output_field_add and sort_dimension__add global Jiri Olsa
@ 2016-09-19 13:09 ` Jiri Olsa
2016-09-19 13:09 ` [PATCH 10/61] perf tools: Make several display " Jiri Olsa
` (51 subsequent siblings)
60 siblings, 0 replies; 85+ messages in thread
From: Jiri Olsa @ 2016-09-19 13:09 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo
Cc: lkml, Don Zickus, Joe Mario, Ingo Molnar, Peter Zijlstra,
Namhyung Kim, David Ahern, Andi Kleen
Will be used from external places in following patches.
Link: http://lkml.kernel.org/n/tip-4jyvw21cac7yuqsdkzdo5e2w@git.kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
tools/perf/util/sort.c | 6 +++---
tools/perf/util/sort.h | 6 ++++++
2 files changed, 9 insertions(+), 3 deletions(-)
diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c
index 9f7c1ea9e3ad..452e15a10dd2 100644
--- a/tools/perf/util/sort.c
+++ b/tools/perf/util/sort.c
@@ -867,7 +867,7 @@ struct sort_entry sort_cycles = {
};
/* --sort daddr_sym */
-static int64_t
+int64_t
sort__daddr_cmp(struct hist_entry *left, struct hist_entry *right)
{
uint64_t l = 0, r = 0;
@@ -896,7 +896,7 @@ static int hist_entry__daddr_snprintf(struct hist_entry *he, char *bf,
width);
}
-static int64_t
+int64_t
sort__iaddr_cmp(struct hist_entry *left, struct hist_entry *right)
{
uint64_t l = 0, r = 0;
@@ -1062,7 +1062,7 @@ static int hist_entry__snoop_snprintf(struct hist_entry *he, char *bf,
return repsep_snprintf(bf, size, "%-*s", width, out);
}
-static int64_t
+int64_t
sort__dcacheline_cmp(struct hist_entry *left, struct hist_entry *right)
{
u64 l, r;
diff --git a/tools/perf/util/sort.h b/tools/perf/util/sort.h
index ac7998048b1e..d4ef567dcd7b 100644
--- a/tools/perf/util/sort.h
+++ b/tools/perf/util/sort.h
@@ -273,4 +273,10 @@ int sort_dimension__add(struct perf_hpp_list *list, const char *tok,
struct perf_evlist *evlist,
int level);
int output_field_add(struct perf_hpp_list *list, char *tok);
+int64_t
+sort__iaddr_cmp(struct hist_entry *left, struct hist_entry *right);
+int64_t
+sort__daddr_cmp(struct hist_entry *left, struct hist_entry *right);
+int64_t
+sort__dcacheline_cmp(struct hist_entry *left, struct hist_entry *right);
#endif /* __PERF_SORT_H */
--
2.7.4
^ permalink raw reply related [flat|nested] 85+ messages in thread
* [PATCH 10/61] perf tools: Make several display functions global
2016-09-19 13:09 [PATCHv3 00/61] perf c2c: Add new tool to analyze cacheline contention on NUMA systems Jiri Olsa
` (8 preceding siblings ...)
2016-09-19 13:09 ` [PATCH 09/61] perf tools: Make several sorting functions global Jiri Olsa
@ 2016-09-19 13:09 ` Jiri Olsa
2016-09-19 13:09 ` [PATCH 11/61] perf tools: Make hist_entry__snprintf function global Jiri Olsa
` (50 subsequent siblings)
60 siblings, 0 replies; 85+ messages in thread
From: Jiri Olsa @ 2016-09-19 13:09 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo
Cc: lkml, Don Zickus, Joe Mario, Ingo Molnar, Peter Zijlstra,
Namhyung Kim, David Ahern, Andi Kleen
Will be used from external places in following patches.
Link: http://lkml.kernel.org/n/tip-w5tpcitxjvufkndq0x5ehsx3@git.kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
tools/perf/ui/browsers/hists.c | 2 +-
tools/perf/ui/hist.c | 2 +-
tools/perf/util/hist.h | 2 ++
3 files changed, 4 insertions(+), 2 deletions(-)
diff --git a/tools/perf/ui/browsers/hists.c b/tools/perf/ui/browsers/hists.c
index 35e44b1879e3..77cf7a80e8d6 100644
--- a/tools/perf/ui/browsers/hists.c
+++ b/tools/perf/ui/browsers/hists.c
@@ -1080,7 +1080,7 @@ struct hpp_arg {
bool current_entry;
};
-static int __hpp__slsmg_color_printf(struct perf_hpp *hpp, const char *fmt, ...)
+int __hpp__slsmg_color_printf(struct perf_hpp *hpp, const char *fmt, ...)
{
struct hpp_arg *arg = hpp->ptr;
int ret, len;
diff --git a/tools/perf/ui/hist.c b/tools/perf/ui/hist.c
index b47fafc8ee2a..84ad92ad24be 100644
--- a/tools/perf/ui/hist.c
+++ b/tools/perf/ui/hist.c
@@ -237,7 +237,7 @@ static int hpp__header_fn(struct perf_hpp_fmt *fmt, struct perf_hpp *hpp,
return scnprintf(hpp->buf, hpp->size, "%*s", len, fmt->name);
}
-static int hpp_color_scnprintf(struct perf_hpp *hpp, const char *fmt, ...)
+int hpp_color_scnprintf(struct perf_hpp *hpp, const char *fmt, ...)
{
va_list args;
ssize_t ssize = hpp->size;
diff --git a/tools/perf/util/hist.h b/tools/perf/util/hist.h
index a002c93fe422..ef9985cba1de 100644
--- a/tools/perf/util/hist.h
+++ b/tools/perf/util/hist.h
@@ -484,5 +484,7 @@ static inline struct rb_node *rb_hierarchy_next(struct rb_node *node)
#define HIERARCHY_INDENT 3
bool hist_entry__has_hierarchy_children(struct hist_entry *he, float limit);
+int hpp_color_scnprintf(struct perf_hpp *hpp, const char *fmt, ...);
+int __hpp__slsmg_color_printf(struct perf_hpp *hpp, const char *fmt, ...);
#endif /* __PERF_HIST_H */
--
2.7.4
^ permalink raw reply related [flat|nested] 85+ messages in thread
* [PATCH 11/61] perf tools: Make hist_entry__snprintf function global
2016-09-19 13:09 [PATCHv3 00/61] perf c2c: Add new tool to analyze cacheline contention on NUMA systems Jiri Olsa
` (9 preceding siblings ...)
2016-09-19 13:09 ` [PATCH 10/61] perf tools: Make several display " Jiri Olsa
@ 2016-09-19 13:09 ` Jiri Olsa
2016-09-19 13:09 ` [PATCH 12/61] perf tools: Make hists__fprintf_headers " Jiri Olsa
` (49 subsequent siblings)
60 siblings, 0 replies; 85+ messages in thread
From: Jiri Olsa @ 2016-09-19 13:09 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo
Cc: lkml, Don Zickus, Joe Mario, Ingo Molnar, Peter Zijlstra,
Namhyung Kim, David Ahern, Andi Kleen
Will be used from external places in following patches.
Link: http://lkml.kernel.org/n/tip-uip4x9u74t3dcz8sh4meiy5i@git.kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
tools/perf/ui/stdio/hist.c | 4 ++--
tools/perf/util/hist.h | 2 ++
2 files changed, 4 insertions(+), 2 deletions(-)
diff --git a/tools/perf/ui/stdio/hist.c b/tools/perf/ui/stdio/hist.c
index 0a32b48eda80..3434d571ddd1 100644
--- a/tools/perf/ui/stdio/hist.c
+++ b/tools/perf/ui/stdio/hist.c
@@ -373,8 +373,8 @@ static size_t hist_entry_callchain__fprintf(struct hist_entry *he,
return 0;
}
-static int hist_entry__snprintf(struct hist_entry *he, struct perf_hpp *hpp,
- struct perf_hpp_list *hpp_list)
+int hist_entry__snprintf(struct hist_entry *he, struct perf_hpp *hpp,
+ struct perf_hpp_list *hpp_list)
{
const char *sep = symbol_conf.field_sep;
struct perf_hpp_fmt *fmt;
diff --git a/tools/perf/util/hist.h b/tools/perf/util/hist.h
index ef9985cba1de..aa5ddfa1fa22 100644
--- a/tools/perf/util/hist.h
+++ b/tools/perf/util/hist.h
@@ -486,5 +486,7 @@ static inline struct rb_node *rb_hierarchy_next(struct rb_node *node)
bool hist_entry__has_hierarchy_children(struct hist_entry *he, float limit);
int hpp_color_scnprintf(struct perf_hpp *hpp, const char *fmt, ...);
int __hpp__slsmg_color_printf(struct perf_hpp *hpp, const char *fmt, ...);
+int hist_entry__snprintf(struct hist_entry *he, struct perf_hpp *hpp,
+ struct perf_hpp_list *hpp_list);
#endif /* __PERF_HIST_H */
--
2.7.4
^ permalink raw reply related [flat|nested] 85+ messages in thread
* [PATCH 12/61] perf tools: Make hists__fprintf_headers function global
2016-09-19 13:09 [PATCHv3 00/61] perf c2c: Add new tool to analyze cacheline contention on NUMA systems Jiri Olsa
` (10 preceding siblings ...)
2016-09-19 13:09 ` [PATCH 11/61] perf tools: Make hist_entry__snprintf function global Jiri Olsa
@ 2016-09-19 13:09 ` Jiri Olsa
2016-09-19 13:09 ` [PATCH 13/61] perf c2c: Add c2c command Jiri Olsa
` (48 subsequent siblings)
60 siblings, 0 replies; 85+ messages in thread
From: Jiri Olsa @ 2016-09-19 13:09 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo
Cc: lkml, Don Zickus, Joe Mario, Ingo Molnar, Peter Zijlstra,
Namhyung Kim, David Ahern, Andi Kleen
Will be used from external places in following patches.
Link: http://lkml.kernel.org/n/tip-ydj205bfen9fgflnv39hnrdh@git.kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
tools/perf/ui/stdio/hist.c | 2 +-
tools/perf/util/hist.h | 1 +
2 files changed, 2 insertions(+), 1 deletion(-)
diff --git a/tools/perf/ui/stdio/hist.c b/tools/perf/ui/stdio/hist.c
index 3434d571ddd1..f6d5ac8772f4 100644
--- a/tools/perf/ui/stdio/hist.c
+++ b/tools/perf/ui/stdio/hist.c
@@ -697,7 +697,7 @@ hists__fprintf_standard_headers(struct hists *hists,
return hpp_list->nr_header_lines + 2;
}
-static int hists__fprintf_headers(struct hists *hists, FILE *fp)
+int hists__fprintf_headers(struct hists *hists, FILE *fp)
{
char bf[1024];
struct perf_hpp dummy_hpp = {
diff --git a/tools/perf/util/hist.h b/tools/perf/util/hist.h
index aa5ddfa1fa22..0e3493e33175 100644
--- a/tools/perf/util/hist.h
+++ b/tools/perf/util/hist.h
@@ -488,5 +488,6 @@ int hpp_color_scnprintf(struct perf_hpp *hpp, const char *fmt, ...);
int __hpp__slsmg_color_printf(struct perf_hpp *hpp, const char *fmt, ...);
int hist_entry__snprintf(struct hist_entry *he, struct perf_hpp *hpp,
struct perf_hpp_list *hpp_list);
+int hists__fprintf_headers(struct hists *hists, FILE *fp);
#endif /* __PERF_HIST_H */
--
2.7.4
^ permalink raw reply related [flat|nested] 85+ messages in thread
* [PATCH 13/61] perf c2c: Add c2c command
2016-09-19 13:09 [PATCHv3 00/61] perf c2c: Add new tool to analyze cacheline contention on NUMA systems Jiri Olsa
` (11 preceding siblings ...)
2016-09-19 13:09 ` [PATCH 12/61] perf tools: Make hists__fprintf_headers " Jiri Olsa
@ 2016-09-19 13:09 ` Jiri Olsa
2016-09-19 13:09 ` [PATCH 14/61] perf c2c: Add record subcommand Jiri Olsa
` (47 subsequent siblings)
60 siblings, 0 replies; 85+ messages in thread
From: Jiri Olsa @ 2016-09-19 13:09 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo
Cc: lkml, Don Zickus, Joe Mario, Ingo Molnar, Peter Zijlstra,
Namhyung Kim, David Ahern, Andi Kleen
Adding c2c command base wirings. Its implementation
is going to be added gradually in following patches.
Link: http://lkml.kernel.org/n/tip-svq2kccqjaaieb6rxhky3oif@git.kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
tools/perf/Build | 1 +
tools/perf/builtin-c2c.c | 23 +++++++++++++++++++++++
tools/perf/builtin.h | 1 +
tools/perf/perf.c | 1 +
4 files changed, 26 insertions(+)
create mode 100644 tools/perf/builtin-c2c.c
diff --git a/tools/perf/Build b/tools/perf/Build
index a43fae7f439a..b12d5d1666e3 100644
--- a/tools/perf/Build
+++ b/tools/perf/Build
@@ -21,6 +21,7 @@ perf-y += builtin-inject.o
perf-y += builtin-mem.o
perf-y += builtin-data.o
perf-y += builtin-version.o
+perf-y += builtin-c2c.o
perf-$(CONFIG_AUDIT) += builtin-trace.o
perf-$(CONFIG_LIBELF) += builtin-probe.o
diff --git a/tools/perf/builtin-c2c.c b/tools/perf/builtin-c2c.c
new file mode 100644
index 000000000000..8252ed0ba5d0
--- /dev/null
+++ b/tools/perf/builtin-c2c.c
@@ -0,0 +1,23 @@
+#include <linux/compiler.h>
+#include <linux/kernel.h>
+#include "util.h"
+#include "debug.h"
+#include "builtin.h"
+#include <subcmd/parse-options.h>
+
+static const char * const c2c_usage[] = {
+ "perf c2c",
+ NULL
+};
+
+int cmd_c2c(int argc, const char **argv, const char *prefix __maybe_unused)
+{
+ const struct option c2c_options[] = {
+ OPT_INCR('v', "verbose", &verbose, "be more verbose"),
+ OPT_END()
+ };
+
+ argc = parse_options(argc, argv, c2c_options, c2c_usage,
+ PARSE_OPT_STOP_AT_NON_OPTION);
+ return 0;
+}
diff --git a/tools/perf/builtin.h b/tools/perf/builtin.h
index 41c24010ab43..0bcf68e98ccc 100644
--- a/tools/perf/builtin.h
+++ b/tools/perf/builtin.h
@@ -18,6 +18,7 @@ int cmd_bench(int argc, const char **argv, const char *prefix);
int cmd_buildid_cache(int argc, const char **argv, const char *prefix);
int cmd_buildid_list(int argc, const char **argv, const char *prefix);
int cmd_config(int argc, const char **argv, const char *prefix);
+int cmd_c2c(int argc, const char **argv, const char *prefix);
int cmd_diff(int argc, const char **argv, const char *prefix);
int cmd_evlist(int argc, const char **argv, const char *prefix);
int cmd_help(int argc, const char **argv, const char *prefix);
diff --git a/tools/perf/perf.c b/tools/perf/perf.c
index 64c06961bfe4..aa23b3347d6b 100644
--- a/tools/perf/perf.c
+++ b/tools/perf/perf.c
@@ -43,6 +43,7 @@ static struct cmd_struct commands[] = {
{ "buildid-cache", cmd_buildid_cache, 0 },
{ "buildid-list", cmd_buildid_list, 0 },
{ "config", cmd_config, 0 },
+ { "c2c", cmd_c2c, 0 },
{ "diff", cmd_diff, 0 },
{ "evlist", cmd_evlist, 0 },
{ "help", cmd_help, 0 },
--
2.7.4
^ permalink raw reply related [flat|nested] 85+ messages in thread
* [PATCH 14/61] perf c2c: Add record subcommand
2016-09-19 13:09 [PATCHv3 00/61] perf c2c: Add new tool to analyze cacheline contention on NUMA systems Jiri Olsa
` (12 preceding siblings ...)
2016-09-19 13:09 ` [PATCH 13/61] perf c2c: Add c2c command Jiri Olsa
@ 2016-09-19 13:09 ` Jiri Olsa
2016-09-19 13:09 ` [PATCH 15/61] perf c2c: Add report subcommand Jiri Olsa
` (46 subsequent siblings)
60 siblings, 0 replies; 85+ messages in thread
From: Jiri Olsa @ 2016-09-19 13:09 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo
Cc: lkml, Don Zickus, Joe Mario, Ingo Molnar, Peter Zijlstra,
Namhyung Kim, David Ahern, Andi Kleen
Adding c2c record subcommand. It setups options related
to HITM cacheline analysis and calls standard perf
record command.
$ sudo perf c2c record -v -- -a
calling: record -W -d --sample-cpu -e cpu/mem-loads,ldlat=30/P -e cpu/mem-stores/P -a
...
It produces perf.data, which is to be reported by
perf c2c report, that comes in following patches.
Details are described in the man page, which is
added in one of the following patches.
Link: http://lkml.kernel.org/n/tip-hjxkryl43njyhaombycca7z9@git.kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
tools/perf/builtin-c2c.c | 114 +++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 114 insertions(+)
diff --git a/tools/perf/builtin-c2c.c b/tools/perf/builtin-c2c.c
index 8252ed0ba5d0..58924c67f818 100644
--- a/tools/perf/builtin-c2c.c
+++ b/tools/perf/builtin-c2c.c
@@ -4,12 +4,116 @@
#include "debug.h"
#include "builtin.h"
#include <subcmd/parse-options.h>
+#include "mem-events.h"
static const char * const c2c_usage[] = {
"perf c2c",
NULL
};
+static int parse_record_events(const struct option *opt __maybe_unused,
+ const char *str, int unset __maybe_unused)
+{
+ bool *event_set = (bool *) opt->value;
+
+ *event_set = true;
+ return perf_mem_events__parse(str);
+}
+
+
+static const char * const __usage_record[] = {
+ "perf c2c record [<options>] [<command>]",
+ "perf c2c record [<options>] -- <command> [<options>]",
+ NULL
+};
+
+static const char * const *record_mem_usage = __usage_record;
+
+static int perf_c2c__record(int argc, const char **argv)
+{
+ int rec_argc, i = 0, j;
+ const char **rec_argv;
+ int ret;
+ bool all_user = false, all_kernel = false;
+ bool event_set = false;
+ struct option options[] = {
+ OPT_CALLBACK('e', "event", &event_set, "event",
+ "event selector. Use 'perf mem record -e list' to list available events",
+ parse_record_events),
+ OPT_INCR('v', "verbose", &verbose,
+ "be more verbose (show counter open errors, etc)"),
+ OPT_BOOLEAN('u', "all-user", &all_user, "collect only user level data"),
+ OPT_BOOLEAN('k', "all-kernel", &all_kernel, "collect only kernel level data"),
+ OPT_UINTEGER('l', "ldlat", &perf_mem_events__loads_ldlat, "setup mem-loads latency"),
+ OPT_END()
+ };
+
+ if (perf_mem_events__init()) {
+ pr_err("failed: memory events not supported\n");
+ return -1;
+ }
+
+ argc = parse_options(argc, argv, options, record_mem_usage,
+ PARSE_OPT_KEEP_UNKNOWN);
+
+ rec_argc = argc + 10; /* max number of arguments */
+ rec_argv = calloc(rec_argc + 1, sizeof(char *));
+ if (!rec_argv)
+ return -1;
+
+ rec_argv[i++] = "record";
+
+ if (!event_set) {
+ perf_mem_events[PERF_MEM_EVENTS__LOAD].record = true;
+ perf_mem_events[PERF_MEM_EVENTS__STORE].record = true;
+ }
+
+ if (perf_mem_events[PERF_MEM_EVENTS__LOAD].record)
+ rec_argv[i++] = "-W";
+
+ rec_argv[i++] = "-d";
+ rec_argv[i++] = "--sample-cpu";
+
+ for (j = 0; j < PERF_MEM_EVENTS__MAX; j++) {
+ if (!perf_mem_events[j].record)
+ continue;
+
+ if (!perf_mem_events[j].supported) {
+ pr_err("failed: event '%s' not supported\n",
+ perf_mem_events[j].name);
+ return -1;
+ }
+
+ rec_argv[i++] = "-e";
+ rec_argv[i++] = perf_mem_events__name(j);
+ };
+
+ if (all_user)
+ rec_argv[i++] = "--all-user";
+
+ if (all_kernel)
+ rec_argv[i++] = "--all-kernel";
+
+ for (j = 0; j < argc; j++, i++)
+ rec_argv[i] = argv[j];
+
+ if (verbose > 0) {
+ pr_debug("calling: ");
+
+ j = 0;
+
+ while (rec_argv[j]) {
+ pr_debug("%s ", rec_argv[j]);
+ j++;
+ }
+ pr_debug("\n");
+ }
+
+ ret = cmd_record(i, rec_argv, NULL);
+ free(rec_argv);
+ return ret;
+}
+
int cmd_c2c(int argc, const char **argv, const char *prefix __maybe_unused)
{
const struct option c2c_options[] = {
@@ -19,5 +123,15 @@ int cmd_c2c(int argc, const char **argv, const char *prefix __maybe_unused)
argc = parse_options(argc, argv, c2c_options, c2c_usage,
PARSE_OPT_STOP_AT_NON_OPTION);
+
+ if (!argc)
+ usage_with_options(c2c_usage, c2c_options);
+
+ if (!strncmp(argv[0], "rec", 3)) {
+ return perf_c2c__record(argc, argv);
+ } else {
+ usage_with_options(c2c_usage, c2c_options);
+ }
+
return 0;
}
--
2.7.4
^ permalink raw reply related [flat|nested] 85+ messages in thread
* [PATCH 15/61] perf c2c: Add report subcommand
2016-09-19 13:09 [PATCHv3 00/61] perf c2c: Add new tool to analyze cacheline contention on NUMA systems Jiri Olsa
` (13 preceding siblings ...)
2016-09-19 13:09 ` [PATCH 14/61] perf c2c: Add record subcommand Jiri Olsa
@ 2016-09-19 13:09 ` Jiri Olsa
2016-09-19 13:09 ` [PATCH 16/61] perf c2c report: Add dimension support Jiri Olsa
` (45 subsequent siblings)
60 siblings, 0 replies; 85+ messages in thread
From: Jiri Olsa @ 2016-09-19 13:09 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo
Cc: lkml, Don Zickus, Joe Mario, Ingo Molnar, Peter Zijlstra,
Namhyung Kim, David Ahern, Andi Kleen
Adding c2c report subcommand. It reads the
perf.data and displays shared data analysis.
This patch adds report basic wirings. It gets
fully implemented in following patches.
Link: http://lkml.kernel.org/n/tip-8smklfkveeyv1pahfxv2re44@git.kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
tools/perf/builtin-c2c.c | 66 +++++++++++++++++++++++++++++++++++++++++++++++-
1 file changed, 65 insertions(+), 1 deletion(-)
diff --git a/tools/perf/builtin-c2c.c b/tools/perf/builtin-c2c.c
index 58924c67f818..3fac3a294bdd 100644
--- a/tools/perf/builtin-c2c.c
+++ b/tools/perf/builtin-c2c.c
@@ -5,12 +5,74 @@
#include "builtin.h"
#include <subcmd/parse-options.h>
#include "mem-events.h"
+#include "session.h"
+#include "hist.h"
+#include "tool.h"
+#include "data.h"
+
+struct perf_c2c {
+ struct perf_tool tool;
+};
+
+static struct perf_c2c c2c;
static const char * const c2c_usage[] = {
- "perf c2c",
+ "perf c2c {record|report}",
NULL
};
+static const char * const __usage_report[] = {
+ "perf c2c report",
+ NULL
+};
+
+static const char * const *report_c2c_usage = __usage_report;
+
+static int perf_c2c__report(int argc, const char **argv)
+{
+ struct perf_session *session;
+ struct perf_data_file file = {
+ .mode = PERF_DATA_MODE_READ,
+ };
+ const struct option c2c_options[] = {
+ OPT_STRING('k', "vmlinux", &symbol_conf.vmlinux_name,
+ "file", "vmlinux pathname"),
+ OPT_INCR('v', "verbose", &verbose,
+ "be more verbose (show counter open errors, etc)"),
+ OPT_STRING('i', "input", &input_name, "file",
+ "the input file to process"),
+ OPT_END()
+ };
+ int err = 0;
+
+ argc = parse_options(argc, argv, c2c_options, report_c2c_usage,
+ PARSE_OPT_STOP_AT_NON_OPTION);
+ if (!argc)
+ usage_with_options(report_c2c_usage, c2c_options);
+
+ file.path = input_name;
+
+ session = perf_session__new(&file, 0, &c2c.tool);
+ if (session == NULL) {
+ pr_debug("No memory for session\n");
+ goto out;
+ }
+
+ if (symbol__init(&session->header.env) < 0)
+ goto out_session;
+
+ /* No pipe support at the moment. */
+ if (perf_data_file__is_pipe(session->file)) {
+ pr_debug("No pipe support at the moment.\n");
+ goto out_session;
+ }
+
+out_session:
+ perf_session__delete(session);
+out:
+ return err;
+}
+
static int parse_record_events(const struct option *opt __maybe_unused,
const char *str, int unset __maybe_unused)
{
@@ -129,6 +191,8 @@ int cmd_c2c(int argc, const char **argv, const char *prefix __maybe_unused)
if (!strncmp(argv[0], "rec", 3)) {
return perf_c2c__record(argc, argv);
+ } else if (!strncmp(argv[0], "rep", 3)) {
+ return perf_c2c__report(argc, argv);
} else {
usage_with_options(c2c_usage, c2c_options);
}
--
2.7.4
^ permalink raw reply related [flat|nested] 85+ messages in thread
* [PATCH 16/61] perf c2c report: Add dimension support
2016-09-19 13:09 [PATCHv3 00/61] perf c2c: Add new tool to analyze cacheline contention on NUMA systems Jiri Olsa
` (14 preceding siblings ...)
2016-09-19 13:09 ` [PATCH 15/61] perf c2c: Add report subcommand Jiri Olsa
@ 2016-09-19 13:09 ` Jiri Olsa
2016-09-19 13:09 ` [PATCH 17/61] perf c2c report: Add sort_entry " Jiri Olsa
` (44 subsequent siblings)
60 siblings, 0 replies; 85+ messages in thread
From: Jiri Olsa @ 2016-09-19 13:09 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo
Cc: lkml, Don Zickus, Joe Mario, Ingo Molnar, Peter Zijlstra,
Namhyung Kim, David Ahern, Andi Kleen
Adding bare bones of dimension support for c2c report.
Main interface functions are:
c2c_hists__init
c2c_hists__reinit
which re/initialize 'struct c2c_hists' object with
sort/display entries string, in a similar way that
setup_sorting function does.
We overload the dimension to provide multi line
header support for sort/display entries.
Also we overload base 'struct perf_hpp_fmt' object
with 'struct c2c_fmt' to define c2c specific functions
to deal with multi line headers and spans.
Link: http://lkml.kernel.org/n/tip-yg8p7bc8p7grxg4eifs2als2@git.kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
tools/perf/builtin-c2c.c | 239 ++++++++++++++++++++++++++++++++++++++++++++++-
1 file changed, 238 insertions(+), 1 deletion(-)
diff --git a/tools/perf/builtin-c2c.c b/tools/perf/builtin-c2c.c
index 3fac3a294bdd..63c0e2d8d2d8 100644
--- a/tools/perf/builtin-c2c.c
+++ b/tools/perf/builtin-c2c.c
@@ -10,8 +10,14 @@
#include "tool.h"
#include "data.h"
+struct c2c_hists {
+ struct hists hists;
+ struct perf_hpp_list list;
+};
+
struct perf_c2c {
- struct perf_tool tool;
+ struct perf_tool tool;
+ struct c2c_hists hists;
};
static struct perf_c2c c2c;
@@ -28,6 +34,231 @@ static const char * const __usage_report[] = {
static const char * const *report_c2c_usage = __usage_report;
+#define C2C_HEADER_MAX 2
+
+struct c2c_header {
+ struct {
+ const char *text;
+ int span;
+ } line[C2C_HEADER_MAX];
+};
+
+struct c2c_dimension {
+ struct c2c_header header;
+ const char *name;
+ int width;
+
+ int64_t (*cmp)(struct perf_hpp_fmt *fmt,
+ struct hist_entry *, struct hist_entry *);
+ int (*entry)(struct perf_hpp_fmt *fmt, struct perf_hpp *hpp,
+ struct hist_entry *he);
+ int (*color)(struct perf_hpp_fmt *fmt, struct perf_hpp *hpp,
+ struct hist_entry *he);
+};
+
+struct c2c_fmt {
+ struct perf_hpp_fmt fmt;
+ struct c2c_dimension *dim;
+};
+
+static int c2c_width(struct perf_hpp_fmt *fmt,
+ struct perf_hpp *hpp __maybe_unused,
+ struct hists *hists __maybe_unused)
+{
+ struct c2c_fmt *c2c_fmt;
+
+ c2c_fmt = container_of(fmt, struct c2c_fmt, fmt);
+ return c2c_fmt->dim->width;
+}
+
+static int c2c_header(struct perf_hpp_fmt *fmt, struct perf_hpp *hpp,
+ struct hists *hists __maybe_unused, int line, int *span)
+{
+ struct c2c_fmt *c2c_fmt;
+ struct c2c_dimension *dim;
+ int len = c2c_width(fmt, hpp, hists);
+ const char *text;
+
+ c2c_fmt = container_of(fmt, struct c2c_fmt, fmt);
+ dim = c2c_fmt->dim;
+
+ text = dim->header.line[line].text;
+ if (text == NULL)
+ text = "";
+
+ if (*span) {
+ (*span)--;
+ return 0;
+ } else {
+ *span = dim->header.line[line].span;
+ }
+
+ return scnprintf(hpp->buf, hpp->size, "%*s", len, text);
+}
+
+static struct c2c_dimension *dimensions[] = {
+ NULL,
+};
+
+static void fmt_free(struct perf_hpp_fmt *fmt)
+{
+ struct c2c_fmt *c2c_fmt;
+
+ c2c_fmt = container_of(fmt, struct c2c_fmt, fmt);
+ free(c2c_fmt);
+}
+
+static bool fmt_equal(struct perf_hpp_fmt *a, struct perf_hpp_fmt *b)
+{
+ struct c2c_fmt *c2c_a = container_of(a, struct c2c_fmt, fmt);
+ struct c2c_fmt *c2c_b = container_of(b, struct c2c_fmt, fmt);
+
+ return c2c_a->dim == c2c_b->dim;
+}
+
+static struct c2c_dimension *get_dimension(const char *name)
+{
+ unsigned int i;
+
+ for (i = 0; dimensions[i]; i++) {
+ struct c2c_dimension *dim = dimensions[i];
+
+ if (!strcmp(dim->name, name))
+ return dim;
+ };
+
+ return NULL;
+}
+
+static struct c2c_fmt *get_format(const char *name)
+{
+ struct c2c_dimension *dim = get_dimension(name);
+ struct c2c_fmt *c2c_fmt;
+ struct perf_hpp_fmt *fmt;
+
+ if (!dim)
+ return NULL;
+
+ c2c_fmt = zalloc(sizeof(*c2c_fmt));
+ if (!c2c_fmt)
+ return NULL;
+
+ c2c_fmt->dim = dim;
+
+ fmt = &c2c_fmt->fmt;
+ INIT_LIST_HEAD(&fmt->list);
+ INIT_LIST_HEAD(&fmt->sort_list);
+
+ fmt->cmp = dim->cmp;
+ fmt->sort = dim->cmp;
+ fmt->entry = dim->entry;
+ fmt->header = c2c_header;
+ fmt->width = c2c_width;
+ fmt->collapse = dim->cmp;
+ fmt->equal = fmt_equal;
+ fmt->free = fmt_free;
+
+ return c2c_fmt;
+}
+
+static int c2c_hists__init_output(struct perf_hpp_list *hpp_list, char *name)
+{
+ struct c2c_fmt *c2c_fmt = get_format(name);
+
+ if (!c2c_fmt)
+ return -1;
+
+ perf_hpp_list__column_register(hpp_list, &c2c_fmt->fmt);
+ return 0;
+}
+
+static int c2c_hists__init_sort(struct perf_hpp_list *hpp_list, char *name)
+{
+ struct c2c_fmt *c2c_fmt = get_format(name);
+
+ if (!c2c_fmt)
+ return -1;
+
+ perf_hpp_list__register_sort_field(hpp_list, &c2c_fmt->fmt);
+ return 0;
+}
+
+#define PARSE_LIST(_list, _fn) \
+ do { \
+ char *tmp, *tok; \
+ ret = 0; \
+ \
+ if (!_list) \
+ break; \
+ \
+ for (tok = strtok_r((char *)_list, ", ", &tmp); \
+ tok; tok = strtok_r(NULL, ", ", &tmp)) { \
+ ret = _fn(hpp_list, tok); \
+ if (ret == -EINVAL) { \
+ error("Invalid --fields key: `%s'", tok); \
+ break; \
+ } else if (ret == -ESRCH) { \
+ error("Unknown --fields key: `%s'", tok); \
+ break; \
+ } \
+ } \
+ } while (0)
+
+static int hpp_list__parse(struct perf_hpp_list *hpp_list,
+ const char *output_,
+ const char *sort_)
+{
+ char *output = output_ ? strdup(output_) : NULL;
+ char *sort = sort_ ? strdup(sort_) : NULL;
+ int ret;
+
+ PARSE_LIST(output, c2c_hists__init_output);
+ PARSE_LIST(sort, c2c_hists__init_sort);
+
+ /* copy sort keys to output fields */
+ perf_hpp__setup_output_field(hpp_list);
+
+ /*
+ * We dont need other sorting keys other than those
+ * we already specified. It also really slows down
+ * the processing a lot with big number of output
+ * fields, so switching this off for c2c.
+ */
+
+#if 0
+ /* and then copy output fields to sort keys */
+ perf_hpp__append_sort_keys(&hists->list);
+#endif
+
+ free(output);
+ free(sort);
+ return ret;
+}
+
+static int c2c_hists__init(struct c2c_hists *hists,
+ const char *sort)
+{
+ __hists__init(&hists->hists, &hists->list);
+
+ /*
+ * Initialize only with sort fields, we need to resort
+ * later anyway, and that's where we add output fields
+ * as well.
+ */
+ perf_hpp_list__init(&hists->list);
+
+ return hpp_list__parse(&hists->list, NULL, sort);
+}
+
+__maybe_unused
+static int c2c_hists__reinit(struct c2c_hists *c2c_hists,
+ const char *output,
+ const char *sort)
+{
+ perf_hpp__reset_output_field(&c2c_hists->list);
+ return hpp_list__parse(&c2c_hists->list, output, sort);
+}
+
static int perf_c2c__report(int argc, const char **argv)
{
struct perf_session *session;
@@ -52,6 +283,12 @@ static int perf_c2c__report(int argc, const char **argv)
file.path = input_name;
+ err = c2c_hists__init(&c2c.hists, "dcacheline");
+ if (err) {
+ pr_debug("Failed to initialize hists\n");
+ goto out;
+ }
+
session = perf_session__new(&file, 0, &c2c.tool);
if (session == NULL) {
pr_debug("No memory for session\n");
--
2.7.4
^ permalink raw reply related [flat|nested] 85+ messages in thread
* [PATCH 17/61] perf c2c report: Add sort_entry dimension support
2016-09-19 13:09 [PATCHv3 00/61] perf c2c: Add new tool to analyze cacheline contention on NUMA systems Jiri Olsa
` (15 preceding siblings ...)
2016-09-19 13:09 ` [PATCH 16/61] perf c2c report: Add dimension support Jiri Olsa
@ 2016-09-19 13:09 ` Jiri Olsa
2016-09-19 13:09 ` [PATCH 18/61] perf c2c report: Fallback to standard dimensions Jiri Olsa
` (43 subsequent siblings)
60 siblings, 0 replies; 85+ messages in thread
From: Jiri Olsa @ 2016-09-19 13:09 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo
Cc: lkml, Don Zickus, Joe Mario, Ingo Molnar, Peter Zijlstra,
Namhyung Kim, David Ahern, Andi Kleen
Allow to reuse 'struct sort_entry' objects
within c2c dimension support.
In case the 'struct sort_entry' object meets
the need of c2c report we will use it directly
in following patches.
Link: http://lkml.kernel.org/n/tip-a4jraum43uwhhnp91je2jnnk@git.kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
tools/perf/builtin-c2c.c | 82 ++++++++++++++++++++++++++++++++++++++----------
1 file changed, 65 insertions(+), 17 deletions(-)
diff --git a/tools/perf/builtin-c2c.c b/tools/perf/builtin-c2c.c
index 63c0e2d8d2d8..6b58b537bc9d 100644
--- a/tools/perf/builtin-c2c.c
+++ b/tools/perf/builtin-c2c.c
@@ -9,6 +9,7 @@
#include "hist.h"
#include "tool.h"
#include "data.h"
+#include "sort.h"
struct c2c_hists {
struct hists hists;
@@ -47,6 +48,7 @@ struct c2c_dimension {
struct c2c_header header;
const char *name;
int width;
+ struct sort_entry *se;
int64_t (*cmp)(struct perf_hpp_fmt *fmt,
struct hist_entry *, struct hist_entry *);
@@ -66,34 +68,47 @@ static int c2c_width(struct perf_hpp_fmt *fmt,
struct hists *hists __maybe_unused)
{
struct c2c_fmt *c2c_fmt;
+ struct c2c_dimension *dim;
c2c_fmt = container_of(fmt, struct c2c_fmt, fmt);
- return c2c_fmt->dim->width;
+ dim = c2c_fmt->dim;
+
+ return dim->se ? hists__col_len(hists, dim->se->se_width_idx) :
+ c2c_fmt->dim->width;
}
static int c2c_header(struct perf_hpp_fmt *fmt, struct perf_hpp *hpp,
- struct hists *hists __maybe_unused, int line, int *span)
+ struct hists *hists, int line, int *span)
{
+ struct perf_hpp_list *hpp_list = hists->hpp_list;
struct c2c_fmt *c2c_fmt;
struct c2c_dimension *dim;
- int len = c2c_width(fmt, hpp, hists);
- const char *text;
+ const char *text = NULL;
+ int width = c2c_width(fmt, hpp, hists);
c2c_fmt = container_of(fmt, struct c2c_fmt, fmt);
dim = c2c_fmt->dim;
- text = dim->header.line[line].text;
- if (text == NULL)
- text = "";
-
- if (*span) {
- (*span)--;
- return 0;
+ if (dim->se) {
+ text = dim->header.line[line].text;
+ /* Use the last line from sort_entry if not defined. */
+ if (!text && (line == hpp_list->nr_header_lines - 1))
+ text = dim->se->se_header;
} else {
- *span = dim->header.line[line].span;
+ text = dim->header.line[line].text;
+
+ if (*span) {
+ (*span)--;
+ return 0;
+ } else {
+ *span = dim->header.line[line].span;
+ }
}
- return scnprintf(hpp->buf, hpp->size, "%*s", len, text);
+ if (text == NULL)
+ text = "";
+
+ return scnprintf(hpp->buf, hpp->size, "%*s", width, text);
}
static struct c2c_dimension *dimensions[] = {
@@ -130,6 +145,39 @@ static struct c2c_dimension *get_dimension(const char *name)
return NULL;
}
+static int c2c_se_entry(struct perf_hpp_fmt *fmt, struct perf_hpp *hpp,
+ struct hist_entry *he)
+{
+ struct c2c_fmt *c2c_fmt = container_of(fmt, struct c2c_fmt, fmt);
+ struct c2c_dimension *dim = c2c_fmt->dim;
+ size_t len = fmt->user_len;
+
+ if (!len)
+ len = hists__col_len(he->hists, dim->se->se_width_idx);
+
+ return dim->se->se_snprintf(he, hpp->buf, hpp->size, len);
+}
+
+static int64_t c2c_se_cmp(struct perf_hpp_fmt *fmt,
+ struct hist_entry *a, struct hist_entry *b)
+{
+ struct c2c_fmt *c2c_fmt = container_of(fmt, struct c2c_fmt, fmt);
+ struct c2c_dimension *dim = c2c_fmt->dim;
+
+ return dim->se->se_cmp(a, b);
+}
+
+static int64_t c2c_se_collapse(struct perf_hpp_fmt *fmt,
+ struct hist_entry *a, struct hist_entry *b)
+{
+ struct c2c_fmt *c2c_fmt = container_of(fmt, struct c2c_fmt, fmt);
+ struct c2c_dimension *dim = c2c_fmt->dim;
+ int64_t (*collapse_fn)(struct hist_entry *, struct hist_entry *);
+
+ collapse_fn = dim->se->se_collapse ?: dim->se->se_cmp;
+ return collapse_fn(a, b);
+}
+
static struct c2c_fmt *get_format(const char *name)
{
struct c2c_dimension *dim = get_dimension(name);
@@ -149,12 +197,12 @@ static struct c2c_fmt *get_format(const char *name)
INIT_LIST_HEAD(&fmt->list);
INIT_LIST_HEAD(&fmt->sort_list);
- fmt->cmp = dim->cmp;
- fmt->sort = dim->cmp;
- fmt->entry = dim->entry;
+ fmt->cmp = dim->se ? c2c_se_cmp : dim->cmp;
+ fmt->sort = dim->se ? c2c_se_cmp : dim->cmp;
+ fmt->entry = dim->se ? c2c_se_entry : dim->entry;
fmt->header = c2c_header;
fmt->width = c2c_width;
- fmt->collapse = dim->cmp;
+ fmt->collapse = dim->se ? c2c_se_collapse : dim->cmp;
fmt->equal = fmt_equal;
fmt->free = fmt_free;
--
2.7.4
^ permalink raw reply related [flat|nested] 85+ messages in thread
* [PATCH 18/61] perf c2c report: Fallback to standard dimensions
2016-09-19 13:09 [PATCHv3 00/61] perf c2c: Add new tool to analyze cacheline contention on NUMA systems Jiri Olsa
` (16 preceding siblings ...)
2016-09-19 13:09 ` [PATCH 17/61] perf c2c report: Add sort_entry " Jiri Olsa
@ 2016-09-19 13:09 ` Jiri Olsa
2016-09-19 13:09 ` [PATCH 19/61] perf c2c report: Add sample processing Jiri Olsa
` (42 subsequent siblings)
60 siblings, 0 replies; 85+ messages in thread
From: Jiri Olsa @ 2016-09-19 13:09 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo
Cc: lkml, Don Zickus, Joe Mario, Ingo Molnar, Peter Zijlstra,
Namhyung Kim, David Ahern, Andi Kleen
Fallback to standard dimensions in case we don't
find the dimension within c2c ones.
Link: http://lkml.kernel.org/n/tip-w3yrcawal0dr1w9pcu4gyymd@git.kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
tools/perf/builtin-c2c.c | 12 ++++++++----
1 file changed, 8 insertions(+), 4 deletions(-)
diff --git a/tools/perf/builtin-c2c.c b/tools/perf/builtin-c2c.c
index 6b58b537bc9d..a3481f86e2ae 100644
--- a/tools/perf/builtin-c2c.c
+++ b/tools/perf/builtin-c2c.c
@@ -213,8 +213,10 @@ static int c2c_hists__init_output(struct perf_hpp_list *hpp_list, char *name)
{
struct c2c_fmt *c2c_fmt = get_format(name);
- if (!c2c_fmt)
- return -1;
+ if (!c2c_fmt) {
+ reset_dimensions();
+ return output_field_add(hpp_list, name);
+ }
perf_hpp_list__column_register(hpp_list, &c2c_fmt->fmt);
return 0;
@@ -224,8 +226,10 @@ static int c2c_hists__init_sort(struct perf_hpp_list *hpp_list, char *name)
{
struct c2c_fmt *c2c_fmt = get_format(name);
- if (!c2c_fmt)
- return -1;
+ if (!c2c_fmt) {
+ reset_dimensions();
+ return sort_dimension__add(hpp_list, name, NULL, 0);
+ }
perf_hpp_list__register_sort_field(hpp_list, &c2c_fmt->fmt);
return 0;
--
2.7.4
^ permalink raw reply related [flat|nested] 85+ messages in thread
* [PATCH 19/61] perf c2c report: Add sample processing
2016-09-19 13:09 [PATCHv3 00/61] perf c2c: Add new tool to analyze cacheline contention on NUMA systems Jiri Olsa
` (17 preceding siblings ...)
2016-09-19 13:09 ` [PATCH 18/61] perf c2c report: Fallback to standard dimensions Jiri Olsa
@ 2016-09-19 13:09 ` Jiri Olsa
2016-09-19 13:09 ` [PATCH 20/61] perf c2c report: Add cacheline hists processing Jiri Olsa
` (41 subsequent siblings)
60 siblings, 0 replies; 85+ messages in thread
From: Jiri Olsa @ 2016-09-19 13:09 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo
Cc: lkml, Don Zickus, Joe Mario, Ingo Molnar, Peter Zijlstra,
Namhyung Kim, David Ahern, Andi Kleen
Adding basic sample processing specific hist_entry
allocation callbacks (via hists__add_entry_ops).
Overloading 'struct hist_entry' object with new
'struct c2c_hist_entry'. The new hist entry object
will carry specific stats and nested hists objects.
Link: http://lkml.kernel.org/n/tip-ksr9smz4o1t040h50z28dds2@git.kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
tools/perf/builtin-c2c.c | 108 ++++++++++++++++++++++++++++++++++++++++++++++-
1 file changed, 107 insertions(+), 1 deletion(-)
diff --git a/tools/perf/builtin-c2c.c b/tools/perf/builtin-c2c.c
index a3481f86e2ae..29fb9573e292 100644
--- a/tools/perf/builtin-c2c.c
+++ b/tools/perf/builtin-c2c.c
@@ -16,6 +16,15 @@ struct c2c_hists {
struct perf_hpp_list list;
};
+struct c2c_hist_entry {
+ struct c2c_hists *hists;
+ /*
+ * must be at the end,
+ * because of its callchain dynamic entry
+ */
+ struct hist_entry he;
+};
+
struct perf_c2c {
struct perf_tool tool;
struct c2c_hists hists;
@@ -23,6 +32,86 @@ struct perf_c2c {
static struct perf_c2c c2c;
+static void *c2c_he_zalloc(size_t size)
+{
+ struct c2c_hist_entry *c2c_he;
+
+ c2c_he = zalloc(size + sizeof(*c2c_he));
+ if (!c2c_he)
+ return NULL;
+
+ return &c2c_he->he;
+}
+
+static void c2c_he_free(void *he)
+{
+ struct c2c_hist_entry *c2c_he;
+
+ c2c_he = container_of(he, struct c2c_hist_entry, he);
+ if (c2c_he->hists) {
+ hists__delete_entries(&c2c_he->hists->hists);
+ free(c2c_he->hists);
+ }
+
+ free(c2c_he);
+}
+
+static struct hist_entry_ops c2c_entry_ops = {
+ .new = c2c_he_zalloc,
+ .free = c2c_he_free,
+};
+
+static int process_sample_event(struct perf_tool *tool __maybe_unused,
+ union perf_event *event,
+ struct perf_sample *sample,
+ struct perf_evsel *evsel __maybe_unused,
+ struct machine *machine)
+{
+ struct hists *hists = &c2c.hists.hists;
+ struct hist_entry *he;
+ struct addr_location al;
+ struct mem_info *mi;
+ int ret;
+
+ if (machine__resolve(machine, &al, sample) < 0) {
+ pr_debug("problem processing %d event, skipping it.\n",
+ event->header.type);
+ return -1;
+ }
+
+ mi = sample__resolve_mem(sample, &al);
+ if (mi == NULL)
+ return -ENOMEM;
+
+ he = hists__add_entry_ops(hists, &c2c_entry_ops,
+ &al, NULL, NULL, mi,
+ sample, true);
+ if (he == NULL) {
+ free(mi);
+ return -ENOMEM;
+ }
+
+ hists__inc_nr_samples(hists, he->filtered);
+ ret = hist_entry__append_callchain(he, sample);
+
+ addr_location__put(&al);
+ return ret;
+}
+
+static struct perf_c2c c2c = {
+ .tool = {
+ .sample = process_sample_event,
+ .mmap = perf_event__process_mmap,
+ .mmap2 = perf_event__process_mmap2,
+ .comm = perf_event__process_comm,
+ .exit = perf_event__process_exit,
+ .fork = perf_event__process_fork,
+ .lost = perf_event__process_lost,
+ .ordered_events = true,
+ .ordering_requires_timestamps = true,
+ },
+};
+
static const char * const c2c_usage[] = {
"perf c2c {record|report}",
NULL
@@ -314,6 +403,7 @@ static int c2c_hists__reinit(struct c2c_hists *c2c_hists,
static int perf_c2c__report(int argc, const char **argv)
{
struct perf_session *session;
+ struct ui_progress prog;
struct perf_data_file file = {
.mode = PERF_DATA_MODE_READ,
};
@@ -330,9 +420,12 @@ static int perf_c2c__report(int argc, const char **argv)
argc = parse_options(argc, argv, c2c_options, report_c2c_usage,
PARSE_OPT_STOP_AT_NON_OPTION);
- if (!argc)
+ if (argc)
usage_with_options(report_c2c_usage, c2c_options);
+ if (!input_name || !strlen(input_name))
+ input_name = "perf.data";
+
file.path = input_name;
err = c2c_hists__init(&c2c.hists, "dcacheline");
@@ -356,6 +449,19 @@ static int perf_c2c__report(int argc, const char **argv)
goto out_session;
}
+ err = perf_session__process_events(session);
+ if (err) {
+ pr_err("failed to process sample\n");
+ goto out_session;
+ }
+
+ ui_progress__init(&prog, c2c.hists.hists.nr_entries, "Sorting...");
+
+ hists__collapse_resort(&c2c.hists.hists, NULL);
+ hists__output_resort(&c2c.hists.hists, &prog);
+
+ ui_progress__finish();
+
out_session:
perf_session__delete(session);
out:
--
2.7.4
^ permalink raw reply related [flat|nested] 85+ messages in thread
* [PATCH 20/61] perf c2c report: Add cacheline hists processing
2016-09-19 13:09 [PATCHv3 00/61] perf c2c: Add new tool to analyze cacheline contention on NUMA systems Jiri Olsa
` (18 preceding siblings ...)
2016-09-19 13:09 ` [PATCH 19/61] perf c2c report: Add sample processing Jiri Olsa
@ 2016-09-19 13:09 ` Jiri Olsa
2016-09-19 13:09 ` [PATCH 21/61] perf c2c report: Decode c2c_stats for hist entries Jiri Olsa
` (40 subsequent siblings)
60 siblings, 0 replies; 85+ messages in thread
From: Jiri Olsa @ 2016-09-19 13:09 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo
Cc: lkml, Don Zickus, Joe Mario, Ingo Molnar, Peter Zijlstra,
Namhyung Kim, David Ahern, Andi Kleen
Store cacheline related entries in nested hist
object for each cacheline data. Nested entries
are sorted by 'offset' within related cacheline.
We will allow specific sort keys to be configured
for nested cacheline data entries in following
patches.
Link: http://lkml.kernel.org/n/tip-37f751rgqamq9miubmr89tj4@git.kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
tools/perf/builtin-c2c.c | 90 ++++++++++++++++++++++++++++++++++++++++++++----
1 file changed, 84 insertions(+), 6 deletions(-)
diff --git a/tools/perf/builtin-c2c.c b/tools/perf/builtin-c2c.c
index 29fb9573e292..cd0406ab8b5d 100644
--- a/tools/perf/builtin-c2c.c
+++ b/tools/perf/builtin-c2c.c
@@ -61,6 +61,32 @@ static struct hist_entry_ops c2c_entry_ops = {
.free = c2c_he_free,
};
+static int c2c_hists__init(struct c2c_hists *hists,
+ const char *sort);
+
+static struct hists*
+he__get_hists(struct hist_entry *he,
+ const char *sort)
+{
+ struct c2c_hist_entry *c2c_he;
+ struct c2c_hists *hists;
+ int ret;
+
+ c2c_he = container_of(he, struct c2c_hist_entry, he);
+ if (c2c_he->hists)
+ return &c2c_he->hists->hists;
+
+ hists = c2c_he->hists = zalloc(sizeof(*hists));
+ if (!hists)
+ return NULL;
+
+ ret = c2c_hists__init(hists, sort);
+ if (ret)
+ free(hists);
+
+ return &hists->hists;
+}
+
static int process_sample_event(struct perf_tool *tool __maybe_unused,
union perf_event *event,
struct perf_sample *sample,
@@ -70,7 +96,7 @@ static int process_sample_event(struct perf_tool *tool __maybe_unused,
struct hists *hists = &c2c.hists.hists;
struct hist_entry *he;
struct addr_location al;
- struct mem_info *mi;
+ struct mem_info *mi, *mi_dup;
int ret;
if (machine__resolve(machine, &al, sample) < 0) {
@@ -83,19 +109,50 @@ static int process_sample_event(struct perf_tool *tool __maybe_unused,
if (mi == NULL)
return -ENOMEM;
+ mi_dup = memdup(mi, sizeof(*mi));
+ if (!mi_dup)
+ goto free_mi;
+
he = hists__add_entry_ops(hists, &c2c_entry_ops,
&al, NULL, NULL, mi,
sample, true);
- if (he == NULL) {
- free(mi);
- return -ENOMEM;
- }
+ if (he == NULL)
+ goto free_mi_dup;
hists__inc_nr_samples(hists, he->filtered);
ret = hist_entry__append_callchain(he, sample);
+ if (!ret) {
+ mi = mi_dup;
+
+ mi_dup = memdup(mi, sizeof(*mi));
+ if (!mi_dup)
+ goto free_mi;
+
+ hists = he__get_hists(he, "offset");
+ if (!hists)
+ goto free_mi_dup;
+
+ he = hists__add_entry_ops(hists, &c2c_entry_ops,
+ &al, NULL, NULL, mi,
+ sample, true);
+ if (he == NULL)
+ goto free_mi_dup;
+
+ hists__inc_nr_samples(hists, he->filtered);
+ ret = hist_entry__append_callchain(he, sample);
+ }
+
+out:
addr_location__put(&al);
return ret;
+
+free_mi_dup:
+ free(mi_dup);
+free_mi:
+ free(mi);
+ ret = -ENOMEM;
+ goto out;
}
static struct perf_c2c c2c = {
@@ -400,6 +457,27 @@ static int c2c_hists__reinit(struct c2c_hists *c2c_hists,
return hpp_list__parse(&c2c_hists->list, output, sort);
}
+static int filter_cb(struct hist_entry *he __maybe_unused)
+{
+ return 0;
+}
+
+static int resort_cl_cb(struct hist_entry *he)
+{
+ struct c2c_hist_entry *c2c_he;
+ struct c2c_hists *c2c_hists;
+
+ c2c_he = container_of(he, struct c2c_hist_entry, he);
+ c2c_hists = c2c_he->hists;
+
+ if (c2c_hists) {
+ hists__collapse_resort(&c2c_hists->hists, NULL);
+ hists__output_resort_cb(&c2c_hists->hists, NULL, filter_cb);
+ }
+
+ return 0;
+}
+
static int perf_c2c__report(int argc, const char **argv)
{
struct perf_session *session;
@@ -458,7 +536,7 @@ static int perf_c2c__report(int argc, const char **argv)
ui_progress__init(&prog, c2c.hists.hists.nr_entries, "Sorting...");
hists__collapse_resort(&c2c.hists.hists, NULL);
- hists__output_resort(&c2c.hists.hists, &prog);
+ hists__output_resort_cb(&c2c.hists.hists, &prog, resort_cl_cb);
ui_progress__finish();
--
2.7.4
^ permalink raw reply related [flat|nested] 85+ messages in thread
* [PATCH 21/61] perf c2c report: Decode c2c_stats for hist entries
2016-09-19 13:09 [PATCHv3 00/61] perf c2c: Add new tool to analyze cacheline contention on NUMA systems Jiri Olsa
` (19 preceding siblings ...)
2016-09-19 13:09 ` [PATCH 20/61] perf c2c report: Add cacheline hists processing Jiri Olsa
@ 2016-09-19 13:09 ` Jiri Olsa
2016-09-19 13:09 ` [PATCH 22/61] perf c2c report: Add header macros Jiri Olsa
` (39 subsequent siblings)
60 siblings, 0 replies; 85+ messages in thread
From: Jiri Olsa @ 2016-09-19 13:09 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo
Cc: lkml, Don Zickus, Joe Mario, Ingo Molnar, Peter Zijlstra,
Namhyung Kim, David Ahern, Andi Kleen
Decoding and storing c2c_stats for each hist entry.
Changing related function to work with c2c_* objects.
Link: http://lkml.kernel.org/n/tip-obz2fu3801wuayz4rntegb1d@git.kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
tools/perf/builtin-c2c.c | 38 ++++++++++++++++++++++++++------------
1 file changed, 26 insertions(+), 12 deletions(-)
diff --git a/tools/perf/builtin-c2c.c b/tools/perf/builtin-c2c.c
index cd0406ab8b5d..7bf6248dbd75 100644
--- a/tools/perf/builtin-c2c.c
+++ b/tools/perf/builtin-c2c.c
@@ -14,10 +14,12 @@
struct c2c_hists {
struct hists hists;
struct perf_hpp_list list;
+ struct c2c_stats stats;
};
struct c2c_hist_entry {
struct c2c_hists *hists;
+ struct c2c_stats stats;
/*
* must be at the end,
* because of its callchain dynamic entry
@@ -64,9 +66,9 @@ static struct hist_entry_ops c2c_entry_ops = {
static int c2c_hists__init(struct c2c_hists *hists,
const char *sort);
-static struct hists*
-he__get_hists(struct hist_entry *he,
- const char *sort)
+static struct c2c_hists*
+he__get_c2c_hists(struct hist_entry *he,
+ const char *sort)
{
struct c2c_hist_entry *c2c_he;
struct c2c_hists *hists;
@@ -74,7 +76,7 @@ he__get_hists(struct hist_entry *he,
c2c_he = container_of(he, struct c2c_hist_entry, he);
if (c2c_he->hists)
- return &c2c_he->hists->hists;
+ return c2c_he->hists;
hists = c2c_he->hists = zalloc(sizeof(*hists));
if (!hists)
@@ -84,7 +86,7 @@ he__get_hists(struct hist_entry *he,
if (ret)
free(hists);
- return &hists->hists;
+ return hists;
}
static int process_sample_event(struct perf_tool *tool __maybe_unused,
@@ -93,7 +95,9 @@ static int process_sample_event(struct perf_tool *tool __maybe_unused,
struct perf_evsel *evsel __maybe_unused,
struct machine *machine)
{
- struct hists *hists = &c2c.hists.hists;
+ struct c2c_hists *c2c_hists = &c2c.hists;
+ struct c2c_hist_entry *c2c_he;
+ struct c2c_stats stats = { 0 };
struct hist_entry *he;
struct addr_location al;
struct mem_info *mi, *mi_dup;
@@ -113,13 +117,19 @@ static int process_sample_event(struct perf_tool *tool __maybe_unused,
if (!mi_dup)
goto free_mi;
- he = hists__add_entry_ops(hists, &c2c_entry_ops,
+ c2c_decode_stats(&stats, mi);
+
+ he = hists__add_entry_ops(&c2c_hists->hists, &c2c_entry_ops,
&al, NULL, NULL, mi,
sample, true);
if (he == NULL)
goto free_mi_dup;
- hists__inc_nr_samples(hists, he->filtered);
+ c2c_he = container_of(he, struct c2c_hist_entry, he);
+ c2c_add_stats(&c2c_he->stats, &stats);
+ c2c_add_stats(&c2c_hists->stats, &stats);
+
+ hists__inc_nr_samples(&c2c_hists->hists, he->filtered);
ret = hist_entry__append_callchain(he, sample);
if (!ret) {
@@ -129,17 +139,21 @@ static int process_sample_event(struct perf_tool *tool __maybe_unused,
if (!mi_dup)
goto free_mi;
- hists = he__get_hists(he, "offset");
- if (!hists)
+ c2c_hists = he__get_c2c_hists(he, "offset");
+ if (!c2c_hists)
goto free_mi_dup;
- he = hists__add_entry_ops(hists, &c2c_entry_ops,
+ he = hists__add_entry_ops(&c2c_hists->hists, &c2c_entry_ops,
&al, NULL, NULL, mi,
sample, true);
if (he == NULL)
goto free_mi_dup;
- hists__inc_nr_samples(hists, he->filtered);
+ c2c_he = container_of(he, struct c2c_hist_entry, he);
+ c2c_add_stats(&c2c_he->stats, &stats);
+ c2c_add_stats(&c2c_hists->stats, &stats);
+
+ hists__inc_nr_samples(&c2c_hists->hists, he->filtered);
ret = hist_entry__append_callchain(he, sample);
}
--
2.7.4
^ permalink raw reply related [flat|nested] 85+ messages in thread
* [PATCH 22/61] perf c2c report: Add header macros
2016-09-19 13:09 [PATCHv3 00/61] perf c2c: Add new tool to analyze cacheline contention on NUMA systems Jiri Olsa
` (20 preceding siblings ...)
2016-09-19 13:09 ` [PATCH 21/61] perf c2c report: Decode c2c_stats for hist entries Jiri Olsa
@ 2016-09-19 13:09 ` Jiri Olsa
2016-09-19 13:09 ` [PATCH 23/61] perf c2c report: Add dcacheline dimension key Jiri Olsa
` (38 subsequent siblings)
60 siblings, 0 replies; 85+ messages in thread
From: Jiri Olsa @ 2016-09-19 13:09 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo
Cc: lkml, Don Zickus, Joe Mario, Ingo Molnar, Peter Zijlstra,
Namhyung Kim, David Ahern, Andi Kleen
Adding helping macros to define header objects.
It will be used in following patches, that add
new dimensions.
The c2c report will support 2 line headers, hence
we only define line[0/1] in macros.
Link: http://lkml.kernel.org/n/tip-tkgrfvlw0m5awb75fk2sv1wb@git.kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
tools/perf/builtin-c2c.c | 40 ++++++++++++++++++++++++++++++++++++++++
1 file changed, 40 insertions(+)
diff --git a/tools/perf/builtin-c2c.c b/tools/perf/builtin-c2c.c
index 7bf6248dbd75..c21124e6bb63 100644
--- a/tools/perf/builtin-c2c.c
+++ b/tools/perf/builtin-c2c.c
@@ -271,6 +271,46 @@ static int c2c_header(struct perf_hpp_fmt *fmt, struct perf_hpp *hpp,
return scnprintf(hpp->buf, hpp->size, "%*s", width, text);
}
+#define HEADER_LOW(__h) \
+ { \
+ .line[1] = { \
+ .text = __h, \
+ }, \
+ }
+
+#define HEADER_BOTH(__h0, __h1) \
+ { \
+ .line[0] = { \
+ .text = __h0, \
+ }, \
+ .line[1] = { \
+ .text = __h1, \
+ }, \
+ }
+
+#define HEADER_SPAN(__h0, __h1, __s) \
+ { \
+ .line[0] = { \
+ .text = __h0, \
+ .span = __s, \
+ }, \
+ .line[1] = { \
+ .text = __h1, \
+ }, \
+ }
+
+#define HEADER_SPAN_LOW(__h) \
+ { \
+ .line[1] = { \
+ .text = __h, \
+ }, \
+ }
+
+#undef HEADER_LOW
+#undef HEADER_BOTH
+#undef HEADER_SPAN
+#undef HEADER_SPAN_LOW
+
static struct c2c_dimension *dimensions[] = {
NULL,
};
--
2.7.4
^ permalink raw reply related [flat|nested] 85+ messages in thread
* [PATCH 23/61] perf c2c report: Add dcacheline dimension key
2016-09-19 13:09 [PATCHv3 00/61] perf c2c: Add new tool to analyze cacheline contention on NUMA systems Jiri Olsa
` (21 preceding siblings ...)
2016-09-19 13:09 ` [PATCH 22/61] perf c2c report: Add header macros Jiri Olsa
@ 2016-09-19 13:09 ` Jiri Olsa
2016-09-19 13:09 ` [PATCH 24/61] perf c2c report: Add offset " Jiri Olsa
` (37 subsequent siblings)
60 siblings, 0 replies; 85+ messages in thread
From: Jiri Olsa @ 2016-09-19 13:09 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo
Cc: lkml, Don Zickus, Joe Mario, Ingo Molnar, Peter Zijlstra,
Namhyung Kim, David Ahern, Andi Kleen
Adding dcacheline dimension key support. It
displays cacheline address as hex number.
Using c2c wrapper to standard 'dcacheline' object
to defined own header and simple (just address)
cacheline output.
Link: http://lkml.kernel.org/n/tip-j5enppr8e7h27nskqhgq33lu@git.kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
tools/perf/builtin-c2c.c | 38 ++++++++++++++++++++++++++++++++++++++
1 file changed, 38 insertions(+)
diff --git a/tools/perf/builtin-c2c.c b/tools/perf/builtin-c2c.c
index c21124e6bb63..060ee1050da9 100644
--- a/tools/perf/builtin-c2c.c
+++ b/tools/perf/builtin-c2c.c
@@ -1,5 +1,6 @@
#include <linux/compiler.h>
#include <linux/kernel.h>
+#include <linux/stringify.h>
#include "util.h"
#include "debug.h"
#include "builtin.h"
@@ -7,6 +8,7 @@
#include "mem-events.h"
#include "session.h"
#include "hist.h"
+#include "sort.h"
#include "tool.h"
#include "data.h"
#include "sort.h"
@@ -271,6 +273,33 @@ static int c2c_header(struct perf_hpp_fmt *fmt, struct perf_hpp *hpp,
return scnprintf(hpp->buf, hpp->size, "%*s", width, text);
}
+static char *hex_str(u64 val)
+{
+ static char buf[20];
+
+ snprintf(buf, 20, "0x%" PRIx64, val);
+ return buf;
+}
+
+static int64_t
+dcacheline_cmp(struct perf_hpp_fmt *fmt __maybe_unused,
+ struct hist_entry *left, struct hist_entry *right)
+{
+ return sort__dcacheline_cmp(left, right);
+}
+
+static int dcacheline_entry(struct perf_hpp_fmt *fmt, struct perf_hpp *hpp,
+ struct hist_entry *he)
+{
+ uint64_t addr = 0;
+ int width = c2c_width(fmt, hpp, he->hists);
+
+ if (he->mem_info)
+ addr = cl_address(he->mem_info->daddr.addr);
+
+ return snprintf(hpp->buf, hpp->size, "%*s", width, hex_str(addr));
+}
+
#define HEADER_LOW(__h) \
{ \
.line[1] = { \
@@ -306,12 +335,21 @@ static int c2c_header(struct perf_hpp_fmt *fmt, struct perf_hpp *hpp,
}, \
}
+static struct c2c_dimension dim_dcacheline = {
+ .header = HEADER_LOW("Cacheline"),
+ .name = "dcacheline",
+ .cmp = dcacheline_cmp,
+ .entry = dcacheline_entry,
+ .width = 18,
+};
+
#undef HEADER_LOW
#undef HEADER_BOTH
#undef HEADER_SPAN
#undef HEADER_SPAN_LOW
static struct c2c_dimension *dimensions[] = {
+ &dim_dcacheline,
NULL,
};
--
2.7.4
^ permalink raw reply related [flat|nested] 85+ messages in thread
* [PATCH 24/61] perf c2c report: Add offset dimension key
2016-09-19 13:09 [PATCHv3 00/61] perf c2c: Add new tool to analyze cacheline contention on NUMA systems Jiri Olsa
` (22 preceding siblings ...)
2016-09-19 13:09 ` [PATCH 23/61] perf c2c report: Add dcacheline dimension key Jiri Olsa
@ 2016-09-19 13:09 ` Jiri Olsa
2016-09-19 13:09 ` [PATCH 25/61] perf c2c report: Add iaddr " Jiri Olsa
` (36 subsequent siblings)
60 siblings, 0 replies; 85+ messages in thread
From: Jiri Olsa @ 2016-09-19 13:09 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo
Cc: lkml, Don Zickus, Joe Mario, Ingo Molnar, Peter Zijlstra,
Namhyung Kim, David Ahern, Andi Kleen
Adding cacheline offset dimension key support.
It displays cacheline offset as hex number.
Link: http://lkml.kernel.org/n/tip-m0424ye98lqveg5nopto8qww@git.kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
tools/perf/builtin-c2c.c | 35 +++++++++++++++++++++++++++++++++++
1 file changed, 35 insertions(+)
diff --git a/tools/perf/builtin-c2c.c b/tools/perf/builtin-c2c.c
index 060ee1050da9..086e337e9d7d 100644
--- a/tools/perf/builtin-c2c.c
+++ b/tools/perf/builtin-c2c.c
@@ -300,6 +300,32 @@ static int dcacheline_entry(struct perf_hpp_fmt *fmt, struct perf_hpp *hpp,
return snprintf(hpp->buf, hpp->size, "%*s", width, hex_str(addr));
}
+static int offset_entry(struct perf_hpp_fmt *fmt, struct perf_hpp *hpp,
+ struct hist_entry *he)
+{
+ uint64_t addr = 0;
+ int width = c2c_width(fmt, hpp, he->hists);
+
+ if (he->mem_info)
+ addr = cl_offset(he->mem_info->daddr.al_addr);
+
+ return snprintf(hpp->buf, hpp->size, "%*s", width, hex_str(addr));
+}
+
+static int64_t
+offset_cmp(struct perf_hpp_fmt *fmt __maybe_unused,
+ struct hist_entry *left, struct hist_entry *right)
+{
+ uint64_t l = 0, r = 0;
+
+ if (left->mem_info)
+ l = cl_offset(left->mem_info->daddr.addr);
+ if (right->mem_info)
+ r = cl_offset(right->mem_info->daddr.addr);
+
+ return (int64_t)(r - l);
+}
+
#define HEADER_LOW(__h) \
{ \
.line[1] = { \
@@ -343,6 +369,14 @@ static struct c2c_dimension dim_dcacheline = {
.width = 18,
};
+static struct c2c_dimension dim_offset = {
+ .header = HEADER_BOTH("Data address", "Offset"),
+ .name = "offset",
+ .cmp = offset_cmp,
+ .entry = offset_entry,
+ .width = 18,
+};
+
#undef HEADER_LOW
#undef HEADER_BOTH
#undef HEADER_SPAN
@@ -350,6 +384,7 @@ static struct c2c_dimension dim_dcacheline = {
static struct c2c_dimension *dimensions[] = {
&dim_dcacheline,
+ &dim_offset,
NULL,
};
--
2.7.4
^ permalink raw reply related [flat|nested] 85+ messages in thread
* [PATCH 25/61] perf c2c report: Add iaddr dimension key
2016-09-19 13:09 [PATCHv3 00/61] perf c2c: Add new tool to analyze cacheline contention on NUMA systems Jiri Olsa
` (23 preceding siblings ...)
2016-09-19 13:09 ` [PATCH 24/61] perf c2c report: Add offset " Jiri Olsa
@ 2016-09-19 13:09 ` Jiri Olsa
2016-09-19 13:09 ` [PATCH 26/61] perf c2c report: Add hitm related dimension keys Jiri Olsa
` (35 subsequent siblings)
60 siblings, 0 replies; 85+ messages in thread
From: Jiri Olsa @ 2016-09-19 13:09 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo
Cc: lkml, Don Zickus, Joe Mario, Ingo Molnar, Peter Zijlstra,
Namhyung Kim, David Ahern, Andi Kleen
Adding iaddr dimension key support. It displays
code address (as hex number) responsible for the
accesses.
Using c2c wrapper to standard 'symbol_iaddr' object
to define own header and simple (just address) code
address output.
Link: http://lkml.kernel.org/n/tip-rhshygbst6kr75kju0muwt5x@git.kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
tools/perf/builtin-c2c.c | 29 +++++++++++++++++++++++++++++
1 file changed, 29 insertions(+)
diff --git a/tools/perf/builtin-c2c.c b/tools/perf/builtin-c2c.c
index 086e337e9d7d..a97e6d6c3b9b 100644
--- a/tools/perf/builtin-c2c.c
+++ b/tools/perf/builtin-c2c.c
@@ -326,6 +326,26 @@ offset_cmp(struct perf_hpp_fmt *fmt __maybe_unused,
return (int64_t)(r - l);
}
+static int
+iaddr_entry(struct perf_hpp_fmt *fmt, struct perf_hpp *hpp,
+ struct hist_entry *he)
+{
+ uint64_t addr = 0;
+ int width = c2c_width(fmt, hpp, he->hists);
+
+ if (he->mem_info)
+ addr = he->mem_info->iaddr.addr;
+
+ return snprintf(hpp->buf, hpp->size, "%*s", width, hex_str(addr));
+}
+
+static int64_t
+iaddr_cmp(struct perf_hpp_fmt *fmt __maybe_unused,
+ struct hist_entry *left, struct hist_entry *right)
+{
+ return sort__iaddr_cmp(left, right);
+}
+
#define HEADER_LOW(__h) \
{ \
.line[1] = { \
@@ -377,6 +397,14 @@ static struct c2c_dimension dim_offset = {
.width = 18,
};
+static struct c2c_dimension dim_iaddr = {
+ .header = HEADER_LOW("Code address"),
+ .name = "iaddr",
+ .cmp = iaddr_cmp,
+ .entry = iaddr_entry,
+ .width = 18,
+};
+
#undef HEADER_LOW
#undef HEADER_BOTH
#undef HEADER_SPAN
@@ -385,6 +413,7 @@ static struct c2c_dimension dim_offset = {
static struct c2c_dimension *dimensions[] = {
&dim_dcacheline,
&dim_offset,
+ &dim_iaddr,
NULL,
};
--
2.7.4
^ permalink raw reply related [flat|nested] 85+ messages in thread
* [PATCH 26/61] perf c2c report: Add hitm related dimension keys
2016-09-19 13:09 [PATCHv3 00/61] perf c2c: Add new tool to analyze cacheline contention on NUMA systems Jiri Olsa
` (24 preceding siblings ...)
2016-09-19 13:09 ` [PATCH 25/61] perf c2c report: Add iaddr " Jiri Olsa
@ 2016-09-19 13:09 ` Jiri Olsa
2016-09-19 13:09 ` [PATCH 27/61] perf c2c report: Add stores " Jiri Olsa
` (34 subsequent siblings)
60 siblings, 0 replies; 85+ messages in thread
From: Jiri Olsa @ 2016-09-19 13:09 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo
Cc: lkml, Don Zickus, Joe Mario, Ingo Molnar, Peter Zijlstra,
Namhyung Kim, David Ahern, Andi Kleen
Adding 5 hitm related dimension key wrappers.
First 3 are to be displayed in the main cachelines
overall output:
tot_hitm, lcl_hitm, rmt_hitm
The latter 2 are to be displayed within single
cacheline output:
cl_rmt_hitm, cl_lcl_hitm
They all display bare numbers of remote/local/total
HITMs for cacheline or its related offsets.
Link: http://lkml.kernel.org/n/tip-iju5239xa5heqqben65g1u7e@git.kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
tools/perf/builtin-c2c.c | 109 +++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 109 insertions(+)
diff --git a/tools/perf/builtin-c2c.c b/tools/perf/builtin-c2c.c
index a97e6d6c3b9b..a48fcc91e9fd 100644
--- a/tools/perf/builtin-c2c.c
+++ b/tools/perf/builtin-c2c.c
@@ -346,6 +346,70 @@ iaddr_cmp(struct perf_hpp_fmt *fmt __maybe_unused,
return sort__iaddr_cmp(left, right);
}
+static int
+tot_hitm_entry(struct perf_hpp_fmt *fmt, struct perf_hpp *hpp,
+ struct hist_entry *he)
+{
+ struct c2c_hist_entry *c2c_he;
+ int width = c2c_width(fmt, hpp, he->hists);
+ unsigned int tot_hitm;
+
+ c2c_he = container_of(he, struct c2c_hist_entry, he);
+ tot_hitm = c2c_he->stats.lcl_hitm + c2c_he->stats.rmt_hitm;
+
+ return snprintf(hpp->buf, hpp->size, "%*u", width, tot_hitm);
+}
+
+static int64_t
+tot_hitm_cmp(struct perf_hpp_fmt *fmt __maybe_unused,
+ struct hist_entry *left, struct hist_entry *right)
+{
+ struct c2c_hist_entry *c2c_left;
+ struct c2c_hist_entry *c2c_right;
+ unsigned int tot_hitm_left;
+ unsigned int tot_hitm_right;
+
+ c2c_left = container_of(left, struct c2c_hist_entry, he);
+ c2c_right = container_of(right, struct c2c_hist_entry, he);
+
+ tot_hitm_left = c2c_left->stats.lcl_hitm + c2c_left->stats.rmt_hitm;
+ tot_hitm_right = c2c_right->stats.lcl_hitm + c2c_right->stats.rmt_hitm;
+
+ return tot_hitm_left - tot_hitm_right;
+}
+
+#define STAT_FN_ENTRY(__f) \
+static int \
+__f ## _entry(struct perf_hpp_fmt *fmt, struct perf_hpp *hpp, \
+ struct hist_entry *he) \
+{ \
+ struct c2c_hist_entry *c2c_he; \
+ int width = c2c_width(fmt, hpp, he->hists); \
+ \
+ c2c_he = container_of(he, struct c2c_hist_entry, he); \
+ return snprintf(hpp->buf, hpp->size, "%*u", width, \
+ c2c_he->stats.__f); \
+}
+
+#define STAT_FN_CMP(__f) \
+static int64_t \
+__f ## _cmp(struct perf_hpp_fmt *fmt __maybe_unused, \
+ struct hist_entry *left, struct hist_entry *right) \
+{ \
+ struct c2c_hist_entry *c2c_left, *c2c_right; \
+ \
+ c2c_left = container_of(left, struct c2c_hist_entry, he); \
+ c2c_right = container_of(right, struct c2c_hist_entry, he); \
+ return c2c_left->stats.__f - c2c_right->stats.__f; \
+}
+
+#define STAT_FN(__f) \
+ STAT_FN_ENTRY(__f) \
+ STAT_FN_CMP(__f)
+
+STAT_FN(rmt_hitm)
+STAT_FN(lcl_hitm)
+
#define HEADER_LOW(__h) \
{ \
.line[1] = { \
@@ -405,6 +469,46 @@ static struct c2c_dimension dim_iaddr = {
.width = 18,
};
+static struct c2c_dimension dim_tot_hitm = {
+ .header = HEADER_SPAN("----- LLC Load Hitm -----", "Total", 2),
+ .name = "tot_hitm",
+ .cmp = tot_hitm_cmp,
+ .entry = tot_hitm_entry,
+ .width = 7,
+};
+
+static struct c2c_dimension dim_lcl_hitm = {
+ .header = HEADER_SPAN_LOW("Lcl"),
+ .name = "lcl_hitm",
+ .cmp = lcl_hitm_cmp,
+ .entry = lcl_hitm_entry,
+ .width = 7,
+};
+
+static struct c2c_dimension dim_rmt_hitm = {
+ .header = HEADER_SPAN_LOW("Rmt"),
+ .name = "rmt_hitm",
+ .cmp = rmt_hitm_cmp,
+ .entry = rmt_hitm_entry,
+ .width = 7,
+};
+
+static struct c2c_dimension dim_cl_rmt_hitm = {
+ .header = HEADER_SPAN("----- HITM -----", "Rmt", 1),
+ .name = "cl_rmt_hitm",
+ .cmp = rmt_hitm_cmp,
+ .entry = rmt_hitm_entry,
+ .width = 7,
+};
+
+static struct c2c_dimension dim_cl_lcl_hitm = {
+ .header = HEADER_SPAN_LOW("Lcl"),
+ .name = "cl_lcl_hitm",
+ .cmp = lcl_hitm_cmp,
+ .entry = lcl_hitm_entry,
+ .width = 7,
+};
+
#undef HEADER_LOW
#undef HEADER_BOTH
#undef HEADER_SPAN
@@ -414,6 +518,11 @@ static struct c2c_dimension *dimensions[] = {
&dim_dcacheline,
&dim_offset,
&dim_iaddr,
+ &dim_tot_hitm,
+ &dim_lcl_hitm,
+ &dim_rmt_hitm,
+ &dim_cl_lcl_hitm,
+ &dim_cl_rmt_hitm,
NULL,
};
--
2.7.4
^ permalink raw reply related [flat|nested] 85+ messages in thread
* [PATCH 27/61] perf c2c report: Add stores related dimension keys
2016-09-19 13:09 [PATCHv3 00/61] perf c2c: Add new tool to analyze cacheline contention on NUMA systems Jiri Olsa
` (25 preceding siblings ...)
2016-09-19 13:09 ` [PATCH 26/61] perf c2c report: Add hitm related dimension keys Jiri Olsa
@ 2016-09-19 13:09 ` Jiri Olsa
2016-09-19 13:09 ` [PATCH 28/61] perf c2c report: Add loads " Jiri Olsa
` (33 subsequent siblings)
60 siblings, 0 replies; 85+ messages in thread
From: Jiri Olsa @ 2016-09-19 13:09 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo
Cc: lkml, Don Zickus, Joe Mario, Ingo Molnar, Peter Zijlstra,
Namhyung Kim, David Ahern, Andi Kleen
Adding 5 stores related dimension key wrappers.
First 3 are to be displayed in the main cachelines
overall output:
stores, stores_l1hit, stores_l1miss
The latter 2 are to be displayed within single
cacheline output:
cl_stores_l1hit, cl_stores_l1miss
They all display bare numbers of stores for
cacheline or its related offsets.
Link: http://lkml.kernel.org/n/tip-qeml8v53v6q3wl5n8vgbf64r@git.kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
tools/perf/builtin-c2c.c | 48 ++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 48 insertions(+)
diff --git a/tools/perf/builtin-c2c.c b/tools/perf/builtin-c2c.c
index a48fcc91e9fd..eb8bb158ad8a 100644
--- a/tools/perf/builtin-c2c.c
+++ b/tools/perf/builtin-c2c.c
@@ -409,6 +409,9 @@ __f ## _cmp(struct perf_hpp_fmt *fmt __maybe_unused, \
STAT_FN(rmt_hitm)
STAT_FN(lcl_hitm)
+STAT_FN(store)
+STAT_FN(st_l1hit)
+STAT_FN(st_l1miss)
#define HEADER_LOW(__h) \
{ \
@@ -509,6 +512,46 @@ static struct c2c_dimension dim_cl_lcl_hitm = {
.width = 7,
};
+static struct c2c_dimension dim_stores = {
+ .header = HEADER_SPAN("---- Store Reference ----", "Total", 2),
+ .name = "stores",
+ .cmp = store_cmp,
+ .entry = store_entry,
+ .width = 7,
+};
+
+static struct c2c_dimension dim_stores_l1hit = {
+ .header = HEADER_SPAN_LOW("L1Hit"),
+ .name = "stores_l1hit",
+ .cmp = st_l1hit_cmp,
+ .entry = st_l1hit_entry,
+ .width = 7,
+};
+
+static struct c2c_dimension dim_stores_l1miss = {
+ .header = HEADER_SPAN_LOW("L1Miss"),
+ .name = "stores_l1miss",
+ .cmp = st_l1miss_cmp,
+ .entry = st_l1miss_entry,
+ .width = 7,
+};
+
+static struct c2c_dimension dim_cl_stores_l1hit = {
+ .header = HEADER_SPAN("-- Store Refs --", "L1 Hit", 1),
+ .name = "cl_stores_l1hit",
+ .cmp = st_l1hit_cmp,
+ .entry = st_l1hit_entry,
+ .width = 7,
+};
+
+static struct c2c_dimension dim_cl_stores_l1miss = {
+ .header = HEADER_SPAN_LOW("L1 Miss"),
+ .name = "cl_stores_l1miss",
+ .cmp = st_l1miss_cmp,
+ .entry = st_l1miss_entry,
+ .width = 7,
+};
+
#undef HEADER_LOW
#undef HEADER_BOTH
#undef HEADER_SPAN
@@ -523,6 +566,11 @@ static struct c2c_dimension *dimensions[] = {
&dim_rmt_hitm,
&dim_cl_lcl_hitm,
&dim_cl_rmt_hitm,
+ &dim_stores,
+ &dim_stores_l1hit,
+ &dim_stores_l1miss,
+ &dim_cl_stores_l1hit,
+ &dim_cl_stores_l1miss,
NULL,
};
--
2.7.4
^ permalink raw reply related [flat|nested] 85+ messages in thread
* [PATCH 28/61] perf c2c report: Add loads related dimension keys
2016-09-19 13:09 [PATCHv3 00/61] perf c2c: Add new tool to analyze cacheline contention on NUMA systems Jiri Olsa
` (26 preceding siblings ...)
2016-09-19 13:09 ` [PATCH 27/61] perf c2c report: Add stores " Jiri Olsa
@ 2016-09-19 13:09 ` Jiri Olsa
2016-09-19 13:09 ` [PATCH 29/61] perf c2c report: Add llc and remote " Jiri Olsa
` (32 subsequent siblings)
60 siblings, 0 replies; 85+ messages in thread
From: Jiri Olsa @ 2016-09-19 13:09 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo
Cc: lkml, Don Zickus, Joe Mario, Ingo Molnar, Peter Zijlstra,
Namhyung Kim, David Ahern, Andi Kleen
Adding 3 loads related dimension key wrappers.
They are to be displayed in the main cachelines
overall output:
ld_fbhit, ld_l1hit, ld_l2hit
They all display bare numbers of loads for
FB (Fill Buffer), L1 and L2 cache.
Link: http://lkml.kernel.org/n/tip-wxrzhy74zl8fvkvgjae3w1ju@git.kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
tools/perf/builtin-c2c.c | 30 ++++++++++++++++++++++++++++++
1 file changed, 30 insertions(+)
diff --git a/tools/perf/builtin-c2c.c b/tools/perf/builtin-c2c.c
index eb8bb158ad8a..8279033d9d83 100644
--- a/tools/perf/builtin-c2c.c
+++ b/tools/perf/builtin-c2c.c
@@ -412,6 +412,9 @@ STAT_FN(lcl_hitm)
STAT_FN(store)
STAT_FN(st_l1hit)
STAT_FN(st_l1miss)
+STAT_FN(ld_fbhit)
+STAT_FN(ld_l1hit)
+STAT_FN(ld_l2hit)
#define HEADER_LOW(__h) \
{ \
@@ -552,6 +555,30 @@ static struct c2c_dimension dim_cl_stores_l1miss = {
.width = 7,
};
+static struct c2c_dimension dim_ld_fbhit = {
+ .header = HEADER_SPAN("----- Core Load Hit -----", "FB", 2),
+ .name = "ld_fbhit",
+ .cmp = ld_fbhit_cmp,
+ .entry = ld_fbhit_entry,
+ .width = 7,
+};
+
+static struct c2c_dimension dim_ld_l1hit = {
+ .header = HEADER_SPAN_LOW("L1"),
+ .name = "ld_l1hit",
+ .cmp = ld_l1hit_cmp,
+ .entry = ld_l1hit_entry,
+ .width = 7,
+};
+
+static struct c2c_dimension dim_ld_l2hit = {
+ .header = HEADER_SPAN_LOW("L2"),
+ .name = "ld_l2hit",
+ .cmp = ld_l2hit_cmp,
+ .entry = ld_l2hit_entry,
+ .width = 7,
+};
+
#undef HEADER_LOW
#undef HEADER_BOTH
#undef HEADER_SPAN
@@ -571,6 +598,9 @@ static struct c2c_dimension *dimensions[] = {
&dim_stores_l1miss,
&dim_cl_stores_l1hit,
&dim_cl_stores_l1miss,
+ &dim_ld_fbhit,
+ &dim_ld_l1hit,
+ &dim_ld_l2hit,
NULL,
};
--
2.7.4
^ permalink raw reply related [flat|nested] 85+ messages in thread
* [PATCH 29/61] perf c2c report: Add llc and remote loads related dimension keys
2016-09-19 13:09 [PATCHv3 00/61] perf c2c: Add new tool to analyze cacheline contention on NUMA systems Jiri Olsa
` (27 preceding siblings ...)
2016-09-19 13:09 ` [PATCH 28/61] perf c2c report: Add loads " Jiri Olsa
@ 2016-09-19 13:09 ` Jiri Olsa
2016-09-19 13:09 ` [PATCH 30/61] perf c2c report: Add llc load miss dimension key Jiri Olsa
` (31 subsequent siblings)
60 siblings, 0 replies; 85+ messages in thread
From: Jiri Olsa @ 2016-09-19 13:09 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo
Cc: lkml, Don Zickus, Joe Mario, Ingo Molnar, Peter Zijlstra,
Namhyung Kim, David Ahern, Andi Kleen
Adding 2 LLC load related dimension key wrappers.
They are to be displayed in the main cachelines
overall output:
ld_lclhit, ld_rmthit
They display bare numbers of LLC and remote loads
for cacheline.
Link: http://lkml.kernel.org/n/tip-ahjg0voaufefboemjuj9yefh@git.kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
tools/perf/builtin-c2c.c | 20 ++++++++++++++++++++
1 file changed, 20 insertions(+)
diff --git a/tools/perf/builtin-c2c.c b/tools/perf/builtin-c2c.c
index 8279033d9d83..2cb5252c0623 100644
--- a/tools/perf/builtin-c2c.c
+++ b/tools/perf/builtin-c2c.c
@@ -415,6 +415,8 @@ STAT_FN(st_l1miss)
STAT_FN(ld_fbhit)
STAT_FN(ld_l1hit)
STAT_FN(ld_l2hit)
+STAT_FN(ld_llchit)
+STAT_FN(rmt_hit)
#define HEADER_LOW(__h) \
{ \
@@ -579,6 +581,22 @@ static struct c2c_dimension dim_ld_l2hit = {
.width = 7,
};
+static struct c2c_dimension dim_ld_llchit = {
+ .header = HEADER_SPAN("-- LLC Load Hit --", "Llc", 1),
+ .name = "ld_lclhit",
+ .cmp = ld_llchit_cmp,
+ .entry = ld_llchit_entry,
+ .width = 8,
+};
+
+static struct c2c_dimension dim_ld_rmthit = {
+ .header = HEADER_SPAN_LOW("Rmt"),
+ .name = "ld_rmthit",
+ .cmp = rmt_hit_cmp,
+ .entry = rmt_hit_entry,
+ .width = 8,
+};
+
#undef HEADER_LOW
#undef HEADER_BOTH
#undef HEADER_SPAN
@@ -601,6 +619,8 @@ static struct c2c_dimension *dimensions[] = {
&dim_ld_fbhit,
&dim_ld_l1hit,
&dim_ld_l2hit,
+ &dim_ld_llchit,
+ &dim_ld_rmthit,
NULL,
};
--
2.7.4
^ permalink raw reply related [flat|nested] 85+ messages in thread
* [PATCH 30/61] perf c2c report: Add llc load miss dimension key
2016-09-19 13:09 [PATCHv3 00/61] perf c2c: Add new tool to analyze cacheline contention on NUMA systems Jiri Olsa
` (28 preceding siblings ...)
2016-09-19 13:09 ` [PATCH 29/61] perf c2c report: Add llc and remote " Jiri Olsa
@ 2016-09-19 13:09 ` Jiri Olsa
2016-09-19 13:09 ` [PATCH 31/61] perf c2c report: Add total record sort key Jiri Olsa
` (30 subsequent siblings)
60 siblings, 0 replies; 85+ messages in thread
From: Jiri Olsa @ 2016-09-19 13:09 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo
Cc: lkml, Don Zickus, Joe Mario, Ingo Molnar, Peter Zijlstra,
Namhyung Kim, David Ahern, Andi Kleen
Adding LLC load miss dimension key wrapper.
It is to be displayed in the main cachelines
overall output:
ld_llcmiss
It displays bare number of LLC misses for cacheline.
Link: http://lkml.kernel.org/n/tip-wojujik7zzen770mxn295mxa@git.kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
tools/perf/builtin-c2c.c | 47 +++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 47 insertions(+)
diff --git a/tools/perf/builtin-c2c.c b/tools/perf/builtin-c2c.c
index 2cb5252c0623..e7e7890882c4 100644
--- a/tools/perf/builtin-c2c.c
+++ b/tools/perf/builtin-c2c.c
@@ -418,6 +418,44 @@ STAT_FN(ld_l2hit)
STAT_FN(ld_llchit)
STAT_FN(rmt_hit)
+static uint64_t llc_miss(struct c2c_stats *stats)
+{
+ uint64_t llcmiss;
+
+ llcmiss = stats->lcl_dram +
+ stats->rmt_dram +
+ stats->rmt_hitm +
+ stats->rmt_hit;
+
+ return llcmiss;
+}
+
+static int
+ld_llcmiss_entry(struct perf_hpp_fmt *fmt, struct perf_hpp *hpp,
+ struct hist_entry *he)
+{
+ struct c2c_hist_entry *c2c_he;
+ int width = c2c_width(fmt, hpp, he->hists);
+
+ c2c_he = container_of(he, struct c2c_hist_entry, he);
+
+ return snprintf(hpp->buf, hpp->size, "%*lu", width,
+ llc_miss(&c2c_he->stats));
+}
+
+static int64_t
+ld_llcmiss_cmp(struct perf_hpp_fmt *fmt __maybe_unused,
+ struct hist_entry *left, struct hist_entry *right)
+{
+ struct c2c_hist_entry *c2c_left;
+ struct c2c_hist_entry *c2c_right;
+
+ c2c_left = container_of(left, struct c2c_hist_entry, he);
+ c2c_right = container_of(right, struct c2c_hist_entry, he);
+
+ return llc_miss(&c2c_left->stats) - llc_miss(&c2c_right->stats);
+}
+
#define HEADER_LOW(__h) \
{ \
.line[1] = { \
@@ -597,6 +635,14 @@ static struct c2c_dimension dim_ld_rmthit = {
.width = 8,
};
+static struct c2c_dimension dim_ld_llcmiss = {
+ .header = HEADER_BOTH("LLC", "Ld Miss"),
+ .name = "ld_llcmiss",
+ .cmp = ld_llcmiss_cmp,
+ .entry = ld_llcmiss_entry,
+ .width = 7,
+};
+
#undef HEADER_LOW
#undef HEADER_BOTH
#undef HEADER_SPAN
@@ -621,6 +667,7 @@ static struct c2c_dimension *dimensions[] = {
&dim_ld_l2hit,
&dim_ld_llchit,
&dim_ld_rmthit,
+ &dim_ld_llcmiss,
NULL,
};
--
2.7.4
^ permalink raw reply related [flat|nested] 85+ messages in thread
* [PATCH 31/61] perf c2c report: Add total record sort key
2016-09-19 13:09 [PATCHv3 00/61] perf c2c: Add new tool to analyze cacheline contention on NUMA systems Jiri Olsa
` (29 preceding siblings ...)
2016-09-19 13:09 ` [PATCH 30/61] perf c2c report: Add llc load miss dimension key Jiri Olsa
@ 2016-09-19 13:09 ` Jiri Olsa
2016-09-19 13:09 ` [PATCH 32/61] perf c2c report: Add total loads " Jiri Olsa
` (29 subsequent siblings)
60 siblings, 0 replies; 85+ messages in thread
From: Jiri Olsa @ 2016-09-19 13:09 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo
Cc: lkml, Don Zickus, Joe Mario, Ingo Molnar, Peter Zijlstra,
Namhyung Kim, David Ahern, Andi Kleen
Adding total record dimension key wrapper.
It is to be displayed in the main cachelines
overall output:
tot_recs
It displays sum of all cachelines accesses.
Link: http://lkml.kernel.org/n/tip-wojujik7zzen770mxn295mxa@git.kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
tools/perf/builtin-c2c.c | 64 ++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 64 insertions(+)
diff --git a/tools/perf/builtin-c2c.c b/tools/perf/builtin-c2c.c
index e7e7890882c4..3f2f348479e3 100644
--- a/tools/perf/builtin-c2c.c
+++ b/tools/perf/builtin-c2c.c
@@ -456,6 +456,61 @@ ld_llcmiss_cmp(struct perf_hpp_fmt *fmt __maybe_unused,
return llc_miss(&c2c_left->stats) - llc_miss(&c2c_right->stats);
}
+static uint64_t total_records(struct c2c_stats *stats)
+{
+ uint64_t lclmiss, ldcnt, total;
+
+ lclmiss = stats->lcl_dram +
+ stats->rmt_dram +
+ stats->rmt_hitm +
+ stats->rmt_hit;
+
+ ldcnt = lclmiss +
+ stats->ld_fbhit +
+ stats->ld_l1hit +
+ stats->ld_l2hit +
+ stats->ld_llchit +
+ stats->lcl_hitm;
+
+ total = ldcnt +
+ stats->st_l1hit +
+ stats->st_l1miss;
+
+ return total;
+}
+
+static int
+tot_recs_entry(struct perf_hpp_fmt *fmt, struct perf_hpp *hpp,
+ struct hist_entry *he)
+{
+ struct c2c_hist_entry *c2c_he;
+ int width = c2c_width(fmt, hpp, he->hists);
+ uint64_t tot_recs;
+
+ c2c_he = container_of(he, struct c2c_hist_entry, he);
+ tot_recs = total_records(&c2c_he->stats);
+
+ return snprintf(hpp->buf, hpp->size, "%*" PRIu64, width, tot_recs);
+}
+
+static int64_t
+tot_recs_cmp(struct perf_hpp_fmt *fmt __maybe_unused,
+ struct hist_entry *left, struct hist_entry *right)
+{
+ struct c2c_hist_entry *c2c_left;
+ struct c2c_hist_entry *c2c_right;
+ uint64_t tot_recs_left;
+ uint64_t tot_recs_right;
+
+ c2c_left = container_of(left, struct c2c_hist_entry, he);
+ c2c_right = container_of(right, struct c2c_hist_entry, he);
+
+ tot_recs_left = total_records(&c2c_left->stats);
+ tot_recs_right = total_records(&c2c_right->stats);
+
+ return tot_recs_left - tot_recs_right;
+}
+
#define HEADER_LOW(__h) \
{ \
.line[1] = { \
@@ -643,6 +698,14 @@ static struct c2c_dimension dim_ld_llcmiss = {
.width = 7,
};
+static struct c2c_dimension dim_tot_recs = {
+ .header = HEADER_BOTH("Total", "records"),
+ .name = "tot_recs",
+ .cmp = tot_recs_cmp,
+ .entry = tot_recs_entry,
+ .width = 7,
+};
+
#undef HEADER_LOW
#undef HEADER_BOTH
#undef HEADER_SPAN
@@ -668,6 +731,7 @@ static struct c2c_dimension *dimensions[] = {
&dim_ld_llchit,
&dim_ld_rmthit,
&dim_ld_llcmiss,
+ &dim_tot_recs,
NULL,
};
--
2.7.4
^ permalink raw reply related [flat|nested] 85+ messages in thread
* [PATCH 32/61] perf c2c report: Add total loads sort key
2016-09-19 13:09 [PATCHv3 00/61] perf c2c: Add new tool to analyze cacheline contention on NUMA systems Jiri Olsa
` (30 preceding siblings ...)
2016-09-19 13:09 ` [PATCH 31/61] perf c2c report: Add total record sort key Jiri Olsa
@ 2016-09-19 13:09 ` Jiri Olsa
2016-09-19 13:09 ` [PATCH 33/61] perf c2c report: Add hitm percent " Jiri Olsa
` (28 subsequent siblings)
60 siblings, 0 replies; 85+ messages in thread
From: Jiri Olsa @ 2016-09-19 13:09 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo
Cc: lkml, Don Zickus, Joe Mario, Ingo Molnar, Peter Zijlstra,
Namhyung Kim, David Ahern, Andi Kleen
Adding total loads dimension key wrapper.
It is to be displayed in the main cachelines
overall output:
tot_loads
It displays sum of all load accesses for cacheline.
Link: http://lkml.kernel.org/n/tip-czd17qsh5u5z0yc1estz9l2y@git.kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
tools/perf/builtin-c2c.c | 60 ++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 60 insertions(+)
diff --git a/tools/perf/builtin-c2c.c b/tools/perf/builtin-c2c.c
index 3f2f348479e3..c5ca6daec2d6 100644
--- a/tools/perf/builtin-c2c.c
+++ b/tools/perf/builtin-c2c.c
@@ -511,6 +511,57 @@ tot_recs_cmp(struct perf_hpp_fmt *fmt __maybe_unused,
return tot_recs_left - tot_recs_right;
}
+static uint64_t total_loads(struct c2c_stats *stats)
+{
+ uint64_t lclmiss, ldcnt;
+
+ lclmiss = stats->lcl_dram +
+ stats->rmt_dram +
+ stats->rmt_hitm +
+ stats->rmt_hit;
+
+ ldcnt = lclmiss +
+ stats->ld_fbhit +
+ stats->ld_l1hit +
+ stats->ld_l2hit +
+ stats->ld_llchit +
+ stats->lcl_hitm;
+
+ return ldcnt;
+}
+
+static int
+tot_loads_entry(struct perf_hpp_fmt *fmt, struct perf_hpp *hpp,
+ struct hist_entry *he)
+{
+ struct c2c_hist_entry *c2c_he;
+ int width = c2c_width(fmt, hpp, he->hists);
+ uint64_t tot_recs;
+
+ c2c_he = container_of(he, struct c2c_hist_entry, he);
+ tot_recs = total_loads(&c2c_he->stats);
+
+ return snprintf(hpp->buf, hpp->size, "%*" PRIu64, width, tot_recs);
+}
+
+static int64_t
+tot_loads_cmp(struct perf_hpp_fmt *fmt __maybe_unused,
+ struct hist_entry *left, struct hist_entry *right)
+{
+ struct c2c_hist_entry *c2c_left;
+ struct c2c_hist_entry *c2c_right;
+ uint64_t tot_recs_left;
+ uint64_t tot_recs_right;
+
+ c2c_left = container_of(left, struct c2c_hist_entry, he);
+ c2c_right = container_of(right, struct c2c_hist_entry, he);
+
+ tot_recs_left = total_loads(&c2c_left->stats);
+ tot_recs_right = total_loads(&c2c_right->stats);
+
+ return tot_recs_left - tot_recs_right;
+}
+
#define HEADER_LOW(__h) \
{ \
.line[1] = { \
@@ -706,6 +757,14 @@ static struct c2c_dimension dim_tot_recs = {
.width = 7,
};
+static struct c2c_dimension dim_tot_loads = {
+ .header = HEADER_BOTH("Total", "Loads"),
+ .name = "tot_loads",
+ .cmp = tot_loads_cmp,
+ .entry = tot_loads_entry,
+ .width = 7,
+};
+
#undef HEADER_LOW
#undef HEADER_BOTH
#undef HEADER_SPAN
@@ -732,6 +791,7 @@ static struct c2c_dimension *dimensions[] = {
&dim_ld_rmthit,
&dim_ld_llcmiss,
&dim_tot_recs,
+ &dim_tot_loads,
NULL,
};
--
2.7.4
^ permalink raw reply related [flat|nested] 85+ messages in thread
* [PATCH 33/61] perf c2c report: Add hitm percent sort key
2016-09-19 13:09 [PATCHv3 00/61] perf c2c: Add new tool to analyze cacheline contention on NUMA systems Jiri Olsa
` (31 preceding siblings ...)
2016-09-19 13:09 ` [PATCH 32/61] perf c2c report: Add total loads " Jiri Olsa
@ 2016-09-19 13:09 ` Jiri Olsa
2016-09-19 13:09 ` [PATCH 34/61] perf c2c report: Add hitm/store percent related sort keys Jiri Olsa
` (27 subsequent siblings)
60 siblings, 0 replies; 85+ messages in thread
From: Jiri Olsa @ 2016-09-19 13:09 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo
Cc: lkml, Don Zickus, Joe Mario, Ingo Molnar, Peter Zijlstra,
Namhyung Kim, David Ahern, Andi Kleen
Adding HITM percent dimension key wrapper.
It is to be displayed in the main cachelines
overall output:
percent_hitm
It displays HITMs percentage for cacheline.
It counts remote HITMs at the moment, but it
is changed later to support local as well,
based on the sort configuration.
Link: http://lkml.kernel.org/n/tip-czd17qsh5u5z0yc1estz9l2y@git.kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
tools/perf/builtin-c2c.c | 90 ++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 90 insertions(+)
diff --git a/tools/perf/builtin-c2c.c b/tools/perf/builtin-c2c.c
index c5ca6daec2d6..82ad66e71401 100644
--- a/tools/perf/builtin-c2c.c
+++ b/tools/perf/builtin-c2c.c
@@ -562,6 +562,86 @@ tot_loads_cmp(struct perf_hpp_fmt *fmt __maybe_unused,
return tot_recs_left - tot_recs_right;
}
+typedef double (get_percent_cb)(struct c2c_hist_entry *);
+
+static int
+percent_color(struct perf_hpp_fmt *fmt, struct perf_hpp *hpp,
+ struct hist_entry *he, get_percent_cb get_percent)
+{
+ struct c2c_hist_entry *c2c_he;
+ int width = c2c_width(fmt, hpp, he->hists);
+ double per;
+
+ c2c_he = container_of(he, struct c2c_hist_entry, he);
+ per = get_percent(c2c_he);
+
+ if (use_browser)
+ return __hpp__slsmg_color_printf(hpp, "%*.2f%%", width - 1, per);
+ else
+ return hpp_color_scnprintf(hpp, "%*.2f%%", width - 1, per);
+}
+
+static double percent_hitm(struct c2c_hist_entry *c2c_he)
+{
+ struct c2c_hists *hists;
+ struct c2c_stats *stats;
+ struct c2c_stats *total;
+ int tot, st;
+ double p;
+
+ hists = container_of(c2c_he->he.hists, struct c2c_hists, hists);
+ stats = &c2c_he->stats;
+ total = &hists->stats;
+
+ st = stats->rmt_hitm;
+ tot = total->rmt_hitm;
+
+ p = tot ? (double) st / tot : 0;
+
+ return 100 * p;
+}
+
+static int
+percent_hitm_entry(struct perf_hpp_fmt *fmt, struct perf_hpp *hpp,
+ struct hist_entry *he)
+{
+ struct c2c_hist_entry *c2c_he;
+ int width = c2c_width(fmt, hpp, he->hists);
+ char buf[10];
+ double per;
+
+ c2c_he = container_of(he, struct c2c_hist_entry, he);
+ per = percent_hitm(c2c_he);
+
+ snprintf(buf, 10, "%.2F%%", per);
+ return snprintf(hpp->buf, hpp->size, "%*s", width, buf);
+}
+
+static int
+percent_hitm_color(struct perf_hpp_fmt *fmt, struct perf_hpp *hpp,
+ struct hist_entry *he)
+{
+ return percent_color(fmt, hpp, he, percent_hitm);
+}
+
+static int64_t
+percent_hitm_cmp(struct perf_hpp_fmt *fmt __maybe_unused,
+ struct hist_entry *left, struct hist_entry *right)
+{
+ struct c2c_hist_entry *c2c_left;
+ struct c2c_hist_entry *c2c_right;
+ double per_left;
+ double per_right;
+
+ c2c_left = container_of(left, struct c2c_hist_entry, he);
+ c2c_right = container_of(right, struct c2c_hist_entry, he);
+
+ per_left = percent_hitm(c2c_left);
+ per_right = percent_hitm(c2c_right);
+
+ return per_left - per_right;
+}
+
#define HEADER_LOW(__h) \
{ \
.line[1] = { \
@@ -765,6 +845,15 @@ static struct c2c_dimension dim_tot_loads = {
.width = 7,
};
+static struct c2c_dimension dim_percent_hitm = {
+ .header = HEADER_LOW("%hitm"),
+ .name = "percent_hitm",
+ .cmp = percent_hitm_cmp,
+ .entry = percent_hitm_entry,
+ .color = percent_hitm_color,
+ .width = 7,
+};
+
#undef HEADER_LOW
#undef HEADER_BOTH
#undef HEADER_SPAN
@@ -792,6 +881,7 @@ static struct c2c_dimension *dimensions[] = {
&dim_ld_llcmiss,
&dim_tot_recs,
&dim_tot_loads,
+ &dim_percent_hitm,
NULL,
};
--
2.7.4
^ permalink raw reply related [flat|nested] 85+ messages in thread
* [PATCH 34/61] perf c2c report: Add hitm/store percent related sort keys
2016-09-19 13:09 [PATCHv3 00/61] perf c2c: Add new tool to analyze cacheline contention on NUMA systems Jiri Olsa
` (32 preceding siblings ...)
2016-09-19 13:09 ` [PATCH 33/61] perf c2c report: Add hitm percent " Jiri Olsa
@ 2016-09-19 13:09 ` Jiri Olsa
2016-09-19 13:09 ` [PATCH 35/61] perf c2c report: Add dram " Jiri Olsa
` (26 subsequent siblings)
60 siblings, 0 replies; 85+ messages in thread
From: Jiri Olsa @ 2016-09-19 13:09 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo
Cc: lkml, Don Zickus, Joe Mario, Ingo Molnar, Peter Zijlstra,
Namhyung Kim, David Ahern, Andi Kleen
Adding hitm/store percent dimension key wrappers.
They are to be displayed in the single cacheline output:
percent_rmt_hitm, percent_lcl_hitm, percent_stores_l1hit, percent_stores_l1miss
They display percentage of HITMs/stores for specific
offset in the cacheline.
Link: http://lkml.kernel.org/n/tip-t365aosxtdut8sgrgn8mfoe4@git.kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
tools/perf/builtin-c2c.c | 206 +++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 206 insertions(+)
diff --git a/tools/perf/builtin-c2c.c b/tools/perf/builtin-c2c.c
index 82ad66e71401..0613669cd8b4 100644
--- a/tools/perf/builtin-c2c.c
+++ b/tools/perf/builtin-c2c.c
@@ -642,6 +642,171 @@ percent_hitm_cmp(struct perf_hpp_fmt *fmt __maybe_unused,
return per_left - per_right;
}
+static struct c2c_stats *he_stats(struct hist_entry *he)
+{
+ struct c2c_hist_entry *c2c_he;
+
+ c2c_he = container_of(he, struct c2c_hist_entry, he);
+ return &c2c_he->stats;
+}
+
+static struct c2c_stats *total_stats(struct hist_entry *he)
+{
+ struct c2c_hists *hists;
+
+ hists = container_of(he->hists, struct c2c_hists, hists);
+ return &hists->stats;
+}
+
+static double percent(int st, int tot)
+{
+ return tot ? 100. * (double) st / (double) tot : 0;
+}
+
+#define PERCENT(__h, __f) percent(he_stats(__h)->__f, total_stats(__h)->__f)
+
+#define PERCENT_FN(__f) \
+static double percent_ ## __f(struct c2c_hist_entry *c2c_he) \
+{ \
+ struct c2c_hists *hists; \
+ \
+ hists = container_of(c2c_he->he.hists, struct c2c_hists, hists); \
+ return percent(c2c_he->stats.__f, hists->stats.__f); \
+}
+
+PERCENT_FN(rmt_hitm)
+PERCENT_FN(lcl_hitm)
+PERCENT_FN(st_l1hit)
+PERCENT_FN(st_l1miss)
+
+static int
+percent_rmt_hitm_entry(struct perf_hpp_fmt *fmt, struct perf_hpp *hpp,
+ struct hist_entry *he)
+{
+ int width = c2c_width(fmt, hpp, he->hists);
+ double per = PERCENT(he, rmt_hitm);
+ char buf[10];
+
+ snprintf(buf, 10, "%.2F%%", per);
+ return snprintf(hpp->buf, hpp->size, "%*s", width, buf);
+}
+
+static int
+percent_rmt_hitm_color(struct perf_hpp_fmt *fmt, struct perf_hpp *hpp,
+ struct hist_entry *he)
+{
+ return percent_color(fmt, hpp, he, percent_rmt_hitm);
+}
+
+static int64_t
+percent_rmt_hitm_cmp(struct perf_hpp_fmt *fmt __maybe_unused,
+ struct hist_entry *left, struct hist_entry *right)
+{
+ double per_left;
+ double per_right;
+
+ per_left = PERCENT(left, lcl_hitm);
+ per_right = PERCENT(right, lcl_hitm);
+
+ return per_left - per_right;
+}
+
+static int
+percent_lcl_hitm_entry(struct perf_hpp_fmt *fmt, struct perf_hpp *hpp,
+ struct hist_entry *he)
+{
+ int width = c2c_width(fmt, hpp, he->hists);
+ double per = PERCENT(he, lcl_hitm);
+ char buf[10];
+
+ snprintf(buf, 10, "%.2F%%", per);
+ return snprintf(hpp->buf, hpp->size, "%*s", width, buf);
+}
+
+static int
+percent_lcl_hitm_color(struct perf_hpp_fmt *fmt, struct perf_hpp *hpp,
+ struct hist_entry *he)
+{
+ return percent_color(fmt, hpp, he, percent_lcl_hitm);
+}
+
+static int64_t
+percent_lcl_hitm_cmp(struct perf_hpp_fmt *fmt __maybe_unused,
+ struct hist_entry *left, struct hist_entry *right)
+{
+ double per_left;
+ double per_right;
+
+ per_left = PERCENT(left, lcl_hitm);
+ per_right = PERCENT(right, lcl_hitm);
+
+ return per_left - per_right;
+}
+
+static int
+percent_stores_l1hit_entry(struct perf_hpp_fmt *fmt, struct perf_hpp *hpp,
+ struct hist_entry *he)
+{
+ int width = c2c_width(fmt, hpp, he->hists);
+ double per = PERCENT(he, st_l1hit);
+ char buf[10];
+
+ snprintf(buf, 10, "%.2F%%", per);
+ return snprintf(hpp->buf, hpp->size, "%*s", width, buf);
+}
+
+static int
+percent_stores_l1hit_color(struct perf_hpp_fmt *fmt, struct perf_hpp *hpp,
+ struct hist_entry *he)
+{
+ return percent_color(fmt, hpp, he, percent_st_l1hit);
+}
+
+static int64_t
+percent_stores_l1hit_cmp(struct perf_hpp_fmt *fmt __maybe_unused,
+ struct hist_entry *left, struct hist_entry *right)
+{
+ double per_left;
+ double per_right;
+
+ per_left = PERCENT(left, st_l1hit);
+ per_right = PERCENT(right, st_l1hit);
+
+ return per_left - per_right;
+}
+
+static int
+percent_stores_l1miss_entry(struct perf_hpp_fmt *fmt, struct perf_hpp *hpp,
+ struct hist_entry *he)
+{
+ int width = c2c_width(fmt, hpp, he->hists);
+ double per = PERCENT(he, st_l1miss);
+ char buf[10];
+
+ snprintf(buf, 10, "%.2F%%", per);
+ return snprintf(hpp->buf, hpp->size, "%*s", width, buf);
+}
+
+static int
+percent_stores_l1miss_color(struct perf_hpp_fmt *fmt, struct perf_hpp *hpp,
+ struct hist_entry *he)
+{
+ return percent_color(fmt, hpp, he, percent_st_l1miss);
+}
+
+static int64_t
+percent_stores_l1miss_cmp(struct perf_hpp_fmt *fmt __maybe_unused,
+ struct hist_entry *left, struct hist_entry *right)
+{
+ double per_left;
+ double per_right;
+
+ per_left = PERCENT(left, st_l1miss);
+ per_right = PERCENT(right, st_l1miss);
+
+ return per_left - per_right;
+}
+
#define HEADER_LOW(__h) \
{ \
.line[1] = { \
@@ -854,6 +1019,42 @@ static struct c2c_dimension dim_percent_hitm = {
.width = 7,
};
+static struct c2c_dimension dim_percent_rmt_hitm = {
+ .header = HEADER_SPAN("----- HITM -----", "Rmt", 1),
+ .name = "percent_rmt_hitm",
+ .cmp = percent_rmt_hitm_cmp,
+ .entry = percent_rmt_hitm_entry,
+ .color = percent_rmt_hitm_color,
+ .width = 7,
+};
+
+static struct c2c_dimension dim_percent_lcl_hitm = {
+ .header = HEADER_SPAN_LOW("Lcl"),
+ .name = "percent_lcl_hitm",
+ .cmp = percent_lcl_hitm_cmp,
+ .entry = percent_lcl_hitm_entry,
+ .color = percent_lcl_hitm_color,
+ .width = 7,
+};
+
+static struct c2c_dimension dim_percent_stores_l1hit = {
+ .header = HEADER_SPAN("-- Store Refs --", "L1 Hit", 1),
+ .name = "percent_stores_l1hit",
+ .cmp = percent_stores_l1hit_cmp,
+ .entry = percent_stores_l1hit_entry,
+ .color = percent_stores_l1hit_color,
+ .width = 7,
+};
+
+static struct c2c_dimension dim_percent_stores_l1miss = {
+ .header = HEADER_SPAN_LOW("L1 Miss"),
+ .name = "percent_stores_l1miss",
+ .cmp = percent_stores_l1miss_cmp,
+ .entry = percent_stores_l1miss_entry,
+ .color = percent_stores_l1miss_color,
+ .width = 7,
+};
+
#undef HEADER_LOW
#undef HEADER_BOTH
#undef HEADER_SPAN
@@ -882,6 +1083,10 @@ static struct c2c_dimension *dimensions[] = {
&dim_tot_recs,
&dim_tot_loads,
&dim_percent_hitm,
+ &dim_percent_rmt_hitm,
+ &dim_percent_lcl_hitm,
+ &dim_percent_stores_l1hit,
+ &dim_percent_stores_l1miss,
NULL,
};
@@ -969,6 +1174,7 @@ static struct c2c_fmt *get_format(const char *name)
fmt->cmp = dim->se ? c2c_se_cmp : dim->cmp;
fmt->sort = dim->se ? c2c_se_cmp : dim->cmp;
+ fmt->color = dim->se ? NULL : dim->color;
fmt->entry = dim->se ? c2c_se_entry : dim->entry;
fmt->header = c2c_header;
fmt->width = c2c_width;
--
2.7.4
^ permalink raw reply related [flat|nested] 85+ messages in thread
* [PATCH 35/61] perf c2c report: Add dram related sort keys
2016-09-19 13:09 [PATCHv3 00/61] perf c2c: Add new tool to analyze cacheline contention on NUMA systems Jiri Olsa
` (33 preceding siblings ...)
2016-09-19 13:09 ` [PATCH 34/61] perf c2c report: Add hitm/store percent related sort keys Jiri Olsa
@ 2016-09-19 13:09 ` Jiri Olsa
2016-09-19 13:09 ` [PATCH 36/61] perf c2c report: Add pid sort key Jiri Olsa
` (25 subsequent siblings)
60 siblings, 0 replies; 85+ messages in thread
From: Jiri Olsa @ 2016-09-19 13:09 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo
Cc: lkml, Don Zickus, Joe Mario, Ingo Molnar, Peter Zijlstra,
Namhyung Kim, David Ahern, Andi Kleen
Adding dram related dimension key wrappers.
They are to be displayed in the main cachelines
overall output:
dram_lcl, dram_rmt
They display DRAM rmt/lcl access numbers for
specific cacheline.
Link: http://lkml.kernel.org/n/tip-tl3qqi9ehk6g1fla4z7y0ykd@git.kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
tools/perf/builtin-c2c.c | 21 +++++++++++++++++++++
1 file changed, 21 insertions(+)
diff --git a/tools/perf/builtin-c2c.c b/tools/perf/builtin-c2c.c
index 0613669cd8b4..55f8b2fece3d 100644
--- a/tools/perf/builtin-c2c.c
+++ b/tools/perf/builtin-c2c.c
@@ -807,6 +807,9 @@ percent_stores_l1miss_cmp(struct perf_hpp_fmt *fmt __maybe_unused,
return per_left - per_right;
}
+STAT_FN(lcl_dram)
+STAT_FN(rmt_dram)
+
#define HEADER_LOW(__h) \
{ \
.line[1] = { \
@@ -1055,6 +1058,22 @@ static struct c2c_dimension dim_percent_stores_l1miss = {
.width = 7,
};
+static struct c2c_dimension dim_dram_lcl = {
+ .header = HEADER_SPAN("--- Load Dram ----", "Lcl", 1),
+ .name = "dram_lcl",
+ .cmp = lcl_dram_cmp,
+ .entry = lcl_dram_entry,
+ .width = 8,
+};
+
+static struct c2c_dimension dim_dram_rmt = {
+ .header = HEADER_SPAN_LOW("Rmt"),
+ .name = "dram_rmt",
+ .cmp = rmt_dram_cmp,
+ .entry = rmt_dram_entry,
+ .width = 8,
+};
+
#undef HEADER_LOW
#undef HEADER_BOTH
#undef HEADER_SPAN
@@ -1087,6 +1106,8 @@ static struct c2c_dimension *dimensions[] = {
&dim_percent_lcl_hitm,
&dim_percent_stores_l1hit,
&dim_percent_stores_l1miss,
+ &dim_dram_lcl,
+ &dim_dram_rmt,
NULL,
};
--
2.7.4
^ permalink raw reply related [flat|nested] 85+ messages in thread
* [PATCH 36/61] perf c2c report: Add pid sort key
2016-09-19 13:09 [PATCHv3 00/61] perf c2c: Add new tool to analyze cacheline contention on NUMA systems Jiri Olsa
` (34 preceding siblings ...)
2016-09-19 13:09 ` [PATCH 35/61] perf c2c report: Add dram " Jiri Olsa
@ 2016-09-19 13:09 ` Jiri Olsa
2016-09-19 13:09 ` [PATCH 37/61] perf c2c report: Add tid " Jiri Olsa
` (24 subsequent siblings)
60 siblings, 0 replies; 85+ messages in thread
From: Jiri Olsa @ 2016-09-19 13:09 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo
Cc: lkml, Don Zickus, Joe Mario, Ingo Molnar, Peter Zijlstra,
Namhyung Kim, David Ahern, Andi Kleen
Adding pid dimension key wrapper.
It is to be displayed in the single cacheline output:
pid
We currently don't have a single 'pid' sort/display entry,
which would output just pid number, hence adding it into
c2c code.
Link: http://lkml.kernel.org/n/tip-3o23qrspxc99b04ci1swlzr6@git.kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
tools/perf/builtin-c2c.c | 25 +++++++++++++++++++++++++
1 file changed, 25 insertions(+)
diff --git a/tools/perf/builtin-c2c.c b/tools/perf/builtin-c2c.c
index 55f8b2fece3d..e17e01056284 100644
--- a/tools/perf/builtin-c2c.c
+++ b/tools/perf/builtin-c2c.c
@@ -810,6 +810,22 @@ percent_stores_l1miss_cmp(struct perf_hpp_fmt *fmt __maybe_unused,
STAT_FN(lcl_dram)
STAT_FN(rmt_dram)
+static int
+pid_entry(struct perf_hpp_fmt *fmt, struct perf_hpp *hpp,
+ struct hist_entry *he)
+{
+ int width = c2c_width(fmt, hpp, he->hists);
+
+ return snprintf(hpp->buf, hpp->size, "%*d", width, he->thread->pid_);
+}
+
+static int64_t
+pid_cmp(struct perf_hpp_fmt *fmt __maybe_unused,
+ struct hist_entry *left, struct hist_entry *right)
+{
+ return left->thread->pid_ - right->thread->pid_;
+}
+
#define HEADER_LOW(__h) \
{ \
.line[1] = { \
@@ -1074,6 +1090,14 @@ static struct c2c_dimension dim_dram_rmt = {
.width = 8,
};
+static struct c2c_dimension dim_pid = {
+ .header = HEADER_LOW("Pid"),
+ .name = "pid",
+ .cmp = pid_cmp,
+ .entry = pid_entry,
+ .width = 7,
+};
+
#undef HEADER_LOW
#undef HEADER_BOTH
#undef HEADER_SPAN
@@ -1108,6 +1132,7 @@ static struct c2c_dimension *dimensions[] = {
&dim_percent_stores_l1miss,
&dim_dram_lcl,
&dim_dram_rmt,
+ &dim_pid,
NULL,
};
--
2.7.4
^ permalink raw reply related [flat|nested] 85+ messages in thread
* [PATCH 37/61] perf c2c report: Add tid sort key
2016-09-19 13:09 [PATCHv3 00/61] perf c2c: Add new tool to analyze cacheline contention on NUMA systems Jiri Olsa
` (35 preceding siblings ...)
2016-09-19 13:09 ` [PATCH 36/61] perf c2c report: Add pid sort key Jiri Olsa
@ 2016-09-19 13:09 ` Jiri Olsa
2016-09-19 13:09 ` [PATCH 38/61] perf c2c report: Add symbol and dso sort keys Jiri Olsa
` (23 subsequent siblings)
60 siblings, 0 replies; 85+ messages in thread
From: Jiri Olsa @ 2016-09-19 13:09 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo
Cc: lkml, Don Zickus, Joe Mario, Ingo Molnar, Peter Zijlstra,
Namhyung Kim, David Ahern, Andi Kleen
Adding tid dimension key wrapper.
It is to be displayed in the single cacheline output:
tid
It's a wrapper for global sort_thread sort entry with
c2c specific header.
Link: http://lkml.kernel.org/n/tip-fr0socae5skzvz5qbkl85prn@git.kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
tools/perf/builtin-c2c.c | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/tools/perf/builtin-c2c.c b/tools/perf/builtin-c2c.c
index e17e01056284..2966a388ce8b 100644
--- a/tools/perf/builtin-c2c.c
+++ b/tools/perf/builtin-c2c.c
@@ -1098,6 +1098,12 @@ static struct c2c_dimension dim_pid = {
.width = 7,
};
+static struct c2c_dimension dim_tid = {
+ .header = HEADER_LOW("Tid"),
+ .name = "tid",
+ .se = &sort_thread,
+};
+
#undef HEADER_LOW
#undef HEADER_BOTH
#undef HEADER_SPAN
@@ -1133,6 +1139,7 @@ static struct c2c_dimension *dimensions[] = {
&dim_dram_lcl,
&dim_dram_rmt,
&dim_pid,
+ &dim_tid,
NULL,
};
--
2.7.4
^ permalink raw reply related [flat|nested] 85+ messages in thread
* [PATCH 38/61] perf c2c report: Add symbol and dso sort keys
2016-09-19 13:09 [PATCHv3 00/61] perf c2c: Add new tool to analyze cacheline contention on NUMA systems Jiri Olsa
` (36 preceding siblings ...)
2016-09-19 13:09 ` [PATCH 37/61] perf c2c report: Add tid " Jiri Olsa
@ 2016-09-19 13:09 ` Jiri Olsa
2016-09-19 13:09 ` [PATCH 39/61] perf c2c report: Add node sort key Jiri Olsa
` (22 subsequent siblings)
60 siblings, 0 replies; 85+ messages in thread
From: Jiri Olsa @ 2016-09-19 13:09 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo
Cc: lkml, Don Zickus, Joe Mario, Ingo Molnar, Peter Zijlstra,
Namhyung Kim, David Ahern, Andi Kleen
Adding symbol and dso dimension key wrappers.
They are to be displayed in the single cacheline output:
symbol, dso
They are wrappers for global sort_sym and sort_dso
sort entries with c2c specific headers.
Link: http://lkml.kernel.org/n/tip-6742e6g0r7n63y5wc4rrgxx5@git.kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
tools/perf/builtin-c2c.c | 18 ++++++++++++++++++
1 file changed, 18 insertions(+)
diff --git a/tools/perf/builtin-c2c.c b/tools/perf/builtin-c2c.c
index 2966a388ce8b..b3dcd590e97a 100644
--- a/tools/perf/builtin-c2c.c
+++ b/tools/perf/builtin-c2c.c
@@ -1104,6 +1104,17 @@ static struct c2c_dimension dim_tid = {
.se = &sort_thread,
};
+static struct c2c_dimension dim_symbol = {
+ .name = "symbol",
+ .se = &sort_sym,
+};
+
+static struct c2c_dimension dim_dso = {
+ .header = HEADER_BOTH("Shared", "Object"),
+ .name = "dso",
+ .se = &sort_dso,
+};
+
#undef HEADER_LOW
#undef HEADER_BOTH
#undef HEADER_SPAN
@@ -1140,6 +1151,8 @@ static struct c2c_dimension *dimensions[] = {
&dim_dram_rmt,
&dim_pid,
&dim_tid,
+ &dim_symbol,
+ &dim_dso,
NULL,
};
@@ -1254,12 +1267,17 @@ static int c2c_hists__init_output(struct perf_hpp_list *hpp_list, char *name)
static int c2c_hists__init_sort(struct perf_hpp_list *hpp_list, char *name)
{
struct c2c_fmt *c2c_fmt = get_format(name);
+ struct c2c_dimension *dim;
if (!c2c_fmt) {
reset_dimensions();
return sort_dimension__add(hpp_list, name, NULL, 0);
}
+ dim = c2c_fmt->dim;
+ if (dim == &dim_dso)
+ hpp_list->dso = 1;
+
perf_hpp_list__register_sort_field(hpp_list, &c2c_fmt->fmt);
return 0;
}
--
2.7.4
^ permalink raw reply related [flat|nested] 85+ messages in thread
* [PATCH 39/61] perf c2c report: Add node sort key
2016-09-19 13:09 [PATCHv3 00/61] perf c2c: Add new tool to analyze cacheline contention on NUMA systems Jiri Olsa
` (37 preceding siblings ...)
2016-09-19 13:09 ` [PATCH 38/61] perf c2c report: Add symbol and dso sort keys Jiri Olsa
@ 2016-09-19 13:09 ` Jiri Olsa
2016-09-19 13:09 ` [PATCH 40/61] perf c2c report: Add stats related sort keys Jiri Olsa
` (21 subsequent siblings)
60 siblings, 0 replies; 85+ messages in thread
From: Jiri Olsa @ 2016-09-19 13:09 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo
Cc: lkml, Don Zickus, Joe Mario, Ingo Molnar, Peter Zijlstra,
Namhyung Kim, David Ahern, Andi Kleen
Adding node dimension key wrapper.
It is to be displayed in the single cacheline output:
node
It displays nodes hits related to cacheline accesses.
The node filed comes in 3 flavors:
- node IDs separated by ','
- node IDs with stats for each ID, in following format:
Node{cpus %hitms %stores}
- node IDs with list of affected CPUs in following format:
Node{cpu list}
User can switch the flavor with -N option (-NN,-NNN).
It will be available in TUI to switch this with 'n' key.
Link: http://lkml.kernel.org/n/tip-6742e6g0r7n63y5wc4rrgxx5@git.kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
tools/perf/builtin-c2c.c | 219 +++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 219 insertions(+)
diff --git a/tools/perf/builtin-c2c.c b/tools/perf/builtin-c2c.c
index b3dcd590e97a..6b4224764ae4 100644
--- a/tools/perf/builtin-c2c.c
+++ b/tools/perf/builtin-c2c.c
@@ -1,6 +1,7 @@
#include <linux/compiler.h>
#include <linux/kernel.h>
#include <linux/stringify.h>
+#include <asm/bug.h>
#include "util.h"
#include "debug.h"
#include "builtin.h"
@@ -22,6 +23,8 @@ struct c2c_hists {
struct c2c_hist_entry {
struct c2c_hists *hists;
struct c2c_stats stats;
+ unsigned long *cpuset;
+ struct c2c_stats *node_stats;
/*
* must be at the end,
* because of its callchain dynamic entry
@@ -32,6 +35,12 @@ struct c2c_hist_entry {
struct perf_c2c {
struct perf_tool tool;
struct c2c_hists hists;
+
+ unsigned long **nodes;
+ int nodes_cnt;
+ int cpus_cnt;
+ int *cpu2node;
+ int node_info;
};
static struct perf_c2c c2c;
@@ -44,6 +53,14 @@ static void *c2c_he_zalloc(size_t size)
if (!c2c_he)
return NULL;
+ c2c_he->cpuset = bitmap_alloc(c2c.cpus_cnt);
+ if (!c2c_he->cpuset)
+ return NULL;
+
+ c2c_he->node_stats = zalloc(c2c.nodes_cnt * sizeof(*c2c_he->node_stats));
+ if (!c2c_he->node_stats)
+ return NULL;
+
return &c2c_he->he;
}
@@ -57,6 +74,8 @@ static void c2c_he_free(void *he)
free(c2c_he->hists);
}
+ free(c2c_he->cpuset);
+ free(c2c_he->node_stats);
free(c2c_he);
}
@@ -91,6 +110,16 @@ he__get_c2c_hists(struct hist_entry *he,
return hists;
}
+static void c2c_he__set_cpu(struct c2c_hist_entry *c2c_he,
+ struct perf_sample *sample)
+{
+ if (WARN_ONCE(sample->cpu == (unsigned int) -1,
+ "WARNING: no sample cpu value"))
+ return;
+
+ set_bit(sample->cpu, c2c_he->cpuset);
+}
+
static int process_sample_event(struct perf_tool *tool __maybe_unused,
union perf_event *event,
struct perf_sample *sample,
@@ -131,10 +160,23 @@ static int process_sample_event(struct perf_tool *tool __maybe_unused,
c2c_add_stats(&c2c_he->stats, &stats);
c2c_add_stats(&c2c_hists->stats, &stats);
+ c2c_he__set_cpu(c2c_he, sample);
+
hists__inc_nr_samples(&c2c_hists->hists, he->filtered);
ret = hist_entry__append_callchain(he, sample);
if (!ret) {
+ /*
+ * There's already been warning about missing
+ * sample's cpu value. Let's account all to
+ * node 0 in this case, without any further
+ * warning.
+ *
+ * Doing node stats only for single callchain data.
+ */
+ int cpu = sample->cpu == (unsigned int) -1 ? 0 : sample->cpu;
+ int node = c2c.cpu2node[cpu];
+
mi = mi_dup;
mi_dup = memdup(mi, sizeof(*mi));
@@ -154,6 +196,9 @@ static int process_sample_event(struct perf_tool *tool __maybe_unused,
c2c_he = container_of(he, struct c2c_hist_entry, he);
c2c_add_stats(&c2c_he->stats, &stats);
c2c_add_stats(&c2c_hists->stats, &stats);
+ c2c_add_stats(&c2c_he->node_stats[node], &stats);
+
+ c2c_he__set_cpu(c2c_he, sample);
hists__inc_nr_samples(&c2c_hists->hists, he->filtered);
ret = hist_entry__append_callchain(he, sample);
@@ -826,6 +871,97 @@ pid_cmp(struct perf_hpp_fmt *fmt __maybe_unused,
return left->thread->pid_ - right->thread->pid_;
}
+static int64_t
+empty_cmp(struct perf_hpp_fmt *fmt __maybe_unused,
+ struct hist_entry *left __maybe_unused,
+ struct hist_entry *right __maybe_unused)
+{
+ return 0;
+}
+
+static int
+node_entry(struct perf_hpp_fmt *fmt __maybe_unused, struct perf_hpp *hpp,
+ struct hist_entry *he)
+{
+ struct c2c_hist_entry *c2c_he;
+ bool first = true;
+ int node;
+ int ret = 0;
+
+ c2c_he = container_of(he, struct c2c_hist_entry, he);
+
+ for (node = 0; node < c2c.nodes_cnt; node++) {
+ DECLARE_BITMAP(set, c2c.cpus_cnt);
+
+ bitmap_zero(set, c2c.cpus_cnt);
+ bitmap_and(set, c2c_he->cpuset, c2c.nodes[node], c2c.cpus_cnt);
+
+ if (!bitmap_weight(set, c2c.cpus_cnt)) {
+ if (c2c.node_info == 1) {
+ ret = scnprintf(hpp->buf, hpp->size, "%21s", " ");
+ advance_hpp(hpp, ret);
+ }
+ continue;
+ }
+
+ if (!first) {
+ ret = scnprintf(hpp->buf, hpp->size, " ");
+ advance_hpp(hpp, ret);
+ }
+
+ switch (c2c.node_info) {
+ case 0:
+ ret = scnprintf(hpp->buf, hpp->size, "%2d", node);
+ advance_hpp(hpp, ret);
+ break;
+ case 1:
+ {
+ int num = bitmap_weight(c2c_he->cpuset, c2c.cpus_cnt);
+ struct c2c_stats *stats = &c2c_he->node_stats[node];
+
+ ret = scnprintf(hpp->buf, hpp->size, "%2d{%2d ", node, num);
+ advance_hpp(hpp, ret);
+
+
+ if (c2c_he->stats.rmt_hitm > 0) {
+ ret = scnprintf(hpp->buf, hpp->size, "%5.1f%% ",
+ percent(stats->rmt_hitm, c2c_he->stats.rmt_hitm));
+ } else {
+ ret = scnprintf(hpp->buf, hpp->size, "%6s ", "n/a");
+ }
+
+ advance_hpp(hpp, ret);
+
+ if (c2c_he->stats.store > 0) {
+ ret = scnprintf(hpp->buf, hpp->size, "%5.1f%%}",
+ percent(stats->store, c2c_he->stats.store));
+ } else {
+ ret = scnprintf(hpp->buf, hpp->size, "%6s}", "n/a");
+ }
+
+ advance_hpp(hpp, ret);
+ break;
+ }
+ case 2:
+ ret = scnprintf(hpp->buf, hpp->size, "%2d{", node);
+ advance_hpp(hpp, ret);
+
+ ret = bitmap_scnprintf(set, c2c.cpus_cnt, hpp->buf, hpp->size);
+ advance_hpp(hpp, ret);
+
+ ret = scnprintf(hpp->buf, hpp->size, "}");
+ advance_hpp(hpp, ret);
+ break;
+ default:
+ break;
+ }
+
+ first = false;
+ }
+
+ return 0;
+}
+
#define HEADER_LOW(__h) \
{ \
.line[1] = { \
@@ -1115,6 +1251,19 @@ static struct c2c_dimension dim_dso = {
.se = &sort_dso,
};
+static struct c2c_header header_node[3] = {
+ HEADER_LOW("Node"),
+ HEADER_LOW("Node{cpus %hitms %stores}"),
+ HEADER_LOW("Node{cpu list}"),
+};
+
+static struct c2c_dimension dim_node = {
+ .name = "node",
+ .cmp = empty_cmp,
+ .entry = node_entry,
+ .width = 4,
+};
+
#undef HEADER_LOW
#undef HEADER_BOTH
#undef HEADER_SPAN
@@ -1153,6 +1302,7 @@ static struct c2c_dimension *dimensions[] = {
&dim_tid,
&dim_symbol,
&dim_dso,
+ &dim_node,
NULL,
};
@@ -1379,6 +1529,68 @@ static int resort_cl_cb(struct hist_entry *he)
return 0;
}
+static void setup_nodes_header(void)
+{
+ dim_node.header = header_node[c2c.node_info];
+}
+
+static int setup_nodes(struct perf_session *session)
+{
+ struct numa_node *n;
+ unsigned long **nodes;
+ int node, cpu;
+ int *cpu2node;
+
+ if (c2c.node_info > 2)
+ c2c.node_info = 2;
+
+ c2c.nodes_cnt = session->header.env.nr_numa_nodes;
+ c2c.cpus_cnt = session->header.env.nr_cpus_online;
+
+ n = session->header.env.numa_nodes;
+ if (!n)
+ return -EINVAL;
+
+ nodes = zalloc(sizeof(unsigned long *) * c2c.nodes_cnt);
+ if (!nodes)
+ return -ENOMEM;
+
+ c2c.nodes = nodes;
+
+ cpu2node = zalloc(sizeof(int) * c2c.cpus_cnt);
+ if (!cpu2node)
+ return -ENOMEM;
+
+ for (cpu = 0; cpu < c2c.cpus_cnt; cpu++)
+ cpu2node[cpu] = -1;
+
+ c2c.cpu2node = cpu2node;
+
+ for (node = 0; node < c2c.nodes_cnt; node++) {
+ struct cpu_map *map = n[node].map;
+ unsigned long *set;
+
+ set = bitmap_alloc(c2c.cpus_cnt);
+ if (!set)
+ return -ENOMEM;
+
+ for (cpu = 0; cpu < map->nr; cpu++) {
+ set_bit(map->map[cpu], set);
+
+ if (WARN_ONCE(cpu2node[map->map[cpu]] != -1, "node/cpu topology bug"))
+ return -EINVAL;
+
+ cpu2node[map->map[cpu]] = node;
+ }
+
+ nodes[node] = set;
+ }
+
+ setup_nodes_header();
+ return 0;
+}
+
+
static int perf_c2c__report(int argc, const char **argv)
{
struct perf_session *session;
@@ -1393,6 +1605,8 @@ static int perf_c2c__report(int argc, const char **argv)
"be more verbose (show counter open errors, etc)"),
OPT_STRING('i', "input", &input_name, "file",
"the input file to process"),
+ OPT_INCR('N', "node-info", &c2c.node_info,
+ "show extra node info in report (repeat for more info)"),
OPT_END()
};
int err = 0;
@@ -1418,6 +1632,11 @@ static int perf_c2c__report(int argc, const char **argv)
pr_debug("No memory for session\n");
goto out;
}
+ err = setup_nodes(session);
+ if (err) {
+ pr_err("Failed setup nodes\n");
+ goto out;
+ }
if (symbol__init(&session->header.env) < 0)
goto out_session;
--
2.7.4
^ permalink raw reply related [flat|nested] 85+ messages in thread
* [PATCH 40/61] perf c2c report: Add stats related sort keys
2016-09-19 13:09 [PATCHv3 00/61] perf c2c: Add new tool to analyze cacheline contention on NUMA systems Jiri Olsa
` (38 preceding siblings ...)
2016-09-19 13:09 ` [PATCH 39/61] perf c2c report: Add node sort key Jiri Olsa
@ 2016-09-19 13:09 ` Jiri Olsa
2016-09-19 13:09 ` [PATCH 41/61] perf c2c report: Add cpu cnt sort key Jiri Olsa
` (20 subsequent siblings)
60 siblings, 0 replies; 85+ messages in thread
From: Jiri Olsa @ 2016-09-19 13:09 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo
Cc: lkml, Don Zickus, Joe Mario, Ingo Molnar, Peter Zijlstra,
Namhyung Kim, David Ahern, Andi Kleen
Adding statistic dimension key wrapper.
It is to be displayed in the single cacheline output:
median, mean_rmt, mean_lcl, mean_load, stddev
It displays statistics hits related to cacheline accesses.
Link: http://lkml.kernel.org/n/tip-m1r4uc9lcykf1jhpvwk2gkj8@git.kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
tools/perf/builtin-c2c.c | 80 ++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 80 insertions(+)
diff --git a/tools/perf/builtin-c2c.c b/tools/perf/builtin-c2c.c
index 6b4224764ae4..1990c64f18ff 100644
--- a/tools/perf/builtin-c2c.c
+++ b/tools/perf/builtin-c2c.c
@@ -20,11 +20,20 @@ struct c2c_hists {
struct c2c_stats stats;
};
+struct compute_stats {
+ struct stats lcl_hitm;
+ struct stats rmt_hitm;
+ struct stats load;
+};
+
struct c2c_hist_entry {
struct c2c_hists *hists;
struct c2c_stats stats;
unsigned long *cpuset;
struct c2c_stats *node_stats;
+
+ struct compute_stats cstats;
+
/*
* must be at the end,
* because of its callchain dynamic entry
@@ -61,6 +70,10 @@ static void *c2c_he_zalloc(size_t size)
if (!c2c_he->node_stats)
return NULL;
+ init_stats(&c2c_he->cstats.lcl_hitm);
+ init_stats(&c2c_he->cstats.rmt_hitm);
+ init_stats(&c2c_he->cstats.load);
+
return &c2c_he->he;
}
@@ -120,6 +133,20 @@ static void c2c_he__set_cpu(struct c2c_hist_entry *c2c_he,
set_bit(sample->cpu, c2c_he->cpuset);
}
+static void compute_stats(struct c2c_hist_entry *c2c_he,
+ struct c2c_stats *stats,
+ u64 weight)
+{
+ struct compute_stats *cstats = &c2c_he->cstats;
+
+ if (stats->rmt_hitm)
+ update_stats(&cstats->rmt_hitm, weight);
+ else if (stats->lcl_hitm)
+ update_stats(&cstats->lcl_hitm, weight);
+ else if (stats->load)
+ update_stats(&cstats->load, weight);
+}
+
static int process_sample_event(struct perf_tool *tool __maybe_unused,
union perf_event *event,
struct perf_sample *sample,
@@ -198,6 +225,8 @@ static int process_sample_event(struct perf_tool *tool __maybe_unused,
c2c_add_stats(&c2c_hists->stats, &stats);
c2c_add_stats(&c2c_he->node_stats[node], &stats);
+ compute_stats(c2c_he, &stats, sample->weight);
+
c2c_he__set_cpu(c2c_he, sample);
hists__inc_nr_samples(&c2c_hists->hists, he->filtered);
@@ -962,6 +991,30 @@ node_entry(struct perf_hpp_fmt *fmt __maybe_unused, struct perf_hpp *hpp,
return 0;
}
+static int
+mean_entry(struct perf_hpp_fmt *fmt, struct perf_hpp *hpp,
+ struct hist_entry *he, double mean)
+{
+ int width = c2c_width(fmt, hpp, he->hists);
+ char buf[10];
+
+ snprintf(buf, 10, "%6.0f", mean);
+ return snprintf(hpp->buf, hpp->size, "%*s", width, buf);
+}
+
+#define MEAN_ENTRY(__func, __val) \
+static int \
+__func(struct perf_hpp_fmt *fmt, struct perf_hpp *hpp, struct hist_entry *he) \
+{ \
+ struct c2c_hist_entry *c2c_he; \
+ c2c_he = container_of(he, struct c2c_hist_entry, he); \
+ return mean_entry(fmt, hpp, he, avg_stats(&c2c_he->cstats.__val)); \
+}
+
+MEAN_ENTRY(mean_rmt_entry, rmt_hitm);
+MEAN_ENTRY(mean_lcl_entry, lcl_hitm);
+MEAN_ENTRY(mean_load_entry, load);
+
#define HEADER_LOW(__h) \
{ \
.line[1] = { \
@@ -1264,6 +1317,30 @@ static struct c2c_dimension dim_node = {
.width = 4,
};
+static struct c2c_dimension dim_mean_rmt = {
+ .header = HEADER_SPAN("---------- cycles ----------", "rmt hitm", 2),
+ .name = "mean_rmt",
+ .cmp = empty_cmp,
+ .entry = mean_rmt_entry,
+ .width = 8,
+};
+
+static struct c2c_dimension dim_mean_lcl = {
+ .header = HEADER_SPAN_LOW("lcl hitm"),
+ .name = "mean_lcl",
+ .cmp = empty_cmp,
+ .entry = mean_lcl_entry,
+ .width = 8,
+};
+
+static struct c2c_dimension dim_mean_load = {
+ .header = HEADER_SPAN_LOW("load"),
+ .name = "mean_load",
+ .cmp = empty_cmp,
+ .entry = mean_load_entry,
+ .width = 8,
+};
+
#undef HEADER_LOW
#undef HEADER_BOTH
#undef HEADER_SPAN
@@ -1303,6 +1380,9 @@ static struct c2c_dimension *dimensions[] = {
&dim_symbol,
&dim_dso,
&dim_node,
+ &dim_mean_rmt,
+ &dim_mean_lcl,
+ &dim_mean_load,
NULL,
};
--
2.7.4
^ permalink raw reply related [flat|nested] 85+ messages in thread
* [PATCH 41/61] perf c2c report: Add cpu cnt sort key
2016-09-19 13:09 [PATCHv3 00/61] perf c2c: Add new tool to analyze cacheline contention on NUMA systems Jiri Olsa
` (39 preceding siblings ...)
2016-09-19 13:09 ` [PATCH 40/61] perf c2c report: Add stats related sort keys Jiri Olsa
@ 2016-09-19 13:09 ` Jiri Olsa
2016-09-19 13:09 ` [PATCH 42/61] perf c2c report: Add src line " Jiri Olsa
` (19 subsequent siblings)
60 siblings, 0 replies; 85+ messages in thread
From: Jiri Olsa @ 2016-09-19 13:09 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo
Cc: lkml, Don Zickus, Joe Mario, Ingo Molnar, Peter Zijlstra,
Namhyung Kim, David Ahern, Andi Kleen
Adding cpu count dimension key wrapper.
It is to be displayed in the single cacheline output:
cpucnt
It displays number of distinct cpus that hit cacheline.
Link: http://lkml.kernel.org/n/tip-ib2kdwam52fby9u2k3ij6lhm@git.kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
tools/perf/builtin-c2c.c | 23 +++++++++++++++++++++++
1 file changed, 23 insertions(+)
diff --git a/tools/perf/builtin-c2c.c b/tools/perf/builtin-c2c.c
index 1990c64f18ff..a4fea832e677 100644
--- a/tools/perf/builtin-c2c.c
+++ b/tools/perf/builtin-c2c.c
@@ -1015,6 +1015,20 @@ MEAN_ENTRY(mean_rmt_entry, rmt_hitm);
MEAN_ENTRY(mean_lcl_entry, lcl_hitm);
MEAN_ENTRY(mean_load_entry, load);
+static int
+cpucnt_entry(struct perf_hpp_fmt *fmt __maybe_unused, struct perf_hpp *hpp,
+ struct hist_entry *he)
+{
+ struct c2c_hist_entry *c2c_he;
+ int width = c2c_width(fmt, hpp, he->hists);
+ char buf[10];
+
+ c2c_he = container_of(he, struct c2c_hist_entry, he);
+
+ snprintf(buf, 10, "%d", bitmap_weight(c2c_he->cpuset, c2c.cpus_cnt));
+ return snprintf(hpp->buf, hpp->size, "%*s", width, buf);
+}
+
#define HEADER_LOW(__h) \
{ \
.line[1] = { \
@@ -1341,6 +1355,14 @@ static struct c2c_dimension dim_mean_load = {
.width = 8,
};
+static struct c2c_dimension dim_cpucnt = {
+ .header = HEADER_BOTH("cpu", "cnt"),
+ .name = "cpucnt",
+ .cmp = empty_cmp,
+ .entry = cpucnt_entry,
+ .width = 8,
+};
+
#undef HEADER_LOW
#undef HEADER_BOTH
#undef HEADER_SPAN
@@ -1383,6 +1405,7 @@ static struct c2c_dimension *dimensions[] = {
&dim_mean_rmt,
&dim_mean_lcl,
&dim_mean_load,
+ &dim_cpucnt,
NULL,
};
--
2.7.4
^ permalink raw reply related [flat|nested] 85+ messages in thread
* [PATCH 42/61] perf c2c report: Add src line sort key
2016-09-19 13:09 [PATCHv3 00/61] perf c2c: Add new tool to analyze cacheline contention on NUMA systems Jiri Olsa
` (40 preceding siblings ...)
2016-09-19 13:09 ` [PATCH 41/61] perf c2c report: Add cpu cnt sort key Jiri Olsa
@ 2016-09-19 13:09 ` Jiri Olsa
2016-09-19 13:09 ` [PATCH 43/61] perf c2c report: Setup number of header lines for hists Jiri Olsa
` (18 subsequent siblings)
60 siblings, 0 replies; 85+ messages in thread
From: Jiri Olsa @ 2016-09-19 13:09 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo
Cc: lkml, Don Zickus, Joe Mario, Ingo Molnar, Peter Zijlstra,
Namhyung Kim, David Ahern, Andi Kleen
Adding source line dimension key wrapper.
It is to be displayed in the single cacheline output:
cl_srcline
It displays source line related to the code address that
accessed cacheline. It's a wrapper to global srcline sort
entry.
Link: http://lkml.kernel.org/n/tip-cmnzgm37mjz56ozsg4mnbgxq@git.kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
tools/perf/builtin-c2c.c | 11 +++++++++++
tools/perf/util/sort.c | 2 +-
tools/perf/util/sort.h | 2 ++
3 files changed, 14 insertions(+), 1 deletion(-)
diff --git a/tools/perf/builtin-c2c.c b/tools/perf/builtin-c2c.c
index a4fea832e677..c540917a70c4 100644
--- a/tools/perf/builtin-c2c.c
+++ b/tools/perf/builtin-c2c.c
@@ -50,6 +50,8 @@ struct perf_c2c {
int cpus_cnt;
int *cpu2node;
int node_info;
+
+ bool show_src;
};
static struct perf_c2c c2c;
@@ -1363,6 +1365,11 @@ static struct c2c_dimension dim_cpucnt = {
.width = 8,
};
+static struct c2c_dimension dim_srcline = {
+ .name = "cl_srcline",
+ .se = &sort_srcline,
+};
+
#undef HEADER_LOW
#undef HEADER_BOTH
#undef HEADER_SPAN
@@ -1406,6 +1413,7 @@ static struct c2c_dimension *dimensions[] = {
&dim_mean_lcl,
&dim_mean_load,
&dim_cpucnt,
+ &dim_srcline,
NULL,
};
@@ -1613,6 +1621,9 @@ static int c2c_hists__reinit(struct c2c_hists *c2c_hists,
static int filter_cb(struct hist_entry *he __maybe_unused)
{
+ if (c2c.show_src && !he->srcline)
+ he->srcline = hist_entry__get_srcline(he);
+
return 0;
}
diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c
index 452e15a10dd2..df622f4e301e 100644
--- a/tools/perf/util/sort.c
+++ b/tools/perf/util/sort.c
@@ -315,7 +315,7 @@ struct sort_entry sort_sym = {
/* --sort srcline */
-static char *hist_entry__get_srcline(struct hist_entry *he)
+char *hist_entry__get_srcline(struct hist_entry *he)
{
struct map *map = he->ms.map;
diff --git a/tools/perf/util/sort.h b/tools/perf/util/sort.h
index d4ef567dcd7b..7aff317fc7c4 100644
--- a/tools/perf/util/sort.h
+++ b/tools/perf/util/sort.h
@@ -40,6 +40,7 @@ extern struct sort_entry sort_dso_from;
extern struct sort_entry sort_dso_to;
extern struct sort_entry sort_sym_from;
extern struct sort_entry sort_sym_to;
+extern struct sort_entry sort_srcline;
extern enum sort_type sort__first_dimension;
extern const char default_mem_sort_order[];
@@ -279,4 +280,5 @@ int64_t
sort__daddr_cmp(struct hist_entry *left, struct hist_entry *right);
int64_t
sort__dcacheline_cmp(struct hist_entry *left, struct hist_entry *right);
+char *hist_entry__get_srcline(struct hist_entry *he);
#endif /* __PERF_SORT_H */
--
2.7.4
^ permalink raw reply related [flat|nested] 85+ messages in thread
* [PATCH 43/61] perf c2c report: Setup number of header lines for hists
2016-09-19 13:09 [PATCHv3 00/61] perf c2c: Add new tool to analyze cacheline contention on NUMA systems Jiri Olsa
` (41 preceding siblings ...)
2016-09-19 13:09 ` [PATCH 42/61] perf c2c report: Add src line " Jiri Olsa
@ 2016-09-19 13:09 ` Jiri Olsa
2016-09-19 13:09 ` [PATCH 44/61] perf c2c report: Set final resort fields Jiri Olsa
` (17 subsequent siblings)
60 siblings, 0 replies; 85+ messages in thread
From: Jiri Olsa @ 2016-09-19 13:09 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo
Cc: lkml, Don Zickus, Joe Mario, Ingo Molnar, Peter Zijlstra,
Namhyung Kim, David Ahern, Andi Kleen
Allow to setup number of header lines for c2c hists objects.
Link: http://lkml.kernel.org/n/tip-4ilsf0ulubrd4y96g7tnpwzk@git.kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
tools/perf/builtin-c2c.c | 19 +++++++++++++------
1 file changed, 13 insertions(+), 6 deletions(-)
diff --git a/tools/perf/builtin-c2c.c b/tools/perf/builtin-c2c.c
index c540917a70c4..f0983d2b26e3 100644
--- a/tools/perf/builtin-c2c.c
+++ b/tools/perf/builtin-c2c.c
@@ -100,11 +100,13 @@ static struct hist_entry_ops c2c_entry_ops = {
};
static int c2c_hists__init(struct c2c_hists *hists,
- const char *sort);
+ const char *sort,
+ int nr_header_lines);
static struct c2c_hists*
he__get_c2c_hists(struct hist_entry *he,
- const char *sort)
+ const char *sort,
+ int nr_header_lines)
{
struct c2c_hist_entry *c2c_he;
struct c2c_hists *hists;
@@ -118,7 +120,7 @@ he__get_c2c_hists(struct hist_entry *he,
if (!hists)
return NULL;
- ret = c2c_hists__init(hists, sort);
+ ret = c2c_hists__init(hists, sort, nr_header_lines);
if (ret)
free(hists);
@@ -212,7 +214,7 @@ static int process_sample_event(struct perf_tool *tool __maybe_unused,
if (!mi_dup)
goto free_mi;
- c2c_hists = he__get_c2c_hists(he, "offset");
+ c2c_hists = he__get_c2c_hists(he, "offset", 2);
if (!c2c_hists)
goto free_mi_dup;
@@ -1596,7 +1598,8 @@ static int hpp_list__parse(struct perf_hpp_list *hpp_list,
}
static int c2c_hists__init(struct c2c_hists *hists,
- const char *sort)
+ const char *sort,
+ int nr_header_lines)
{
__hists__init(&hists->hists, &hists->list);
@@ -1607,6 +1610,9 @@ static int c2c_hists__init(struct c2c_hists *hists,
*/
perf_hpp_list__init(&hists->list);
+ /* Overload number of header lines.*/
+ hists->list.nr_header_lines = nr_header_lines;
+
return hpp_list__parse(&hists->list, NULL, sort);
}
@@ -1735,7 +1741,8 @@ static int perf_c2c__report(int argc, const char **argv)
file.path = input_name;
- err = c2c_hists__init(&c2c.hists, "dcacheline");
+
+ err = c2c_hists__init(&c2c.hists, "dcacheline", 2);
if (err) {
pr_debug("Failed to initialize hists\n");
goto out;
--
2.7.4
^ permalink raw reply related [flat|nested] 85+ messages in thread
* [PATCH 44/61] perf c2c report: Set final resort fields
2016-09-19 13:09 [PATCHv3 00/61] perf c2c: Add new tool to analyze cacheline contention on NUMA systems Jiri Olsa
` (42 preceding siblings ...)
2016-09-19 13:09 ` [PATCH 43/61] perf c2c report: Setup number of header lines for hists Jiri Olsa
@ 2016-09-19 13:09 ` Jiri Olsa
2016-09-19 13:09 ` [PATCH 45/61] perf c2c report: Add stdio output support Jiri Olsa
` (16 subsequent siblings)
60 siblings, 0 replies; 85+ messages in thread
From: Jiri Olsa @ 2016-09-19 13:09 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo
Cc: lkml, Don Zickus, Joe Mario, Ingo Molnar, Peter Zijlstra,
Namhyung Kim, David Ahern, Andi Kleen
Set resort/display fields for both cachelines and
single cacheline displays.
Cachelines are sorted on:
rmt_hitm
will be made configurable in following patches.
Following fields are display for cachelines:
dcacheline
tot_recs
percent_hitm
tot_hitm,lcl_hitm,rmt_hitm
stores,stores_l1hit,stores_l1miss
dram_lcl,dram_rmt
ld_llcmiss
tot_loads
ld_fbhit,ld_l1hit,ld_l2hit
ld_lclhit,ld_rmthit
The single cacheline is sort by:
offset,rmt_hitm,lcl_hitm
will be made configurable in following patches.
Following fields are display for each cacheline:
percent_rmt_hitm
percent_lcl_hitm
percent_stores_l1hit
percent_stores_l1miss
offset
pid
tid
mean_rmt
mean_lcl
mean_load
cpucnt
symbol
dso
node
Link: http://lkml.kernel.org/n/tip-0rclftliywdq9qr2sjbugb6b@git.kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
tools/perf/builtin-c2c.c | 31 +++++++++++++++++++++++++++++++
1 file changed, 31 insertions(+)
diff --git a/tools/perf/builtin-c2c.c b/tools/perf/builtin-c2c.c
index f0983d2b26e3..d7b47c69aa07 100644
--- a/tools/perf/builtin-c2c.c
+++ b/tools/perf/builtin-c2c.c
@@ -1642,6 +1642,23 @@ static int resort_cl_cb(struct hist_entry *he)
c2c_hists = c2c_he->hists;
if (c2c_hists) {
+ c2c_hists__reinit(c2c_hists,
+ "percent_rmt_hitm,"
+ "percent_lcl_hitm,"
+ "percent_stores_l1hit,"
+ "percent_stores_l1miss,"
+ "offset,"
+ "pid,"
+ "tid,"
+ "mean_rmt,"
+ "mean_lcl,"
+ "mean_load,"
+ "cpucnt,"
+ "symbol,"
+ "dso,"
+ "node",
+ "offset,rmt_hitm,lcl_hitm");
+
hists__collapse_resort(&c2c_hists->hists, NULL);
hists__output_resort_cb(&c2c_hists->hists, NULL, filter_cb);
}
@@ -1774,6 +1791,20 @@ static int perf_c2c__report(int argc, const char **argv)
goto out_session;
}
+ c2c_hists__reinit(&c2c.hists,
+ "dcacheline,"
+ "tot_recs,"
+ "percent_hitm,"
+ "tot_hitm,lcl_hitm,rmt_hitm,"
+ "stores,stores_l1hit,stores_l1miss,"
+ "dram_lcl,dram_rmt,"
+ "ld_llcmiss,"
+ "tot_loads,"
+ "ld_fbhit,ld_l1hit,ld_l2hit,"
+ "ld_lclhit,ld_rmthit",
+ "rmt_hitm"
+ );
+
ui_progress__init(&prog, c2c.hists.hists.nr_entries, "Sorting...");
hists__collapse_resort(&c2c.hists.hists, NULL);
--
2.7.4
^ permalink raw reply related [flat|nested] 85+ messages in thread
* [PATCH 45/61] perf c2c report: Add stdio output support
2016-09-19 13:09 [PATCHv3 00/61] perf c2c: Add new tool to analyze cacheline contention on NUMA systems Jiri Olsa
` (43 preceding siblings ...)
2016-09-19 13:09 ` [PATCH 44/61] perf c2c report: Set final resort fields Jiri Olsa
@ 2016-09-19 13:09 ` Jiri Olsa
2016-09-19 13:09 ` [PATCH 46/61] perf c2c report: Add main browser Jiri Olsa
` (15 subsequent siblings)
60 siblings, 0 replies; 85+ messages in thread
From: Jiri Olsa @ 2016-09-19 13:09 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo
Cc: lkml, Don Zickus, Joe Mario, Ingo Molnar, Peter Zijlstra,
Namhyung Kim, David Ahern, Andi Kleen
Adding the --stdio option output support. The output
tables are dumped directly to the stdio.
$ perf c2c report
=================================================
Shared Data Cache Line Table
=================================================
#
# Total ----- LLC Load Hitm ----- ---- Store Reference ---- --- Load Dram ---- LLC Total ----- Core Load Hit ----- -- LLC Load Hit --
# Cacheline records %hitm Total Lcl Rmt Total L1Hit L1Miss Lcl Rmt Ld Miss Loads FB L1 L2 Llc Rmt
# .................. ....... ....... ....... ....... ....... ....... ....... ....... ........ ........ ....... ....... ....... ....... ....... ........ ........
#
0xffff88000235f840 17 0.00% 0 0 0 17 17 0 0 0 0 0 0 0 0 0 0
...
=================================================
Shared Cache Line Distribution Pareto
=================================================
#
# ----- HITM ----- -- Store Refs -- Data address ---------- cycles ---------- cpu Shared
# Rmt Lcl L1 Hit L1 Miss Offset Pid Tid rmt hitm lcl hitm load cnt Symbol Object Node
# ....... ....... ....... ....... .................. ....... ..................... ........ ........ ........ ........ .................... ................. ....
#
------------------------------------------------------
0 0 17 0 0xffff88000235f840
------------------------------------------------------
0.00% 0.00% 5.88% 0.00% 0x0 11474 11474:kworker/u16:5 0 0 0 1 [k] rmap_walk_file [kernel.kallsyms] 0
0.00% 0.00% 5.88% 0.00% 0x10 11474 11474:kworker/u16:5 0 0 0 1 [k] lock_page_memcg [kernel.kallsyms] 0
0.00% 0.00% 11.76% 0.00% 0x20 11474 11474:kworker/u16:5 0 0 0 1 [k] page_mapping [kernel.kallsyms] 0
0.00% 0.00% 64.71% 0.00% 0x28 11474 11474:kworker/u16:5 0 0 0 1 [k] __test_set_page_writeback [kernel.kallsyms] 0
0.00% 0.00% 11.76% 0.00% 0x30 11474 11474:kworker/u16:5 0 0 0 1 [k] page_mapped [kernel.kallsyms] 0
...
Link: http://lkml.kernel.org/n/tip-eorco9r0oeesjve77pkkg43s@git.kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
tools/perf/builtin-c2c.c | 83 ++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 83 insertions(+)
diff --git a/tools/perf/builtin-c2c.c b/tools/perf/builtin-c2c.c
index d7b47c69aa07..222b1a34c788 100644
--- a/tools/perf/builtin-c2c.c
+++ b/tools/perf/builtin-c2c.c
@@ -13,6 +13,7 @@
#include "tool.h"
#include "data.h"
#include "sort.h"
+#include <asm/bug.h>
struct c2c_hists {
struct hists hists;
@@ -1727,6 +1728,85 @@ static int setup_nodes(struct perf_session *session)
return 0;
}
+static void print_cacheline(struct c2c_hists *c2c_hists,
+ struct hist_entry *he_cl,
+ struct perf_hpp_list *hpp_list,
+ FILE *out)
+{
+ char bf[1000];
+ struct perf_hpp hpp = {
+ .buf = bf,
+ .size = 1000,
+ };
+ static bool once;
+
+ if (!once) {
+ hists__fprintf_headers(&c2c_hists->hists, out);
+ once = true;
+ } else {
+ fprintf(out, "\n");
+ }
+
+ fprintf(out, " ------------------------------------------------------\n");
+ hist_entry__snprintf(he_cl, &hpp, hpp_list);
+ fprintf(out, "%s\n", bf);
+ fprintf(out, " ------------------------------------------------------\n");
+
+ hists__fprintf(&c2c_hists->hists, false, 0, 0, 0, out, true);
+}
+
+static void print_pareto(FILE *out)
+{
+ struct perf_hpp_list hpp_list;
+ struct rb_node *nd;
+ int ret;
+
+ perf_hpp_list__init(&hpp_list);
+ ret = hpp_list__parse(&hpp_list,
+ "cl_rmt_hitm,"
+ "cl_lcl_hitm,"
+ "cl_stores_l1hit,"
+ "cl_stores_l1miss,"
+ "dcacheline",
+ NULL);
+
+ if (WARN_ONCE(ret, "failed to setup sort entries\n"))
+ return;
+
+ nd = rb_first(&c2c.hists.hists.entries);
+
+ for (; nd; nd = rb_next(nd)) {
+ struct hist_entry *he = rb_entry(nd, struct hist_entry, rb_node);
+ struct c2c_hist_entry *c2c_he;
+
+ if (he->filtered)
+ continue;
+
+ c2c_he = container_of(he, struct c2c_hist_entry, he);
+ print_cacheline(c2c_he->hists, he, &hpp_list, out);
+ }
+}
+
+static void perf_c2c__hists_fprintf(FILE *out)
+{
+ setup_pager();
+
+ fprintf(out, "\n");
+ fprintf(out, "=================================================\n");
+ fprintf(out, " Shared Data Cache Line Table \n");
+ fprintf(out, "=================================================\n");
+ fprintf(out, "#\n");
+
+ hists__fprintf(&c2c.hists.hists, true, 0, 0, 0, stdout, false);
+
+ fprintf(out, "\n");
+ fprintf(out, "=================================================\n");
+ fprintf(out, " Shared Cache Line Distribution Pareto \n");
+ fprintf(out, "=================================================\n");
+ fprintf(out, "#\n");
+
+ print_pareto(out);
+}
static int perf_c2c__report(int argc, const char **argv)
{
@@ -1812,6 +1892,9 @@ static int perf_c2c__report(int argc, const char **argv)
ui_progress__finish();
+ use_browser = 0;
+ perf_c2c__hists_fprintf(stdout);
+
out_session:
perf_session__delete(session);
out:
--
2.7.4
^ permalink raw reply related [flat|nested] 85+ messages in thread
* [PATCH 46/61] perf c2c report: Add main browser
2016-09-19 13:09 [PATCHv3 00/61] perf c2c: Add new tool to analyze cacheline contention on NUMA systems Jiri Olsa
` (44 preceding siblings ...)
2016-09-19 13:09 ` [PATCH 45/61] perf c2c report: Add stdio output support Jiri Olsa
@ 2016-09-19 13:09 ` Jiri Olsa
2016-09-19 13:09 ` [PATCH 47/61] perf c2c report: Add cacheline browser Jiri Olsa
` (14 subsequent siblings)
60 siblings, 0 replies; 85+ messages in thread
From: Jiri Olsa @ 2016-09-19 13:09 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo
Cc: lkml, Don Zickus, Joe Mario, Ingo Molnar, Peter Zijlstra,
Namhyung Kim, David Ahern, Andi Kleen
Adding the main cachelines TUI browser. It allows
to navigate through cachelines and disaplay their
details and callchains (implemented in following
patches).
Link: http://lkml.kernel.org/n/tip-inykbom2f19difvsu1e18avr@git.kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
tools/perf/builtin-c2c.c | 99 +++++++++++++++++++++++++++++++++++++++++-
tools/perf/ui/browsers/hists.c | 2 +-
tools/perf/ui/browsers/hists.h | 1 +
3 files changed, 99 insertions(+), 3 deletions(-)
diff --git a/tools/perf/builtin-c2c.c b/tools/perf/builtin-c2c.c
index 222b1a34c788..47d5408aeff8 100644
--- a/tools/perf/builtin-c2c.c
+++ b/tools/perf/builtin-c2c.c
@@ -14,6 +14,7 @@
#include "data.h"
#include "sort.h"
#include <asm/bug.h>
+#include "ui/browsers/hists.h"
struct c2c_hists {
struct hists hists;
@@ -53,6 +54,7 @@ struct perf_c2c {
int node_info;
bool show_src;
+ bool use_stdio;
};
static struct perf_c2c c2c;
@@ -1077,6 +1079,8 @@ static struct c2c_dimension dim_dcacheline = {
.width = 18,
};
+static struct c2c_header header_offset_tui = HEADER_LOW("Off");
+
static struct c2c_dimension dim_offset = {
.header = HEADER_BOTH("Data address", "Offset"),
.name = "offset",
@@ -1808,6 +1812,84 @@ static void perf_c2c__hists_fprintf(FILE *out)
print_pareto(out);
}
+static void c2c_browser__update_nr_entries(struct hist_browser *hb)
+{
+ u64 nr_entries = 0;
+ struct rb_node *nd = rb_first(&hb->hists->entries);
+
+ do {
+ struct hist_entry *he = rb_entry(nd, struct hist_entry, rb_node);
+
+ if (!he->filtered)
+ nr_entries++;
+
+ nd = rb_next(nd);
+ } while (nd);
+
+ hb->nr_non_filtered_entries = nr_entries;
+}
+
+static int perf_c2c_browser__title(struct hist_browser *browser,
+ char *bf, size_t size)
+{
+ scnprintf(bf, size,
+ "Shared Data Cache Line Table "
+ "(%lu entries)", browser->nr_non_filtered_entries);
+ return 0;
+}
+
+static struct hist_browser*
+perf_c2c_browser__new(struct hists *hists)
+{
+ struct hist_browser *browser = hist_browser__new(hists);
+
+ if (browser) {
+ browser->title = perf_c2c_browser__title;
+ browser->c2c_filter = true;
+ }
+
+ return browser;
+}
+
+static int perf_c2c__hists_browse(struct hists *hists)
+{
+ struct hist_browser *browser;
+ int key = -1;
+
+ browser = perf_c2c_browser__new(hists);
+ if (browser == NULL)
+ return -1;
+
+ /* reset abort key so that it can get Ctrl-C as a key */
+ SLang_reset_tty();
+ SLang_init_tty(0, 0, 0);
+
+ c2c_browser__update_nr_entries(browser);
+
+ while (1) {
+ key = hist_browser__run(browser, "help");
+
+ switch (key) {
+ case 'q':
+ goto out;
+ default:
+ break;
+ }
+ }
+
+out:
+ hist_browser__delete(browser);
+ return 0;
+}
+
+static void ui_quirks(bool stdio)
+{
+ if (!stdio) {
+ dim_offset.width = 5;
+ dim_offset.header = header_offset_tui;
+ }
+}
+
static int perf_c2c__report(int argc, const char **argv)
{
struct perf_session *session;
@@ -1824,6 +1906,8 @@ static int perf_c2c__report(int argc, const char **argv)
"the input file to process"),
OPT_INCR('N', "node-info", &c2c.node_info,
"show extra node info in report (repeat for more info)"),
+ OPT_BOOLEAN(0, "stdio", &c2c.use_stdio,
+ "Use the stdio interface"),
OPT_END()
};
int err = 0;
@@ -1833,6 +1917,15 @@ static int perf_c2c__report(int argc, const char **argv)
if (argc)
usage_with_options(report_c2c_usage, c2c_options);
+ if (c2c.use_stdio)
+ use_browser = 0;
+ else
+ use_browser = 1;
+
+ ui_quirks(c2c.use_stdio);
+
+ setup_browser(false);
+
if (!input_name || !strlen(input_name))
input_name = "perf.data";
@@ -1892,8 +1985,10 @@ static int perf_c2c__report(int argc, const char **argv)
ui_progress__finish();
- use_browser = 0;
- perf_c2c__hists_fprintf(stdout);
+ if (c2c.use_stdio)
+ perf_c2c__hists_fprintf(stdout);
+ else
+ perf_c2c__hists_browse(&c2c.hists.hists);
out_session:
perf_session__delete(session);
diff --git a/tools/perf/ui/browsers/hists.c b/tools/perf/ui/browsers/hists.c
index 77cf7a80e8d6..83fd2885d78a 100644
--- a/tools/perf/ui/browsers/hists.c
+++ b/tools/perf/ui/browsers/hists.c
@@ -30,7 +30,7 @@ static struct rb_node *hists__filter_entries(struct rb_node *nd,
static bool hist_browser__has_filter(struct hist_browser *hb)
{
- return hists__has_filter(hb->hists) || hb->min_pcnt || symbol_conf.has_filter;
+ return hists__has_filter(hb->hists) || hb->min_pcnt || symbol_conf.has_filter || hb->c2c_filter;
}
static int hist_browser__get_folding(struct hist_browser *browser)
diff --git a/tools/perf/ui/browsers/hists.h b/tools/perf/ui/browsers/hists.h
index 39bd0f28f211..23d6acb84800 100644
--- a/tools/perf/ui/browsers/hists.h
+++ b/tools/perf/ui/browsers/hists.h
@@ -18,6 +18,7 @@ struct hist_browser {
u64 nr_non_filtered_entries;
u64 nr_hierarchy_entries;
u64 nr_callchain_rows;
+ bool c2c_filter;
/* Get title string. */
int (*title)(struct hist_browser *browser,
--
2.7.4
^ permalink raw reply related [flat|nested] 85+ messages in thread
* [PATCH 47/61] perf c2c report: Add cacheline browser
2016-09-19 13:09 [PATCHv3 00/61] perf c2c: Add new tool to analyze cacheline contention on NUMA systems Jiri Olsa
` (45 preceding siblings ...)
2016-09-19 13:09 ` [PATCH 46/61] perf c2c report: Add main browser Jiri Olsa
@ 2016-09-19 13:09 ` Jiri Olsa
2016-09-20 20:10 ` Kim Phillips
2016-09-19 13:09 ` [PATCH 48/61] perf c2c report: Add global stats stdio output Jiri Olsa
` (13 subsequent siblings)
60 siblings, 1 reply; 85+ messages in thread
From: Jiri Olsa @ 2016-09-19 13:09 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo
Cc: lkml, Don Zickus, Joe Mario, Ingo Molnar, Peter Zijlstra,
Namhyung Kim, David Ahern, Andi Kleen
Adding single cacheline TUI browser. It triggers when
you press 'd' in the main browser on the specific cacheline.
It allows to navigate through cacheline's offsets and display
callchains (implemented in following patches).
Link: http://lkml.kernel.org/n/tip-fovjwgyusv3rz5qxk3hnahtl@git.kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
tools/perf/builtin-c2c.c | 81 ++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 81 insertions(+)
diff --git a/tools/perf/builtin-c2c.c b/tools/perf/builtin-c2c.c
index 47d5408aeff8..b380cdf0e6aa 100644
--- a/tools/perf/builtin-c2c.c
+++ b/tools/perf/builtin-c2c.c
@@ -1829,6 +1829,84 @@ static void c2c_browser__update_nr_entries(struct hist_browser *hb)
hb->nr_non_filtered_entries = nr_entries;
}
+struct c2c_cacheline_browser {
+ struct hist_browser hb;
+ struct hist_entry *he;
+};
+
+static int
+perf_c2c_cacheline_browser__title(struct hist_browser *browser,
+ char *bf, size_t size)
+{
+ struct c2c_cacheline_browser *cl_browser;
+ struct hist_entry *he;
+ uint64_t addr = 0;
+
+ cl_browser = container_of(browser, struct c2c_cacheline_browser, hb);
+ he = cl_browser->he;
+
+ if (he->mem_info)
+ addr = cl_address(he->mem_info->daddr.addr);
+
+ scnprintf(bf, size, "Cacheline 0x%lx", addr);
+ return 0;
+}
+
+static struct c2c_cacheline_browser*
+c2c_cacheline_browser__new(struct hists *hists, struct hist_entry *he)
+{
+ struct c2c_cacheline_browser *browser;
+
+ browser = zalloc(sizeof(*browser));
+ if (browser) {
+ hist_browser__init(&browser->hb, hists);
+ browser->hb.c2c_filter = true;
+ browser->hb.title = perf_c2c_cacheline_browser__title;
+ browser->he = he;
+ }
+
+ return browser;
+}
+
+static int perf_c2c__browse_cacheline(struct hist_entry *he)
+{
+ struct c2c_hist_entry *c2c_he;
+ struct c2c_hists *c2c_hists;
+ struct c2c_cacheline_browser *cl_browser;
+ struct hist_browser *browser;
+ int key = -1;
+
+ c2c_he = container_of(he, struct c2c_hist_entry, he);
+ c2c_hists = c2c_he->hists;
+
+ cl_browser = c2c_cacheline_browser__new(&c2c_hists->hists, he);
+ if (cl_browser == NULL)
+ return -1;
+
+ browser = &cl_browser->hb;
+
+ /* reset abort key so that it can get Ctrl-C as a key */
+ SLang_reset_tty();
+ SLang_init_tty(0, 0, 0);
+
+ c2c_browser__update_nr_entries(browser);
+
+ while (1) {
+ key = hist_browser__run(browser, "help");
+
+ switch (key) {
+ case 'q':
+ goto out;
+ default:
+ break;
+ }
+ }
+
+out:
+ free(cl_browser);
+ return 0;
+}
+
static int perf_c2c_browser__title(struct hist_browser *browser,
char *bf, size_t size)
{
@@ -1872,6 +1950,9 @@ static int perf_c2c__hists_browse(struct hists *hists)
switch (key) {
case 'q':
goto out;
+ case 'd':
+ perf_c2c__browse_cacheline(browser->he_selection);
+ break;
default:
break;
}
--
2.7.4
^ permalink raw reply related [flat|nested] 85+ messages in thread
* Re: [PATCH 47/61] perf c2c report: Add cacheline browser
2016-09-19 13:09 ` [PATCH 47/61] perf c2c report: Add cacheline browser Jiri Olsa
@ 2016-09-20 20:10 ` Kim Phillips
2016-09-21 8:21 ` Jiri Olsa
0 siblings, 1 reply; 85+ messages in thread
From: Kim Phillips @ 2016-09-20 20:10 UTC (permalink / raw)
To: Jiri Olsa
Cc: Arnaldo Carvalho de Melo, lkml, Don Zickus, Joe Mario,
Ingo Molnar, Peter Zijlstra, Namhyung Kim, David Ahern,
Andi Kleen
On Mon, 19 Sep 2016 15:09:56 +0200
Jiri Olsa <jolsa@kernel.org> wrote:
> + /* reset abort key so that it can get Ctrl-C as a key */
> + SLang_reset_tty();
> + SLang_init_tty(0, 0, 0);
this fails to build on systems without slang:
CC builtin-c2c.o
builtin-c2c.c: In function ‘perf_c2c__browse_cacheline’:
builtin-c2c.c:2211:2: error: implicit declaration of function ‘SLang_reset_tty’ [-Werror=implicit-function-declaration]
SLang_reset_tty();
^
builtin-c2c.c:2211:2: error: nested extern declaration of ‘SLang_reset_tty’ [-Werror=nested-externs]
builtin-c2c.c:2212:2: error: implicit declaration of function ‘SLang_init_tty’ [-Werror=implicit-function-declaration]
SLang_init_tty(0, 0, 0);
^
builtin-c2c.c:2212:2: error: nested extern declaration of ‘SLang_init_tty’ [-Werror=nested-externs]
cc1: all warnings being treated as errors
mv: cannot stat ‘./.builtin-c2c.o.tmp’: No such file or directory
tools/build/Makefile.build:77: recipe for target 'builtin-c2c.o' failed
Thanks,
Kim
^ permalink raw reply [flat|nested] 85+ messages in thread
* Re: [PATCH 47/61] perf c2c report: Add cacheline browser
2016-09-20 20:10 ` Kim Phillips
@ 2016-09-21 8:21 ` Jiri Olsa
2016-09-21 12:55 ` Jiri Olsa
0 siblings, 1 reply; 85+ messages in thread
From: Jiri Olsa @ 2016-09-21 8:21 UTC (permalink / raw)
To: Kim Phillips
Cc: Jiri Olsa, Arnaldo Carvalho de Melo, lkml, Don Zickus, Joe Mario,
Ingo Molnar, Peter Zijlstra, Namhyung Kim, David Ahern,
Andi Kleen
On Tue, Sep 20, 2016 at 03:10:07PM -0500, Kim Phillips wrote:
> On Mon, 19 Sep 2016 15:09:56 +0200
> Jiri Olsa <jolsa@kernel.org> wrote:
>
> > + /* reset abort key so that it can get Ctrl-C as a key */
> > + SLang_reset_tty();
> > + SLang_init_tty(0, 0, 0);
>
> this fails to build on systems without slang:
>
> CC builtin-c2c.o
> builtin-c2c.c: In function ‘perf_c2c__browse_cacheline’:
> builtin-c2c.c:2211:2: error: implicit declaration of function ‘SLang_reset_tty’ [-Werror=implicit-function-declaration]
> SLang_reset_tty();
will fix, thanks
jirka
^ permalink raw reply [flat|nested] 85+ messages in thread
* Re: [PATCH 47/61] perf c2c report: Add cacheline browser
2016-09-21 8:21 ` Jiri Olsa
@ 2016-09-21 12:55 ` Jiri Olsa
2016-09-21 19:35 ` Kim Phillips
0 siblings, 1 reply; 85+ messages in thread
From: Jiri Olsa @ 2016-09-21 12:55 UTC (permalink / raw)
To: Kim Phillips
Cc: Jiri Olsa, Arnaldo Carvalho de Melo, lkml, Don Zickus, Joe Mario,
Ingo Molnar, Peter Zijlstra, Namhyung Kim, David Ahern,
Andi Kleen
On Wed, Sep 21, 2016 at 10:21:55AM +0200, Jiri Olsa wrote:
> On Tue, Sep 20, 2016 at 03:10:07PM -0500, Kim Phillips wrote:
> > On Mon, 19 Sep 2016 15:09:56 +0200
> > Jiri Olsa <jolsa@kernel.org> wrote:
> >
> > > + /* reset abort key so that it can get Ctrl-C as a key */
> > > + SLang_reset_tty();
> > > + SLang_init_tty(0, 0, 0);
> >
> > this fails to build on systems without slang:
> >
> > CC builtin-c2c.o
> > builtin-c2c.c: In function ‘perf_c2c__browse_cacheline’:
> > builtin-c2c.c:2211:2: error: implicit declaration of function ‘SLang_reset_tty’ [-Werror=implicit-function-declaration]
> > SLang_reset_tty();
>
> will fix, thanks
fixed branch pushed in perf/c2c_v4
jirka
^ permalink raw reply [flat|nested] 85+ messages in thread
* Re: [PATCH 47/61] perf c2c report: Add cacheline browser
2016-09-21 12:55 ` Jiri Olsa
@ 2016-09-21 19:35 ` Kim Phillips
0 siblings, 0 replies; 85+ messages in thread
From: Kim Phillips @ 2016-09-21 19:35 UTC (permalink / raw)
To: Jiri Olsa
Cc: Jiri Olsa, Arnaldo Carvalho de Melo, lkml, Don Zickus, Joe Mario,
Ingo Molnar, Peter Zijlstra, Namhyung Kim, David Ahern,
Andi Kleen
On Wed, 21 Sep 2016 14:55:40 +0200
Jiri Olsa <jolsa@redhat.com> wrote:
> On Wed, Sep 21, 2016 at 10:21:55AM +0200, Jiri Olsa wrote:
> > On Tue, Sep 20, 2016 at 03:10:07PM -0500, Kim Phillips wrote:
> > > this fails to build on systems without slang:
> > >
> > > CC builtin-c2c.o
> > > builtin-c2c.c: In function ‘perf_c2c__browse_cacheline’:
> > > builtin-c2c.c:2211:2: error: implicit declaration of function ‘SLang_reset_tty’ [-Werror=implicit-function-declaration]
> > > SLang_reset_tty();
> >
> > will fix, thanks
>
> fixed branch pushed in perf/c2c_v4
that works much better, thanks.
Kim
^ permalink raw reply [flat|nested] 85+ messages in thread
* [PATCH 48/61] perf c2c report: Add global stats stdio output
2016-09-19 13:09 [PATCHv3 00/61] perf c2c: Add new tool to analyze cacheline contention on NUMA systems Jiri Olsa
` (46 preceding siblings ...)
2016-09-19 13:09 ` [PATCH 47/61] perf c2c report: Add cacheline browser Jiri Olsa
@ 2016-09-19 13:09 ` Jiri Olsa
2016-09-19 13:09 ` [PATCH 49/61] perf c2c report: Add shared cachelines " Jiri Olsa
` (12 subsequent siblings)
60 siblings, 0 replies; 85+ messages in thread
From: Jiri Olsa @ 2016-09-19 13:09 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo
Cc: lkml, Don Zickus, Joe Mario, Ingo Molnar, Peter Zijlstra,
Namhyung Kim, David Ahern, Andi Kleen
Display global stats table as part of the stdio output
or when --stats option is speicified:
$ perf c2c report --stats
=================================================
Trace Event Information
=================================================
Total records : 41237
Locked Load/Store Operations : 4075
Load Operations : 20526
Loads - uncacheable : 0
Loads - IO : 0
Loads - Miss : 552
Loads - no mapping : 31
Load Fill Buffer Hit : 7333
Load L1D hit : 6398
Load L2D hit : 144
Load LLC hit : 4889
Load Local HITM : 1185
Load Remote HITM : 838
Load Remote HIT : 52
Load Local DRAM : 183
Load Remote DRAM : 106
Load MESI State Exclusive : 289
Load MESI State Shared : 0
Load LLC Misses : 1179
LLC Misses to Local DRAM : 15.5%
LLC Misses to Remote DRAM : 9.0%
LLC Misses to Remote cache (HIT) : 4.4%
LLC Misses to Remote cache (HITM) : 71.1%
Store Operations : 20711
Store - uncacheable : 0
Store - no mapping : 1
Store L1D Hit : 20158
Store L1D Miss : 552
No Page Map Rejects : 7
Unable to parse data source : 0
Original-patch-by: Dick Fowles <rfowles@redhat.com>
Original-patch-by: Don Zickus <dzickus@redhat.com>
Link: http://lkml.kernel.org/n/tip-qkyvao3qsrnwazf0w1jvsh7z@git.kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
tools/perf/builtin-c2c.c | 56 ++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 56 insertions(+)
diff --git a/tools/perf/builtin-c2c.c b/tools/perf/builtin-c2c.c
index b380cdf0e6aa..aecfe70b2f52 100644
--- a/tools/perf/builtin-c2c.c
+++ b/tools/perf/builtin-c2c.c
@@ -55,6 +55,7 @@ struct perf_c2c {
bool show_src;
bool use_stdio;
+ bool stats_only;
};
static struct perf_c2c c2c;
@@ -1732,6 +1733,51 @@ static int setup_nodes(struct perf_session *session)
return 0;
}
+static void print_c2c__display_stats(FILE *out)
+{
+ int llc_misses;
+ struct c2c_stats *stats = &c2c.hists.stats;
+
+ llc_misses = stats->lcl_dram +
+ stats->rmt_dram +
+ stats->rmt_hit +
+ stats->rmt_hitm;
+
+ fprintf(out, "=================================================\n");
+ fprintf(out, " Trace Event Information \n");
+ fprintf(out, "=================================================\n");
+ fprintf(out, " Total records : %10d\n", stats->nr_entries);
+ fprintf(out, " Locked Load/Store Operations : %10d\n", stats->locks);
+ fprintf(out, " Load Operations : %10d\n", stats->load);
+ fprintf(out, " Loads - uncacheable : %10d\n", stats->ld_uncache);
+ fprintf(out, " Loads - IO : %10d\n", stats->ld_io);
+ fprintf(out, " Loads - Miss : %10d\n", stats->ld_miss);
+ fprintf(out, " Loads - no mapping : %10d\n", stats->ld_noadrs);
+ fprintf(out, " Load Fill Buffer Hit : %10d\n", stats->ld_fbhit);
+ fprintf(out, " Load L1D hit : %10d\n", stats->ld_l1hit);
+ fprintf(out, " Load L2D hit : %10d\n", stats->ld_l2hit);
+ fprintf(out, " Load LLC hit : %10d\n", stats->ld_llchit + stats->lcl_hitm);
+ fprintf(out, " Load Local HITM : %10d\n", stats->lcl_hitm);
+ fprintf(out, " Load Remote HITM : %10d\n", stats->rmt_hitm);
+ fprintf(out, " Load Remote HIT : %10d\n", stats->rmt_hit);
+ fprintf(out, " Load Local DRAM : %10d\n", stats->lcl_dram);
+ fprintf(out, " Load Remote DRAM : %10d\n", stats->rmt_dram);
+ fprintf(out, " Load MESI State Exclusive : %10d\n", stats->ld_excl);
+ fprintf(out, " Load MESI State Shared : %10d\n", stats->ld_shared);
+ fprintf(out, " Load LLC Misses : %10d\n", llc_misses);
+ fprintf(out, " LLC Misses to Local DRAM : %10.1f%%\n", ((double)stats->lcl_dram/(double)llc_misses) * 100.);
+ fprintf(out, " LLC Misses to Remote DRAM : %10.1f%%\n", ((double)stats->rmt_dram/(double)llc_misses) * 100.);
+ fprintf(out, " LLC Misses to Remote cache (HIT) : %10.1f%%\n", ((double)stats->rmt_hit /(double)llc_misses) * 100.);
+ fprintf(out, " LLC Misses to Remote cache (HITM) : %10.1f%%\n", ((double)stats->rmt_hitm/(double)llc_misses) * 100.);
+ fprintf(out, " Store Operations : %10d\n", stats->store);
+ fprintf(out, " Store - uncacheable : %10d\n", stats->st_uncache);
+ fprintf(out, " Store - no mapping : %10d\n", stats->st_noadrs);
+ fprintf(out, " Store L1D Hit : %10d\n", stats->st_l1hit);
+ fprintf(out, " Store L1D Miss : %10d\n", stats->st_l1miss);
+ fprintf(out, " No Page Map Rejects : %10d\n", stats->nomap);
+ fprintf(out, " Unable to parse data source : %10d\n", stats->noparse);
+}
+
static void print_cacheline(struct c2c_hists *c2c_hists,
struct hist_entry *he_cl,
struct perf_hpp_list *hpp_list,
@@ -1795,6 +1841,11 @@ static void perf_c2c__hists_fprintf(FILE *out)
{
setup_pager();
+ print_c2c__display_stats(out);
+
+ if (c2c.stats_only)
+ return;
+
fprintf(out, "\n");
fprintf(out, "=================================================\n");
fprintf(out, " Shared Data Cache Line Table \n");
@@ -1989,6 +2040,8 @@ static int perf_c2c__report(int argc, const char **argv)
"show extra node info in report (repeat for more info)"),
OPT_BOOLEAN(0, "stdio", &c2c.use_stdio,
"Use the stdio interface"),
+ OPT_BOOLEAN(0, "stats", &c2c.stats_only,
+ "Use the stdio interface"),
OPT_END()
};
int err = 0;
@@ -1998,6 +2051,9 @@ static int perf_c2c__report(int argc, const char **argv)
if (argc)
usage_with_options(report_c2c_usage, c2c_options);
+ if (c2c.stats_only)
+ c2c.use_stdio = true;
+
if (c2c.use_stdio)
use_browser = 0;
else
--
2.7.4
^ permalink raw reply related [flat|nested] 85+ messages in thread
* [PATCH 49/61] perf c2c report: Add shared cachelines stats stdio output
2016-09-19 13:09 [PATCHv3 00/61] perf c2c: Add new tool to analyze cacheline contention on NUMA systems Jiri Olsa
` (47 preceding siblings ...)
2016-09-19 13:09 ` [PATCH 48/61] perf c2c report: Add global stats stdio output Jiri Olsa
@ 2016-09-19 13:09 ` Jiri Olsa
2016-09-19 13:09 ` [PATCH 50/61] perf c2c report: Add c2c related " Jiri Olsa
` (11 subsequent siblings)
60 siblings, 0 replies; 85+ messages in thread
From: Jiri Olsa @ 2016-09-19 13:09 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo
Cc: lkml, Don Zickus, Joe Mario, Ingo Molnar, Peter Zijlstra,
Namhyung Kim, David Ahern, Andi Kleen
Display global shared cachelines related stats table as part
of the stdio output or when --stats option is speicified:
$ perf c2c report --stats
...
=================================================
Global Shared Cache Line Event Information
=================================================
Total Shared Cache Lines : 1384
Load HITs on shared lines : 5995
Fill Buffer Hits on shared lines : 1726
L1D hits on shared lines : 1943
L2D hits on shared lines : 0
LLC hits on shared lines : 1360
Locked Access on shared lines : 1993
Store HITs on shared lines : 1504
Store L1D hits on shared lines : 1446
Total Merged records : 3527
Original-patch-by: Dick Fowles <rfowles@redhat.com>
Original-patch-by: Don Zickus <dzickus@redhat.com>
Link: http://lkml.kernel.org/n/tip-p0gty8ctbdzisrniwqxhqmhq@git.kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
tools/perf/builtin-c2c.c | 62 +++++++++++++++++++++++++++++++++++++++++++++++-
1 file changed, 61 insertions(+), 1 deletion(-)
diff --git a/tools/perf/builtin-c2c.c b/tools/perf/builtin-c2c.c
index aecfe70b2f52..e463da572207 100644
--- a/tools/perf/builtin-c2c.c
+++ b/tools/perf/builtin-c2c.c
@@ -56,6 +56,10 @@ struct perf_c2c {
bool show_src;
bool use_stdio;
bool stats_only;
+
+ /* HITM shared clines stats */
+ struct c2c_stats hitm_stats;
+ int shared_clines;
};
static struct perf_c2c c2c;
@@ -1733,6 +1737,39 @@ static int setup_nodes(struct perf_session *session)
return 0;
}
+#define HAS_HITMS(__h) ((__h)->stats.lcl_hitm || (__h)->stats.rmt_hitm)
+
+static int resort_hitm_cb(struct hist_entry *he)
+{
+ struct c2c_hist_entry *c2c_he;
+ c2c_he = container_of(he, struct c2c_hist_entry, he);
+
+ if (HAS_HITMS(c2c_he)) {
+ c2c.shared_clines++;
+ c2c_add_stats(&c2c.hitm_stats, &c2c_he->stats);
+ }
+
+ return 0;
+}
+
+static int hists__iterate_cb(struct hists *hists, hists__resort_cb_t cb)
+{
+ struct rb_node *next = rb_first(&hists->entries);
+ int ret = 0;
+
+ while (next) {
+ struct hist_entry *he;
+
+ he = rb_entry(next, struct hist_entry, rb_node);
+ ret = cb(he);
+ if (ret)
+ break;
+ next = rb_next(&he->rb_node);
+ }
+
+ return ret;
+}
+
static void print_c2c__display_stats(FILE *out)
{
int llc_misses;
@@ -1778,6 +1815,26 @@ static void print_c2c__display_stats(FILE *out)
fprintf(out, " Unable to parse data source : %10d\n", stats->noparse);
}
+static void print_shared_cacheline_info(FILE *out)
+{
+ struct c2c_stats *stats = &c2c.hitm_stats;
+ int hitm_cnt = stats->lcl_hitm + stats->rmt_hitm;
+
+ fprintf(out, "=================================================\n");
+ fprintf(out, " Global Shared Cache Line Event Information \n");
+ fprintf(out, "=================================================\n");
+ fprintf(out, " Total Shared Cache Lines : %10d\n", c2c.shared_clines);
+ fprintf(out, " Load HITs on shared lines : %10d\n", stats->load);
+ fprintf(out, " Fill Buffer Hits on shared lines : %10d\n", stats->ld_fbhit);
+ fprintf(out, " L1D hits on shared lines : %10d\n", stats->ld_l1hit);
+ fprintf(out, " L2D hits on shared lines : %10d\n", stats->ld_l2hit);
+ fprintf(out, " LLC hits on shared lines : %10d\n", stats->ld_llchit + stats->lcl_hitm);
+ fprintf(out, " Locked Access on shared lines : %10d\n", stats->locks);
+ fprintf(out, " Store HITs on shared lines : %10d\n", stats->store);
+ fprintf(out, " Store L1D hits on shared lines : %10d\n", stats->st_l1hit);
+ fprintf(out, " Total Merged records : %10d\n", hitm_cnt + stats->store);
+}
+
static void print_cacheline(struct c2c_hists *c2c_hists,
struct hist_entry *he_cl,
struct perf_hpp_list *hpp_list,
@@ -1842,6 +1899,8 @@ static void perf_c2c__hists_fprintf(FILE *out)
setup_pager();
print_c2c__display_stats(out);
+ fprintf(out, "\n");
+ print_shared_cacheline_info(out);
if (c2c.stats_only)
return;
@@ -2118,7 +2177,8 @@ static int perf_c2c__report(int argc, const char **argv)
ui_progress__init(&prog, c2c.hists.hists.nr_entries, "Sorting...");
hists__collapse_resort(&c2c.hists.hists, NULL);
- hists__output_resort_cb(&c2c.hists.hists, &prog, resort_cl_cb);
+ hists__output_resort_cb(&c2c.hists.hists, &prog, resort_hitm_cb);
+ hists__iterate_cb(&c2c.hists.hists, resort_cl_cb);
ui_progress__finish();
--
2.7.4
^ permalink raw reply related [flat|nested] 85+ messages in thread
* [PATCH 50/61] perf c2c report: Add c2c related stats stdio output
2016-09-19 13:09 [PATCHv3 00/61] perf c2c: Add new tool to analyze cacheline contention on NUMA systems Jiri Olsa
` (48 preceding siblings ...)
2016-09-19 13:09 ` [PATCH 49/61] perf c2c report: Add shared cachelines " Jiri Olsa
@ 2016-09-19 13:09 ` Jiri Olsa
2016-09-19 13:10 ` [PATCH 51/61] perf c2c report: Allow to report callchains Jiri Olsa
` (10 subsequent siblings)
60 siblings, 0 replies; 85+ messages in thread
From: Jiri Olsa @ 2016-09-19 13:09 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo
Cc: lkml, Don Zickus, Joe Mario, Ingo Molnar, Peter Zijlstra,
Namhyung Kim, David Ahern, Andi Kleen
Display c2c related configuration options/setup.
So far it's output of monitored events:
$ perf c2c report --stats
...
=================================================
c2c details
=================================================
Events : cpu/mem-loads,ldlat=50/pp
: cpu/mem-stores/pp
Link: http://lkml.kernel.org/n/tip-ypz84f3a9fumyttrxurm458z@git.kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
tools/perf/builtin-c2c.c | 25 +++++++++++++++++++++++--
1 file changed, 23 insertions(+), 2 deletions(-)
diff --git a/tools/perf/builtin-c2c.c b/tools/perf/builtin-c2c.c
index e463da572207..f4bdef5004c9 100644
--- a/tools/perf/builtin-c2c.c
+++ b/tools/perf/builtin-c2c.c
@@ -13,6 +13,8 @@
#include "tool.h"
#include "data.h"
#include "sort.h"
+#include "evlist.h"
+#include "evsel.h"
#include <asm/bug.h>
#include "ui/browsers/hists.h"
@@ -1894,13 +1896,32 @@ static void print_pareto(FILE *out)
}
}
-static void perf_c2c__hists_fprintf(FILE *out)
+static void print_c2c_info(FILE *out, struct perf_session *session)
+{
+ struct perf_evlist *evlist = session->evlist;
+ struct perf_evsel *evsel;
+ bool first = true;
+
+ fprintf(out, "=================================================\n");
+ fprintf(out, " c2c details \n");
+ fprintf(out, "=================================================\n");
+
+ evlist__for_each_entry(evlist, evsel) {
+ fprintf(out, "%-36s: %s\n", first ? " Events" : "",
+ perf_evsel__name(evsel));
+ first = false;
+ }
+}
+
+static void perf_c2c__hists_fprintf(FILE *out, struct perf_session *session)
{
setup_pager();
print_c2c__display_stats(out);
fprintf(out, "\n");
print_shared_cacheline_info(out);
+ fprintf(out, "\n");
+ print_c2c_info(out, session);
if (c2c.stats_only)
return;
@@ -2183,7 +2204,7 @@ static int perf_c2c__report(int argc, const char **argv)
ui_progress__finish();
if (c2c.use_stdio)
- perf_c2c__hists_fprintf(stdout);
+ perf_c2c__hists_fprintf(stdout, session);
else
perf_c2c__hists_browse(&c2c.hists.hists);
--
2.7.4
^ permalink raw reply related [flat|nested] 85+ messages in thread
* [PATCH 51/61] perf c2c report: Allow to report callchains
2016-09-19 13:09 [PATCHv3 00/61] perf c2c: Add new tool to analyze cacheline contention on NUMA systems Jiri Olsa
` (49 preceding siblings ...)
2016-09-19 13:09 ` [PATCH 50/61] perf c2c report: Add c2c related " Jiri Olsa
@ 2016-09-19 13:10 ` Jiri Olsa
2016-09-19 13:10 ` [PATCH 52/61] perf c2c report: Limit the cachelines table entries Jiri Olsa
` (9 subsequent siblings)
60 siblings, 0 replies; 85+ messages in thread
From: Jiri Olsa @ 2016-09-19 13:10 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo
Cc: lkml, Don Zickus, Joe Mario, Ingo Molnar, Peter Zijlstra,
Namhyung Kim, David Ahern, Andi Kleen
Add --call-graph option to properly setup callchain
code. Adding default settings to display callchains
whenever they are stored in the perf.data.
Link: http://lkml.kernel.org/n/tip-inykbom2f19difvsu1e18avr@git.kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
tools/perf/builtin-c2c.c | 67 ++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 67 insertions(+)
diff --git a/tools/perf/builtin-c2c.c b/tools/perf/builtin-c2c.c
index f4bdef5004c9..913a6b9b4d45 100644
--- a/tools/perf/builtin-c2c.c
+++ b/tools/perf/builtin-c2c.c
@@ -17,6 +17,7 @@
#include "evsel.h"
#include <asm/bug.h>
#include "ui/browsers/hists.h"
+#include "evlist.h"
struct c2c_hists {
struct hists hists;
@@ -181,6 +182,11 @@ static int process_sample_event(struct perf_tool *tool __maybe_unused,
return -1;
}
+ ret = sample__resolve_callchain(sample, &callchain_cursor, NULL,
+ evsel, &al, sysctl_perf_event_max_stack);
+ if (ret)
+ goto out;
+
mi = sample__resolve_mem(sample, &al);
if (mi == NULL)
return -ENOMEM;
@@ -2102,6 +2108,58 @@ static void ui_quirks(bool stdio)
}
}
+#define CALLCHAIN_DEFAULT_OPT "graph,0.5,caller,function,percent"
+
+const char callchain_help[] = "Display call graph (stack chain/backtrace):\n\n"
+ CALLCHAIN_REPORT_HELP
+ "\n\t\t\t\tDefault: " CALLCHAIN_DEFAULT_OPT;
+
+static int
+parse_callchain_opt(const struct option *opt, const char *arg, int unset)
+{
+ struct callchain_param *callchain = opt->value;
+
+ callchain->enabled = !unset;
+ /*
+ * --no-call-graph
+ */
+ if (unset) {
+ symbol_conf.use_callchain = false;
+ callchain->mode = CHAIN_NONE;
+ return 0;
+ }
+
+ return parse_callchain_report_opt(arg);
+}
+
+static int setup_callchain(struct perf_evlist *evlist)
+{
+ u64 sample_type = perf_evlist__combined_sample_type(evlist);
+ enum perf_call_graph_mode mode = CALLCHAIN_NONE;
+
+ if ((sample_type & PERF_SAMPLE_REGS_USER) &&
+ (sample_type & PERF_SAMPLE_STACK_USER))
+ mode = CALLCHAIN_DWARF;
+ else if (sample_type & PERF_SAMPLE_BRANCH_STACK)
+ mode = CALLCHAIN_LBR;
+ else if (sample_type & PERF_SAMPLE_CALLCHAIN)
+ mode = CALLCHAIN_FP;
+
+ if (!callchain_param.enabled &&
+ callchain_param.mode != CHAIN_NONE &&
+ mode != CALLCHAIN_NONE) {
+ symbol_conf.use_callchain = true;
+ if (callchain_register_param(&callchain_param) < 0) {
+ ui__error("Can't register callchain params.\n");
+ return -EINVAL;
+ }
+ }
+
+ callchain_param.record_mode = mode;
+ callchain_param.min_percent = 0;
+ return 0;
+}
+
static int perf_c2c__report(int argc, const char **argv)
{
struct perf_session *session;
@@ -2109,6 +2167,7 @@ static int perf_c2c__report(int argc, const char **argv)
struct perf_data_file file = {
.mode = PERF_DATA_MODE_READ,
};
+ char callchain_default_opt[] = CALLCHAIN_DEFAULT_OPT;
const struct option c2c_options[] = {
OPT_STRING('k', "vmlinux", &symbol_conf.vmlinux_name,
"file", "vmlinux pathname"),
@@ -2122,6 +2181,10 @@ static int perf_c2c__report(int argc, const char **argv)
"Use the stdio interface"),
OPT_BOOLEAN(0, "stats", &c2c.stats_only,
"Use the stdio interface"),
+ OPT_CALLBACK_DEFAULT('g', "call-graph", &callchain_param,
+ "print_type,threshold[,print_limit],order,sort_key[,branch],value",
+ callchain_help, &parse_callchain_opt,
+ callchain_default_opt),
OPT_END()
};
int err = 0;
@@ -2166,6 +2229,10 @@ static int perf_c2c__report(int argc, const char **argv)
goto out;
}
+ err = setup_callchain(session->evlist);
+ if (err)
+ goto out_session;
+
if (symbol__init(&session->header.env) < 0)
goto out_session;
--
2.7.4
^ permalink raw reply related [flat|nested] 85+ messages in thread
* [PATCH 52/61] perf c2c report: Limit the cachelines table entries
2016-09-19 13:09 [PATCHv3 00/61] perf c2c: Add new tool to analyze cacheline contention on NUMA systems Jiri Olsa
` (50 preceding siblings ...)
2016-09-19 13:10 ` [PATCH 51/61] perf c2c report: Allow to report callchains Jiri Olsa
@ 2016-09-19 13:10 ` Jiri Olsa
2016-09-19 13:10 ` [PATCH 53/61] perf c2c report: Add support to choose local HITMs Jiri Olsa
` (8 subsequent siblings)
60 siblings, 0 replies; 85+ messages in thread
From: Jiri Olsa @ 2016-09-19 13:10 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo
Cc: lkml, Don Zickus, Joe Mario, Ingo Molnar, Peter Zijlstra,
Namhyung Kim, David Ahern, Andi Kleen
Add a limit for entries number of the cachelines table
entries. By default now it's the 0.0005% minimum of
remote HITMs.
Also display only cachelines with remote hitm or store data.
Link: http://lkml.kernel.org/n/tip-inykbom2f19difvsu1e18avr@git.kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
tools/perf/builtin-c2c.c | 36 ++++++++++++++++++++++++++++++++++--
tools/perf/util/hist.c | 1 +
tools/perf/util/hist.h | 1 +
3 files changed, 36 insertions(+), 2 deletions(-)
diff --git a/tools/perf/builtin-c2c.c b/tools/perf/builtin-c2c.c
index 913a6b9b4d45..571be80c6d18 100644
--- a/tools/perf/builtin-c2c.c
+++ b/tools/perf/builtin-c2c.c
@@ -1643,11 +1643,42 @@ static int c2c_hists__reinit(struct c2c_hists *c2c_hists,
return hpp_list__parse(&c2c_hists->list, output, sort);
}
-static int filter_cb(struct hist_entry *he __maybe_unused)
+#define DISPLAY_LINE_LIMIT 0.0005
+
+static bool he__display(struct hist_entry *he, struct c2c_stats *stats)
+{
+ struct c2c_hist_entry *c2c_he;
+ double ld_dist;
+
+ c2c_he = container_of(he, struct c2c_hist_entry, he);
+
+ if (stats->rmt_hitm) {
+ ld_dist = ((double)c2c_he->stats.rmt_hitm / stats->rmt_hitm);
+ if (ld_dist < DISPLAY_LINE_LIMIT)
+ he->filtered = HIST_FILTER__C2C;
+ } else {
+ he->filtered = HIST_FILTER__C2C;
+ }
+
+ return he->filtered == 0;
+}
+
+static inline int valid_hitm_or_store(struct hist_entry *he)
+{
+ struct c2c_hist_entry *c2c_he;
+
+ c2c_he = container_of(he, struct c2c_hist_entry, he);
+ return c2c_he->stats.rmt_hitm || c2c_he->stats.store;
+}
+
+static int filter_cb(struct hist_entry *he)
{
if (c2c.show_src && !he->srcline)
he->srcline = hist_entry__get_srcline(he);
+ if (!valid_hitm_or_store(he))
+ he->filtered = HIST_FILTER__C2C;
+
return 0;
}
@@ -1655,11 +1686,12 @@ static int resort_cl_cb(struct hist_entry *he)
{
struct c2c_hist_entry *c2c_he;
struct c2c_hists *c2c_hists;
+ bool display = he__display(he, &c2c.hitm_stats);
c2c_he = container_of(he, struct c2c_hist_entry, he);
c2c_hists = c2c_he->hists;
- if (c2c_hists) {
+ if (display && c2c_hists) {
c2c_hists__reinit(c2c_hists,
"percent_rmt_hitm,"
"percent_lcl_hitm,"
diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index 37a08f20730a..020efa9d3d74 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -1193,6 +1193,7 @@ static void hist_entry__check_and_remove_filter(struct hist_entry *he,
case HIST_FILTER__GUEST:
case HIST_FILTER__HOST:
case HIST_FILTER__SOCKET:
+ case HIST_FILTER__C2C:
default:
return;
}
diff --git a/tools/perf/util/hist.h b/tools/perf/util/hist.h
index 0e3493e33175..ff6298693227 100644
--- a/tools/perf/util/hist.h
+++ b/tools/perf/util/hist.h
@@ -22,6 +22,7 @@ enum hist_filter {
HIST_FILTER__GUEST,
HIST_FILTER__HOST,
HIST_FILTER__SOCKET,
+ HIST_FILTER__C2C,
};
enum hist_column {
--
2.7.4
^ permalink raw reply related [flat|nested] 85+ messages in thread
* [PATCH 53/61] perf c2c report: Add support to choose local HITMs
2016-09-19 13:09 [PATCHv3 00/61] perf c2c: Add new tool to analyze cacheline contention on NUMA systems Jiri Olsa
` (51 preceding siblings ...)
2016-09-19 13:10 ` [PATCH 52/61] perf c2c report: Limit the cachelines table entries Jiri Olsa
@ 2016-09-19 13:10 ` Jiri Olsa
2016-09-19 13:10 ` [PATCH 54/61] perf c2c report: Allow to set cacheline sort fields Jiri Olsa
` (7 subsequent siblings)
60 siblings, 0 replies; 85+ messages in thread
From: Jiri Olsa @ 2016-09-19 13:10 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo
Cc: lkml, Don Zickus, Joe Mario, Ingo Molnar, Peter Zijlstra,
Namhyung Kim, David Ahern, Andi Kleen
Currently we sort and limit displayed data based on
the remote HITMs count. Adding support to switch to
local HITMs via --display option:
--display ... lcl,rmt
Link: http://lkml.kernel.org/n/tip-inykbom2f19difvsu1e18avr@git.kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
tools/perf/builtin-c2c.c | 117 ++++++++++++++++++++++++++++++++++++++---------
1 file changed, 96 insertions(+), 21 deletions(-)
diff --git a/tools/perf/builtin-c2c.c b/tools/perf/builtin-c2c.c
index 571be80c6d18..3541c94fff02 100644
--- a/tools/perf/builtin-c2c.c
+++ b/tools/perf/builtin-c2c.c
@@ -63,6 +63,13 @@ struct perf_c2c {
/* HITM shared clines stats */
struct c2c_stats hitm_stats;
int shared_clines;
+
+ int display;
+};
+
+enum {
+ DISPLAY_LCL,
+ DISPLAY_RMT,
};
static struct perf_c2c c2c;
@@ -680,15 +687,24 @@ static double percent_hitm(struct c2c_hist_entry *c2c_he)
struct c2c_hists *hists;
struct c2c_stats *stats;
struct c2c_stats *total;
- int tot, st;
+ int tot = 0, st = 0;
double p;
hists = container_of(c2c_he->he.hists, struct c2c_hists, hists);
stats = &c2c_he->stats;
total = &hists->stats;
- st = stats->rmt_hitm;
- tot = total->rmt_hitm;
+ switch (c2c.display) {
+ case DISPLAY_RMT:
+ st = stats->rmt_hitm;
+ tot = total->rmt_hitm;
+ break;
+ case DISPLAY_LCL:
+ st = stats->lcl_hitm;
+ tot = total->lcl_hitm;
+ default:
+ break;
+ }
p = tot ? (double) st / tot : 0;
@@ -971,14 +987,26 @@ node_entry(struct perf_hpp_fmt *fmt __maybe_unused, struct perf_hpp *hpp,
ret = scnprintf(hpp->buf, hpp->size, "%2d{%2d ", node, num);
advance_hpp(hpp, ret);
+ #define DISPLAY_HITM(__h) \
+ if (c2c_he->stats.__h> 0) { \
+ ret = scnprintf(hpp->buf, hpp->size, "%5.1f%% ", \
+ percent(stats->__h, c2c_he->stats.__h));\
+ } else { \
+ ret = scnprintf(hpp->buf, hpp->size, "%6s ", "n/a"); \
+ }
- if (c2c_he->stats.rmt_hitm > 0) {
- ret = scnprintf(hpp->buf, hpp->size, "%5.1f%% ",
- percent(stats->rmt_hitm, c2c_he->stats.rmt_hitm));
- } else {
- ret = scnprintf(hpp->buf, hpp->size, "%6s ", "n/a");
+ switch (c2c.display) {
+ case DISPLAY_RMT:
+ DISPLAY_HITM(rmt_hitm);
+ break;
+ case DISPLAY_LCL:
+ DISPLAY_HITM(lcl_hitm);
+ default:
+ break;
}
+ #undef DISPLAY_HITM
+
advance_hpp(hpp, ret);
if (c2c_he->stats.store > 0) {
@@ -1254,8 +1282,12 @@ static struct c2c_dimension dim_tot_loads = {
.width = 7,
};
+static struct c2c_header percent_hitm_header[] = {
+ [DISPLAY_LCL] = HEADER_BOTH("Lcl", "Hitm"),
+ [DISPLAY_RMT] = HEADER_BOTH("Rmt", "Hitm"),
+};
+
static struct c2c_dimension dim_percent_hitm = {
- .header = HEADER_LOW("%hitm"),
.name = "percent_hitm",
.cmp = percent_hitm_cmp,
.entry = percent_hitm_entry,
@@ -1652,23 +1684,39 @@ static bool he__display(struct hist_entry *he, struct c2c_stats *stats)
c2c_he = container_of(he, struct c2c_hist_entry, he);
- if (stats->rmt_hitm) {
- ld_dist = ((double)c2c_he->stats.rmt_hitm / stats->rmt_hitm);
- if (ld_dist < DISPLAY_LINE_LIMIT)
- he->filtered = HIST_FILTER__C2C;
- } else {
- he->filtered = HIST_FILTER__C2C;
+#define FILTER_HITM(__h) \
+ if (stats->__h) { \
+ ld_dist = ((double)c2c_he->stats.__h / stats->__h); \
+ if (ld_dist < DISPLAY_LINE_LIMIT) \
+ he->filtered = HIST_FILTER__C2C; \
+ } else { \
+ he->filtered = HIST_FILTER__C2C; \
}
+ switch (c2c.display) {
+ case DISPLAY_LCL:
+ FILTER_HITM(lcl_hitm);
+ break;
+ case DISPLAY_RMT:
+ FILTER_HITM(rmt_hitm);
+ default:
+ break;
+ };
+
+#undef FILTER_HITM
+
return he->filtered == 0;
}
static inline int valid_hitm_or_store(struct hist_entry *he)
{
struct c2c_hist_entry *c2c_he;
+ bool has_hitm;
c2c_he = container_of(he, struct c2c_hist_entry, he);
- return c2c_he->stats.rmt_hitm || c2c_he->stats.store;
+ has_hitm = c2c.display == DISPLAY_LCL ?
+ c2c_he->stats.lcl_hitm : c2c_he->stats.rmt_hitm;
+ return has_hitm || c2c_he->stats.store;
}
static int filter_cb(struct hist_entry *he)
@@ -1949,6 +1997,8 @@ static void print_c2c_info(FILE *out, struct perf_session *session)
perf_evsel__name(evsel));
first = false;
}
+ fprintf(out, " Cachelines sort on : %s HITMs\n",
+ c2c.display == DISPLAY_LCL ? "Local" : "Remote");
}
static void perf_c2c__hists_fprintf(FILE *out, struct perf_session *session)
@@ -2080,8 +2130,10 @@ static int perf_c2c_browser__title(struct hist_browser *browser,
char *bf, size_t size)
{
scnprintf(bf, size,
- "Shared Data Cache Line Table "
- "(%lu entries)", browser->nr_non_filtered_entries);
+ "Shared Data Cache Line Table "
+ "(%lu entries, sorted on %s HITMs)",
+ browser->nr_non_filtered_entries,
+ c2c.display == DISPLAY_LCL ? "local" : "remote");
return 0;
}
@@ -2138,6 +2190,8 @@ static void ui_quirks(bool stdio)
dim_offset.width = 5;
dim_offset.header = header_offset_tui;
}
+
+ dim_percent_hitm.header = percent_hitm_header[c2c.display];
}
#define CALLCHAIN_DEFAULT_OPT "graph,0.5,caller,function,percent"
@@ -2192,6 +2246,22 @@ static int setup_callchain(struct perf_evlist *evlist)
return 0;
}
+static int setup_display(const char *str)
+{
+ const char *display = str ?: "rmt";
+
+ if (!strcmp(display, "rmt"))
+ c2c.display = DISPLAY_RMT;
+ else if (!strcmp(display, "lcl"))
+ c2c.display = DISPLAY_LCL;
+ else {
+ pr_err("failed: unknown display type: %s\n", str);
+ return -1;
+ }
+
+ return 0;
+}
+
static int perf_c2c__report(int argc, const char **argv)
{
struct perf_session *session;
@@ -2200,6 +2270,7 @@ static int perf_c2c__report(int argc, const char **argv)
.mode = PERF_DATA_MODE_READ,
};
char callchain_default_opt[] = CALLCHAIN_DEFAULT_OPT;
+ const char *display = NULL;
const struct option c2c_options[] = {
OPT_STRING('k', "vmlinux", &symbol_conf.vmlinux_name,
"file", "vmlinux pathname"),
@@ -2217,6 +2288,7 @@ static int perf_c2c__report(int argc, const char **argv)
"print_type,threshold[,print_limit],order,sort_key[,branch],value",
callchain_help, &parse_callchain_opt,
callchain_default_opt),
+ OPT_STRING('d', "display", &display, NULL, "lcl,rmt"),
OPT_END()
};
int err = 0;
@@ -2234,8 +2306,6 @@ static int perf_c2c__report(int argc, const char **argv)
else
use_browser = 1;
- ui_quirks(c2c.use_stdio);
-
setup_browser(false);
if (!input_name || !strlen(input_name))
@@ -2243,6 +2313,11 @@ static int perf_c2c__report(int argc, const char **argv)
file.path = input_name;
+ err = setup_display(display);
+ if (err)
+ goto out;
+
+ ui_quirks(c2c.use_stdio);
err = c2c_hists__init(&c2c.hists, "dcacheline", 2);
if (err) {
@@ -2291,7 +2366,7 @@ static int perf_c2c__report(int argc, const char **argv)
"tot_loads,"
"ld_fbhit,ld_l1hit,ld_l2hit,"
"ld_lclhit,ld_rmthit",
- "rmt_hitm"
+ c2c.display == DISPLAY_LCL ? "lcl_hitm" : "rmt_hitm"
);
ui_progress__init(&prog, c2c.hists.hists.nr_entries, "Sorting...");
--
2.7.4
^ permalink raw reply related [flat|nested] 85+ messages in thread
* [PATCH 54/61] perf c2c report: Allow to set cacheline sort fields
2016-09-19 13:09 [PATCHv3 00/61] perf c2c: Add new tool to analyze cacheline contention on NUMA systems Jiri Olsa
` (52 preceding siblings ...)
2016-09-19 13:10 ` [PATCH 53/61] perf c2c report: Add support to choose local HITMs Jiri Olsa
@ 2016-09-19 13:10 ` Jiri Olsa
2016-09-19 13:10 ` [PATCH 55/61] perf c2c report: Recalc width of global sort entries Jiri Olsa
` (6 subsequent siblings)
60 siblings, 0 replies; 85+ messages in thread
From: Jiri Olsa @ 2016-09-19 13:10 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo
Cc: lkml, Don Zickus, Joe Mario, Ingo Molnar, Peter Zijlstra,
Namhyung Kim, David Ahern, Andi Kleen
Allowing user to configure the way the single cacheline
data are sorted after being sorted by offset.
Adding 'c' option to specify sorting fields for single cacheline:
-c, --coalesce <coalesce fields>
coalesce fields: pid,tid,iaddr,dso
It's allowed to use following combination of fields:
pid - process pid
tid - process tid
iaddr - code address
dso - shared object
Link: http://lkml.kernel.org/n/tip-aka8z31umxoq2gqr5mjd81zr@git.kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
tools/perf/builtin-c2c.c | 119 ++++++++++++++++++++++++++++++++++++++++-------
1 file changed, 102 insertions(+), 17 deletions(-)
diff --git a/tools/perf/builtin-c2c.c b/tools/perf/builtin-c2c.c
index 3541c94fff02..ff8a66ee7092 100644
--- a/tools/perf/builtin-c2c.c
+++ b/tools/perf/builtin-c2c.c
@@ -46,6 +46,8 @@ struct c2c_hist_entry {
struct hist_entry he;
};
+static char const *coalesce_default = "pid,tid,iaddr";
+
struct perf_c2c {
struct perf_tool tool;
struct c2c_hists hists;
@@ -65,6 +67,11 @@ struct perf_c2c {
int shared_clines;
int display;
+
+ const char *coalesce;
+ char *cl_sort;
+ char *cl_resort;
+ char *cl_output;
};
enum {
@@ -237,7 +244,7 @@ static int process_sample_event(struct perf_tool *tool __maybe_unused,
if (!mi_dup)
goto free_mi;
- c2c_hists = he__get_c2c_hists(he, "offset", 2);
+ c2c_hists = he__get_c2c_hists(he, c2c.cl_sort, 2);
if (!c2c_hists)
goto free_mi_dup;
@@ -1740,22 +1747,7 @@ static int resort_cl_cb(struct hist_entry *he)
c2c_hists = c2c_he->hists;
if (display && c2c_hists) {
- c2c_hists__reinit(c2c_hists,
- "percent_rmt_hitm,"
- "percent_lcl_hitm,"
- "percent_stores_l1hit,"
- "percent_stores_l1miss,"
- "offset,"
- "pid,"
- "tid,"
- "mean_rmt,"
- "mean_lcl,"
- "mean_load,"
- "cpucnt,"
- "symbol,"
- "dso,"
- "node",
- "offset,rmt_hitm,lcl_hitm");
+ c2c_hists__reinit(c2c_hists, c2c.cl_output, c2c.cl_resort);
hists__collapse_resort(&c2c_hists->hists, NULL);
hists__output_resort_cb(&c2c_hists->hists, NULL, filter_cb);
@@ -1999,6 +1991,7 @@ static void print_c2c_info(FILE *out, struct perf_session *session)
}
fprintf(out, " Cachelines sort on : %s HITMs\n",
c2c.display == DISPLAY_LCL ? "Local" : "Remote");
+ fprintf(out, " Cacheline data grouping : %s\n", c2c.cl_sort);
}
static void perf_c2c__hists_fprintf(FILE *out, struct perf_session *session)
@@ -2262,6 +2255,89 @@ static int setup_display(const char *str)
return 0;
}
+#define for_each_token(__tok, __buf, __sep, __tmp) \
+ for (__tok = strtok_r(__buf, __sep, &__tmp); __tok; \
+ __tok = strtok_r(NULL, __sep, &__tmp))
+
+static int build_cl_output(char *cl_sort)
+{
+ char *tok, *tmp, *buf = strdup(cl_sort);
+ bool add_pid = false;
+ bool add_tid = false;
+ bool add_iaddr = false;
+ bool add_sym = false;
+ bool add_dso = false;
+ bool add_src = false;
+
+ if (!buf)
+ return -ENOMEM;
+
+ for_each_token(tok, buf, ",", tmp) {
+ if (!strcmp(tok, "tid")) {
+ add_tid = true;
+ } else if (!strcmp(tok, "pid")) {
+ add_pid = true;
+ } else if (!strcmp(tok, "iaddr")) {
+ add_iaddr = true;
+ add_sym = true;
+ add_dso = true;
+ add_src = true;
+ } else if (!strcmp(tok, "dso")) {
+ add_dso = true;
+ } else if (strcmp(tok, "offset")) {
+ pr_err("unrecognized sort token: %s\n", tok);
+ return -EINVAL;
+ }
+ }
+
+ if (asprintf(&c2c.cl_output,
+ "%s%s%s%s%s%s%s%s%s",
+ "percent_rmt_hitm,"
+ "percent_lcl_hitm,"
+ "percent_stores_l1hit,"
+ "percent_stores_l1miss,"
+ "offset,",
+ add_pid ? "pid," : "",
+ add_tid ? "tid," : "",
+ add_iaddr ? "iaddr," : "",
+ "mean_rmt,"
+ "mean_lcl,"
+ "mean_load,"
+ "cpucnt,",
+ add_sym ? "symbol," : "",
+ add_dso ? "dso," : "",
+ add_src ? "cl_srcline," : "",
+ "node") < 0)
+ return -ENOMEM;
+
+ c2c.show_src = add_src;
+
+ free(buf);
+ return 0;
+}
+
+static int setup_coalesce(const char *coalesce)
+{
+ const char *c = coalesce ?: coalesce_default;
+
+ if (asprintf(&c2c.cl_sort, "offset,%s", c) < 0)
+ return -ENOMEM;
+
+ if (build_cl_output(c2c.cl_sort))
+ return -1;
+
+ if (asprintf(&c2c.cl_resort, "offset,%s",
+ c2c.display == DISPLAY_RMT ?
+ "rmt_hitm,lcl_hitm" :
+ "lcl_hitm,rmt_hitm") < 0)
+ return -ENOMEM;
+
+ pr_debug("coalesce sort fields: %s\n", c2c.cl_sort);
+ pr_debug("coalesce resort fields: %s\n", c2c.cl_resort);
+ pr_debug("coalesce output fields: %s\n", c2c.cl_output);
+ return 0;
+}
+
static int perf_c2c__report(int argc, const char **argv)
{
struct perf_session *session;
@@ -2271,6 +2347,7 @@ static int perf_c2c__report(int argc, const char **argv)
};
char callchain_default_opt[] = CALLCHAIN_DEFAULT_OPT;
const char *display = NULL;
+ const char *coalesce = NULL;
const struct option c2c_options[] = {
OPT_STRING('k', "vmlinux", &symbol_conf.vmlinux_name,
"file", "vmlinux pathname"),
@@ -2289,6 +2366,8 @@ static int perf_c2c__report(int argc, const char **argv)
callchain_help, &parse_callchain_opt,
callchain_default_opt),
OPT_STRING('d', "display", &display, NULL, "lcl,rmt"),
+ OPT_STRING('c', "coalesce", &coalesce, "coalesce fields",
+ "coalesce fields: pid,tid,iaddr,dso"),
OPT_END()
};
int err = 0;
@@ -2317,6 +2396,12 @@ static int perf_c2c__report(int argc, const char **argv)
if (err)
goto out;
+ err = setup_coalesce(coalesce);
+ if (err) {
+ pr_debug("Failed to initialize hists\n");
+ goto out;
+ }
+
ui_quirks(c2c.use_stdio);
err = c2c_hists__init(&c2c.hists, "dcacheline", 2);
--
2.7.4
^ permalink raw reply related [flat|nested] 85+ messages in thread
* [PATCH 55/61] perf c2c report: Recalc width of global sort entries
2016-09-19 13:09 [PATCHv3 00/61] perf c2c: Add new tool to analyze cacheline contention on NUMA systems Jiri Olsa
` (53 preceding siblings ...)
2016-09-19 13:10 ` [PATCH 54/61] perf c2c report: Allow to set cacheline sort fields Jiri Olsa
@ 2016-09-19 13:10 ` Jiri Olsa
2016-09-19 13:10 ` [PATCH 56/61] perf c2c report: Add cacheline index entry Jiri Olsa
` (5 subsequent siblings)
60 siblings, 0 replies; 85+ messages in thread
From: Jiri Olsa @ 2016-09-19 13:10 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo
Cc: lkml, Don Zickus, Joe Mario, Ingo Molnar, Peter Zijlstra,
Namhyung Kim, David Ahern, Andi Kleen
Using resort callbacks to compute the columns' width.
Computing only the global ones, c2c entries have fixed
width only.
Link: http://lkml.kernel.org/n/tip-zyayvq2u3dzyf3y7i9jza0lw@git.kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
tools/perf/builtin-c2c.c | 12 ++++++++++++
1 file changed, 12 insertions(+)
diff --git a/tools/perf/builtin-c2c.c b/tools/perf/builtin-c2c.c
index ff8a66ee7092..c93a766190b1 100644
--- a/tools/perf/builtin-c2c.c
+++ b/tools/perf/builtin-c2c.c
@@ -1726,11 +1726,21 @@ static inline int valid_hitm_or_store(struct hist_entry *he)
return has_hitm || c2c_he->stats.store;
}
+static void calc_width(struct hist_entry *he)
+{
+ struct c2c_hists *c2c_hists;
+
+ c2c_hists = container_of(he->hists, struct c2c_hists, hists);
+ hists__calc_col_len(&c2c_hists->hists, he);
+}
+
static int filter_cb(struct hist_entry *he)
{
if (c2c.show_src && !he->srcline)
he->srcline = hist_entry__get_srcline(he);
+ calc_width(he);
+
if (!valid_hitm_or_store(he))
he->filtered = HIST_FILTER__C2C;
@@ -1746,6 +1756,8 @@ static int resort_cl_cb(struct hist_entry *he)
c2c_he = container_of(he, struct c2c_hist_entry, he);
c2c_hists = c2c_he->hists;
+ calc_width(he);
+
if (display && c2c_hists) {
c2c_hists__reinit(c2c_hists, c2c.cl_output, c2c.cl_resort);
--
2.7.4
^ permalink raw reply related [flat|nested] 85+ messages in thread
* [PATCH 56/61] perf c2c report: Add cacheline index entry
2016-09-19 13:09 [PATCHv3 00/61] perf c2c: Add new tool to analyze cacheline contention on NUMA systems Jiri Olsa
` (54 preceding siblings ...)
2016-09-19 13:10 ` [PATCH 55/61] perf c2c report: Recalc width of global sort entries Jiri Olsa
@ 2016-09-19 13:10 ` Jiri Olsa
2016-09-19 13:10 ` [PATCH 57/61] perf c2c report: Add support to manage symbol name length Jiri Olsa
` (4 subsequent siblings)
60 siblings, 0 replies; 85+ messages in thread
From: Jiri Olsa @ 2016-09-19 13:10 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo
Cc: lkml, Don Zickus, Joe Mario, Ingo Molnar, Peter Zijlstra,
Namhyung Kim, David Ahern, Andi Kleen
It's convenient to have an index for each cacheline to
help discussions about results over the phone.
Add new 'Index' and 'Num' fields in main and single
cacheline tables.
$ perf c2c report
=================================================
Shared Data Cache Line Table
=================================================
#
# Total Lcl ----- LLC Load Hitm -----
# Index Cacheline records Hitm Total Lcl Rmt ...
# ..... .................. ....... ....... ....... ....... .......
#
0 0xffff880036233b40 1 11.11% 1 1 0
1 0xffff88009ccb2900 1 11.11% 1 1 0
2 0xffff8800b5b3bc40 7 11.11% 1 1 0
...
=================================================
Shared Cache Line Distribution Pareto
=================================================
#
# ----- HITM ----- -- Store Refs -- Data address
# Num Rmt Lcl L1 Hit L1 Miss Offset Pid ...
# ..... ....... ....... ....... ....... .................. .......
#
-------------------------------------------------------------
0 0 1 0 0 0xffff880036233b40
-------------------------------------------------------------
0.00% 100.00% 0.00% 0.00% 0x30 0
-------------------------------------------------------------
1 0 1 0 0 0xffff88009ccb2900
-------------------------------------------------------------
0.00% 100.00% 0.00% 0.00% 0x28 549
...
Link: http://lkml.kernel.org/n/tip-4dhfagaz57tvrfjbg8nd2h4u@git.kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
tools/perf/builtin-c2c.c | 64 +++++++++++++++++++++++++++++++++++++++++++++---
1 file changed, 61 insertions(+), 3 deletions(-)
diff --git a/tools/perf/builtin-c2c.c b/tools/perf/builtin-c2c.c
index c93a766190b1..eb78a73b9230 100644
--- a/tools/perf/builtin-c2c.c
+++ b/tools/perf/builtin-c2c.c
@@ -36,6 +36,7 @@ struct c2c_hist_entry {
struct c2c_stats stats;
unsigned long *cpuset;
struct c2c_stats *node_stats;
+ unsigned int cacheline_idx;
struct compute_stats cstats;
@@ -1084,6 +1085,29 @@ cpucnt_entry(struct perf_hpp_fmt *fmt __maybe_unused, struct perf_hpp *hpp,
return snprintf(hpp->buf, hpp->size, "%*s", width, buf);
}
+static int
+cl_idx_entry(struct perf_hpp_fmt *fmt __maybe_unused, struct perf_hpp *hpp,
+ struct hist_entry *he)
+{
+ struct c2c_hist_entry *c2c_he;
+ int width = c2c_width(fmt, hpp, he->hists);
+ char buf[10];
+
+ c2c_he = container_of(he, struct c2c_hist_entry, he);
+
+ snprintf(buf, 10, "%u", c2c_he->cacheline_idx);
+ return snprintf(hpp->buf, hpp->size, "%*s", width, buf);
+}
+
+static int
+cl_idx_empty_entry(struct perf_hpp_fmt *fmt __maybe_unused, struct perf_hpp *hpp,
+ struct hist_entry *he)
+{
+ int width = c2c_width(fmt, hpp, he->hists);
+
+ return snprintf(hpp->buf, hpp->size, "%*s", width, "");
+}
+
#define HEADER_LOW(__h) \
{ \
.line[1] = { \
@@ -1429,6 +1453,30 @@ static struct c2c_dimension dim_srcline = {
.se = &sort_srcline,
};
+static struct c2c_dimension dim_dcacheline_idx = {
+ .header = HEADER_LOW("Index"),
+ .name = "cl_idx",
+ .cmp = empty_cmp,
+ .entry = cl_idx_entry,
+ .width = 5,
+};
+
+static struct c2c_dimension dim_dcacheline_num = {
+ .header = HEADER_LOW("Num"),
+ .name = "cl_num",
+ .cmp = empty_cmp,
+ .entry = cl_idx_entry,
+ .width = 5,
+};
+
+static struct c2c_dimension dim_dcacheline_num_empty = {
+ .header = HEADER_LOW("Num"),
+ .name = "cl_num_empty",
+ .cmp = empty_cmp,
+ .entry = cl_idx_empty_entry,
+ .width = 5,
+};
+
#undef HEADER_LOW
#undef HEADER_BOTH
#undef HEADER_SPAN
@@ -1473,6 +1521,9 @@ static struct c2c_dimension *dimensions[] = {
&dim_mean_load,
&dim_cpucnt,
&dim_srcline,
+ &dim_dcacheline_idx,
+ &dim_dcacheline_num,
+ &dim_dcacheline_num_empty,
NULL,
};
@@ -1759,6 +1810,10 @@ static int resort_cl_cb(struct hist_entry *he)
calc_width(he);
if (display && c2c_hists) {
+ static unsigned int idx;
+
+ c2c_he->cacheline_idx = idx++;
+
c2c_hists__reinit(c2c_hists, c2c.cl_output, c2c.cl_resort);
hists__collapse_resort(&c2c_hists->hists, NULL);
@@ -1946,10 +2001,10 @@ static void print_cacheline(struct c2c_hists *c2c_hists,
fprintf(out, "\n");
}
- fprintf(out, " ------------------------------------------------------\n");
+ fprintf(out, " -------------------------------------------------------------\n");
hist_entry__snprintf(he_cl, &hpp, hpp_list);
fprintf(out, "%s\n", bf);
- fprintf(out, " ------------------------------------------------------\n");
+ fprintf(out, " -------------------------------------------------------------\n");
hists__fprintf(&c2c_hists->hists, false, 0, 0, 0, out, true);
}
@@ -1962,6 +2017,7 @@ static void print_pareto(FILE *out)
perf_hpp_list__init(&hpp_list);
ret = hpp_list__parse(&hpp_list,
+ "cl_num,"
"cl_rmt_hitm,"
"cl_lcl_hitm,"
"cl_stores_l1hit,"
@@ -2303,7 +2359,8 @@ static int build_cl_output(char *cl_sort)
}
if (asprintf(&c2c.cl_output,
- "%s%s%s%s%s%s%s%s%s",
+ "%s%s%s%s%s%s%s%s%s%s",
+ c2c.use_stdio ? "cl_num_empty," : "",
"percent_rmt_hitm,"
"percent_lcl_hitm,"
"percent_stores_l1hit,"
@@ -2453,6 +2510,7 @@ static int perf_c2c__report(int argc, const char **argv)
}
c2c_hists__reinit(&c2c.hists,
+ "cl_idx,"
"dcacheline,"
"tot_recs,"
"percent_hitm,"
--
2.7.4
^ permalink raw reply related [flat|nested] 85+ messages in thread
* [PATCH 57/61] perf c2c report: Add support to manage symbol name length
2016-09-19 13:09 [PATCHv3 00/61] perf c2c: Add new tool to analyze cacheline contention on NUMA systems Jiri Olsa
` (55 preceding siblings ...)
2016-09-19 13:10 ` [PATCH 56/61] perf c2c report: Add cacheline index entry Jiri Olsa
@ 2016-09-19 13:10 ` Jiri Olsa
2016-09-19 13:10 ` [PATCH 58/61] perf c2c report: Iterate node display in browser Jiri Olsa
` (3 subsequent siblings)
60 siblings, 0 replies; 85+ messages in thread
From: Jiri Olsa @ 2016-09-19 13:10 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo
Cc: lkml, Don Zickus, Joe Mario, Ingo Molnar, Peter Zijlstra,
Namhyung Kim, David Ahern, Andi Kleen
The width of symbol and source line entries could get really long
and not convenient to display. Adding support to display only
patrt of such strings and possibility to switch to full length
by uing --full-symbols option or 's' key in TUI browser.
Link: http://lkml.kernel.org/n/tip-yxf5hfteyfaoi8xrgczqtyha@git.kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
tools/perf/builtin-c2c.c | 33 ++++++++++++++++++++++++++++++++-
1 file changed, 32 insertions(+), 1 deletion(-)
diff --git a/tools/perf/builtin-c2c.c b/tools/perf/builtin-c2c.c
index eb78a73b9230..1adb7fb4866c 100644
--- a/tools/perf/builtin-c2c.c
+++ b/tools/perf/builtin-c2c.c
@@ -62,6 +62,7 @@ struct perf_c2c {
bool show_src;
bool use_stdio;
bool stats_only;
+ bool symbol_full;
/* HITM shared clines stats */
struct c2c_stats hitm_stats;
@@ -334,6 +335,21 @@ struct c2c_fmt {
struct c2c_dimension *dim;
};
+#define SYMBOL_WIDTH 30
+
+static struct c2c_dimension dim_symbol;
+static struct c2c_dimension dim_srcline;
+
+static int symbol_width(struct hists *hists, struct sort_entry *se)
+{
+ int width = hists__col_len(hists, se->se_width_idx);
+
+ if (!c2c.symbol_full)
+ width = MIN(width, SYMBOL_WIDTH);
+
+ return width;
+}
+
static int c2c_width(struct perf_hpp_fmt *fmt,
struct perf_hpp *hpp __maybe_unused,
struct hists *hists __maybe_unused)
@@ -344,6 +360,9 @@ static int c2c_width(struct perf_hpp_fmt *fmt,
c2c_fmt = container_of(fmt, struct c2c_fmt, fmt);
dim = c2c_fmt->dim;
+ if (dim == &dim_symbol || dim == &dim_srcline)
+ return symbol_width(hists, dim->se);
+
return dim->se ? hists__col_len(hists, dim->se->se_width_idx) :
c2c_fmt->dim->width;
}
@@ -1564,9 +1583,13 @@ static int c2c_se_entry(struct perf_hpp_fmt *fmt, struct perf_hpp *hpp,
struct c2c_dimension *dim = c2c_fmt->dim;
size_t len = fmt->user_len;
- if (!len)
+ if (!len) {
len = hists__col_len(he->hists, dim->se->se_width_idx);
+ if (dim == &dim_symbol || dim == &dim_srcline)
+ len = symbol_width(he->hists, dim->se);
+ }
+
return dim->se->se_snprintf(he, hpp->buf, hpp->size, len);
}
@@ -2156,6 +2179,9 @@ static int perf_c2c__browse_cacheline(struct hist_entry *he)
struct hist_browser *browser;
int key = -1;
+ /* Display compact version first. */
+ c2c.symbol_full = false;
+
c2c_he = container_of(he, struct c2c_hist_entry, he);
c2c_hists = c2c_he->hists;
@@ -2175,6 +2201,9 @@ static int perf_c2c__browse_cacheline(struct hist_entry *he)
key = hist_browser__run(browser, "help");
switch (key) {
+ case 's':
+ c2c.symbol_full = !c2c.symbol_full;
+ break;
case 'q':
goto out;
default:
@@ -2430,6 +2459,8 @@ static int perf_c2c__report(int argc, const char **argv)
"Use the stdio interface"),
OPT_BOOLEAN(0, "stats", &c2c.stats_only,
"Use the stdio interface"),
+ OPT_BOOLEAN(0, "full-symbols", &c2c.symbol_full,
+ "Display full length of symbols"),
OPT_CALLBACK_DEFAULT('g', "call-graph", &callchain_param,
"print_type,threshold[,print_limit],order,sort_key[,branch],value",
callchain_help, &parse_callchain_opt,
--
2.7.4
^ permalink raw reply related [flat|nested] 85+ messages in thread
* [PATCH 58/61] perf c2c report: Iterate node display in browser
2016-09-19 13:09 [PATCHv3 00/61] perf c2c: Add new tool to analyze cacheline contention on NUMA systems Jiri Olsa
` (56 preceding siblings ...)
2016-09-19 13:10 ` [PATCH 57/61] perf c2c report: Add support to manage symbol name length Jiri Olsa
@ 2016-09-19 13:10 ` Jiri Olsa
2016-09-19 13:10 ` [PATCH 59/61] perf c2c report: Add help windows Jiri Olsa
` (2 subsequent siblings)
60 siblings, 0 replies; 85+ messages in thread
From: Jiri Olsa @ 2016-09-19 13:10 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo
Cc: lkml, Don Zickus, Joe Mario, Ingo Molnar, Peter Zijlstra,
Namhyung Kim, David Ahern, Andi Kleen
Adding TUI support to switch between Node entry versions
in real time with 'n' key.
Link: http://lkml.kernel.org/n/tip-xqbw4h4dxig54wff7fd14lao@git.kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
tools/perf/builtin-c2c.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/tools/perf/builtin-c2c.c b/tools/perf/builtin-c2c.c
index 1adb7fb4866c..0902aba4cf19 100644
--- a/tools/perf/builtin-c2c.c
+++ b/tools/perf/builtin-c2c.c
@@ -2204,6 +2204,10 @@ static int perf_c2c__browse_cacheline(struct hist_entry *he)
case 's':
c2c.symbol_full = !c2c.symbol_full;
break;
+ case 'n':
+ c2c.node_info = (c2c.node_info + 1) % 3;
+ setup_nodes_header();
+ break;
case 'q':
goto out;
default:
--
2.7.4
^ permalink raw reply related [flat|nested] 85+ messages in thread
* [PATCH 59/61] perf c2c report: Add help windows
2016-09-19 13:09 [PATCHv3 00/61] perf c2c: Add new tool to analyze cacheline contention on NUMA systems Jiri Olsa
` (57 preceding siblings ...)
2016-09-19 13:10 ` [PATCH 58/61] perf c2c report: Iterate node display in browser Jiri Olsa
@ 2016-09-19 13:10 ` Jiri Olsa
2016-09-19 13:10 ` [PATCH 60/61] perf c2c: Add man page and credits Jiri Olsa
2016-09-19 13:10 ` [PATCH 61/61] perf tools: Fix width computation for srcline sort entry Jiri Olsa
60 siblings, 0 replies; 85+ messages in thread
From: Jiri Olsa @ 2016-09-19 13:10 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo
Cc: lkml, Don Zickus, Joe Mario, Ingo Molnar, Peter Zijlstra,
Namhyung Kim, David Ahern, Andi Kleen
Adding help windows to display key/action mappings
for both browsers.
Link: http://lkml.kernel.org/n/tip-zni4apopx6a9eyxsosm1ebh1@git.kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
tools/perf/builtin-c2c.c | 19 +++++++++++++++++--
1 file changed, 17 insertions(+), 2 deletions(-)
diff --git a/tools/perf/builtin-c2c.c b/tools/perf/builtin-c2c.c
index 0902aba4cf19..e1e74ed27075 100644
--- a/tools/perf/builtin-c2c.c
+++ b/tools/perf/builtin-c2c.c
@@ -2178,6 +2178,11 @@ static int perf_c2c__browse_cacheline(struct hist_entry *he)
struct c2c_cacheline_browser *cl_browser;
struct hist_browser *browser;
int key = -1;
+ const char help[] =
+ " ENTER Togle callchains (if present) \n"
+ " n Togle Node details info \n"
+ " s Togle full lenght of symbol and source line columns \n"
+ " q Return back to cacheline list \n";
/* Display compact version first. */
c2c.symbol_full = false;
@@ -2198,7 +2203,7 @@ static int perf_c2c__browse_cacheline(struct hist_entry *he)
c2c_browser__update_nr_entries(browser);
while (1) {
- key = hist_browser__run(browser, "help");
+ key = hist_browser__run(browser, "? - help");
switch (key) {
case 's':
@@ -2210,6 +2215,9 @@ static int perf_c2c__browse_cacheline(struct hist_entry *he)
break;
case 'q':
goto out;
+ case '?':
+ ui_browser__help_window(&browser->b, help);
+ break;
default:
break;
}
@@ -2248,6 +2256,10 @@ static int perf_c2c__hists_browse(struct hists *hists)
{
struct hist_browser *browser;
int key = -1;
+ const char help[] =
+ " d Display cacheline details \n"
+ " ENTER Togle callchains (if present) \n"
+ " q Quit \n";
browser = perf_c2c_browser__new(hists);
if (browser == NULL)
@@ -2260,7 +2272,7 @@ static int perf_c2c__hists_browse(struct hists *hists)
c2c_browser__update_nr_entries(browser);
while (1) {
- key = hist_browser__run(browser, "help");
+ key = hist_browser__run(browser, "? - help");
switch (key) {
case 'q':
@@ -2268,6 +2280,9 @@ static int perf_c2c__hists_browse(struct hists *hists)
case 'd':
perf_c2c__browse_cacheline(browser->he_selection);
break;
+ case '?':
+ ui_browser__help_window(&browser->b, help);
+ break;
default:
break;
}
--
2.7.4
^ permalink raw reply related [flat|nested] 85+ messages in thread
* [PATCH 60/61] perf c2c: Add man page and credits
2016-09-19 13:09 [PATCHv3 00/61] perf c2c: Add new tool to analyze cacheline contention on NUMA systems Jiri Olsa
` (58 preceding siblings ...)
2016-09-19 13:10 ` [PATCH 59/61] perf c2c report: Add help windows Jiri Olsa
@ 2016-09-19 13:10 ` Jiri Olsa
2016-09-19 13:10 ` [PATCH 61/61] perf tools: Fix width computation for srcline sort entry Jiri Olsa
60 siblings, 0 replies; 85+ messages in thread
From: Jiri Olsa @ 2016-09-19 13:10 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo
Cc: lkml, Don Zickus, Joe Mario, Ingo Molnar, Peter Zijlstra,
Namhyung Kim, David Ahern, Andi Kleen
Adding man page for c2c command and credits
to builtin-c2c.c file.
Link: http://lkml.kernel.org/n/tip-twbp391v8v9f5idp584hlfov@git.kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
tools/perf/Documentation/perf-c2c.txt | 276 ++++++++++++++++++++++++++++++++++
tools/perf/builtin-c2c.c | 11 ++
2 files changed, 287 insertions(+)
create mode 100644 tools/perf/Documentation/perf-c2c.txt
diff --git a/tools/perf/Documentation/perf-c2c.txt b/tools/perf/Documentation/perf-c2c.txt
new file mode 100644
index 000000000000..ba2f4de399c3
--- /dev/null
+++ b/tools/perf/Documentation/perf-c2c.txt
@@ -0,0 +1,276 @@
+perf-c2c(1)
+===========
+
+NAME
+----
+perf-c2c - Shared Data C2C/HITM Analyzer.
+
+SYNOPSIS
+--------
+[verse]
+'perf c2c record' [<options>] <command>
+'perf c2c record' [<options>] -- [<record command options>] <command>
+'perf c2c report' [<options>]
+
+DESCRIPTION
+-----------
+C2C stands for Cache To Cache.
+
+The perf c2c tool provides means for Shared Data C2C/HITM analysis. It allows
+you to track down the cacheline contentions.
+
+The tool is based on x86's load latency and precise store facility events
+provided by Intel CPUs. These events provide:
+ - memory address of the access
+ - type of the access (load and store details)
+ - latency (in cycles) of the load access
+
+The c2c tool provide means to record this data and report back access details
+for cachelines with highest contention - highest number of HITM accesses.
+
+The basic workflow with this tool follows the standard record/report phase.
+User uses the record command to record events data and report command to
+display it.
+
+
+RECORD OPTIONS
+--------------
+-e::
+--event=::
+ Select the PMU event. Use 'perf mem record -e list'
+ to list available events.
+
+-v::
+--verbose::
+ Be more verbose (show counter open errors, etc).
+
+-l::
+--ldlat::
+ Configure mem-loads latency.
+
+-k::
+--all-kernel::
+ Configure all used events to run in kernel space.
+
+-u::
+--all-user::
+ Configure all used events to run in user space.
+
+REPORT OPTIONS
+--------------
+-k::
+--vmlinux=<file>::
+ vmlinux pathname
+
+-v::
+--verbose::
+ Be more verbose (show counter open errors, etc).
+
+-i::
+--input::
+ Specify the input file to process.
+
+-N::
+--node-info::
+ Show extra node info in report (see NODE INFO section)
+
+-c::
+--coalesce::
+ Specify sorintg fields for single cacheline display.
+ Following fields are available: tid,pid,iaddr,dso
+ (see COALESCE)
+
+-g::
+--call-graph::
+ Setup callchains parameters.
+ Please refer to perf-report man page for details.
+
+--stdio::
+ Force the stdio output (see STDIO OUTPUT)
+
+--stats::
+ Display only statistic tables and force stdio mode.
+
+--full-symbols::
+ Display full length of symbols.
+
+C2C RECORD
+----------
+The perf c2c record command setup options related to HITM cacheline analysis
+and calls standard perf record command.
+
+Following perf record options are configured by default:
+(check perf record man page for details)
+
+ -W,-d,--sample-cpu
+
+Unless specified otherwise with '-e' option, following events are monitored by
+default:
+
+ cpu/mem-loads,ldlat=30/P
+ cpu/mem-stores/P
+
+User can pass any 'perf record' option behind '--' mark, like (to enable
+callchains and system wide monitoring):
+
+ $ perf c2c record -- -g -a
+
+Please check RECORD OPTIONS section for specific c2c record options.
+
+C2C REPORT
+----------
+The perf c2c report command displays shared data analysis. It comes in two
+display modes: stdio and tui (default).
+
+The report command workflow is following:
+ - sort all the data based on the cacheline address
+ - store access details for each cacheline
+ - sort all cachelines based on user settings
+ - display data
+
+In general perf report output consist of 2 basic views:
+ 1) most expensive cachelines list
+ 2) offsets details for each cacheline
+
+For each cacheline in the 1) list we display following data:
+(Both stdio and TUI modes follow the same fields output)
+
+ Index
+ - zero based index to identify the cacheline
+
+ Cacheline
+ - cacheline address (hex number)
+
+ Total records
+ - sum of all cachelines accesses
+
+ Rmt/Lcl Hitm
+ - cacheline percentage of all Remote/Local HITM accesses
+
+ LLC Load Hitm - Total, Lcl, Rmt
+ - count of Total/Local/Remote load HITMs
+
+ Store Reference - Total, L1Hit, L1Miss
+ Total - all store accesses
+ L1Hit - store accesses that hit L1
+ L1Hit - store accesses that missed L1
+
+ Load Dram
+ - count of local and remote DRAM accesses
+
+ LLC Ld Miss
+ - count of all accesses that missed LLC
+
+ Total Loads
+ - sum of all load accesses
+
+ Core Load Hit - FB, L1, L2
+ - count of load hits in FB (Fill Buffer), L1 and L2 cache
+
+ LLC Load Hit - Llc, Rmt
+ - count of LLC and Remote load hits
+
+For each offset in the 2) list we display following data:
+
+ HITM - Rmt, Lcl
+ - % of Remote/Local HITM accesses for given offset within cacheline
+
+ Store Refs - L1 Hit, L1 Miss
+ - % of store accesses that hit/missed L1 for given offset within cacheline
+
+ Data address - Offset
+ - offset address
+
+ Pid
+ - pid of the process responsible for the accesses
+
+ Tid
+ - tid of the process responsible for the accesses
+
+ Code address
+ - code address responsible for the accesses
+
+ cycles - rmt hitm, lcl hitm, load
+ - sum of cycles for given accesses - Remote/Local HITM and generic load
+
+ cpu cnt
+ - number of cpus that participated on the access
+
+ Symbol
+ - code symbol related to the 'Code address' value
+
+ Shared Object
+ - shared object name related to the 'Code address' value
+
+ Source:Line
+ - source information related to the 'Code address' value
+
+ Node
+ - nodes participating on the access (see NODE INFO section)
+
+NODE INFO
+---------
+The 'Node' field displays nodes that accesses given cacheline
+offset. Its output comes in 3 flavors:
+ - node IDs separated by ','
+ - node IDs with stats for each ID, in following format:
+ Node{cpus %hitms %stores}
+ - node IDs with list of affected CPUs in following format:
+ Node{cpu list}
+
+User can switch between above flavors with -N option or
+use 'n' key to interactively switch in TUI mode.
+
+COALESCE
+--------
+User can specify how to sort offsets for cacheline.
+
+Following fields are available and governs the final
+output fields set for caheline offsets output:
+
+ tid - coalesced by process TIDs
+ pid - coalesced by process PIDs
+ iaddr - coalesced by code address, following fields are displayed:
+ Code address, Code symbol, Shared Object, Source line
+ dso - coalesced by shared object
+
+By default the coalescing is setup with 'pid,tid,iaddr'.
+
+STDIO OUTPUT
+------------
+The stdio output displays data on standard output.
+
+Following tables are displayed:
+ Trace Event Information
+ - overall statistics of memory accesses
+
+ Global Shared Cache Line Event Information
+ - overall statistics on shared cachelines
+
+ Shared Data Cache Line Table
+ - list of most expensive cachelines
+
+ Shared Cache Line Distribution Pareto
+ - list of all accessed offsets for each cacheline
+
+TUI OUTPUT
+----------
+The TUI output provides interactive interface to navigate
+through cachelines list and to display offset details.
+
+For details please refer to the help window by pressing '?' key.
+
+CREDITS
+-------
+Although Don Zickus, Dick Fowles and Joe Mario worked together
+to get this implemented, we got lots of early help from Arnaldo
+Carvalho de Melo, Stephane Eranian, Jiri Olsa and Andi Kleen.
+
+C2C BLOG
+--------
+Check Joe's blog on c2c tool for detailed use case explanation:
+ https://joemario.github.io/blog/2016/09/01/c2c-blog/
+
+SEE ALSO
+--------
+linkperf:perf-record[1], linkperf:perf-mem[1]
diff --git a/tools/perf/builtin-c2c.c b/tools/perf/builtin-c2c.c
index e1e74ed27075..61d6abb3713d 100644
--- a/tools/perf/builtin-c2c.c
+++ b/tools/perf/builtin-c2c.c
@@ -1,3 +1,14 @@
+/*
+ * This is rewrite of original c2c tool introduced in here:
+ * http://lwn.net/Articles/588866/
+ *
+ * The original tool was changed to fit in current perf state.
+ *
+ * Original authors:
+ * Don Zickus <dzickus@redhat.com>
+ * Dick Fowles <fowles@inreach.com>
+ * Joe Mario <jmario@redhat.com>
+ */
#include <linux/compiler.h>
#include <linux/kernel.h>
#include <linux/stringify.h>
--
2.7.4
^ permalink raw reply related [flat|nested] 85+ messages in thread
* [PATCH 61/61] perf tools: Fix width computation for srcline sort entry
2016-09-19 13:09 [PATCHv3 00/61] perf c2c: Add new tool to analyze cacheline contention on NUMA systems Jiri Olsa
` (59 preceding siblings ...)
2016-09-19 13:10 ` [PATCH 60/61] perf c2c: Add man page and credits Jiri Olsa
@ 2016-09-19 13:10 ` Jiri Olsa
2016-09-19 14:33 ` Arnaldo Carvalho de Melo
2016-09-20 21:43 ` [tip:perf/core] perf hists: " tip-bot for Jiri Olsa
60 siblings, 2 replies; 85+ messages in thread
From: Jiri Olsa @ 2016-09-19 13:10 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo
Cc: lkml, Don Zickus, Joe Mario, Ingo Molnar, Peter Zijlstra,
Namhyung Kim, David Ahern, Andi Kleen
Adding header size to width computation for srcline sort entry,
because it's possible to get empty data with ':0' which set width
of 2 which is lower than width needed to display column header.
Link: http://lkml.kernel.org/n/tip-twbp391v8v9f5idp584hlfov@git.kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
tools/perf/util/hist.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index 020efa9d3d74..e1be4132054d 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -177,8 +177,10 @@ void hists__calc_col_len(struct hists *hists, struct hist_entry *h)
hists__new_col_len(hists, HISTC_LOCAL_WEIGHT, 12);
hists__new_col_len(hists, HISTC_GLOBAL_WEIGHT, 12);
- if (h->srcline)
- hists__new_col_len(hists, HISTC_SRCLINE, strlen(h->srcline));
+ if (h->srcline) {
+ len = MAX(strlen(h->srcline), strlen(sort_srcline.se_header));
+ hists__new_col_len(hists, HISTC_SRCLINE, len);
+ }
if (h->srcfile)
hists__new_col_len(hists, HISTC_SRCFILE, strlen(h->srcfile));
--
2.7.4
^ permalink raw reply related [flat|nested] 85+ messages in thread
* Re: [PATCH 61/61] perf tools: Fix width computation for srcline sort entry
2016-09-19 13:10 ` [PATCH 61/61] perf tools: Fix width computation for srcline sort entry Jiri Olsa
@ 2016-09-19 14:33 ` Arnaldo Carvalho de Melo
2016-09-19 14:49 ` Jiri Olsa
2016-09-20 21:43 ` [tip:perf/core] perf hists: " tip-bot for Jiri Olsa
1 sibling, 1 reply; 85+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-09-19 14:33 UTC (permalink / raw)
To: Jiri Olsa
Cc: lkml, Don Zickus, Joe Mario, Ingo Molnar, Peter Zijlstra,
Namhyung Kim, David Ahern, Andi Kleen
Em Mon, Sep 19, 2016 at 03:10:10PM +0200, Jiri Olsa escreveu:
> Adding header size to width computation for srcline sort entry,
> because it's possible to get empty data with ':0' which set width
> of 2 which is lower than width needed to display column header.
Thanks, cherry-picked, looking at the larger patchset.
- Arnaldo
> Link: http://lkml.kernel.org/n/tip-twbp391v8v9f5idp584hlfov@git.kernel.org
> Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> ---
> tools/perf/util/hist.c | 6 ++++--
> 1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
> index 020efa9d3d74..e1be4132054d 100644
> --- a/tools/perf/util/hist.c
> +++ b/tools/perf/util/hist.c
> @@ -177,8 +177,10 @@ void hists__calc_col_len(struct hists *hists, struct hist_entry *h)
> hists__new_col_len(hists, HISTC_LOCAL_WEIGHT, 12);
> hists__new_col_len(hists, HISTC_GLOBAL_WEIGHT, 12);
>
> - if (h->srcline)
> - hists__new_col_len(hists, HISTC_SRCLINE, strlen(h->srcline));
> + if (h->srcline) {
> + len = MAX(strlen(h->srcline), strlen(sort_srcline.se_header));
> + hists__new_col_len(hists, HISTC_SRCLINE, len);
> + }
>
> if (h->srcfile)
> hists__new_col_len(hists, HISTC_SRCFILE, strlen(h->srcfile));
> --
> 2.7.4
^ permalink raw reply [flat|nested] 85+ messages in thread
* Re: [PATCH 61/61] perf tools: Fix width computation for srcline sort entry
2016-09-19 14:33 ` Arnaldo Carvalho de Melo
@ 2016-09-19 14:49 ` Jiri Olsa
2016-09-19 14:57 ` Arnaldo Carvalho de Melo
0 siblings, 1 reply; 85+ messages in thread
From: Jiri Olsa @ 2016-09-19 14:49 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo
Cc: Jiri Olsa, lkml, Don Zickus, Joe Mario, Ingo Molnar,
Peter Zijlstra, Namhyung Kim, David Ahern, Andi Kleen
On Mon, Sep 19, 2016 at 11:33:19AM -0300, Arnaldo Carvalho de Melo wrote:
> Em Mon, Sep 19, 2016 at 03:10:10PM +0200, Jiri Olsa escreveu:
> > Adding header size to width computation for srcline sort entry,
> > because it's possible to get empty data with ':0' which set width
> > of 2 which is lower than width needed to display column header.
>
> Thanks, cherry-picked, looking at the larger patchset.
oops, it's on this side of the patchset because it needs
sort_srcline to be global, which is done within the patchset..
I can separate that for v4 if it's needed ;-)
thanks,
jirka
>
> - Arnaldo
>
> > Link: http://lkml.kernel.org/n/tip-twbp391v8v9f5idp584hlfov@git.kernel.org
> > Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> > ---
> > tools/perf/util/hist.c | 6 ++++--
> > 1 file changed, 4 insertions(+), 2 deletions(-)
> >
> > diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
> > index 020efa9d3d74..e1be4132054d 100644
> > --- a/tools/perf/util/hist.c
> > +++ b/tools/perf/util/hist.c
> > @@ -177,8 +177,10 @@ void hists__calc_col_len(struct hists *hists, struct hist_entry *h)
> > hists__new_col_len(hists, HISTC_LOCAL_WEIGHT, 12);
> > hists__new_col_len(hists, HISTC_GLOBAL_WEIGHT, 12);
> >
> > - if (h->srcline)
> > - hists__new_col_len(hists, HISTC_SRCLINE, strlen(h->srcline));
> > + if (h->srcline) {
> > + len = MAX(strlen(h->srcline), strlen(sort_srcline.se_header));
> > + hists__new_col_len(hists, HISTC_SRCLINE, len);
> > + }
> >
> > if (h->srcfile)
> > hists__new_col_len(hists, HISTC_SRCFILE, strlen(h->srcfile));
> > --
> > 2.7.4
^ permalink raw reply [flat|nested] 85+ messages in thread
* Re: [PATCH 61/61] perf tools: Fix width computation for srcline sort entry
2016-09-19 14:49 ` Jiri Olsa
@ 2016-09-19 14:57 ` Arnaldo Carvalho de Melo
0 siblings, 0 replies; 85+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-09-19 14:57 UTC (permalink / raw)
To: Jiri Olsa
Cc: Jiri Olsa, lkml, Don Zickus, Joe Mario, Ingo Molnar,
Peter Zijlstra, Namhyung Kim, David Ahern, Andi Kleen
Em Mon, Sep 19, 2016 at 04:49:34PM +0200, Jiri Olsa escreveu:
> On Mon, Sep 19, 2016 at 11:33:19AM -0300, Arnaldo Carvalho de Melo wrote:
> > Em Mon, Sep 19, 2016 at 03:10:10PM +0200, Jiri Olsa escreveu:
> > > Adding header size to width computation for srcline sort entry,
> > > because it's possible to get empty data with ':0' which set width
> > > of 2 which is lower than width needed to display column header.
> >
> > Thanks, cherry-picked, looking at the larger patchset.
>
> oops, it's on this side of the patchset because it needs
> sort_srcline to be global, which is done within the patchset..
>
> I can separate that for v4 if it's needed ;-)
I noticed, I'll fix it up :-)
- Arnaldo
> thanks,
> jirka
>
> >
> > - Arnaldo
> >
> > > Link: http://lkml.kernel.org/n/tip-twbp391v8v9f5idp584hlfov@git.kernel.org
> > > Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> > > ---
> > > tools/perf/util/hist.c | 6 ++++--
> > > 1 file changed, 4 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
> > > index 020efa9d3d74..e1be4132054d 100644
> > > --- a/tools/perf/util/hist.c
> > > +++ b/tools/perf/util/hist.c
> > > @@ -177,8 +177,10 @@ void hists__calc_col_len(struct hists *hists, struct hist_entry *h)
> > > hists__new_col_len(hists, HISTC_LOCAL_WEIGHT, 12);
> > > hists__new_col_len(hists, HISTC_GLOBAL_WEIGHT, 12);
> > >
> > > - if (h->srcline)
> > > - hists__new_col_len(hists, HISTC_SRCLINE, strlen(h->srcline));
> > > + if (h->srcline) {
> > > + len = MAX(strlen(h->srcline), strlen(sort_srcline.se_header));
> > > + hists__new_col_len(hists, HISTC_SRCLINE, len);
> > > + }
> > >
> > > if (h->srcfile)
> > > hists__new_col_len(hists, HISTC_SRCFILE, strlen(h->srcfile));
> > > --
> > > 2.7.4
^ permalink raw reply [flat|nested] 85+ messages in thread
* [tip:perf/core] perf hists: Fix width computation for srcline sort entry
2016-09-19 13:10 ` [PATCH 61/61] perf tools: Fix width computation for srcline sort entry Jiri Olsa
2016-09-19 14:33 ` Arnaldo Carvalho de Melo
@ 2016-09-20 21:43 ` tip-bot for Jiri Olsa
1 sibling, 0 replies; 85+ messages in thread
From: tip-bot for Jiri Olsa @ 2016-09-20 21:43 UTC (permalink / raw)
To: linux-tip-commits
Cc: dzickus, linux-kernel, hpa, jolsa, tglx, acme, jmario, dsahern,
andi, namhyung, a.p.zijlstra, mingo
Commit-ID: f666ac0dab5afaf6ebed2c361251581bfccc4003
Gitweb: http://git.kernel.org/tip/f666ac0dab5afaf6ebed2c361251581bfccc4003
Author: Jiri Olsa <jolsa@kernel.org>
AuthorDate: Mon, 19 Sep 2016 15:10:10 +0200
Committer: Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Tue, 20 Sep 2016 12:28:28 -0300
perf hists: Fix width computation for srcline sort entry
Adding header size to width computation for srcline sort entry,
because it's possible to get empty data with ':0' which set width
of 2 which is lower than width needed to display column header.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Joe Mario <jmario@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1474290610-23241-62-git-send-email-jolsa@kernel.org
[ Added declaration to sort.h ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
tools/perf/util/hist.c | 6 ++++--
tools/perf/util/sort.h | 1 +
2 files changed, 5 insertions(+), 2 deletions(-)
diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index 37a08f2..b02992e 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -177,8 +177,10 @@ void hists__calc_col_len(struct hists *hists, struct hist_entry *h)
hists__new_col_len(hists, HISTC_LOCAL_WEIGHT, 12);
hists__new_col_len(hists, HISTC_GLOBAL_WEIGHT, 12);
- if (h->srcline)
- hists__new_col_len(hists, HISTC_SRCLINE, strlen(h->srcline));
+ if (h->srcline) {
+ len = MAX(strlen(h->srcline), strlen(sort_srcline.se_header));
+ hists__new_col_len(hists, HISTC_SRCLINE, len);
+ }
if (h->srcfile)
hists__new_col_len(hists, HISTC_SRCFILE, strlen(h->srcfile));
diff --git a/tools/perf/util/sort.h b/tools/perf/util/sort.h
index 28c0524..9505483 100644
--- a/tools/perf/util/sort.h
+++ b/tools/perf/util/sort.h
@@ -40,6 +40,7 @@ extern struct sort_entry sort_dso_from;
extern struct sort_entry sort_dso_to;
extern struct sort_entry sort_sym_from;
extern struct sort_entry sort_sym_to;
+extern struct sort_entry sort_srcline;
extern enum sort_type sort__first_dimension;
extern const char default_mem_sort_order[];
^ permalink raw reply related [flat|nested] 85+ messages in thread