* [RFC 1/3] perf tools: Fix weight sort key behavior
@ 2021-11-05 22:56 Namhyung Kim
2021-11-05 22:56 ` [RFC 2/3] perf tools: Fix ins_lat " Namhyung Kim
2021-11-05 22:56 ` [RFC 3/3] perf tools: Fix p_stage_cyc " Namhyung Kim
0 siblings, 2 replies; 6+ messages in thread
From: Namhyung Kim @ 2021-11-05 22:56 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo, Jiri Olsa
Cc: Ingo Molnar, Peter Zijlstra, LKML, Andi Kleen, Ian Rogers,
Stephane Eranian, Kan Liang, Athira Rajeev
Currently, the weight field in the perf sample has latency information
for some instructions like in memory access. And perf tool has weight
and local_weight sort keys to display the info.
But it's somewhat confusing what it shows exactly. In my understanding,
the local_weight shows a weight in a single sample, and (global) weight
shows a sum of the weights in the hist_entry.
For example,
$ perf mem record -t load dd if=/dev/zero of=/dev/null bs=4k count=1M
$ perf report --stdio -n -s +local_weight
...
#
# Overhead Samples Command Shared Object Symbol Local Weight
# ........ ............ ....... ................ ................................... ............
#
21.23% 313 dd [kernel.vmlinux] [k] lockref_get_not_zero 32
12.43% 183 dd [kernel.vmlinux] [k] lockref_get_not_zero 35
11.97% 159 dd [kernel.vmlinux] [k] lockref_get_not_zero 36
10.40% 141 dd [kernel.vmlinux] [k] lockref_put_return 32
7.63% 113 dd [kernel.vmlinux] [k] lockref_get_not_zero 33
6.37% 92 dd [kernel.vmlinux] [k] lockref_get_not_zero 34
6.15% 90 dd [kernel.vmlinux] [k] lockref_put_return 33
...
So let's look at the lockref_get_not_zero symbols. The top entry shows
that 313 samples were captured with local_weight 32, so the total weight
should be 313 x 32 = 10016. But it's not the case like below:
$ perf report --stdio -n -s +local_weight,weight -S lockref_get_not_zero
...
#
# Overhead Samples Command Shared Object Local Weight Weight
# ........ ............ ....... ................ ............ ............
#
1.36% 4 dd [kernel.vmlinux] 36 144
0.47% 4 dd [kernel.vmlinux] 37 148
0.42% 4 dd [kernel.vmlinux] 32 128
0.40% 4 dd [kernel.vmlinux] 34 136
0.35% 4 dd [kernel.vmlinux] 36 144
0.34% 4 dd [kernel.vmlinux] 35 140
0.30% 4 dd [kernel.vmlinux] 36 144
0.30% 4 dd [kernel.vmlinux] 34 136
0.30% 4 dd [kernel.vmlinux] 32 128
0.30% 4 dd [kernel.vmlinux] 32 128
...
With 'weight' sort key, it's divided to 4 samples even with same info
(comm,dso,sym and local_weight). I don't think this is what we want.
I found this is because of the way it aggregate the weight value.
Since it's not a period, we should not add them in the he->stat.
Otherwise, two 32 weight entries will create a 64 weight entry.
After that, new 32 weight samples don't have a matching entry so it'd
create a new entry and make it a 64 weight entry again and again.
Later, they will be merged into 128 weight entries during the
hists__collapse_resort() with 4 samples, multiple times like above.
Let's keep the weight and display differently. For local_weight, it
can show the weight as is, and for (global) weight it can display the
number multiplied by the number of samples.
With this change, I can see the expected numbers.
$ perf report --stdio -n -s +local_weight,weight -S lockref_get_not_zero
...
#
# Overhead Samples Command Shared Object Local Weight Weight
# ........ ............ ....... ................ ............ ............
#
21.23% 313 dd [kernel.vmlinux] 32 10016
12.43% 183 dd [kernel.vmlinux] 35 6405
11.97% 159 dd [kernel.vmlinux] 36 5724
7.63% 113 dd [kernel.vmlinux] 33 3729
6.37% 92 dd [kernel.vmlinux] 34 3128
4.17% 59 dd [kernel.vmlinux] 37 2183
0.08% 1 dd [kernel.vmlinux] 269 269
0.08% 1 dd [kernel.vmlinux] 38 38
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
tools/perf/util/hist.c | 14 +++++---------
tools/perf/util/sort.c | 24 +++++++-----------------
tools/perf/util/sort.h | 2 +-
3 files changed, 13 insertions(+), 27 deletions(-)
diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index 65fe65ba03c2..4e9bd7b589b1 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -290,11 +290,9 @@ static long hist_time(unsigned long htime)
}
static void he_stat__add_period(struct he_stat *he_stat, u64 period,
- u64 weight, u64 ins_lat, u64 p_stage_cyc)
+ u64 ins_lat, u64 p_stage_cyc)
{
-
he_stat->period += period;
- he_stat->weight += weight;
he_stat->nr_events += 1;
he_stat->ins_lat += ins_lat;
he_stat->p_stage_cyc += p_stage_cyc;
@@ -308,9 +306,8 @@ static void he_stat__add_stat(struct he_stat *dest, struct he_stat *src)
dest->period_guest_sys += src->period_guest_sys;
dest->period_guest_us += src->period_guest_us;
dest->nr_events += src->nr_events;
- dest->weight += src->weight;
dest->ins_lat += src->ins_lat;
- dest->p_stage_cyc += src->p_stage_cyc;
+ dest->p_stage_cyc += src->p_stage_cyc;
}
static void he_stat__decay(struct he_stat *he_stat)
@@ -598,7 +595,6 @@ static struct hist_entry *hists__findnew_entry(struct hists *hists,
struct hist_entry *he;
int64_t cmp;
u64 period = entry->stat.period;
- u64 weight = entry->stat.weight;
u64 ins_lat = entry->stat.ins_lat;
u64 p_stage_cyc = entry->stat.p_stage_cyc;
bool leftmost = true;
@@ -619,11 +615,11 @@ static struct hist_entry *hists__findnew_entry(struct hists *hists,
if (!cmp) {
if (sample_self) {
- he_stat__add_period(&he->stat, period, weight, ins_lat, p_stage_cyc);
+ he_stat__add_period(&he->stat, period, ins_lat, p_stage_cyc);
hist_entry__add_callchain_period(he, period);
}
if (symbol_conf.cumulate_callchain)
- he_stat__add_period(he->stat_acc, period, weight, ins_lat, p_stage_cyc);
+ he_stat__add_period(he->stat_acc, period, ins_lat, p_stage_cyc);
/*
* This mem info was allocated from sample__resolve_mem
@@ -733,7 +729,6 @@ __hists__add_entry(struct hists *hists,
.stat = {
.nr_events = 1,
.period = sample->period,
- .weight = sample->weight,
.ins_lat = sample->ins_lat,
.p_stage_cyc = sample->p_stage_cyc,
},
@@ -748,6 +743,7 @@ __hists__add_entry(struct hists *hists,
.raw_size = sample->raw_size,
.ops = ops,
.time = hist_time(sample->time),
+ .weight = sample->weight,
}, *he = hists__findnew_entry(hists, &entry, al, sample_self);
if (!hists->has_callchains && he && he->callchain_size != 0)
diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c
index 568a88c001c6..903f34fff27e 100644
--- a/tools/perf/util/sort.c
+++ b/tools/perf/util/sort.c
@@ -1325,45 +1325,35 @@ struct sort_entry sort_mispredict = {
.se_width_idx = HISTC_MISPREDICT,
};
-static u64 he_weight(struct hist_entry *he)
-{
- return he->stat.nr_events ? he->stat.weight / he->stat.nr_events : 0;
-}
-
static int64_t
-sort__local_weight_cmp(struct hist_entry *left, struct hist_entry *right)
+sort__weight_cmp(struct hist_entry *left, struct hist_entry *right)
{
- return he_weight(left) - he_weight(right);
+ return left->weight - right->weight;
}
static int hist_entry__local_weight_snprintf(struct hist_entry *he, char *bf,
size_t size, unsigned int width)
{
- return repsep_snprintf(bf, size, "%-*llu", width, he_weight(he));
+ return repsep_snprintf(bf, size, "%-*llu", width, he->weight);
}
struct sort_entry sort_local_weight = {
.se_header = "Local Weight",
- .se_cmp = sort__local_weight_cmp,
+ .se_cmp = sort__weight_cmp,
.se_snprintf = hist_entry__local_weight_snprintf,
.se_width_idx = HISTC_LOCAL_WEIGHT,
};
-static int64_t
-sort__global_weight_cmp(struct hist_entry *left, struct hist_entry *right)
-{
- return left->stat.weight - right->stat.weight;
-}
-
static int hist_entry__global_weight_snprintf(struct hist_entry *he, char *bf,
size_t size, unsigned int width)
{
- return repsep_snprintf(bf, size, "%-*llu", width, he->stat.weight);
+ return repsep_snprintf(bf, size, "%-*llu", width,
+ he->weight * he->stat.nr_events);
}
struct sort_entry sort_global_weight = {
.se_header = "Weight",
- .se_cmp = sort__global_weight_cmp,
+ .se_cmp = sort__weight_cmp,
.se_snprintf = hist_entry__global_weight_snprintf,
.se_width_idx = HISTC_GLOBAL_WEIGHT,
};
diff --git a/tools/perf/util/sort.h b/tools/perf/util/sort.h
index b67c469aba79..e18b79916f63 100644
--- a/tools/perf/util/sort.h
+++ b/tools/perf/util/sort.h
@@ -49,7 +49,6 @@ struct he_stat {
u64 period_us;
u64 period_guest_sys;
u64 period_guest_us;
- u64 weight;
u64 ins_lat;
u64 p_stage_cyc;
u32 nr_events;
@@ -109,6 +108,7 @@ struct hist_entry {
s32 socket;
s32 cpu;
u64 code_page_size;
+ u64 weight;
u8 cpumode;
u8 depth;
--
2.34.0.rc0.344.g81b53c2807-goog
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [RFC 2/3] perf tools: Fix ins_lat sort key behavior
2021-11-05 22:56 [RFC 1/3] perf tools: Fix weight sort key behavior Namhyung Kim
@ 2021-11-05 22:56 ` Namhyung Kim
2021-11-05 22:56 ` [RFC 3/3] perf tools: Fix p_stage_cyc " Namhyung Kim
1 sibling, 0 replies; 6+ messages in thread
From: Namhyung Kim @ 2021-11-05 22:56 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo, Jiri Olsa
Cc: Ingo Molnar, Peter Zijlstra, LKML, Andi Kleen, Ian Rogers,
Stephane Eranian, Kan Liang, Athira Rajeev
Like weight and local_weight, the ins_lat (for instruction latency)
and local_ins_lat should be handled the same way.
But I couldn't test it actually, so only build tested.
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
tools/perf/util/hist.c | 11 ++++-------
tools/perf/util/sort.c | 24 +++++++-----------------
tools/perf/util/sort.h | 2 +-
3 files changed, 12 insertions(+), 25 deletions(-)
diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index 4e9bd7b589b1..54fe97dd191c 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -290,11 +290,10 @@ static long hist_time(unsigned long htime)
}
static void he_stat__add_period(struct he_stat *he_stat, u64 period,
- u64 ins_lat, u64 p_stage_cyc)
+ u64 p_stage_cyc)
{
he_stat->period += period;
he_stat->nr_events += 1;
- he_stat->ins_lat += ins_lat;
he_stat->p_stage_cyc += p_stage_cyc;
}
@@ -306,7 +305,6 @@ static void he_stat__add_stat(struct he_stat *dest, struct he_stat *src)
dest->period_guest_sys += src->period_guest_sys;
dest->period_guest_us += src->period_guest_us;
dest->nr_events += src->nr_events;
- dest->ins_lat += src->ins_lat;
dest->p_stage_cyc += src->p_stage_cyc;
}
@@ -595,7 +593,6 @@ static struct hist_entry *hists__findnew_entry(struct hists *hists,
struct hist_entry *he;
int64_t cmp;
u64 period = entry->stat.period;
- u64 ins_lat = entry->stat.ins_lat;
u64 p_stage_cyc = entry->stat.p_stage_cyc;
bool leftmost = true;
@@ -615,11 +612,11 @@ static struct hist_entry *hists__findnew_entry(struct hists *hists,
if (!cmp) {
if (sample_self) {
- he_stat__add_period(&he->stat, period, ins_lat, p_stage_cyc);
+ he_stat__add_period(&he->stat, period, p_stage_cyc);
hist_entry__add_callchain_period(he, period);
}
if (symbol_conf.cumulate_callchain)
- he_stat__add_period(he->stat_acc, period, ins_lat, p_stage_cyc);
+ he_stat__add_period(he->stat_acc, period, p_stage_cyc);
/*
* This mem info was allocated from sample__resolve_mem
@@ -729,7 +726,6 @@ __hists__add_entry(struct hists *hists,
.stat = {
.nr_events = 1,
.period = sample->period,
- .ins_lat = sample->ins_lat,
.p_stage_cyc = sample->p_stage_cyc,
},
.parent = sym_parent,
@@ -744,6 +740,7 @@ __hists__add_entry(struct hists *hists,
.ops = ops,
.time = hist_time(sample->time),
.weight = sample->weight,
+ .ins_lat = sample->ins_lat,
}, *he = hists__findnew_entry(hists, &entry, al, sample_self);
if (!hists->has_callchains && he && he->callchain_size != 0)
diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c
index 903f34fff27e..adc0584695d6 100644
--- a/tools/perf/util/sort.c
+++ b/tools/perf/util/sort.c
@@ -1358,45 +1358,35 @@ struct sort_entry sort_global_weight = {
.se_width_idx = HISTC_GLOBAL_WEIGHT,
};
-static u64 he_ins_lat(struct hist_entry *he)
-{
- return he->stat.nr_events ? he->stat.ins_lat / he->stat.nr_events : 0;
-}
-
static int64_t
-sort__local_ins_lat_cmp(struct hist_entry *left, struct hist_entry *right)
+sort__ins_lat_cmp(struct hist_entry *left, struct hist_entry *right)
{
- return he_ins_lat(left) - he_ins_lat(right);
+ return left->ins_lat - right->ins_lat;
}
static int hist_entry__local_ins_lat_snprintf(struct hist_entry *he, char *bf,
size_t size, unsigned int width)
{
- return repsep_snprintf(bf, size, "%-*u", width, he_ins_lat(he));
+ return repsep_snprintf(bf, size, "%-*u", width, he->ins_lat);
}
struct sort_entry sort_local_ins_lat = {
.se_header = "Local INSTR Latency",
- .se_cmp = sort__local_ins_lat_cmp,
+ .se_cmp = sort__ins_lat_cmp,
.se_snprintf = hist_entry__local_ins_lat_snprintf,
.se_width_idx = HISTC_LOCAL_INS_LAT,
};
-static int64_t
-sort__global_ins_lat_cmp(struct hist_entry *left, struct hist_entry *right)
-{
- return left->stat.ins_lat - right->stat.ins_lat;
-}
-
static int hist_entry__global_ins_lat_snprintf(struct hist_entry *he, char *bf,
size_t size, unsigned int width)
{
- return repsep_snprintf(bf, size, "%-*u", width, he->stat.ins_lat);
+ return repsep_snprintf(bf, size, "%-*u", width,
+ he->ins_lat * he->stat.nr_events);
}
struct sort_entry sort_global_ins_lat = {
.se_header = "INSTR Latency",
- .se_cmp = sort__global_ins_lat_cmp,
+ .se_cmp = sort__ins_lat_cmp,
.se_snprintf = hist_entry__global_ins_lat_snprintf,
.se_width_idx = HISTC_GLOBAL_INS_LAT,
};
diff --git a/tools/perf/util/sort.h b/tools/perf/util/sort.h
index e18b79916f63..22ae7c6ae398 100644
--- a/tools/perf/util/sort.h
+++ b/tools/perf/util/sort.h
@@ -49,7 +49,6 @@ struct he_stat {
u64 period_us;
u64 period_guest_sys;
u64 period_guest_us;
- u64 ins_lat;
u64 p_stage_cyc;
u32 nr_events;
};
@@ -109,6 +108,7 @@ struct hist_entry {
s32 cpu;
u64 code_page_size;
u64 weight;
+ u64 ins_lat;
u8 cpumode;
u8 depth;
--
2.34.0.rc0.344.g81b53c2807-goog
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [RFC 3/3] perf tools: Fix p_stage_cyc sort key behavior
2021-11-05 22:56 [RFC 1/3] perf tools: Fix weight sort key behavior Namhyung Kim
2021-11-05 22:56 ` [RFC 2/3] perf tools: Fix ins_lat " Namhyung Kim
@ 2021-11-05 22:56 ` Namhyung Kim
2021-11-16 16:29 ` Athira Rajeev
1 sibling, 1 reply; 6+ messages in thread
From: Namhyung Kim @ 2021-11-05 22:56 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo, Jiri Olsa
Cc: Ingo Molnar, Peter Zijlstra, LKML, Andi Kleen, Ian Rogers,
Stephane Eranian, Kan Liang, Athira Rajeev
Like weight and local_weight, the p_stage_cyc (for pipeline stage
cycles) should be handled the same way. Not sure it also needs
the local and global variants.
But I couldn't test it actually because I don't have the machine.
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
tools/perf/util/hist.c | 12 ++++--------
tools/perf/util/sort.c | 4 ++--
tools/perf/util/sort.h | 2 +-
3 files changed, 7 insertions(+), 11 deletions(-)
diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index 54fe97dd191c..b776465e04ef 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -289,12 +289,10 @@ static long hist_time(unsigned long htime)
return htime;
}
-static void he_stat__add_period(struct he_stat *he_stat, u64 period,
- u64 p_stage_cyc)
+static void he_stat__add_period(struct he_stat *he_stat, u64 period)
{
he_stat->period += period;
he_stat->nr_events += 1;
- he_stat->p_stage_cyc += p_stage_cyc;
}
static void he_stat__add_stat(struct he_stat *dest, struct he_stat *src)
@@ -305,7 +303,6 @@ static void he_stat__add_stat(struct he_stat *dest, struct he_stat *src)
dest->period_guest_sys += src->period_guest_sys;
dest->period_guest_us += src->period_guest_us;
dest->nr_events += src->nr_events;
- dest->p_stage_cyc += src->p_stage_cyc;
}
static void he_stat__decay(struct he_stat *he_stat)
@@ -593,7 +590,6 @@ static struct hist_entry *hists__findnew_entry(struct hists *hists,
struct hist_entry *he;
int64_t cmp;
u64 period = entry->stat.period;
- u64 p_stage_cyc = entry->stat.p_stage_cyc;
bool leftmost = true;
p = &hists->entries_in->rb_root.rb_node;
@@ -612,11 +608,11 @@ static struct hist_entry *hists__findnew_entry(struct hists *hists,
if (!cmp) {
if (sample_self) {
- he_stat__add_period(&he->stat, period, p_stage_cyc);
+ he_stat__add_period(&he->stat, period);
hist_entry__add_callchain_period(he, period);
}
if (symbol_conf.cumulate_callchain)
- he_stat__add_period(he->stat_acc, period, p_stage_cyc);
+ he_stat__add_period(he->stat_acc, period);
/*
* This mem info was allocated from sample__resolve_mem
@@ -726,7 +722,6 @@ __hists__add_entry(struct hists *hists,
.stat = {
.nr_events = 1,
.period = sample->period,
- .p_stage_cyc = sample->p_stage_cyc,
},
.parent = sym_parent,
.filtered = symbol__parent_filter(sym_parent) | al->filtered,
@@ -741,6 +736,7 @@ __hists__add_entry(struct hists *hists,
.time = hist_time(sample->time),
.weight = sample->weight,
.ins_lat = sample->ins_lat,
+ .p_stage_cyc = sample->p_stage_cyc,
}, *he = hists__findnew_entry(hists, &entry, al, sample_self);
if (!hists->has_callchains && he && he->callchain_size != 0)
diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c
index adc0584695d6..a111065b484e 100644
--- a/tools/perf/util/sort.c
+++ b/tools/perf/util/sort.c
@@ -1394,13 +1394,13 @@ struct sort_entry sort_global_ins_lat = {
static int64_t
sort__global_p_stage_cyc_cmp(struct hist_entry *left, struct hist_entry *right)
{
- return left->stat.p_stage_cyc - right->stat.p_stage_cyc;
+ return left->p_stage_cyc - right->p_stage_cyc;
}
static int hist_entry__p_stage_cyc_snprintf(struct hist_entry *he, char *bf,
size_t size, unsigned int width)
{
- return repsep_snprintf(bf, size, "%-*u", width, he->stat.p_stage_cyc);
+ return repsep_snprintf(bf, size, "%-*u", width, he->p_stage_cyc);
}
struct sort_entry sort_p_stage_cyc = {
diff --git a/tools/perf/util/sort.h b/tools/perf/util/sort.h
index 22ae7c6ae398..7b7145501933 100644
--- a/tools/perf/util/sort.h
+++ b/tools/perf/util/sort.h
@@ -49,7 +49,6 @@ struct he_stat {
u64 period_us;
u64 period_guest_sys;
u64 period_guest_us;
- u64 p_stage_cyc;
u32 nr_events;
};
@@ -109,6 +108,7 @@ struct hist_entry {
u64 code_page_size;
u64 weight;
u64 ins_lat;
+ u64 p_stage_cyc;
u8 cpumode;
u8 depth;
--
2.34.0.rc0.344.g81b53c2807-goog
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [RFC 3/3] perf tools: Fix p_stage_cyc sort key behavior
2021-11-05 22:56 ` [RFC 3/3] perf tools: Fix p_stage_cyc " Namhyung Kim
@ 2021-11-16 16:29 ` Athira Rajeev
2021-11-16 17:50 ` Arnaldo Carvalho de Melo
0 siblings, 1 reply; 6+ messages in thread
From: Athira Rajeev @ 2021-11-16 16:29 UTC (permalink / raw)
To: Namhyung Kim
Cc: Arnaldo Carvalho de Melo, Jiri Olsa, Ingo Molnar, Peter Zijlstra,
LKML, Andi Kleen, Ian Rogers, Stephane Eranian, Kan Liang
> On 06-Nov-2021, at 4:26 AM, Namhyung Kim <namhyung@kernel.org> wrote:
>
> Like weight and local_weight, the p_stage_cyc (for pipeline stage
> cycles) should be handled the same way. Not sure it also needs
> the local and global variants.
Hi Namhyung,
Thanks for the fixes. I could test the fix for "weight" and "ins_lat" in powerpc.
Also it makes sense to have global variant for p_stage_cyc as well. Thanks for pointing that. I will post fix to have both the variants for the p_stage_cyc.
Thanks
Athira
>
> But I couldn't test it actually because I don't have the machine.
>
> Cc: Kan Liang <kan.liang@linux.intel.com>
> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
> ---
> tools/perf/util/hist.c | 12 ++++--------
> tools/perf/util/sort.c | 4 ++--
> tools/perf/util/sort.h | 2 +-
> 3 files changed, 7 insertions(+), 11 deletions(-)
>
> diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
> index 54fe97dd191c..b776465e04ef 100644
> --- a/tools/perf/util/hist.c
> +++ b/tools/perf/util/hist.c
> @@ -289,12 +289,10 @@ static long hist_time(unsigned long htime)
> return htime;
> }
>
> -static void he_stat__add_period(struct he_stat *he_stat, u64 period,
> - u64 p_stage_cyc)
> +static void he_stat__add_period(struct he_stat *he_stat, u64 period)
> {
> he_stat->period += period;
> he_stat->nr_events += 1;
> - he_stat->p_stage_cyc += p_stage_cyc;
> }
>
> static void he_stat__add_stat(struct he_stat *dest, struct he_stat *src)
> @@ -305,7 +303,6 @@ static void he_stat__add_stat(struct he_stat *dest, struct he_stat *src)
> dest->period_guest_sys += src->period_guest_sys;
> dest->period_guest_us += src->period_guest_us;
> dest->nr_events += src->nr_events;
> - dest->p_stage_cyc += src->p_stage_cyc;
> }
>
> static void he_stat__decay(struct he_stat *he_stat)
> @@ -593,7 +590,6 @@ static struct hist_entry *hists__findnew_entry(struct hists *hists,
> struct hist_entry *he;
> int64_t cmp;
> u64 period = entry->stat.period;
> - u64 p_stage_cyc = entry->stat.p_stage_cyc;
> bool leftmost = true;
>
> p = &hists->entries_in->rb_root.rb_node;
> @@ -612,11 +608,11 @@ static struct hist_entry *hists__findnew_entry(struct hists *hists,
>
> if (!cmp) {
> if (sample_self) {
> - he_stat__add_period(&he->stat, period, p_stage_cyc);
> + he_stat__add_period(&he->stat, period);
> hist_entry__add_callchain_period(he, period);
> }
> if (symbol_conf.cumulate_callchain)
> - he_stat__add_period(he->stat_acc, period, p_stage_cyc);
> + he_stat__add_period(he->stat_acc, period);
>
> /*
> * This mem info was allocated from sample__resolve_mem
> @@ -726,7 +722,6 @@ __hists__add_entry(struct hists *hists,
> .stat = {
> .nr_events = 1,
> .period = sample->period,
> - .p_stage_cyc = sample->p_stage_cyc,
> },
> .parent = sym_parent,
> .filtered = symbol__parent_filter(sym_parent) | al->filtered,
> @@ -741,6 +736,7 @@ __hists__add_entry(struct hists *hists,
> .time = hist_time(sample->time),
> .weight = sample->weight,
> .ins_lat = sample->ins_lat,
> + .p_stage_cyc = sample->p_stage_cyc,
> }, *he = hists__findnew_entry(hists, &entry, al, sample_self);
>
> if (!hists->has_callchains && he && he->callchain_size != 0)
> diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c
> index adc0584695d6..a111065b484e 100644
> --- a/tools/perf/util/sort.c
> +++ b/tools/perf/util/sort.c
> @@ -1394,13 +1394,13 @@ struct sort_entry sort_global_ins_lat = {
> static int64_t
> sort__global_p_stage_cyc_cmp(struct hist_entry *left, struct hist_entry *right)
> {
> - return left->stat.p_stage_cyc - right->stat.p_stage_cyc;
> + return left->p_stage_cyc - right->p_stage_cyc;
> }
>
> static int hist_entry__p_stage_cyc_snprintf(struct hist_entry *he, char *bf,
> size_t size, unsigned int width)
> {
> - return repsep_snprintf(bf, size, "%-*u", width, he->stat.p_stage_cyc);
> + return repsep_snprintf(bf, size, "%-*u", width, he->p_stage_cyc);
> }
>
> struct sort_entry sort_p_stage_cyc = {
> diff --git a/tools/perf/util/sort.h b/tools/perf/util/sort.h
> index 22ae7c6ae398..7b7145501933 100644
> --- a/tools/perf/util/sort.h
> +++ b/tools/perf/util/sort.h
> @@ -49,7 +49,6 @@ struct he_stat {
> u64 period_us;
> u64 period_guest_sys;
> u64 period_guest_us;
> - u64 p_stage_cyc;
> u32 nr_events;
> };
>
> @@ -109,6 +108,7 @@ struct hist_entry {
> u64 code_page_size;
> u64 weight;
> u64 ins_lat;
> + u64 p_stage_cyc;
> u8 cpumode;
> u8 depth;
>
> --
> 2.34.0.rc0.344.g81b53c2807-goog
>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [RFC 3/3] perf tools: Fix p_stage_cyc sort key behavior
2021-11-16 16:29 ` Athira Rajeev
@ 2021-11-16 17:50 ` Arnaldo Carvalho de Melo
[not found] ` <BECFD6E5-8137-4FBD-BF4A-D8709B67650F@linux.vnet.ibm.com>
0 siblings, 1 reply; 6+ messages in thread
From: Arnaldo Carvalho de Melo @ 2021-11-16 17:50 UTC (permalink / raw)
To: Athira Rajeev
Cc: Namhyung Kim, Jiri Olsa, Ingo Molnar, Peter Zijlstra, LKML,
Andi Kleen, Ian Rogers, Stephane Eranian, Kan Liang
Em Tue, Nov 16, 2021 at 09:59:42PM +0530, Athira Rajeev escreveu:
>
>
> > On 06-Nov-2021, at 4:26 AM, Namhyung Kim <namhyung@kernel.org> wrote:
> >
> > Like weight and local_weight, the p_stage_cyc (for pipeline stage
> > cycles) should be handled the same way. Not sure it also needs
> > the local and global variants.
>
> Hi Namhyung,
>
> Thanks for the fixes. I could test the fix for "weight" and "ins_lat" in powerpc.
> Also it makes sense to have global variant for p_stage_cyc as well. Thanks for pointing that. I will post fix to have both the variants for the p_stage_cyc.
>
> Thanks
> Athira
So I'm going to wait for Namhyung's patch with some extra touches? Or is
thie one below good to go and you can send the other fixes in a followup
patch?
Please advise.
- Arnaldo
> >
> > But I couldn't test it actually because I don't have the machine.
> >
> > Cc: Kan Liang <kan.liang@linux.intel.com>
> > Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
> > Signed-off-by: Namhyung Kim <namhyung@kernel.org>
> > ---
> > tools/perf/util/hist.c | 12 ++++--------
> > tools/perf/util/sort.c | 4 ++--
> > tools/perf/util/sort.h | 2 +-
> > 3 files changed, 7 insertions(+), 11 deletions(-)
> >
> > diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
> > index 54fe97dd191c..b776465e04ef 100644
> > --- a/tools/perf/util/hist.c
> > +++ b/tools/perf/util/hist.c
> > @@ -289,12 +289,10 @@ static long hist_time(unsigned long htime)
> > return htime;
> > }
> >
> > -static void he_stat__add_period(struct he_stat *he_stat, u64 period,
> > - u64 p_stage_cyc)
> > +static void he_stat__add_period(struct he_stat *he_stat, u64 period)
> > {
> > he_stat->period += period;
> > he_stat->nr_events += 1;
> > - he_stat->p_stage_cyc += p_stage_cyc;
> > }
> >
> > static void he_stat__add_stat(struct he_stat *dest, struct he_stat *src)
> > @@ -305,7 +303,6 @@ static void he_stat__add_stat(struct he_stat *dest, struct he_stat *src)
> > dest->period_guest_sys += src->period_guest_sys;
> > dest->period_guest_us += src->period_guest_us;
> > dest->nr_events += src->nr_events;
> > - dest->p_stage_cyc += src->p_stage_cyc;
> > }
> >
> > static void he_stat__decay(struct he_stat *he_stat)
> > @@ -593,7 +590,6 @@ static struct hist_entry *hists__findnew_entry(struct hists *hists,
> > struct hist_entry *he;
> > int64_t cmp;
> > u64 period = entry->stat.period;
> > - u64 p_stage_cyc = entry->stat.p_stage_cyc;
> > bool leftmost = true;
> >
> > p = &hists->entries_in->rb_root.rb_node;
> > @@ -612,11 +608,11 @@ static struct hist_entry *hists__findnew_entry(struct hists *hists,
> >
> > if (!cmp) {
> > if (sample_self) {
> > - he_stat__add_period(&he->stat, period, p_stage_cyc);
> > + he_stat__add_period(&he->stat, period);
> > hist_entry__add_callchain_period(he, period);
> > }
> > if (symbol_conf.cumulate_callchain)
> > - he_stat__add_period(he->stat_acc, period, p_stage_cyc);
> > + he_stat__add_period(he->stat_acc, period);
> >
> > /*
> > * This mem info was allocated from sample__resolve_mem
> > @@ -726,7 +722,6 @@ __hists__add_entry(struct hists *hists,
> > .stat = {
> > .nr_events = 1,
> > .period = sample->period,
> > - .p_stage_cyc = sample->p_stage_cyc,
> > },
> > .parent = sym_parent,
> > .filtered = symbol__parent_filter(sym_parent) | al->filtered,
> > @@ -741,6 +736,7 @@ __hists__add_entry(struct hists *hists,
> > .time = hist_time(sample->time),
> > .weight = sample->weight,
> > .ins_lat = sample->ins_lat,
> > + .p_stage_cyc = sample->p_stage_cyc,
> > }, *he = hists__findnew_entry(hists, &entry, al, sample_self);
> >
> > if (!hists->has_callchains && he && he->callchain_size != 0)
> > diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c
> > index adc0584695d6..a111065b484e 100644
> > --- a/tools/perf/util/sort.c
> > +++ b/tools/perf/util/sort.c
> > @@ -1394,13 +1394,13 @@ struct sort_entry sort_global_ins_lat = {
> > static int64_t
> > sort__global_p_stage_cyc_cmp(struct hist_entry *left, struct hist_entry *right)
> > {
> > - return left->stat.p_stage_cyc - right->stat.p_stage_cyc;
> > + return left->p_stage_cyc - right->p_stage_cyc;
> > }
> >
> > static int hist_entry__p_stage_cyc_snprintf(struct hist_entry *he, char *bf,
> > size_t size, unsigned int width)
> > {
> > - return repsep_snprintf(bf, size, "%-*u", width, he->stat.p_stage_cyc);
> > + return repsep_snprintf(bf, size, "%-*u", width, he->p_stage_cyc);
> > }
> >
> > struct sort_entry sort_p_stage_cyc = {
> > diff --git a/tools/perf/util/sort.h b/tools/perf/util/sort.h
> > index 22ae7c6ae398..7b7145501933 100644
> > --- a/tools/perf/util/sort.h
> > +++ b/tools/perf/util/sort.h
> > @@ -49,7 +49,6 @@ struct he_stat {
> > u64 period_us;
> > u64 period_guest_sys;
> > u64 period_guest_us;
> > - u64 p_stage_cyc;
> > u32 nr_events;
> > };
> >
> > @@ -109,6 +108,7 @@ struct hist_entry {
> > u64 code_page_size;
> > u64 weight;
> > u64 ins_lat;
> > + u64 p_stage_cyc;
> > u8 cpumode;
> > u8 depth;
> >
> > --
> > 2.34.0.rc0.344.g81b53c2807-goog
> >
--
- Arnaldo
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [RFC 3/3] perf tools: Fix p_stage_cyc sort key behavior
[not found] ` <BECFD6E5-8137-4FBD-BF4A-D8709B67650F@linux.vnet.ibm.com>
@ 2021-11-17 13:13 ` Arnaldo Carvalho de Melo
0 siblings, 0 replies; 6+ messages in thread
From: Arnaldo Carvalho de Melo @ 2021-11-17 13:13 UTC (permalink / raw)
To: Athira Rajeev
Cc: Namhyung Kim, Jiri Olsa, Ingo Molnar, Peter Zijlstra, LKML,
Andi Kleen, Ian Rogers, Stephane Eranian, Kan Liang
Em Wed, Nov 17, 2021 at 05:33:43PM +0530, Athira Rajeev escreveu:
>
>
> > On 16-Nov-2021, at 11:20 PM, Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
> >
> > Em Tue, Nov 16, 2021 at 09:59:42PM +0530, Athira Rajeev escreveu:
> >>
> >>
> >>> On 06-Nov-2021, at 4:26 AM, Namhyung Kim <namhyung@kernel.org> wrote:
> >>>
> >>> Like weight and local_weight, the p_stage_cyc (for pipeline stage
> >>> cycles) should be handled the same way. Not sure it also needs
> >>> the local and global variants.
> >>
> >> Hi Namhyung,
> >>
> >> Thanks for the fixes. I could test the fix for "weight" and "ins_lat" in powerpc.
> >> Also it makes sense to have global variant for p_stage_cyc as well. Thanks for pointing that. I will post fix to have both the variants for the p_stage_cyc.
> >>
> >> Thanks
> >> Athira
> >
> > So I'm going to wait for Namhyung's patch with some extra touches? Or is
> > thie one below good to go and you can send the other fixes in a followup
> > patch?
> >
> > Please advise.
> >
> > - Arnaldo
>
> Hi Arnaldo,
>
> Yes, this patch series looks good to me.
>
> For this patch series,
> Reviewed-and-Tested-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Thanks, applied.
- Arnaldo
> I will send the other fixes ( for adding both variants for p_stage_cyc ) in a separate follow up patch.
>
> Thanks,
> Athira
> >
> >>>
> >>> But I couldn't test it actually because I don't have the machine.
> >>>
> >>> Cc: Kan Liang <kan.liang@linux.intel.com>
> >>> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
> >>> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
> >>> ---
> >>> tools/perf/util/hist.c | 12 ++++--------
> >>> tools/perf/util/sort.c | 4 ++--
> >>> tools/perf/util/sort.h | 2 +-
> >>> 3 files changed, 7 insertions(+), 11 deletions(-)
> >>>
> >>> diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
> >>> index 54fe97dd191c..b776465e04ef 100644
> >>> --- a/tools/perf/util/hist.c
> >>> +++ b/tools/perf/util/hist.c
> >>> @@ -289,12 +289,10 @@ static long hist_time(unsigned long htime)
> >>> return htime;
> >>> }
> >>>
> >>> -static void he_stat__add_period(struct he_stat *he_stat, u64 period,
> >>> - u64 p_stage_cyc)
> >>> +static void he_stat__add_period(struct he_stat *he_stat, u64 period)
> >>> {
> >>> he_stat->period += period;
> >>> he_stat->nr_events += 1;
> >>> - he_stat->p_stage_cyc += p_stage_cyc;
> >>> }
> >>>
> >>> static void he_stat__add_stat(struct he_stat *dest, struct he_stat *src)
> >>> @@ -305,7 +303,6 @@ static void he_stat__add_stat(struct he_stat *dest, struct he_stat *src)
> >>> dest->period_guest_sys += src->period_guest_sys;
> >>> dest->period_guest_us += src->period_guest_us;
> >>> dest->nr_events += src->nr_events;
> >>> - dest->p_stage_cyc += src->p_stage_cyc;
> >>> }
> >>>
> >>> static void he_stat__decay(struct he_stat *he_stat)
> >>> @@ -593,7 +590,6 @@ static struct hist_entry *hists__findnew_entry(struct hists *hists,
> >>> struct hist_entry *he;
> >>> int64_t cmp;
> >>> u64 period = entry->stat.period;
> >>> - u64 p_stage_cyc = entry->stat.p_stage_cyc;
> >>> bool leftmost = true;
> >>>
> >>> p = &hists->entries_in->rb_root.rb_node;
> >>> @@ -612,11 +608,11 @@ static struct hist_entry *hists__findnew_entry(struct hists *hists,
> >>>
> >>> if (!cmp) {
> >>> if (sample_self) {
> >>> - he_stat__add_period(&he->stat, period, p_stage_cyc);
> >>> + he_stat__add_period(&he->stat, period);
> >>> hist_entry__add_callchain_period(he, period);
> >>> }
> >>> if (symbol_conf.cumulate_callchain)
> >>> - he_stat__add_period(he->stat_acc, period, p_stage_cyc);
> >>> + he_stat__add_period(he->stat_acc, period);
> >>>
> >>> /*
> >>> * This mem info was allocated from sample__resolve_mem
> >>> @@ -726,7 +722,6 @@ __hists__add_entry(struct hists *hists,
> >>> .stat = {
> >>> .nr_events = 1,
> >>> .period = sample->period,
> >>> - .p_stage_cyc = sample->p_stage_cyc,
> >>> },
> >>> .parent = sym_parent,
> >>> .filtered = symbol__parent_filter(sym_parent) | al->filtered,
> >>> @@ -741,6 +736,7 @@ __hists__add_entry(struct hists *hists,
> >>> .time = hist_time(sample->time),
> >>> .weight = sample->weight,
> >>> .ins_lat = sample->ins_lat,
> >>> + .p_stage_cyc = sample->p_stage_cyc,
> >>> }, *he = hists__findnew_entry(hists, &entry, al, sample_self);
> >>>
> >>> if (!hists->has_callchains && he && he->callchain_size != 0)
> >>> diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c
> >>> index adc0584695d6..a111065b484e 100644
> >>> --- a/tools/perf/util/sort.c
> >>> +++ b/tools/perf/util/sort.c
> >>> @@ -1394,13 +1394,13 @@ struct sort_entry sort_global_ins_lat = {
> >>> static int64_t
> >>> sort__global_p_stage_cyc_cmp(struct hist_entry *left, struct hist_entry *right)
> >>> {
> >>> - return left->stat.p_stage_cyc - right->stat.p_stage_cyc;
> >>> + return left->p_stage_cyc - right->p_stage_cyc;
> >>> }
> >>>
> >>> static int hist_entry__p_stage_cyc_snprintf(struct hist_entry *he, char *bf,
> >>> size_t size, unsigned int width)
> >>> {
> >>> - return repsep_snprintf(bf, size, "%-*u", width, he->stat.p_stage_cyc);
> >>> + return repsep_snprintf(bf, size, "%-*u", width, he->p_stage_cyc);
> >>> }
> >>>
> >>> struct sort_entry sort_p_stage_cyc = {
> >>> diff --git a/tools/perf/util/sort.h b/tools/perf/util/sort.h
> >>> index 22ae7c6ae398..7b7145501933 100644
> >>> --- a/tools/perf/util/sort.h
> >>> +++ b/tools/perf/util/sort.h
> >>> @@ -49,7 +49,6 @@ struct he_stat {
> >>> u64 period_us;
> >>> u64 period_guest_sys;
> >>> u64 period_guest_us;
> >>> - u64 p_stage_cyc;
> >>> u32 nr_events;
> >>> };
> >>>
> >>> @@ -109,6 +108,7 @@ struct hist_entry {
> >>> u64 code_page_size;
> >>> u64 weight;
> >>> u64 ins_lat;
> >>> + u64 p_stage_cyc;
> >>> u8 cpumode;
> >>> u8 depth;
> >>>
> >>> --
> >>> 2.34.0.rc0.344.g81b53c2807-goog
> >>>
> >
> > --
> >
> > - Arnaldo
>
--
- Arnaldo
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2021-11-17 13:13 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-11-05 22:56 [RFC 1/3] perf tools: Fix weight sort key behavior Namhyung Kim
2021-11-05 22:56 ` [RFC 2/3] perf tools: Fix ins_lat " Namhyung Kim
2021-11-05 22:56 ` [RFC 3/3] perf tools: Fix p_stage_cyc " Namhyung Kim
2021-11-16 16:29 ` Athira Rajeev
2021-11-16 17:50 ` Arnaldo Carvalho de Melo
[not found] ` <BECFD6E5-8137-4FBD-BF4A-D8709B67650F@linux.vnet.ibm.com>
2021-11-17 13:13 ` Arnaldo Carvalho de Melo
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).