linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v1 0/8] perf c2c: Refine the organization of metrics
@ 2020-10-14  5:09 Leo Yan
  2020-10-14  5:09 ` [PATCH v1 1/8] perf c2c: Display the total numbers continuously Leo Yan
                   ` (9 more replies)
  0 siblings, 10 replies; 12+ messages in thread
From: Leo Yan @ 2020-10-14  5:09 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Peter Zijlstra, Ingo Molnar,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Andi Kleen, David Ahern, Don Zickus, Joe Mario, Al Grant,
	James Clark, linux-kernel
  Cc: Leo Yan

This patch set is to refine metrics output organization.

If we reivew the current memory metrics in Perf c2c tool, it doesn't
orgnize the metrics with directive approach; thus user needs to take
time to dig into every statistics item.  On the other hand, if use the
"summary and breakdown" approach, the output result will be easier for
reviewing by users, e.g. the output result can firstly give out the
summary values, and then the later items will breakdown into more
detailed statistics.

For this reason, this patch is to reorgnize the metrics and it only
changes for the "Shared Data Cache Line Table": it firstly displays the
summary values for total records, total loads, total stores; then it
breaks these summary values into small values, with the order from the
most near memory node ("CPU Load Hit") to more far nodes
("LLC Load Hit", "RMT Load Hit", "Load Dram").

  "LLC Load Hit" = "LclHit" + "LclHitm"

  "RMT Load Hit" = "RmtHit" + "RmtHitm" \
                                         ->  LLC Load Miss
  "Load Dram"    = "Lcl" + "Rmt"        /

Another main reason for this patch set is wanting to extend "perf c2c"
to support Arm SPE memory event, but Arm SPE doesn't contain 'HTIM' tag
in its default trace data, for this case if want to analyze cache false
sharing issue, we need to rely on LLC metrics + multi-threading info.
So this patch set can be friendly to show LLC related metrics in the
"Shared Data Cache Line Table"; for sorting cache lines with LLC metrics
which will be sent out with another separate patch set.

Before:

=================================================
           Shared Data Cache Line Table          
=================================================
#
#        ----------- Cacheline ----------    Total      Tot  ----- LLC Load Hitm -----  ---- Store Reference ----  --- Load Dram ----      LLC    Total  ----- Core Load Hit -----  -- LLC Load Hit --
# Index             Address  Node  PA cnt  records     Hitm    Total      Lcl      Rmt    Total    L1Hit   L1Miss       Lcl       Rmt  Ld Miss    Loads       FB       L1       L2       Llc       Rmt
# .....  ..................  ....  ......  .......  .......  .......  .......  .......  .......  .......  .......  ........  ........  .......  .......  .......  .......  .......  ........  ........
#
      0      0x55acdcc92100     0    8197    40716   52.18%     3170     3170        0    24466    24437       29         0         0        0    16250     3349     5909        0      3822         0
      1      0x55acdcc920c0     0       1     4621   31.01%     1884     1884        0        0        0        0         0         0        0     4621      739        0        0      1998         0
      2      0x55acdcc92080     0       1     4475   16.69%     1014     1014        0        0        0        0         0         0        0     4475     2405        0        0      1056         0


After:

=================================================
           Shared Data Cache Line Table          
=================================================
#
#        ----------- Cacheline ----------      Tot  ------- Load Hitm -------    Total    Total    Total  ---- Stores ----  ----- Core Load Hit -----  - LLC Load Hit --  - RMT Load Hit --  --- Load Dram ----
# Index             Address  Node  PA cnt     Hitm    Total  LclHitm  RmtHitm  records    Loads   Stores    L1Hit   L1Miss       FB       L1       L2    LclHit  LclHitm    RmtHit  RmtHitm       Lcl       Rmt
# .....  ..................  ....  ......  .......  .......  .......  .......  .......  .......  .......  .......  .......  .......  .......  .......  ........  .......  ........  .......  ........  ........
#
      0      0x55acdcc92100     0    8197   52.18%     3170     3170        0    40716    16250    24466    24437       29     3349     5909        0      3822     3170         0        0         0         0
      1      0x55acdcc920c0     0       1   31.01%     1884     1884        0     4621     4621        0        0        0      739        0        0      1998     1884         0        0         0         0
      2      0x55acdcc92080     0       1   16.69%     1014     1014        0     4475     4475        0        0        0     2405        0        0      1056     1014         0        0         0         0


Leo Yan (8):
  perf c2c: Display the total numbers continuously
  perf c2c: Display "Total Stores" as a standalone metrics
  perf c2c: Organize metrics based on memory hierarchy
  perf c2c: Change header from "LLC Load Hitm" to "Load Hitm"
  perf c2c: Use more explicit headers for HITM
  perf c2c: Change header for LLC local hit
  perf c2c: Correct LLC load hit metrics
  perf c2c: Add metrics "RMT Load Hit"

 tools/perf/builtin-c2c.c | 83 +++++++++-------------------------------
 1 file changed, 18 insertions(+), 65 deletions(-)

-- 
2.17.1


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH v1 1/8] perf c2c: Display the total numbers continuously
  2020-10-14  5:09 [PATCH v1 0/8] perf c2c: Refine the organization of metrics Leo Yan
@ 2020-10-14  5:09 ` Leo Yan
  2020-10-14  5:09 ` [PATCH v1 2/8] perf c2c: Display "Total Stores" as a standalone metrics Leo Yan
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 12+ messages in thread
From: Leo Yan @ 2020-10-14  5:09 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Peter Zijlstra, Ingo Molnar,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Andi Kleen, David Ahern, Don Zickus, Joe Mario, Al Grant,
	James Clark, linux-kernel
  Cc: Leo Yan

To view the statistics with "breakdown" mode, it's good to show the
summary numbers for the total records, all stores and all loads, then
the sequential conlumns can be used to break into more detailed items.

To achieve this purpose, this patch displays the summary numbers for
records/stores/loads continuously and places them before breakdown
items, this can allow uses to easily read the summarized statistics.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
---
 tools/perf/builtin-c2c.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/perf/builtin-c2c.c b/tools/perf/builtin-c2c.c
index 5938b100eaf4..e602b7891ce9 100644
--- a/tools/perf/builtin-c2c.c
+++ b/tools/perf/builtin-c2c.c
@@ -2846,13 +2846,13 @@ static int perf_c2c__report(int argc, const char **argv)
 			"dcacheline,"
 			"dcacheline_node,"
 			"dcacheline_count,"
-			"tot_recs,"
 			"percent_hitm,"
 			"tot_hitm,lcl_hitm,rmt_hitm,"
+			"tot_recs,"
+			"tot_loads,"
 			"stores,stores_l1hit,stores_l1miss,"
 			"dram_lcl,dram_rmt,"
 			"ld_llcmiss,"
-			"tot_loads,"
 			"ld_fbhit,ld_l1hit,ld_l2hit,"
 			"ld_lclhit,ld_rmthit",
 			c2c.display == DISPLAY_TOT ? "tot_hitm" :
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v1 2/8] perf c2c: Display "Total Stores" as a standalone metrics
  2020-10-14  5:09 [PATCH v1 0/8] perf c2c: Refine the organization of metrics Leo Yan
  2020-10-14  5:09 ` [PATCH v1 1/8] perf c2c: Display the total numbers continuously Leo Yan
@ 2020-10-14  5:09 ` Leo Yan
  2020-10-14  5:09 ` [PATCH v1 3/8] perf c2c: Organize metrics based on memory hierarchy Leo Yan
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 12+ messages in thread
From: Leo Yan @ 2020-10-14  5:09 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Peter Zijlstra, Ingo Molnar,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Andi Kleen, David Ahern, Don Zickus, Joe Mario, Al Grant,
	James Clark, linux-kernel
  Cc: Leo Yan

The total stores is displayed under the metrics "Store Reference", to
output the same format with total records and all loads, extract the
total stores number as a standalone metrics "Total Stores".

After this patch, the tool shows the summary numbers ("Total records",
"Total loads", "Total Stores") in the unified form.

Before:

  #        ----------- Cacheline ----------      Tot  ----- LLC Load Hitm -----    Total    Total  ---- Store Reference ----  --- Load Dram ----      LLC  ----- Core Load Hit -----  -- LLC Load Hit --
  # Index             Address  Node  PA cnt     Hitm    Total      Lcl      Rmt  records    Loads    Total    L1Hit   L1Miss       Lcl       Rmt  Ld Miss       FB       L1       L2       Llc       Rmt
  # .....  ..................  ....  ......  .......  .......  .......  .......  .......  .......  .......  .......  .......  ........  ........  .......  .......  .......  .......  ........  ........
  #
        0      0x55f07d580100     0    1499   85.89%      481      481        0     7243     3879     3364     2599      765         0         0        0      548     2615       66       169         0
        1      0x55f07d580080     0       1   13.93%       78       78        0      664      664        0        0        0         0         0        0      187      361       27        11         0
        2      0x55f07d5800c0     0       1    0.18%        1        1        0      405      405        0        0        0         0         0        0      131        0       10       263         0

After:

  #        ----------- Cacheline ----------      Tot  ----- LLC Load Hitm -----    Total    Total    Total  ---- Stores ----  --- Load Dram ----      LLC  ----- Core Load Hit -----  -- LLC Load Hit --
  # Index             Address  Node  PA cnt     Hitm    Total      Lcl      Rmt  records    Loads   Stores    L1Hit   L1Miss       Lcl       Rmt  Ld Miss       FB       L1       L2       Llc       Rmt
  # .....  ..................  ....  ......  .......  .......  .......  .......  .......  .......  .......  .......  .......  ........  ........  .......  .......  .......  .......  ........  ........
  #
        0      0x55f07d580100     0    1499   85.89%      481      481        0     7243     3879     3364     2599      765         0         0        0      548     2615       66       169         0
        1      0x55f07d580080     0       1   13.93%       78       78        0      664      664        0        0        0         0         0        0      187      361       27        11         0
        2      0x55f07d5800c0     0       1    0.18%        1        1        0      405      405        0        0        0         0         0        0      131        0       10       263         0

Signed-off-by: Leo Yan <leo.yan@linaro.org>
---
 tools/perf/builtin-c2c.c | 13 +++++++------
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/tools/perf/builtin-c2c.c b/tools/perf/builtin-c2c.c
index e602b7891ce9..a2ad24799aea 100644
--- a/tools/perf/builtin-c2c.c
+++ b/tools/perf/builtin-c2c.c
@@ -1367,16 +1367,16 @@ static struct c2c_dimension dim_cl_lcl_hitm = {
 	.width		= 7,
 };
 
-static struct c2c_dimension dim_stores = {
-	.header		= HEADER_SPAN("---- Store Reference ----", "Total", 2),
-	.name		= "stores",
+static struct c2c_dimension dim_tot_stores = {
+	.header		= HEADER_BOTH("Total", "Stores"),
+	.name		= "tot_stores",
 	.cmp		= store_cmp,
 	.entry		= store_entry,
 	.width		= 7,
 };
 
 static struct c2c_dimension dim_stores_l1hit = {
-	.header		= HEADER_SPAN_LOW("L1Hit"),
+	.header		= HEADER_SPAN("---- Stores ----", "L1Hit", 1),
 	.name		= "stores_l1hit",
 	.cmp		= st_l1hit_cmp,
 	.entry		= st_l1hit_entry,
@@ -1648,7 +1648,7 @@ static struct c2c_dimension *dimensions[] = {
 	&dim_rmt_hitm,
 	&dim_cl_lcl_hitm,
 	&dim_cl_rmt_hitm,
-	&dim_stores,
+	&dim_tot_stores,
 	&dim_stores_l1hit,
 	&dim_stores_l1miss,
 	&dim_cl_stores_l1hit,
@@ -2850,7 +2850,8 @@ static int perf_c2c__report(int argc, const char **argv)
 			"tot_hitm,lcl_hitm,rmt_hitm,"
 			"tot_recs,"
 			"tot_loads,"
-			"stores,stores_l1hit,stores_l1miss,"
+			"tot_stores,"
+			"stores_l1hit,stores_l1miss,"
 			"dram_lcl,dram_rmt,"
 			"ld_llcmiss,"
 			"ld_fbhit,ld_l1hit,ld_l2hit,"
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v1 3/8] perf c2c: Organize metrics based on memory hierarchy
  2020-10-14  5:09 [PATCH v1 0/8] perf c2c: Refine the organization of metrics Leo Yan
  2020-10-14  5:09 ` [PATCH v1 1/8] perf c2c: Display the total numbers continuously Leo Yan
  2020-10-14  5:09 ` [PATCH v1 2/8] perf c2c: Display "Total Stores" as a standalone metrics Leo Yan
@ 2020-10-14  5:09 ` Leo Yan
  2020-10-14  5:09 ` [PATCH v1 4/8] perf c2c: Change header from "LLC Load Hitm" to "Load Hitm" Leo Yan
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 12+ messages in thread
From: Leo Yan @ 2020-10-14  5:09 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Peter Zijlstra, Ingo Molnar,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Andi Kleen, David Ahern, Don Zickus, Joe Mario, Al Grant,
	James Clark, linux-kernel
  Cc: Leo Yan

The metrics are not organized based on memory hierarchy, e.g. the tool
doesn't organize the metrics order based on memory nodes from the close
node (e.g. L1/L2 cache) to far node (e.g. L3 cache and DRAM).

To output metrics with more friendly form, this patch refines the
metrics order based on memory hierarchy:

  "Core Load Hit" => "LLC Load Hit" => "LLC Ld Miss" => "Load Dram"

Signed-off-by: Leo Yan <leo.yan@linaro.org>
---
 tools/perf/builtin-c2c.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/tools/perf/builtin-c2c.c b/tools/perf/builtin-c2c.c
index a2ad24799aea..404d4739b8c1 100644
--- a/tools/perf/builtin-c2c.c
+++ b/tools/perf/builtin-c2c.c
@@ -2852,10 +2852,10 @@ static int perf_c2c__report(int argc, const char **argv)
 			"tot_loads,"
 			"tot_stores,"
 			"stores_l1hit,stores_l1miss,"
-			"dram_lcl,dram_rmt,"
-			"ld_llcmiss,"
 			"ld_fbhit,ld_l1hit,ld_l2hit,"
-			"ld_lclhit,ld_rmthit",
+			"ld_lclhit,ld_rmthit,"
+			"ld_llcmiss,"
+			"dram_lcl,dram_rmt",
 			c2c.display == DISPLAY_TOT ? "tot_hitm" :
 			c2c.display == DISPLAY_LCL ? "lcl_hitm" : "rmt_hitm"
 			);
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v1 4/8] perf c2c: Change header from "LLC Load Hitm" to "Load Hitm"
  2020-10-14  5:09 [PATCH v1 0/8] perf c2c: Refine the organization of metrics Leo Yan
                   ` (2 preceding siblings ...)
  2020-10-14  5:09 ` [PATCH v1 3/8] perf c2c: Organize metrics based on memory hierarchy Leo Yan
@ 2020-10-14  5:09 ` Leo Yan
  2020-10-14  5:09 ` [PATCH v1 5/8] perf c2c: Use more explicit headers for HITM Leo Yan
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 12+ messages in thread
From: Leo Yan @ 2020-10-14  5:09 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Peter Zijlstra, Ingo Molnar,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Andi Kleen, David Ahern, Don Zickus, Joe Mario, Al Grant,
	James Clark, linux-kernel
  Cc: Leo Yan

The metrics "LLC Load Hitm" contains two items: one is "local Hitm" and
another is "remote Hitm".

"local Hitm" means: L3 HIT and was serviced by another processor core
with a cross core snoop where modified copies were found; it's no doubt
that "local Hitm" belongs to LLC access.

But for "remote Hitm", based on the code in util/mem-events, it's the
event for remote cache HIT and was serviced by another processor core
with modified copies.  Thus the remote Hitm is a remote cache's hit and
actually it's LLC load miss.

Now the display format gives users the impression that "local Hitm" and
"remote Hitm" both belong to the LLC load, but this is not the fact as
described.

This patch changes the header from "LLC Load Hitm" to "Load Hitm", this
can avoid the give the wrong impression that all Hitm belong to LLC.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
---
 tools/perf/builtin-c2c.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/builtin-c2c.c b/tools/perf/builtin-c2c.c
index 404d4739b8c1..fa7a1c55b989 100644
--- a/tools/perf/builtin-c2c.c
+++ b/tools/perf/builtin-c2c.c
@@ -1328,7 +1328,7 @@ static struct c2c_dimension dim_iaddr = {
 };
 
 static struct c2c_dimension dim_tot_hitm = {
-	.header		= HEADER_SPAN("----- LLC Load Hitm -----", "Total", 2),
+	.header		= HEADER_SPAN("------- Load Hitm -------", "Total", 2),
 	.name		= "tot_hitm",
 	.cmp		= tot_hitm_cmp,
 	.entry		= tot_hitm_entry,
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v1 5/8] perf c2c: Use more explicit headers for HITM
  2020-10-14  5:09 [PATCH v1 0/8] perf c2c: Refine the organization of metrics Leo Yan
                   ` (3 preceding siblings ...)
  2020-10-14  5:09 ` [PATCH v1 4/8] perf c2c: Change header from "LLC Load Hitm" to "Load Hitm" Leo Yan
@ 2020-10-14  5:09 ` Leo Yan
  2020-10-14  5:09 ` [PATCH v1 6/8] perf c2c: Change header for LLC local hit Leo Yan
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 12+ messages in thread
From: Leo Yan @ 2020-10-14  5:09 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Peter Zijlstra, Ingo Molnar,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Andi Kleen, David Ahern, Don Zickus, Joe Mario, Al Grant,
	James Clark, linux-kernel
  Cc: Leo Yan

Local and remote HITM use the headers 'Lcl' and 'Rmt' respectively,
suppose if we want to extend the tool to display these two dimensions
under any one metrics, users cannot understand the semantics if only
based on the header string 'Lcl' or 'Rmt'.

To explicit express the meaning for HITM items, this patch changes the
headers string as "LclHitm" and "RmtHitm", the strings are more readable
and this allows to extend metrics for using HITM items.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
---
 tools/perf/builtin-c2c.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/tools/perf/builtin-c2c.c b/tools/perf/builtin-c2c.c
index fa7a1c55b989..3d5aa21020f2 100644
--- a/tools/perf/builtin-c2c.c
+++ b/tools/perf/builtin-c2c.c
@@ -1336,7 +1336,7 @@ static struct c2c_dimension dim_tot_hitm = {
 };
 
 static struct c2c_dimension dim_lcl_hitm = {
-	.header		= HEADER_SPAN_LOW("Lcl"),
+	.header		= HEADER_SPAN_LOW("LclHitm"),
 	.name		= "lcl_hitm",
 	.cmp		= lcl_hitm_cmp,
 	.entry		= lcl_hitm_entry,
@@ -1344,7 +1344,7 @@ static struct c2c_dimension dim_lcl_hitm = {
 };
 
 static struct c2c_dimension dim_rmt_hitm = {
-	.header		= HEADER_SPAN_LOW("Rmt"),
+	.header		= HEADER_SPAN_LOW("RmtHitm"),
 	.name		= "rmt_hitm",
 	.cmp		= rmt_hitm_cmp,
 	.entry		= rmt_hitm_entry,
@@ -1486,7 +1486,7 @@ static struct c2c_dimension dim_percent_hitm = {
 };
 
 static struct c2c_dimension dim_percent_rmt_hitm = {
-	.header		= HEADER_SPAN("----- HITM -----", "Rmt", 1),
+	.header		= HEADER_SPAN("----- HITM -----", "RmtHitm", 1),
 	.name		= "percent_rmt_hitm",
 	.cmp		= percent_rmt_hitm_cmp,
 	.entry		= percent_rmt_hitm_entry,
@@ -1495,7 +1495,7 @@ static struct c2c_dimension dim_percent_rmt_hitm = {
 };
 
 static struct c2c_dimension dim_percent_lcl_hitm = {
-	.header		= HEADER_SPAN_LOW("Lcl"),
+	.header		= HEADER_SPAN_LOW("LclHitm"),
 	.name		= "percent_lcl_hitm",
 	.cmp		= percent_lcl_hitm_cmp,
 	.entry		= percent_lcl_hitm_entry,
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v1 6/8] perf c2c: Change header for LLC local hit
  2020-10-14  5:09 [PATCH v1 0/8] perf c2c: Refine the organization of metrics Leo Yan
                   ` (4 preceding siblings ...)
  2020-10-14  5:09 ` [PATCH v1 5/8] perf c2c: Use more explicit headers for HITM Leo Yan
@ 2020-10-14  5:09 ` Leo Yan
  2020-10-14  5:09 ` [PATCH v1 7/8] perf c2c: Correct LLC load hit metrics Leo Yan
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 12+ messages in thread
From: Leo Yan @ 2020-10-14  5:09 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Peter Zijlstra, Ingo Molnar,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Andi Kleen, David Ahern, Don Zickus, Joe Mario, Al Grant,
	James Clark, linux-kernel
  Cc: Leo Yan

Replace the header string "Lcl" with "LclHit", which is more explicit
to express the event type is LLC local hit.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
---
 tools/perf/builtin-c2c.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/builtin-c2c.c b/tools/perf/builtin-c2c.c
index 3d5aa21020f2..2292261b40a2 100644
--- a/tools/perf/builtin-c2c.c
+++ b/tools/perf/builtin-c2c.c
@@ -1432,7 +1432,7 @@ static struct c2c_dimension dim_ld_l2hit = {
 };
 
 static struct c2c_dimension dim_ld_llchit = {
-	.header		= HEADER_SPAN("-- LLC Load Hit --", "Llc", 1),
+	.header		= HEADER_SPAN("-- LLC Load Hit --", "LclHit", 1),
 	.name		= "ld_lclhit",
 	.cmp		= ld_llchit_cmp,
 	.entry		= ld_llchit_entry,
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v1 7/8] perf c2c: Correct LLC load hit metrics
  2020-10-14  5:09 [PATCH v1 0/8] perf c2c: Refine the organization of metrics Leo Yan
                   ` (5 preceding siblings ...)
  2020-10-14  5:09 ` [PATCH v1 6/8] perf c2c: Change header for LLC local hit Leo Yan
@ 2020-10-14  5:09 ` Leo Yan
  2020-10-14  5:09 ` [PATCH v1 8/8] perf c2c: Add metrics "RMT Load Hit" Leo Yan
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 12+ messages in thread
From: Leo Yan @ 2020-10-14  5:09 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Peter Zijlstra, Ingo Molnar,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Andi Kleen, David Ahern, Don Zickus, Joe Mario, Al Grant,
	James Clark, linux-kernel
  Cc: Leo Yan

"rmt_hit" is accounted into two metrics: one is accounted into the
metrics "LLC Ld Miss" (see the function llc_miss() for calculation
"llcmiss"); and it's accounted into metrics "LLC Load Hit".  Thus,
for the literal meaning, it is contradictory that "rmt_hit" is
accounted for both "LLC Ld Miss" (LLC miss) and "LLC Load Hit"
(LLC hit).

Thus this is easily to introduce confusion: "LLC Load Hit" gives
impression that all items belong to it are LLC hit; in fact "rmt_hit"
is LLC miss and remote cache hit.

To give out clear semantics for metric "LLC Load Hit", "rmt_hit" is
moved out from it and changes "LLC Load Hit" to contain two items:

  LLC Load Hit = LLC's hit ("ld_llchit") + LLC's hitm ("lcl_hitm")

For output alignment, adjusts the header for "LLC Load Hit".

Signed-off-by: Leo Yan <leo.yan@linaro.org>
---
 tools/perf/builtin-c2c.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/perf/builtin-c2c.c b/tools/perf/builtin-c2c.c
index 2292261b40a2..61fb939a4e70 100644
--- a/tools/perf/builtin-c2c.c
+++ b/tools/perf/builtin-c2c.c
@@ -1432,7 +1432,7 @@ static struct c2c_dimension dim_ld_l2hit = {
 };
 
 static struct c2c_dimension dim_ld_llchit = {
-	.header		= HEADER_SPAN("-- LLC Load Hit --", "LclHit", 1),
+	.header		= HEADER_SPAN("- LLC Load Hit --", "LclHit", 1),
 	.name		= "ld_lclhit",
 	.cmp		= ld_llchit_cmp,
 	.entry		= ld_llchit_entry,
@@ -2853,7 +2853,7 @@ static int perf_c2c__report(int argc, const char **argv)
 			"tot_stores,"
 			"stores_l1hit,stores_l1miss,"
 			"ld_fbhit,ld_l1hit,ld_l2hit,"
-			"ld_lclhit,ld_rmthit,"
+			"ld_lclhit,lcl_hitm,"
 			"ld_llcmiss,"
 			"dram_lcl,dram_rmt",
 			c2c.display == DISPLAY_TOT ? "tot_hitm" :
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v1 8/8] perf c2c: Add metrics "RMT Load Hit"
  2020-10-14  5:09 [PATCH v1 0/8] perf c2c: Refine the organization of metrics Leo Yan
                   ` (6 preceding siblings ...)
  2020-10-14  5:09 ` [PATCH v1 7/8] perf c2c: Correct LLC load hit metrics Leo Yan
@ 2020-10-14  5:09 ` Leo Yan
  2020-10-14 14:03 ` [PATCH v1 0/8] perf c2c: Refine the organization of metrics Jiri Olsa
  2020-10-14 18:38 ` Joe Mario
  9 siblings, 0 replies; 12+ messages in thread
From: Leo Yan @ 2020-10-14  5:09 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Peter Zijlstra, Ingo Molnar,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Andi Kleen, David Ahern, Don Zickus, Joe Mario, Al Grant,
	James Clark, linux-kernel
  Cc: Leo Yan

The metrics "LLC Ld Miss" and "Load Dram" overlap with each other for
accouting items:

  "LLC Ld Miss" = "lcl_dram" + "rmt_dram" + "rmt_hit" + "rmt_hitm"
  "Load Dram"   = "lcl_dram" + "rmt_dram"

Furthermore, the metrics "LLC Ld Miss" is not directive to show
statistics due to it contains summary value and cannot give out
breakdown details.

For this reason, add a new metrics "RMT Load Hit" which is used to
present the remote cache hit; it contains two items:

  "RMT Load Hit" = remote hit ("rmt_hit") + remote hitm ("rmt_hitm")

As result, the metrics "LLC Ld Miss" is perfectly divided into two
metrics "RMT Load Hit" and "Load Dram".  It's not necessary to keep
metrics "LLC Ld Miss", so remove it.

Before:

  #        ----------- Cacheline ----------      Tot  ------- Load Hitm -------    Total    Total    Total  ---- Stores ----  ----- Core Load Hit -----  - LLC Load Hit --      LLC  --- Load Dram ----
  # Index             Address  Node  PA cnt     Hitm    Total  LclHitm  RmtHitm  records    Loads   Stores    L1Hit   L1Miss       FB       L1       L2    LclHit  LclHitm  Ld Miss       Lcl       Rmt
  # .....  ..................  ....  ......  .......  .......  .......  .......  .......  .......  .......  .......  .......  .......  .......  .......  ........  .......  .......  ........  ........
  #
        0      0x55f07d580100     0    1499   85.89%      481      481        0     7243     3879     3364     2599      765      548     2615       66       169      481        0         0         0
        1      0x55f07d580080     0       1   13.93%       78       78        0      664      664        0        0        0      187      361       27        11       78        0         0         0
        2      0x55f07d5800c0     0       1    0.18%        1        1        0      405      405        0        0        0      131        0       10       263        1        0         0         0

After:

  #        ----------- Cacheline ----------      Tot  ------- Load Hitm -------    Total    Total    Total  ---- Stores ----  ----- Core Load Hit -----  - LLC Load Hit --  - RMT Load Hit --  --- Load Dram ----
  # Index             Address  Node  PA cnt     Hitm    Total  LclHitm  RmtHitm  records    Loads   Stores    L1Hit   L1Miss       FB       L1       L2    LclHit  LclHitm    RmtHit  RmtHitm       Lcl       Rmt
  # .....  ..................  ....  ......  .......  .......  .......  .......  .......  .......  .......  .......  .......  .......  .......  .......  ........  .......  ........  .......  ........  ........
  #
        0      0x55f07d580100     0    1499   85.89%      481      481        0     7243     3879     3364     2599      765      548     2615       66       169      481         0        0         0         0
        1      0x55f07d580080     0       1   13.93%       78       78        0      664      664        0        0        0      187      361       27        11       78         0        0         0         0
        2      0x55f07d5800c0     0       1    0.18%        1        1        0      405      405        0        0        0      131        0       10       263        1         0        0         0         0

Signed-off-by: Leo Yan <leo.yan@linaro.org>
---
 tools/perf/builtin-c2c.c | 52 ++--------------------------------------
 1 file changed, 2 insertions(+), 50 deletions(-)

diff --git a/tools/perf/builtin-c2c.c b/tools/perf/builtin-c2c.c
index 61fb939a4e70..9c2183957c50 100644
--- a/tools/perf/builtin-c2c.c
+++ b/tools/perf/builtin-c2c.c
@@ -652,45 +652,6 @@ STAT_FN(ld_l2hit)
 STAT_FN(ld_llchit)
 STAT_FN(rmt_hit)
 
-static uint64_t llc_miss(struct c2c_stats *stats)
-{
-	uint64_t llcmiss;
-
-	llcmiss = stats->lcl_dram +
-		  stats->rmt_dram +
-		  stats->rmt_hitm +
-		  stats->rmt_hit;
-
-	return llcmiss;
-}
-
-static int
-ld_llcmiss_entry(struct perf_hpp_fmt *fmt, struct perf_hpp *hpp,
-		 struct hist_entry *he)
-{
-	struct c2c_hist_entry *c2c_he;
-	int width = c2c_width(fmt, hpp, he->hists);
-
-	c2c_he = container_of(he, struct c2c_hist_entry, he);
-
-	return scnprintf(hpp->buf, hpp->size, "%*lu", width,
-			 llc_miss(&c2c_he->stats));
-}
-
-static int64_t
-ld_llcmiss_cmp(struct perf_hpp_fmt *fmt __maybe_unused,
-	       struct hist_entry *left, struct hist_entry *right)
-{
-	struct c2c_hist_entry *c2c_left;
-	struct c2c_hist_entry *c2c_right;
-
-	c2c_left  = container_of(left, struct c2c_hist_entry, he);
-	c2c_right = container_of(right, struct c2c_hist_entry, he);
-
-	return (uint64_t) llc_miss(&c2c_left->stats) -
-	       (uint64_t) llc_miss(&c2c_right->stats);
-}
-
 static uint64_t total_records(struct c2c_stats *stats)
 {
 	uint64_t lclmiss, ldcnt, total;
@@ -1440,21 +1401,13 @@ static struct c2c_dimension dim_ld_llchit = {
 };
 
 static struct c2c_dimension dim_ld_rmthit = {
-	.header		= HEADER_SPAN_LOW("Rmt"),
+	.header		= HEADER_SPAN("- RMT Load Hit --", "RmtHit", 1),
 	.name		= "ld_rmthit",
 	.cmp		= rmt_hit_cmp,
 	.entry		= rmt_hit_entry,
 	.width		= 8,
 };
 
-static struct c2c_dimension dim_ld_llcmiss = {
-	.header		= HEADER_BOTH("LLC", "Ld Miss"),
-	.name		= "ld_llcmiss",
-	.cmp		= ld_llcmiss_cmp,
-	.entry		= ld_llcmiss_entry,
-	.width		= 7,
-};
-
 static struct c2c_dimension dim_tot_recs = {
 	.header		= HEADER_BOTH("Total", "records"),
 	.name		= "tot_recs",
@@ -1658,7 +1611,6 @@ static struct c2c_dimension *dimensions[] = {
 	&dim_ld_l2hit,
 	&dim_ld_llchit,
 	&dim_ld_rmthit,
-	&dim_ld_llcmiss,
 	&dim_tot_recs,
 	&dim_tot_loads,
 	&dim_percent_hitm,
@@ -2854,7 +2806,7 @@ static int perf_c2c__report(int argc, const char **argv)
 			"stores_l1hit,stores_l1miss,"
 			"ld_fbhit,ld_l1hit,ld_l2hit,"
 			"ld_lclhit,lcl_hitm,"
-			"ld_llcmiss,"
+			"ld_rmthit,rmt_hitm,"
 			"dram_lcl,dram_rmt",
 			c2c.display == DISPLAY_TOT ? "tot_hitm" :
 			c2c.display == DISPLAY_LCL ? "lcl_hitm" : "rmt_hitm"
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH v1 0/8] perf c2c: Refine the organization of metrics
  2020-10-14  5:09 [PATCH v1 0/8] perf c2c: Refine the organization of metrics Leo Yan
                   ` (7 preceding siblings ...)
  2020-10-14  5:09 ` [PATCH v1 8/8] perf c2c: Add metrics "RMT Load Hit" Leo Yan
@ 2020-10-14 14:03 ` Jiri Olsa
  2020-10-14 18:38 ` Joe Mario
  9 siblings, 0 replies; 12+ messages in thread
From: Jiri Olsa @ 2020-10-14 14:03 UTC (permalink / raw)
  To: Leo Yan
  Cc: Arnaldo Carvalho de Melo, Peter Zijlstra, Ingo Molnar,
	Mark Rutland, Alexander Shishkin, Namhyung Kim, Andi Kleen,
	David Ahern, Don Zickus, Joe Mario, Al Grant, James Clark,
	linux-kernel

On Wed, Oct 14, 2020 at 06:09:13AM +0100, Leo Yan wrote:
> This patch set is to refine metrics output organization.
> 
> If we reivew the current memory metrics in Perf c2c tool, it doesn't
> orgnize the metrics with directive approach; thus user needs to take
> time to dig into every statistics item.  On the other hand, if use the
> "summary and breakdown" approach, the output result will be easier for
> reviewing by users, e.g. the output result can firstly give out the
> summary values, and then the later items will breakdown into more
> detailed statistics.
> 
> For this reason, this patch is to reorgnize the metrics and it only
> changes for the "Shared Data Cache Line Table": it firstly displays the
> summary values for total records, total loads, total stores; then it
> breaks these summary values into small values, with the order from the
> most near memory node ("CPU Load Hit") to more far nodes
> ("LLC Load Hit", "RMT Load Hit", "Load Dram").
> 
>   "LLC Load Hit" = "LclHit" + "LclHitm"
> 
>   "RMT Load Hit" = "RmtHit" + "RmtHitm" \
>                                          ->  LLC Load Miss
>   "Load Dram"    = "Lcl" + "Rmt"        /
> 
> Another main reason for this patch set is wanting to extend "perf c2c"
> to support Arm SPE memory event, but Arm SPE doesn't contain 'HTIM' tag
> in its default trace data, for this case if want to analyze cache false
> sharing issue, we need to rely on LLC metrics + multi-threading info.
> So this patch set can be friendly to show LLC related metrics in the
> "Shared Data Cache Line Table"; for sorting cache lines with LLC metrics
> which will be sent out with another separate patch set.
> 
> Before:
> 
> =================================================
>            Shared Data Cache Line Table          
> =================================================
> #
> #        ----------- Cacheline ----------    Total      Tot  ----- LLC Load Hitm -----  ---- Store Reference ----  --- Load Dram ----      LLC    Total  ----- Core Load Hit -----  -- LLC Load Hit --
> # Index             Address  Node  PA cnt  records     Hitm    Total      Lcl      Rmt    Total    L1Hit   L1Miss       Lcl       Rmt  Ld Miss    Loads       FB       L1       L2       Llc       Rmt
> # .....  ..................  ....  ......  .......  .......  .......  .......  .......  .......  .......  .......  ........  ........  .......  .......  .......  .......  .......  ........  ........
> #
>       0      0x55acdcc92100     0    8197    40716   52.18%     3170     3170        0    24466    24437       29         0         0        0    16250     3349     5909        0      3822         0
>       1      0x55acdcc920c0     0       1     4621   31.01%     1884     1884        0        0        0        0         0         0        0     4621      739        0        0      1998         0
>       2      0x55acdcc92080     0       1     4475   16.69%     1014     1014        0        0        0        0         0         0        0     4475     2405        0        0      1056         0
> 
> 
> After:
> 
> =================================================
>            Shared Data Cache Line Table          
> =================================================
> #
> #        ----------- Cacheline ----------      Tot  ------- Load Hitm -------    Total    Total    Total  ---- Stores ----  ----- Core Load Hit -----  - LLC Load Hit --  - RMT Load Hit --  --- Load Dram ----
> # Index             Address  Node  PA cnt     Hitm    Total  LclHitm  RmtHitm  records    Loads   Stores    L1Hit   L1Miss       FB       L1       L2    LclHit  LclHitm    RmtHit  RmtHitm       Lcl       Rmt
> # .....  ..................  ....  ......  .......  .......  .......  .......  .......  .......  .......  .......  .......  .......  .......  .......  ........  .......  ........  .......  ........  ........
> #
>       0      0x55acdcc92100     0    8197   52.18%     3170     3170        0    40716    16250    24466    24437       29     3349     5909        0      3822     3170         0        0         0         0
>       1      0x55acdcc920c0     0       1   31.01%     1884     1884        0     4621     4621        0        0        0      739        0        0      1998     1884         0        0         0         0
>       2      0x55acdcc92080     0       1   16.69%     1014     1014        0     4475     4475        0        0        0     2405        0        0      1056     1014         0        0         0         0

I haven't used the tool for some time, so it's fine with me,
but there might be some people already used to see certain
columns in place and I don't want to make them angry unless
there's really good reason for that ;-)

Joe, could you please check on these changes?

thanks,
jirka

> 
> 
> Leo Yan (8):
>   perf c2c: Display the total numbers continuously
>   perf c2c: Display "Total Stores" as a standalone metrics
>   perf c2c: Organize metrics based on memory hierarchy
>   perf c2c: Change header from "LLC Load Hitm" to "Load Hitm"
>   perf c2c: Use more explicit headers for HITM
>   perf c2c: Change header for LLC local hit
>   perf c2c: Correct LLC load hit metrics
>   perf c2c: Add metrics "RMT Load Hit"
> 
>  tools/perf/builtin-c2c.c | 83 +++++++++-------------------------------
>  1 file changed, 18 insertions(+), 65 deletions(-)
> 
> -- 
> 2.17.1
> 


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v1 0/8] perf c2c: Refine the organization of metrics
  2020-10-14  5:09 [PATCH v1 0/8] perf c2c: Refine the organization of metrics Leo Yan
                   ` (8 preceding siblings ...)
  2020-10-14 14:03 ` [PATCH v1 0/8] perf c2c: Refine the organization of metrics Jiri Olsa
@ 2020-10-14 18:38 ` Joe Mario
  2020-10-15 15:04   ` Leo Yan
  9 siblings, 1 reply; 12+ messages in thread
From: Joe Mario @ 2020-10-14 18:38 UTC (permalink / raw)
  To: Leo Yan, Arnaldo Carvalho de Melo, Peter Zijlstra, Ingo Molnar,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Andi Kleen, David Ahern, Don Zickus, Al Grant, James Clark,
	linux-kernel



On 10/14/20 1:09 AM, Leo Yan wrote:
> This patch set is to refine metrics output organization.
> 
> If we reivew the current memory metrics in Perf c2c tool, it doesn't
> orgnize the metrics with directive approach; thus user needs to take
> time to dig into every statistics item.  On the other hand, if use the
> "summary and breakdown" approach, the output result will be easier for
> reviewing by users, e.g. the output result can firstly give out the
> summary values, and then the later items will breakdown into more
> detailed statistics.
> 
> For this reason, this patch is to reorgnize the metrics and it only
> changes for the "Shared Data Cache Line Table": it firstly displays the
> summary values for total records, total loads, total stores; then it
> breaks these summary values into small values, with the order from the
> most near memory node ("CPU Load Hit") to more far nodes
> ("LLC Load Hit", "RMT Load Hit", "Load Dram").
> 
>   "LLC Load Hit" = "LclHit" + "LclHitm"
> 
>   "RMT Load Hit" = "RmtHit" + "RmtHitm" \
>                                          ->  LLC Load Miss
>   "Load Dram"    = "Lcl" + "Rmt"        /
> 
> Another main reason for this patch set is wanting to extend "perf c2c"
> to support Arm SPE memory event, but Arm SPE doesn't contain 'HTIM' tag
> in its default trace data, for this case if want to analyze cache false
> sharing issue, we need to rely on LLC metrics + multi-threading info.
> So this patch set can be friendly to show LLC related metrics in the
> "Shared Data Cache Line Table"; for sorting cache lines with LLC metrics
> which will be sent out with another separate patch set.
> 
> <SNIP>
> 
> Leo Yan (8):
>   perf c2c: Display the total numbers continuously
>   perf c2c: Display "Total Stores" as a standalone metrics
>   perf c2c: Organize metrics based on memory hierarchy
>   perf c2c: Change header from "LLC Load Hitm" to "Load Hitm"
>   perf c2c: Use more explicit headers for HITM
>   perf c2c: Change header for LLC local hit
>   perf c2c: Correct LLC load hit metrics
>   perf c2c: Add metrics "RMT Load Hit"
> 
>  tools/perf/builtin-c2c.c | 83 +++++++++-------------------------------
>  1 file changed, 18 insertions(+), 65 deletions(-)

Hi Leo:
I ran your patches through some perf c2c tests and it all looks good.  
I agree the new format of the "Shared Data Cache Line Table" makes more sense now.  And it still holds together nicely when sorted on local HitMs (-d lcl).

Thank you for doing this.
Joe

Tested-by: Joe Mario <jmario@redhat.com>


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v1 0/8] perf c2c: Refine the organization of metrics
  2020-10-14 18:38 ` Joe Mario
@ 2020-10-15 15:04   ` Leo Yan
  0 siblings, 0 replies; 12+ messages in thread
From: Leo Yan @ 2020-10-15 15:04 UTC (permalink / raw)
  To: Joe Mario
  Cc: Arnaldo Carvalho de Melo, Peter Zijlstra, Ingo Molnar,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Andi Kleen, David Ahern, Don Zickus, Al Grant, James Clark,
	linux-kernel

Hi Joe,

On Wed, Oct 14, 2020 at 02:38:19PM -0400, Joe Mario wrote:

[...]

> > This patch set is to refine metrics output organization.

[...]

> Hi Leo:
> I ran your patches through some perf c2c tests and it all looks good.  
> I agree the new format of the "Shared Data Cache Line Table" makes more sense now.  And it still holds together nicely when sorted on local HitMs (-d lcl).
> 
> Thank you for doing this.
> Joe
> 
> Tested-by: Joe Mario <jmario@redhat.com>

Thank you for quick response and testing.

I share the same thinking with Jiri that we should respect the existed
usages and habits of the tool, I was also a bit concern that my changes
might introduce inconvinence for others.  But it's great that receive
your agreement for the changes!

I have respinned the patch set v2 [1] with adding your test tag and
updated documentation; furthermore, I sent out another patch set for
enhancement perf c2c with sorting on LLC load hit, you are welcome to
reivew and comment on it [2].

Thanks,
Leo

[1] https://lore.kernel.org/patchwork/cover/1321499/
[2] https://lore.kernel.org/patchwork/cover/1321514/

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2020-10-15 15:05 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-14  5:09 [PATCH v1 0/8] perf c2c: Refine the organization of metrics Leo Yan
2020-10-14  5:09 ` [PATCH v1 1/8] perf c2c: Display the total numbers continuously Leo Yan
2020-10-14  5:09 ` [PATCH v1 2/8] perf c2c: Display "Total Stores" as a standalone metrics Leo Yan
2020-10-14  5:09 ` [PATCH v1 3/8] perf c2c: Organize metrics based on memory hierarchy Leo Yan
2020-10-14  5:09 ` [PATCH v1 4/8] perf c2c: Change header from "LLC Load Hitm" to "Load Hitm" Leo Yan
2020-10-14  5:09 ` [PATCH v1 5/8] perf c2c: Use more explicit headers for HITM Leo Yan
2020-10-14  5:09 ` [PATCH v1 6/8] perf c2c: Change header for LLC local hit Leo Yan
2020-10-14  5:09 ` [PATCH v1 7/8] perf c2c: Correct LLC load hit metrics Leo Yan
2020-10-14  5:09 ` [PATCH v1 8/8] perf c2c: Add metrics "RMT Load Hit" Leo Yan
2020-10-14 14:03 ` [PATCH v1 0/8] perf c2c: Refine the organization of metrics Jiri Olsa
2020-10-14 18:38 ` Joe Mario
2020-10-15 15:04   ` Leo Yan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).