All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v1 0/8] perf c2c: Sort cacheline with LLC load
@ 2020-10-15 14:50 Leo Yan
  2020-10-15 14:50 ` [PATCH v1 1/8] perf mem: Add structure field c2c_stats::tot_llchit Leo Yan
                   ` (9 more replies)
  0 siblings, 10 replies; 19+ messages in thread
From: Leo Yan @ 2020-10-15 14:50 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Peter Zijlstra, Ingo Molnar,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Kan Liang, Andi Kleen, Ian Rogers, Joe Mario, David Ahern,
	Don Zickus, Al Grant, James Clark, linux-kernel
  Cc: Leo Yan

If the memory event doesn't contain HITM tag (like Arm SPE), it cannot
rely on HITM display to report cache false sharing.  Alternatively, we
can use the LLC access and multi-threads info to locate the potential
false sharing's data address, and if we connect with source code and
analyze the multi-threads' execution timing, if can conclude load and
store the same cache line at the meantime, thus this can be helpful for
resolve the cache false sharing issue.

This patch set is to enable the display with sorting on LLC load
accesses; it adds dimensions for total LLC hit and LLC load accesses,
and these dimensions are used for shared cache line table and pareto.

This patch set is dependend on the patch set "perf c2c: Refine the
organization of metrics" [1].

[1] https://lore.kernel.org/patchwork/cover/1321499/

With this patch set, we can get display 'llc' as follows:

  # perf c2c report -d llc --coalesce tid,pid,iaddr,dso --stdio

  [...]

  =================================================
             Shared Data Cache Line Table
  =================================================
  #
  #        ----------- Cacheline ----------  LLC Hit   LLC Hit    Total    Total    Total  ---- Stores ----  ----- Core Load Hit -----  - LLC Load Hit --  - RMT Load Hit --  --- Load Dram ----
  # Index             Address  Node  PA cnt      Pct     Total  records    Loads   Stores    L1Hit   L1Miss       FB       L1       L2    LclHit  LclHitm    RmtHit  RmtHitm       Lcl       Rmt
  # .....  ..................  ....  ......  .......  ........  .......  .......  .......  .......  .......  .......  .......  .......  ........  .......  ........  .......  ........  ........
  #
        0      0x563b01e83100     0    1401   65.32%       648     7011     3738     3273     2582      691      515     2516       59       143      505         0        0         0         0
        1      0x563b01e830c0     0       1   26.51%       263      400      400        0        0        0      130        3        4       262        1         0        0         0         0
        2      0x563b01e83080     0       1    7.76%        77      650      650        0        0        0      180      348       45        14       63         0        0         0         0
        3  0xffff88c3d74e82c0     0       1    0.10%         1        1        1        0        0        0        0        0        0         1        0         0        0         0         0
        4  0xffffa587c11e38c0   N/A       0    0.10%         1        2        1        1        1        0        0        0        0         1        0         0        0         0         0
        5  0xffffffffbd5e6fc0     0       1    0.10%         1        1        1        0        0        0        0        0        0         0        1         0        0         0         0
        6      0x7f90a4d6c2c0     0       1    0.10%         1        1        1        0        0        0        0        0        0         1        0         0        0         0         0

  =================================================
        Shared Cache Line Distribution Pareto
  =================================================
  #
  #        ---- LLC LD ----  -- Store Refs --  --------- Data address ---------                                                   ---------- cycles ----------    Total       cpu                                  Shared
  #   Num   LclHit  LclHitm   L1 Hit  L1 Miss              Offset  Node  PA cnt      Pid                 Tid        Code address  rmt hitm  lcl hitm      load  records       cnt               Symbol             Object                  Source:Line  Node
  # .....  .......  .......  .......  .......  ..................  ....  ......  .......  ..................  ..................  ........  ........  ........  .......  ........  ...................  .................  ...........................  ....
  #
    -------------------------------------------------------------
        0      143      505     2582      691      0x563b01e83100
    -------------------------------------------------------------
            96.50%    7.72%   46.79%    0.00%                 0x0     0       1    14100    14102:lock_th         0x563b01c81c16         0      1949      1331     1876         1  [.] read_write_func  false_sharing.exe  false_sharing_example.c:145   0
             0.00%   35.05%    0.00%    0.00%                 0x0     0       1    14100    14102:lock_th         0x563b01c81c1d         0      2651       975      748         1  [.] read_write_func  false_sharing.exe  false_sharing_example.c:146   0
             0.00%   30.89%    0.00%    0.00%                 0x0     0       1    14100    14103:lock_th         0x563b01c81c1d         0      1425      1003      762         1  [.] read_write_func  false_sharing.exe  false_sharing_example.c:146   0
             2.10%    7.52%   49.19%    0.00%                 0x0     0       1    14100    14103:lock_th         0x563b01c81c16         0      1585      1053     2037         1  [.] read_write_func  false_sharing.exe  false_sharing_example.c:145   0
             0.00%    0.00%    2.52%   44.86%                 0x0     0       1    14100    14102:lock_th         0x563b01c81c28         0         0         0      375         1  [.] read_write_func  false_sharing.exe  false_sharing_example.c:146   0
             0.00%    0.00%    1.51%   55.14%                 0x0     0       1    14100    14103:lock_th         0x563b01c81c28         0         0         0      420         1  [.] read_write_func  false_sharing.exe  false_sharing_example.c:146   0
             1.40%   12.87%    0.00%    0.00%                0x20     0       1    14100    14104:reader_thd      0x563b01c81c73         0       166        99      417         1  [.] read_write_func  false_sharing.exe  false_sharing_example.c:155   0
             0.00%    5.94%    0.00%    0.00%                0x20     0       1    14100    14105:reader_thd      0x563b01c81c73         0       144        85      376         1  [.] read_write_func  false_sharing.exe  false_sharing_example.c:155   0

  [...]


Leo Yan (8):
  perf mem: Add structure field c2c_stats::tot_llchit
  perf c2c: Add dimensions for total LLC hit
  perf c2c: Add dimensions for LLC load hit
  perf c2c: Change to general naming for macros
  perf c2c: Rename for shared cache line stats
  perf c2c: Refactor hist entry validation
  perf c2c: Add option '-d llc' for sorting with LLC load
  perf c2c: Update documentation for display option 'llc'

 tools/perf/Documentation/perf-c2c.txt |  18 +-
 tools/perf/builtin-c2c.c              | 333 +++++++++++++++++++++-----
 tools/perf/util/mem-events.c          |   3 +
 tools/perf/util/mem-events.h          |   1 +
 4 files changed, 286 insertions(+), 69 deletions(-)

-- 
2.17.1


^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2020-10-22  8:44 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-15 14:50 [PATCH v1 0/8] perf c2c: Sort cacheline with LLC load Leo Yan
2020-10-15 14:50 ` [PATCH v1 1/8] perf mem: Add structure field c2c_stats::tot_llchit Leo Yan
2020-10-15 14:50 ` [PATCH v1 2/8] perf c2c: Add dimensions for total LLC hit Leo Yan
2020-10-15 14:50 ` [PATCH v1 3/8] perf c2c: Add dimensions for LLC load hit Leo Yan
2020-10-15 14:50 ` [PATCH v1 4/8] perf c2c: Change to general naming for macros Leo Yan
2020-10-15 14:50 ` [PATCH v1 5/8] perf c2c: Rename for shared cache line stats Leo Yan
2020-10-15 14:50 ` [PATCH v1 6/8] perf c2c: Refactor hist entry validation Leo Yan
2020-10-15 14:50 ` [PATCH v1 7/8] perf c2c: Add option '-d llc' for sorting with LLC load Leo Yan
2020-10-20  7:25   ` Jiri Olsa
2020-10-20  8:08     ` Leo Yan
2020-10-22  8:43       ` Jiri Olsa
2020-10-20  7:26   ` Jiri Olsa
2020-10-20  8:14     ` Leo Yan
2020-10-15 14:50 ` [PATCH v1 8/8] perf c2c: Update documentation for display option 'llc' Leo Yan
2020-10-20  7:26   ` Jiri Olsa
2020-10-15 15:05 ` [PATCH v1 0/8] perf c2c: Sort cacheline with LLC load Arnaldo Carvalho de Melo
2020-10-15 15:14   ` Leo Yan
2020-10-20  8:13 ` Namhyung Kim
2020-10-20  8:18   ` Leo Yan

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.