All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/6] Corrections to cpu map event encoding
@ 2022-06-14 14:33 Ian Rogers
  2022-06-14 14:33 ` [PATCH v2 1/6] perf cpumap: Const map for max Ian Rogers
                   ` (7 more replies)
  0 siblings, 8 replies; 32+ messages in thread
From: Ian Rogers @ 2022-06-14 14:33 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	James Clark, Kees Cook, Gustavo A. R. Silva, Adrian Hunter,
	Riccardo Mancini, German Gomez, Colin Ian King, Song Liu,
	Dave Marchevsky, Athira Rajeev, Alexey Bayduraev, Leo Yan,
	linux-perf-users, linux-kernel
  Cc: Stephane Eranian, Ian Rogers

A mask encoding of a cpu map is laid out as:
  u16 nr
  u16 long_size
  unsigned long mask[];
However, the mask may be 8-byte aligned meaning there is a 4-byte pad
after long_size. This means 32-bit and 64-bit builds see the mask as
being at different offsets. On top of this the structure is in the byte
data[] encoded as:
  u16 type
  char data[]
This means the mask's struct isn't the required 4 or 8 byte aligned, but
is offset by 2. Consequently the long reads and writes are causing
undefined behavior as the alignment is broken.

These changes do minor clean up with const, visibility of functions
and using the constant time max function. It then adds 32 and 64-bit
mask encoding variants, packed to match current alignment. Taking the
address of a packed struct leads to unaligned data, so function
arguments are altered to be passed the packed struct. To compact the
mask encoding further and drop the padding, the 4-byte variant is
preferred. Finally a new range encoding is added, that reduces the
size of the common case of a range of CPUs to a single u64.

On a 72 CPU (hyperthread) machine the original encoding of all CPUs is:
0x9a98 [0x28]: event: 74
.
. ... raw event: size 40 bytes
.  0000:  4a 00 00 00 00 00 28 00 01 00 02 00 08 00 00 00  J.....(.........
.  0010:  00 00 ff ff ff ff ff ff ff ff ff 00 00 00 00 00  ................
.  0020:  00 00 00 00 00 00 00 00                          ........        

0 0 0x9a98 [0x28]: PERF_RECORD_CPU_MAP

Using the 4-byte encoding it is:
0x9a98@pipe [0x20]: event: 74
.
. ... raw event: size 32 bytes
.  0000:  4a 00 00 00 00 00 20 00 01 00 03 00 04 00 ff ff  J..... .........
.  0010:  ff ff ff ff ff ff ff 00 00 00 00 00 00 00 00 00  ................

0 0 0x9a98 [0x20]: PERF_RECORD_CPU_MAP

Finally, with the range encoding it is:
0x9ab8@pipe [0x10]: event: 74
.
. ... raw event: size 16 bytes
.  0000:  4a 00 00 00 00 00 10 00 02 00 00 00 00 00 47 00  J.............G.

0 0 0x9ab8 [0x10]: PERF_RECORD_CPU_MAP

v2. Fixes a bug in the size computation of the update header
    introduced by the last patch (Add range data encoding) and caught
    by address sanitizer.

Ian Rogers (6):
  perf cpumap: Const map for max
  perf cpumap: Synthetic events and const/static
  perf cpumap: Compute mask size in constant time
  perf cpumap: Fix alignment for masks in event encoding
  perf events: Prefer union over variable length array
  perf cpumap: Add range data encoding

 tools/lib/perf/cpumap.c              |   2 +-
 tools/lib/perf/include/perf/cpumap.h |   2 +-
 tools/lib/perf/include/perf/event.h  |  61 ++++++++-
 tools/perf/tests/cpumap.c            |  71 ++++++++---
 tools/perf/tests/event_update.c      |  14 +--
 tools/perf/util/cpumap.c             | 111 +++++++++++++---
 tools/perf/util/cpumap.h             |   4 +-
 tools/perf/util/event.h              |   4 -
 tools/perf/util/header.c             |  24 ++--
 tools/perf/util/session.c            |  35 +++---
 tools/perf/util/synthetic-events.c   | 182 +++++++++++++--------------
 tools/perf/util/synthetic-events.h   |   2 +-
 12 files changed, 327 insertions(+), 185 deletions(-)

-- 
2.36.1.476.g0c4daa206d-goog


^ permalink raw reply	[flat|nested] 32+ messages in thread

end of thread, other threads:[~2022-09-08 18:53 UTC | newest]

Thread overview: 32+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-06-14 14:33 [PATCH v2 0/6] Corrections to cpu map event encoding Ian Rogers
2022-06-14 14:33 ` [PATCH v2 1/6] perf cpumap: Const map for max Ian Rogers
2022-06-14 14:33 ` [PATCH v2 2/6] perf cpumap: Synthetic events and const/static Ian Rogers
2022-06-14 14:33 ` [PATCH v2 3/6] perf cpumap: Compute mask size in constant time Ian Rogers
2022-06-14 14:33 ` [PATCH v2 4/6] perf cpumap: Fix alignment for masks in event encoding Ian Rogers
2022-06-14 22:44   ` Namhyung Kim
2022-06-14 23:51     ` Ian Rogers
2022-06-29  9:18   ` Jiri Olsa
2022-06-29 16:05     ` Ian Rogers
2022-08-18 21:50   ` Arnaldo Carvalho de Melo
2022-08-18 22:49     ` Ian Rogers
2022-08-19 15:58     ` Arnaldo Carvalho de Melo
2022-08-19 17:09       ` Ian Rogers
2022-08-19 17:28         ` Arnaldo Carvalho de Melo
2022-08-26 12:57   ` Alexander Gordeev
2022-08-26 16:20     ` Ian Rogers
2022-06-14 14:33 ` [PATCH v2 5/6] perf events: Prefer union over variable length array Ian Rogers
2022-06-14 14:33 ` [PATCH v2 6/6] perf cpumap: Add range data encoding Ian Rogers
2022-06-29  9:31   ` Jiri Olsa
2022-06-29 16:19     ` Ian Rogers
2022-07-31 12:39   ` Jiri Olsa
2022-08-04 19:30     ` Ian Rogers
2022-09-07 22:41       ` Ian Rogers
2022-09-07 23:47         ` Arnaldo Carvalho de Melo
2022-09-08 18:52         ` Arnaldo Carvalho de Melo
2022-07-15 17:01 ` [PATCH v2 0/6] Corrections to cpu map event encoding Ian Rogers
2022-07-29  2:01   ` Ian Rogers
2022-07-29 11:35     ` Jiri Olsa
2022-07-29 14:28       ` Ian Rogers
2022-07-31 12:37         ` Jiri Olsa
2022-08-04 20:23 ` Jiri Olsa
     [not found]   ` <CAP-5=fX-Ex1uv0hxCwDkkAyFV6VQNPRB5uSPpCDNgqu5ZV=bCA@mail.gmail.com>
2022-08-16 19:51     ` Arnaldo Carvalho de Melo

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.