linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 00/14] perf arm-spe: Refactor decoding & dumping flow
@ 2020-09-29 13:39 Leo Yan
  2020-09-29 13:39 ` [PATCH v2 01/14] perf arm-spe: Include bitops.h for BIT() macro Leo Yan
                   ` (14 more replies)
  0 siblings, 15 replies; 43+ messages in thread
From: Leo Yan @ 2020-09-29 13:39 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Peter Zijlstra, Ingo Molnar,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Wei Li, James Clark, Andre Przywara, Dave Martin, linux-kernel,
	Al Grant
  Cc: Leo Yan

The prominent issue for the SPE trace decoding and dumping is the packet
header and payload values are hard coded with numbers and it's not
readable and difficult to maintain; and has other minor issues, e.g. the
packet length (header + payload) calculation is not correct for some
packet types, and the dumping flow misses to support specific sub
classes for operation packet, etc.

So this patch set is to refactor the Arm SPE decoding SPE with:
- Patches 01, 02 are minor cleans up;
- Patches 03, 04 are used to fix and polish the packet and payload
  length calculation;
- Patch 05 is to add a helper to wrap up printing strings, this can
  avoid bunch of duplicate code lines;
- Patches 06 ~ 12 are used to refactor decoding for different types
  packet one by one (address packet, context packet, counter packet,
  event packet, operation packet);
- Patch 13 is coming from Andre to dump memory tagging;
- Patch 14 is coming from Wei Li to add decoding for ARMv8.3
  extension, in this version it has been improved to use defined
  macros, also is improved for failure handling and commit log.

This patch set is cleanly applied on the top of perf/core branch
with commit a55b7bb1c146 ("perf test: Fix msan uninitialized use."),
and the patches have been verified on Hisilicon D06 platform and I
manually inspected the dumping result.

Changes from v1:
- Heavily rewrote the patch 05 for refactoring printing strings; this
  is fundamental change, so adjusted the sequence for patches and moved
  the printing string patch ahead from patch 10 (v1) to patch 05;
- Changed to use GENMASK_ULL() for bits mask;
- Added Andre's patch 13 for dumping memory tagging;
- Refined patch 12 for adding sub classes for Operation packet, merged
  some commit log from Andre's patch, which allows commit log and code
  to be more clear; Added "Co-developed-by: Andre Przywara" tag to
  reflect this.


Andre Przywara (1):
  perf arm_spe: Decode memory tagging properties

Leo Yan (12):
  perf arm-spe: Include bitops.h for BIT() macro
  perf arm-spe: Fix a typo in comment
  perf arm-spe: Refactor payload length calculation
  perf arm-spe: Fix packet length handling
  perf arm-spe: Refactor printing string to buffer
  perf arm-spe: Refactor packet header parsing
  perf arm-spe: Refactor address packet handling
  perf arm-spe: Refactor context packet handling
  perf arm-spe: Refactor counter packet handling
  perf arm-spe: Refactor event type handling
  perf arm-spe: Refactor operation packet handling
  perf arm-spe: Add more sub classes for operation packet

Wei Li (1):
  perf arm-spe: Add support for ARMv8.3-SPE

 .../util/arm-spe-decoder/arm-spe-decoder.c    |  54 +-
 .../util/arm-spe-decoder/arm-spe-decoder.h    |  17 -
 .../arm-spe-decoder/arm-spe-pkt-decoder.c     | 567 +++++++++++-------
 .../arm-spe-decoder/arm-spe-pkt-decoder.h     | 124 +++-
 4 files changed, 478 insertions(+), 284 deletions(-)

-- 
2.20.1


^ permalink raw reply	[flat|nested] 43+ messages in thread

* [PATCH v2 01/14] perf arm-spe: Include bitops.h for BIT() macro
  2020-09-29 13:39 [PATCH v2 00/14] perf arm-spe: Refactor decoding & dumping flow Leo Yan
@ 2020-09-29 13:39 ` Leo Yan
  2020-10-08 13:44   ` André Przywara
  2020-09-29 13:39 ` [PATCH v2 02/14] perf arm-spe: Fix a typo in comment Leo Yan
                   ` (13 subsequent siblings)
  14 siblings, 1 reply; 43+ messages in thread
From: Leo Yan @ 2020-09-29 13:39 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Peter Zijlstra, Ingo Molnar,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Wei Li, James Clark, Andre Przywara, Dave Martin, linux-kernel,
	Al Grant
  Cc: Leo Yan

Include header linux/bitops.h, directly use its BIT() macro and remove
the self defined macros.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
---
 tools/perf/util/arm-spe-decoder/arm-spe-decoder.c     | 5 +----
 tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c | 3 +--
 2 files changed, 2 insertions(+), 6 deletions(-)

diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c b/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c
index 93e063f22be5..cc18a1e8c212 100644
--- a/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c
+++ b/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c
@@ -12,6 +12,7 @@
 #include <string.h>
 #include <stdint.h>
 #include <stdlib.h>
+#include <linux/bitops.h>
 #include <linux/compiler.h>
 #include <linux/zalloc.h>
 
@@ -21,10 +22,6 @@
 
 #include "arm-spe-decoder.h"
 
-#ifndef BIT
-#define BIT(n)		(1UL << (n))
-#endif
-
 static u64 arm_spe_calc_ip(int index, u64 payload)
 {
 	u8 *addr = (u8 *)&payload;
diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
index b94001b756c7..46ddb53a6457 100644
--- a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
+++ b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
@@ -8,11 +8,10 @@
 #include <string.h>
 #include <endian.h>
 #include <byteswap.h>
+#include <linux/bitops.h>
 
 #include "arm-spe-pkt-decoder.h"
 
-#define BIT(n)		(1ULL << (n))
-
 #define NS_FLAG		BIT(63)
 #define EL_FLAG		(BIT(62) | BIT(61))
 
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v2 02/14] perf arm-spe: Fix a typo in comment
  2020-09-29 13:39 [PATCH v2 00/14] perf arm-spe: Refactor decoding & dumping flow Leo Yan
  2020-09-29 13:39 ` [PATCH v2 01/14] perf arm-spe: Include bitops.h for BIT() macro Leo Yan
@ 2020-09-29 13:39 ` Leo Yan
  2020-10-08 13:44   ` André Przywara
  2020-09-29 13:39 ` [PATCH v2 03/14] perf arm-spe: Refactor payload length calculation Leo Yan
                   ` (12 subsequent siblings)
  14 siblings, 1 reply; 43+ messages in thread
From: Leo Yan @ 2020-09-29 13:39 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Peter Zijlstra, Ingo Molnar,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Wei Li, James Clark, Andre Przywara, Dave Martin, linux-kernel,
	Al Grant
  Cc: Leo Yan

Fix a typo: s/iff/if.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
---
 tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
index 46ddb53a6457..7c7b5eb09fba 100644
--- a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
+++ b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
@@ -142,7 +142,7 @@ static int arm_spe_get_events(const unsigned char *buf, size_t len,
 
 	/* we use index to identify Events with a less number of
 	 * comparisons in arm_spe_pkt_desc(): E.g., the LLC-ACCESS,
-	 * LLC-REFILL, and REMOTE-ACCESS events are identified iff
+	 * LLC-REFILL, and REMOTE-ACCESS events are identified if
 	 * index > 1.
 	 */
 	packet->index = ret - 1;
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v2 03/14] perf arm-spe: Refactor payload length calculation
  2020-09-29 13:39 [PATCH v2 00/14] perf arm-spe: Refactor decoding & dumping flow Leo Yan
  2020-09-29 13:39 ` [PATCH v2 01/14] perf arm-spe: Include bitops.h for BIT() macro Leo Yan
  2020-09-29 13:39 ` [PATCH v2 02/14] perf arm-spe: Fix a typo in comment Leo Yan
@ 2020-09-29 13:39 ` Leo Yan
  2020-10-08 13:44   ` André Przywara
  2020-09-29 13:39 ` [PATCH v2 04/14] perf arm-spe: Fix packet length handling Leo Yan
                   ` (11 subsequent siblings)
  14 siblings, 1 reply; 43+ messages in thread
From: Leo Yan @ 2020-09-29 13:39 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Peter Zijlstra, Ingo Molnar,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Wei Li, James Clark, Andre Przywara, Dave Martin, linux-kernel,
	Al Grant
  Cc: Leo Yan

Defines macro for payload length calculation instead of static function.

Currently the event packet's 'index' is assigned as payload length, but
the flow is not directive: it firstly gets the packet length (includes
header length and payload length) and then reduces header length from
packet length, so finally get the payload length; to simplify the code,
this patch directly assigns payload length to event packet's index.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
---
 .../arm-spe-decoder/arm-spe-pkt-decoder.c     | 26 ++++++++-----------
 .../arm-spe-decoder/arm-spe-pkt-decoder.h     |  4 +++
 2 files changed, 15 insertions(+), 15 deletions(-)

diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
index 7c7b5eb09fba..5a8696031e16 100644
--- a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
+++ b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
@@ -69,22 +69,20 @@ const char *arm_spe_pkt_name(enum arm_spe_pkt_type type)
 	return arm_spe_packet_name[type];
 }
 
-/* return ARM SPE payload size from its encoding,
- * which is in bits 5:4 of the byte.
- * 00 : byte
- * 01 : halfword (2)
- * 10 : word (4)
- * 11 : doubleword (8)
+/*
+ * Return ARM SPE payload size from header bits 5:4
+ *   00 : byte
+ *   01 : halfword (2)
+ *   10 : word (4)
+ *   11 : doubleword (8)
  */
-static int payloadlen(unsigned char byte)
-{
-	return 1 << ((byte & 0x30) >> 4);
-}
+#define PAYLOAD_LEN(val)	\
+	(1 << (((val) & SPE_HEADER_SZ_MASK) >> SPE_HEADER_SZ_SHIFT))
 
 static int arm_spe_get_payload(const unsigned char *buf, size_t len,
 			       struct arm_spe_pkt *packet)
 {
-	size_t payload_len = payloadlen(buf[0]);
+	size_t payload_len = PAYLOAD_LEN(buf[0]);
 
 	if (len < 1 + payload_len)
 		return ARM_SPE_NEED_MORE_BYTES;
@@ -136,8 +134,6 @@ static int arm_spe_get_timestamp(const unsigned char *buf, size_t len,
 static int arm_spe_get_events(const unsigned char *buf, size_t len,
 			      struct arm_spe_pkt *packet)
 {
-	int ret = arm_spe_get_payload(buf, len, packet);
-
 	packet->type = ARM_SPE_EVENTS;
 
 	/* we use index to identify Events with a less number of
@@ -145,9 +141,9 @@ static int arm_spe_get_events(const unsigned char *buf, size_t len,
 	 * LLC-REFILL, and REMOTE-ACCESS events are identified if
 	 * index > 1.
 	 */
-	packet->index = ret - 1;
+	packet->index = PAYLOAD_LEN(buf[0]);
 
-	return ret;
+	return arm_spe_get_payload(buf, len, packet);
 }
 
 static int arm_spe_get_data_source(const unsigned char *buf, size_t len,
diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h
index 4c870521b8eb..f2d0af39a58c 100644
--- a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h
+++ b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h
@@ -9,6 +9,7 @@
 
 #include <stddef.h>
 #include <stdint.h>
+#include <linux/bits.h>
 
 #define ARM_SPE_PKT_DESC_MAX		256
 
@@ -36,6 +37,9 @@ struct arm_spe_pkt {
 	uint64_t		payload;
 };
 
+#define SPE_HEADER_SZ_SHIFT		(4)
+#define SPE_HEADER_SZ_MASK		GENMASK_ULL(5, 4)
+
 #define SPE_ADDR_PKT_HDR_INDEX_INS		(0x0)
 #define SPE_ADDR_PKT_HDR_INDEX_BRANCH		(0x1)
 #define SPE_ADDR_PKT_HDR_INDEX_DATA_VIRT	(0x2)
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v2 04/14] perf arm-spe: Fix packet length handling
  2020-09-29 13:39 [PATCH v2 00/14] perf arm-spe: Refactor decoding & dumping flow Leo Yan
                   ` (2 preceding siblings ...)
  2020-09-29 13:39 ` [PATCH v2 03/14] perf arm-spe: Refactor payload length calculation Leo Yan
@ 2020-09-29 13:39 ` Leo Yan
  2020-10-08 13:45   ` André Przywara
  2020-09-29 13:39 ` [PATCH v2 05/14] perf arm-spe: Refactor printing string to buffer Leo Yan
                   ` (10 subsequent siblings)
  14 siblings, 1 reply; 43+ messages in thread
From: Leo Yan @ 2020-09-29 13:39 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Peter Zijlstra, Ingo Molnar,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Wei Li, James Clark, Andre Przywara, Dave Martin, linux-kernel,
	Al Grant
  Cc: Leo Yan

When process address packet and counter packet, if the packet contains
extended header, it misses to account the extra one byte for header
length calculation, thus returns the wrong packet length.

To correct the packet length calculation, one possible fixing is simply
to plus extra 1 for extended header, but will spread some duplicate code
in the flows for processing address packet and counter packet.
Alternatively, we can refine the function arm_spe_get_payload() to not
only support short header and allow it to support extended header, and
rely on it for the packet length calculation.

So this patch refactors function arm_spe_get_payload() with a new
argument 'ext_hdr' for support extended header; the packet processing
flows can invoke this function to unify the packet length calculation.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
---
 .../arm-spe-decoder/arm-spe-pkt-decoder.c     | 34 +++++++------------
 1 file changed, 12 insertions(+), 22 deletions(-)

diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
index 5a8696031e16..4f0aeb62e97b 100644
--- a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
+++ b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
@@ -80,14 +80,15 @@ const char *arm_spe_pkt_name(enum arm_spe_pkt_type type)
 	(1 << (((val) & SPE_HEADER_SZ_MASK) >> SPE_HEADER_SZ_SHIFT))
 
 static int arm_spe_get_payload(const unsigned char *buf, size_t len,
+			       unsigned char ext_hdr,
 			       struct arm_spe_pkt *packet)
 {
-	size_t payload_len = PAYLOAD_LEN(buf[0]);
+	size_t payload_len = PAYLOAD_LEN(buf[ext_hdr]);
 
-	if (len < 1 + payload_len)
+	if (len < 1 + ext_hdr + payload_len)
 		return ARM_SPE_NEED_MORE_BYTES;
 
-	buf++;
+	buf += 1 + ext_hdr;
 
 	switch (payload_len) {
 	case 1: packet->payload = *(uint8_t *)buf; break;
@@ -97,7 +98,7 @@ static int arm_spe_get_payload(const unsigned char *buf, size_t len,
 	default: return ARM_SPE_BAD_PACKET;
 	}
 
-	return 1 + payload_len;
+	return 1 + ext_hdr + payload_len;
 }
 
 static int arm_spe_get_pad(struct arm_spe_pkt *packet)
@@ -128,7 +129,7 @@ static int arm_spe_get_timestamp(const unsigned char *buf, size_t len,
 				 struct arm_spe_pkt *packet)
 {
 	packet->type = ARM_SPE_TIMESTAMP;
-	return arm_spe_get_payload(buf, len, packet);
+	return arm_spe_get_payload(buf, len, 0, packet);
 }
 
 static int arm_spe_get_events(const unsigned char *buf, size_t len,
@@ -143,14 +144,14 @@ static int arm_spe_get_events(const unsigned char *buf, size_t len,
 	 */
 	packet->index = PAYLOAD_LEN(buf[0]);
 
-	return arm_spe_get_payload(buf, len, packet);
+	return arm_spe_get_payload(buf, len, 0, packet);
 }
 
 static int arm_spe_get_data_source(const unsigned char *buf, size_t len,
 				   struct arm_spe_pkt *packet)
 {
 	packet->type = ARM_SPE_DATA_SOURCE;
-	return arm_spe_get_payload(buf, len, packet);
+	return arm_spe_get_payload(buf, len, 0, packet);
 }
 
 static int arm_spe_get_context(const unsigned char *buf, size_t len,
@@ -158,8 +159,7 @@ static int arm_spe_get_context(const unsigned char *buf, size_t len,
 {
 	packet->type = ARM_SPE_CONTEXT;
 	packet->index = buf[0] & 0x3;
-
-	return arm_spe_get_payload(buf, len, packet);
+	return arm_spe_get_payload(buf, len, 0, packet);
 }
 
 static int arm_spe_get_op_type(const unsigned char *buf, size_t len,
@@ -167,41 +167,31 @@ static int arm_spe_get_op_type(const unsigned char *buf, size_t len,
 {
 	packet->type = ARM_SPE_OP_TYPE;
 	packet->index = buf[0] & 0x3;
-	return arm_spe_get_payload(buf, len, packet);
+	return arm_spe_get_payload(buf, len, 0, packet);
 }
 
 static int arm_spe_get_counter(const unsigned char *buf, size_t len,
 			       const unsigned char ext_hdr, struct arm_spe_pkt *packet)
 {
-	if (len < 2)
-		return ARM_SPE_NEED_MORE_BYTES;
-
 	packet->type = ARM_SPE_COUNTER;
 	if (ext_hdr)
 		packet->index = ((buf[0] & 0x3) << 3) | (buf[1] & 0x7);
 	else
 		packet->index = buf[0] & 0x7;
 
-	packet->payload = le16_to_cpu(*(uint16_t *)(buf + 1));
-
-	return 1 + ext_hdr + 2;
+	return arm_spe_get_payload(buf, len, ext_hdr, packet);
 }
 
 static int arm_spe_get_addr(const unsigned char *buf, size_t len,
 			    const unsigned char ext_hdr, struct arm_spe_pkt *packet)
 {
-	if (len < 8)
-		return ARM_SPE_NEED_MORE_BYTES;
-
 	packet->type = ARM_SPE_ADDRESS;
 	if (ext_hdr)
 		packet->index = ((buf[0] & 0x3) << 3) | (buf[1] & 0x7);
 	else
 		packet->index = buf[0] & 0x7;
 
-	memcpy_le64(&packet->payload, buf + 1, 8);
-
-	return 1 + ext_hdr + 8;
+	return arm_spe_get_payload(buf, len, ext_hdr, packet);
 }
 
 static int arm_spe_do_get_packet(const unsigned char *buf, size_t len,
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v2 05/14] perf arm-spe: Refactor printing string to buffer
  2020-09-29 13:39 [PATCH v2 00/14] perf arm-spe: Refactor decoding & dumping flow Leo Yan
                   ` (3 preceding siblings ...)
  2020-09-29 13:39 ` [PATCH v2 04/14] perf arm-spe: Fix packet length handling Leo Yan
@ 2020-09-29 13:39 ` Leo Yan
  2020-10-08 13:46   ` André Przywara
  2020-09-29 13:39 ` [PATCH v2 06/14] perf arm-spe: Refactor packet header parsing Leo Yan
                   ` (9 subsequent siblings)
  14 siblings, 1 reply; 43+ messages in thread
From: Leo Yan @ 2020-09-29 13:39 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Peter Zijlstra, Ingo Molnar,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Wei Li, James Clark, Andre Przywara, Dave Martin, linux-kernel,
	Al Grant
  Cc: Leo Yan

When outputs strings to the decoding buffer with function snprintf(),
SPE decoder needs to detects if any error returns from snprintf() and if
so needs to directly bail out.  If snprintf() returns success, it needs
to update buffer pointer and reduce the buffer length so can continue to
output the next string into the consequent memory space.

This complex logics are spreading in the function arm_spe_pkt_desc() so
there has many duplicate codes for handling error detecting, increment
buffer pointer and decrement buffer size.

To avoid the duplicate code, this patch introduces a new helper function
arm_spe_pkt_snprintf() which is used to wrap up the complex logics, and
the caller arm_spe_pkt_desc() will call it and simply check the returns
value.

This patch also moves the variable 'blen' as the function's local
variable, this allows to remove the unnecessary braces and improve the
readability.

Suggested-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Leo Yan <leo.yan@linaro.org>
---
 .../arm-spe-decoder/arm-spe-pkt-decoder.c     | 247 ++++++++++--------
 1 file changed, 135 insertions(+), 112 deletions(-)

diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
index 4f0aeb62e97b..96b717a19163 100644
--- a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
+++ b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
@@ -9,6 +9,7 @@
 #include <endian.h>
 #include <byteswap.h>
 #include <linux/bitops.h>
+#include <stdarg.h>
 
 #include "arm-spe-pkt-decoder.h"
 
@@ -256,192 +257,214 @@ int arm_spe_get_packet(const unsigned char *buf, size_t len,
 	return ret;
 }
 
+static int arm_spe_pkt_snprintf(char **buf_p, size_t *blen,
+				const char *fmt, ...)
+{
+	va_list ap;
+	int ret;
+
+	va_start(ap, fmt);
+	ret = vsnprintf(*buf_p, *blen, fmt, ap);
+	va_end(ap);
+
+	if (ret < 0)
+		return ret;
+
+	*buf_p += ret;
+	*blen -= ret;
+	return ret;
+}
+
 int arm_spe_pkt_desc(const struct arm_spe_pkt *packet, char *buf,
 		     size_t buf_len)
 {
 	int ret, ns, el, idx = packet->index;
 	unsigned long long payload = packet->payload;
 	const char *name = arm_spe_pkt_name(packet->type);
+	size_t blen = buf_len;
 
 	switch (packet->type) {
 	case ARM_SPE_BAD:
 	case ARM_SPE_PAD:
 	case ARM_SPE_END:
-		return snprintf(buf, buf_len, "%s", name);
-	case ARM_SPE_EVENTS: {
-		size_t blen = buf_len;
-
-		ret = 0;
-		ret = snprintf(buf, buf_len, "EV");
-		buf += ret;
-		blen -= ret;
+		return arm_spe_pkt_snprintf(&buf, &blen, name);
+	case ARM_SPE_EVENTS:
+		ret = arm_spe_pkt_snprintf(&buf, &blen, "EV");
+		if (ret < 0)
+			return ret;
+
 		if (payload & 0x1) {
-			ret = snprintf(buf, buf_len, " EXCEPTION-GEN");
-			buf += ret;
-			blen -= ret;
+			ret = arm_spe_pkt_snprintf(&buf, &blen, " EXCEPTION-GEN");
+			if (ret < 0)
+				return ret;
 		}
 		if (payload & 0x2) {
-			ret = snprintf(buf, buf_len, " RETIRED");
-			buf += ret;
-			blen -= ret;
+			ret = arm_spe_pkt_snprintf(&buf, &blen, " RETIRED");
+			if (ret < 0)
+				return ret;
 		}
 		if (payload & 0x4) {
-			ret = snprintf(buf, buf_len, " L1D-ACCESS");
-			buf += ret;
-			blen -= ret;
+			ret = arm_spe_pkt_snprintf(&buf, &blen, " L1D-ACCESS");
+			if (ret < 0)
+				return ret;
 		}
 		if (payload & 0x8) {
-			ret = snprintf(buf, buf_len, " L1D-REFILL");
-			buf += ret;
-			blen -= ret;
+			ret = arm_spe_pkt_snprintf(&buf, &blen, " L1D-REFILL");
+			if (ret < 0)
+				return ret;
 		}
 		if (payload & 0x10) {
-			ret = snprintf(buf, buf_len, " TLB-ACCESS");
-			buf += ret;
-			blen -= ret;
+			ret = arm_spe_pkt_snprintf(&buf, &blen, " TLB-ACCESS");
+			if (ret < 0)
+				return ret;
 		}
 		if (payload & 0x20) {
-			ret = snprintf(buf, buf_len, " TLB-REFILL");
-			buf += ret;
-			blen -= ret;
+			ret = arm_spe_pkt_snprintf(&buf, &blen, " TLB-REFILL");
+			if (ret < 0)
+				return ret;
 		}
 		if (payload & 0x40) {
-			ret = snprintf(buf, buf_len, " NOT-TAKEN");
-			buf += ret;
-			blen -= ret;
+			ret = arm_spe_pkt_snprintf(&buf, &blen, " NOT-TAKEN");
+			if (ret < 0)
+				return ret;
 		}
 		if (payload & 0x80) {
-			ret = snprintf(buf, buf_len, " MISPRED");
-			buf += ret;
-			blen -= ret;
+			ret = arm_spe_pkt_snprintf(&buf, &blen, " MISPRED");
+			if (ret < 0)
+				return ret;
 		}
 		if (idx > 1) {
 			if (payload & 0x100) {
-				ret = snprintf(buf, buf_len, " LLC-ACCESS");
-				buf += ret;
-				blen -= ret;
+				ret = arm_spe_pkt_snprintf(&buf, &blen, " LLC-ACCESS");
+				if (ret < 0)
+					return ret;
 			}
 			if (payload & 0x200) {
-				ret = snprintf(buf, buf_len, " LLC-REFILL");
-				buf += ret;
-				blen -= ret;
+				ret = arm_spe_pkt_snprintf(&buf, &blen, " LLC-REFILL");
+				if (ret < 0)
+					return ret;
 			}
 			if (payload & 0x400) {
-				ret = snprintf(buf, buf_len, " REMOTE-ACCESS");
-				buf += ret;
-				blen -= ret;
+				ret = arm_spe_pkt_snprintf(&buf, &blen, " REMOTE-ACCESS");
+				if (ret < 0)
+					return ret;
 			}
 		}
-		if (ret < 0)
-			return ret;
-		blen -= ret;
 		return buf_len - blen;
-	}
+
 	case ARM_SPE_OP_TYPE:
 		switch (idx) {
-		case 0:	return snprintf(buf, buf_len, "%s", payload & 0x1 ?
-					"COND-SELECT" : "INSN-OTHER");
-		case 1:	{
-			size_t blen = buf_len;
-
-			if (payload & 0x1)
-				ret = snprintf(buf, buf_len, "ST");
-			else
-				ret = snprintf(buf, buf_len, "LD");
-			buf += ret;
-			blen -= ret;
+		case 0:
+			return arm_spe_pkt_snprintf(&buf, &blen,
+					payload & 0x1 ? "COND-SELECT" : "INSN-OTHER");
+		case 1:
+			ret = arm_spe_pkt_snprintf(&buf, &blen,
+						   payload & 0x1 ? "ST" : "LD");
+			if (ret < 0)
+				return ret;
+
 			if (payload & 0x2) {
 				if (payload & 0x4) {
-					ret = snprintf(buf, buf_len, " AT");
-					buf += ret;
-					blen -= ret;
+					ret = arm_spe_pkt_snprintf(&buf, &blen, " AT");
+					if (ret < 0)
+						return ret;
 				}
 				if (payload & 0x8) {
-					ret = snprintf(buf, buf_len, " EXCL");
-					buf += ret;
-					blen -= ret;
+					ret = arm_spe_pkt_snprintf(&buf, &blen, " EXCL");
+					if (ret < 0)
+						return ret;
 				}
 				if (payload & 0x10) {
-					ret = snprintf(buf, buf_len, " AR");
-					buf += ret;
-					blen -= ret;
+					ret = arm_spe_pkt_snprintf(&buf, &blen, " AR");
+					if (ret < 0)
+						return ret;
 				}
 			} else if (payload & 0x4) {
-				ret = snprintf(buf, buf_len, " SIMD-FP");
-				buf += ret;
-				blen -= ret;
+				ret = arm_spe_pkt_snprintf(&buf, &blen, " SIMD-FP");
+				if (ret < 0)
+					return ret;
 			}
+
+			return buf_len - blen;
+
+		case 2:
+			ret = arm_spe_pkt_snprintf(&buf, &blen, "B");
 			if (ret < 0)
 				return ret;
-			blen -= ret;
-			return buf_len - blen;
-		}
-		case 2:	{
-			size_t blen = buf_len;
 
-			ret = snprintf(buf, buf_len, "B");
-			buf += ret;
-			blen -= ret;
 			if (payload & 0x1) {
-				ret = snprintf(buf, buf_len, " COND");
-				buf += ret;
-				blen -= ret;
+				ret = arm_spe_pkt_snprintf(&buf, &blen, " COND");
+				if (ret < 0)
+					return ret;
 			}
 			if (payload & 0x2) {
-				ret = snprintf(buf, buf_len, " IND");
-				buf += ret;
-				blen -= ret;
+				ret = arm_spe_pkt_snprintf(&buf, &blen, " IND");
+				if (ret < 0)
+					return ret;
 			}
-			if (ret < 0)
-				return ret;
-			blen -= ret;
+
 			return buf_len - blen;
-			}
-		default: return 0;
+
+		default:
+			return 0;
 		}
 	case ARM_SPE_DATA_SOURCE:
 	case ARM_SPE_TIMESTAMP:
-		return snprintf(buf, buf_len, "%s %lld", name, payload);
+		return arm_spe_pkt_snprintf(&buf, &blen, "%s %lld", name, payload);
 	case ARM_SPE_ADDRESS:
 		switch (idx) {
 		case 0:
 		case 1: ns = !!(packet->payload & NS_FLAG);
 			el = (packet->payload & EL_FLAG) >> 61;
 			payload &= ~(0xffULL << 56);
-			return snprintf(buf, buf_len, "%s 0x%llx el%d ns=%d",
+			return arm_spe_pkt_snprintf(&buf, &blen,
+					"%s 0x%llx el%d ns=%d",
 				        (idx == 1) ? "TGT" : "PC", payload, el, ns);
-		case 2:	return snprintf(buf, buf_len, "VA 0x%llx", payload);
+		case 2:
+			return arm_spe_pkt_snprintf(&buf, &blen,
+						    "VA 0x%llx", payload);
 		case 3:	ns = !!(packet->payload & NS_FLAG);
 			payload &= ~(0xffULL << 56);
-			return snprintf(buf, buf_len, "PA 0x%llx ns=%d",
-					payload, ns);
-		default: return 0;
+			return arm_spe_pkt_snprintf(&buf, &blen,
+						    "PA 0x%llx ns=%d", payload, ns);
+		default:
+			return 0;
 		}
 	case ARM_SPE_CONTEXT:
-		return snprintf(buf, buf_len, "%s 0x%lx el%d", name,
-				(unsigned long)payload, idx + 1);
-	case ARM_SPE_COUNTER: {
-		size_t blen = buf_len;
-
-		ret = snprintf(buf, buf_len, "%s %d ", name,
-			       (unsigned short)payload);
-		buf += ret;
-		blen -= ret;
-		switch (idx) {
-		case 0:	ret = snprintf(buf, buf_len, "TOT"); break;
-		case 1:	ret = snprintf(buf, buf_len, "ISSUE"); break;
-		case 2:	ret = snprintf(buf, buf_len, "XLAT"); break;
-		default: ret = 0;
-		}
+		return arm_spe_pkt_snprintf(&buf, &blen, "%s 0x%lx el%d",
+					    name, (unsigned long)payload, idx + 1);
+	case ARM_SPE_COUNTER:
+		ret = arm_spe_pkt_snprintf(&buf, &blen, "%s %d ", name,
+					   (unsigned short)payload);
 		if (ret < 0)
 			return ret;
-		blen -= ret;
+
+		switch (idx) {
+		case 0:
+			ret = arm_spe_pkt_snprintf(&buf, &blen, "TOT");
+			if (ret < 0)
+				return ret;
+			break;
+		case 1:
+			ret = arm_spe_pkt_snprintf(&buf, &blen, "ISSUE");
+			if (ret < 0)
+				return ret;
+			break;
+		case 2:
+			ret = arm_spe_pkt_snprintf(&buf, &blen, "XLAT");
+			if (ret < 0)
+				return ret;
+			break;
+		default:
+			break;
+		}
+
 		return buf_len - blen;
-	}
+
 	default:
 		break;
 	}
 
-	return snprintf(buf, buf_len, "%s 0x%llx (%d)",
-			name, payload, packet->index);
+	return arm_spe_pkt_snprintf(&buf, &blen, "%s 0x%llx (%d)",
+				    name, payload, packet->index);
 }
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v2 06/14] perf arm-spe: Refactor packet header parsing
  2020-09-29 13:39 [PATCH v2 00/14] perf arm-spe: Refactor decoding & dumping flow Leo Yan
                   ` (4 preceding siblings ...)
  2020-09-29 13:39 ` [PATCH v2 05/14] perf arm-spe: Refactor printing string to buffer Leo Yan
@ 2020-09-29 13:39 ` Leo Yan
  2020-10-08 19:49   ` André Przywara
  2020-09-29 13:39 ` [PATCH v2 07/14] perf arm-spe: Refactor address packet handling Leo Yan
                   ` (8 subsequent siblings)
  14 siblings, 1 reply; 43+ messages in thread
From: Leo Yan @ 2020-09-29 13:39 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Peter Zijlstra, Ingo Molnar,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Wei Li, James Clark, Andre Przywara, Dave Martin, linux-kernel,
	Al Grant
  Cc: Leo Yan

The packet header parsing uses the hard coded values and it uses nested
if-else statements.

To improve the readability, this patch refactors the macros for packet
header format so it removes the hard coded values.  Furthermore, based
on the new mask macros it reduces the nested if-else statements and
changes to use the flat conditions checking, this is directive and can
easily map to the descriptions in ARMv8-a architecture reference manual
(ARM DDI 0487E.a), chapter 'D10.1.5 Statistical Profiling Extension
protocol packet headers'.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
---
 .../arm-spe-decoder/arm-spe-pkt-decoder.c     | 92 +++++++++----------
 .../arm-spe-decoder/arm-spe-pkt-decoder.h     | 21 +++++
 2 files changed, 62 insertions(+), 51 deletions(-)

diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
index 96b717a19163..e738bd04f209 100644
--- a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
+++ b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
@@ -16,28 +16,6 @@
 #define NS_FLAG		BIT(63)
 #define EL_FLAG		(BIT(62) | BIT(61))
 
-#define SPE_HEADER0_PAD			0x0
-#define SPE_HEADER0_END			0x1
-#define SPE_HEADER0_ADDRESS		0x30 /* address packet (short) */
-#define SPE_HEADER0_ADDRESS_MASK	0x38
-#define SPE_HEADER0_COUNTER		0x18 /* counter packet (short) */
-#define SPE_HEADER0_COUNTER_MASK	0x38
-#define SPE_HEADER0_TIMESTAMP		0x71
-#define SPE_HEADER0_TIMESTAMP		0x71
-#define SPE_HEADER0_EVENTS		0x2
-#define SPE_HEADER0_EVENTS_MASK		0xf
-#define SPE_HEADER0_SOURCE		0x3
-#define SPE_HEADER0_SOURCE_MASK		0xf
-#define SPE_HEADER0_CONTEXT		0x24
-#define SPE_HEADER0_CONTEXT_MASK	0x3c
-#define SPE_HEADER0_OP_TYPE		0x8
-#define SPE_HEADER0_OP_TYPE_MASK	0x3c
-#define SPE_HEADER1_ALIGNMENT		0x0
-#define SPE_HEADER1_ADDRESS		0xb0 /* address packet (extended) */
-#define SPE_HEADER1_ADDRESS_MASK	0xf8
-#define SPE_HEADER1_COUNTER		0x98 /* counter packet (extended) */
-#define SPE_HEADER1_COUNTER_MASK	0xf8
-
 #if __BYTE_ORDER == __BIG_ENDIAN
 #define le16_to_cpu bswap_16
 #define le32_to_cpu bswap_32
@@ -198,46 +176,58 @@ static int arm_spe_get_addr(const unsigned char *buf, size_t len,
 static int arm_spe_do_get_packet(const unsigned char *buf, size_t len,
 				 struct arm_spe_pkt *packet)
 {
-	unsigned int byte;
+	unsigned int hdr;
+	unsigned char ext_hdr = 0;
 
 	memset(packet, 0, sizeof(struct arm_spe_pkt));
 
 	if (!len)
 		return ARM_SPE_NEED_MORE_BYTES;
 
-	byte = buf[0];
-	if (byte == SPE_HEADER0_PAD)
+	hdr = buf[0];
+
+	if (hdr == SPE_HEADER0_PAD)
 		return arm_spe_get_pad(packet);
-	else if (byte == SPE_HEADER0_END) /* no timestamp at end of record */
+
+	if (hdr == SPE_HEADER0_END) /* no timestamp at end of record */
 		return arm_spe_get_end(packet);
-	else if (byte & 0xc0 /* 0y11xxxxxx */) {
-		if (byte & 0x80) {
-			if ((byte & SPE_HEADER0_ADDRESS_MASK) == SPE_HEADER0_ADDRESS)
-				return arm_spe_get_addr(buf, len, 0, packet);
-			if ((byte & SPE_HEADER0_COUNTER_MASK) == SPE_HEADER0_COUNTER)
-				return arm_spe_get_counter(buf, len, 0, packet);
-		} else
-			if (byte == SPE_HEADER0_TIMESTAMP)
-				return arm_spe_get_timestamp(buf, len, packet);
-			else if ((byte & SPE_HEADER0_EVENTS_MASK) == SPE_HEADER0_EVENTS)
-				return arm_spe_get_events(buf, len, packet);
-			else if ((byte & SPE_HEADER0_SOURCE_MASK) == SPE_HEADER0_SOURCE)
-				return arm_spe_get_data_source(buf, len, packet);
-			else if ((byte & SPE_HEADER0_CONTEXT_MASK) == SPE_HEADER0_CONTEXT)
-				return arm_spe_get_context(buf, len, packet);
-			else if ((byte & SPE_HEADER0_OP_TYPE_MASK) == SPE_HEADER0_OP_TYPE)
-				return arm_spe_get_op_type(buf, len, packet);
-	} else if ((byte & 0xe0) == 0x20 /* 0y001xxxxx */) {
-		/* 16-bit header */
-		byte = buf[1];
-		if (byte == SPE_HEADER1_ALIGNMENT)
+
+	if (hdr == SPE_HEADER0_TIMESTAMP)
+		return arm_spe_get_timestamp(buf, len, packet);
+
+	if ((hdr & SPE_HEADER0_MASK1) == SPE_HEADER0_EVENTS)
+		return arm_spe_get_events(buf, len, packet);
+
+	if ((hdr & SPE_HEADER0_MASK1) == SPE_HEADER0_SOURCE)
+		return arm_spe_get_data_source(buf, len, packet);
+
+	if ((hdr & SPE_HEADER0_MASK2) == SPE_HEADER0_CONTEXT)
+		return arm_spe_get_context(buf, len, packet);
+
+	if ((hdr & SPE_HEADER0_MASK2) == SPE_HEADER0_OPERATION)
+		return arm_spe_get_op_type(buf, len, packet);
+
+	if ((hdr & SPE_HEADER0_MASK3) == SPE_HEADER0_EXTENDED) {
+		/* 16-bit extended format header */
+		ext_hdr = 1;
+
+		hdr = buf[1];
+		if (hdr == SPE_HEADER1_ALIGNMENT)
 			return arm_spe_get_alignment(buf, len, packet);
-		else if ((byte & SPE_HEADER1_ADDRESS_MASK) == SPE_HEADER1_ADDRESS)
-			return arm_spe_get_addr(buf, len, 1, packet);
-		else if ((byte & SPE_HEADER1_COUNTER_MASK) == SPE_HEADER1_COUNTER)
-			return arm_spe_get_counter(buf, len, 1, packet);
 	}
 
+	/*
+	 * The short format header's byte 0 or the extended format header's
+	 * byte 1 has been assigned to 'hdr', which uses the same encoding for
+	 * address packet and counter packet, so don't need to distinguish if
+	 * it's short format or extended format and handle in once.
+	 */
+	if ((hdr & SPE_HEADER0_MASK4) == SPE_HEADER0_ADDRESS)
+		return arm_spe_get_addr(buf, len, ext_hdr, packet);
+
+	if ((hdr & SPE_HEADER0_MASK4) == SPE_HEADER0_COUNTER)
+		return arm_spe_get_counter(buf, len, ext_hdr, packet);
+
 	return ARM_SPE_BAD_PACKET;
 }
 
diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h
index f2d0af39a58c..a30fe3c5ab67 100644
--- a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h
+++ b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h
@@ -37,6 +37,27 @@ struct arm_spe_pkt {
 	uint64_t		payload;
 };
 
+/* Short header (HEADER0) and extended header (HEADER1) */
+#define SPE_HEADER0_PAD			0x0
+#define SPE_HEADER0_END			0x1
+#define SPE_HEADER0_TIMESTAMP		0x71
+/* Mask for event & data source */
+#define SPE_HEADER0_MASK1		(GENMASK_ULL(7, 6) | GENMASK_ULL(3, 0))
+#define SPE_HEADER0_EVENTS		0x42
+#define SPE_HEADER0_SOURCE		0x43
+/* Mask for context & operation */
+#define SPE_HEADER0_MASK2		GENMASK_ULL(7, 2)
+#define SPE_HEADER0_CONTEXT		0x64
+#define SPE_HEADER0_OPERATION		0x48
+/* Mask for extended format */
+#define SPE_HEADER0_MASK3		GENMASK_ULL(7, 5)
+#define SPE_HEADER0_EXTENDED		0x20
+/* Mask for address & counter */
+#define SPE_HEADER0_MASK4		GENMASK_ULL(7, 3)
+#define SPE_HEADER0_ADDRESS		0xb0
+#define SPE_HEADER0_COUNTER		0x98
+#define SPE_HEADER1_ALIGNMENT		0x0
+
 #define SPE_HEADER_SZ_SHIFT		(4)
 #define SPE_HEADER_SZ_MASK		GENMASK_ULL(5, 4)
 
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v2 07/14] perf arm-spe: Refactor address packet handling
  2020-09-29 13:39 [PATCH v2 00/14] perf arm-spe: Refactor decoding & dumping flow Leo Yan
                   ` (5 preceding siblings ...)
  2020-09-29 13:39 ` [PATCH v2 06/14] perf arm-spe: Refactor packet header parsing Leo Yan
@ 2020-09-29 13:39 ` Leo Yan
  2020-10-19  9:01   ` André Przywara
  2020-09-29 13:39 ` [PATCH v2 08/14] perf arm-spe: Refactor context " Leo Yan
                   ` (7 subsequent siblings)
  14 siblings, 1 reply; 43+ messages in thread
From: Leo Yan @ 2020-09-29 13:39 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Peter Zijlstra, Ingo Molnar,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Wei Li, James Clark, Andre Przywara, Dave Martin, linux-kernel,
	Al Grant
  Cc: Leo Yan

This patch is to refactor address packet handling, it defines macros for
address packet's header and payload, these macros are used by decoder
and the dump flow.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
---
 .../util/arm-spe-decoder/arm-spe-decoder.c    | 33 ++++++++++---------
 .../arm-spe-decoder/arm-spe-pkt-decoder.c     | 25 +++++++-------
 .../arm-spe-decoder/arm-spe-pkt-decoder.h     | 27 ++++++++++-----
 3 files changed, 49 insertions(+), 36 deletions(-)

diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c b/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c
index cc18a1e8c212..9d3de163d47c 100644
--- a/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c
+++ b/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c
@@ -24,36 +24,37 @@
 
 static u64 arm_spe_calc_ip(int index, u64 payload)
 {
-	u8 *addr = (u8 *)&payload;
-	int ns, el;
+	u64 ns, el;
 
 	/* Instruction virtual address or Branch target address */
 	if (index == SPE_ADDR_PKT_HDR_INDEX_INS ||
 	    index == SPE_ADDR_PKT_HDR_INDEX_BRANCH) {
-		ns = addr[7] & SPE_ADDR_PKT_NS;
-		el = (addr[7] & SPE_ADDR_PKT_EL_MASK) >> SPE_ADDR_PKT_EL_OFFSET;
+		ns = payload & SPE_ADDR_PKT_INST_VA_NS;
+		el = (payload & SPE_ADDR_PKT_INST_VA_EL_MASK)
+			>> SPE_ADDR_PKT_INST_VA_EL_SHIFT;
+
+		/* Clean highest byte */
+		payload &= SPE_ADDR_PKT_ADDR_MASK;
 
 		/* Fill highest byte for EL1 or EL2 (VHE) mode */
-		if (ns && (el == SPE_ADDR_PKT_EL1 || el == SPE_ADDR_PKT_EL2))
-			addr[7] = 0xff;
-		/* Clean highest byte for other cases */
-		else
-			addr[7] = 0x0;
+		if (ns && (el == SPE_ADDR_PKT_INST_VA_EL1 ||
+			   el == SPE_ADDR_PKT_INST_VA_EL2))
+			payload |= 0xffULL << SPE_ADDR_PKT_ADDR_BYTE7_SHIFT;
 
 	/* Data access virtual address */
 	} else if (index == SPE_ADDR_PKT_HDR_INDEX_DATA_VIRT) {
 
+		/* Clean tags */
+		payload &= SPE_ADDR_PKT_ADDR_MASK;
+
 		/* Fill highest byte if bits [48..55] is 0xff */
-		if (addr[6] == 0xff)
-			addr[7] = 0xff;
-		/* Otherwise, cleanup tags */
-		else
-			addr[7] = 0x0;
+		if ((payload >> 48) == 0xffULL)
+			payload |= 0xffULL << SPE_ADDR_PKT_ADDR_BYTE7_SHIFT;
 
 	/* Data access physical address */
 	} else if (index == SPE_ADDR_PKT_HDR_INDEX_DATA_PHYS) {
-		/* Cleanup byte 7 */
-		addr[7] = 0x0;
+		/* Clean highest byte */
+		payload &= SPE_ADDR_PKT_ADDR_MASK;
 	} else {
 		pr_err("unsupported address packet index: 0x%x\n", index);
 	}
diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
index e738bd04f209..b51a2207e4a0 100644
--- a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
+++ b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
@@ -13,9 +13,6 @@
 
 #include "arm-spe-pkt-decoder.h"
 
-#define NS_FLAG		BIT(63)
-#define EL_FLAG		(BIT(62) | BIT(61))
-
 #if __BYTE_ORDER == __BIG_ENDIAN
 #define le16_to_cpu bswap_16
 #define le32_to_cpu bswap_32
@@ -166,9 +163,10 @@ static int arm_spe_get_addr(const unsigned char *buf, size_t len,
 {
 	packet->type = ARM_SPE_ADDRESS;
 	if (ext_hdr)
-		packet->index = ((buf[0] & 0x3) << 3) | (buf[1] & 0x7);
+		packet->index = (((buf[0] & SPE_ADDR_PKT_HDR_EXT_INDEX_MASK) << 3) |
+				  (buf[1] & SPE_ADDR_PKT_HDR_INDEX_MASK));
 	else
-		packet->index = buf[0] & 0x7;
+		packet->index = buf[0] & SPE_ADDR_PKT_HDR_INDEX_MASK;
 
 	return arm_spe_get_payload(buf, len, ext_hdr, packet);
 }
@@ -403,18 +401,21 @@ int arm_spe_pkt_desc(const struct arm_spe_pkt *packet, char *buf,
 		return arm_spe_pkt_snprintf(&buf, &blen, "%s %lld", name, payload);
 	case ARM_SPE_ADDRESS:
 		switch (idx) {
-		case 0:
-		case 1: ns = !!(packet->payload & NS_FLAG);
-			el = (packet->payload & EL_FLAG) >> 61;
-			payload &= ~(0xffULL << 56);
+		case SPE_ADDR_PKT_HDR_INDEX_INS:
+		case SPE_ADDR_PKT_HDR_INDEX_BRANCH:
+			ns = !!(packet->payload & SPE_ADDR_PKT_INST_VA_NS);
+			el = (packet->payload & SPE_ADDR_PKT_INST_VA_EL_MASK)
+				>> SPE_ADDR_PKT_INST_VA_EL_SHIFT;
+			payload &= SPE_ADDR_PKT_ADDR_MASK;
 			return arm_spe_pkt_snprintf(&buf, &blen,
 					"%s 0x%llx el%d ns=%d",
 				        (idx == 1) ? "TGT" : "PC", payload, el, ns);
-		case 2:
+		case SPE_ADDR_PKT_HDR_INDEX_DATA_VIRT:
 			return arm_spe_pkt_snprintf(&buf, &blen,
 						    "VA 0x%llx", payload);
-		case 3:	ns = !!(packet->payload & NS_FLAG);
-			payload &= ~(0xffULL << 56);
+		case SPE_ADDR_PKT_HDR_INDEX_DATA_PHYS:
+			ns = !!(packet->payload & SPE_ADDR_PKT_INST_VA_NS);
+			payload &= SPE_ADDR_PKT_ADDR_MASK;
 			return arm_spe_pkt_snprintf(&buf, &blen,
 						    "PA 0x%llx ns=%d", payload, ns);
 		default:
diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h
index a30fe3c5ab67..88d2231c76da 100644
--- a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h
+++ b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h
@@ -61,19 +61,30 @@ struct arm_spe_pkt {
 #define SPE_HEADER_SZ_SHIFT		(4)
 #define SPE_HEADER_SZ_MASK		GENMASK_ULL(5, 4)
 
+/* Address packet header */
+#define SPE_ADDR_PKT_HDR_INDEX_MASK		GENMASK_ULL(2, 0)
 #define SPE_ADDR_PKT_HDR_INDEX_INS		(0x0)
 #define SPE_ADDR_PKT_HDR_INDEX_BRANCH		(0x1)
 #define SPE_ADDR_PKT_HDR_INDEX_DATA_VIRT	(0x2)
 #define SPE_ADDR_PKT_HDR_INDEX_DATA_PHYS	(0x3)
 
-#define SPE_ADDR_PKT_NS				BIT(7)
-#define SPE_ADDR_PKT_CH				BIT(6)
-#define SPE_ADDR_PKT_EL_OFFSET			(5)
-#define SPE_ADDR_PKT_EL_MASK			(0x3 << SPE_ADDR_PKT_EL_OFFSET)
-#define SPE_ADDR_PKT_EL0			(0)
-#define SPE_ADDR_PKT_EL1			(1)
-#define SPE_ADDR_PKT_EL2			(2)
-#define SPE_ADDR_PKT_EL3			(3)
+#define SPE_ADDR_PKT_HDR_EXT_INDEX_MASK		GENMASK_ULL(1, 0)
+
+/* Address packet payload for data access physical address */
+#define SPE_ADDR_PKT_ADDR_BYTE7_SHIFT		(56)
+#define SPE_ADDR_PKT_ADDR_MASK			GENMASK_ULL(55, 0)
+
+#define SPE_ADDR_PKT_DATA_PA_NS			BIT(63)
+#define SPE_ADDR_PKT_DATA_PA_CH			BIT(62)
+
+/* Address packet payload for instrcution virtual address */
+#define SPE_ADDR_PKT_INST_VA_NS			BIT(63)
+#define SPE_ADDR_PKT_INST_VA_EL_SHIFT		(61)
+#define SPE_ADDR_PKT_INST_VA_EL_MASK		GENMASK_ULL(62, 61)
+#define SPE_ADDR_PKT_INST_VA_EL0		(0)
+#define SPE_ADDR_PKT_INST_VA_EL1		(1)
+#define SPE_ADDR_PKT_INST_VA_EL2		(2)
+#define SPE_ADDR_PKT_INST_VA_EL3		(3)
 
 const char *arm_spe_pkt_name(enum arm_spe_pkt_type);
 
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v2 08/14] perf arm-spe: Refactor context packet handling
  2020-09-29 13:39 [PATCH v2 00/14] perf arm-spe: Refactor decoding & dumping flow Leo Yan
                   ` (6 preceding siblings ...)
  2020-09-29 13:39 ` [PATCH v2 07/14] perf arm-spe: Refactor address packet handling Leo Yan
@ 2020-09-29 13:39 ` Leo Yan
  2020-10-20 21:53   ` André Przywara
  2020-09-29 13:39 ` [PATCH v2 09/14] perf arm-spe: Refactor counter " Leo Yan
                   ` (6 subsequent siblings)
  14 siblings, 1 reply; 43+ messages in thread
From: Leo Yan @ 2020-09-29 13:39 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Peter Zijlstra, Ingo Molnar,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Wei Li, James Clark, Andre Przywara, Dave Martin, linux-kernel,
	Al Grant
  Cc: Leo Yan

Minor refactoring to use macro for index mask.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
---
 tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c | 2 +-
 tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h | 3 +++
 2 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
index b51a2207e4a0..00a2cd1af422 100644
--- a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
+++ b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
@@ -134,7 +134,7 @@ static int arm_spe_get_context(const unsigned char *buf, size_t len,
 			       struct arm_spe_pkt *packet)
 {
 	packet->type = ARM_SPE_CONTEXT;
-	packet->index = buf[0] & 0x3;
+	packet->index = buf[0] & SPE_CTX_PKT_HDR_INDEX_MASK;
 	return arm_spe_get_payload(buf, len, 0, packet);
 }
 
diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h
index 88d2231c76da..62db4ff91832 100644
--- a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h
+++ b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h
@@ -86,6 +86,9 @@ struct arm_spe_pkt {
 #define SPE_ADDR_PKT_INST_VA_EL2		(2)
 #define SPE_ADDR_PKT_INST_VA_EL3		(3)
 
+/* Context packet header */
+#define SPE_CTX_PKT_HDR_INDEX_MASK		GENMASK_ULL(1, 0)
+
 const char *arm_spe_pkt_name(enum arm_spe_pkt_type);
 
 int arm_spe_get_packet(const unsigned char *buf, size_t len,
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v2 09/14] perf arm-spe: Refactor counter packet handling
  2020-09-29 13:39 [PATCH v2 00/14] perf arm-spe: Refactor decoding & dumping flow Leo Yan
                   ` (7 preceding siblings ...)
  2020-09-29 13:39 ` [PATCH v2 08/14] perf arm-spe: Refactor context " Leo Yan
@ 2020-09-29 13:39 ` Leo Yan
  2020-10-20 21:53   ` André Przywara
  2020-09-29 13:39 ` [PATCH v2 10/14] perf arm-spe: Refactor event type handling Leo Yan
                   ` (5 subsequent siblings)
  14 siblings, 1 reply; 43+ messages in thread
From: Leo Yan @ 2020-09-29 13:39 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Peter Zijlstra, Ingo Molnar,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Wei Li, James Clark, Andre Przywara, Dave Martin, linux-kernel,
	Al Grant
  Cc: Leo Yan

This patch defines macros for counter packet header, and uses macro to
replace hard code values for packet parsing.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
---
 .../util/arm-spe-decoder/arm-spe-pkt-decoder.c  | 17 ++++++++++-------
 .../util/arm-spe-decoder/arm-spe-pkt-decoder.h  |  9 +++++++++
 2 files changed, 19 insertions(+), 7 deletions(-)

diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
index 00a2cd1af422..ed0f4c74dfc5 100644
--- a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
+++ b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
@@ -150,10 +150,13 @@ static int arm_spe_get_counter(const unsigned char *buf, size_t len,
 			       const unsigned char ext_hdr, struct arm_spe_pkt *packet)
 {
 	packet->type = ARM_SPE_COUNTER;
-	if (ext_hdr)
-		packet->index = ((buf[0] & 0x3) << 3) | (buf[1] & 0x7);
-	else
-		packet->index = buf[0] & 0x7;
+	if (ext_hdr) {
+		packet->index  = (buf[1] & SPE_CNT_PKT_HDR_INDEX_MASK);
+		packet->index |= ((buf[0] & SPE_CNT_PKT_HDR_EXT_INDEX_MASK)
+			<< SPE_CNT_PKT_HDR_EXT_INDEX_SHIFT);
+	} else {
+		packet->index = buf[0] & SPE_CNT_PKT_HDR_INDEX_MASK;
+	}
 
 	return arm_spe_get_payload(buf, len, ext_hdr, packet);
 }
@@ -431,17 +434,17 @@ int arm_spe_pkt_desc(const struct arm_spe_pkt *packet, char *buf,
 			return ret;
 
 		switch (idx) {
-		case 0:
+		case SPE_CNT_PKT_HDR_INDEX_TOTAL_LAT:
 			ret = arm_spe_pkt_snprintf(&buf, &blen, "TOT");
 			if (ret < 0)
 				return ret;
 			break;
-		case 1:
+		case SPE_CNT_PKT_HDR_INDEX_ISSUE_LAT:
 			ret = arm_spe_pkt_snprintf(&buf, &blen, "ISSUE");
 			if (ret < 0)
 				return ret;
 			break;
-		case 2:
+		case SPE_CNT_PKT_HDR_INDEX_TRANS_LAT:
 			ret = arm_spe_pkt_snprintf(&buf, &blen, "XLAT");
 			if (ret < 0)
 				return ret;
diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h
index 62db4ff91832..18667a63f5ba 100644
--- a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h
+++ b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h
@@ -89,6 +89,15 @@ struct arm_spe_pkt {
 /* Context packet header */
 #define SPE_CTX_PKT_HDR_INDEX_MASK		GENMASK_ULL(1, 0)
 
+/* Counter packet header */
+#define SPE_CNT_PKT_HDR_INDEX_MASK		GENMASK_ULL(2, 0)
+#define SPE_CNT_PKT_HDR_INDEX_TOTAL_LAT		(0x0)
+#define SPE_CNT_PKT_HDR_INDEX_ISSUE_LAT		(0x1)
+#define SPE_CNT_PKT_HDR_INDEX_TRANS_LAT		(0x2)
+
+#define SPE_CNT_PKT_HDR_EXT_INDEX_MASK		GENMASK_ULL(1, 0)
+#define SPE_CNT_PKT_HDR_EXT_INDEX_SHIFT		(3)
+
 const char *arm_spe_pkt_name(enum arm_spe_pkt_type);
 
 int arm_spe_get_packet(const unsigned char *buf, size_t len,
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v2 10/14] perf arm-spe: Refactor event type handling
  2020-09-29 13:39 [PATCH v2 00/14] perf arm-spe: Refactor decoding & dumping flow Leo Yan
                   ` (8 preceding siblings ...)
  2020-09-29 13:39 ` [PATCH v2 09/14] perf arm-spe: Refactor counter " Leo Yan
@ 2020-09-29 13:39 ` Leo Yan
  2020-10-20 21:54   ` André Przywara
  2020-09-29 13:39 ` [PATCH v2 11/14] perf arm-spe: Refactor operation packet handling Leo Yan
                   ` (4 subsequent siblings)
  14 siblings, 1 reply; 43+ messages in thread
From: Leo Yan @ 2020-09-29 13:39 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Peter Zijlstra, Ingo Molnar,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Wei Li, James Clark, Andre Przywara, Dave Martin, linux-kernel,
	Al Grant
  Cc: Leo Yan

Use macros instead of the enum values for event types, this is more
directive and without bit shifting when parse packet.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
---
 .../util/arm-spe-decoder/arm-spe-decoder.c    | 16 +++++++-------
 .../util/arm-spe-decoder/arm-spe-decoder.h    | 17 --------------
 .../arm-spe-decoder/arm-spe-pkt-decoder.c     | 22 +++++++++----------
 .../arm-spe-decoder/arm-spe-pkt-decoder.h     | 16 ++++++++++++++
 4 files changed, 35 insertions(+), 36 deletions(-)

diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c b/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c
index 9d3de163d47c..ac66e7f42a58 100644
--- a/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c
+++ b/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c
@@ -168,31 +168,31 @@ static int arm_spe_read_record(struct arm_spe_decoder *decoder)
 		case ARM_SPE_OP_TYPE:
 			break;
 		case ARM_SPE_EVENTS:
-			if (payload & BIT(EV_L1D_REFILL))
+			if (payload & SPE_EVT_PKT_L1D_REFILL)
 				decoder->record.type |= ARM_SPE_L1D_MISS;
 
-			if (payload & BIT(EV_L1D_ACCESS))
+			if (payload & SPE_EVT_PKT_L1D_ACCESS)
 				decoder->record.type |= ARM_SPE_L1D_ACCESS;
 
-			if (payload & BIT(EV_TLB_WALK))
+			if (payload & SPE_EVT_PKT_TLB_WALK)
 				decoder->record.type |= ARM_SPE_TLB_MISS;
 
-			if (payload & BIT(EV_TLB_ACCESS))
+			if (payload & SPE_EVT_PKT_TLB_ACCESS)
 				decoder->record.type |= ARM_SPE_TLB_ACCESS;
 
 			if ((idx == 2 || idx == 4 || idx == 8) &&
-			    (payload & BIT(EV_LLC_MISS)))
+			    (payload & SPE_EVT_PKT_LLC_MISS))
 				decoder->record.type |= ARM_SPE_LLC_MISS;
 
 			if ((idx == 2 || idx == 4 || idx == 8) &&
-			    (payload & BIT(EV_LLC_ACCESS)))
+			    (payload & SPE_EVT_PKT_LLC_ACCESS))
 				decoder->record.type |= ARM_SPE_LLC_ACCESS;
 
 			if ((idx == 2 || idx == 4 || idx == 8) &&
-			    (payload & BIT(EV_REMOTE_ACCESS)))
+			    (payload & SPE_EVT_PKT_REMOTE_ACCESS))
 				decoder->record.type |= ARM_SPE_REMOTE_ACCESS;
 
-			if (payload & BIT(EV_MISPRED))
+			if (payload & SPE_EVT_PKT_MISPREDICTED)
 				decoder->record.type |= ARM_SPE_BRANCH_MISS;
 
 			break;
diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-decoder.h b/tools/perf/util/arm-spe-decoder/arm-spe-decoder.h
index a5111a8d4360..24727b8ca7ff 100644
--- a/tools/perf/util/arm-spe-decoder/arm-spe-decoder.h
+++ b/tools/perf/util/arm-spe-decoder/arm-spe-decoder.h
@@ -13,23 +13,6 @@
 
 #include "arm-spe-pkt-decoder.h"
 
-enum arm_spe_events {
-	EV_EXCEPTION_GEN	= 0,
-	EV_RETIRED		= 1,
-	EV_L1D_ACCESS		= 2,
-	EV_L1D_REFILL		= 3,
-	EV_TLB_ACCESS		= 4,
-	EV_TLB_WALK		= 5,
-	EV_NOT_TAKEN		= 6,
-	EV_MISPRED		= 7,
-	EV_LLC_ACCESS		= 8,
-	EV_LLC_MISS		= 9,
-	EV_REMOTE_ACCESS	= 10,
-	EV_ALIGNMENT		= 11,
-	EV_PARTIAL_PREDICATE	= 17,
-	EV_EMPTY_PREDICATE	= 18,
-};
-
 enum arm_spe_sample_type {
 	ARM_SPE_L1D_ACCESS	= 1 << 0,
 	ARM_SPE_L1D_MISS	= 1 << 1,
diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
index ed0f4c74dfc5..b8f343320abf 100644
--- a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
+++ b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
@@ -284,58 +284,58 @@ int arm_spe_pkt_desc(const struct arm_spe_pkt *packet, char *buf,
 		if (ret < 0)
 			return ret;
 
-		if (payload & 0x1) {
+		if (payload & SPE_EVT_PKT_GEN_EXCEPTION) {
 			ret = arm_spe_pkt_snprintf(&buf, &blen, " EXCEPTION-GEN");
 			if (ret < 0)
 				return ret;
 		}
-		if (payload & 0x2) {
+		if (payload & SPE_EVT_PKT_ARCH_RETIRED) {
 			ret = arm_spe_pkt_snprintf(&buf, &blen, " RETIRED");
 			if (ret < 0)
 				return ret;
 		}
-		if (payload & 0x4) {
+		if (payload & SPE_EVT_PKT_L1D_ACCESS) {
 			ret = arm_spe_pkt_snprintf(&buf, &blen, " L1D-ACCESS");
 			if (ret < 0)
 				return ret;
 		}
-		if (payload & 0x8) {
+		if (payload & SPE_EVT_PKT_L1D_REFILL) {
 			ret = arm_spe_pkt_snprintf(&buf, &blen, " L1D-REFILL");
 			if (ret < 0)
 				return ret;
 		}
-		if (payload & 0x10) {
+		if (payload & SPE_EVT_PKT_TLB_ACCESS) {
 			ret = arm_spe_pkt_snprintf(&buf, &blen, " TLB-ACCESS");
 			if (ret < 0)
 				return ret;
 		}
-		if (payload & 0x20) {
+		if (payload & SPE_EVT_PKT_TLB_WALK) {
 			ret = arm_spe_pkt_snprintf(&buf, &blen, " TLB-REFILL");
 			if (ret < 0)
 				return ret;
 		}
-		if (payload & 0x40) {
+		if (payload & SPE_EVT_PKT_NOT_TAKEN) {
 			ret = arm_spe_pkt_snprintf(&buf, &blen, " NOT-TAKEN");
 			if (ret < 0)
 				return ret;
 		}
-		if (payload & 0x80) {
+		if (payload & SPE_EVT_PKT_MISPREDICTED) {
 			ret = arm_spe_pkt_snprintf(&buf, &blen, " MISPRED");
 			if (ret < 0)
 				return ret;
 		}
 		if (idx > 1) {
-			if (payload & 0x100) {
+			if (payload & SPE_EVT_PKT_LLC_ACCESS) {
 				ret = arm_spe_pkt_snprintf(&buf, &blen, " LLC-ACCESS");
 				if (ret < 0)
 					return ret;
 			}
-			if (payload & 0x200) {
+			if (payload & SPE_EVT_PKT_LLC_MISS) {
 				ret = arm_spe_pkt_snprintf(&buf, &blen, " LLC-REFILL");
 				if (ret < 0)
 					return ret;
 			}
-			if (payload & 0x400) {
+			if (payload & SPE_EVT_PKT_REMOTE_ACCESS) {
 				ret = arm_spe_pkt_snprintf(&buf, &blen, " REMOTE-ACCESS");
 				if (ret < 0)
 					return ret;
diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h
index 18667a63f5ba..e9a88cf685bb 100644
--- a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h
+++ b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h
@@ -98,6 +98,22 @@ struct arm_spe_pkt {
 #define SPE_CNT_PKT_HDR_EXT_INDEX_MASK		GENMASK_ULL(1, 0)
 #define SPE_CNT_PKT_HDR_EXT_INDEX_SHIFT		(3)
 
+/* Event packet payload */
+#define SPE_EVT_PKT_SVE_EMPTY_PREDICATE		BIT(18)
+#define SPE_EVT_PKT_SVE_PARTIAL_PREDICATE	BIT(17)
+#define SPE_EVT_PKT_ALIGNMENT			BIT(11)
+#define SPE_EVT_PKT_REMOTE_ACCESS		BIT(10)
+#define SPE_EVT_PKT_LLC_MISS			BIT(9)
+#define SPE_EVT_PKT_LLC_ACCESS			BIT(8)
+#define SPE_EVT_PKT_MISPREDICTED		BIT(7)
+#define SPE_EVT_PKT_NOT_TAKEN			BIT(6)
+#define SPE_EVT_PKT_TLB_WALK			BIT(5)
+#define SPE_EVT_PKT_TLB_ACCESS			BIT(4)
+#define SPE_EVT_PKT_L1D_REFILL			BIT(3)
+#define SPE_EVT_PKT_L1D_ACCESS			BIT(2)
+#define SPE_EVT_PKT_ARCH_RETIRED		BIT(1)
+#define SPE_EVT_PKT_GEN_EXCEPTION		BIT(0)
+
 const char *arm_spe_pkt_name(enum arm_spe_pkt_type);
 
 int arm_spe_get_packet(const unsigned char *buf, size_t len,
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v2 11/14] perf arm-spe: Refactor operation packet handling
  2020-09-29 13:39 [PATCH v2 00/14] perf arm-spe: Refactor decoding & dumping flow Leo Yan
                   ` (9 preceding siblings ...)
  2020-09-29 13:39 ` [PATCH v2 10/14] perf arm-spe: Refactor event type handling Leo Yan
@ 2020-09-29 13:39 ` Leo Yan
  2020-09-29 13:39 ` [PATCH v2 12/14] perf arm-spe: Add more sub classes for operation packet Leo Yan
                   ` (3 subsequent siblings)
  14 siblings, 0 replies; 43+ messages in thread
From: Leo Yan @ 2020-09-29 13:39 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Peter Zijlstra, Ingo Molnar,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Wei Li, James Clark, Andre Przywara, Dave Martin, linux-kernel,
	Al Grant
  Cc: Leo Yan

Defines macros for operation packet header and formats (support sub
classes for 'other', 'branch', 'load and store', etc).  Uses these
macros for operation packet decoding and dumping.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
---
 .../arm-spe-decoder/arm-spe-pkt-decoder.c     | 31 +++++++++-------
 .../arm-spe-decoder/arm-spe-pkt-decoder.h     | 36 +++++++++++++++++++
 2 files changed, 54 insertions(+), 13 deletions(-)

diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
index b8f343320abf..a848c784f4cf 100644
--- a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
+++ b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
@@ -142,7 +142,7 @@ static int arm_spe_get_op_type(const unsigned char *buf, size_t len,
 			       struct arm_spe_pkt *packet)
 {
 	packet->type = ARM_SPE_OP_TYPE;
-	packet->index = buf[0] & 0x3;
+	packet->index = buf[0] & SPE_OP_PKT_HDR_CLASS_MASK;
 	return arm_spe_get_payload(buf, len, 0, packet);
 }
 
@@ -345,32 +345,36 @@ int arm_spe_pkt_desc(const struct arm_spe_pkt *packet, char *buf,
 
 	case ARM_SPE_OP_TYPE:
 		switch (idx) {
-		case 0:
+		case SPE_OP_PKT_HDR_CLASS_OTHER:
 			return arm_spe_pkt_snprintf(&buf, &blen,
-					payload & 0x1 ? "COND-SELECT" : "INSN-OTHER");
-		case 1:
+					payload & SPE_OP_PKT_OTHER_SUBCLASS_COND ?
+					"COND-SELECT" : "INSN-OTHER");
+		case SPE_OP_PKT_HDR_CLASS_LD_ST_ATOMIC:
 			ret = arm_spe_pkt_snprintf(&buf, &blen,
-						   payload & 0x1 ? "ST" : "LD");
+						   payload & SPE_OP_PKT_LDST ?
+						   "ST" : "LD");
 			if (ret < 0)
 				return ret;
 
-			if (payload & 0x2) {
-				if (payload & 0x4) {
+			if ((payload & SPE_OP_PKT_LDST_SUBCLASS_ATOMIC_MASK) ==
+					SPE_OP_PKT_LDST_SUBCLASS_ATOMIC) {
+				if (payload & SPE_OP_PKT_AT) {
 					ret = arm_spe_pkt_snprintf(&buf, &blen, " AT");
 					if (ret < 0)
 						return ret;
 				}
-				if (payload & 0x8) {
+				if (payload & SPE_OP_PKT_EXCL) {
 					ret = arm_spe_pkt_snprintf(&buf, &blen, " EXCL");
 					if (ret < 0)
 						return ret;
 				}
-				if (payload & 0x10) {
+				if (payload & SPE_OP_PKT_AR) {
 					ret = arm_spe_pkt_snprintf(&buf, &blen, " AR");
 					if (ret < 0)
 						return ret;
 				}
-			} else if (payload & 0x4) {
+			} else if ((payload & SPE_OP_PKT_LDST_SUBCLASS_MASK) ==
+					SPE_OP_PKT_LDST_SUBCLASS_SIMD_FP) {
 				ret = arm_spe_pkt_snprintf(&buf, &blen, " SIMD-FP");
 				if (ret < 0)
 					return ret;
@@ -378,17 +382,18 @@ int arm_spe_pkt_desc(const struct arm_spe_pkt *packet, char *buf,
 
 			return buf_len - blen;
 
-		case 2:
+		case SPE_OP_PKT_HDR_CLASS_BR_ERET:
 			ret = arm_spe_pkt_snprintf(&buf, &blen, "B");
 			if (ret < 0)
 				return ret;
 
-			if (payload & 0x1) {
+			if (payload & SPE_OP_PKT_BRANCH_SUBCLASS_COND) {
 				ret = arm_spe_pkt_snprintf(&buf, &blen, " COND");
 				if (ret < 0)
 					return ret;
 			}
-			if (payload & 0x2) {
+			if ((payload & SPE_OP_PKT_BRANCH_SUBCLASS_MASK) ==
+					SPE_OP_PKT_BRANCH_SUBCLASS_INDIRECT) {
 				ret = arm_spe_pkt_snprintf(&buf, &blen, " IND");
 				if (ret < 0)
 					return ret;
diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h
index e9a88cf685bb..2c4086bf3149 100644
--- a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h
+++ b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h
@@ -114,6 +114,42 @@ struct arm_spe_pkt {
 #define SPE_EVT_PKT_ARCH_RETIRED		BIT(1)
 #define SPE_EVT_PKT_GEN_EXCEPTION		BIT(0)
 
+/* Operation packet header */
+#define SPE_OP_PKT_HDR_CLASS_MASK		GENMASK_ULL(2, 0)
+#define SPE_OP_PKT_HDR_CLASS_OTHER		(0x0)
+#define SPE_OP_PKT_HDR_CLASS_LD_ST_ATOMIC	(0x1)
+#define SPE_OP_PKT_HDR_CLASS_BR_ERET		(0x2)
+
+#define SPE_OP_PKT_OTHER_SUBCLASS_MASK		GENMASK_ULL(7, 1)
+#define SPE_OP_PKT_OTHER_SUBCLASS_OTHER_OP	(0x0)
+#define SPE_OP_PKT_OTHER_SVE_SUBCLASS_MASK	(BIT(7) | BIT(3) | BIT(0))
+#define SPE_OP_PKT_OTHER_SUBCLASS_SVG_OP	(0x8)
+
+#define SPE_OP_PKT_OTHER_SUBCLASS_COND		BIT(0)
+
+#define SPE_OP_PKT_BRANCH_SUBCLASS_MASK		GENMASK_ULL(7, 1)
+#define SPE_OP_PKT_BRANCH_SUBCLASS_DIRECT	(0x0)
+#define SPE_OP_PKT_BRANCH_SUBCLASS_INDIRECT	(0x2)
+
+#define SPE_OP_PKT_BRANCH_SUBCLASS_COND		BIT(0)
+
+#define SPE_OP_PKT_LDST_SUBCLASS_MASK		GENMASK_ULL(7, 1)
+#define SPE_OP_PKT_LDST_SUBCLASS_GP_REG		(0x0)
+#define SPE_OP_PKT_LDST_SUBCLASS_SIMD_FP	(0x4)
+#define SPE_OP_PKT_LDST_SUBCLASS_UNSPEC_REG	(0x10)
+#define SPE_OP_PKT_LDST_SUBCLASS_NV_SYSREG	(0x30)
+
+#define SPE_OP_PKT_LDST_SUBCLASS_ATOMIC_MASK	(GENMASK_ULL(7, 5) | GENMASK_ULL(1, 1))
+#define SPE_OP_PKT_LDST_SUBCLASS_ATOMIC		(0x2)
+
+#define SPE_OP_PKT_LDST_SUBCLASS_SVE_MASK	(GENMASK_ULL(3, 3) | GENMASK_ULL(1, 1))
+#define SPE_OP_PKT_LDST_SUBCLASS_SVE		(0x8)
+
+#define SPE_OP_PKT_AR				BIT(4)
+#define SPE_OP_PKT_EXCL				BIT(3)
+#define SPE_OP_PKT_AT				BIT(2)
+#define SPE_OP_PKT_LDST				BIT(0)
+
 const char *arm_spe_pkt_name(enum arm_spe_pkt_type);
 
 int arm_spe_get_packet(const unsigned char *buf, size_t len,
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v2 12/14] perf arm-spe: Add more sub classes for operation packet
  2020-09-29 13:39 [PATCH v2 00/14] perf arm-spe: Refactor decoding & dumping flow Leo Yan
                   ` (10 preceding siblings ...)
  2020-09-29 13:39 ` [PATCH v2 11/14] perf arm-spe: Refactor operation packet handling Leo Yan
@ 2020-09-29 13:39 ` Leo Yan
  2020-10-20 21:54   ` André Przywara
  2020-09-29 13:39 ` [PATCH v2 13/14] perf arm_spe: Decode memory tagging properties Leo Yan
                   ` (2 subsequent siblings)
  14 siblings, 1 reply; 43+ messages in thread
From: Leo Yan @ 2020-09-29 13:39 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Peter Zijlstra, Ingo Molnar,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Wei Li, James Clark, Andre Przywara, Dave Martin, linux-kernel,
	Al Grant
  Cc: Leo Yan

For the operation type packet payload with load/store class, it misses
to support these sub classes:

  - A load/store targeting the general-purpose registers;
  - A load/store targeting unspecified registers;
  - The ARMv8.4 nested virtualisation extension can redirect system
    register accesses to a memory page controlled by the hypervisor.
    The SPE profiling feature in newer implementations can tag those
    memory accesses accordingly.

Add the bit pattern describing load/store sub classes, so that the perf
tool can decode it properly.

Inspired by Andre Przywara, refined the commit log and code for more
clear description.

Co-developed-by: Andre Przywara <andre.przywara@arm.com>
Signed-off-by: Leo Yan <leo.yan@linaro.org>
---
 .../util/arm-spe-decoder/arm-spe-pkt-decoder.c    | 15 +++++++++++++++
 1 file changed, 15 insertions(+)

diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
index a848c784f4cf..57a2d5494838 100644
--- a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
+++ b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
@@ -378,6 +378,21 @@ int arm_spe_pkt_desc(const struct arm_spe_pkt *packet, char *buf,
 				ret = arm_spe_pkt_snprintf(&buf, &blen, " SIMD-FP");
 				if (ret < 0)
 					return ret;
+			} else if ((payload & SPE_OP_PKT_LDST_SUBCLASS_MASK) ==
+					SPE_OP_PKT_LDST_SUBCLASS_GP_REG) {
+				ret = arm_spe_pkt_snprintf(&buf, &blen, " GP-REG");
+				if (ret < 0)
+					return ret;
+			} else if ((payload & SPE_OP_PKT_LDST_SUBCLASS_MASK) ==
+					SPE_OP_PKT_LDST_SUBCLASS_UNSPEC_REG) {
+				ret = arm_spe_pkt_snprintf(&buf, &blen, " UNSPEC-REG");
+				if (ret < 0)
+					return ret;
+			} else if ((payload & SPE_OP_PKT_LDST_SUBCLASS_MASK) ==
+					SPE_OP_PKT_LDST_SUBCLASS_NV_SYSREG) {
+				ret = arm_spe_pkt_snprintf(&buf, &blen, " NV-SYSREG");
+				if (ret < 0)
+					return ret;
 			}
 
 			return buf_len - blen;
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v2 13/14] perf arm_spe: Decode memory tagging properties
  2020-09-29 13:39 [PATCH v2 00/14] perf arm-spe: Refactor decoding & dumping flow Leo Yan
                   ` (11 preceding siblings ...)
  2020-09-29 13:39 ` [PATCH v2 12/14] perf arm-spe: Add more sub classes for operation packet Leo Yan
@ 2020-09-29 13:39 ` Leo Yan
  2020-09-29 13:39 ` [PATCH v2 14/14] perf arm-spe: Add support for ARMv8.3-SPE Leo Yan
  2020-10-13 14:53 ` [PATCH v2 00/14] perf arm-spe: Refactor decoding & dumping flow Arnaldo Carvalho de Melo
  14 siblings, 0 replies; 43+ messages in thread
From: Leo Yan @ 2020-09-29 13:39 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Peter Zijlstra, Ingo Molnar,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Wei Li, James Clark, Andre Przywara, Dave Martin, linux-kernel,
	Al Grant
  Cc: Leo Yan

From: Andre Przywara <andre.przywara@arm.com>

When SPE records a physical address, it can additionally tag the event
with information from the Memory Tagging architecture extension.

Decode the two additional fields in the SPE event payload.

[leoy: Refined patch to use predefined macros]

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Signed-off-by: Leo Yan <leo.yan@linaro.org>
---
 tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c | 7 ++++++-
 tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h | 2 ++
 2 files changed, 8 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
index 57a2d5494838..05a4c74399d7 100644
--- a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
+++ b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
@@ -270,6 +270,7 @@ int arm_spe_pkt_desc(const struct arm_spe_pkt *packet, char *buf,
 		     size_t buf_len)
 {
 	int ret, ns, el, idx = packet->index;
+	int ch, pat;
 	unsigned long long payload = packet->payload;
 	const char *name = arm_spe_pkt_name(packet->type);
 	size_t blen = buf_len;
@@ -438,9 +439,13 @@ int arm_spe_pkt_desc(const struct arm_spe_pkt *packet, char *buf,
 						    "VA 0x%llx", payload);
 		case SPE_ADDR_PKT_HDR_INDEX_DATA_PHYS:
 			ns = !!(packet->payload & SPE_ADDR_PKT_INST_VA_NS);
+			ch = !!(packet->payload & SPE_ADDR_PKT_DATA_PA_CH);
+			pat = (packet->payload & SPE_ADDR_PKT_DATA_PA_PAT_MASK)
+				>> SPE_ADDR_PKT_DATA_PA_PAT_SHIFT;
 			payload &= SPE_ADDR_PKT_ADDR_MASK;
 			return arm_spe_pkt_snprintf(&buf, &blen,
-						    "PA 0x%llx ns=%d", payload, ns);
+						    "PA 0x%llx ns=%d ch=%d, pat=%x",
+						    payload, ns, ch, pat);
 		default:
 			return 0;
 		}
diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h
index 2c4086bf3149..1847cad517db 100644
--- a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h
+++ b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h
@@ -76,6 +76,8 @@ struct arm_spe_pkt {
 
 #define SPE_ADDR_PKT_DATA_PA_NS			BIT(63)
 #define SPE_ADDR_PKT_DATA_PA_CH			BIT(62)
+#define SPE_ADDR_PKT_DATA_PA_PAT_SHIFT          (56)
+#define SPE_ADDR_PKT_DATA_PA_PAT_MASK           GENMASK_ULL(59, 56)
 
 /* Address packet payload for instrcution virtual address */
 #define SPE_ADDR_PKT_INST_VA_NS			BIT(63)
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v2 14/14] perf arm-spe: Add support for ARMv8.3-SPE
  2020-09-29 13:39 [PATCH v2 00/14] perf arm-spe: Refactor decoding & dumping flow Leo Yan
                   ` (12 preceding siblings ...)
  2020-09-29 13:39 ` [PATCH v2 13/14] perf arm_spe: Decode memory tagging properties Leo Yan
@ 2020-09-29 13:39 ` Leo Yan
  2020-10-20 21:54   ` André Przywara
  2020-10-13 14:53 ` [PATCH v2 00/14] perf arm-spe: Refactor decoding & dumping flow Arnaldo Carvalho de Melo
  14 siblings, 1 reply; 43+ messages in thread
From: Leo Yan @ 2020-09-29 13:39 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Peter Zijlstra, Ingo Molnar,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Wei Li, James Clark, Andre Przywara, Dave Martin, linux-kernel,
	Al Grant
  Cc: Leo Yan

From: Wei Li <liwei391@huawei.com>

This patch is to support Armv8.3 extension for SPE, it adds alignment
field in the Events packet and it supports the Scalable Vector Extension
(SVE) for Operation packet and Events packet with two additions:

  - The vector length for SVE operations in the Operation Type packet;
  - The incomplete predicate and empty predicate fields in the Events
    packet.

Signed-off-by: Wei Li <liwei391@huawei.com>
Signed-off-by: Leo Yan <leo.yan@linaro.org>
---
 .../arm-spe-decoder/arm-spe-pkt-decoder.c     | 84 ++++++++++++++++++-
 .../arm-spe-decoder/arm-spe-pkt-decoder.h     |  6 ++
 2 files changed, 87 insertions(+), 3 deletions(-)

diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
index 05a4c74399d7..3ec381fddfcb 100644
--- a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
+++ b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
@@ -342,14 +342,73 @@ int arm_spe_pkt_desc(const struct arm_spe_pkt *packet, char *buf,
 					return ret;
 			}
 		}
+		if (idx > 2) {
+			if (payload & SPE_EVT_PKT_ALIGNMENT) {
+				ret = snprintf(buf, buf_len, " ALIGNMENT");
+				if (ret < 0)
+					return ret;
+				buf += ret;
+				blen -= ret;
+			}
+			if (payload & SPE_EVT_PKT_SVE_PARTIAL_PREDICATE) {
+				ret = snprintf(buf, buf_len, " SVE-PARTIAL-PRED");
+				if (ret < 0)
+					return ret;
+				buf += ret;
+				blen -= ret;
+			}
+			if (payload & SPE_EVT_PKT_SVE_EMPTY_PREDICATE) {
+				ret = snprintf(buf, buf_len, " SVE-EMPTY-PRED");
+				if (ret < 0)
+					return ret;
+				buf += ret;
+				blen -= ret;
+			}
+		}
+
 		return buf_len - blen;
 
 	case ARM_SPE_OP_TYPE:
 		switch (idx) {
 		case SPE_OP_PKT_HDR_CLASS_OTHER:
-			return arm_spe_pkt_snprintf(&buf, &blen,
-					payload & SPE_OP_PKT_OTHER_SUBCLASS_COND ?
-					"COND-SELECT" : "INSN-OTHER");
+			if ((payload & SPE_OP_PKT_OTHER_SVE_SUBCLASS_MASK) ==
+					SPE_OP_PKT_OTHER_SUBCLASS_SVG_OP) {
+
+				ret = arm_spe_pkt_snprintf(&buf, &blen, "SVE-OTHER");
+				if (ret < 0)
+					return ret;
+
+				/* Effective vector length: step is 32 bits */
+				ret = arm_spe_pkt_snprintf(&buf, &blen, " EVLEN %d",
+					32 << ((payload & SPE_OP_PKT_SVE_EVL_MASK) >>
+						SPE_OP_PKT_SVE_EVL_SHIFT));
+				if (ret < 0)
+					return ret;
+
+				if (payload & SPE_OP_PKT_SVE_FP) {
+					ret = arm_spe_pkt_snprintf(&buf, &blen, " FP");
+					if (ret < 0)
+						return ret;
+				}
+				if (payload & SPE_OP_PKT_SVE_PRED) {
+					ret = arm_spe_pkt_snprintf(&buf, &blen, " PRED");
+					if (ret < 0)
+						return ret;
+				}
+			} else {
+				ret = arm_spe_pkt_snprintf(&buf, &blen, "OTHER");
+				if (ret < 0)
+					return ret;
+
+				ret = arm_spe_pkt_snprintf(&buf, &blen, " %s",
+					       payload & SPE_OP_PKT_OTHER_SUBCLASS_COND ?
+					       "COND-SELECT" : "UNCOND");
+				if (ret < 0)
+					return ret;
+			}
+
+			return buf_len - blen;
+
 		case SPE_OP_PKT_HDR_CLASS_LD_ST_ATOMIC:
 			ret = arm_spe_pkt_snprintf(&buf, &blen,
 						   payload & SPE_OP_PKT_LDST ?
@@ -394,6 +453,25 @@ int arm_spe_pkt_desc(const struct arm_spe_pkt *packet, char *buf,
 				ret = arm_spe_pkt_snprintf(&buf, &blen, " NV-SYSREG");
 				if (ret < 0)
 					return ret;
+			} else if ((payload & SPE_OP_PKT_LDST_SUBCLASS_SVE_MASK) ==
+					SPE_OP_PKT_LDST_SUBCLASS_SVE) {
+				/* Effective vector length: step is 32 bits */
+				ret = arm_spe_pkt_snprintf(&buf, &blen, " EVLEN %d",
+					32 << ((payload & SPE_OP_PKT_SVE_EVL_MASK) >>
+						SPE_OP_PKT_SVE_EVL_SHIFT));
+				if (ret < 0)
+					return ret;
+
+				if (payload & SPE_OP_PKT_SVE_PRED) {
+					ret = arm_spe_pkt_snprintf(&buf, &blen, " PRED");
+					if (ret < 0)
+						return ret;
+				}
+				if (payload & SPE_OP_PKT_SVE_SG) {
+					ret = arm_spe_pkt_snprintf(&buf, &blen, " SG");
+					if (ret < 0)
+						return ret;
+				}
 			}
 
 			return buf_len - blen;
diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h
index 1847cad517db..80266bfebec2 100644
--- a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h
+++ b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h
@@ -147,6 +147,12 @@ struct arm_spe_pkt {
 #define SPE_OP_PKT_LDST_SUBCLASS_SVE_MASK	(GENMASK_ULL(3, 3) | GENMASK_ULL(1, 1))
 #define SPE_OP_PKT_LDST_SUBCLASS_SVE		(0x8)
 
+#define SPE_OP_PKT_SVE_SG			BIT(7)
+#define SPE_OP_PKT_SVE_EVL_MASK			GENMASK_ULL(6, 4)
+#define SPE_OP_PKT_SVE_EVL_SHIFT		(4)
+#define SPE_OP_PKT_SVE_PRED			BIT(2)
+#define SPE_OP_PKT_SVE_FP			BIT(1)
+
 #define SPE_OP_PKT_AR				BIT(4)
 #define SPE_OP_PKT_EXCL				BIT(3)
 #define SPE_OP_PKT_AT				BIT(2)
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* Re: [PATCH v2 01/14] perf arm-spe: Include bitops.h for BIT() macro
  2020-09-29 13:39 ` [PATCH v2 01/14] perf arm-spe: Include bitops.h for BIT() macro Leo Yan
@ 2020-10-08 13:44   ` André Przywara
  0 siblings, 0 replies; 43+ messages in thread
From: André Przywara @ 2020-10-08 13:44 UTC (permalink / raw)
  To: Leo Yan, Arnaldo Carvalho de Melo, Peter Zijlstra, Ingo Molnar,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Wei Li, James Clark, Dave Martin, linux-kernel, Al Grant

On 29/09/2020 14:39, Leo Yan wrote:
> Include header linux/bitops.h, directly use its BIT() macro and remove
> the self defined macros.
> 
> Signed-off-by: Leo Yan <leo.yan@linaro.org>

Reviewed-by: Andre Przywara <andre.przywara@arm.com>

Thanks,
Andre

> ---
>  tools/perf/util/arm-spe-decoder/arm-spe-decoder.c     | 5 +----
>  tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c | 3 +--
>  2 files changed, 2 insertions(+), 6 deletions(-)
> 
> diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c b/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c
> index 93e063f22be5..cc18a1e8c212 100644
> --- a/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c
> +++ b/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c
> @@ -12,6 +12,7 @@
>  #include <string.h>
>  #include <stdint.h>
>  #include <stdlib.h>
> +#include <linux/bitops.h>
>  #include <linux/compiler.h>
>  #include <linux/zalloc.h>
>  
> @@ -21,10 +22,6 @@
>  
>  #include "arm-spe-decoder.h"
>  
> -#ifndef BIT
> -#define BIT(n)		(1UL << (n))
> -#endif
> -
>  static u64 arm_spe_calc_ip(int index, u64 payload)
>  {
>  	u8 *addr = (u8 *)&payload;
> diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> index b94001b756c7..46ddb53a6457 100644
> --- a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> +++ b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> @@ -8,11 +8,10 @@
>  #include <string.h>
>  #include <endian.h>
>  #include <byteswap.h>
> +#include <linux/bitops.h>
>  
>  #include "arm-spe-pkt-decoder.h"
>  
> -#define BIT(n)		(1ULL << (n))
> -
>  #define NS_FLAG		BIT(63)
>  #define EL_FLAG		(BIT(62) | BIT(61))
>  
> 


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v2 02/14] perf arm-spe: Fix a typo in comment
  2020-09-29 13:39 ` [PATCH v2 02/14] perf arm-spe: Fix a typo in comment Leo Yan
@ 2020-10-08 13:44   ` André Przywara
  0 siblings, 0 replies; 43+ messages in thread
From: André Przywara @ 2020-10-08 13:44 UTC (permalink / raw)
  To: Leo Yan, Arnaldo Carvalho de Melo, Peter Zijlstra, Ingo Molnar,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Wei Li, James Clark, Dave Martin, linux-kernel, Al Grant

On 29/09/2020 14:39, Leo Yan wrote:
> Fix a typo: s/iff/if.
> 
> Signed-off-by: Leo Yan <leo.yan@linaro.org>

Reviewed-by: Andre Przywara <andre.przywara@arm.com>

Cheers,
Andre

> ---
>  tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> index 46ddb53a6457..7c7b5eb09fba 100644
> --- a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> +++ b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> @@ -142,7 +142,7 @@ static int arm_spe_get_events(const unsigned char *buf, size_t len,
>  
>  	/* we use index to identify Events with a less number of
>  	 * comparisons in arm_spe_pkt_desc(): E.g., the LLC-ACCESS,
> -	 * LLC-REFILL, and REMOTE-ACCESS events are identified iff
> +	 * LLC-REFILL, and REMOTE-ACCESS events are identified if
>  	 * index > 1.
>  	 */
>  	packet->index = ret - 1;
> 


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v2 03/14] perf arm-spe: Refactor payload length calculation
  2020-09-29 13:39 ` [PATCH v2 03/14] perf arm-spe: Refactor payload length calculation Leo Yan
@ 2020-10-08 13:44   ` André Przywara
  2020-10-12  0:21     ` Leo Yan
  0 siblings, 1 reply; 43+ messages in thread
From: André Przywara @ 2020-10-08 13:44 UTC (permalink / raw)
  To: Leo Yan, Arnaldo Carvalho de Melo, Peter Zijlstra, Ingo Molnar,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Wei Li, James Clark, Dave Martin, linux-kernel, Al Grant

On 29/09/2020 14:39, Leo Yan wrote:

Hi Leo,

> Defines macro for payload length calculation instead of static function.

What is the reason for that? I thought the kernel's direction is more
the other way: replacing macros with static functions ("Don't write CPP,
write C")? Ideally the compiler would generate the same code.

> Currently the event packet's 'index' is assigned as payload length, but
> the flow is not directive: it firstly gets the packet length (includes
> header length and payload length) and then reduces header length from
> packet length, so finally get the payload length; to simplify the code,
> this patch directly assigns payload length to event packet's index.
> 
> Signed-off-by: Leo Yan <leo.yan@linaro.org>
> ---
>  .../arm-spe-decoder/arm-spe-pkt-decoder.c     | 26 ++++++++-----------
>  .../arm-spe-decoder/arm-spe-pkt-decoder.h     |  4 +++
>  2 files changed, 15 insertions(+), 15 deletions(-)
> 
> diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> index 7c7b5eb09fba..5a8696031e16 100644
> --- a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> +++ b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> @@ -69,22 +69,20 @@ const char *arm_spe_pkt_name(enum arm_spe_pkt_type type)
>  	return arm_spe_packet_name[type];
>  }
>  
> -/* return ARM SPE payload size from its encoding,
> - * which is in bits 5:4 of the byte.
> - * 00 : byte
> - * 01 : halfword (2)
> - * 10 : word (4)
> - * 11 : doubleword (8)
> +/*
> + * Return ARM SPE payload size from header bits 5:4
> + *   00 : byte
> + *   01 : halfword (2)
> + *   10 : word (4)
> + *   11 : doubleword (8)
>   */
> -static int payloadlen(unsigned char byte)
> -{
> -	return 1 << ((byte & 0x30) >> 4);
> -}
> +#define PAYLOAD_LEN(val)	\
> +	(1 << (((val) & SPE_HEADER_SZ_MASK) >> SPE_HEADER_SZ_SHIFT))

This change of the expression is good (although it should be 1U), but
please keep it a function. The return type should be unsigned, I guess.

The rest looks fine.

Cheers,
Andre

>  
>  static int arm_spe_get_payload(const unsigned char *buf, size_t len,
>  			       struct arm_spe_pkt *packet)
>  {
> -	size_t payload_len = payloadlen(buf[0]);
> +	size_t payload_len = PAYLOAD_LEN(buf[0]);
>  
>  	if (len < 1 + payload_len)
>  		return ARM_SPE_NEED_MORE_BYTES;
> @@ -136,8 +134,6 @@ static int arm_spe_get_timestamp(const unsigned char *buf, size_t len,
>  static int arm_spe_get_events(const unsigned char *buf, size_t len,
>  			      struct arm_spe_pkt *packet)
>  {
> -	int ret = arm_spe_get_payload(buf, len, packet);
> -
>  	packet->type = ARM_SPE_EVENTS;
>  
>  	/* we use index to identify Events with a less number of
> @@ -145,9 +141,9 @@ static int arm_spe_get_events(const unsigned char *buf, size_t len,
>  	 * LLC-REFILL, and REMOTE-ACCESS events are identified if
>  	 * index > 1.
>  	 */
> -	packet->index = ret - 1;
> +	packet->index = PAYLOAD_LEN(buf[0]);
>  
> -	return ret;
> +	return arm_spe_get_payload(buf, len, packet);
>  }
>  
>  static int arm_spe_get_data_source(const unsigned char *buf, size_t len,
> diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h
> index 4c870521b8eb..f2d0af39a58c 100644
> --- a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h
> +++ b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h
> @@ -9,6 +9,7 @@
>  
>  #include <stddef.h>
>  #include <stdint.h>
> +#include <linux/bits.h>
>  
>  #define ARM_SPE_PKT_DESC_MAX		256
>  
> @@ -36,6 +37,9 @@ struct arm_spe_pkt {
>  	uint64_t		payload;
>  };
>  
> +#define SPE_HEADER_SZ_SHIFT		(4)
> +#define SPE_HEADER_SZ_MASK		GENMASK_ULL(5, 4)
> +
>  #define SPE_ADDR_PKT_HDR_INDEX_INS		(0x0)
>  #define SPE_ADDR_PKT_HDR_INDEX_BRANCH		(0x1)
>  #define SPE_ADDR_PKT_HDR_INDEX_DATA_VIRT	(0x2)
> 


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v2 04/14] perf arm-spe: Fix packet length handling
  2020-09-29 13:39 ` [PATCH v2 04/14] perf arm-spe: Fix packet length handling Leo Yan
@ 2020-10-08 13:45   ` André Przywara
  0 siblings, 0 replies; 43+ messages in thread
From: André Przywara @ 2020-10-08 13:45 UTC (permalink / raw)
  To: Leo Yan, Arnaldo Carvalho de Melo, Peter Zijlstra, Ingo Molnar,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Wei Li, James Clark, Dave Martin, linux-kernel, Al Grant

On 29/09/2020 14:39, Leo Yan wrote:
> When process address packet and counter packet, if the packet contains

       processing

> extended header, it misses to account the extra one byte for header
> length calculation, thus returns the wrong packet length.
> 
> To correct the packet length calculation, one possible fixing is simply
> to plus extra 1 for extended header, but will spread some duplicate code
> in the flows for processing address packet and counter packet.
> Alternatively, we can refine the function arm_spe_get_payload() to not
> only support short header and allow it to support extended header, and
> rely on it for the packet length calculation.
> 
> So this patch refactors function arm_spe_get_payload() with a new
> argument 'ext_hdr' for support extended header; the packet processing
> flows can invoke this function to unify the packet length calculation.
> 
> Signed-off-by: Leo Yan <leo.yan@linaro.org>

That looks alright to me.

Reviewed-by: Andre Przywara <andre.przywara@arm.com>

Cheers,
Andre

> ---
>  .../arm-spe-decoder/arm-spe-pkt-decoder.c     | 34 +++++++------------
>  1 file changed, 12 insertions(+), 22 deletions(-)
> 
> diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> index 5a8696031e16..4f0aeb62e97b 100644
> --- a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> +++ b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> @@ -80,14 +80,15 @@ const char *arm_spe_pkt_name(enum arm_spe_pkt_type type)
>  	(1 << (((val) & SPE_HEADER_SZ_MASK) >> SPE_HEADER_SZ_SHIFT))
>  
>  static int arm_spe_get_payload(const unsigned char *buf, size_t len,
> +			       unsigned char ext_hdr,
>  			       struct arm_spe_pkt *packet)
>  {
> -	size_t payload_len = PAYLOAD_LEN(buf[0]);
> +	size_t payload_len = PAYLOAD_LEN(buf[ext_hdr]);
>  
> -	if (len < 1 + payload_len)
> +	if (len < 1 + ext_hdr + payload_len)
>  		return ARM_SPE_NEED_MORE_BYTES;
>  
> -	buf++;
> +	buf += 1 + ext_hdr;
>  
>  	switch (payload_len) {
>  	case 1: packet->payload = *(uint8_t *)buf; break;
> @@ -97,7 +98,7 @@ static int arm_spe_get_payload(const unsigned char *buf, size_t len,
>  	default: return ARM_SPE_BAD_PACKET;
>  	}
>  
> -	return 1 + payload_len;
> +	return 1 + ext_hdr + payload_len;
>  }
>  
>  static int arm_spe_get_pad(struct arm_spe_pkt *packet)
> @@ -128,7 +129,7 @@ static int arm_spe_get_timestamp(const unsigned char *buf, size_t len,
>  				 struct arm_spe_pkt *packet)
>  {
>  	packet->type = ARM_SPE_TIMESTAMP;
> -	return arm_spe_get_payload(buf, len, packet);
> +	return arm_spe_get_payload(buf, len, 0, packet);
>  }
>  
>  static int arm_spe_get_events(const unsigned char *buf, size_t len,
> @@ -143,14 +144,14 @@ static int arm_spe_get_events(const unsigned char *buf, size_t len,
>  	 */
>  	packet->index = PAYLOAD_LEN(buf[0]);
>  
> -	return arm_spe_get_payload(buf, len, packet);
> +	return arm_spe_get_payload(buf, len, 0, packet);
>  }
>  
>  static int arm_spe_get_data_source(const unsigned char *buf, size_t len,
>  				   struct arm_spe_pkt *packet)
>  {
>  	packet->type = ARM_SPE_DATA_SOURCE;
> -	return arm_spe_get_payload(buf, len, packet);
> +	return arm_spe_get_payload(buf, len, 0, packet);
>  }
>  
>  static int arm_spe_get_context(const unsigned char *buf, size_t len,
> @@ -158,8 +159,7 @@ static int arm_spe_get_context(const unsigned char *buf, size_t len,
>  {
>  	packet->type = ARM_SPE_CONTEXT;
>  	packet->index = buf[0] & 0x3;
> -
> -	return arm_spe_get_payload(buf, len, packet);
> +	return arm_spe_get_payload(buf, len, 0, packet);
>  }
>  
>  static int arm_spe_get_op_type(const unsigned char *buf, size_t len,
> @@ -167,41 +167,31 @@ static int arm_spe_get_op_type(const unsigned char *buf, size_t len,
>  {
>  	packet->type = ARM_SPE_OP_TYPE;
>  	packet->index = buf[0] & 0x3;
> -	return arm_spe_get_payload(buf, len, packet);
> +	return arm_spe_get_payload(buf, len, 0, packet);
>  }
>  
>  static int arm_spe_get_counter(const unsigned char *buf, size_t len,
>  			       const unsigned char ext_hdr, struct arm_spe_pkt *packet)
>  {
> -	if (len < 2)
> -		return ARM_SPE_NEED_MORE_BYTES;
> -
>  	packet->type = ARM_SPE_COUNTER;
>  	if (ext_hdr)
>  		packet->index = ((buf[0] & 0x3) << 3) | (buf[1] & 0x7);
>  	else
>  		packet->index = buf[0] & 0x7;
>  
> -	packet->payload = le16_to_cpu(*(uint16_t *)(buf + 1));
> -
> -	return 1 + ext_hdr + 2;
> +	return arm_spe_get_payload(buf, len, ext_hdr, packet);
>  }
>  
>  static int arm_spe_get_addr(const unsigned char *buf, size_t len,
>  			    const unsigned char ext_hdr, struct arm_spe_pkt *packet)
>  {
> -	if (len < 8)
> -		return ARM_SPE_NEED_MORE_BYTES;
> -
>  	packet->type = ARM_SPE_ADDRESS;
>  	if (ext_hdr)
>  		packet->index = ((buf[0] & 0x3) << 3) | (buf[1] & 0x7);
>  	else
>  		packet->index = buf[0] & 0x7;
>  
> -	memcpy_le64(&packet->payload, buf + 1, 8);
> -
> -	return 1 + ext_hdr + 8;
> +	return arm_spe_get_payload(buf, len, ext_hdr, packet);
>  }
>  
>  static int arm_spe_do_get_packet(const unsigned char *buf, size_t len,
> 


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v2 05/14] perf arm-spe: Refactor printing string to buffer
  2020-09-29 13:39 ` [PATCH v2 05/14] perf arm-spe: Refactor printing string to buffer Leo Yan
@ 2020-10-08 13:46   ` André Przywara
  2020-10-12  0:29     ` Leo Yan
  0 siblings, 1 reply; 43+ messages in thread
From: André Przywara @ 2020-10-08 13:46 UTC (permalink / raw)
  To: Leo Yan, Arnaldo Carvalho de Melo, Peter Zijlstra, Ingo Molnar,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Wei Li, James Clark, Dave Martin, linux-kernel, Al Grant

On 29/09/2020 14:39, Leo Yan wrote:

Hi,

> When outputs strings to the decoding buffer with function snprintf(),
> SPE decoder needs to detects if any error returns from snprintf() and if
> so needs to directly bail out.  If snprintf() returns success, it needs
> to update buffer pointer and reduce the buffer length so can continue to
> output the next string into the consequent memory space.
> 
> This complex logics are spreading in the function arm_spe_pkt_desc() so
> there has many duplicate codes for handling error detecting, increment
> buffer pointer and decrement buffer size.
> 
> To avoid the duplicate code, this patch introduces a new helper function
> arm_spe_pkt_snprintf() which is used to wrap up the complex logics, and
> the caller arm_spe_pkt_desc() will call it and simply check the returns
> value.
> 
> This patch also moves the variable 'blen' as the function's local
> variable, this allows to remove the unnecessary braces and improve the
> readability.

Ah, many thanks for tackling this, I wondered about those code
duplications and missing error handling already.

So I tried some sed pattern on the "old" part of the patch, to get to
the new format and spot any differences apart from mechanically changing
"snprintf(buf, buf_len," to "arm_spe_pkt_snprintf(&buf, &blen,".
See below.

> 
> Suggested-by: Dave Martin <Dave.Martin@arm.com>
> Signed-off-by: Leo Yan <leo.yan@linaro.org>

Very nice cleanup! If you can address this one '"%s", name' comment
below, then:

Reviewed-by: Andre Przywara <andre.przywara@arm.com>


> ---
>  .../arm-spe-decoder/arm-spe-pkt-decoder.c     | 247 ++++++++++--------
>  1 file changed, 135 insertions(+), 112 deletions(-)
> 
> diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> index 4f0aeb62e97b..96b717a19163 100644
> --- a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> +++ b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> @@ -9,6 +9,7 @@
>  #include <endian.h>
>  #include <byteswap.h>
>  #include <linux/bitops.h>
> +#include <stdarg.h>
>  
>  #include "arm-spe-pkt-decoder.h"
>  
> @@ -256,192 +257,214 @@ int arm_spe_get_packet(const unsigned char *buf, size_t len,
>  	return ret;
>  }
>  
> +static int arm_spe_pkt_snprintf(char **buf_p, size_t *blen,
> +				const char *fmt, ...)
> +{
> +	va_list ap;
> +	int ret;
> +
> +	va_start(ap, fmt);
> +	ret = vsnprintf(*buf_p, *blen, fmt, ap);
> +	va_end(ap);
> +
> +	if (ret < 0)
> +		return ret;
> +
> +	*buf_p += ret;
> +	*blen -= ret;
> +	return ret;
> +}
> +
>  int arm_spe_pkt_desc(const struct arm_spe_pkt *packet, char *buf,
>  		     size_t buf_len)
>  {
>  	int ret, ns, el, idx = packet->index;
>  	unsigned long long payload = packet->payload;
>  	const char *name = arm_spe_pkt_name(packet->type);
> +	size_t blen = buf_len;
>  
>  	switch (packet->type) {
>  	case ARM_SPE_BAD:
>  	case ARM_SPE_PAD:
>  	case ARM_SPE_END:
> -		return snprintf(buf, buf_len, "%s", name);
> -	case ARM_SPE_EVENTS: {
> -		size_t blen = buf_len;
> -
> -		ret = 0;
> -		ret = snprintf(buf, buf_len, "EV");
> -		buf += ret;
> -		blen -= ret;
> +		return arm_spe_pkt_snprintf(&buf, &blen, name);

Isn't it considered safer to use '..., "%s", name)', because that would
handle percent signs in "name" correctly?
Not really an issue in the current code, but good style, I believe.

> +	case ARM_SPE_EVENTS:
> +		ret = arm_spe_pkt_snprintf(&buf, &blen, "EV");
> +		if (ret < 0)
> +			return ret;
> +
>  		if (payload & 0x1) {
> -			ret = snprintf(buf, buf_len, " EXCEPTION-GEN");

So this was actually a bug before, right? Because "buf" does no longer
have buf_len bytes available at this point, it should have been blen
here already. Which gets fixed with all your changes!

Thanks!
Andre

> -			buf += ret;
> -			blen -= ret;
> +			ret = arm_spe_pkt_snprintf(&buf, &blen, " EXCEPTION-GEN");
> +			if (ret < 0)
> +				return ret;
>  		}
>  		if (payload & 0x2) {
> -			ret = snprintf(buf, buf_len, " RETIRED");
> -			buf += ret;
> -			blen -= ret;
> +			ret = arm_spe_pkt_snprintf(&buf, &blen, " RETIRED");
> +			if (ret < 0)
> +				return ret;
>  		}
>  		if (payload & 0x4) {
> -			ret = snprintf(buf, buf_len, " L1D-ACCESS");
> -			buf += ret;
> -			blen -= ret;
> +			ret = arm_spe_pkt_snprintf(&buf, &blen, " L1D-ACCESS");
> +			if (ret < 0)
> +				return ret;
>  		}
>  		if (payload & 0x8) {
> -			ret = snprintf(buf, buf_len, " L1D-REFILL");
> -			buf += ret;
> -			blen -= ret;
> +			ret = arm_spe_pkt_snprintf(&buf, &blen, " L1D-REFILL");
> +			if (ret < 0)
> +				return ret;
>  		}
>  		if (payload & 0x10) {
> -			ret = snprintf(buf, buf_len, " TLB-ACCESS");
> -			buf += ret;
> -			blen -= ret;
> +			ret = arm_spe_pkt_snprintf(&buf, &blen, " TLB-ACCESS");
> +			if (ret < 0)
> +				return ret;
>  		}
>  		if (payload & 0x20) {
> -			ret = snprintf(buf, buf_len, " TLB-REFILL");
> -			buf += ret;
> -			blen -= ret;
> +			ret = arm_spe_pkt_snprintf(&buf, &blen, " TLB-REFILL");
> +			if (ret < 0)
> +				return ret;
>  		}
>  		if (payload & 0x40) {
> -			ret = snprintf(buf, buf_len, " NOT-TAKEN");
> -			buf += ret;
> -			blen -= ret;
> +			ret = arm_spe_pkt_snprintf(&buf, &blen, " NOT-TAKEN");
> +			if (ret < 0)
> +				return ret;
>  		}
>  		if (payload & 0x80) {
> -			ret = snprintf(buf, buf_len, " MISPRED");
> -			buf += ret;
> -			blen -= ret;
> +			ret = arm_spe_pkt_snprintf(&buf, &blen, " MISPRED");
> +			if (ret < 0)
> +				return ret;
>  		}
>  		if (idx > 1) {
>  			if (payload & 0x100) {
> -				ret = snprintf(buf, buf_len, " LLC-ACCESS");
> -				buf += ret;
> -				blen -= ret;
> +				ret = arm_spe_pkt_snprintf(&buf, &blen, " LLC-ACCESS");
> +				if (ret < 0)
> +					return ret;
>  			}
>  			if (payload & 0x200) {
> -				ret = snprintf(buf, buf_len, " LLC-REFILL");
> -				buf += ret;
> -				blen -= ret;
> +				ret = arm_spe_pkt_snprintf(&buf, &blen, " LLC-REFILL");
> +				if (ret < 0)
> +					return ret;
>  			}
>  			if (payload & 0x400) {
> -				ret = snprintf(buf, buf_len, " REMOTE-ACCESS");
> -				buf += ret;
> -				blen -= ret;
> +				ret = arm_spe_pkt_snprintf(&buf, &blen, " REMOTE-ACCESS");
> +				if (ret < 0)
> +					return ret;
>  			}
>  		}
> -		if (ret < 0)
> -			return ret;
> -		blen -= ret;
>  		return buf_len - blen;
> -	}
> +
>  	case ARM_SPE_OP_TYPE:
>  		switch (idx) {
> -		case 0:	return snprintf(buf, buf_len, "%s", payload & 0x1 ?
> -					"COND-SELECT" : "INSN-OTHER");
> -		case 1:	{
> -			size_t blen = buf_len;
> -
> -			if (payload & 0x1)
> -				ret = snprintf(buf, buf_len, "ST");
> -			else
> -				ret = snprintf(buf, buf_len, "LD");
> -			buf += ret;
> -			blen -= ret;
> +		case 0:
> +			return arm_spe_pkt_snprintf(&buf, &blen,
> +					payload & 0x1 ? "COND-SELECT" : "INSN-OTHER");
> +		case 1:
> +			ret = arm_spe_pkt_snprintf(&buf, &blen,
> +						   payload & 0x1 ? "ST" : "LD");
> +			if (ret < 0)
> +				return ret;
> +
>  			if (payload & 0x2) {
>  				if (payload & 0x4) {
> -					ret = snprintf(buf, buf_len, " AT");
> -					buf += ret;
> -					blen -= ret;
> +					ret = arm_spe_pkt_snprintf(&buf, &blen, " AT");
> +					if (ret < 0)
> +						return ret;
>  				}
>  				if (payload & 0x8) {
> -					ret = snprintf(buf, buf_len, " EXCL");
> -					buf += ret;
> -					blen -= ret;
> +					ret = arm_spe_pkt_snprintf(&buf, &blen, " EXCL");
> +					if (ret < 0)
> +						return ret;
>  				}
>  				if (payload & 0x10) {
> -					ret = snprintf(buf, buf_len, " AR");
> -					buf += ret;
> -					blen -= ret;
> +					ret = arm_spe_pkt_snprintf(&buf, &blen, " AR");
> +					if (ret < 0)
> +						return ret;
>  				}
>  			} else if (payload & 0x4) {
> -				ret = snprintf(buf, buf_len, " SIMD-FP");
> -				buf += ret;
> -				blen -= ret;
> +				ret = arm_spe_pkt_snprintf(&buf, &blen, " SIMD-FP");
> +				if (ret < 0)
> +					return ret;
>  			}
> +
> +			return buf_len - blen;
> +
> +		case 2:
> +			ret = arm_spe_pkt_snprintf(&buf, &blen, "B");
>  			if (ret < 0)
>  				return ret;
> -			blen -= ret;
> -			return buf_len - blen;
> -		}
> -		case 2:	{
> -			size_t blen = buf_len;
>  
> -			ret = snprintf(buf, buf_len, "B");
> -			buf += ret;
> -			blen -= ret;
>  			if (payload & 0x1) {
> -				ret = snprintf(buf, buf_len, " COND");
> -				buf += ret;
> -				blen -= ret;
> +				ret = arm_spe_pkt_snprintf(&buf, &blen, " COND");
> +				if (ret < 0)
> +					return ret;
>  			}
>  			if (payload & 0x2) {
> -				ret = snprintf(buf, buf_len, " IND");
> -				buf += ret;
> -				blen -= ret;
> +				ret = arm_spe_pkt_snprintf(&buf, &blen, " IND");
> +				if (ret < 0)
> +					return ret;
>  			}
> -			if (ret < 0)
> -				return ret;
> -			blen -= ret;
> +
>  			return buf_len - blen;
> -			}
> -		default: return 0;
> +
> +		default:
> +			return 0;
>  		}
>  	case ARM_SPE_DATA_SOURCE:
>  	case ARM_SPE_TIMESTAMP:
> -		return snprintf(buf, buf_len, "%s %lld", name, payload);
> +		return arm_spe_pkt_snprintf(&buf, &blen, "%s %lld", name, payload);
>  	case ARM_SPE_ADDRESS:
>  		switch (idx) {
>  		case 0:
>  		case 1: ns = !!(packet->payload & NS_FLAG);
>  			el = (packet->payload & EL_FLAG) >> 61;
>  			payload &= ~(0xffULL << 56);
> -			return snprintf(buf, buf_len, "%s 0x%llx el%d ns=%d",
> +			return arm_spe_pkt_snprintf(&buf, &blen,
> +					"%s 0x%llx el%d ns=%d",
>  				        (idx == 1) ? "TGT" : "PC", payload, el, ns);
> -		case 2:	return snprintf(buf, buf_len, "VA 0x%llx", payload);
> +		case 2:
> +			return arm_spe_pkt_snprintf(&buf, &blen,
> +						    "VA 0x%llx", payload);
>  		case 3:	ns = !!(packet->payload & NS_FLAG);
>  			payload &= ~(0xffULL << 56);
> -			return snprintf(buf, buf_len, "PA 0x%llx ns=%d",
> -					payload, ns);
> -		default: return 0;
> +			return arm_spe_pkt_snprintf(&buf, &blen,
> +						    "PA 0x%llx ns=%d", payload, ns);
> +		default:
> +			return 0;
>  		}
>  	case ARM_SPE_CONTEXT:
> -		return snprintf(buf, buf_len, "%s 0x%lx el%d", name,
> -				(unsigned long)payload, idx + 1);
> -	case ARM_SPE_COUNTER: {
> -		size_t blen = buf_len;
> -
> -		ret = snprintf(buf, buf_len, "%s %d ", name,
> -			       (unsigned short)payload);
> -		buf += ret;
> -		blen -= ret;
> -		switch (idx) {
> -		case 0:	ret = snprintf(buf, buf_len, "TOT"); break;
> -		case 1:	ret = snprintf(buf, buf_len, "ISSUE"); break;
> -		case 2:	ret = snprintf(buf, buf_len, "XLAT"); break;
> -		default: ret = 0;
> -		}
> +		return arm_spe_pkt_snprintf(&buf, &blen, "%s 0x%lx el%d",
> +					    name, (unsigned long)payload, idx + 1);
> +	case ARM_SPE_COUNTER:
> +		ret = arm_spe_pkt_snprintf(&buf, &blen, "%s %d ", name,
> +					   (unsigned short)payload);
>  		if (ret < 0)
>  			return ret;
> -		blen -= ret;
> +
> +		switch (idx) {
> +		case 0:
> +			ret = arm_spe_pkt_snprintf(&buf, &blen, "TOT");
> +			if (ret < 0)
> +				return ret;
> +			break;
> +		case 1:
> +			ret = arm_spe_pkt_snprintf(&buf, &blen, "ISSUE");
> +			if (ret < 0)
> +				return ret;
> +			break;
> +		case 2:
> +			ret = arm_spe_pkt_snprintf(&buf, &blen, "XLAT");
> +			if (ret < 0)
> +				return ret;
> +			break;
> +		default:
> +			break;
> +		}
> +
>  		return buf_len - blen;
> -	}
> +
>  	default:
>  		break;
>  	}
>  
> -	return snprintf(buf, buf_len, "%s 0x%llx (%d)",
> -			name, payload, packet->index);
> +	return arm_spe_pkt_snprintf(&buf, &blen, "%s 0x%llx (%d)",
> +				    name, payload, packet->index);
>  }
> 


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v2 06/14] perf arm-spe: Refactor packet header parsing
  2020-09-29 13:39 ` [PATCH v2 06/14] perf arm-spe: Refactor packet header parsing Leo Yan
@ 2020-10-08 19:49   ` André Przywara
  2020-10-12  1:00     ` Leo Yan
  0 siblings, 1 reply; 43+ messages in thread
From: André Przywara @ 2020-10-08 19:49 UTC (permalink / raw)
  To: Leo Yan, Arnaldo Carvalho de Melo, Peter Zijlstra, Ingo Molnar,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Wei Li, James Clark, Dave Martin, linux-kernel, Al Grant

On 29/09/2020 14:39, Leo Yan wrote:

Hi Leo,

> The packet header parsing uses the hard coded values and it uses nested
> if-else statements.
> 
> To improve the readability, this patch refactors the macros for packet
> header format so it removes the hard coded values.  Furthermore, based
> on the new mask macros it reduces the nested if-else statements and
> changes to use the flat conditions checking, this is directive and can
> easily map to the descriptions in ARMv8-a architecture reference manual
> (ARM DDI 0487E.a), chapter 'D10.1.5 Statistical Profiling Extension
> protocol packet headers'.

Yeah, that's so much better, thank you!

I checked all the bits and comparisons against the ARM ARM.

Two minor things below ...

> 
> Signed-off-by: Leo Yan <leo.yan@linaro.org>
> ---
>  .../arm-spe-decoder/arm-spe-pkt-decoder.c     | 92 +++++++++----------
>  .../arm-spe-decoder/arm-spe-pkt-decoder.h     | 21 +++++
>  2 files changed, 62 insertions(+), 51 deletions(-)
> 
> diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> index 96b717a19163..e738bd04f209 100644
> --- a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> +++ b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> @@ -16,28 +16,6 @@
>  #define NS_FLAG		BIT(63)
>  #define EL_FLAG		(BIT(62) | BIT(61))
>  
> -#define SPE_HEADER0_PAD			0x0
> -#define SPE_HEADER0_END			0x1
> -#define SPE_HEADER0_ADDRESS		0x30 /* address packet (short) */
> -#define SPE_HEADER0_ADDRESS_MASK	0x38
> -#define SPE_HEADER0_COUNTER		0x18 /* counter packet (short) */
> -#define SPE_HEADER0_COUNTER_MASK	0x38
> -#define SPE_HEADER0_TIMESTAMP		0x71
> -#define SPE_HEADER0_TIMESTAMP		0x71
> -#define SPE_HEADER0_EVENTS		0x2
> -#define SPE_HEADER0_EVENTS_MASK		0xf
> -#define SPE_HEADER0_SOURCE		0x3
> -#define SPE_HEADER0_SOURCE_MASK		0xf
> -#define SPE_HEADER0_CONTEXT		0x24
> -#define SPE_HEADER0_CONTEXT_MASK	0x3c
> -#define SPE_HEADER0_OP_TYPE		0x8
> -#define SPE_HEADER0_OP_TYPE_MASK	0x3c
> -#define SPE_HEADER1_ALIGNMENT		0x0
> -#define SPE_HEADER1_ADDRESS		0xb0 /* address packet (extended) */
> -#define SPE_HEADER1_ADDRESS_MASK	0xf8
> -#define SPE_HEADER1_COUNTER		0x98 /* counter packet (extended) */
> -#define SPE_HEADER1_COUNTER_MASK	0xf8
> -
>  #if __BYTE_ORDER == __BIG_ENDIAN
>  #define le16_to_cpu bswap_16
>  #define le32_to_cpu bswap_32
> @@ -198,46 +176,58 @@ static int arm_spe_get_addr(const unsigned char *buf, size_t len,
>  static int arm_spe_do_get_packet(const unsigned char *buf, size_t len,
>  				 struct arm_spe_pkt *packet)
>  {
> -	unsigned int byte;
> +	unsigned int hdr;
> +	unsigned char ext_hdr = 0;
>  
>  	memset(packet, 0, sizeof(struct arm_spe_pkt));
>  
>  	if (!len)
>  		return ARM_SPE_NEED_MORE_BYTES;
>  
> -	byte = buf[0];
> -	if (byte == SPE_HEADER0_PAD)
> +	hdr = buf[0];
> +
> +	if (hdr == SPE_HEADER0_PAD)
>  		return arm_spe_get_pad(packet);
> -	else if (byte == SPE_HEADER0_END) /* no timestamp at end of record */
> +
> +	if (hdr == SPE_HEADER0_END) /* no timestamp at end of record */
>  		return arm_spe_get_end(packet);
> -	else if (byte & 0xc0 /* 0y11xxxxxx */) {
> -		if (byte & 0x80) {
> -			if ((byte & SPE_HEADER0_ADDRESS_MASK) == SPE_HEADER0_ADDRESS)
> -				return arm_spe_get_addr(buf, len, 0, packet);
> -			if ((byte & SPE_HEADER0_COUNTER_MASK) == SPE_HEADER0_COUNTER)
> -				return arm_spe_get_counter(buf, len, 0, packet);
> -		} else
> -			if (byte == SPE_HEADER0_TIMESTAMP)
> -				return arm_spe_get_timestamp(buf, len, packet);
> -			else if ((byte & SPE_HEADER0_EVENTS_MASK) == SPE_HEADER0_EVENTS)
> -				return arm_spe_get_events(buf, len, packet);
> -			else if ((byte & SPE_HEADER0_SOURCE_MASK) == SPE_HEADER0_SOURCE)
> -				return arm_spe_get_data_source(buf, len, packet);
> -			else if ((byte & SPE_HEADER0_CONTEXT_MASK) == SPE_HEADER0_CONTEXT)
> -				return arm_spe_get_context(buf, len, packet);
> -			else if ((byte & SPE_HEADER0_OP_TYPE_MASK) == SPE_HEADER0_OP_TYPE)
> -				return arm_spe_get_op_type(buf, len, packet);
> -	} else if ((byte & 0xe0) == 0x20 /* 0y001xxxxx */) {
> -		/* 16-bit header */
> -		byte = buf[1];
> -		if (byte == SPE_HEADER1_ALIGNMENT)
> +
> +	if (hdr == SPE_HEADER0_TIMESTAMP)
> +		return arm_spe_get_timestamp(buf, len, packet);
> +
> +	if ((hdr & SPE_HEADER0_MASK1) == SPE_HEADER0_EVENTS)
> +		return arm_spe_get_events(buf, len, packet);
> +
> +	if ((hdr & SPE_HEADER0_MASK1) == SPE_HEADER0_SOURCE)
> +		return arm_spe_get_data_source(buf, len, packet);
> +
> +	if ((hdr & SPE_HEADER0_MASK2) == SPE_HEADER0_CONTEXT)
> +		return arm_spe_get_context(buf, len, packet);
> +
> +	if ((hdr & SPE_HEADER0_MASK2) == SPE_HEADER0_OPERATION)
> +		return arm_spe_get_op_type(buf, len, packet);
> +
> +	if ((hdr & SPE_HEADER0_MASK3) == SPE_HEADER0_EXTENDED) {

Is there any reason you are using MASK3 here, and not MASK2? The ARM ARM
seems to suggest that bits [7:2] make up the mask for the extended
header type, as the actual subtype is handled in the next byte.

> +		/* 16-bit extended format header */
> +		ext_hdr = 1;
> +
> +		hdr = buf[1];
> +		if (hdr == SPE_HEADER1_ALIGNMENT)
>  			return arm_spe_get_alignment(buf, len, packet);
> -		else if ((byte & SPE_HEADER1_ADDRESS_MASK) == SPE_HEADER1_ADDRESS)
> -			return arm_spe_get_addr(buf, len, 1, packet);
> -		else if ((byte & SPE_HEADER1_COUNTER_MASK) == SPE_HEADER1_COUNTER)
> -			return arm_spe_get_counter(buf, len, 1, packet);
>  	}
>  
> +	/*
> +	 * The short format header's byte 0 or the extended format header's
> +	 * byte 1 has been assigned to 'hdr', which uses the same encoding for
> +	 * address packet and counter packet, so don't need to distinguish if
> +	 * it's short format or extended format and handle in once.
> +	 */
> +	if ((hdr & SPE_HEADER0_MASK4) == SPE_HEADER0_ADDRESS)
> +		return arm_spe_get_addr(buf, len, ext_hdr, packet);
> +
> +	if ((hdr & SPE_HEADER0_MASK4) == SPE_HEADER0_COUNTER)
> +		return arm_spe_get_counter(buf, len, ext_hdr, packet);
> +
>  	return ARM_SPE_BAD_PACKET;
>  }
>  
> diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h
> index f2d0af39a58c..a30fe3c5ab67 100644
> --- a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h
> +++ b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h
> @@ -37,6 +37,27 @@ struct arm_spe_pkt {
>  	uint64_t		payload;
>  };
>  
> +/* Short header (HEADER0) and extended header (HEADER1) */
> +#define SPE_HEADER0_PAD			0x0
> +#define SPE_HEADER0_END			0x1
> +#define SPE_HEADER0_TIMESTAMP		0x71
> +/* Mask for event & data source */
> +#define SPE_HEADER0_MASK1		(GENMASK_ULL(7, 6) | GENMASK_ULL(3, 0))
> +#define SPE_HEADER0_EVENTS		0x42
> +#define SPE_HEADER0_SOURCE		0x43
> +/* Mask for context & operation */
> +#define SPE_HEADER0_MASK2		GENMASK_ULL(7, 2)
> +#define SPE_HEADER0_CONTEXT		0x64
> +#define SPE_HEADER0_OPERATION		0x48

Just a nit, but should the name be ..._OP_TYPE instead?

Cheers,
Andre

> +/* Mask for extended format */
> +#define SPE_HEADER0_MASK3		GENMASK_ULL(7, 5)
> +#define SPE_HEADER0_EXTENDED		0x20
> +/* Mask for address & counter */
> +#define SPE_HEADER0_MASK4		GENMASK_ULL(7, 3)
> +#define SPE_HEADER0_ADDRESS		0xb0
> +#define SPE_HEADER0_COUNTER		0x98
> +#define SPE_HEADER1_ALIGNMENT		0x0
> +
>  #define SPE_HEADER_SZ_SHIFT		(4)
>  #define SPE_HEADER_SZ_MASK		GENMASK_ULL(5, 4)
>  
> 


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v2 03/14] perf arm-spe: Refactor payload length calculation
  2020-10-08 13:44   ` André Przywara
@ 2020-10-12  0:21     ` Leo Yan
  0 siblings, 0 replies; 43+ messages in thread
From: Leo Yan @ 2020-10-12  0:21 UTC (permalink / raw)
  To: André Przywara
  Cc: Arnaldo Carvalho de Melo, Peter Zijlstra, Ingo Molnar,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Wei Li, James Clark, Dave Martin, linux-kernel, Al Grant

Hi Andre,

On Thu, Oct 08, 2020 at 02:44:59PM +0100, André Przywara wrote:
> On 29/09/2020 14:39, Leo Yan wrote:
> 
> Hi Leo,
> 
> > Defines macro for payload length calculation instead of static function.
> 
> What is the reason for that? I thought the kernel's direction is more
> the other way: replacing macros with static functions ("Don't write CPP,
> write C")? Ideally the compiler would generate the same code.

Okay, I didn't note this before.  Will change back to use static
function.

> > Currently the event packet's 'index' is assigned as payload length, but
> > the flow is not directive: it firstly gets the packet length (includes
> > header length and payload length) and then reduces header length from
> > packet length, so finally get the payload length; to simplify the code,
> > this patch directly assigns payload length to event packet's index.
> > 
> > Signed-off-by: Leo Yan <leo.yan@linaro.org>
> > ---
> >  .../arm-spe-decoder/arm-spe-pkt-decoder.c     | 26 ++++++++-----------
> >  .../arm-spe-decoder/arm-spe-pkt-decoder.h     |  4 +++
> >  2 files changed, 15 insertions(+), 15 deletions(-)
> > 
> > diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> > index 7c7b5eb09fba..5a8696031e16 100644
> > --- a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> > +++ b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> > @@ -69,22 +69,20 @@ const char *arm_spe_pkt_name(enum arm_spe_pkt_type type)
> >  	return arm_spe_packet_name[type];
> >  }
> >  
> > -/* return ARM SPE payload size from its encoding,
> > - * which is in bits 5:4 of the byte.
> > - * 00 : byte
> > - * 01 : halfword (2)
> > - * 10 : word (4)
> > - * 11 : doubleword (8)
> > +/*
> > + * Return ARM SPE payload size from header bits 5:4
> > + *   00 : byte
> > + *   01 : halfword (2)
> > + *   10 : word (4)
> > + *   11 : doubleword (8)
> >   */
> > -static int payloadlen(unsigned char byte)
> > -{
> > -	return 1 << ((byte & 0x30) >> 4);
> > -}
> > +#define PAYLOAD_LEN(val)	\
> > +	(1 << (((val) & SPE_HEADER_SZ_MASK) >> SPE_HEADER_SZ_SHIFT))
> 
> This change of the expression is good (although it should be 1U), but
> please keep it a function. The return type should be unsigned, I guess.

Will do.

> The rest looks fine.

Thanks a lot for reviewing!

Leo

> Cheers,
> Andre
> 
> >  
> >  static int arm_spe_get_payload(const unsigned char *buf, size_t len,
> >  			       struct arm_spe_pkt *packet)
> >  {
> > -	size_t payload_len = payloadlen(buf[0]);
> > +	size_t payload_len = PAYLOAD_LEN(buf[0]);
> >  
> >  	if (len < 1 + payload_len)
> >  		return ARM_SPE_NEED_MORE_BYTES;
> > @@ -136,8 +134,6 @@ static int arm_spe_get_timestamp(const unsigned char *buf, size_t len,
> >  static int arm_spe_get_events(const unsigned char *buf, size_t len,
> >  			      struct arm_spe_pkt *packet)
> >  {
> > -	int ret = arm_spe_get_payload(buf, len, packet);
> > -
> >  	packet->type = ARM_SPE_EVENTS;
> >  
> >  	/* we use index to identify Events with a less number of
> > @@ -145,9 +141,9 @@ static int arm_spe_get_events(const unsigned char *buf, size_t len,
> >  	 * LLC-REFILL, and REMOTE-ACCESS events are identified if
> >  	 * index > 1.
> >  	 */
> > -	packet->index = ret - 1;
> > +	packet->index = PAYLOAD_LEN(buf[0]);
> >  
> > -	return ret;
> > +	return arm_spe_get_payload(buf, len, packet);
> >  }
> >  
> >  static int arm_spe_get_data_source(const unsigned char *buf, size_t len,
> > diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h
> > index 4c870521b8eb..f2d0af39a58c 100644
> > --- a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h
> > +++ b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h
> > @@ -9,6 +9,7 @@
> >  
> >  #include <stddef.h>
> >  #include <stdint.h>
> > +#include <linux/bits.h>
> >  
> >  #define ARM_SPE_PKT_DESC_MAX		256
> >  
> > @@ -36,6 +37,9 @@ struct arm_spe_pkt {
> >  	uint64_t		payload;
> >  };
> >  
> > +#define SPE_HEADER_SZ_SHIFT		(4)
> > +#define SPE_HEADER_SZ_MASK		GENMASK_ULL(5, 4)
> > +
> >  #define SPE_ADDR_PKT_HDR_INDEX_INS		(0x0)
> >  #define SPE_ADDR_PKT_HDR_INDEX_BRANCH		(0x1)
> >  #define SPE_ADDR_PKT_HDR_INDEX_DATA_VIRT	(0x2)
> > 
> 

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v2 05/14] perf arm-spe: Refactor printing string to buffer
  2020-10-08 13:46   ` André Przywara
@ 2020-10-12  0:29     ` Leo Yan
  0 siblings, 0 replies; 43+ messages in thread
From: Leo Yan @ 2020-10-12  0:29 UTC (permalink / raw)
  To: André Przywara
  Cc: Arnaldo Carvalho de Melo, Peter Zijlstra, Ingo Molnar,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Wei Li, James Clark, Dave Martin, linux-kernel, Al Grant

On Thu, Oct 08, 2020 at 02:46:06PM +0100, André Przywara wrote:
> On 29/09/2020 14:39, Leo Yan wrote:
> 
> Hi,
> 
> > When outputs strings to the decoding buffer with function snprintf(),
> > SPE decoder needs to detects if any error returns from snprintf() and if
> > so needs to directly bail out.  If snprintf() returns success, it needs
> > to update buffer pointer and reduce the buffer length so can continue to
> > output the next string into the consequent memory space.
> > 
> > This complex logics are spreading in the function arm_spe_pkt_desc() so
> > there has many duplicate codes for handling error detecting, increment
> > buffer pointer and decrement buffer size.
> > 
> > To avoid the duplicate code, this patch introduces a new helper function
> > arm_spe_pkt_snprintf() which is used to wrap up the complex logics, and
> > the caller arm_spe_pkt_desc() will call it and simply check the returns
> > value.
> > 
> > This patch also moves the variable 'blen' as the function's local
> > variable, this allows to remove the unnecessary braces and improve the
> > readability.
> 
> Ah, many thanks for tackling this, I wondered about those code
> duplications and missing error handling already.
> 
> So I tried some sed pattern on the "old" part of the patch, to get to
> the new format and spot any differences apart from mechanically changing
> "snprintf(buf, buf_len," to "arm_spe_pkt_snprintf(&buf, &blen,".
> See below.
> 
> > 
> > Suggested-by: Dave Martin <Dave.Martin@arm.com>
> > Signed-off-by: Leo Yan <leo.yan@linaro.org>
> 
> Very nice cleanup! If you can address this one '"%s", name' comment
> below, then:
> 
> Reviewed-by: Andre Przywara <andre.przywara@arm.com>
> 
> 
> > ---
> >  .../arm-spe-decoder/arm-spe-pkt-decoder.c     | 247 ++++++++++--------
> >  1 file changed, 135 insertions(+), 112 deletions(-)
> > 
> > diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> > index 4f0aeb62e97b..96b717a19163 100644
> > --- a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> > +++ b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> > @@ -9,6 +9,7 @@
> >  #include <endian.h>
> >  #include <byteswap.h>
> >  #include <linux/bitops.h>
> > +#include <stdarg.h>
> >  
> >  #include "arm-spe-pkt-decoder.h"
> >  
> > @@ -256,192 +257,214 @@ int arm_spe_get_packet(const unsigned char *buf, size_t len,
> >  	return ret;
> >  }
> >  
> > +static int arm_spe_pkt_snprintf(char **buf_p, size_t *blen,
> > +				const char *fmt, ...)
> > +{
> > +	va_list ap;
> > +	int ret;
> > +
> > +	va_start(ap, fmt);
> > +	ret = vsnprintf(*buf_p, *blen, fmt, ap);
> > +	va_end(ap);
> > +
> > +	if (ret < 0)
> > +		return ret;
> > +
> > +	*buf_p += ret;
> > +	*blen -= ret;
> > +	return ret;
> > +}
> > +
> >  int arm_spe_pkt_desc(const struct arm_spe_pkt *packet, char *buf,
> >  		     size_t buf_len)
> >  {
> >  	int ret, ns, el, idx = packet->index;
> >  	unsigned long long payload = packet->payload;
> >  	const char *name = arm_spe_pkt_name(packet->type);
> > +	size_t blen = buf_len;
> >  
> >  	switch (packet->type) {
> >  	case ARM_SPE_BAD:
> >  	case ARM_SPE_PAD:
> >  	case ARM_SPE_END:
> > -		return snprintf(buf, buf_len, "%s", name);
> > -	case ARM_SPE_EVENTS: {
> > -		size_t blen = buf_len;
> > -
> > -		ret = 0;
> > -		ret = snprintf(buf, buf_len, "EV");
> > -		buf += ret;
> > -		blen -= ret;
> > +		return arm_spe_pkt_snprintf(&buf, &blen, name);
> 
> Isn't it considered safer to use '..., "%s", name)', because that would
> handle percent signs in "name" correctly?
> Not really an issue in the current code, but good style, I believe.

Good point.  '..., "%s", name)' format is more reliable, will change
to use it.

> > +	case ARM_SPE_EVENTS:
> > +		ret = arm_spe_pkt_snprintf(&buf, &blen, "EV");
> > +		if (ret < 0)
> > +			return ret;
> > +
> >  		if (payload & 0x1) {
> > -			ret = snprintf(buf, buf_len, " EXCEPTION-GEN");
> 
> So this was actually a bug before, right? Because "buf" does no longer
> have buf_len bytes available at this point, it should have been blen
> here already. Which gets fixed with all your changes!

Exactly, I will update the commit log so can give more explicit info.

Thanks a lot,
Leo

> Thanks!
> Andre
> 
> > -			buf += ret;
> > -			blen -= ret;
> > +			ret = arm_spe_pkt_snprintf(&buf, &blen, " EXCEPTION-GEN");
> > +			if (ret < 0)
> > +				return ret;
> >  		}
> >  		if (payload & 0x2) {
> > -			ret = snprintf(buf, buf_len, " RETIRED");
> > -			buf += ret;
> > -			blen -= ret;
> > +			ret = arm_spe_pkt_snprintf(&buf, &blen, " RETIRED");
> > +			if (ret < 0)
> > +				return ret;
> >  		}
> >  		if (payload & 0x4) {
> > -			ret = snprintf(buf, buf_len, " L1D-ACCESS");
> > -			buf += ret;
> > -			blen -= ret;
> > +			ret = arm_spe_pkt_snprintf(&buf, &blen, " L1D-ACCESS");
> > +			if (ret < 0)
> > +				return ret;
> >  		}
> >  		if (payload & 0x8) {
> > -			ret = snprintf(buf, buf_len, " L1D-REFILL");
> > -			buf += ret;
> > -			blen -= ret;
> > +			ret = arm_spe_pkt_snprintf(&buf, &blen, " L1D-REFILL");
> > +			if (ret < 0)
> > +				return ret;
> >  		}
> >  		if (payload & 0x10) {
> > -			ret = snprintf(buf, buf_len, " TLB-ACCESS");
> > -			buf += ret;
> > -			blen -= ret;
> > +			ret = arm_spe_pkt_snprintf(&buf, &blen, " TLB-ACCESS");
> > +			if (ret < 0)
> > +				return ret;
> >  		}
> >  		if (payload & 0x20) {
> > -			ret = snprintf(buf, buf_len, " TLB-REFILL");
> > -			buf += ret;
> > -			blen -= ret;
> > +			ret = arm_spe_pkt_snprintf(&buf, &blen, " TLB-REFILL");
> > +			if (ret < 0)
> > +				return ret;
> >  		}
> >  		if (payload & 0x40) {
> > -			ret = snprintf(buf, buf_len, " NOT-TAKEN");
> > -			buf += ret;
> > -			blen -= ret;
> > +			ret = arm_spe_pkt_snprintf(&buf, &blen, " NOT-TAKEN");
> > +			if (ret < 0)
> > +				return ret;
> >  		}
> >  		if (payload & 0x80) {
> > -			ret = snprintf(buf, buf_len, " MISPRED");
> > -			buf += ret;
> > -			blen -= ret;
> > +			ret = arm_spe_pkt_snprintf(&buf, &blen, " MISPRED");
> > +			if (ret < 0)
> > +				return ret;
> >  		}
> >  		if (idx > 1) {
> >  			if (payload & 0x100) {
> > -				ret = snprintf(buf, buf_len, " LLC-ACCESS");
> > -				buf += ret;
> > -				blen -= ret;
> > +				ret = arm_spe_pkt_snprintf(&buf, &blen, " LLC-ACCESS");
> > +				if (ret < 0)
> > +					return ret;
> >  			}
> >  			if (payload & 0x200) {
> > -				ret = snprintf(buf, buf_len, " LLC-REFILL");
> > -				buf += ret;
> > -				blen -= ret;
> > +				ret = arm_spe_pkt_snprintf(&buf, &blen, " LLC-REFILL");
> > +				if (ret < 0)
> > +					return ret;
> >  			}
> >  			if (payload & 0x400) {
> > -				ret = snprintf(buf, buf_len, " REMOTE-ACCESS");
> > -				buf += ret;
> > -				blen -= ret;
> > +				ret = arm_spe_pkt_snprintf(&buf, &blen, " REMOTE-ACCESS");
> > +				if (ret < 0)
> > +					return ret;
> >  			}
> >  		}
> > -		if (ret < 0)
> > -			return ret;
> > -		blen -= ret;
> >  		return buf_len - blen;
> > -	}
> > +
> >  	case ARM_SPE_OP_TYPE:
> >  		switch (idx) {
> > -		case 0:	return snprintf(buf, buf_len, "%s", payload & 0x1 ?
> > -					"COND-SELECT" : "INSN-OTHER");
> > -		case 1:	{
> > -			size_t blen = buf_len;
> > -
> > -			if (payload & 0x1)
> > -				ret = snprintf(buf, buf_len, "ST");
> > -			else
> > -				ret = snprintf(buf, buf_len, "LD");
> > -			buf += ret;
> > -			blen -= ret;
> > +		case 0:
> > +			return arm_spe_pkt_snprintf(&buf, &blen,
> > +					payload & 0x1 ? "COND-SELECT" : "INSN-OTHER");
> > +		case 1:
> > +			ret = arm_spe_pkt_snprintf(&buf, &blen,
> > +						   payload & 0x1 ? "ST" : "LD");
> > +			if (ret < 0)
> > +				return ret;
> > +
> >  			if (payload & 0x2) {
> >  				if (payload & 0x4) {
> > -					ret = snprintf(buf, buf_len, " AT");
> > -					buf += ret;
> > -					blen -= ret;
> > +					ret = arm_spe_pkt_snprintf(&buf, &blen, " AT");
> > +					if (ret < 0)
> > +						return ret;
> >  				}
> >  				if (payload & 0x8) {
> > -					ret = snprintf(buf, buf_len, " EXCL");
> > -					buf += ret;
> > -					blen -= ret;
> > +					ret = arm_spe_pkt_snprintf(&buf, &blen, " EXCL");
> > +					if (ret < 0)
> > +						return ret;
> >  				}
> >  				if (payload & 0x10) {
> > -					ret = snprintf(buf, buf_len, " AR");
> > -					buf += ret;
> > -					blen -= ret;
> > +					ret = arm_spe_pkt_snprintf(&buf, &blen, " AR");
> > +					if (ret < 0)
> > +						return ret;
> >  				}
> >  			} else if (payload & 0x4) {
> > -				ret = snprintf(buf, buf_len, " SIMD-FP");
> > -				buf += ret;
> > -				blen -= ret;
> > +				ret = arm_spe_pkt_snprintf(&buf, &blen, " SIMD-FP");
> > +				if (ret < 0)
> > +					return ret;
> >  			}
> > +
> > +			return buf_len - blen;
> > +
> > +		case 2:
> > +			ret = arm_spe_pkt_snprintf(&buf, &blen, "B");
> >  			if (ret < 0)
> >  				return ret;
> > -			blen -= ret;
> > -			return buf_len - blen;
> > -		}
> > -		case 2:	{
> > -			size_t blen = buf_len;
> >  
> > -			ret = snprintf(buf, buf_len, "B");
> > -			buf += ret;
> > -			blen -= ret;
> >  			if (payload & 0x1) {
> > -				ret = snprintf(buf, buf_len, " COND");
> > -				buf += ret;
> > -				blen -= ret;
> > +				ret = arm_spe_pkt_snprintf(&buf, &blen, " COND");
> > +				if (ret < 0)
> > +					return ret;
> >  			}
> >  			if (payload & 0x2) {
> > -				ret = snprintf(buf, buf_len, " IND");
> > -				buf += ret;
> > -				blen -= ret;
> > +				ret = arm_spe_pkt_snprintf(&buf, &blen, " IND");
> > +				if (ret < 0)
> > +					return ret;
> >  			}
> > -			if (ret < 0)
> > -				return ret;
> > -			blen -= ret;
> > +
> >  			return buf_len - blen;
> > -			}
> > -		default: return 0;
> > +
> > +		default:
> > +			return 0;
> >  		}
> >  	case ARM_SPE_DATA_SOURCE:
> >  	case ARM_SPE_TIMESTAMP:
> > -		return snprintf(buf, buf_len, "%s %lld", name, payload);
> > +		return arm_spe_pkt_snprintf(&buf, &blen, "%s %lld", name, payload);
> >  	case ARM_SPE_ADDRESS:
> >  		switch (idx) {
> >  		case 0:
> >  		case 1: ns = !!(packet->payload & NS_FLAG);
> >  			el = (packet->payload & EL_FLAG) >> 61;
> >  			payload &= ~(0xffULL << 56);
> > -			return snprintf(buf, buf_len, "%s 0x%llx el%d ns=%d",
> > +			return arm_spe_pkt_snprintf(&buf, &blen,
> > +					"%s 0x%llx el%d ns=%d",
> >  				        (idx == 1) ? "TGT" : "PC", payload, el, ns);
> > -		case 2:	return snprintf(buf, buf_len, "VA 0x%llx", payload);
> > +		case 2:
> > +			return arm_spe_pkt_snprintf(&buf, &blen,
> > +						    "VA 0x%llx", payload);
> >  		case 3:	ns = !!(packet->payload & NS_FLAG);
> >  			payload &= ~(0xffULL << 56);
> > -			return snprintf(buf, buf_len, "PA 0x%llx ns=%d",
> > -					payload, ns);
> > -		default: return 0;
> > +			return arm_spe_pkt_snprintf(&buf, &blen,
> > +						    "PA 0x%llx ns=%d", payload, ns);
> > +		default:
> > +			return 0;
> >  		}
> >  	case ARM_SPE_CONTEXT:
> > -		return snprintf(buf, buf_len, "%s 0x%lx el%d", name,
> > -				(unsigned long)payload, idx + 1);
> > -	case ARM_SPE_COUNTER: {
> > -		size_t blen = buf_len;
> > -
> > -		ret = snprintf(buf, buf_len, "%s %d ", name,
> > -			       (unsigned short)payload);
> > -		buf += ret;
> > -		blen -= ret;
> > -		switch (idx) {
> > -		case 0:	ret = snprintf(buf, buf_len, "TOT"); break;
> > -		case 1:	ret = snprintf(buf, buf_len, "ISSUE"); break;
> > -		case 2:	ret = snprintf(buf, buf_len, "XLAT"); break;
> > -		default: ret = 0;
> > -		}
> > +		return arm_spe_pkt_snprintf(&buf, &blen, "%s 0x%lx el%d",
> > +					    name, (unsigned long)payload, idx + 1);
> > +	case ARM_SPE_COUNTER:
> > +		ret = arm_spe_pkt_snprintf(&buf, &blen, "%s %d ", name,
> > +					   (unsigned short)payload);
> >  		if (ret < 0)
> >  			return ret;
> > -		blen -= ret;
> > +
> > +		switch (idx) {
> > +		case 0:
> > +			ret = arm_spe_pkt_snprintf(&buf, &blen, "TOT");
> > +			if (ret < 0)
> > +				return ret;
> > +			break;
> > +		case 1:
> > +			ret = arm_spe_pkt_snprintf(&buf, &blen, "ISSUE");
> > +			if (ret < 0)
> > +				return ret;
> > +			break;
> > +		case 2:
> > +			ret = arm_spe_pkt_snprintf(&buf, &blen, "XLAT");
> > +			if (ret < 0)
> > +				return ret;
> > +			break;
> > +		default:
> > +			break;
> > +		}
> > +
> >  		return buf_len - blen;
> > -	}
> > +
> >  	default:
> >  		break;
> >  	}
> >  
> > -	return snprintf(buf, buf_len, "%s 0x%llx (%d)",
> > -			name, payload, packet->index);
> > +	return arm_spe_pkt_snprintf(&buf, &blen, "%s 0x%llx (%d)",
> > +				    name, payload, packet->index);
> >  }
> > 
> 

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v2 06/14] perf arm-spe: Refactor packet header parsing
  2020-10-08 19:49   ` André Przywara
@ 2020-10-12  1:00     ` Leo Yan
  0 siblings, 0 replies; 43+ messages in thread
From: Leo Yan @ 2020-10-12  1:00 UTC (permalink / raw)
  To: André Przywara
  Cc: Arnaldo Carvalho de Melo, Peter Zijlstra, Ingo Molnar,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Wei Li, James Clark, Dave Martin, linux-kernel, Al Grant

On Thu, Oct 08, 2020 at 08:49:11PM +0100, André Przywara wrote:
> On 29/09/2020 14:39, Leo Yan wrote:
> 
> Hi Leo,
> 
> > The packet header parsing uses the hard coded values and it uses nested
> > if-else statements.
> > 
> > To improve the readability, this patch refactors the macros for packet
> > header format so it removes the hard coded values.  Furthermore, based
> > on the new mask macros it reduces the nested if-else statements and
> > changes to use the flat conditions checking, this is directive and can
> > easily map to the descriptions in ARMv8-a architecture reference manual
> > (ARM DDI 0487E.a), chapter 'D10.1.5 Statistical Profiling Extension
> > protocol packet headers'.
> 
> Yeah, that's so much better, thank you!

Welcome :)

> I checked all the bits and comparisons against the ARM ARM.
> 
> Two minor things below ...
> 
> > 
> > Signed-off-by: Leo Yan <leo.yan@linaro.org>
> > ---
> >  .../arm-spe-decoder/arm-spe-pkt-decoder.c     | 92 +++++++++----------
> >  .../arm-spe-decoder/arm-spe-pkt-decoder.h     | 21 +++++
> >  2 files changed, 62 insertions(+), 51 deletions(-)
> > 
> > diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> > index 96b717a19163..e738bd04f209 100644
> > --- a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> > +++ b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> > @@ -16,28 +16,6 @@
> >  #define NS_FLAG		BIT(63)
> >  #define EL_FLAG		(BIT(62) | BIT(61))
> >  
> > -#define SPE_HEADER0_PAD			0x0
> > -#define SPE_HEADER0_END			0x1
> > -#define SPE_HEADER0_ADDRESS		0x30 /* address packet (short) */
> > -#define SPE_HEADER0_ADDRESS_MASK	0x38
> > -#define SPE_HEADER0_COUNTER		0x18 /* counter packet (short) */
> > -#define SPE_HEADER0_COUNTER_MASK	0x38
> > -#define SPE_HEADER0_TIMESTAMP		0x71
> > -#define SPE_HEADER0_TIMESTAMP		0x71
> > -#define SPE_HEADER0_EVENTS		0x2
> > -#define SPE_HEADER0_EVENTS_MASK		0xf
> > -#define SPE_HEADER0_SOURCE		0x3
> > -#define SPE_HEADER0_SOURCE_MASK		0xf
> > -#define SPE_HEADER0_CONTEXT		0x24
> > -#define SPE_HEADER0_CONTEXT_MASK	0x3c
> > -#define SPE_HEADER0_OP_TYPE		0x8
> > -#define SPE_HEADER0_OP_TYPE_MASK	0x3c
> > -#define SPE_HEADER1_ALIGNMENT		0x0
> > -#define SPE_HEADER1_ADDRESS		0xb0 /* address packet (extended) */
> > -#define SPE_HEADER1_ADDRESS_MASK	0xf8
> > -#define SPE_HEADER1_COUNTER		0x98 /* counter packet (extended) */
> > -#define SPE_HEADER1_COUNTER_MASK	0xf8
> > -
> >  #if __BYTE_ORDER == __BIG_ENDIAN
> >  #define le16_to_cpu bswap_16
> >  #define le32_to_cpu bswap_32
> > @@ -198,46 +176,58 @@ static int arm_spe_get_addr(const unsigned char *buf, size_t len,
> >  static int arm_spe_do_get_packet(const unsigned char *buf, size_t len,
> >  				 struct arm_spe_pkt *packet)
> >  {
> > -	unsigned int byte;
> > +	unsigned int hdr;
> > +	unsigned char ext_hdr = 0;
> >  
> >  	memset(packet, 0, sizeof(struct arm_spe_pkt));
> >  
> >  	if (!len)
> >  		return ARM_SPE_NEED_MORE_BYTES;
> >  
> > -	byte = buf[0];
> > -	if (byte == SPE_HEADER0_PAD)
> > +	hdr = buf[0];
> > +
> > +	if (hdr == SPE_HEADER0_PAD)
> >  		return arm_spe_get_pad(packet);
> > -	else if (byte == SPE_HEADER0_END) /* no timestamp at end of record */
> > +
> > +	if (hdr == SPE_HEADER0_END) /* no timestamp at end of record */
> >  		return arm_spe_get_end(packet);
> > -	else if (byte & 0xc0 /* 0y11xxxxxx */) {
> > -		if (byte & 0x80) {
> > -			if ((byte & SPE_HEADER0_ADDRESS_MASK) == SPE_HEADER0_ADDRESS)
> > -				return arm_spe_get_addr(buf, len, 0, packet);
> > -			if ((byte & SPE_HEADER0_COUNTER_MASK) == SPE_HEADER0_COUNTER)
> > -				return arm_spe_get_counter(buf, len, 0, packet);
> > -		} else
> > -			if (byte == SPE_HEADER0_TIMESTAMP)
> > -				return arm_spe_get_timestamp(buf, len, packet);
> > -			else if ((byte & SPE_HEADER0_EVENTS_MASK) == SPE_HEADER0_EVENTS)
> > -				return arm_spe_get_events(buf, len, packet);
> > -			else if ((byte & SPE_HEADER0_SOURCE_MASK) == SPE_HEADER0_SOURCE)
> > -				return arm_spe_get_data_source(buf, len, packet);
> > -			else if ((byte & SPE_HEADER0_CONTEXT_MASK) == SPE_HEADER0_CONTEXT)
> > -				return arm_spe_get_context(buf, len, packet);
> > -			else if ((byte & SPE_HEADER0_OP_TYPE_MASK) == SPE_HEADER0_OP_TYPE)
> > -				return arm_spe_get_op_type(buf, len, packet);
> > -	} else if ((byte & 0xe0) == 0x20 /* 0y001xxxxx */) {
> > -		/* 16-bit header */
> > -		byte = buf[1];
> > -		if (byte == SPE_HEADER1_ALIGNMENT)
> > +
> > +	if (hdr == SPE_HEADER0_TIMESTAMP)
> > +		return arm_spe_get_timestamp(buf, len, packet);
> > +
> > +	if ((hdr & SPE_HEADER0_MASK1) == SPE_HEADER0_EVENTS)
> > +		return arm_spe_get_events(buf, len, packet);
> > +
> > +	if ((hdr & SPE_HEADER0_MASK1) == SPE_HEADER0_SOURCE)
> > +		return arm_spe_get_data_source(buf, len, packet);
> > +
> > +	if ((hdr & SPE_HEADER0_MASK2) == SPE_HEADER0_CONTEXT)
> > +		return arm_spe_get_context(buf, len, packet);
> > +
> > +	if ((hdr & SPE_HEADER0_MASK2) == SPE_HEADER0_OPERATION)
> > +		return arm_spe_get_op_type(buf, len, packet);
> > +
> > +	if ((hdr & SPE_HEADER0_MASK3) == SPE_HEADER0_EXTENDED) {
> 
> Is there any reason you are using MASK3 here, and not MASK2? The ARM ARM
> seems to suggest that bits [7:2] make up the mask for the extended
> header type, as the actual subtype is handled in the next byte.

You are right, here I introduced confusion for MASK3 for bits [7:5];
will change to use MASK2 for the mask of the extended header type.

> > +		/* 16-bit extended format header */
> > +		ext_hdr = 1;
> > +
> > +		hdr = buf[1];
> > +		if (hdr == SPE_HEADER1_ALIGNMENT)
> >  			return arm_spe_get_alignment(buf, len, packet);
> > -		else if ((byte & SPE_HEADER1_ADDRESS_MASK) == SPE_HEADER1_ADDRESS)
> > -			return arm_spe_get_addr(buf, len, 1, packet);
> > -		else if ((byte & SPE_HEADER1_COUNTER_MASK) == SPE_HEADER1_COUNTER)
> > -			return arm_spe_get_counter(buf, len, 1, packet);
> >  	}
> >  
> > +	/*
> > +	 * The short format header's byte 0 or the extended format header's
> > +	 * byte 1 has been assigned to 'hdr', which uses the same encoding for
> > +	 * address packet and counter packet, so don't need to distinguish if
> > +	 * it's short format or extended format and handle in once.
> > +	 */
> > +	if ((hdr & SPE_HEADER0_MASK4) == SPE_HEADER0_ADDRESS)
> > +		return arm_spe_get_addr(buf, len, ext_hdr, packet);
> > +
> > +	if ((hdr & SPE_HEADER0_MASK4) == SPE_HEADER0_COUNTER)
> > +		return arm_spe_get_counter(buf, len, ext_hdr, packet);
> > +
> >  	return ARM_SPE_BAD_PACKET;
> >  }
> >  
> > diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h
> > index f2d0af39a58c..a30fe3c5ab67 100644
> > --- a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h
> > +++ b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h
> > @@ -37,6 +37,27 @@ struct arm_spe_pkt {
> >  	uint64_t		payload;
> >  };
> >  
> > +/* Short header (HEADER0) and extended header (HEADER1) */
> > +#define SPE_HEADER0_PAD			0x0
> > +#define SPE_HEADER0_END			0x1
> > +#define SPE_HEADER0_TIMESTAMP		0x71
> > +/* Mask for event & data source */
> > +#define SPE_HEADER0_MASK1		(GENMASK_ULL(7, 6) | GENMASK_ULL(3, 0))
> > +#define SPE_HEADER0_EVENTS		0x42
> > +#define SPE_HEADER0_SOURCE		0x43
> > +/* Mask for context & operation */
> > +#define SPE_HEADER0_MASK2		GENMASK_ULL(7, 2)
> > +#define SPE_HEADER0_CONTEXT		0x64
> > +#define SPE_HEADER0_OPERATION		0x48
> 
> Just a nit, but should the name be ..._OP_TYPE instead?

Exactly, will apply this in next spin.

Very appreicate for your detailed reviewing.

Thanks,
Leo

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v2 00/14] perf arm-spe: Refactor decoding & dumping flow
  2020-09-29 13:39 [PATCH v2 00/14] perf arm-spe: Refactor decoding & dumping flow Leo Yan
                   ` (13 preceding siblings ...)
  2020-09-29 13:39 ` [PATCH v2 14/14] perf arm-spe: Add support for ARMv8.3-SPE Leo Yan
@ 2020-10-13 14:53 ` Arnaldo Carvalho de Melo
  2020-10-13 15:19   ` Leo Yan
  14 siblings, 1 reply; 43+ messages in thread
From: Arnaldo Carvalho de Melo @ 2020-10-13 14:53 UTC (permalink / raw)
  To: Leo Yan
  Cc: Peter Zijlstra, Ingo Molnar, Mark Rutland, Alexander Shishkin,
	Jiri Olsa, Namhyung Kim, Wei Li, James Clark, Andre Przywara,
	Dave Martin, linux-kernel, Al Grant

Em Tue, Sep 29, 2020 at 02:39:03PM +0100, Leo Yan escreveu:
> The prominent issue for the SPE trace decoding and dumping is the packet
> header and payload values are hard coded with numbers and it's not
> readable and difficult to maintain; and has other minor issues, e.g. the
> packet length (header + payload) calculation is not correct for some
> packet types, and the dumping flow misses to support specific sub
> classes for operation packet, etc.
> 
> So this patch set is to refactor the Arm SPE decoding SPE with:
> - Patches 01, 02 are minor cleans up;
> - Patches 03, 04 are used to fix and polish the packet and payload
>   length calculation;
> - Patch 05 is to add a helper to wrap up printing strings, this can
>   avoid bunch of duplicate code lines;
> - Patches 06 ~ 12 are used to refactor decoding for different types
>   packet one by one (address packet, context packet, counter packet,
>   event packet, operation packet);
> - Patch 13 is coming from Andre to dump memory tagging;
> - Patch 14 is coming from Wei Li to add decoding for ARMv8.3
>   extension, in this version it has been improved to use defined
>   macros, also is improved for failure handling and commit log.
> 
> This patch set is cleanly applied on the top of perf/core branch
> with commit a55b7bb1c146 ("perf test: Fix msan uninitialized use."),
> and the patches have been verified on Hisilicon D06 platform and I
> manually inspected the dumping result.
> 
> Changes from v1:
> - Heavily rewrote the patch 05 for refactoring printing strings; this
>   is fundamental change, so adjusted the sequence for patches and moved
>   the printing string patch ahead from patch 10 (v1) to patch 05;
> - Changed to use GENMASK_ULL() for bits mask;
> - Added Andre's patch 13 for dumping memory tagging;
> - Refined patch 12 for adding sub classes for Operation packet, merged
>   some commit log from Andre's patch, which allows commit log and code
>   to be more clear; Added "Co-developed-by: Andre Przywara" tag to
>   reflect this.

Ok, so I'll wait for v3, as Leo indicated he'll respin.

Thanks,

- Arnaldo
 
> 
> Andre Przywara (1):
>   perf arm_spe: Decode memory tagging properties
> 
> Leo Yan (12):
>   perf arm-spe: Include bitops.h for BIT() macro
>   perf arm-spe: Fix a typo in comment
>   perf arm-spe: Refactor payload length calculation
>   perf arm-spe: Fix packet length handling
>   perf arm-spe: Refactor printing string to buffer
>   perf arm-spe: Refactor packet header parsing
>   perf arm-spe: Refactor address packet handling
>   perf arm-spe: Refactor context packet handling
>   perf arm-spe: Refactor counter packet handling
>   perf arm-spe: Refactor event type handling
>   perf arm-spe: Refactor operation packet handling
>   perf arm-spe: Add more sub classes for operation packet
> 
> Wei Li (1):
>   perf arm-spe: Add support for ARMv8.3-SPE
> 
>  .../util/arm-spe-decoder/arm-spe-decoder.c    |  54 +-
>  .../util/arm-spe-decoder/arm-spe-decoder.h    |  17 -
>  .../arm-spe-decoder/arm-spe-pkt-decoder.c     | 567 +++++++++++-------
>  .../arm-spe-decoder/arm-spe-pkt-decoder.h     | 124 +++-
>  4 files changed, 478 insertions(+), 284 deletions(-)
> 
> -- 
> 2.20.1
> 

-- 

- Arnaldo

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v2 00/14] perf arm-spe: Refactor decoding & dumping flow
  2020-10-13 14:53 ` [PATCH v2 00/14] perf arm-spe: Refactor decoding & dumping flow Arnaldo Carvalho de Melo
@ 2020-10-13 15:19   ` Leo Yan
  0 siblings, 0 replies; 43+ messages in thread
From: Leo Yan @ 2020-10-13 15:19 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Peter Zijlstra, Ingo Molnar, Mark Rutland, Alexander Shishkin,
	Jiri Olsa, Namhyung Kim, Wei Li, James Clark, Andre Przywara,
	Dave Martin, linux-kernel, Al Grant

Hi Arnaldo,

On Tue, Oct 13, 2020 at 11:53:32AM -0300, Arnaldo Carvalho de Melo wrote:

[...]

> > Changes from v1:
> > - Heavily rewrote the patch 05 for refactoring printing strings; this
> >   is fundamental change, so adjusted the sequence for patches and moved
> >   the printing string patch ahead from patch 10 (v1) to patch 05;
> > - Changed to use GENMASK_ULL() for bits mask;
> > - Added Andre's patch 13 for dumping memory tagging;
> > - Refined patch 12 for adding sub classes for Operation packet, merged
> >   some commit log from Andre's patch, which allows commit log and code
> >   to be more clear; Added "Co-developed-by: Andre Przywara" tag to
> >   reflect this.
> 
> Ok, so I'll wait for v3, as Leo indicated he'll respin.

Yes, please hold on and I will send out patch set v3 according to
Andre's reviewing.

Thanks,
Leo

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v2 07/14] perf arm-spe: Refactor address packet handling
  2020-09-29 13:39 ` [PATCH v2 07/14] perf arm-spe: Refactor address packet handling Leo Yan
@ 2020-10-19  9:01   ` André Przywara
  2020-10-19 10:41     ` Leo Yan
  0 siblings, 1 reply; 43+ messages in thread
From: André Przywara @ 2020-10-19  9:01 UTC (permalink / raw)
  To: Leo Yan, Arnaldo Carvalho de Melo, Peter Zijlstra, Ingo Molnar,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Wei Li, James Clark, Dave Martin, linux-kernel, Al Grant

On 29/09/2020 14:39, Leo Yan wrote:

Hi Leo,

> This patch is to refactor address packet handling, it defines macros for
> address packet's header and payload, these macros are used by decoder
> and the dump flow.

So I was thinking about these next few patches a bit. I understand that
it's common ground to not use numbers in code directly, but put names to
them (and there is good rationale for that).

However those long and complicated names don't make it really easier to
read, I think.

See below for an idea:

> Signed-off-by: Leo Yan <leo.yan@linaro.org>
> ---
>  .../util/arm-spe-decoder/arm-spe-decoder.c    | 33 ++++++++++---------
>  .../arm-spe-decoder/arm-spe-pkt-decoder.c     | 25 +++++++-------
>  .../arm-spe-decoder/arm-spe-pkt-decoder.h     | 27 ++++++++++-----
>  3 files changed, 49 insertions(+), 36 deletions(-)
> 
> diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c b/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c
> index cc18a1e8c212..9d3de163d47c 100644
> --- a/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c
> +++ b/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c
> @@ -24,36 +24,37 @@
>  
>  static u64 arm_spe_calc_ip(int index, u64 payload)
>  {
> -	u8 *addr = (u8 *)&payload;
> -	int ns, el;
> +	u64 ns, el;

This (and the "u64 vs. u8[]" changes below) looks like a nice cleanup.
>  	/* Instruction virtual address or Branch target address */
>  	if (index == SPE_ADDR_PKT_HDR_INDEX_INS ||
>  	    index == SPE_ADDR_PKT_HDR_INDEX_BRANCH) {
> -		ns = addr[7] & SPE_ADDR_PKT_NS;
> -		el = (addr[7] & SPE_ADDR_PKT_EL_MASK) >> SPE_ADDR_PKT_EL_OFFSET;
> +		ns = payload & SPE_ADDR_PKT_INST_VA_NS;
> +		el = (payload & SPE_ADDR_PKT_INST_VA_EL_MASK)
> +			>> SPE_ADDR_PKT_INST_VA_EL_SHIFT;

So if I see this correctly, this _EL_SHIFT and _EL_MASK are only used
together, and only to read values, not to construct them.
So can you fuse them together in the header file below, like:
	el = SPE_ADDR_PKT_INST_VA_GET_EL(payload);

That should help readablity, I guess, while still keeping the actual
numbers in one place. _SHIFT and _MASK are useful when we use them to
both extract *and construct* values, but here we only parse the buffer.

Similar for other places where you just extract bits from a bitfield or
integer.

Cheers,
Andre


> +
> +		/* Clean highest byte */
> +		payload &= SPE_ADDR_PKT_ADDR_MASK;
>  
>  		/* Fill highest byte for EL1 or EL2 (VHE) mode */
> -		if (ns && (el == SPE_ADDR_PKT_EL1 || el == SPE_ADDR_PKT_EL2))
> -			addr[7] = 0xff;
> -		/* Clean highest byte for other cases */
> -		else
> -			addr[7] = 0x0;
> +		if (ns && (el == SPE_ADDR_PKT_INST_VA_EL1 ||
> +			   el == SPE_ADDR_PKT_INST_VA_EL2))
> +			payload |= 0xffULL << SPE_ADDR_PKT_ADDR_BYTE7_SHIFT;
>  
>  	/* Data access virtual address */
>  	} else if (index == SPE_ADDR_PKT_HDR_INDEX_DATA_VIRT) {
>  
> +		/* Clean tags */
> +		payload &= SPE_ADDR_PKT_ADDR_MASK;
> +
>  		/* Fill highest byte if bits [48..55] is 0xff */
> -		if (addr[6] == 0xff)
> -			addr[7] = 0xff;
> -		/* Otherwise, cleanup tags */
> -		else
> -			addr[7] = 0x0;
> +		if ((payload >> 48) == 0xffULL)
> +			payload |= 0xffULL << SPE_ADDR_PKT_ADDR_BYTE7_SHIFT;
>  
>  	/* Data access physical address */
>  	} else if (index == SPE_ADDR_PKT_HDR_INDEX_DATA_PHYS) {
> -		/* Cleanup byte 7 */
> -		addr[7] = 0x0;
> +		/* Clean highest byte */
> +		payload &= SPE_ADDR_PKT_ADDR_MASK;
>  	} else {
>  		pr_err("unsupported address packet index: 0x%x\n", index);
>  	}
> diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> index e738bd04f209..b51a2207e4a0 100644
> --- a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> +++ b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> @@ -13,9 +13,6 @@
>  
>  #include "arm-spe-pkt-decoder.h"
>  
> -#define NS_FLAG		BIT(63)
> -#define EL_FLAG		(BIT(62) | BIT(61))
> -
>  #if __BYTE_ORDER == __BIG_ENDIAN
>  #define le16_to_cpu bswap_16
>  #define le32_to_cpu bswap_32
> @@ -166,9 +163,10 @@ static int arm_spe_get_addr(const unsigned char *buf, size_t len,
>  {
>  	packet->type = ARM_SPE_ADDRESS;
>  	if (ext_hdr)
> -		packet->index = ((buf[0] & 0x3) << 3) | (buf[1] & 0x7);
> +		packet->index = (((buf[0] & SPE_ADDR_PKT_HDR_EXT_INDEX_MASK) << 3) |
> +				  (buf[1] & SPE_ADDR_PKT_HDR_INDEX_MASK));
>  	else
> -		packet->index = buf[0] & 0x7;
> +		packet->index = buf[0] & SPE_ADDR_PKT_HDR_INDEX_MASK;
>  
>  	return arm_spe_get_payload(buf, len, ext_hdr, packet);
>  }
> @@ -403,18 +401,21 @@ int arm_spe_pkt_desc(const struct arm_spe_pkt *packet, char *buf,
>  		return arm_spe_pkt_snprintf(&buf, &blen, "%s %lld", name, payload);
>  	case ARM_SPE_ADDRESS:
>  		switch (idx) {
> -		case 0:
> -		case 1: ns = !!(packet->payload & NS_FLAG);
> -			el = (packet->payload & EL_FLAG) >> 61;
> -			payload &= ~(0xffULL << 56);
> +		case SPE_ADDR_PKT_HDR_INDEX_INS:
> +		case SPE_ADDR_PKT_HDR_INDEX_BRANCH:
> +			ns = !!(packet->payload & SPE_ADDR_PKT_INST_VA_NS);
> +			el = (packet->payload & SPE_ADDR_PKT_INST_VA_EL_MASK)
> +				>> SPE_ADDR_PKT_INST_VA_EL_SHIFT;
> +			payload &= SPE_ADDR_PKT_ADDR_MASK;
>  			return arm_spe_pkt_snprintf(&buf, &blen,
>  					"%s 0x%llx el%d ns=%d",
>  				        (idx == 1) ? "TGT" : "PC", payload, el, ns);
> -		case 2:
> +		case SPE_ADDR_PKT_HDR_INDEX_DATA_VIRT:
>  			return arm_spe_pkt_snprintf(&buf, &blen,
>  						    "VA 0x%llx", payload);
> -		case 3:	ns = !!(packet->payload & NS_FLAG);
> -			payload &= ~(0xffULL << 56);
> +		case SPE_ADDR_PKT_HDR_INDEX_DATA_PHYS:
> +			ns = !!(packet->payload & SPE_ADDR_PKT_INST_VA_NS);
> +			payload &= SPE_ADDR_PKT_ADDR_MASK;
>  			return arm_spe_pkt_snprintf(&buf, &blen,
>  						    "PA 0x%llx ns=%d", payload, ns);
>  		default:
> diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h
> index a30fe3c5ab67..88d2231c76da 100644
> --- a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h
> +++ b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h
> @@ -61,19 +61,30 @@ struct arm_spe_pkt {
>  #define SPE_HEADER_SZ_SHIFT		(4)
>  #define SPE_HEADER_SZ_MASK		GENMASK_ULL(5, 4)
>  
> +/* Address packet header */
> +#define SPE_ADDR_PKT_HDR_INDEX_MASK		GENMASK_ULL(2, 0)
>  #define SPE_ADDR_PKT_HDR_INDEX_INS		(0x0)
>  #define SPE_ADDR_PKT_HDR_INDEX_BRANCH		(0x1)
>  #define SPE_ADDR_PKT_HDR_INDEX_DATA_VIRT	(0x2)
>  #define SPE_ADDR_PKT_HDR_INDEX_DATA_PHYS	(0x3)
>  
> -#define SPE_ADDR_PKT_NS				BIT(7)
> -#define SPE_ADDR_PKT_CH				BIT(6)
> -#define SPE_ADDR_PKT_EL_OFFSET			(5)
> -#define SPE_ADDR_PKT_EL_MASK			(0x3 << SPE_ADDR_PKT_EL_OFFSET)
> -#define SPE_ADDR_PKT_EL0			(0)
> -#define SPE_ADDR_PKT_EL1			(1)
> -#define SPE_ADDR_PKT_EL2			(2)
> -#define SPE_ADDR_PKT_EL3			(3)
> +#define SPE_ADDR_PKT_HDR_EXT_INDEX_MASK		GENMASK_ULL(1, 0)
> +
> +/* Address packet payload for data access physical address */
> +#define SPE_ADDR_PKT_ADDR_BYTE7_SHIFT		(56)
> +#define SPE_ADDR_PKT_ADDR_MASK			GENMASK_ULL(55, 0)
> +
> +#define SPE_ADDR_PKT_DATA_PA_NS			BIT(63)
> +#define SPE_ADDR_PKT_DATA_PA_CH			BIT(62)
> +
> +/* Address packet payload for instrcution virtual address */
> +#define SPE_ADDR_PKT_INST_VA_NS			BIT(63)
> +#define SPE_ADDR_PKT_INST_VA_EL_SHIFT		(61)
> +#define SPE_ADDR_PKT_INST_VA_EL_MASK		GENMASK_ULL(62, 61)
> +#define SPE_ADDR_PKT_INST_VA_EL0		(0)
> +#define SPE_ADDR_PKT_INST_VA_EL1		(1)
> +#define SPE_ADDR_PKT_INST_VA_EL2		(2)
> +#define SPE_ADDR_PKT_INST_VA_EL3		(3)
>  
>  const char *arm_spe_pkt_name(enum arm_spe_pkt_type);
>  
> 


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v2 07/14] perf arm-spe: Refactor address packet handling
  2020-10-19  9:01   ` André Przywara
@ 2020-10-19 10:41     ` Leo Yan
  0 siblings, 0 replies; 43+ messages in thread
From: Leo Yan @ 2020-10-19 10:41 UTC (permalink / raw)
  To: André Przywara
  Cc: Arnaldo Carvalho de Melo, Peter Zijlstra, Ingo Molnar,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Wei Li, James Clark, Dave Martin, linux-kernel, Al Grant

Hi Andre,

On Mon, Oct 19, 2020 at 10:01:28AM +0100, André Przywara wrote:

[...]

> > diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c b/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c
> > index cc18a1e8c212..9d3de163d47c 100644
> > --- a/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c
> > +++ b/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c
> > @@ -24,36 +24,37 @@
> >  
> >  static u64 arm_spe_calc_ip(int index, u64 payload)
> >  {
> > -	u8 *addr = (u8 *)&payload;
> > -	int ns, el;
> > +	u64 ns, el;
> 
> This (and the "u64 vs. u8[]" changes below) looks like a nice cleanup.
>
> >  	/* Instruction virtual address or Branch target address */
> >  	if (index == SPE_ADDR_PKT_HDR_INDEX_INS ||
> >  	    index == SPE_ADDR_PKT_HDR_INDEX_BRANCH) {
> > -		ns = addr[7] & SPE_ADDR_PKT_NS;
> > -		el = (addr[7] & SPE_ADDR_PKT_EL_MASK) >> SPE_ADDR_PKT_EL_OFFSET;
> > +		ns = payload & SPE_ADDR_PKT_INST_VA_NS;
> > +		el = (payload & SPE_ADDR_PKT_INST_VA_EL_MASK)
> > +			>> SPE_ADDR_PKT_INST_VA_EL_SHIFT;
> 
> So if I see this correctly, this _EL_SHIFT and _EL_MASK are only used
> together, and only to read values, not to construct them.
> So can you fuse them together in the header file below, like:
> 	el = SPE_ADDR_PKT_INST_VA_GET_EL(payload);

Agreed that this is more neat.

> That should help readablity, I guess, while still keeping the actual
> numbers in one place. _SHIFT and _MASK are useful when we use them to
> both extract *and construct* values, but here we only parse the buffer.
> 
> Similar for other places where you just extract bits from a bitfield or
> integer.

Will apply the suggestion crossing the patch set.

Thanks a lot for your reviewing and suggestions.

Leo

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v2 08/14] perf arm-spe: Refactor context packet handling
  2020-09-29 13:39 ` [PATCH v2 08/14] perf arm-spe: Refactor context " Leo Yan
@ 2020-10-20 21:53   ` André Przywara
  0 siblings, 0 replies; 43+ messages in thread
From: André Przywara @ 2020-10-20 21:53 UTC (permalink / raw)
  To: Leo Yan, Arnaldo Carvalho de Melo, Peter Zijlstra, Ingo Molnar,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Wei Li, James Clark, Dave Martin, linux-kernel, Al Grant

On 29/09/2020 14:39, Leo Yan wrote:
> Minor refactoring to use macro for index mask.
> 
> Signed-off-by: Leo Yan <leo.yan@linaro.org>

Reviewed-by: Andre Przywara <andre.przywara@arm.com>

Cheers,
Andre

> ---
>  tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c | 2 +-
>  tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h | 3 +++
>  2 files changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> index b51a2207e4a0..00a2cd1af422 100644
> --- a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> +++ b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> @@ -134,7 +134,7 @@ static int arm_spe_get_context(const unsigned char *buf, size_t len,
>  			       struct arm_spe_pkt *packet)
>  {
>  	packet->type = ARM_SPE_CONTEXT;
> -	packet->index = buf[0] & 0x3;
> +	packet->index = buf[0] & SPE_CTX_PKT_HDR_INDEX_MASK;
>  	return arm_spe_get_payload(buf, len, 0, packet);
>  }
>  
> diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h
> index 88d2231c76da..62db4ff91832 100644
> --- a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h
> +++ b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h
> @@ -86,6 +86,9 @@ struct arm_spe_pkt {
>  #define SPE_ADDR_PKT_INST_VA_EL2		(2)
>  #define SPE_ADDR_PKT_INST_VA_EL3		(3)
>  
> +/* Context packet header */
> +#define SPE_CTX_PKT_HDR_INDEX_MASK		GENMASK_ULL(1, 0)
> +
>  const char *arm_spe_pkt_name(enum arm_spe_pkt_type);
>  
>  int arm_spe_get_packet(const unsigned char *buf, size_t len,
> 


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v2 09/14] perf arm-spe: Refactor counter packet handling
  2020-09-29 13:39 ` [PATCH v2 09/14] perf arm-spe: Refactor counter " Leo Yan
@ 2020-10-20 21:53   ` André Przywara
  2020-10-21  3:52     ` Leo Yan
  0 siblings, 1 reply; 43+ messages in thread
From: André Przywara @ 2020-10-20 21:53 UTC (permalink / raw)
  To: Leo Yan, Arnaldo Carvalho de Melo, Peter Zijlstra, Ingo Molnar,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Wei Li, James Clark, Dave Martin, linux-kernel, Al Grant

On 29/09/2020 14:39, Leo Yan wrote:

Hi,

> This patch defines macros for counter packet header, and uses macro to
> replace hard code values for packet parsing.
> 
> Signed-off-by: Leo Yan <leo.yan@linaro.org>
> ---
>  .../util/arm-spe-decoder/arm-spe-pkt-decoder.c  | 17 ++++++++++-------
>  .../util/arm-spe-decoder/arm-spe-pkt-decoder.h  |  9 +++++++++
>  2 files changed, 19 insertions(+), 7 deletions(-)
> 
> diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> index 00a2cd1af422..ed0f4c74dfc5 100644
> --- a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> +++ b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> @@ -150,10 +150,13 @@ static int arm_spe_get_counter(const unsigned char *buf, size_t len,
>  			       const unsigned char ext_hdr, struct arm_spe_pkt *packet)
>  {
>  	packet->type = ARM_SPE_COUNTER;
> -	if (ext_hdr)
> -		packet->index = ((buf[0] & 0x3) << 3) | (buf[1] & 0x7);
> -	else
> -		packet->index = buf[0] & 0x7;
> +	if (ext_hdr) {
> +		packet->index  = (buf[1] & SPE_CNT_PKT_HDR_INDEX_MASK);
> +		packet->index |= ((buf[0] & SPE_CNT_PKT_HDR_EXT_INDEX_MASK)
> +			<< SPE_CNT_PKT_HDR_EXT_INDEX_SHIFT);
> +	} else {
> +		packet->index = buf[0] & SPE_CNT_PKT_HDR_INDEX_MASK;

That looks suspiciously similar to the extended header in the address
packet. Can you use the same name for that?
And, similar to the address packet, what about:
	packet->index |= SPE_PKT_EXT_HEADER_INDEX(buf[0]);

(merging the mask and the shift in the macro definition)

> +	}
>  
>  	return arm_spe_get_payload(buf, len, ext_hdr, packet);
>  }
> @@ -431,17 +434,17 @@ int arm_spe_pkt_desc(const struct arm_spe_pkt *packet, char *buf,
>  			return ret;
>  
>  		switch (idx) {
> -		case 0:
> +		case SPE_CNT_PKT_HDR_INDEX_TOTAL_LAT:
>  			ret = arm_spe_pkt_snprintf(&buf, &blen, "TOT");
>  			if (ret < 0)
>  				return ret;
>  			break;
> -		case 1:
> +		case SPE_CNT_PKT_HDR_INDEX_ISSUE_LAT:
>  			ret = arm_spe_pkt_snprintf(&buf, &blen, "ISSUE");
>  			if (ret < 0)
>  				return ret;
>  			break;
> -		case 2:
> +		case SPE_CNT_PKT_HDR_INDEX_TRANS_LAT:
>  			ret = arm_spe_pkt_snprintf(&buf, &blen, "XLAT");
>  			if (ret < 0)
>  				return ret;
> diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h
> index 62db4ff91832..18667a63f5ba 100644
> --- a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h
> +++ b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h
> @@ -89,6 +89,15 @@ struct arm_spe_pkt {
>  /* Context packet header */
>  #define SPE_CTX_PKT_HDR_INDEX_MASK		GENMASK_ULL(1, 0)
>  
> +/* Counter packet header */
> +#define SPE_CNT_PKT_HDR_INDEX_MASK		GENMASK_ULL(2, 0)
> +#define SPE_CNT_PKT_HDR_INDEX_TOTAL_LAT		(0x0)
> +#define SPE_CNT_PKT_HDR_INDEX_ISSUE_LAT		(0x1)
> +#define SPE_CNT_PKT_HDR_INDEX_TRANS_LAT		(0x2)

I think the Linux kernel coding style does not mention parentheses just
around numbers, so just 0x2 would suffice, for instance.
See section 12) in Documentation/process/coding-style.rst

Cheers,
Andre


> +
> +#define SPE_CNT_PKT_HDR_EXT_INDEX_MASK		GENMASK_ULL(1, 0)
> +#define SPE_CNT_PKT_HDR_EXT_INDEX_SHIFT		(3)
> +
>  const char *arm_spe_pkt_name(enum arm_spe_pkt_type);
>  
>  int arm_spe_get_packet(const unsigned char *buf, size_t len,
> 


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v2 10/14] perf arm-spe: Refactor event type handling
  2020-09-29 13:39 ` [PATCH v2 10/14] perf arm-spe: Refactor event type handling Leo Yan
@ 2020-10-20 21:54   ` André Przywara
  2020-10-21  4:54     ` Leo Yan
  0 siblings, 1 reply; 43+ messages in thread
From: André Przywara @ 2020-10-20 21:54 UTC (permalink / raw)
  To: Leo Yan, Arnaldo Carvalho de Melo, Peter Zijlstra, Ingo Molnar,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Wei Li, James Clark, Dave Martin, linux-kernel, Al Grant

On 29/09/2020 14:39, Leo Yan wrote:

Hi,

> Use macros instead of the enum values for event types, this is more
> directive and without bit shifting when parse packet.
> 
> Signed-off-by: Leo Yan <leo.yan@linaro.org>
> ---
>  .../util/arm-spe-decoder/arm-spe-decoder.c    | 16 +++++++-------
>  .../util/arm-spe-decoder/arm-spe-decoder.h    | 17 --------------
>  .../arm-spe-decoder/arm-spe-pkt-decoder.c     | 22 +++++++++----------
>  .../arm-spe-decoder/arm-spe-pkt-decoder.h     | 16 ++++++++++++++
>  4 files changed, 35 insertions(+), 36 deletions(-)
> 
> diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c b/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c
> index 9d3de163d47c..ac66e7f42a58 100644
> --- a/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c
> +++ b/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c
> @@ -168,31 +168,31 @@ static int arm_spe_read_record(struct arm_spe_decoder *decoder)
>  		case ARM_SPE_OP_TYPE:
>  			break;
>  		case ARM_SPE_EVENTS:
> -			if (payload & BIT(EV_L1D_REFILL))
> +			if (payload & SPE_EVT_PKT_L1D_REFILL)

Not sure this (and the others below) are an improvement? I liked the
enum below, and reading BIT() here tells me that it's a bitmask.

>  				decoder->record.type |= ARM_SPE_L1D_MISS;
>  
> -			if (payload & BIT(EV_L1D_ACCESS))
> +			if (payload & SPE_EVT_PKT_L1D_ACCESS)
>  				decoder->record.type |= ARM_SPE_L1D_ACCESS;
>  
> -			if (payload & BIT(EV_TLB_WALK))
> +			if (payload & SPE_EVT_PKT_TLB_WALK)
>  				decoder->record.type |= ARM_SPE_TLB_MISS;
>  
> -			if (payload & BIT(EV_TLB_ACCESS))
> +			if (payload & SPE_EVT_PKT_TLB_ACCESS)
>  				decoder->record.type |= ARM_SPE_TLB_ACCESS;
>  
>  			if ((idx == 2 || idx == 4 || idx == 8) &&
> -			    (payload & BIT(EV_LLC_MISS)))
> +			    (payload & SPE_EVT_PKT_LLC_MISS))
>  				decoder->record.type |= ARM_SPE_LLC_MISS;
>  
>  			if ((idx == 2 || idx == 4 || idx == 8) &&
> -			    (payload & BIT(EV_LLC_ACCESS)))
> +			    (payload & SPE_EVT_PKT_LLC_ACCESS))
>  				decoder->record.type |= ARM_SPE_LLC_ACCESS;
>  
>  			if ((idx == 2 || idx == 4 || idx == 8) &&
> -			    (payload & BIT(EV_REMOTE_ACCESS)))
> +			    (payload & SPE_EVT_PKT_REMOTE_ACCESS))
>  				decoder->record.type |= ARM_SPE_REMOTE_ACCESS;
>  
> -			if (payload & BIT(EV_MISPRED))
> +			if (payload & SPE_EVT_PKT_MISPREDICTED)
>  				decoder->record.type |= ARM_SPE_BRANCH_MISS;
>  
>  			break;
> diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-decoder.h b/tools/perf/util/arm-spe-decoder/arm-spe-decoder.h
> index a5111a8d4360..24727b8ca7ff 100644
> --- a/tools/perf/util/arm-spe-decoder/arm-spe-decoder.h
> +++ b/tools/perf/util/arm-spe-decoder/arm-spe-decoder.h
> @@ -13,23 +13,6 @@
>  
>  #include "arm-spe-pkt-decoder.h"
>  
> -enum arm_spe_events {
> -	EV_EXCEPTION_GEN	= 0,
> -	EV_RETIRED		= 1,
> -	EV_L1D_ACCESS		= 2,
> -	EV_L1D_REFILL		= 3,
> -	EV_TLB_ACCESS		= 4,
> -	EV_TLB_WALK		= 5,
> -	EV_NOT_TAKEN		= 6,
> -	EV_MISPRED		= 7,
> -	EV_LLC_ACCESS		= 8,
> -	EV_LLC_MISS		= 9,
> -	EV_REMOTE_ACCESS	= 10,
> -	EV_ALIGNMENT		= 11,
> -	EV_PARTIAL_PREDICATE	= 17,
> -	EV_EMPTY_PREDICATE	= 18,
> -};

So what about keeping this, but moving it into the other header file?
coding-style.rst says: "Enums are preferred when defining several
related constants."

> -
>  enum arm_spe_sample_type {
>  	ARM_SPE_L1D_ACCESS	= 1 << 0,
>  	ARM_SPE_L1D_MISS	= 1 << 1,
> diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> index ed0f4c74dfc5..b8f343320abf 100644
> --- a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> +++ b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> @@ -284,58 +284,58 @@ int arm_spe_pkt_desc(const struct arm_spe_pkt *packet, char *buf,
>  		if (ret < 0)
>  			return ret;
>  
> -		if (payload & 0x1) {
> +		if (payload & SPE_EVT_PKT_GEN_EXCEPTION) {

Having the bitmask here directly is indeed not very nice and error
prone. But I would rather see the above solution:
		if (payload & BIT(EV_EXCEPTION_GEN)) {

>  			ret = arm_spe_pkt_snprintf(&buf, &blen, " EXCEPTION-GEN");
>  			if (ret < 0)
>  				return ret;
>  		}
> -		if (payload & 0x2) {
> +		if (payload & SPE_EVT_PKT_ARCH_RETIRED) {
>  			ret = arm_spe_pkt_snprintf(&buf, &blen, " RETIRED");
>  			if (ret < 0)
>  				return ret;
>  		}
> -		if (payload & 0x4) {
> +		if (payload & SPE_EVT_PKT_L1D_ACCESS) {
>  			ret = arm_spe_pkt_snprintf(&buf, &blen, " L1D-ACCESS");
>  			if (ret < 0)
>  				return ret;
>  		}
> -		if (payload & 0x8) {
> +		if (payload & SPE_EVT_PKT_L1D_REFILL) {
>  			ret = arm_spe_pkt_snprintf(&buf, &blen, " L1D-REFILL");
>  			if (ret < 0)
>  				return ret;
>  		}
> -		if (payload & 0x10) {
> +		if (payload & SPE_EVT_PKT_TLB_ACCESS) {
>  			ret = arm_spe_pkt_snprintf(&buf, &blen, " TLB-ACCESS");
>  			if (ret < 0)
>  				return ret;
>  		}
> -		if (payload & 0x20) {
> +		if (payload & SPE_EVT_PKT_TLB_WALK) {
>  			ret = arm_spe_pkt_snprintf(&buf, &blen, " TLB-REFILL");
>  			if (ret < 0)
>  				return ret;
>  		}
> -		if (payload & 0x40) {
> +		if (payload & SPE_EVT_PKT_NOT_TAKEN) {
>  			ret = arm_spe_pkt_snprintf(&buf, &blen, " NOT-TAKEN");
>  			if (ret < 0)
>  				return ret;
>  		}
> -		if (payload & 0x80) {
> +		if (payload & SPE_EVT_PKT_MISPREDICTED) {
>  			ret = arm_spe_pkt_snprintf(&buf, &blen, " MISPRED");
>  			if (ret < 0)
>  				return ret;
>  		}
>  		if (idx > 1) {

Do you know what the purpose of this comparison is? Surely payload would
not contain more bits than would fit in "idx" bytes? So is this some
attempt of an optimisation? If so, I doubt it's really useful, the
compiler might find a smarter solution to the problem. Just continuing
with the bit mask comparison would make it look nicer, I think.

Cheers,
Andre

> -			if (payload & 0x100) {
> +			if (payload & SPE_EVT_PKT_LLC_ACCESS) {
>  				ret = arm_spe_pkt_snprintf(&buf, &blen, " LLC-ACCESS");
>  				if (ret < 0)
>  					return ret;
>  			}
> -			if (payload & 0x200) {
> +			if (payload & SPE_EVT_PKT_LLC_MISS) {
>  				ret = arm_spe_pkt_snprintf(&buf, &blen, " LLC-REFILL");
>  				if (ret < 0)
>  					return ret;
>  			}
> -			if (payload & 0x400) {
> +			if (payload & SPE_EVT_PKT_REMOTE_ACCESS) {
>  				ret = arm_spe_pkt_snprintf(&buf, &blen, " REMOTE-ACCESS");
>  				if (ret < 0)
>  					return ret;
> diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h
> index 18667a63f5ba..e9a88cf685bb 100644
> --- a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h
> +++ b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h
> @@ -98,6 +98,22 @@ struct arm_spe_pkt {
>  #define SPE_CNT_PKT_HDR_EXT_INDEX_MASK		GENMASK_ULL(1, 0)
>  #define SPE_CNT_PKT_HDR_EXT_INDEX_SHIFT		(3)
>  
> +/* Event packet payload */
> +#define SPE_EVT_PKT_SVE_EMPTY_PREDICATE		BIT(18)
> +#define SPE_EVT_PKT_SVE_PARTIAL_PREDICATE	BIT(17)
> +#define SPE_EVT_PKT_ALIGNMENT			BIT(11)
> +#define SPE_EVT_PKT_REMOTE_ACCESS		BIT(10)
> +#define SPE_EVT_PKT_LLC_MISS			BIT(9)
> +#define SPE_EVT_PKT_LLC_ACCESS			BIT(8)
> +#define SPE_EVT_PKT_MISPREDICTED		BIT(7)
> +#define SPE_EVT_PKT_NOT_TAKEN			BIT(6)
> +#define SPE_EVT_PKT_TLB_WALK			BIT(5)
> +#define SPE_EVT_PKT_TLB_ACCESS			BIT(4)
> +#define SPE_EVT_PKT_L1D_REFILL			BIT(3)
> +#define SPE_EVT_PKT_L1D_ACCESS			BIT(2)
> +#define SPE_EVT_PKT_ARCH_RETIRED		BIT(1)
> +#define SPE_EVT_PKT_GEN_EXCEPTION		BIT(0)
> +
>  const char *arm_spe_pkt_name(enum arm_spe_pkt_type);
>  
>  int arm_spe_get_packet(const unsigned char *buf, size_t len,
> 


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v2 14/14] perf arm-spe: Add support for ARMv8.3-SPE
  2020-09-29 13:39 ` [PATCH v2 14/14] perf arm-spe: Add support for ARMv8.3-SPE Leo Yan
@ 2020-10-20 21:54   ` André Przywara
  2020-10-21  5:10     ` Leo Yan
  0 siblings, 1 reply; 43+ messages in thread
From: André Przywara @ 2020-10-20 21:54 UTC (permalink / raw)
  To: Leo Yan, Arnaldo Carvalho de Melo, Peter Zijlstra, Ingo Molnar,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Wei Li, James Clark, Dave Martin, linux-kernel, Al Grant

On 29/09/2020 14:39, Leo Yan wrote:

Hi,

> From: Wei Li <liwei391@huawei.com>
> 
> This patch is to support Armv8.3 extension for SPE, it adds alignment
> field in the Events packet and it supports the Scalable Vector Extension
> (SVE) for Operation packet and Events packet with two additions:
> 
>   - The vector length for SVE operations in the Operation Type packet;
>   - The incomplete predicate and empty predicate fields in the Events
>     packet.
> 
> Signed-off-by: Wei Li <liwei391@huawei.com>
> Signed-off-by: Leo Yan <leo.yan@linaro.org>
> ---
>  .../arm-spe-decoder/arm-spe-pkt-decoder.c     | 84 ++++++++++++++++++-
>  .../arm-spe-decoder/arm-spe-pkt-decoder.h     |  6 ++
>  2 files changed, 87 insertions(+), 3 deletions(-)
> 
> diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> index 05a4c74399d7..3ec381fddfcb 100644
> --- a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> +++ b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> @@ -342,14 +342,73 @@ int arm_spe_pkt_desc(const struct arm_spe_pkt *packet, char *buf,
>  					return ret;
>  			}
>  		}
> +		if (idx > 2) {

As I mentioned in the other patch, I doubt this extra comparison is
useful. Does that protect us from anything?

> +			if (payload & SPE_EVT_PKT_ALIGNMENT) {

Mmh, but this is bit 11, right? So would need to go into the (idx > 1)
section (covering bits 8-15)? Another reason to ditch this comparison above.

> +				ret = snprintf(buf, buf_len, " ALIGNMENT");
> +				if (ret < 0)
> +					return ret;
> +				buf += ret;
> +				blen -= ret;

Shouldn't we use the new arm_spe_pkt_snprintf() function here as well?
Or is there a reason that this doesn't work?

> +			}
> +			if (payload & SPE_EVT_PKT_SVE_PARTIAL_PREDICATE) {
> +				ret = snprintf(buf, buf_len, " SVE-PARTIAL-PRED");
> +				if (ret < 0)
> +					return ret;
> +				buf += ret;
> +				blen -= ret;
> +			}
> +			if (payload & SPE_EVT_PKT_SVE_EMPTY_PREDICATE) {
> +				ret = snprintf(buf, buf_len, " SVE-EMPTY-PRED");
> +				if (ret < 0)
> +					return ret;
> +				buf += ret;
> +				blen -= ret;
> +			}
> +		}
> +
>  		return buf_len - blen;
>  
>  	case ARM_SPE_OP_TYPE:
>  		switch (idx) {
>  		case SPE_OP_PKT_HDR_CLASS_OTHER:
> -			return arm_spe_pkt_snprintf(&buf, &blen,
> -					payload & SPE_OP_PKT_OTHER_SUBCLASS_COND ?
> -					"COND-SELECT" : "INSN-OTHER");
> +			if ((payload & SPE_OP_PKT_OTHER_SVE_SUBCLASS_MASK) ==
> +					SPE_OP_PKT_OTHER_SUBCLASS_SVG_OP) {
> +
> +				ret = arm_spe_pkt_snprintf(&buf, &blen, "SVE-OTHER");
> +				if (ret < 0)
> +					return ret;
> +
> +				/* Effective vector length: step is 32 bits */
> +				ret = arm_spe_pkt_snprintf(&buf, &blen, " EVLEN %d",
> +					32 << ((payload & SPE_OP_PKT_SVE_EVL_MASK) >>
> +						SPE_OP_PKT_SVE_EVL_SHIFT));

Can you move this into a macro, and add a comment about how this works?
People might get confused over the "32 << something".

Cheers,
Andre

> +				if (ret < 0)
> +					return ret;
> +
> +				if (payload & SPE_OP_PKT_SVE_FP) {
> +					ret = arm_spe_pkt_snprintf(&buf, &blen, " FP");
> +					if (ret < 0)
> +						return ret;
> +				}
> +				if (payload & SPE_OP_PKT_SVE_PRED) {
> +					ret = arm_spe_pkt_snprintf(&buf, &blen, " PRED");
> +					if (ret < 0)
> +						return ret;
> +				}
> +			} else {
> +				ret = arm_spe_pkt_snprintf(&buf, &blen, "OTHER");
> +				if (ret < 0)
> +					return ret;
> +
> +				ret = arm_spe_pkt_snprintf(&buf, &blen, " %s",
> +					       payload & SPE_OP_PKT_OTHER_SUBCLASS_COND ?
> +					       "COND-SELECT" : "UNCOND");
> +				if (ret < 0)
> +					return ret;
> +			}
> +
> +			return buf_len - blen;
> +
>  		case SPE_OP_PKT_HDR_CLASS_LD_ST_ATOMIC:
>  			ret = arm_spe_pkt_snprintf(&buf, &blen,
>  						   payload & SPE_OP_PKT_LDST ?
> @@ -394,6 +453,25 @@ int arm_spe_pkt_desc(const struct arm_spe_pkt *packet, char *buf,
>  				ret = arm_spe_pkt_snprintf(&buf, &blen, " NV-SYSREG");
>  				if (ret < 0)
>  					return ret;
> +			} else if ((payload & SPE_OP_PKT_LDST_SUBCLASS_SVE_MASK) ==
> +					SPE_OP_PKT_LDST_SUBCLASS_SVE) {
> +				/* Effective vector length: step is 32 bits */
> +				ret = arm_spe_pkt_snprintf(&buf, &blen, " EVLEN %d",
> +					32 << ((payload & SPE_OP_PKT_SVE_EVL_MASK) >>
> +						SPE_OP_PKT_SVE_EVL_SHIFT));
> +				if (ret < 0)
> +					return ret;
> +
> +				if (payload & SPE_OP_PKT_SVE_PRED) {
> +					ret = arm_spe_pkt_snprintf(&buf, &blen, " PRED");
> +					if (ret < 0)
> +						return ret;
> +				}
> +				if (payload & SPE_OP_PKT_SVE_SG) {
> +					ret = arm_spe_pkt_snprintf(&buf, &blen, " SG");
> +					if (ret < 0)
> +						return ret;
> +				}
>  			}
>  
>  			return buf_len - blen;
> diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h
> index 1847cad517db..80266bfebec2 100644
> --- a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h
> +++ b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h
> @@ -147,6 +147,12 @@ struct arm_spe_pkt {
>  #define SPE_OP_PKT_LDST_SUBCLASS_SVE_MASK	(GENMASK_ULL(3, 3) | GENMASK_ULL(1, 1))
>  #define SPE_OP_PKT_LDST_SUBCLASS_SVE		(0x8)
>  
> +#define SPE_OP_PKT_SVE_SG			BIT(7)
> +#define SPE_OP_PKT_SVE_EVL_MASK			GENMASK_ULL(6, 4)
> +#define SPE_OP_PKT_SVE_EVL_SHIFT		(4)
> +#define SPE_OP_PKT_SVE_PRED			BIT(2)
> +#define SPE_OP_PKT_SVE_FP			BIT(1)
> +
>  #define SPE_OP_PKT_AR				BIT(4)
>  #define SPE_OP_PKT_EXCL				BIT(3)
>  #define SPE_OP_PKT_AT				BIT(2)
> 


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v2 12/14] perf arm-spe: Add more sub classes for operation packet
  2020-09-29 13:39 ` [PATCH v2 12/14] perf arm-spe: Add more sub classes for operation packet Leo Yan
@ 2020-10-20 21:54   ` André Przywara
  2020-10-21  5:16     ` Leo Yan
  0 siblings, 1 reply; 43+ messages in thread
From: André Przywara @ 2020-10-20 21:54 UTC (permalink / raw)
  To: Leo Yan, Arnaldo Carvalho de Melo, Peter Zijlstra, Ingo Molnar,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Wei Li, James Clark, Dave Martin, linux-kernel, Al Grant

On 29/09/2020 14:39, Leo Yan wrote:

Hi,

> For the operation type packet payload with load/store class, it misses
> to support these sub classes:
> 
>   - A load/store targeting the general-purpose registers;
>   - A load/store targeting unspecified registers;
>   - The ARMv8.4 nested virtualisation extension can redirect system
>     register accesses to a memory page controlled by the hypervisor.
>     The SPE profiling feature in newer implementations can tag those
>     memory accesses accordingly.
> 
> Add the bit pattern describing load/store sub classes, so that the perf
> tool can decode it properly.
> 
> Inspired by Andre Przywara, refined the commit log and code for more
> clear description.
> 
> Co-developed-by: Andre Przywara <andre.przywara@arm.com>
> Signed-off-by: Leo Yan <leo.yan@linaro.org>
> ---
>  .../util/arm-spe-decoder/arm-spe-pkt-decoder.c    | 15 +++++++++++++++
>  1 file changed, 15 insertions(+)
> 
> diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> index a848c784f4cf..57a2d5494838 100644
> --- a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> +++ b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> @@ -378,6 +378,21 @@ int arm_spe_pkt_desc(const struct arm_spe_pkt *packet, char *buf,
>  				ret = arm_spe_pkt_snprintf(&buf, &blen, " SIMD-FP");
>  				if (ret < 0)
>  					return ret;
> +			} else if ((payload & SPE_OP_PKT_LDST_SUBCLASS_MASK) ==

These three and the one above use the same mask, should this go into a
switch case? Move this block to the end, then do:
	switch (payload & SPE_OP_PKT_LDST_SUBCLASS_MASK) {
	case SPE_OP_PKT_LDST_SUBCLASS_GP_REG:
		...
	case SPE_OP_PKT_LDST_SUBCLASS_UNSPEC_REG:
		...
Maybe even assign just a string pointer inside, then have one snprintf.
Haven't checked it that *really* looks better, though.

Also those later checks are quite indented, shall those be moved to
helper functions? Again just an idea ....

Cheers,
Andre


> +					SPE_OP_PKT_LDST_SUBCLASS_GP_REG) {
> +				ret = arm_spe_pkt_snprintf(&buf, &blen, " GP-REG");
> +				if (ret < 0)
> +					return ret;
> +			} else if ((payload & SPE_OP_PKT_LDST_SUBCLASS_MASK) ==
> +					SPE_OP_PKT_LDST_SUBCLASS_UNSPEC_REG) {
> +				ret = arm_spe_pkt_snprintf(&buf, &blen, " UNSPEC-REG");
> +				if (ret < 0)
> +					return ret;
> +			} else if ((payload & SPE_OP_PKT_LDST_SUBCLASS_MASK) ==
> +					SPE_OP_PKT_LDST_SUBCLASS_NV_SYSREG) {
> +				ret = arm_spe_pkt_snprintf(&buf, &blen, " NV-SYSREG");
> +				if (ret < 0)
> +					return ret;
>  			}
>  
>  			return buf_len - blen;
> 


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v2 09/14] perf arm-spe: Refactor counter packet handling
  2020-10-20 21:53   ` André Przywara
@ 2020-10-21  3:52     ` Leo Yan
  0 siblings, 0 replies; 43+ messages in thread
From: Leo Yan @ 2020-10-21  3:52 UTC (permalink / raw)
  To: André Przywara
  Cc: Arnaldo Carvalho de Melo, Peter Zijlstra, Ingo Molnar,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Wei Li, James Clark, Dave Martin, linux-kernel, Al Grant

On Tue, Oct 20, 2020 at 10:53:47PM +0100, André Przywara wrote:
> On 29/09/2020 14:39, Leo Yan wrote:
> 
> Hi,
> 
> > This patch defines macros for counter packet header, and uses macro to
> > replace hard code values for packet parsing.
> > 
> > Signed-off-by: Leo Yan <leo.yan@linaro.org>
> > ---
> >  .../util/arm-spe-decoder/arm-spe-pkt-decoder.c  | 17 ++++++++++-------
> >  .../util/arm-spe-decoder/arm-spe-pkt-decoder.h  |  9 +++++++++
> >  2 files changed, 19 insertions(+), 7 deletions(-)
> > 
> > diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> > index 00a2cd1af422..ed0f4c74dfc5 100644
> > --- a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> > +++ b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> > @@ -150,10 +150,13 @@ static int arm_spe_get_counter(const unsigned char *buf, size_t len,
> >  			       const unsigned char ext_hdr, struct arm_spe_pkt *packet)
> >  {
> >  	packet->type = ARM_SPE_COUNTER;
> > -	if (ext_hdr)
> > -		packet->index = ((buf[0] & 0x3) << 3) | (buf[1] & 0x7);
> > -	else
> > -		packet->index = buf[0] & 0x7;
> > +	if (ext_hdr) {
> > +		packet->index  = (buf[1] & SPE_CNT_PKT_HDR_INDEX_MASK);
> > +		packet->index |= ((buf[0] & SPE_CNT_PKT_HDR_EXT_INDEX_MASK)
> > +			<< SPE_CNT_PKT_HDR_EXT_INDEX_SHIFT);
> > +	} else {
> > +		packet->index = buf[0] & SPE_CNT_PKT_HDR_INDEX_MASK;
> 
> That looks suspiciously similar to the extended header in the address
> packet. Can you use the same name for that?

Checked for counter packet and address packet, they are using the same
format for index.  Will use the same name.

> And, similar to the address packet, what about:
> 	packet->index |= SPE_PKT_EXT_HEADER_INDEX(buf[0]);

Will do.

> (merging the mask and the shift in the macro definition)
> 
> > +	}
> >  
> >  	return arm_spe_get_payload(buf, len, ext_hdr, packet);
> >  }
> > @@ -431,17 +434,17 @@ int arm_spe_pkt_desc(const struct arm_spe_pkt *packet, char *buf,
> >  			return ret;
> >  
> >  		switch (idx) {
> > -		case 0:
> > +		case SPE_CNT_PKT_HDR_INDEX_TOTAL_LAT:
> >  			ret = arm_spe_pkt_snprintf(&buf, &blen, "TOT");
> >  			if (ret < 0)
> >  				return ret;
> >  			break;
> > -		case 1:
> > +		case SPE_CNT_PKT_HDR_INDEX_ISSUE_LAT:
> >  			ret = arm_spe_pkt_snprintf(&buf, &blen, "ISSUE");
> >  			if (ret < 0)
> >  				return ret;
> >  			break;
> > -		case 2:
> > +		case SPE_CNT_PKT_HDR_INDEX_TRANS_LAT:
> >  			ret = arm_spe_pkt_snprintf(&buf, &blen, "XLAT");
> >  			if (ret < 0)
> >  				return ret;
> > diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h
> > index 62db4ff91832..18667a63f5ba 100644
> > --- a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h
> > +++ b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.h
> > @@ -89,6 +89,15 @@ struct arm_spe_pkt {
> >  /* Context packet header */
> >  #define SPE_CTX_PKT_HDR_INDEX_MASK		GENMASK_ULL(1, 0)
> >  
> > +/* Counter packet header */
> > +#define SPE_CNT_PKT_HDR_INDEX_MASK		GENMASK_ULL(2, 0)
> > +#define SPE_CNT_PKT_HDR_INDEX_TOTAL_LAT		(0x0)
> > +#define SPE_CNT_PKT_HDR_INDEX_ISSUE_LAT		(0x1)
> > +#define SPE_CNT_PKT_HDR_INDEX_TRANS_LAT		(0x2)
> 
> I think the Linux kernel coding style does not mention parentheses just
> around numbers, so just 0x2 would suffice, for instance.
> See section 12) in Documentation/process/coding-style.rst

Yeah, it gives example as "#define CONSTANT 0x4000"; will follow the
coding style.

Thanks for suggestions!
Leo

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v2 10/14] perf arm-spe: Refactor event type handling
  2020-10-20 21:54   ` André Przywara
@ 2020-10-21  4:54     ` Leo Yan
  2020-10-21  9:20       ` André Przywara
  0 siblings, 1 reply; 43+ messages in thread
From: Leo Yan @ 2020-10-21  4:54 UTC (permalink / raw)
  To: André Przywara
  Cc: Arnaldo Carvalho de Melo, Peter Zijlstra, Ingo Molnar,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Wei Li, James Clark, Dave Martin, linux-kernel, Al Grant

On Tue, Oct 20, 2020 at 10:54:16PM +0100, André Przywara wrote:
> On 29/09/2020 14:39, Leo Yan wrote:
> 
> Hi,
> 
> > Use macros instead of the enum values for event types, this is more
> > directive and without bit shifting when parse packet.
> > 
> > Signed-off-by: Leo Yan <leo.yan@linaro.org>
> > ---
> >  .../util/arm-spe-decoder/arm-spe-decoder.c    | 16 +++++++-------
> >  .../util/arm-spe-decoder/arm-spe-decoder.h    | 17 --------------
> >  .../arm-spe-decoder/arm-spe-pkt-decoder.c     | 22 +++++++++----------
> >  .../arm-spe-decoder/arm-spe-pkt-decoder.h     | 16 ++++++++++++++
> >  4 files changed, 35 insertions(+), 36 deletions(-)
> > 
> > diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c b/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c
> > index 9d3de163d47c..ac66e7f42a58 100644
> > --- a/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c
> > +++ b/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c
> > @@ -168,31 +168,31 @@ static int arm_spe_read_record(struct arm_spe_decoder *decoder)
> >  		case ARM_SPE_OP_TYPE:
> >  			break;
> >  		case ARM_SPE_EVENTS:
> > -			if (payload & BIT(EV_L1D_REFILL))
> > +			if (payload & SPE_EVT_PKT_L1D_REFILL)
> 
> Not sure this (and the others below) are an improvement? I liked the
> enum below, and reading BIT() here tells me that it's a bitmask.

Agreed.

> >  				decoder->record.type |= ARM_SPE_L1D_MISS;
> >  
> > -			if (payload & BIT(EV_L1D_ACCESS))
> > +			if (payload & SPE_EVT_PKT_L1D_ACCESS)
> >  				decoder->record.type |= ARM_SPE_L1D_ACCESS;
> >  
> > -			if (payload & BIT(EV_TLB_WALK))
> > +			if (payload & SPE_EVT_PKT_TLB_WALK)
> >  				decoder->record.type |= ARM_SPE_TLB_MISS;
> >  
> > -			if (payload & BIT(EV_TLB_ACCESS))
> > +			if (payload & SPE_EVT_PKT_TLB_ACCESS)
> >  				decoder->record.type |= ARM_SPE_TLB_ACCESS;
> >  
> >  			if ((idx == 2 || idx == 4 || idx == 8) &&
> > -			    (payload & BIT(EV_LLC_MISS)))
> > +			    (payload & SPE_EVT_PKT_LLC_MISS))
> >  				decoder->record.type |= ARM_SPE_LLC_MISS;
> >  
> >  			if ((idx == 2 || idx == 4 || idx == 8) &&
> > -			    (payload & BIT(EV_LLC_ACCESS)))
> > +			    (payload & SPE_EVT_PKT_LLC_ACCESS))
> >  				decoder->record.type |= ARM_SPE_LLC_ACCESS;
> >  
> >  			if ((idx == 2 || idx == 4 || idx == 8) &&
> > -			    (payload & BIT(EV_REMOTE_ACCESS)))
> > +			    (payload & SPE_EVT_PKT_REMOTE_ACCESS))
> >  				decoder->record.type |= ARM_SPE_REMOTE_ACCESS;
> >  
> > -			if (payload & BIT(EV_MISPRED))
> > +			if (payload & SPE_EVT_PKT_MISPREDICTED)
> >  				decoder->record.type |= ARM_SPE_BRANCH_MISS;
> >  
> >  			break;
> > diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-decoder.h b/tools/perf/util/arm-spe-decoder/arm-spe-decoder.h
> > index a5111a8d4360..24727b8ca7ff 100644
> > --- a/tools/perf/util/arm-spe-decoder/arm-spe-decoder.h
> > +++ b/tools/perf/util/arm-spe-decoder/arm-spe-decoder.h
> > @@ -13,23 +13,6 @@
> >  
> >  #include "arm-spe-pkt-decoder.h"
> >  
> > -enum arm_spe_events {
> > -	EV_EXCEPTION_GEN	= 0,
> > -	EV_RETIRED		= 1,
> > -	EV_L1D_ACCESS		= 2,
> > -	EV_L1D_REFILL		= 3,
> > -	EV_TLB_ACCESS		= 4,
> > -	EV_TLB_WALK		= 5,
> > -	EV_NOT_TAKEN		= 6,
> > -	EV_MISPRED		= 7,
> > -	EV_LLC_ACCESS		= 8,
> > -	EV_LLC_MISS		= 9,
> > -	EV_REMOTE_ACCESS	= 10,
> > -	EV_ALIGNMENT		= 11,
> > -	EV_PARTIAL_PREDICATE	= 17,
> > -	EV_EMPTY_PREDICATE	= 18,
> > -};
> 
> So what about keeping this, but moving it into the other header file?

Will do.  This is more directive, especially if we consider every bit
indicates an event type :)

> coding-style.rst says: "Enums are preferred when defining several
> related constants."

Thanks for pasting the coding style, it's useful.  I agree that using
BIT() + enum is better form, will refine the patch for this.

> > -
> >  enum arm_spe_sample_type {
> >  	ARM_SPE_L1D_ACCESS	= 1 << 0,
> >  	ARM_SPE_L1D_MISS	= 1 << 1,
> > diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> > index ed0f4c74dfc5..b8f343320abf 100644
> > --- a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> > +++ b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> > @@ -284,58 +284,58 @@ int arm_spe_pkt_desc(const struct arm_spe_pkt *packet, char *buf,
> >  		if (ret < 0)
> >  			return ret;
> >  
> > -		if (payload & 0x1) {
> > +		if (payload & SPE_EVT_PKT_GEN_EXCEPTION) {
> 
> Having the bitmask here directly is indeed not very nice and error
> prone. But I would rather see the above solution:
> 		if (payload & BIT(EV_EXCEPTION_GEN)) {

Will do.

> >  			ret = arm_spe_pkt_snprintf(&buf, &blen, " EXCEPTION-GEN");
> >  			if (ret < 0)
> >  				return ret;
> >  		}
> > -		if (payload & 0x2) {
> > +		if (payload & SPE_EVT_PKT_ARCH_RETIRED) {
> >  			ret = arm_spe_pkt_snprintf(&buf, &blen, " RETIRED");
> >  			if (ret < 0)
> >  				return ret;
> >  		}
> > -		if (payload & 0x4) {
> > +		if (payload & SPE_EVT_PKT_L1D_ACCESS) {
> >  			ret = arm_spe_pkt_snprintf(&buf, &blen, " L1D-ACCESS");
> >  			if (ret < 0)
> >  				return ret;
> >  		}
> > -		if (payload & 0x8) {
> > +		if (payload & SPE_EVT_PKT_L1D_REFILL) {
> >  			ret = arm_spe_pkt_snprintf(&buf, &blen, " L1D-REFILL");
> >  			if (ret < 0)
> >  				return ret;
> >  		}
> > -		if (payload & 0x10) {
> > +		if (payload & SPE_EVT_PKT_TLB_ACCESS) {
> >  			ret = arm_spe_pkt_snprintf(&buf, &blen, " TLB-ACCESS");
> >  			if (ret < 0)
> >  				return ret;
> >  		}
> > -		if (payload & 0x20) {
> > +		if (payload & SPE_EVT_PKT_TLB_WALK) {
> >  			ret = arm_spe_pkt_snprintf(&buf, &blen, " TLB-REFILL");
> >  			if (ret < 0)
> >  				return ret;
> >  		}
> > -		if (payload & 0x40) {
> > +		if (payload & SPE_EVT_PKT_NOT_TAKEN) {
> >  			ret = arm_spe_pkt_snprintf(&buf, &blen, " NOT-TAKEN");
> >  			if (ret < 0)
> >  				return ret;
> >  		}
> > -		if (payload & 0x80) {
> > +		if (payload & SPE_EVT_PKT_MISPREDICTED) {
> >  			ret = arm_spe_pkt_snprintf(&buf, &blen, " MISPRED");
> >  			if (ret < 0)
> >  				return ret;
> >  		}
> >  		if (idx > 1) {
> 
> Do you know what the purpose of this comparison is? Surely payload would
> not contain more bits than would fit in "idx" bytes? So is this some
> attempt of an optimisation?

Here "idx" is for payload size (in bytes); you could see function
arm_spe_get_events() calculate the payload size:

  packet->index = PAYLOAD_LEN(buf[0]);

Please note, the raw payload size (field "sz" in header) value is:

  0b00 Byte.
  0b01 Halfword.
  0b10 Word.
  0b11 Doubleword.

After using PAYLOAD_LEN(), the payload size is converted to value in
byte, so:

  packet->index = 1 << "sz";

  1  Byte
  2  Halfword
  4  Word
  8  Doubleword

In Armv8 ARM, chapter "D10.2.6 Events packet", we can see the events
"Remote access", "Last Level cache miss" and "Last Level cache access"
are only valid when "sz" is equal or longer than Halfword, thus idx is
2/4/8; this is why here checks the condition "if (idx > 1)".

> If so, I doubt it's really useful, the
> compiler might find a smarter solution to the problem. Just continuing
> with the bit mask comparison would make it look nicer, I think.

ARMv8 ARM gives out "Otherwise this bit reads-as-zero.", IIUC this
suggests to firstly check the size, if cannot meet the size requirement,
then the Event bit should be reads-as-zero.

Thanks,
Leo

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v2 14/14] perf arm-spe: Add support for ARMv8.3-SPE
  2020-10-20 21:54   ` André Przywara
@ 2020-10-21  5:10     ` Leo Yan
  2020-10-21  9:26       ` André Przywara
  0 siblings, 1 reply; 43+ messages in thread
From: Leo Yan @ 2020-10-21  5:10 UTC (permalink / raw)
  To: André Przywara
  Cc: Arnaldo Carvalho de Melo, Peter Zijlstra, Ingo Molnar,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Wei Li, James Clark, Dave Martin, linux-kernel, Al Grant

On Tue, Oct 20, 2020 at 10:54:44PM +0100, André Przywara wrote:
> On 29/09/2020 14:39, Leo Yan wrote:
> 
> Hi,
> 
> > From: Wei Li <liwei391@huawei.com>
> > 
> > This patch is to support Armv8.3 extension for SPE, it adds alignment
> > field in the Events packet and it supports the Scalable Vector Extension
> > (SVE) for Operation packet and Events packet with two additions:
> > 
> >   - The vector length for SVE operations in the Operation Type packet;
> >   - The incomplete predicate and empty predicate fields in the Events
> >     packet.
> > 
> > Signed-off-by: Wei Li <liwei391@huawei.com>
> > Signed-off-by: Leo Yan <leo.yan@linaro.org>
> > ---
> >  .../arm-spe-decoder/arm-spe-pkt-decoder.c     | 84 ++++++++++++++++++-
> >  .../arm-spe-decoder/arm-spe-pkt-decoder.h     |  6 ++
> >  2 files changed, 87 insertions(+), 3 deletions(-)
> > 
> > diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> > index 05a4c74399d7..3ec381fddfcb 100644
> > --- a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> > +++ b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> > @@ -342,14 +342,73 @@ int arm_spe_pkt_desc(const struct arm_spe_pkt *packet, char *buf,
> >  					return ret;
> >  			}
> >  		}
> > +		if (idx > 2) {
> 
> As I mentioned in the other patch, I doubt this extra comparison is
> useful. Does that protect us from anything?

It's the same reason with Event packet which have explained for replying
patch 10, the condition is to respect the SPE specifiction:

  E[11], byte 1, bit [11], when SZ == 0b10 , or SZ == 0b11
     Alignment.
     ...
     Otherwise this bit reads-as-zero.

So we gives higher priority for checking payload size than the Event
bit setting; if you have other thinking for this, please let me know.

> > +			if (payload & SPE_EVT_PKT_ALIGNMENT) {
> 
> Mmh, but this is bit 11, right?

Yes.

> So would need to go into the (idx > 1)
> section (covering bits 8-15)? Another reason to ditch this comparison above.

As has explained in patch 10, idx is not the same thing with "sz"
field; "idx" stands for payload length in bytes, so:

  idx = 1 << sz

The spec defines the sz is 2 or 3, thus idx is 4 or 8; so this is why
here use the condition "(idx > 2)".

I think here need to refine code for more explict expression so can
avoid confusion.  So I think it's better to condition such like:

  if (payload_len >= 4) {

     ...

  }

> > +				ret = snprintf(buf, buf_len, " ALIGNMENT");
> > +				if (ret < 0)
> > +					return ret;
> > +				buf += ret;
> > +				blen -= ret;
> 
> Shouldn't we use the new arm_spe_pkt_snprintf() function here as well?
> Or is there a reason that this doesn't work?

Goot point.  Will change to use arm_spe_pkt_snprintf().

> > +			}
> > +			if (payload & SPE_EVT_PKT_SVE_PARTIAL_PREDICATE) {
> > +				ret = snprintf(buf, buf_len, " SVE-PARTIAL-PRED");
> > +				if (ret < 0)
> > +					return ret;
> > +				buf += ret;
> > +				blen -= ret;
> > +			}
> > +			if (payload & SPE_EVT_PKT_SVE_EMPTY_PREDICATE) {
> > +				ret = snprintf(buf, buf_len, " SVE-EMPTY-PRED");
> > +				if (ret < 0)
> > +					return ret;
> > +				buf += ret;
> > +				blen -= ret;
> > +			}
> > +		}
> > +
> >  		return buf_len - blen;
> >  
> >  	case ARM_SPE_OP_TYPE:
> >  		switch (idx) {
> >  		case SPE_OP_PKT_HDR_CLASS_OTHER:
> > -			return arm_spe_pkt_snprintf(&buf, &blen,
> > -					payload & SPE_OP_PKT_OTHER_SUBCLASS_COND ?
> > -					"COND-SELECT" : "INSN-OTHER");
> > +			if ((payload & SPE_OP_PKT_OTHER_SVE_SUBCLASS_MASK) ==
> > +					SPE_OP_PKT_OTHER_SUBCLASS_SVG_OP) {
> > +
> > +				ret = arm_spe_pkt_snprintf(&buf, &blen, "SVE-OTHER");
> > +				if (ret < 0)
> > +					return ret;
> > +
> > +				/* Effective vector length: step is 32 bits */
> > +				ret = arm_spe_pkt_snprintf(&buf, &blen, " EVLEN %d",
> > +					32 << ((payload & SPE_OP_PKT_SVE_EVL_MASK) >>
> > +						SPE_OP_PKT_SVE_EVL_SHIFT));
> 
> Can you move this into a macro, and add a comment about how this works?
> People might get confused over the "32 << something".

Yeah, will refine for it.

Thanks,
Leo

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v2 12/14] perf arm-spe: Add more sub classes for operation packet
  2020-10-20 21:54   ` André Przywara
@ 2020-10-21  5:16     ` Leo Yan
  0 siblings, 0 replies; 43+ messages in thread
From: Leo Yan @ 2020-10-21  5:16 UTC (permalink / raw)
  To: André Przywara
  Cc: Arnaldo Carvalho de Melo, Peter Zijlstra, Ingo Molnar,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Wei Li, James Clark, Dave Martin, linux-kernel, Al Grant

On Tue, Oct 20, 2020 at 10:54:57PM +0100, André Przywara wrote:
> On 29/09/2020 14:39, Leo Yan wrote:
> 
> Hi,
> 
> > For the operation type packet payload with load/store class, it misses
> > to support these sub classes:
> > 
> >   - A load/store targeting the general-purpose registers;
> >   - A load/store targeting unspecified registers;
> >   - The ARMv8.4 nested virtualisation extension can redirect system
> >     register accesses to a memory page controlled by the hypervisor.
> >     The SPE profiling feature in newer implementations can tag those
> >     memory accesses accordingly.
> > 
> > Add the bit pattern describing load/store sub classes, so that the perf
> > tool can decode it properly.
> > 
> > Inspired by Andre Przywara, refined the commit log and code for more
> > clear description.
> > 
> > Co-developed-by: Andre Przywara <andre.przywara@arm.com>
> > Signed-off-by: Leo Yan <leo.yan@linaro.org>
> > ---
> >  .../util/arm-spe-decoder/arm-spe-pkt-decoder.c    | 15 +++++++++++++++
> >  1 file changed, 15 insertions(+)
> > 
> > diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> > index a848c784f4cf..57a2d5494838 100644
> > --- a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> > +++ b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> > @@ -378,6 +378,21 @@ int arm_spe_pkt_desc(const struct arm_spe_pkt *packet, char *buf,
> >  				ret = arm_spe_pkt_snprintf(&buf, &blen, " SIMD-FP");
> >  				if (ret < 0)
> >  					return ret;
> > +			} else if ((payload & SPE_OP_PKT_LDST_SUBCLASS_MASK) ==
> 
> These three and the one above use the same mask, should this go into a
> switch case? Move this block to the end, then do:
> 	switch (payload & SPE_OP_PKT_LDST_SUBCLASS_MASK) {
> 	case SPE_OP_PKT_LDST_SUBCLASS_GP_REG:
> 		...
> 	case SPE_OP_PKT_LDST_SUBCLASS_UNSPEC_REG:
> 		...
> Maybe even assign just a string pointer inside, then have one snprintf.
> Haven't checked it that *really* looks better, though.

Good point, will follow up this suggestion.

> Also those later checks are quite indented, shall those be moved to
> helper functions? Again just an idea ....

Yeah, I also considered to divide into small functions to handle
different types of packets so can avoid deep indention.  Will tweak
patches for this.

Very appreciate for your detailed reviewing and suggestions throughout
the patch set!

Leo

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v2 10/14] perf arm-spe: Refactor event type handling
  2020-10-21  4:54     ` Leo Yan
@ 2020-10-21  9:20       ` André Przywara
  2020-10-21 10:13         ` Leo Yan
  0 siblings, 1 reply; 43+ messages in thread
From: André Przywara @ 2020-10-21  9:20 UTC (permalink / raw)
  To: Leo Yan
  Cc: Arnaldo Carvalho de Melo, Peter Zijlstra, Ingo Molnar,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Wei Li, James Clark, Dave Martin, linux-kernel, Al Grant

On 21/10/2020 05:54, Leo Yan wrote:

Hi Leo,

> On Tue, Oct 20, 2020 at 10:54:16PM +0100, Andr� Przywara wrote:
>> On 29/09/2020 14:39, Leo Yan wrote:
>>
>> Hi,
>>
>>> Use macros instead of the enum values for event types, this is more
>>> directive and without bit shifting when parse packet.
>>>
>>> Signed-off-by: Leo Yan <leo.yan@linaro.org>
>>> ---
>>>  .../util/arm-spe-decoder/arm-spe-decoder.c    | 16 +++++++-------
>>>  .../util/arm-spe-decoder/arm-spe-decoder.h    | 17 --------------
>>>  .../arm-spe-decoder/arm-spe-pkt-decoder.c     | 22 +++++++++----------
>>>  .../arm-spe-decoder/arm-spe-pkt-decoder.h     | 16 ++++++++++++++
>>>  4 files changed, 35 insertions(+), 36 deletions(-)
>>>
>>> diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c b/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c
>>> index 9d3de163d47c..ac66e7f42a58 100644
>>> --- a/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c
>>> +++ b/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c
>>> @@ -168,31 +168,31 @@ static int arm_spe_read_record(struct arm_spe_decoder *decoder)
>>>  		case ARM_SPE_OP_TYPE:
>>>  			break;
>>>  		case ARM_SPE_EVENTS:
>>> -			if (payload & BIT(EV_L1D_REFILL))
>>> +			if (payload & SPE_EVT_PKT_L1D_REFILL)
>>
>> Not sure this (and the others below) are an improvement? I liked the
>> enum below, and reading BIT() here tells me that it's a bitmask.
> 
> Agreed.
> 
>>>  				decoder->record.type |= ARM_SPE_L1D_MISS;
>>>  
>>> -			if (payload & BIT(EV_L1D_ACCESS))
>>> +			if (payload & SPE_EVT_PKT_L1D_ACCESS)
>>>  				decoder->record.type |= ARM_SPE_L1D_ACCESS;
>>>  
>>> -			if (payload & BIT(EV_TLB_WALK))
>>> +			if (payload & SPE_EVT_PKT_TLB_WALK)
>>>  				decoder->record.type |= ARM_SPE_TLB_MISS;
>>>  
>>> -			if (payload & BIT(EV_TLB_ACCESS))
>>> +			if (payload & SPE_EVT_PKT_TLB_ACCESS)
>>>  				decoder->record.type |= ARM_SPE_TLB_ACCESS;
>>>  
>>>  			if ((idx == 2 || idx == 4 || idx == 8) &&
>>> -			    (payload & BIT(EV_LLC_MISS)))
>>> +			    (payload & SPE_EVT_PKT_LLC_MISS))
>>>  				decoder->record.type |= ARM_SPE_LLC_MISS;
>>>  
>>>  			if ((idx == 2 || idx == 4 || idx == 8) &&
>>> -			    (payload & BIT(EV_LLC_ACCESS)))
>>> +			    (payload & SPE_EVT_PKT_LLC_ACCESS))
>>>  				decoder->record.type |= ARM_SPE_LLC_ACCESS;
>>>  
>>>  			if ((idx == 2 || idx == 4 || idx == 8) &&
>>> -			    (payload & BIT(EV_REMOTE_ACCESS)))
>>> +			    (payload & SPE_EVT_PKT_REMOTE_ACCESS))
>>>  				decoder->record.type |= ARM_SPE_REMOTE_ACCESS;
>>>  
>>> -			if (payload & BIT(EV_MISPRED))
>>> +			if (payload & SPE_EVT_PKT_MISPREDICTED)
>>>  				decoder->record.type |= ARM_SPE_BRANCH_MISS;
>>>  
>>>  			break;
>>> diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-decoder.h b/tools/perf/util/arm-spe-decoder/arm-spe-decoder.h
>>> index a5111a8d4360..24727b8ca7ff 100644
>>> --- a/tools/perf/util/arm-spe-decoder/arm-spe-decoder.h
>>> +++ b/tools/perf/util/arm-spe-decoder/arm-spe-decoder.h
>>> @@ -13,23 +13,6 @@
>>>  
>>>  #include "arm-spe-pkt-decoder.h"
>>>  
>>> -enum arm_spe_events {
>>> -	EV_EXCEPTION_GEN	= 0,
>>> -	EV_RETIRED		= 1,
>>> -	EV_L1D_ACCESS		= 2,
>>> -	EV_L1D_REFILL		= 3,
>>> -	EV_TLB_ACCESS		= 4,
>>> -	EV_TLB_WALK		= 5,
>>> -	EV_NOT_TAKEN		= 6,
>>> -	EV_MISPRED		= 7,
>>> -	EV_LLC_ACCESS		= 8,
>>> -	EV_LLC_MISS		= 9,
>>> -	EV_REMOTE_ACCESS	= 10,
>>> -	EV_ALIGNMENT		= 11,
>>> -	EV_PARTIAL_PREDICATE	= 17,
>>> -	EV_EMPTY_PREDICATE	= 18,
>>> -};
>>
>> So what about keeping this, but moving it into the other header file?
> 
> Will do.  This is more directive, especially if we consider every bit
> indicates an event type :)
> 
>> coding-style.rst says: "Enums are preferred when defining several
>> related constants."
> 
> Thanks for pasting the coding style, it's useful.  I agree that using
> BIT() + enum is better form, will refine the patch for this.
> 
>>> -
>>>  enum arm_spe_sample_type {
>>>  	ARM_SPE_L1D_ACCESS	= 1 << 0,
>>>  	ARM_SPE_L1D_MISS	= 1 << 1,
>>> diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
>>> index ed0f4c74dfc5..b8f343320abf 100644
>>> --- a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
>>> +++ b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
>>> @@ -284,58 +284,58 @@ int arm_spe_pkt_desc(const struct arm_spe_pkt *packet, char *buf,
>>>  		if (ret < 0)
>>>  			return ret;
>>>  
>>> -		if (payload & 0x1) {
>>> +		if (payload & SPE_EVT_PKT_GEN_EXCEPTION) {
>>
>> Having the bitmask here directly is indeed not very nice and error
>> prone. But I would rather see the above solution:
>> 		if (payload & BIT(EV_EXCEPTION_GEN)) {
> 
> Will do.
> 
>>>  			ret = arm_spe_pkt_snprintf(&buf, &blen, " EXCEPTION-GEN");
>>>  			if (ret < 0)
>>>  				return ret;
>>>  		}
>>> -		if (payload & 0x2) {
>>> +		if (payload & SPE_EVT_PKT_ARCH_RETIRED) {
>>>  			ret = arm_spe_pkt_snprintf(&buf, &blen, " RETIRED");
>>>  			if (ret < 0)
>>>  				return ret;
>>>  		}
>>> -		if (payload & 0x4) {
>>> +		if (payload & SPE_EVT_PKT_L1D_ACCESS) {
>>>  			ret = arm_spe_pkt_snprintf(&buf, &blen, " L1D-ACCESS");
>>>  			if (ret < 0)
>>>  				return ret;
>>>  		}
>>> -		if (payload & 0x8) {
>>> +		if (payload & SPE_EVT_PKT_L1D_REFILL) {
>>>  			ret = arm_spe_pkt_snprintf(&buf, &blen, " L1D-REFILL");
>>>  			if (ret < 0)
>>>  				return ret;
>>>  		}
>>> -		if (payload & 0x10) {
>>> +		if (payload & SPE_EVT_PKT_TLB_ACCESS) {
>>>  			ret = arm_spe_pkt_snprintf(&buf, &blen, " TLB-ACCESS");
>>>  			if (ret < 0)
>>>  				return ret;
>>>  		}
>>> -		if (payload & 0x20) {
>>> +		if (payload & SPE_EVT_PKT_TLB_WALK) {
>>>  			ret = arm_spe_pkt_snprintf(&buf, &blen, " TLB-REFILL");
>>>  			if (ret < 0)
>>>  				return ret;
>>>  		}
>>> -		if (payload & 0x40) {
>>> +		if (payload & SPE_EVT_PKT_NOT_TAKEN) {
>>>  			ret = arm_spe_pkt_snprintf(&buf, &blen, " NOT-TAKEN");
>>>  			if (ret < 0)
>>>  				return ret;
>>>  		}
>>> -		if (payload & 0x80) {
>>> +		if (payload & SPE_EVT_PKT_MISPREDICTED) {
>>>  			ret = arm_spe_pkt_snprintf(&buf, &blen, " MISPRED");
>>>  			if (ret < 0)
>>>  				return ret;
>>>  		}
>>>  		if (idx > 1) {
>>
>> Do you know what the purpose of this comparison is? Surely payload would
>> not contain more bits than would fit in "idx" bytes? So is this some
>> attempt of an optimisation?
> 
> Here "idx" is for payload size (in bytes); you could see function
> arm_spe_get_events() calculate the payload size:
> 
>   packet->index = PAYLOAD_LEN(buf[0]);
> 
> Please note, the raw payload size (field "sz" in header) value is:
> 
>   0b00 Byte.
>   0b01 Halfword.
>   0b10 Word.
>   0b11 Doubleword.
> 
> After using PAYLOAD_LEN(), the payload size is converted to value in
> byte, so:
> 
>   packet->index = 1 << "sz";
> 
>   1  Byte
>   2  Halfword
>   4  Word
>   8  Doubleword
> 
> In Armv8 ARM, chapter "D10.2.6 Events packet", we can see the events
> "Remote access", "Last Level cache miss" and "Last Level cache access"
> are only valid when "sz" is equal or longer than Halfword, thus idx is
> 2/4/8; this is why here checks the condition "if (idx > 1)".

Right, thanks for the explanation. But in the end this is just a lot of
words for: "You can only fit n*8 bits in n bytes.", isn't it?
So if the payload size is 1 bytes, we can't have bits 8 or higher.

And in arm_spe_get_payload() we load payload with casts, so the upper
bits, beyond the payload size, must always be 0? Regardless of what was
in the buffer. Or am I looking at the wrong function?
Even if that wouldn't be the case, I'd rather mask it here again, so
that we can rely on this, and lose the extra check.

> 
>> If so, I doubt it's really useful, the
>> compiler might find a smarter solution to the problem. Just continuing
>> with the bit mask comparison would make it look nicer, I think.
> 
> ARMv8 ARM gives out "Otherwise this bit reads-as-zero.", IIUC this
> suggests to firstly check the size, if cannot meet the size requirement,
> then the Event bit should be reads-as-zero.

But as mentioned above, we take care of this already:
        switch (payload_len) {
        case 1: packet->payload = *(uint8_t *)buf; break;
        case 2: packet->payload = le16_to_cpu(*(uint16_t *)buf); break;
	...

Thanks,
Andre

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v2 14/14] perf arm-spe: Add support for ARMv8.3-SPE
  2020-10-21  5:10     ` Leo Yan
@ 2020-10-21  9:26       ` André Przywara
  2020-10-21 10:17         ` Leo Yan
  0 siblings, 1 reply; 43+ messages in thread
From: André Przywara @ 2020-10-21  9:26 UTC (permalink / raw)
  To: Leo Yan
  Cc: Arnaldo Carvalho de Melo, Peter Zijlstra, Ingo Molnar,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Wei Li, James Clark, Dave Martin, linux-kernel, Al Grant

On 21/10/2020 06:10, Leo Yan wrote:

Hi,

> On Tue, Oct 20, 2020 at 10:54:44PM +0100, Andr� Przywara wrote:
>> On 29/09/2020 14:39, Leo Yan wrote:
>>
>> Hi,
>>
>>> From: Wei Li <liwei391@huawei.com>
>>>
>>> This patch is to support Armv8.3 extension for SPE, it adds alignment
>>> field in the Events packet and it supports the Scalable Vector Extension
>>> (SVE) for Operation packet and Events packet with two additions:
>>>
>>>   - The vector length for SVE operations in the Operation Type packet;
>>>   - The incomplete predicate and empty predicate fields in the Events
>>>     packet.
>>>
>>> Signed-off-by: Wei Li <liwei391@huawei.com>
>>> Signed-off-by: Leo Yan <leo.yan@linaro.org>
>>> ---
>>>  .../arm-spe-decoder/arm-spe-pkt-decoder.c     | 84 ++++++++++++++++++-
>>>  .../arm-spe-decoder/arm-spe-pkt-decoder.h     |  6 ++
>>>  2 files changed, 87 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
>>> index 05a4c74399d7..3ec381fddfcb 100644
>>> --- a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
>>> +++ b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
>>> @@ -342,14 +342,73 @@ int arm_spe_pkt_desc(const struct arm_spe_pkt *packet, char *buf,
>>>  					return ret;
>>>  			}
>>>  		}
>>> +		if (idx > 2) {
>>
>> As I mentioned in the other patch, I doubt this extra comparison is
>> useful. Does that protect us from anything?
> 
> It's the same reason with Event packet which have explained for replying
> patch 10, the condition is to respect the SPE specifiction:
> 
>   E[11], byte 1, bit [11], when SZ == 0b10 , or SZ == 0b11
>      Alignment.
>      ...
>      Otherwise this bit reads-as-zero.
> 
> So we gives higher priority for checking payload size than the Event
> bit setting; if you have other thinking for this, please let me know.

Ah, thanks for pointing this out. It looks like a bug in the manual
then, because I don't see why bit 11 should be any different from bits
[10:8] and bits [15:12] in this respect. And in the diagrams above you
clearly see bit 11 being shown even when SZ == 0b01.

I will try to follow this up here.

>>> +			if (payload & SPE_EVT_PKT_ALIGNMENT) {
>>
>> Mmh, but this is bit 11, right?
> 
> Yes.
> 
>> So would need to go into the (idx > 1)
>> section (covering bits 8-15)? Another reason to ditch this comparison above.
> 
> As has explained in patch 10, idx is not the same thing with "sz"
> field; "idx" stands for payload length in bytes, so:
> 
>   idx = 1 << sz
> 
> The spec defines the sz is 2 or 3, thus idx is 4 or 8; so this is why
> here use the condition "(idx > 2)".
> 
> I think here need to refine code for more explict expression so can
> avoid confusion.  So I think it's better to condition such like:
> 
>   if (payload_len >= 4) {

Yes, that would be (or have been) more helpful, but as mentioned in the
other patch, I'd rather see those comparisons go entirely.

Cheers,
Andre

> 
>      ...
> 
>   }
> 
>>> +				ret = snprintf(buf, buf_len, " ALIGNMENT");
>>> +				if (ret < 0)
>>> +					return ret;
>>> +				buf += ret;
>>> +				blen -= ret;
>>
>> Shouldn't we use the new arm_spe_pkt_snprintf() function here as well?
>> Or is there a reason that this doesn't work?
> 
> Goot point.  Will change to use arm_spe_pkt_snprintf().
> 
>>> +			}
>>> +			if (payload & SPE_EVT_PKT_SVE_PARTIAL_PREDICATE) {
>>> +				ret = snprintf(buf, buf_len, " SVE-PARTIAL-PRED");
>>> +				if (ret < 0)
>>> +					return ret;
>>> +				buf += ret;
>>> +				blen -= ret;
>>> +			}
>>> +			if (payload & SPE_EVT_PKT_SVE_EMPTY_PREDICATE) {
>>> +				ret = snprintf(buf, buf_len, " SVE-EMPTY-PRED");
>>> +				if (ret < 0)
>>> +					return ret;
>>> +				buf += ret;
>>> +				blen -= ret;
>>> +			}
>>> +		}
>>> +
>>>  		return buf_len - blen;
>>>  
>>>  	case ARM_SPE_OP_TYPE:
>>>  		switch (idx) {
>>>  		case SPE_OP_PKT_HDR_CLASS_OTHER:
>>> -			return arm_spe_pkt_snprintf(&buf, &blen,
>>> -					payload & SPE_OP_PKT_OTHER_SUBCLASS_COND ?
>>> -					"COND-SELECT" : "INSN-OTHER");
>>> +			if ((payload & SPE_OP_PKT_OTHER_SVE_SUBCLASS_MASK) ==
>>> +					SPE_OP_PKT_OTHER_SUBCLASS_SVG_OP) {
>>> +
>>> +				ret = arm_spe_pkt_snprintf(&buf, &blen, "SVE-OTHER");
>>> +				if (ret < 0)
>>> +					return ret;
>>> +
>>> +				/* Effective vector length: step is 32 bits */
>>> +				ret = arm_spe_pkt_snprintf(&buf, &blen, " EVLEN %d",
>>> +					32 << ((payload & SPE_OP_PKT_SVE_EVL_MASK) >>
>>> +						SPE_OP_PKT_SVE_EVL_SHIFT));
>>
>> Can you move this into a macro, and add a comment about how this works?
>> People might get confused over the "32 << something".
> 
> Yeah, will refine for it.
> 
> Thanks,
> Leo
> 


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v2 10/14] perf arm-spe: Refactor event type handling
  2020-10-21  9:20       ` André Przywara
@ 2020-10-21 10:13         ` Leo Yan
  0 siblings, 0 replies; 43+ messages in thread
From: Leo Yan @ 2020-10-21 10:13 UTC (permalink / raw)
  To: André Przywara
  Cc: Arnaldo Carvalho de Melo, Peter Zijlstra, Ingo Molnar,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Wei Li, James Clark, Dave Martin, linux-kernel, Al Grant

Hi Andre,

On Wed, Oct 21, 2020 at 10:20:24AM +0100, André Przywara wrote:

[...]

> >>> --- a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> >>> +++ b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> >>> @@ -284,58 +284,58 @@ int arm_spe_pkt_desc(const struct arm_spe_pkt *packet, char *buf,
> >>>  		if (ret < 0)
> >>>  			return ret;
> >>>  
> >>> -		if (payload & 0x1) {
> >>> +		if (payload & SPE_EVT_PKT_GEN_EXCEPTION) {
> >>
> >> Having the bitmask here directly is indeed not very nice and error
> >> prone. But I would rather see the above solution:
> >> 		if (payload & BIT(EV_EXCEPTION_GEN)) {
> > 
> > Will do.
> > 
> >>>  			ret = arm_spe_pkt_snprintf(&buf, &blen, " EXCEPTION-GEN");
> >>>  			if (ret < 0)
> >>>  				return ret;
> >>>  		}
> >>> -		if (payload & 0x2) {
> >>> +		if (payload & SPE_EVT_PKT_ARCH_RETIRED) {
> >>>  			ret = arm_spe_pkt_snprintf(&buf, &blen, " RETIRED");
> >>>  			if (ret < 0)
> >>>  				return ret;
> >>>  		}
> >>> -		if (payload & 0x4) {
> >>> +		if (payload & SPE_EVT_PKT_L1D_ACCESS) {
> >>>  			ret = arm_spe_pkt_snprintf(&buf, &blen, " L1D-ACCESS");
> >>>  			if (ret < 0)
> >>>  				return ret;
> >>>  		}
> >>> -		if (payload & 0x8) {
> >>> +		if (payload & SPE_EVT_PKT_L1D_REFILL) {
> >>>  			ret = arm_spe_pkt_snprintf(&buf, &blen, " L1D-REFILL");
> >>>  			if (ret < 0)
> >>>  				return ret;
> >>>  		}
> >>> -		if (payload & 0x10) {
> >>> +		if (payload & SPE_EVT_PKT_TLB_ACCESS) {
> >>>  			ret = arm_spe_pkt_snprintf(&buf, &blen, " TLB-ACCESS");
> >>>  			if (ret < 0)
> >>>  				return ret;
> >>>  		}
> >>> -		if (payload & 0x20) {
> >>> +		if (payload & SPE_EVT_PKT_TLB_WALK) {
> >>>  			ret = arm_spe_pkt_snprintf(&buf, &blen, " TLB-REFILL");
> >>>  			if (ret < 0)
> >>>  				return ret;
> >>>  		}
> >>> -		if (payload & 0x40) {
> >>> +		if (payload & SPE_EVT_PKT_NOT_TAKEN) {
> >>>  			ret = arm_spe_pkt_snprintf(&buf, &blen, " NOT-TAKEN");
> >>>  			if (ret < 0)
> >>>  				return ret;
> >>>  		}
> >>> -		if (payload & 0x80) {
> >>> +		if (payload & SPE_EVT_PKT_MISPREDICTED) {
> >>>  			ret = arm_spe_pkt_snprintf(&buf, &blen, " MISPRED");
> >>>  			if (ret < 0)
> >>>  				return ret;
> >>>  		}
> >>>  		if (idx > 1) {
> >>
> >> Do you know what the purpose of this comparison is? Surely payload would
> >> not contain more bits than would fit in "idx" bytes? So is this some
> >> attempt of an optimisation?
> > 
> > Here "idx" is for payload size (in bytes); you could see function
> > arm_spe_get_events() calculate the payload size:
> > 
> >   packet->index = PAYLOAD_LEN(buf[0]);
> > 
> > Please note, the raw payload size (field "sz" in header) value is:
> > 
> >   0b00 Byte.
> >   0b01 Halfword.
> >   0b10 Word.
> >   0b11 Doubleword.
> > 
> > After using PAYLOAD_LEN(), the payload size is converted to value in
> > byte, so:
> > 
> >   packet->index = 1 << "sz";
> > 
> >   1  Byte
> >   2  Halfword
> >   4  Word
> >   8  Doubleword
> > 
> > In Armv8 ARM, chapter "D10.2.6 Events packet", we can see the events
> > "Remote access", "Last Level cache miss" and "Last Level cache access"
> > are only valid when "sz" is equal or longer than Halfword, thus idx is
> > 2/4/8; this is why here checks the condition "if (idx > 1)".
> 
> Right, thanks for the explanation. But in the end this is just a lot of
> words for: "You can only fit n*8 bits in n bytes.", isn't it?
> So if the payload size is 1 bytes, we can't have bits 8 or higher.
> 
> And in arm_spe_get_payload() we load payload with casts, so the upper
> bits, beyond the payload size, must always be 0? Regardless of what was
> in the buffer. Or am I looking at the wrong function?

You are right!  Sorry I didn't connect with the function
arm_spe_get_payload(), so packet->payload has been been casted, so we
don't need to do extra checking for payload size at here.

Will remove the condition checking in next spin.

Thanks a lot!

Leo

> Even if that wouldn't be the case, I'd rather mask it here again, so
> that we can rely on this, and lose the extra check.
> 
> > 
> >> If so, I doubt it's really useful, the
> >> compiler might find a smarter solution to the problem. Just continuing
> >> with the bit mask comparison would make it look nicer, I think.
> > 
> > ARMv8 ARM gives out "Otherwise this bit reads-as-zero.", IIUC this
> > suggests to firstly check the size, if cannot meet the size requirement,
> > then the Event bit should be reads-as-zero.
> 
> But as mentioned above, we take care of this already:
>         switch (payload_len) {
>         case 1: packet->payload = *(uint8_t *)buf; break;
>         case 2: packet->payload = le16_to_cpu(*(uint16_t *)buf); break;
> 	...
> 
> Thanks,
> Andre

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v2 14/14] perf arm-spe: Add support for ARMv8.3-SPE
  2020-10-21  9:26       ` André Przywara
@ 2020-10-21 10:17         ` Leo Yan
  2020-10-21 14:53           ` André Przywara
  0 siblings, 1 reply; 43+ messages in thread
From: Leo Yan @ 2020-10-21 10:17 UTC (permalink / raw)
  To: André Przywara
  Cc: Arnaldo Carvalho de Melo, Peter Zijlstra, Ingo Molnar,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Wei Li, James Clark, Dave Martin, linux-kernel, Al Grant

On Wed, Oct 21, 2020 at 10:26:07AM +0100, André Przywara wrote:
> On 21/10/2020 06:10, Leo Yan wrote:
> 
> Hi,
> 
> > On Tue, Oct 20, 2020 at 10:54:44PM +0100, Andr� Przywara wrote:
> >> On 29/09/2020 14:39, Leo Yan wrote:
> >>
> >> Hi,
> >>
> >>> From: Wei Li <liwei391@huawei.com>
> >>>
> >>> This patch is to support Armv8.3 extension for SPE, it adds alignment
> >>> field in the Events packet and it supports the Scalable Vector Extension
> >>> (SVE) for Operation packet and Events packet with two additions:
> >>>
> >>>   - The vector length for SVE operations in the Operation Type packet;
> >>>   - The incomplete predicate and empty predicate fields in the Events
> >>>     packet.
> >>>
> >>> Signed-off-by: Wei Li <liwei391@huawei.com>
> >>> Signed-off-by: Leo Yan <leo.yan@linaro.org>
> >>> ---
> >>>  .../arm-spe-decoder/arm-spe-pkt-decoder.c     | 84 ++++++++++++++++++-
> >>>  .../arm-spe-decoder/arm-spe-pkt-decoder.h     |  6 ++
> >>>  2 files changed, 87 insertions(+), 3 deletions(-)
> >>>
> >>> diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> >>> index 05a4c74399d7..3ec381fddfcb 100644
> >>> --- a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> >>> +++ b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> >>> @@ -342,14 +342,73 @@ int arm_spe_pkt_desc(const struct arm_spe_pkt *packet, char *buf,
> >>>  					return ret;
> >>>  			}
> >>>  		}
> >>> +		if (idx > 2) {
> >>
> >> As I mentioned in the other patch, I doubt this extra comparison is
> >> useful. Does that protect us from anything?
> > 
> > It's the same reason with Event packet which have explained for replying
> > patch 10, the condition is to respect the SPE specifiction:
> > 
> >   E[11], byte 1, bit [11], when SZ == 0b10 , or SZ == 0b11
> >      Alignment.
> >      ...
> >      Otherwise this bit reads-as-zero.
> > 
> > So we gives higher priority for checking payload size than the Event
> > bit setting; if you have other thinking for this, please let me know.
> 
> Ah, thanks for pointing this out. It looks like a bug in the manual
> then, because I don't see why bit 11 should be any different from bits
> [10:8] and bits [15:12] in this respect. And in the diagrams above you
> clearly see bit 11 being shown even when SZ == 0b01.
> 
> I will try to follow this up here.

Thanks for following up!

> >>> +			if (payload & SPE_EVT_PKT_ALIGNMENT) {
> >>
> >> Mmh, but this is bit 11, right?
> > 
> > Yes.
> > 
> >> So would need to go into the (idx > 1)
> >> section (covering bits 8-15)? Another reason to ditch this comparison above.
> > 
> > As has explained in patch 10, idx is not the same thing with "sz"
> > field; "idx" stands for payload length in bytes, so:
> > 
> >   idx = 1 << sz
> > 
> > The spec defines the sz is 2 or 3, thus idx is 4 or 8; so this is why
> > here use the condition "(idx > 2)".
> > 
> > I think here need to refine code for more explict expression so can
> > avoid confusion.  So I think it's better to condition such like:
> > 
> >   if (payload_len >= 4) {
> 
> Yes, that would be (or have been) more helpful, but as mentioned in the
> other patch, I'd rather see those comparisons go entirely.

Agree.  Will remove comparisons in next version.

Thanks,
Leo

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v2 14/14] perf arm-spe: Add support for ARMv8.3-SPE
  2020-10-21 10:17         ` Leo Yan
@ 2020-10-21 14:53           ` André Przywara
  2020-10-22  0:44             ` Leo Yan
  0 siblings, 1 reply; 43+ messages in thread
From: André Przywara @ 2020-10-21 14:53 UTC (permalink / raw)
  To: Leo Yan
  Cc: Arnaldo Carvalho de Melo, Peter Zijlstra, Ingo Molnar,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Wei Li, James Clark, Dave Martin, linux-kernel, Al Grant

On 21/10/2020 11:17, Leo Yan wrote:

Hi Leo,

> On Wed, Oct 21, 2020 at 10:26:07AM +0100, Andr� Przywara wrote:
>> On 21/10/2020 06:10, Leo Yan wrote:
>>
>> Hi,
>>
>>> On Tue, Oct 20, 2020 at 10:54:44PM +0100, Andr� Przywara wrote:
>>>> On 29/09/2020 14:39, Leo Yan wrote:
>>>>
>>>> Hi,
>>>>
>>>>> From: Wei Li <liwei391@huawei.com>
>>>>>
>>>>> This patch is to support Armv8.3 extension for SPE, it adds alignment
>>>>> field in the Events packet and it supports the Scalable Vector Extension
>>>>> (SVE) for Operation packet and Events packet with two additions:
>>>>>
>>>>>   - The vector length for SVE operations in the Operation Type packet;
>>>>>   - The incomplete predicate and empty predicate fields in the Events
>>>>>     packet.
>>>>>
>>>>> Signed-off-by: Wei Li <liwei391@huawei.com>
>>>>> Signed-off-by: Leo Yan <leo.yan@linaro.org>
>>>>> ---
>>>>>  .../arm-spe-decoder/arm-spe-pkt-decoder.c     | 84 ++++++++++++++++++-
>>>>>  .../arm-spe-decoder/arm-spe-pkt-decoder.h     |  6 ++
>>>>>  2 files changed, 87 insertions(+), 3 deletions(-)
>>>>>
>>>>> diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
>>>>> index 05a4c74399d7..3ec381fddfcb 100644
>>>>> --- a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
>>>>> +++ b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
>>>>> @@ -342,14 +342,73 @@ int arm_spe_pkt_desc(const struct arm_spe_pkt *packet, char *buf,
>>>>>  					return ret;
>>>>>  			}
>>>>>  		}
>>>>> +		if (idx > 2) {
>>>>
>>>> As I mentioned in the other patch, I doubt this extra comparison is
>>>> useful. Does that protect us from anything?
>>>
>>> It's the same reason with Event packet which have explained for replying
>>> patch 10, the condition is to respect the SPE specifiction:
>>>
>>>   E[11], byte 1, bit [11], when SZ == 0b10 , or SZ == 0b11
>>>      Alignment.
>>>      ...
>>>      Otherwise this bit reads-as-zero.
>>>
>>> So we gives higher priority for checking payload size than the Event
>>> bit setting; if you have other thinking for this, please let me know.
>>
>> Ah, thanks for pointing this out. It looks like a bug in the manual
>> then, because I don't see why bit 11 should be any different from bits
>> [10:8] and bits [15:12] in this respect. And in the diagrams above you
>> clearly see bit 11 being shown even when SZ == 0b01.
>>
>> I will try to follow this up here.
> 
> Thanks for following up!

Just got the confirmation that this is indeed a bug in the manual. It
will be fixed, but since the ARM ARM isn't published on a daily base, it
might take a while to trickle in.

Cheers,
Andre


> 
>>>>> +			if (payload & SPE_EVT_PKT_ALIGNMENT) {
>>>>
>>>> Mmh, but this is bit 11, right?
>>>
>>> Yes.
>>>
>>>> So would need to go into the (idx > 1)
>>>> section (covering bits 8-15)? Another reason to ditch this comparison above.
>>>
>>> As has explained in patch 10, idx is not the same thing with "sz"
>>> field; "idx" stands for payload length in bytes, so:
>>>
>>>   idx = 1 << sz
>>>
>>> The spec defines the sz is 2 or 3, thus idx is 4 or 8; so this is why
>>> here use the condition "(idx > 2)".
>>>
>>> I think here need to refine code for more explict expression so can
>>> avoid confusion.  So I think it's better to condition such like:
>>>
>>>   if (payload_len >= 4) {
>>
>> Yes, that would be (or have been) more helpful, but as mentioned in the
>> other patch, I'd rather see those comparisons go entirely.
> 
> Agree.  Will remove comparisons in next version.
> 
> Thanks,
> Leo
> 


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v2 14/14] perf arm-spe: Add support for ARMv8.3-SPE
  2020-10-21 14:53           ` André Przywara
@ 2020-10-22  0:44             ` Leo Yan
  0 siblings, 0 replies; 43+ messages in thread
From: Leo Yan @ 2020-10-22  0:44 UTC (permalink / raw)
  To: André Przywara
  Cc: Arnaldo Carvalho de Melo, Peter Zijlstra, Ingo Molnar,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Wei Li, James Clark, Dave Martin, linux-kernel, Al Grant

Hi Andre,

On Wed, Oct 21, 2020 at 03:53:31PM +0100, André Przywara wrote:

[...]

> >>>>> diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> >>>>> index 05a4c74399d7..3ec381fddfcb 100644
> >>>>> --- a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> >>>>> +++ b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
> >>>>> @@ -342,14 +342,73 @@ int arm_spe_pkt_desc(const struct arm_spe_pkt *packet, char *buf,
> >>>>>  					return ret;
> >>>>>  			}
> >>>>>  		}
> >>>>> +		if (idx > 2) {
> >>>>
> >>>> As I mentioned in the other patch, I doubt this extra comparison is
> >>>> useful. Does that protect us from anything?
> >>>
> >>> It's the same reason with Event packet which have explained for replying
> >>> patch 10, the condition is to respect the SPE specifiction:
> >>>
> >>>   E[11], byte 1, bit [11], when SZ == 0b10 , or SZ == 0b11
> >>>      Alignment.
> >>>      ...
> >>>      Otherwise this bit reads-as-zero.
> >>>
> >>> So we gives higher priority for checking payload size than the Event
> >>> bit setting; if you have other thinking for this, please let me know.
> >>
> >> Ah, thanks for pointing this out. It looks like a bug in the manual
> >> then, because I don't see why bit 11 should be any different from bits
> >> [10:8] and bits [15:12] in this respect. And in the diagrams above you
> >> clearly see bit 11 being shown even when SZ == 0b01.
> >>
> >> I will try to follow this up here.
> > 
> > Thanks for following up!
> 
> Just got the confirmation that this is indeed a bug in the manual. It
> will be fixed, but since the ARM ARM isn't published on a daily base, it
> might take a while to trickle in.

Thanks for confirmation and sharing the info!

Leo

^ permalink raw reply	[flat|nested] 43+ messages in thread

end of thread, other threads:[~2020-10-22  0:44 UTC | newest]

Thread overview: 43+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-29 13:39 [PATCH v2 00/14] perf arm-spe: Refactor decoding & dumping flow Leo Yan
2020-09-29 13:39 ` [PATCH v2 01/14] perf arm-spe: Include bitops.h for BIT() macro Leo Yan
2020-10-08 13:44   ` André Przywara
2020-09-29 13:39 ` [PATCH v2 02/14] perf arm-spe: Fix a typo in comment Leo Yan
2020-10-08 13:44   ` André Przywara
2020-09-29 13:39 ` [PATCH v2 03/14] perf arm-spe: Refactor payload length calculation Leo Yan
2020-10-08 13:44   ` André Przywara
2020-10-12  0:21     ` Leo Yan
2020-09-29 13:39 ` [PATCH v2 04/14] perf arm-spe: Fix packet length handling Leo Yan
2020-10-08 13:45   ` André Przywara
2020-09-29 13:39 ` [PATCH v2 05/14] perf arm-spe: Refactor printing string to buffer Leo Yan
2020-10-08 13:46   ` André Przywara
2020-10-12  0:29     ` Leo Yan
2020-09-29 13:39 ` [PATCH v2 06/14] perf arm-spe: Refactor packet header parsing Leo Yan
2020-10-08 19:49   ` André Przywara
2020-10-12  1:00     ` Leo Yan
2020-09-29 13:39 ` [PATCH v2 07/14] perf arm-spe: Refactor address packet handling Leo Yan
2020-10-19  9:01   ` André Przywara
2020-10-19 10:41     ` Leo Yan
2020-09-29 13:39 ` [PATCH v2 08/14] perf arm-spe: Refactor context " Leo Yan
2020-10-20 21:53   ` André Przywara
2020-09-29 13:39 ` [PATCH v2 09/14] perf arm-spe: Refactor counter " Leo Yan
2020-10-20 21:53   ` André Przywara
2020-10-21  3:52     ` Leo Yan
2020-09-29 13:39 ` [PATCH v2 10/14] perf arm-spe: Refactor event type handling Leo Yan
2020-10-20 21:54   ` André Przywara
2020-10-21  4:54     ` Leo Yan
2020-10-21  9:20       ` André Przywara
2020-10-21 10:13         ` Leo Yan
2020-09-29 13:39 ` [PATCH v2 11/14] perf arm-spe: Refactor operation packet handling Leo Yan
2020-09-29 13:39 ` [PATCH v2 12/14] perf arm-spe: Add more sub classes for operation packet Leo Yan
2020-10-20 21:54   ` André Przywara
2020-10-21  5:16     ` Leo Yan
2020-09-29 13:39 ` [PATCH v2 13/14] perf arm_spe: Decode memory tagging properties Leo Yan
2020-09-29 13:39 ` [PATCH v2 14/14] perf arm-spe: Add support for ARMv8.3-SPE Leo Yan
2020-10-20 21:54   ` André Przywara
2020-10-21  5:10     ` Leo Yan
2020-10-21  9:26       ` André Przywara
2020-10-21 10:17         ` Leo Yan
2020-10-21 14:53           ` André Przywara
2020-10-22  0:44             ` Leo Yan
2020-10-13 14:53 ` [PATCH v2 00/14] perf arm-spe: Refactor decoding & dumping flow Arnaldo Carvalho de Melo
2020-10-13 15:19   ` Leo Yan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).