All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] perf: Expand captured branch types
@ 2022-01-28  5:44 ` Anshuman Khandual
  0 siblings, 0 replies; 24+ messages in thread
From: Anshuman Khandual @ 2022-01-28  5:44 UTC (permalink / raw)
  To: linux-kernel, linux-perf-users, linux-arm-kernel
  Cc: robh, Anshuman Khandual, Peter Zijlstra, Ingo Molnar,
	Arnaldo Carvalho de Melo, Mark Rutland

Branch Record Buffer Extension (BRBE) implementation on arm64 captures more
branch type classification which cannot be accommodated in the current perf
branch record format via perf_branch_entry.type element. To overcome this
limitation, perf_branch_entry.type needs to be expanded along with generic
branch classification.

This series achieves this expansion. But before that it adds more generic
branch classification types which can be accommodated without any changes,
while also updating the x86 implementation as required.

This series applies on v5.17-rc1

Please find BRBE captured branch type classification reference here.

https://developer.arm.com/documentation/ddi0601/2021-12/AArch64-Registers/
BRBINF-n--EL1--Branch-Record-Buffer-Information-Register--n-?lang=en

Proposed perf branch stack enablement on arm64 platform can be found here.

https://lore.kernel.org/all/1642998653-21377-1-git-send-email-anshuman.khandual@arm.com/

These patches are split from the above series (PATCH 9/11, PATCH 10/11) as
these perf changes are pre-requisite for BRBE enablement. Although there is
another proposal (PATCH 11/11), extending struct perf_branch_entry further
to capture branch record privilege level information (priv, 2 bits), it is
not included here. But please do let me know if that proposal should also
be discussed in this series. Thank you.

Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-perf-users@vger.kernel.org
Cc: linux-kernel@vger.kernel.org

Anshuman Khandual (2):
  perf: Add more generic branch types
  perf: Expand perf_branch_entry.type

 arch/x86/events/intel/lbr.c           |  4 ++--
 include/uapi/linux/perf_event.h       | 15 +++++++++++++--
 tools/include/uapi/linux/perf_event.h | 15 +++++++++++++--
 tools/perf/util/branch.c              | 13 ++++++++++++-
 tools/perf/util/branch.h              |  4 ++--
 5 files changed, 42 insertions(+), 9 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 0/2] perf: Expand captured branch types
@ 2022-01-28  5:44 ` Anshuman Khandual
  0 siblings, 0 replies; 24+ messages in thread
From: Anshuman Khandual @ 2022-01-28  5:44 UTC (permalink / raw)
  To: linux-kernel, linux-perf-users, linux-arm-kernel
  Cc: robh, Anshuman Khandual, Peter Zijlstra, Ingo Molnar,
	Arnaldo Carvalho de Melo, Mark Rutland

Branch Record Buffer Extension (BRBE) implementation on arm64 captures more
branch type classification which cannot be accommodated in the current perf
branch record format via perf_branch_entry.type element. To overcome this
limitation, perf_branch_entry.type needs to be expanded along with generic
branch classification.

This series achieves this expansion. But before that it adds more generic
branch classification types which can be accommodated without any changes,
while also updating the x86 implementation as required.

This series applies on v5.17-rc1

Please find BRBE captured branch type classification reference here.

https://developer.arm.com/documentation/ddi0601/2021-12/AArch64-Registers/
BRBINF-n--EL1--Branch-Record-Buffer-Information-Register--n-?lang=en

Proposed perf branch stack enablement on arm64 platform can be found here.

https://lore.kernel.org/all/1642998653-21377-1-git-send-email-anshuman.khandual@arm.com/

These patches are split from the above series (PATCH 9/11, PATCH 10/11) as
these perf changes are pre-requisite for BRBE enablement. Although there is
another proposal (PATCH 11/11), extending struct perf_branch_entry further
to capture branch record privilege level information (priv, 2 bits), it is
not included here. But please do let me know if that proposal should also
be discussed in this series. Thank you.

Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-perf-users@vger.kernel.org
Cc: linux-kernel@vger.kernel.org

Anshuman Khandual (2):
  perf: Add more generic branch types
  perf: Expand perf_branch_entry.type

 arch/x86/events/intel/lbr.c           |  4 ++--
 include/uapi/linux/perf_event.h       | 15 +++++++++++++--
 tools/include/uapi/linux/perf_event.h | 15 +++++++++++++--
 tools/perf/util/branch.c              | 13 ++++++++++++-
 tools/perf/util/branch.h              |  4 ++--
 5 files changed, 42 insertions(+), 9 deletions(-)

-- 
2.25.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 1/2] perf: Add more generic branch types
  2022-01-28  5:44 ` Anshuman Khandual
@ 2022-01-28  5:44   ` Anshuman Khandual
  -1 siblings, 0 replies; 24+ messages in thread
From: Anshuman Khandual @ 2022-01-28  5:44 UTC (permalink / raw)
  To: linux-kernel, linux-perf-users, linux-arm-kernel
  Cc: robh, Anshuman Khandual, Peter Zijlstra, Ingo Molnar,
	Arnaldo Carvalho de Melo, Mark Rutland, Alexander Shishkin,
	Jiri Olsa, Namhyung Kim, Will Deacon

This expands generic branch type classification by adding some more entries
, that can still be represented with the existing 4 bit 'type' field. While
here this also updates the x86 implementation with these new branch types.

Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-perf-users@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
---
 arch/x86/events/intel/lbr.c           | 4 ++--
 include/uapi/linux/perf_event.h       | 5 +++++
 tools/include/uapi/linux/perf_event.h | 5 +++++
 tools/perf/util/branch.c              | 7 ++++++-
 4 files changed, 18 insertions(+), 3 deletions(-)

diff --git a/arch/x86/events/intel/lbr.c b/arch/x86/events/intel/lbr.c
index 8043213b75a5..9f86fac8c6a5 100644
--- a/arch/x86/events/intel/lbr.c
+++ b/arch/x86/events/intel/lbr.c
@@ -1336,10 +1336,10 @@ static int branch_map[X86_BR_TYPE_MAP_MAX] = {
 	PERF_BR_SYSCALL,	/* X86_BR_SYSCALL */
 	PERF_BR_SYSRET,		/* X86_BR_SYSRET */
 	PERF_BR_UNKNOWN,	/* X86_BR_INT */
-	PERF_BR_UNKNOWN,	/* X86_BR_IRET */
+	PERF_BR_EXPT_RET,	/* X86_BR_IRET */
 	PERF_BR_COND,		/* X86_BR_JCC */
 	PERF_BR_UNCOND,		/* X86_BR_JMP */
-	PERF_BR_UNKNOWN,	/* X86_BR_IRQ */
+	PERF_BR_IRQ,		/* X86_BR_IRQ */
 	PERF_BR_IND_CALL,	/* X86_BR_IND_CALL */
 	PERF_BR_UNKNOWN,	/* X86_BR_ABORT */
 	PERF_BR_UNKNOWN,	/* X86_BR_IN_TX */
diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
index 1b65042ab1db..b91d0f575d0c 100644
--- a/include/uapi/linux/perf_event.h
+++ b/include/uapi/linux/perf_event.h
@@ -251,6 +251,11 @@ enum {
 	PERF_BR_SYSRET		= 8,	/* syscall return */
 	PERF_BR_COND_CALL	= 9,	/* conditional function call */
 	PERF_BR_COND_RET	= 10,	/* conditional function return */
+	PERF_BR_EXPT_RET	= 11,	/* exception return */
+	PERF_BR_IRQ		= 12,	/* irq */
+	PERF_BR_FIQ		= 13,	/* fiq */
+	PERF_BR_DEBUG_HALT	= 14,	/* debug halt */
+	PERF_BR_DEBUG_EXIT	= 15,	/* debug exit */
 	PERF_BR_MAX,
 };
 
diff --git a/tools/include/uapi/linux/perf_event.h b/tools/include/uapi/linux/perf_event.h
index 4cd39aaccbe7..1882054e8684 100644
--- a/tools/include/uapi/linux/perf_event.h
+++ b/tools/include/uapi/linux/perf_event.h
@@ -251,6 +251,11 @@ enum {
 	PERF_BR_SYSRET		= 8,	/* syscall return */
 	PERF_BR_COND_CALL	= 9,	/* conditional function call */
 	PERF_BR_COND_RET	= 10,	/* conditional function return */
+	PERF_BR_EXPT_RET	= 11,	/* exception return */
+	PERF_BR_IRQ		= 12,	/* irq */
+	PERF_BR_FIQ		= 13,	/* fiq */
+	PERF_BR_DEBUG_HALT	= 14,	/* debug halt */
+	PERF_BR_DEBUG_EXIT	= 15,	/* debug exit */
 	PERF_BR_MAX,
 };
 
diff --git a/tools/perf/util/branch.c b/tools/perf/util/branch.c
index 2285b1eb3128..74e5e67b1779 100644
--- a/tools/perf/util/branch.c
+++ b/tools/perf/util/branch.c
@@ -49,7 +49,12 @@ const char *branch_type_name(int type)
 		"SYSCALL",
 		"SYSRET",
 		"COND_CALL",
-		"COND_RET"
+		"COND_RET",
+		"EXPT_RET",
+		"IRQ",
+		"FIQ",
+		"DEBUG_HALT",
+		"DEBUG_EXIT"
 	};
 
 	if (type >= 0 && type < PERF_BR_MAX)
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 1/2] perf: Add more generic branch types
@ 2022-01-28  5:44   ` Anshuman Khandual
  0 siblings, 0 replies; 24+ messages in thread
From: Anshuman Khandual @ 2022-01-28  5:44 UTC (permalink / raw)
  To: linux-kernel, linux-perf-users, linux-arm-kernel
  Cc: robh, Anshuman Khandual, Peter Zijlstra, Ingo Molnar,
	Arnaldo Carvalho de Melo, Mark Rutland, Alexander Shishkin,
	Jiri Olsa, Namhyung Kim, Will Deacon

This expands generic branch type classification by adding some more entries
, that can still be represented with the existing 4 bit 'type' field. While
here this also updates the x86 implementation with these new branch types.

Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-perf-users@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
---
 arch/x86/events/intel/lbr.c           | 4 ++--
 include/uapi/linux/perf_event.h       | 5 +++++
 tools/include/uapi/linux/perf_event.h | 5 +++++
 tools/perf/util/branch.c              | 7 ++++++-
 4 files changed, 18 insertions(+), 3 deletions(-)

diff --git a/arch/x86/events/intel/lbr.c b/arch/x86/events/intel/lbr.c
index 8043213b75a5..9f86fac8c6a5 100644
--- a/arch/x86/events/intel/lbr.c
+++ b/arch/x86/events/intel/lbr.c
@@ -1336,10 +1336,10 @@ static int branch_map[X86_BR_TYPE_MAP_MAX] = {
 	PERF_BR_SYSCALL,	/* X86_BR_SYSCALL */
 	PERF_BR_SYSRET,		/* X86_BR_SYSRET */
 	PERF_BR_UNKNOWN,	/* X86_BR_INT */
-	PERF_BR_UNKNOWN,	/* X86_BR_IRET */
+	PERF_BR_EXPT_RET,	/* X86_BR_IRET */
 	PERF_BR_COND,		/* X86_BR_JCC */
 	PERF_BR_UNCOND,		/* X86_BR_JMP */
-	PERF_BR_UNKNOWN,	/* X86_BR_IRQ */
+	PERF_BR_IRQ,		/* X86_BR_IRQ */
 	PERF_BR_IND_CALL,	/* X86_BR_IND_CALL */
 	PERF_BR_UNKNOWN,	/* X86_BR_ABORT */
 	PERF_BR_UNKNOWN,	/* X86_BR_IN_TX */
diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
index 1b65042ab1db..b91d0f575d0c 100644
--- a/include/uapi/linux/perf_event.h
+++ b/include/uapi/linux/perf_event.h
@@ -251,6 +251,11 @@ enum {
 	PERF_BR_SYSRET		= 8,	/* syscall return */
 	PERF_BR_COND_CALL	= 9,	/* conditional function call */
 	PERF_BR_COND_RET	= 10,	/* conditional function return */
+	PERF_BR_EXPT_RET	= 11,	/* exception return */
+	PERF_BR_IRQ		= 12,	/* irq */
+	PERF_BR_FIQ		= 13,	/* fiq */
+	PERF_BR_DEBUG_HALT	= 14,	/* debug halt */
+	PERF_BR_DEBUG_EXIT	= 15,	/* debug exit */
 	PERF_BR_MAX,
 };
 
diff --git a/tools/include/uapi/linux/perf_event.h b/tools/include/uapi/linux/perf_event.h
index 4cd39aaccbe7..1882054e8684 100644
--- a/tools/include/uapi/linux/perf_event.h
+++ b/tools/include/uapi/linux/perf_event.h
@@ -251,6 +251,11 @@ enum {
 	PERF_BR_SYSRET		= 8,	/* syscall return */
 	PERF_BR_COND_CALL	= 9,	/* conditional function call */
 	PERF_BR_COND_RET	= 10,	/* conditional function return */
+	PERF_BR_EXPT_RET	= 11,	/* exception return */
+	PERF_BR_IRQ		= 12,	/* irq */
+	PERF_BR_FIQ		= 13,	/* fiq */
+	PERF_BR_DEBUG_HALT	= 14,	/* debug halt */
+	PERF_BR_DEBUG_EXIT	= 15,	/* debug exit */
 	PERF_BR_MAX,
 };
 
diff --git a/tools/perf/util/branch.c b/tools/perf/util/branch.c
index 2285b1eb3128..74e5e67b1779 100644
--- a/tools/perf/util/branch.c
+++ b/tools/perf/util/branch.c
@@ -49,7 +49,12 @@ const char *branch_type_name(int type)
 		"SYSCALL",
 		"SYSRET",
 		"COND_CALL",
-		"COND_RET"
+		"COND_RET",
+		"EXPT_RET",
+		"IRQ",
+		"FIQ",
+		"DEBUG_HALT",
+		"DEBUG_EXIT"
 	};
 
 	if (type >= 0 && type < PERF_BR_MAX)
-- 
2.25.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 2/2] perf: Expand perf_branch_entry.type
  2022-01-28  5:44 ` Anshuman Khandual
@ 2022-01-28  5:44   ` Anshuman Khandual
  -1 siblings, 0 replies; 24+ messages in thread
From: Anshuman Khandual @ 2022-01-28  5:44 UTC (permalink / raw)
  To: linux-kernel, linux-perf-users, linux-arm-kernel
  Cc: robh, Anshuman Khandual, Peter Zijlstra, Ingo Molnar,
	Arnaldo Carvalho de Melo, Mark Rutland, Alexander Shishkin,
	Jiri Olsa, Namhyung Kim, Will Deacon

Current perf_branch_entry.type is a 4 bits field just enough to accommodate
16 generic branch types. This is insufficient to accommodate platforms like
arm64 which has much more branch types. Lets just expands this field into a
6 bits one, which can now hold 64 generic branch types. This also adds more
generic branch types.

Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-perf-users@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
---
 include/uapi/linux/perf_event.h       | 10 ++++++++--
 tools/include/uapi/linux/perf_event.h | 10 ++++++++--
 tools/perf/util/branch.c              |  8 +++++++-
 tools/perf/util/branch.h              |  4 ++--
 4 files changed, 25 insertions(+), 7 deletions(-)

diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
index b91d0f575d0c..361fdc6b87a0 100644
--- a/include/uapi/linux/perf_event.h
+++ b/include/uapi/linux/perf_event.h
@@ -256,6 +256,12 @@ enum {
 	PERF_BR_FIQ		= 13,	/* fiq */
 	PERF_BR_DEBUG_HALT	= 14,	/* debug halt */
 	PERF_BR_DEBUG_EXIT	= 15,	/* debug exit */
+	PERF_BR_DEBUG_INST	= 16,	/* instruciton debug */
+	PERF_BR_DEBUG_DATA	= 17,	/* data debug */
+	PERF_BR_FAULT_ALGN	= 18,	/* alignment fault */
+	PERF_BR_FAULT_DATA	= 19,	/* data fault */
+	PERF_BR_FAULT_INST	= 20,	/* instruction fault */
+	PERF_BR_SERROR		= 21,	/* system error */
 	PERF_BR_MAX,
 };
 
@@ -1370,8 +1376,8 @@ struct perf_branch_entry {
 		in_tx:1,    /* in transaction */
 		abort:1,    /* transaction abort */
 		cycles:16,  /* cycle count to last branch */
-		type:4,     /* branch type */
-		reserved:40;
+		type:6,     /* branch type */
+		reserved:38;
 };
 
 union perf_sample_weight {
diff --git a/tools/include/uapi/linux/perf_event.h b/tools/include/uapi/linux/perf_event.h
index 1882054e8684..9a82b8aaed93 100644
--- a/tools/include/uapi/linux/perf_event.h
+++ b/tools/include/uapi/linux/perf_event.h
@@ -256,6 +256,12 @@ enum {
 	PERF_BR_FIQ		= 13,	/* fiq */
 	PERF_BR_DEBUG_HALT	= 14,	/* debug halt */
 	PERF_BR_DEBUG_EXIT	= 15,	/* debug exit */
+	PERF_BR_DEBUG_INST	= 16,	/* instruciton debug */
+	PERF_BR_DEBUG_DATA	= 17,	/* data debug */
+	PERF_BR_FAULT_ALGN	= 18,	/* alignment fault */
+	PERF_BR_FAULT_DATA	= 19,	/* data fault */
+	PERF_BR_FAULT_INST	= 20,	/* instruction fault */
+	PERF_BR_SERROR		= 21,	/* system error */
 	PERF_BR_MAX,
 };
 
@@ -1370,8 +1376,8 @@ struct perf_branch_entry {
 		in_tx:1,    /* in transaction */
 		abort:1,    /* transaction abort */
 		cycles:16,  /* cycle count to last branch */
-		type:4,     /* branch type */
-		reserved:40;
+		type:6,     /* branch type */
+		reserved:38;
 };
 
 union perf_sample_weight {
diff --git a/tools/perf/util/branch.c b/tools/perf/util/branch.c
index 74e5e67b1779..1e216ea2e2a8 100644
--- a/tools/perf/util/branch.c
+++ b/tools/perf/util/branch.c
@@ -54,7 +54,13 @@ const char *branch_type_name(int type)
 		"IRQ",
 		"FIQ",
 		"DEBUG_HALT",
-		"DEBUG_EXIT"
+		"DEBUG_EXIT",
+		"DEBUG_INST",
+		"DEBUG_DATA",
+		"FAULT_ALGN",
+		"FAULT_DATA",
+		"FAULT_INST",
+		"SERROR"
 	};
 
 	if (type >= 0 && type < PERF_BR_MAX)
diff --git a/tools/perf/util/branch.h b/tools/perf/util/branch.h
index 17b2ccc61094..875d99abdc36 100644
--- a/tools/perf/util/branch.h
+++ b/tools/perf/util/branch.h
@@ -23,8 +23,8 @@ struct branch_flags {
 			u64 in_tx:1;
 			u64 abort:1;
 			u64 cycles:16;
-			u64 type:4;
-			u64 reserved:40;
+			u64 type:6;
+			u64 reserved:38;
 		};
 	};
 };
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 2/2] perf: Expand perf_branch_entry.type
@ 2022-01-28  5:44   ` Anshuman Khandual
  0 siblings, 0 replies; 24+ messages in thread
From: Anshuman Khandual @ 2022-01-28  5:44 UTC (permalink / raw)
  To: linux-kernel, linux-perf-users, linux-arm-kernel
  Cc: robh, Anshuman Khandual, Peter Zijlstra, Ingo Molnar,
	Arnaldo Carvalho de Melo, Mark Rutland, Alexander Shishkin,
	Jiri Olsa, Namhyung Kim, Will Deacon

Current perf_branch_entry.type is a 4 bits field just enough to accommodate
16 generic branch types. This is insufficient to accommodate platforms like
arm64 which has much more branch types. Lets just expands this field into a
6 bits one, which can now hold 64 generic branch types. This also adds more
generic branch types.

Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-perf-users@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
---
 include/uapi/linux/perf_event.h       | 10 ++++++++--
 tools/include/uapi/linux/perf_event.h | 10 ++++++++--
 tools/perf/util/branch.c              |  8 +++++++-
 tools/perf/util/branch.h              |  4 ++--
 4 files changed, 25 insertions(+), 7 deletions(-)

diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
index b91d0f575d0c..361fdc6b87a0 100644
--- a/include/uapi/linux/perf_event.h
+++ b/include/uapi/linux/perf_event.h
@@ -256,6 +256,12 @@ enum {
 	PERF_BR_FIQ		= 13,	/* fiq */
 	PERF_BR_DEBUG_HALT	= 14,	/* debug halt */
 	PERF_BR_DEBUG_EXIT	= 15,	/* debug exit */
+	PERF_BR_DEBUG_INST	= 16,	/* instruciton debug */
+	PERF_BR_DEBUG_DATA	= 17,	/* data debug */
+	PERF_BR_FAULT_ALGN	= 18,	/* alignment fault */
+	PERF_BR_FAULT_DATA	= 19,	/* data fault */
+	PERF_BR_FAULT_INST	= 20,	/* instruction fault */
+	PERF_BR_SERROR		= 21,	/* system error */
 	PERF_BR_MAX,
 };
 
@@ -1370,8 +1376,8 @@ struct perf_branch_entry {
 		in_tx:1,    /* in transaction */
 		abort:1,    /* transaction abort */
 		cycles:16,  /* cycle count to last branch */
-		type:4,     /* branch type */
-		reserved:40;
+		type:6,     /* branch type */
+		reserved:38;
 };
 
 union perf_sample_weight {
diff --git a/tools/include/uapi/linux/perf_event.h b/tools/include/uapi/linux/perf_event.h
index 1882054e8684..9a82b8aaed93 100644
--- a/tools/include/uapi/linux/perf_event.h
+++ b/tools/include/uapi/linux/perf_event.h
@@ -256,6 +256,12 @@ enum {
 	PERF_BR_FIQ		= 13,	/* fiq */
 	PERF_BR_DEBUG_HALT	= 14,	/* debug halt */
 	PERF_BR_DEBUG_EXIT	= 15,	/* debug exit */
+	PERF_BR_DEBUG_INST	= 16,	/* instruciton debug */
+	PERF_BR_DEBUG_DATA	= 17,	/* data debug */
+	PERF_BR_FAULT_ALGN	= 18,	/* alignment fault */
+	PERF_BR_FAULT_DATA	= 19,	/* data fault */
+	PERF_BR_FAULT_INST	= 20,	/* instruction fault */
+	PERF_BR_SERROR		= 21,	/* system error */
 	PERF_BR_MAX,
 };
 
@@ -1370,8 +1376,8 @@ struct perf_branch_entry {
 		in_tx:1,    /* in transaction */
 		abort:1,    /* transaction abort */
 		cycles:16,  /* cycle count to last branch */
-		type:4,     /* branch type */
-		reserved:40;
+		type:6,     /* branch type */
+		reserved:38;
 };
 
 union perf_sample_weight {
diff --git a/tools/perf/util/branch.c b/tools/perf/util/branch.c
index 74e5e67b1779..1e216ea2e2a8 100644
--- a/tools/perf/util/branch.c
+++ b/tools/perf/util/branch.c
@@ -54,7 +54,13 @@ const char *branch_type_name(int type)
 		"IRQ",
 		"FIQ",
 		"DEBUG_HALT",
-		"DEBUG_EXIT"
+		"DEBUG_EXIT",
+		"DEBUG_INST",
+		"DEBUG_DATA",
+		"FAULT_ALGN",
+		"FAULT_DATA",
+		"FAULT_INST",
+		"SERROR"
 	};
 
 	if (type >= 0 && type < PERF_BR_MAX)
diff --git a/tools/perf/util/branch.h b/tools/perf/util/branch.h
index 17b2ccc61094..875d99abdc36 100644
--- a/tools/perf/util/branch.h
+++ b/tools/perf/util/branch.h
@@ -23,8 +23,8 @@ struct branch_flags {
 			u64 in_tx:1;
 			u64 abort:1;
 			u64 cycles:16;
-			u64 type:4;
-			u64 reserved:40;
+			u64 type:6;
+			u64 reserved:38;
 		};
 	};
 };
-- 
2.25.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/2] perf: Expand perf_branch_entry.type
  2022-01-28  5:44   ` Anshuman Khandual
@ 2022-02-02 11:57     ` Mark Rutland
  -1 siblings, 0 replies; 24+ messages in thread
From: Mark Rutland @ 2022-02-02 11:57 UTC (permalink / raw)
  To: Anshuman Khandual
  Cc: linux-kernel, linux-perf-users, linux-arm-kernel, robh,
	Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Will Deacon

On Fri, Jan 28, 2022 at 11:14:13AM +0530, Anshuman Khandual wrote:
> Current perf_branch_entry.type is a 4 bits field just enough to accommodate
> 16 generic branch types. This is insufficient to accommodate platforms like
> arm64 which has much more branch types.

It would be good to mention BRBE specifically here, along with specific values
and a rought intro, e.g.

| The Arm Branch Record Buffer Extension (BRBE) distinguishes $N types of
| branch/exception/return: <rough summary here>. There's not enough space to
| describe these all in perf_branch_entry.type, as this is a 4-bit field.

That way reviewers (and anyone looking at the patch in future) have a lot more
rationale to work with. A rough summary of the distinct branch types would be
*really* helpful.

> Lets just expands this field into a 6 bits one, which can now hold 64 generic
> branch types.

Is it safe (ABI-wise) to extend a bit-field like this? Does that break any
combination of old/new userspace and old/new kernel? I'm not sure how bit
fields are managed w.r.t. endianness, but normally extending a field would
break BE, so this seems suspicious.

I suspect we might need to allocate a *separate* field for new values, and
possibly reserve a value in the existing field to say "go look at the new
field".

Do you have any rationale for 64 values specifically? e.g. is that mostly for
future extensibility? How many will we need for Arm's BRBE?

Do those types fall into a hierarchy, that we could split across separate
fields?

> This also adds more generic branch types.

This feels like ti should be in a separate/subsequent patch. If nothing else
that aids bisectability if changing the size of the field breaks anything.

> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
> Cc: Mark Rutland <mark.rutland@arm.com>
> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
> Cc: Jiri Olsa <jolsa@redhat.com>
> Cc: Namhyung Kim <namhyung@kernel.org>
> Cc: Will Deacon <will@kernel.org>
> Cc: linux-arm-kernel@lists.infradead.org
> Cc: linux-perf-users@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
> ---
>  include/uapi/linux/perf_event.h       | 10 ++++++++--
>  tools/include/uapi/linux/perf_event.h | 10 ++++++++--
>  tools/perf/util/branch.c              |  8 +++++++-
>  tools/perf/util/branch.h              |  4 ++--
>  4 files changed, 25 insertions(+), 7 deletions(-)
> 
> diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
> index b91d0f575d0c..361fdc6b87a0 100644
> --- a/include/uapi/linux/perf_event.h
> +++ b/include/uapi/linux/perf_event.h
> @@ -256,6 +256,12 @@ enum {
>  	PERF_BR_FIQ		= 13,	/* fiq */
>  	PERF_BR_DEBUG_HALT	= 14,	/* debug halt */
>  	PERF_BR_DEBUG_EXIT	= 15,	/* debug exit */
> +	PERF_BR_DEBUG_INST	= 16,	/* instruciton debug */
> +	PERF_BR_DEBUG_DATA	= 17,	/* data debug */

This is really unclear. What is "instruction debug" vs "data debug" ?

Are there meant for breakpoint/watchpoint exceptions? HW breakpoints vs BRK
instructions?

> +	PERF_BR_FAULT_ALGN	= 18,	/* alignment fault */
> +	PERF_BR_FAULT_DATA	= 19,	/* data fault */
> +	PERF_BR_FAULT_INST	= 20,	/* instruction fault */

There are many other potential faults a CPU could take; are these specifically
what Arm's BRBE provides?

> +	PERF_BR_SERROR		= 21,	/* system error */

This is really arm-specific; IIUC the closest thing on x86 is an MCE.

>  	PERF_BR_MAX,
>  };
>  
> @@ -1370,8 +1376,8 @@ struct perf_branch_entry {
>  		in_tx:1,    /* in transaction */
>  		abort:1,    /* transaction abort */
>  		cycles:16,  /* cycle count to last branch */
> -		type:4,     /* branch type */
> -		reserved:40;
> +		type:6,     /* branch type */

As above, is this a safe-change ABI-wise?

Thanks,
Mark.

> +		reserved:38;
>  };
>  
>  union perf_sample_weight {
> diff --git a/tools/include/uapi/linux/perf_event.h b/tools/include/uapi/linux/perf_event.h
> index 1882054e8684..9a82b8aaed93 100644
> --- a/tools/include/uapi/linux/perf_event.h
> +++ b/tools/include/uapi/linux/perf_event.h
> @@ -256,6 +256,12 @@ enum {
>  	PERF_BR_FIQ		= 13,	/* fiq */
>  	PERF_BR_DEBUG_HALT	= 14,	/* debug halt */
>  	PERF_BR_DEBUG_EXIT	= 15,	/* debug exit */
> +	PERF_BR_DEBUG_INST	= 16,	/* instruciton debug */
> +	PERF_BR_DEBUG_DATA	= 17,	/* data debug */
> +	PERF_BR_FAULT_ALGN	= 18,	/* alignment fault */
> +	PERF_BR_FAULT_DATA	= 19,	/* data fault */
> +	PERF_BR_FAULT_INST	= 20,	/* instruction fault */
> +	PERF_BR_SERROR		= 21,	/* system error */
>  	PERF_BR_MAX,
>  };
>  
> @@ -1370,8 +1376,8 @@ struct perf_branch_entry {
>  		in_tx:1,    /* in transaction */
>  		abort:1,    /* transaction abort */
>  		cycles:16,  /* cycle count to last branch */
> -		type:4,     /* branch type */
> -		reserved:40;
> +		type:6,     /* branch type */
> +		reserved:38;
>  };
>  
>  union perf_sample_weight {
> diff --git a/tools/perf/util/branch.c b/tools/perf/util/branch.c
> index 74e5e67b1779..1e216ea2e2a8 100644
> --- a/tools/perf/util/branch.c
> +++ b/tools/perf/util/branch.c
> @@ -54,7 +54,13 @@ const char *branch_type_name(int type)
>  		"IRQ",
>  		"FIQ",
>  		"DEBUG_HALT",
> -		"DEBUG_EXIT"
> +		"DEBUG_EXIT",
> +		"DEBUG_INST",
> +		"DEBUG_DATA",
> +		"FAULT_ALGN",
> +		"FAULT_DATA",
> +		"FAULT_INST",
> +		"SERROR"
>  	};
>  
>  	if (type >= 0 && type < PERF_BR_MAX)
> diff --git a/tools/perf/util/branch.h b/tools/perf/util/branch.h
> index 17b2ccc61094..875d99abdc36 100644
> --- a/tools/perf/util/branch.h
> +++ b/tools/perf/util/branch.h
> @@ -23,8 +23,8 @@ struct branch_flags {
>  			u64 in_tx:1;
>  			u64 abort:1;
>  			u64 cycles:16;
> -			u64 type:4;
> -			u64 reserved:40;
> +			u64 type:6;
> +			u64 reserved:38;
>  		};
>  	};
>  };
> -- 
> 2.25.1
> 

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/2] perf: Expand perf_branch_entry.type
@ 2022-02-02 11:57     ` Mark Rutland
  0 siblings, 0 replies; 24+ messages in thread
From: Mark Rutland @ 2022-02-02 11:57 UTC (permalink / raw)
  To: Anshuman Khandual
  Cc: linux-kernel, linux-perf-users, linux-arm-kernel, robh,
	Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Will Deacon

On Fri, Jan 28, 2022 at 11:14:13AM +0530, Anshuman Khandual wrote:
> Current perf_branch_entry.type is a 4 bits field just enough to accommodate
> 16 generic branch types. This is insufficient to accommodate platforms like
> arm64 which has much more branch types.

It would be good to mention BRBE specifically here, along with specific values
and a rought intro, e.g.

| The Arm Branch Record Buffer Extension (BRBE) distinguishes $N types of
| branch/exception/return: <rough summary here>. There's not enough space to
| describe these all in perf_branch_entry.type, as this is a 4-bit field.

That way reviewers (and anyone looking at the patch in future) have a lot more
rationale to work with. A rough summary of the distinct branch types would be
*really* helpful.

> Lets just expands this field into a 6 bits one, which can now hold 64 generic
> branch types.

Is it safe (ABI-wise) to extend a bit-field like this? Does that break any
combination of old/new userspace and old/new kernel? I'm not sure how bit
fields are managed w.r.t. endianness, but normally extending a field would
break BE, so this seems suspicious.

I suspect we might need to allocate a *separate* field for new values, and
possibly reserve a value in the existing field to say "go look at the new
field".

Do you have any rationale for 64 values specifically? e.g. is that mostly for
future extensibility? How many will we need for Arm's BRBE?

Do those types fall into a hierarchy, that we could split across separate
fields?

> This also adds more generic branch types.

This feels like ti should be in a separate/subsequent patch. If nothing else
that aids bisectability if changing the size of the field breaks anything.

> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
> Cc: Mark Rutland <mark.rutland@arm.com>
> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
> Cc: Jiri Olsa <jolsa@redhat.com>
> Cc: Namhyung Kim <namhyung@kernel.org>
> Cc: Will Deacon <will@kernel.org>
> Cc: linux-arm-kernel@lists.infradead.org
> Cc: linux-perf-users@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
> ---
>  include/uapi/linux/perf_event.h       | 10 ++++++++--
>  tools/include/uapi/linux/perf_event.h | 10 ++++++++--
>  tools/perf/util/branch.c              |  8 +++++++-
>  tools/perf/util/branch.h              |  4 ++--
>  4 files changed, 25 insertions(+), 7 deletions(-)
> 
> diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
> index b91d0f575d0c..361fdc6b87a0 100644
> --- a/include/uapi/linux/perf_event.h
> +++ b/include/uapi/linux/perf_event.h
> @@ -256,6 +256,12 @@ enum {
>  	PERF_BR_FIQ		= 13,	/* fiq */
>  	PERF_BR_DEBUG_HALT	= 14,	/* debug halt */
>  	PERF_BR_DEBUG_EXIT	= 15,	/* debug exit */
> +	PERF_BR_DEBUG_INST	= 16,	/* instruciton debug */
> +	PERF_BR_DEBUG_DATA	= 17,	/* data debug */

This is really unclear. What is "instruction debug" vs "data debug" ?

Are there meant for breakpoint/watchpoint exceptions? HW breakpoints vs BRK
instructions?

> +	PERF_BR_FAULT_ALGN	= 18,	/* alignment fault */
> +	PERF_BR_FAULT_DATA	= 19,	/* data fault */
> +	PERF_BR_FAULT_INST	= 20,	/* instruction fault */

There are many other potential faults a CPU could take; are these specifically
what Arm's BRBE provides?

> +	PERF_BR_SERROR		= 21,	/* system error */

This is really arm-specific; IIUC the closest thing on x86 is an MCE.

>  	PERF_BR_MAX,
>  };
>  
> @@ -1370,8 +1376,8 @@ struct perf_branch_entry {
>  		in_tx:1,    /* in transaction */
>  		abort:1,    /* transaction abort */
>  		cycles:16,  /* cycle count to last branch */
> -		type:4,     /* branch type */
> -		reserved:40;
> +		type:6,     /* branch type */

As above, is this a safe-change ABI-wise?

Thanks,
Mark.

> +		reserved:38;
>  };
>  
>  union perf_sample_weight {
> diff --git a/tools/include/uapi/linux/perf_event.h b/tools/include/uapi/linux/perf_event.h
> index 1882054e8684..9a82b8aaed93 100644
> --- a/tools/include/uapi/linux/perf_event.h
> +++ b/tools/include/uapi/linux/perf_event.h
> @@ -256,6 +256,12 @@ enum {
>  	PERF_BR_FIQ		= 13,	/* fiq */
>  	PERF_BR_DEBUG_HALT	= 14,	/* debug halt */
>  	PERF_BR_DEBUG_EXIT	= 15,	/* debug exit */
> +	PERF_BR_DEBUG_INST	= 16,	/* instruciton debug */
> +	PERF_BR_DEBUG_DATA	= 17,	/* data debug */
> +	PERF_BR_FAULT_ALGN	= 18,	/* alignment fault */
> +	PERF_BR_FAULT_DATA	= 19,	/* data fault */
> +	PERF_BR_FAULT_INST	= 20,	/* instruction fault */
> +	PERF_BR_SERROR		= 21,	/* system error */
>  	PERF_BR_MAX,
>  };
>  
> @@ -1370,8 +1376,8 @@ struct perf_branch_entry {
>  		in_tx:1,    /* in transaction */
>  		abort:1,    /* transaction abort */
>  		cycles:16,  /* cycle count to last branch */
> -		type:4,     /* branch type */
> -		reserved:40;
> +		type:6,     /* branch type */
> +		reserved:38;
>  };
>  
>  union perf_sample_weight {
> diff --git a/tools/perf/util/branch.c b/tools/perf/util/branch.c
> index 74e5e67b1779..1e216ea2e2a8 100644
> --- a/tools/perf/util/branch.c
> +++ b/tools/perf/util/branch.c
> @@ -54,7 +54,13 @@ const char *branch_type_name(int type)
>  		"IRQ",
>  		"FIQ",
>  		"DEBUG_HALT",
> -		"DEBUG_EXIT"
> +		"DEBUG_EXIT",
> +		"DEBUG_INST",
> +		"DEBUG_DATA",
> +		"FAULT_ALGN",
> +		"FAULT_DATA",
> +		"FAULT_INST",
> +		"SERROR"
>  	};
>  
>  	if (type >= 0 && type < PERF_BR_MAX)
> diff --git a/tools/perf/util/branch.h b/tools/perf/util/branch.h
> index 17b2ccc61094..875d99abdc36 100644
> --- a/tools/perf/util/branch.h
> +++ b/tools/perf/util/branch.h
> @@ -23,8 +23,8 @@ struct branch_flags {
>  			u64 in_tx:1;
>  			u64 abort:1;
>  			u64 cycles:16;
> -			u64 type:4;
> -			u64 reserved:40;
> +			u64 type:6;
> +			u64 reserved:38;
>  		};
>  	};
>  };
> -- 
> 2.25.1
> 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/2] perf: Add more generic branch types
  2022-01-28  5:44   ` Anshuman Khandual
@ 2022-02-02 11:58     ` Mark Rutland
  -1 siblings, 0 replies; 24+ messages in thread
From: Mark Rutland @ 2022-02-02 11:58 UTC (permalink / raw)
  To: Anshuman Khandual
  Cc: linux-kernel, linux-perf-users, linux-arm-kernel, robh,
	Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Will Deacon

Hi Anshuman,

On Fri, Jan 28, 2022 at 11:14:12AM +0530, Anshuman Khandual wrote:
> This expands generic branch type classification by adding some more entries
> , that can still be represented with the existing 4 bit 'type' field. While
> here this also updates the x86 implementation with these new branch types.

It looks like there's some whitespace damage here.

From a quick scan of the below, I think the "exception return" and "IRQ
exception" types are somewhat generic, and could be added now (aside from any
bikeshedding over names), but:

* For IRQ vs FIQ, we might just want to have a top-level "asynchronous
  exception" type, and then further divide that with a separate field. That way
  it's easier to extend in future if new exceptions are added.

* I don't think the debug state entry/exits make sense as generic branch types,
  since those are somewhat specific to the ARM architecutre, and it might make
  sense to define generic PERF_BR_ARCH* definitions instead.

* Given the next patch extends the field, and therei are potential ABI problems
  with that, we might need to reserve a value for ABI extensibility purposes,
  and I suspect we need to do that *first*. More comments on the subsequent
  patch.
 
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
> Cc: Mark Rutland <mark.rutland@arm.com>
> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
> Cc: Jiri Olsa <jolsa@redhat.com>
> Cc: Namhyung Kim <namhyung@kernel.org>
> Cc: Will Deacon <will@kernel.org>
> Cc: linux-arm-kernel@lists.infradead.org
> Cc: linux-perf-users@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
> ---
>  arch/x86/events/intel/lbr.c           | 4 ++--
>  include/uapi/linux/perf_event.h       | 5 +++++
>  tools/include/uapi/linux/perf_event.h | 5 +++++
>  tools/perf/util/branch.c              | 7 ++++++-
>  4 files changed, 18 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/x86/events/intel/lbr.c b/arch/x86/events/intel/lbr.c
> index 8043213b75a5..9f86fac8c6a5 100644
> --- a/arch/x86/events/intel/lbr.c
> +++ b/arch/x86/events/intel/lbr.c
> @@ -1336,10 +1336,10 @@ static int branch_map[X86_BR_TYPE_MAP_MAX] = {
>  	PERF_BR_SYSCALL,	/* X86_BR_SYSCALL */
>  	PERF_BR_SYSRET,		/* X86_BR_SYSRET */
>  	PERF_BR_UNKNOWN,	/* X86_BR_INT */
> -	PERF_BR_UNKNOWN,	/* X86_BR_IRET */
> +	PERF_BR_EXPT_RET,	/* X86_BR_IRET */
>  	PERF_BR_COND,		/* X86_BR_JCC */
>  	PERF_BR_UNCOND,		/* X86_BR_JMP */
> -	PERF_BR_UNKNOWN,	/* X86_BR_IRQ */
> +	PERF_BR_IRQ,		/* X86_BR_IRQ */
>  	PERF_BR_IND_CALL,	/* X86_BR_IND_CALL */
>  	PERF_BR_UNKNOWN,	/* X86_BR_ABORT */
>  	PERF_BR_UNKNOWN,	/* X86_BR_IN_TX */

This presumably changes the values reported to userspace, so the commit message
should mention that and explain why that is not a problem.

> diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
> index 1b65042ab1db..b91d0f575d0c 100644
> --- a/include/uapi/linux/perf_event.h
> +++ b/include/uapi/linux/perf_event.h
> @@ -251,6 +251,11 @@ enum {
>  	PERF_BR_SYSRET		= 8,	/* syscall return */
>  	PERF_BR_COND_CALL	= 9,	/* conditional function call */
>  	PERF_BR_COND_RET	= 10,	/* conditional function return */
> +	PERF_BR_EXPT_RET	= 11,	/* exception return */

We don't use 'EXPT' anywhere else, so it might be better to just use 'ERET'.
IIUC that's the naming the x86 FRED stuff is going to use anyhow.

> +	PERF_BR_IRQ		= 12,	/* irq */

This looks somewhat generic, so adding it now makes sense to me, but ...

> +	PERF_BR_FIQ		= 13,	/* fiq */

... this is arguably just a idfferent class of interrupt from the PoV of Linux,
and the naming is ARM-specific, so I don't think this makes sense to add *now*.
As above, maybe it would be better to have a generic "aynchronous exception" or
"interrupt" type, and a separate field to distinguish specific types of those.

> +	PERF_BR_DEBUG_HALT	= 14,	/* debug halt */
> +	PERF_BR_DEBUG_EXIT	= 15,	/* debug exit */

For the benefit of those not familiar with the ARM architecture, "debug halt"
and "debug exit" usually refer to "debug state", which is what an external
(e.g. JTAG) debugger uses rather than the usual self-hosted debug stuff that
Linux uses.

Given that, I'm not sure these are very generic, and I suspect it would be
better to have more generic PERF_BR_ARCH_* entries for things like this.

Thanks,
Mark.

>  	PERF_BR_MAX,
>  };
>  
> diff --git a/tools/include/uapi/linux/perf_event.h b/tools/include/uapi/linux/perf_event.h
> index 4cd39aaccbe7..1882054e8684 100644
> --- a/tools/include/uapi/linux/perf_event.h
> +++ b/tools/include/uapi/linux/perf_event.h
> @@ -251,6 +251,11 @@ enum {
>  	PERF_BR_SYSRET		= 8,	/* syscall return */
>  	PERF_BR_COND_CALL	= 9,	/* conditional function call */
>  	PERF_BR_COND_RET	= 10,	/* conditional function return */
> +	PERF_BR_EXPT_RET	= 11,	/* exception return */
> +	PERF_BR_IRQ		= 12,	/* irq */
> +	PERF_BR_FIQ		= 13,	/* fiq */
> +	PERF_BR_DEBUG_HALT	= 14,	/* debug halt */
> +	PERF_BR_DEBUG_EXIT	= 15,	/* debug exit */
>  	PERF_BR_MAX,
>  };
>  
> diff --git a/tools/perf/util/branch.c b/tools/perf/util/branch.c
> index 2285b1eb3128..74e5e67b1779 100644
> --- a/tools/perf/util/branch.c
> +++ b/tools/perf/util/branch.c
> @@ -49,7 +49,12 @@ const char *branch_type_name(int type)
>  		"SYSCALL",
>  		"SYSRET",
>  		"COND_CALL",
> -		"COND_RET"
> +		"COND_RET",
> +		"EXPT_RET",
> +		"IRQ",
> +		"FIQ",
> +		"DEBUG_HALT",
> +		"DEBUG_EXIT"
>  	};
>  
>  	if (type >= 0 && type < PERF_BR_MAX)
> -- 
> 2.25.1
> 

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/2] perf: Add more generic branch types
@ 2022-02-02 11:58     ` Mark Rutland
  0 siblings, 0 replies; 24+ messages in thread
From: Mark Rutland @ 2022-02-02 11:58 UTC (permalink / raw)
  To: Anshuman Khandual
  Cc: linux-kernel, linux-perf-users, linux-arm-kernel, robh,
	Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Will Deacon

Hi Anshuman,

On Fri, Jan 28, 2022 at 11:14:12AM +0530, Anshuman Khandual wrote:
> This expands generic branch type classification by adding some more entries
> , that can still be represented with the existing 4 bit 'type' field. While
> here this also updates the x86 implementation with these new branch types.

It looks like there's some whitespace damage here.

From a quick scan of the below, I think the "exception return" and "IRQ
exception" types are somewhat generic, and could be added now (aside from any
bikeshedding over names), but:

* For IRQ vs FIQ, we might just want to have a top-level "asynchronous
  exception" type, and then further divide that with a separate field. That way
  it's easier to extend in future if new exceptions are added.

* I don't think the debug state entry/exits make sense as generic branch types,
  since those are somewhat specific to the ARM architecutre, and it might make
  sense to define generic PERF_BR_ARCH* definitions instead.

* Given the next patch extends the field, and therei are potential ABI problems
  with that, we might need to reserve a value for ABI extensibility purposes,
  and I suspect we need to do that *first*. More comments on the subsequent
  patch.
 
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
> Cc: Mark Rutland <mark.rutland@arm.com>
> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
> Cc: Jiri Olsa <jolsa@redhat.com>
> Cc: Namhyung Kim <namhyung@kernel.org>
> Cc: Will Deacon <will@kernel.org>
> Cc: linux-arm-kernel@lists.infradead.org
> Cc: linux-perf-users@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
> ---
>  arch/x86/events/intel/lbr.c           | 4 ++--
>  include/uapi/linux/perf_event.h       | 5 +++++
>  tools/include/uapi/linux/perf_event.h | 5 +++++
>  tools/perf/util/branch.c              | 7 ++++++-
>  4 files changed, 18 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/x86/events/intel/lbr.c b/arch/x86/events/intel/lbr.c
> index 8043213b75a5..9f86fac8c6a5 100644
> --- a/arch/x86/events/intel/lbr.c
> +++ b/arch/x86/events/intel/lbr.c
> @@ -1336,10 +1336,10 @@ static int branch_map[X86_BR_TYPE_MAP_MAX] = {
>  	PERF_BR_SYSCALL,	/* X86_BR_SYSCALL */
>  	PERF_BR_SYSRET,		/* X86_BR_SYSRET */
>  	PERF_BR_UNKNOWN,	/* X86_BR_INT */
> -	PERF_BR_UNKNOWN,	/* X86_BR_IRET */
> +	PERF_BR_EXPT_RET,	/* X86_BR_IRET */
>  	PERF_BR_COND,		/* X86_BR_JCC */
>  	PERF_BR_UNCOND,		/* X86_BR_JMP */
> -	PERF_BR_UNKNOWN,	/* X86_BR_IRQ */
> +	PERF_BR_IRQ,		/* X86_BR_IRQ */
>  	PERF_BR_IND_CALL,	/* X86_BR_IND_CALL */
>  	PERF_BR_UNKNOWN,	/* X86_BR_ABORT */
>  	PERF_BR_UNKNOWN,	/* X86_BR_IN_TX */

This presumably changes the values reported to userspace, so the commit message
should mention that and explain why that is not a problem.

> diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
> index 1b65042ab1db..b91d0f575d0c 100644
> --- a/include/uapi/linux/perf_event.h
> +++ b/include/uapi/linux/perf_event.h
> @@ -251,6 +251,11 @@ enum {
>  	PERF_BR_SYSRET		= 8,	/* syscall return */
>  	PERF_BR_COND_CALL	= 9,	/* conditional function call */
>  	PERF_BR_COND_RET	= 10,	/* conditional function return */
> +	PERF_BR_EXPT_RET	= 11,	/* exception return */

We don't use 'EXPT' anywhere else, so it might be better to just use 'ERET'.
IIUC that's the naming the x86 FRED stuff is going to use anyhow.

> +	PERF_BR_IRQ		= 12,	/* irq */

This looks somewhat generic, so adding it now makes sense to me, but ...

> +	PERF_BR_FIQ		= 13,	/* fiq */

... this is arguably just a idfferent class of interrupt from the PoV of Linux,
and the naming is ARM-specific, so I don't think this makes sense to add *now*.
As above, maybe it would be better to have a generic "aynchronous exception" or
"interrupt" type, and a separate field to distinguish specific types of those.

> +	PERF_BR_DEBUG_HALT	= 14,	/* debug halt */
> +	PERF_BR_DEBUG_EXIT	= 15,	/* debug exit */

For the benefit of those not familiar with the ARM architecture, "debug halt"
and "debug exit" usually refer to "debug state", which is what an external
(e.g. JTAG) debugger uses rather than the usual self-hosted debug stuff that
Linux uses.

Given that, I'm not sure these are very generic, and I suspect it would be
better to have more generic PERF_BR_ARCH_* entries for things like this.

Thanks,
Mark.

>  	PERF_BR_MAX,
>  };
>  
> diff --git a/tools/include/uapi/linux/perf_event.h b/tools/include/uapi/linux/perf_event.h
> index 4cd39aaccbe7..1882054e8684 100644
> --- a/tools/include/uapi/linux/perf_event.h
> +++ b/tools/include/uapi/linux/perf_event.h
> @@ -251,6 +251,11 @@ enum {
>  	PERF_BR_SYSRET		= 8,	/* syscall return */
>  	PERF_BR_COND_CALL	= 9,	/* conditional function call */
>  	PERF_BR_COND_RET	= 10,	/* conditional function return */
> +	PERF_BR_EXPT_RET	= 11,	/* exception return */
> +	PERF_BR_IRQ		= 12,	/* irq */
> +	PERF_BR_FIQ		= 13,	/* fiq */
> +	PERF_BR_DEBUG_HALT	= 14,	/* debug halt */
> +	PERF_BR_DEBUG_EXIT	= 15,	/* debug exit */
>  	PERF_BR_MAX,
>  };
>  
> diff --git a/tools/perf/util/branch.c b/tools/perf/util/branch.c
> index 2285b1eb3128..74e5e67b1779 100644
> --- a/tools/perf/util/branch.c
> +++ b/tools/perf/util/branch.c
> @@ -49,7 +49,12 @@ const char *branch_type_name(int type)
>  		"SYSCALL",
>  		"SYSRET",
>  		"COND_CALL",
> -		"COND_RET"
> +		"COND_RET",
> +		"EXPT_RET",
> +		"IRQ",
> +		"FIQ",
> +		"DEBUG_HALT",
> +		"DEBUG_EXIT"
>  	};
>  
>  	if (type >= 0 && type < PERF_BR_MAX)
> -- 
> 2.25.1
> 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/2] perf: Expand perf_branch_entry.type
  2022-02-02 11:57     ` Mark Rutland
@ 2022-02-04  4:55       ` Anshuman Khandual
  -1 siblings, 0 replies; 24+ messages in thread
From: Anshuman Khandual @ 2022-02-04  4:55 UTC (permalink / raw)
  To: Mark Rutland
  Cc: linux-kernel, linux-perf-users, linux-arm-kernel, robh,
	Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Will Deacon



On 2/2/22 5:27 PM, Mark Rutland wrote:
> On Fri, Jan 28, 2022 at 11:14:13AM +0530, Anshuman Khandual wrote:
>> Current perf_branch_entry.type is a 4 bits field just enough to accommodate
>> 16 generic branch types. This is insufficient to accommodate platforms like
>> arm64 which has much more branch types.
> 
> It would be good to mention BRBE specifically here, along with specific values
> and a rought intro, e.g.
> 
> | The Arm Branch Record Buffer Extension (BRBE) distinguishes $N types of
> | branch/exception/return: <rough summary here>. There's not enough space to
> | describe these all in perf_branch_entry.type, as this is a 4-bit field.
> 
> That way reviewers (and anyone looking at the patch in future) have a lot more
> rationale to work with. A rough summary of the distinct branch types would be
> *really* helpful.

Sure, will add a summary describing the need for an ABI extension as suggested.

> 
>> Lets just expands this field into a 6 bits one, which can now hold 64 generic
>> branch types.
> 
> Is it safe (ABI-wise) to extend a bit-field like this? Does that break any
> combination of old/new userspace and old/new kernel? I'm not sure how bit
> fields are managed w.r.t. endianness, but normally extending a field would
> break BE, so this seems suspicious.

Probably. I guess we would need some more inputs/suggestions from others
regarding any potential issues and possible workarounds.

> 
> I suspect we might need to allocate a *separate* field for new values, and
> possibly reserve a value in the existing field to say "go look at the new
> field".

In that case there might be another level of indirection.

> 
> Do you have any rationale for 64 values specifically? e.g. is that mostly for
> future extensibility? How many will we need for Arm's BRBE?

Yeah that is mostly for future extensibility. BRBE's current requirement
will be well within 32 types itself. 19 to be specific.

> 
> Do those types fall into a hierarchy, that we could split across separate
> fields?
I will take a look and get back on this. But as mentioned before it will
cause additional level of indirection for a look up.

> 
>> This also adds more generic branch types.
> 
> This feels like ti should be in a separate/subsequent patch. If nothing else
> that aids bisectability if changing the size of the field breaks anything.

Makes sense, will split them.

> 
>> Cc: Peter Zijlstra <peterz@infradead.org>
>> Cc: Ingo Molnar <mingo@redhat.com>
>> Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
>> Cc: Mark Rutland <mark.rutland@arm.com>
>> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
>> Cc: Jiri Olsa <jolsa@redhat.com>
>> Cc: Namhyung Kim <namhyung@kernel.org>
>> Cc: Will Deacon <will@kernel.org>
>> Cc: linux-arm-kernel@lists.infradead.org
>> Cc: linux-perf-users@vger.kernel.org
>> Cc: linux-kernel@vger.kernel.org
>> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
>> ---
>>  include/uapi/linux/perf_event.h       | 10 ++++++++--
>>  tools/include/uapi/linux/perf_event.h | 10 ++++++++--
>>  tools/perf/util/branch.c              |  8 +++++++-
>>  tools/perf/util/branch.h              |  4 ++--
>>  4 files changed, 25 insertions(+), 7 deletions(-)
>>
>> diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
>> index b91d0f575d0c..361fdc6b87a0 100644
>> --- a/include/uapi/linux/perf_event.h
>> +++ b/include/uapi/linux/perf_event.h
>> @@ -256,6 +256,12 @@ enum {
>>  	PERF_BR_FIQ		= 13,	/* fiq */
>>  	PERF_BR_DEBUG_HALT	= 14,	/* debug halt */
>>  	PERF_BR_DEBUG_EXIT	= 15,	/* debug exit */
>> +	PERF_BR_DEBUG_INST	= 16,	/* instruciton debug */
>> +	PERF_BR_DEBUG_DATA	= 17,	/* data debug */
> 
> This is really unclear. What is "instruction debug" vs "data debug" ?
> 
> Are there meant for breakpoint/watchpoint exceptions? HW breakpoints vs BRK
> instructions?

Unfortunately the spec does not expand much in detail but will try
and find some more clarity.

> 
>> +	PERF_BR_FAULT_ALGN	= 18,	/* alignment fault */
>> +	PERF_BR_FAULT_DATA	= 19,	/* data fault */
>> +	PERF_BR_FAULT_INST	= 20,	/* instruction fault */
> 
> There are many other potential faults a CPU could take; are these specifically
> what Arm's BRBE provides?

Right, these are what BRBE captures for now.

> 
>> +	PERF_BR_SERROR		= 21,	/* system error */
> 
> This is really arm-specific; IIUC the closest thing on x86 is an MCE.

But ('unhandled' ?) system error can be a generic control flow change.

> 
>>  	PERF_BR_MAX,
>>  };
>>  
>> @@ -1370,8 +1376,8 @@ struct perf_branch_entry {
>>  		in_tx:1,    /* in transaction */
>>  		abort:1,    /* transaction abort */
>>  		cycles:16,  /* cycle count to last branch */
>> -		type:4,     /* branch type */
>> -		reserved:40;
>> +		type:6,     /* branch type */
> 
> As above, is this a safe-change ABI-wise?

If the bit fields here cannot be expanded without breaking ABI, then
there is a fundamental problem. Only remaining option will be to add
new fields (with new width value) which could accommodate these new
required branch types.

> 
> Thanks,
> Mark.
> 
>> +		reserved:38;
>>  };
>>  
>>  union perf_sample_weight {
>> diff --git a/tools/include/uapi/linux/perf_event.h b/tools/include/uapi/linux/perf_event.h
>> index 1882054e8684..9a82b8aaed93 100644
>> --- a/tools/include/uapi/linux/perf_event.h
>> +++ b/tools/include/uapi/linux/perf_event.h
>> @@ -256,6 +256,12 @@ enum {
>>  	PERF_BR_FIQ		= 13,	/* fiq */
>>  	PERF_BR_DEBUG_HALT	= 14,	/* debug halt */
>>  	PERF_BR_DEBUG_EXIT	= 15,	/* debug exit */
>> +	PERF_BR_DEBUG_INST	= 16,	/* instruciton debug */
>> +	PERF_BR_DEBUG_DATA	= 17,	/* data debug */
>> +	PERF_BR_FAULT_ALGN	= 18,	/* alignment fault */
>> +	PERF_BR_FAULT_DATA	= 19,	/* data fault */
>> +	PERF_BR_FAULT_INST	= 20,	/* instruction fault */
>> +	PERF_BR_SERROR		= 21,	/* system error */
>>  	PERF_BR_MAX,
>>  };
>>  
>> @@ -1370,8 +1376,8 @@ struct perf_branch_entry {
>>  		in_tx:1,    /* in transaction */
>>  		abort:1,    /* transaction abort */
>>  		cycles:16,  /* cycle count to last branch */
>> -		type:4,     /* branch type */
>> -		reserved:40;
>> +		type:6,     /* branch type */
>> +		reserved:38;
>>  };
>>  
>>  union perf_sample_weight {
>> diff --git a/tools/perf/util/branch.c b/tools/perf/util/branch.c
>> index 74e5e67b1779..1e216ea2e2a8 100644
>> --- a/tools/perf/util/branch.c
>> +++ b/tools/perf/util/branch.c
>> @@ -54,7 +54,13 @@ const char *branch_type_name(int type)
>>  		"IRQ",
>>  		"FIQ",
>>  		"DEBUG_HALT",
>> -		"DEBUG_EXIT"
>> +		"DEBUG_EXIT",
>> +		"DEBUG_INST",
>> +		"DEBUG_DATA",
>> +		"FAULT_ALGN",
>> +		"FAULT_DATA",
>> +		"FAULT_INST",
>> +		"SERROR"
>>  	};
>>  
>>  	if (type >= 0 && type < PERF_BR_MAX)
>> diff --git a/tools/perf/util/branch.h b/tools/perf/util/branch.h
>> index 17b2ccc61094..875d99abdc36 100644
>> --- a/tools/perf/util/branch.h
>> +++ b/tools/perf/util/branch.h
>> @@ -23,8 +23,8 @@ struct branch_flags {
>>  			u64 in_tx:1;
>>  			u64 abort:1;
>>  			u64 cycles:16;
>> -			u64 type:4;
>> -			u64 reserved:40;
>> +			u64 type:6;
>> +			u64 reserved:38;
>>  		};
>>  	};
>>  };
>> -- 
>> 2.25.1

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/2] perf: Expand perf_branch_entry.type
@ 2022-02-04  4:55       ` Anshuman Khandual
  0 siblings, 0 replies; 24+ messages in thread
From: Anshuman Khandual @ 2022-02-04  4:55 UTC (permalink / raw)
  To: Mark Rutland
  Cc: linux-kernel, linux-perf-users, linux-arm-kernel, robh,
	Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Will Deacon



On 2/2/22 5:27 PM, Mark Rutland wrote:
> On Fri, Jan 28, 2022 at 11:14:13AM +0530, Anshuman Khandual wrote:
>> Current perf_branch_entry.type is a 4 bits field just enough to accommodate
>> 16 generic branch types. This is insufficient to accommodate platforms like
>> arm64 which has much more branch types.
> 
> It would be good to mention BRBE specifically here, along with specific values
> and a rought intro, e.g.
> 
> | The Arm Branch Record Buffer Extension (BRBE) distinguishes $N types of
> | branch/exception/return: <rough summary here>. There's not enough space to
> | describe these all in perf_branch_entry.type, as this is a 4-bit field.
> 
> That way reviewers (and anyone looking at the patch in future) have a lot more
> rationale to work with. A rough summary of the distinct branch types would be
> *really* helpful.

Sure, will add a summary describing the need for an ABI extension as suggested.

> 
>> Lets just expands this field into a 6 bits one, which can now hold 64 generic
>> branch types.
> 
> Is it safe (ABI-wise) to extend a bit-field like this? Does that break any
> combination of old/new userspace and old/new kernel? I'm not sure how bit
> fields are managed w.r.t. endianness, but normally extending a field would
> break BE, so this seems suspicious.

Probably. I guess we would need some more inputs/suggestions from others
regarding any potential issues and possible workarounds.

> 
> I suspect we might need to allocate a *separate* field for new values, and
> possibly reserve a value in the existing field to say "go look at the new
> field".

In that case there might be another level of indirection.

> 
> Do you have any rationale for 64 values specifically? e.g. is that mostly for
> future extensibility? How many will we need for Arm's BRBE?

Yeah that is mostly for future extensibility. BRBE's current requirement
will be well within 32 types itself. 19 to be specific.

> 
> Do those types fall into a hierarchy, that we could split across separate
> fields?
I will take a look and get back on this. But as mentioned before it will
cause additional level of indirection for a look up.

> 
>> This also adds more generic branch types.
> 
> This feels like ti should be in a separate/subsequent patch. If nothing else
> that aids bisectability if changing the size of the field breaks anything.

Makes sense, will split them.

> 
>> Cc: Peter Zijlstra <peterz@infradead.org>
>> Cc: Ingo Molnar <mingo@redhat.com>
>> Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
>> Cc: Mark Rutland <mark.rutland@arm.com>
>> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
>> Cc: Jiri Olsa <jolsa@redhat.com>
>> Cc: Namhyung Kim <namhyung@kernel.org>
>> Cc: Will Deacon <will@kernel.org>
>> Cc: linux-arm-kernel@lists.infradead.org
>> Cc: linux-perf-users@vger.kernel.org
>> Cc: linux-kernel@vger.kernel.org
>> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
>> ---
>>  include/uapi/linux/perf_event.h       | 10 ++++++++--
>>  tools/include/uapi/linux/perf_event.h | 10 ++++++++--
>>  tools/perf/util/branch.c              |  8 +++++++-
>>  tools/perf/util/branch.h              |  4 ++--
>>  4 files changed, 25 insertions(+), 7 deletions(-)
>>
>> diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
>> index b91d0f575d0c..361fdc6b87a0 100644
>> --- a/include/uapi/linux/perf_event.h
>> +++ b/include/uapi/linux/perf_event.h
>> @@ -256,6 +256,12 @@ enum {
>>  	PERF_BR_FIQ		= 13,	/* fiq */
>>  	PERF_BR_DEBUG_HALT	= 14,	/* debug halt */
>>  	PERF_BR_DEBUG_EXIT	= 15,	/* debug exit */
>> +	PERF_BR_DEBUG_INST	= 16,	/* instruciton debug */
>> +	PERF_BR_DEBUG_DATA	= 17,	/* data debug */
> 
> This is really unclear. What is "instruction debug" vs "data debug" ?
> 
> Are there meant for breakpoint/watchpoint exceptions? HW breakpoints vs BRK
> instructions?

Unfortunately the spec does not expand much in detail but will try
and find some more clarity.

> 
>> +	PERF_BR_FAULT_ALGN	= 18,	/* alignment fault */
>> +	PERF_BR_FAULT_DATA	= 19,	/* data fault */
>> +	PERF_BR_FAULT_INST	= 20,	/* instruction fault */
> 
> There are many other potential faults a CPU could take; are these specifically
> what Arm's BRBE provides?

Right, these are what BRBE captures for now.

> 
>> +	PERF_BR_SERROR		= 21,	/* system error */
> 
> This is really arm-specific; IIUC the closest thing on x86 is an MCE.

But ('unhandled' ?) system error can be a generic control flow change.

> 
>>  	PERF_BR_MAX,
>>  };
>>  
>> @@ -1370,8 +1376,8 @@ struct perf_branch_entry {
>>  		in_tx:1,    /* in transaction */
>>  		abort:1,    /* transaction abort */
>>  		cycles:16,  /* cycle count to last branch */
>> -		type:4,     /* branch type */
>> -		reserved:40;
>> +		type:6,     /* branch type */
> 
> As above, is this a safe-change ABI-wise?

If the bit fields here cannot be expanded without breaking ABI, then
there is a fundamental problem. Only remaining option will be to add
new fields (with new width value) which could accommodate these new
required branch types.

> 
> Thanks,
> Mark.
> 
>> +		reserved:38;
>>  };
>>  
>>  union perf_sample_weight {
>> diff --git a/tools/include/uapi/linux/perf_event.h b/tools/include/uapi/linux/perf_event.h
>> index 1882054e8684..9a82b8aaed93 100644
>> --- a/tools/include/uapi/linux/perf_event.h
>> +++ b/tools/include/uapi/linux/perf_event.h
>> @@ -256,6 +256,12 @@ enum {
>>  	PERF_BR_FIQ		= 13,	/* fiq */
>>  	PERF_BR_DEBUG_HALT	= 14,	/* debug halt */
>>  	PERF_BR_DEBUG_EXIT	= 15,	/* debug exit */
>> +	PERF_BR_DEBUG_INST	= 16,	/* instruciton debug */
>> +	PERF_BR_DEBUG_DATA	= 17,	/* data debug */
>> +	PERF_BR_FAULT_ALGN	= 18,	/* alignment fault */
>> +	PERF_BR_FAULT_DATA	= 19,	/* data fault */
>> +	PERF_BR_FAULT_INST	= 20,	/* instruction fault */
>> +	PERF_BR_SERROR		= 21,	/* system error */
>>  	PERF_BR_MAX,
>>  };
>>  
>> @@ -1370,8 +1376,8 @@ struct perf_branch_entry {
>>  		in_tx:1,    /* in transaction */
>>  		abort:1,    /* transaction abort */
>>  		cycles:16,  /* cycle count to last branch */
>> -		type:4,     /* branch type */
>> -		reserved:40;
>> +		type:6,     /* branch type */
>> +		reserved:38;
>>  };
>>  
>>  union perf_sample_weight {
>> diff --git a/tools/perf/util/branch.c b/tools/perf/util/branch.c
>> index 74e5e67b1779..1e216ea2e2a8 100644
>> --- a/tools/perf/util/branch.c
>> +++ b/tools/perf/util/branch.c
>> @@ -54,7 +54,13 @@ const char *branch_type_name(int type)
>>  		"IRQ",
>>  		"FIQ",
>>  		"DEBUG_HALT",
>> -		"DEBUG_EXIT"
>> +		"DEBUG_EXIT",
>> +		"DEBUG_INST",
>> +		"DEBUG_DATA",
>> +		"FAULT_ALGN",
>> +		"FAULT_DATA",
>> +		"FAULT_INST",
>> +		"SERROR"
>>  	};
>>  
>>  	if (type >= 0 && type < PERF_BR_MAX)
>> diff --git a/tools/perf/util/branch.h b/tools/perf/util/branch.h
>> index 17b2ccc61094..875d99abdc36 100644
>> --- a/tools/perf/util/branch.h
>> +++ b/tools/perf/util/branch.h
>> @@ -23,8 +23,8 @@ struct branch_flags {
>>  			u64 in_tx:1;
>>  			u64 abort:1;
>>  			u64 cycles:16;
>> -			u64 type:4;
>> -			u64 reserved:40;
>> +			u64 type:6;
>> +			u64 reserved:38;
>>  		};
>>  	};
>>  };
>> -- 
>> 2.25.1

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/2] perf: Add more generic branch types
  2022-02-02 11:58     ` Mark Rutland
@ 2022-02-04  4:56       ` Anshuman Khandual
  -1 siblings, 0 replies; 24+ messages in thread
From: Anshuman Khandual @ 2022-02-04  4:56 UTC (permalink / raw)
  To: Mark Rutland
  Cc: linux-kernel, linux-perf-users, linux-arm-kernel, robh,
	Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Will Deacon


On 2/2/22 5:28 PM, Mark Rutland wrote:
> Hi Anshuman,
> 
> On Fri, Jan 28, 2022 at 11:14:12AM +0530, Anshuman Khandual wrote:
>> This expands generic branch type classification by adding some more entries
>> , that can still be represented with the existing 4 bit 'type' field. While

    ^
>> here this also updates the x86 implementation with these new branch types.
> 
> It looks like there's some whitespace damage here.

Are you referring the above ? I will have a look.

> 
>>From a quick scan of the below, I think the "exception return" and "IRQ
> exception" types are somewhat generic, and could be added now (aside from any
> bikeshedding over names), but:
> 
> * For IRQ vs FIQ, we might just want to have a top-level "asynchronous
>   exception" type, and then further divide that with a separate field. That way
>   it's easier to extend in future if new exceptions are added.

Okay. But that might lead to a hierarchical bit fields design where as the
current one is just linear.

> 
> * I don't think the debug state entry/exits make sense as generic branch types,

From BRBE perspective, a branch is any control flow change including exception
level change, debug enter, debug exit etc. If exception and its return can be
classified as 'branches' why not debug state change ? Are there no similar
debug states transition on other platforms ?

>   since those are somewhat specific to the ARM architecutre, and it might make
>   sense to define generic PERF_BR_ARCH* definitions instead.

Makes sense but corresponding bit field layout change in branch_sample_type
will remain a challenge.

> 
> * Given the next patch extends the field, and therei are potential ABI problems
>   with that, we might need to reserve a value for ABI extensibility purposes,
>   and I suspect we need to do that *first*. More comments on the subsequent
>   patch.

Sure, understood.

>  
>> Cc: Peter Zijlstra <peterz@infradead.org>
>> Cc: Ingo Molnar <mingo@redhat.com>
>> Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
>> Cc: Mark Rutland <mark.rutland@arm.com>
>> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
>> Cc: Jiri Olsa <jolsa@redhat.com>
>> Cc: Namhyung Kim <namhyung@kernel.org>
>> Cc: Will Deacon <will@kernel.org>
>> Cc: linux-arm-kernel@lists.infradead.org
>> Cc: linux-perf-users@vger.kernel.org
>> Cc: linux-kernel@vger.kernel.org
>> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
>> ---
>>  arch/x86/events/intel/lbr.c           | 4 ++--
>>  include/uapi/linux/perf_event.h       | 5 +++++
>>  tools/include/uapi/linux/perf_event.h | 5 +++++
>>  tools/perf/util/branch.c              | 7 ++++++-
>>  4 files changed, 18 insertions(+), 3 deletions(-)
>>
>> diff --git a/arch/x86/events/intel/lbr.c b/arch/x86/events/intel/lbr.c
>> index 8043213b75a5..9f86fac8c6a5 100644
>> --- a/arch/x86/events/intel/lbr.c
>> +++ b/arch/x86/events/intel/lbr.c
>> @@ -1336,10 +1336,10 @@ static int branch_map[X86_BR_TYPE_MAP_MAX] = {
>>  	PERF_BR_SYSCALL,	/* X86_BR_SYSCALL */
>>  	PERF_BR_SYSRET,		/* X86_BR_SYSRET */
>>  	PERF_BR_UNKNOWN,	/* X86_BR_INT */
>> -	PERF_BR_UNKNOWN,	/* X86_BR_IRET */
>> +	PERF_BR_EXPT_RET,	/* X86_BR_IRET */
>>  	PERF_BR_COND,		/* X86_BR_JCC */
>>  	PERF_BR_UNCOND,		/* X86_BR_JMP */
>> -	PERF_BR_UNKNOWN,	/* X86_BR_IRQ */
>> +	PERF_BR_IRQ,		/* X86_BR_IRQ */
>>  	PERF_BR_IND_CALL,	/* X86_BR_IND_CALL */
>>  	PERF_BR_UNKNOWN,	/* X86_BR_ABORT */
>>  	PERF_BR_UNKNOWN,	/* X86_BR_IN_TX */
> 
> This presumably changes the values reported to userspace, so the commit message
> should mention that and explain why that is not a problem.

Okay.

> 
>> diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
>> index 1b65042ab1db..b91d0f575d0c 100644
>> --- a/include/uapi/linux/perf_event.h
>> +++ b/include/uapi/linux/perf_event.h
>> @@ -251,6 +251,11 @@ enum {
>>  	PERF_BR_SYSRET		= 8,	/* syscall return */
>>  	PERF_BR_COND_CALL	= 9,	/* conditional function call */
>>  	PERF_BR_COND_RET	= 10,	/* conditional function return */
>> +	PERF_BR_EXPT_RET	= 11,	/* exception return */
> 
> We don't use 'EXPT' anywhere else, so it might be better to just use 'ERET'.
> IIUC that's the naming the x86 FRED stuff is going to use anyhow.

Sure, will change.

> 
>> +	PERF_BR_IRQ		= 12,	/* irq */
> 
> This looks somewhat generic, so adding it now makes sense to me, but ...
> 
>> +	PERF_BR_FIQ		= 13,	/* fiq */
> 
> ... this is arguably just a idfferent class of interrupt from the PoV of Linux,
> and the naming is ARM-specific, so I don't think this makes sense to add *now*.

I assume 'now' --> without ABI extension.

> As above, maybe it would be better to have a generic "aynchronous exception" or
> "interrupt" type, and a separate field to distinguish specific types of those.
> 
>> +	PERF_BR_DEBUG_HALT	= 14,	/* debug halt */
>> +	PERF_BR_DEBUG_EXIT	= 15,	/* debug exit */
> 
> For the benefit of those not familiar with the ARM architecture, "debug halt"
> and "debug exit" usually refer to "debug state", which is what an external
> (e.g. JTAG) debugger uses rather than the usual self-hosted debug stuff that
> Linux uses.
> 
> Given that, I'm not sure these are very generic, and I suspect it would be
> better to have more generic PERF_BR_ARCH_* entries for things like this.

Sure, will try and come up with something similar.

> 
> Thanks,
> Mark.
> 
>>  	PERF_BR_MAX,
>>  };
>>  
>> diff --git a/tools/include/uapi/linux/perf_event.h b/tools/include/uapi/linux/perf_event.h
>> index 4cd39aaccbe7..1882054e8684 100644
>> --- a/tools/include/uapi/linux/perf_event.h
>> +++ b/tools/include/uapi/linux/perf_event.h
>> @@ -251,6 +251,11 @@ enum {
>>  	PERF_BR_SYSRET		= 8,	/* syscall return */
>>  	PERF_BR_COND_CALL	= 9,	/* conditional function call */
>>  	PERF_BR_COND_RET	= 10,	/* conditional function return */
>> +	PERF_BR_EXPT_RET	= 11,	/* exception return */
>> +	PERF_BR_IRQ		= 12,	/* irq */
>> +	PERF_BR_FIQ		= 13,	/* fiq */
>> +	PERF_BR_DEBUG_HALT	= 14,	/* debug halt */
>> +	PERF_BR_DEBUG_EXIT	= 15,	/* debug exit */
>>  	PERF_BR_MAX,
>>  };
>>  
>> diff --git a/tools/perf/util/branch.c b/tools/perf/util/branch.c
>> index 2285b1eb3128..74e5e67b1779 100644
>> --- a/tools/perf/util/branch.c
>> +++ b/tools/perf/util/branch.c
>> @@ -49,7 +49,12 @@ const char *branch_type_name(int type)
>>  		"SYSCALL",
>>  		"SYSRET",
>>  		"COND_CALL",
>> -		"COND_RET"
>> +		"COND_RET",
>> +		"EXPT_RET",
>> +		"IRQ",
>> +		"FIQ",
>> +		"DEBUG_HALT",
>> +		"DEBUG_EXIT"
>>  	};
>>  
>>  	if (type >= 0 && type < PERF_BR_MAX)
>> -- 
>> 2.25.1
>>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/2] perf: Add more generic branch types
@ 2022-02-04  4:56       ` Anshuman Khandual
  0 siblings, 0 replies; 24+ messages in thread
From: Anshuman Khandual @ 2022-02-04  4:56 UTC (permalink / raw)
  To: Mark Rutland
  Cc: linux-kernel, linux-perf-users, linux-arm-kernel, robh,
	Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Will Deacon


On 2/2/22 5:28 PM, Mark Rutland wrote:
> Hi Anshuman,
> 
> On Fri, Jan 28, 2022 at 11:14:12AM +0530, Anshuman Khandual wrote:
>> This expands generic branch type classification by adding some more entries
>> , that can still be represented with the existing 4 bit 'type' field. While

    ^
>> here this also updates the x86 implementation with these new branch types.
> 
> It looks like there's some whitespace damage here.

Are you referring the above ? I will have a look.

> 
>>From a quick scan of the below, I think the "exception return" and "IRQ
> exception" types are somewhat generic, and could be added now (aside from any
> bikeshedding over names), but:
> 
> * For IRQ vs FIQ, we might just want to have a top-level "asynchronous
>   exception" type, and then further divide that with a separate field. That way
>   it's easier to extend in future if new exceptions are added.

Okay. But that might lead to a hierarchical bit fields design where as the
current one is just linear.

> 
> * I don't think the debug state entry/exits make sense as generic branch types,

From BRBE perspective, a branch is any control flow change including exception
level change, debug enter, debug exit etc. If exception and its return can be
classified as 'branches' why not debug state change ? Are there no similar
debug states transition on other platforms ?

>   since those are somewhat specific to the ARM architecutre, and it might make
>   sense to define generic PERF_BR_ARCH* definitions instead.

Makes sense but corresponding bit field layout change in branch_sample_type
will remain a challenge.

> 
> * Given the next patch extends the field, and therei are potential ABI problems
>   with that, we might need to reserve a value for ABI extensibility purposes,
>   and I suspect we need to do that *first*. More comments on the subsequent
>   patch.

Sure, understood.

>  
>> Cc: Peter Zijlstra <peterz@infradead.org>
>> Cc: Ingo Molnar <mingo@redhat.com>
>> Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
>> Cc: Mark Rutland <mark.rutland@arm.com>
>> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
>> Cc: Jiri Olsa <jolsa@redhat.com>
>> Cc: Namhyung Kim <namhyung@kernel.org>
>> Cc: Will Deacon <will@kernel.org>
>> Cc: linux-arm-kernel@lists.infradead.org
>> Cc: linux-perf-users@vger.kernel.org
>> Cc: linux-kernel@vger.kernel.org
>> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
>> ---
>>  arch/x86/events/intel/lbr.c           | 4 ++--
>>  include/uapi/linux/perf_event.h       | 5 +++++
>>  tools/include/uapi/linux/perf_event.h | 5 +++++
>>  tools/perf/util/branch.c              | 7 ++++++-
>>  4 files changed, 18 insertions(+), 3 deletions(-)
>>
>> diff --git a/arch/x86/events/intel/lbr.c b/arch/x86/events/intel/lbr.c
>> index 8043213b75a5..9f86fac8c6a5 100644
>> --- a/arch/x86/events/intel/lbr.c
>> +++ b/arch/x86/events/intel/lbr.c
>> @@ -1336,10 +1336,10 @@ static int branch_map[X86_BR_TYPE_MAP_MAX] = {
>>  	PERF_BR_SYSCALL,	/* X86_BR_SYSCALL */
>>  	PERF_BR_SYSRET,		/* X86_BR_SYSRET */
>>  	PERF_BR_UNKNOWN,	/* X86_BR_INT */
>> -	PERF_BR_UNKNOWN,	/* X86_BR_IRET */
>> +	PERF_BR_EXPT_RET,	/* X86_BR_IRET */
>>  	PERF_BR_COND,		/* X86_BR_JCC */
>>  	PERF_BR_UNCOND,		/* X86_BR_JMP */
>> -	PERF_BR_UNKNOWN,	/* X86_BR_IRQ */
>> +	PERF_BR_IRQ,		/* X86_BR_IRQ */
>>  	PERF_BR_IND_CALL,	/* X86_BR_IND_CALL */
>>  	PERF_BR_UNKNOWN,	/* X86_BR_ABORT */
>>  	PERF_BR_UNKNOWN,	/* X86_BR_IN_TX */
> 
> This presumably changes the values reported to userspace, so the commit message
> should mention that and explain why that is not a problem.

Okay.

> 
>> diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
>> index 1b65042ab1db..b91d0f575d0c 100644
>> --- a/include/uapi/linux/perf_event.h
>> +++ b/include/uapi/linux/perf_event.h
>> @@ -251,6 +251,11 @@ enum {
>>  	PERF_BR_SYSRET		= 8,	/* syscall return */
>>  	PERF_BR_COND_CALL	= 9,	/* conditional function call */
>>  	PERF_BR_COND_RET	= 10,	/* conditional function return */
>> +	PERF_BR_EXPT_RET	= 11,	/* exception return */
> 
> We don't use 'EXPT' anywhere else, so it might be better to just use 'ERET'.
> IIUC that's the naming the x86 FRED stuff is going to use anyhow.

Sure, will change.

> 
>> +	PERF_BR_IRQ		= 12,	/* irq */
> 
> This looks somewhat generic, so adding it now makes sense to me, but ...
> 
>> +	PERF_BR_FIQ		= 13,	/* fiq */
> 
> ... this is arguably just a idfferent class of interrupt from the PoV of Linux,
> and the naming is ARM-specific, so I don't think this makes sense to add *now*.

I assume 'now' --> without ABI extension.

> As above, maybe it would be better to have a generic "aynchronous exception" or
> "interrupt" type, and a separate field to distinguish specific types of those.
> 
>> +	PERF_BR_DEBUG_HALT	= 14,	/* debug halt */
>> +	PERF_BR_DEBUG_EXIT	= 15,	/* debug exit */
> 
> For the benefit of those not familiar with the ARM architecture, "debug halt"
> and "debug exit" usually refer to "debug state", which is what an external
> (e.g. JTAG) debugger uses rather than the usual self-hosted debug stuff that
> Linux uses.
> 
> Given that, I'm not sure these are very generic, and I suspect it would be
> better to have more generic PERF_BR_ARCH_* entries for things like this.

Sure, will try and come up with something similar.

> 
> Thanks,
> Mark.
> 
>>  	PERF_BR_MAX,
>>  };
>>  
>> diff --git a/tools/include/uapi/linux/perf_event.h b/tools/include/uapi/linux/perf_event.h
>> index 4cd39aaccbe7..1882054e8684 100644
>> --- a/tools/include/uapi/linux/perf_event.h
>> +++ b/tools/include/uapi/linux/perf_event.h
>> @@ -251,6 +251,11 @@ enum {
>>  	PERF_BR_SYSRET		= 8,	/* syscall return */
>>  	PERF_BR_COND_CALL	= 9,	/* conditional function call */
>>  	PERF_BR_COND_RET	= 10,	/* conditional function return */
>> +	PERF_BR_EXPT_RET	= 11,	/* exception return */
>> +	PERF_BR_IRQ		= 12,	/* irq */
>> +	PERF_BR_FIQ		= 13,	/* fiq */
>> +	PERF_BR_DEBUG_HALT	= 14,	/* debug halt */
>> +	PERF_BR_DEBUG_EXIT	= 15,	/* debug exit */
>>  	PERF_BR_MAX,
>>  };
>>  
>> diff --git a/tools/perf/util/branch.c b/tools/perf/util/branch.c
>> index 2285b1eb3128..74e5e67b1779 100644
>> --- a/tools/perf/util/branch.c
>> +++ b/tools/perf/util/branch.c
>> @@ -49,7 +49,12 @@ const char *branch_type_name(int type)
>>  		"SYSCALL",
>>  		"SYSRET",
>>  		"COND_CALL",
>> -		"COND_RET"
>> +		"COND_RET",
>> +		"EXPT_RET",
>> +		"IRQ",
>> +		"FIQ",
>> +		"DEBUG_HALT",
>> +		"DEBUG_EXIT"
>>  	};
>>  
>>  	if (type >= 0 && type < PERF_BR_MAX)
>> -- 
>> 2.25.1
>>

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/2] perf: Expand perf_branch_entry.type
  2022-02-04  4:55       ` Anshuman Khandual
@ 2022-02-04 16:02         ` Mark Rutland
  -1 siblings, 0 replies; 24+ messages in thread
From: Mark Rutland @ 2022-02-04 16:02 UTC (permalink / raw)
  To: Anshuman Khandual
  Cc: linux-kernel, linux-perf-users, linux-arm-kernel, robh,
	Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Will Deacon

On Fri, Feb 04, 2022 at 10:25:24AM +0530, Anshuman Khandual wrote:
> On 2/2/22 5:27 PM, Mark Rutland wrote:
> > On Fri, Jan 28, 2022 at 11:14:13AM +0530, Anshuman Khandual wrote:
> >> @@ -1370,8 +1376,8 @@ struct perf_branch_entry {
> >>  		in_tx:1,    /* in transaction */
> >>  		abort:1,    /* transaction abort */
> >>  		cycles:16,  /* cycle count to last branch */
> >> -		type:4,     /* branch type */
> >> -		reserved:40;
> >> +		type:6,     /* branch type */
> > 
> > As above, is this a safe-change ABI-wise?
> 
> If the bit fields here cannot be expanded without breaking ABI, then
> there is a fundamental problem. Only remaining option will be to add
> new fields (with new width value) which could accommodate these new
> required branch types.

Unfortunately, I think expanding this does break ABI, and is a fundamental
problem, as:

(a) Any new values in the expanded field will be truncated when read by old
    userspace, and so those may be mis-reported. Maybe we're not too worried
    about this case.

(b) Depending on how the field is placed, existing values might get stored
    differently. This could break any mismatched combination of
    {old,new}-kernel and {old,new}-userspace.

    In practice, I think this means that this is broken for BE, and happens to
    work for LE, but I don't know how bitfields are defined for each
    architecture, so there could be other brokenness.

Consider the test case below:

--------
#include <stdbool.h>

struct bfv1 {
        unsigned long   a:20,
                        b:20,
                        c:4,
                        d:20;
};

struct bfv2 {
        unsigned long   a:20,
                        b:20,
                        c:6,
                        d:18;
};

union bf {
        struct bfv1 v1;
        struct bfv2 v2;
};

bool bfv1_w_r(unsigned char v)
{
        unsigned char v_old = v & ((1UL << 4) - 1);
        unsigned char v_new;
        union bf bf = { 0 };

        bf.v1.c = v_old;
        v_new = bf.v1.c;

        return v_old == v_new;
}

bool bfv2_w_r(unsigned char v)
{
        unsigned char v_old = v & ((1UL << 6) - 1);
        unsigned char v_new;
        union bf bf = { 0 };

        bf.v2.c = v_old;
        v_new = bf.v2.c;

        return v_old == v_new;
}

bool bfv1_w_bfv2_r(unsigned char v)
{
        unsigned char v_old = v & ((1UL << 4) - 1);
        unsigned char v_new;
        union bf bf = { 0 };

        bf.v1.c = v_old;
        v_new = bf.v2.c;

        return v_old == v_new;
}
--------

When compiled for little-endian AArch64, GCC thinks all old values will be
interpreted the same, and optimizes all the round-trip tests to return true:

| [mark@gravadlaks:~]% aarch64-linux-gnu-gcc -c bitfield-test.c -O2 -mlittle-endian
| [mark@gravadlaks:~]% aarch64-linux-gnu-objdump -d bitfield-test.o
|
| bitfield-test.o:     file format elf64-littleaarch64
|
|
| Disassembly of section .text:
|
| 0000000000000000 <bfv1_w_r>:
|    0:   52800020        mov     w0, #0x1                        // #1
|    4:   d65f03c0        ret
|
| 0000000000000008 <bfv2_w_r>:
|    8:   52800020        mov     w0, #0x1                        // #1
|    c:   d65f03c0        ret
|
| 0000000000000010 <bfv1_w_bfv2_r>:
|   10:   52800020        mov     w0, #0x1                        // #1
|   14:   d65f03c0        ret

But when compiled for big-endian AArch64, it doesn't believe that at all:

| [mark@gravadlaks:~]% aarch64-linux-gnu-gcc -c bitfield-test.c -O2 -mbig-endian
| [mark@gravadlaks:~]% aarch64-linux-gnu-objdump -d bitfield-test.o
|
| bitfield-test.o:     file format elf64-bigaarch64
|
|
| Disassembly of section .text:
|
| 0000000000000000 <bfv1_w_r>:
|    0:   52800020        mov     w0, #0x1                        // #1
|    4:   d65f03c0        ret
|
| 0000000000000008 <bfv2_w_r>:
|    8:   52800020        mov     w0, #0x1                        // #1
|    c:   d65f03c0        ret
|
| 0000000000000010 <bfv1_w_bfv2_r>:
|   10:   12001c00        and     w0, w0, #0xff
|   14:   12000c01        and     w1, w0, #0xf
|   18:   531e0c00        ubfiz   w0, w0, #2, #4
|   1c:   6b01001f        cmp     w0, w1
|   20:   1a9f17e0        cset    w0, eq  // eq = none
|   24:   d65f03c0        ret

If we add the following:

| void write_bfv1_c(union bf *bf, unsigned char v)
| {
|         bf->v1.c = v;
| }
| 
| void write_bfv2_c(union bf *bf, unsigned char v)
| {
|         bf->v2.c = v;
| }
| 
| unsigned char read_bfv1_c(union bf *bf)
| {
|         return bf->v1.c;
| }
| 
| unsigned char read_bfv2_c(union bf *bf)
| {
|         return bf->v2.c;
| }

For LE we get:

| 0000000000000018 <write_bfv1_c>:
|   18:   39401402        ldrb    w2, [x0, #5]
|   1c:   33000c22        bfxil   w2, w1, #0, #4
|   20:   39001402        strb    w2, [x0, #5]
|   24:   d65f03c0        ret
| 
| 0000000000000028 <write_bfv2_c>:
|   28:   39401402        ldrb    w2, [x0, #5]
|   2c:   33001422        bfxil   w2, w1, #0, #6
|   30:   39001402        strb    w2, [x0, #5]
|   34:   d65f03c0        ret
| 
| 0000000000000038 <read_bfv1_c>:
|   38:   39401400        ldrb    w0, [x0, #5]
|   3c:   92400c00        and     x0, x0, #0xf
|   40:   d65f03c0        ret
|   44:   d503201f        nop
| 
| 0000000000000048 <read_bfv2_c>:
|   48:   39401400        ldrb    w0, [x0, #5]
|   4c:   92401400        and     x0, x0, #0x3f
|   50:   d65f03c0        ret

... where the low bits of the field stay in the same place, so for all existing
values this happens to work.

For BE we get:

| 0000000000000028 <write_bfv1_c>:
|   28:   39401402        ldrb    w2, [x0, #5]
|   2c:   331c0c22        bfi     w2, w1, #4, #4
|   30:   39001402        strb    w2, [x0, #5]
|   34:   d65f03c0        ret
| 
| 0000000000000038 <write_bfv2_c>:
|   38:   39401402        ldrb    w2, [x0, #5]
|   3c:   331e1422        bfi     w2, w1, #2, #6
|   40:   39001402        strb    w2, [x0, #5]
|   44:   d65f03c0        ret
| 
| 0000000000000048 <read_bfv1_c>:
|   48:   39401400        ldrb    w0, [x0, #5]
|   4c:   53041c00        ubfx    w0, w0, #4, #4
|   50:   d65f03c0        ret
|   54:   d503201f        nop
| 
| 0000000000000058 <read_bfv2_c>:
|   58:   39401400        ldrb    w0, [x0, #5]
|   5c:   53021c00        ubfx    w0, w0, #2, #6
|   60:   d65f03c0        ret

... where the low bits of the field have moved, and so this is broken even for
existing values!

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/2] perf: Expand perf_branch_entry.type
@ 2022-02-04 16:02         ` Mark Rutland
  0 siblings, 0 replies; 24+ messages in thread
From: Mark Rutland @ 2022-02-04 16:02 UTC (permalink / raw)
  To: Anshuman Khandual
  Cc: linux-kernel, linux-perf-users, linux-arm-kernel, robh,
	Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Will Deacon

On Fri, Feb 04, 2022 at 10:25:24AM +0530, Anshuman Khandual wrote:
> On 2/2/22 5:27 PM, Mark Rutland wrote:
> > On Fri, Jan 28, 2022 at 11:14:13AM +0530, Anshuman Khandual wrote:
> >> @@ -1370,8 +1376,8 @@ struct perf_branch_entry {
> >>  		in_tx:1,    /* in transaction */
> >>  		abort:1,    /* transaction abort */
> >>  		cycles:16,  /* cycle count to last branch */
> >> -		type:4,     /* branch type */
> >> -		reserved:40;
> >> +		type:6,     /* branch type */
> > 
> > As above, is this a safe-change ABI-wise?
> 
> If the bit fields here cannot be expanded without breaking ABI, then
> there is a fundamental problem. Only remaining option will be to add
> new fields (with new width value) which could accommodate these new
> required branch types.

Unfortunately, I think expanding this does break ABI, and is a fundamental
problem, as:

(a) Any new values in the expanded field will be truncated when read by old
    userspace, and so those may be mis-reported. Maybe we're not too worried
    about this case.

(b) Depending on how the field is placed, existing values might get stored
    differently. This could break any mismatched combination of
    {old,new}-kernel and {old,new}-userspace.

    In practice, I think this means that this is broken for BE, and happens to
    work for LE, but I don't know how bitfields are defined for each
    architecture, so there could be other brokenness.

Consider the test case below:

--------
#include <stdbool.h>

struct bfv1 {
        unsigned long   a:20,
                        b:20,
                        c:4,
                        d:20;
};

struct bfv2 {
        unsigned long   a:20,
                        b:20,
                        c:6,
                        d:18;
};

union bf {
        struct bfv1 v1;
        struct bfv2 v2;
};

bool bfv1_w_r(unsigned char v)
{
        unsigned char v_old = v & ((1UL << 4) - 1);
        unsigned char v_new;
        union bf bf = { 0 };

        bf.v1.c = v_old;
        v_new = bf.v1.c;

        return v_old == v_new;
}

bool bfv2_w_r(unsigned char v)
{
        unsigned char v_old = v & ((1UL << 6) - 1);
        unsigned char v_new;
        union bf bf = { 0 };

        bf.v2.c = v_old;
        v_new = bf.v2.c;

        return v_old == v_new;
}

bool bfv1_w_bfv2_r(unsigned char v)
{
        unsigned char v_old = v & ((1UL << 4) - 1);
        unsigned char v_new;
        union bf bf = { 0 };

        bf.v1.c = v_old;
        v_new = bf.v2.c;

        return v_old == v_new;
}
--------

When compiled for little-endian AArch64, GCC thinks all old values will be
interpreted the same, and optimizes all the round-trip tests to return true:

| [mark@gravadlaks:~]% aarch64-linux-gnu-gcc -c bitfield-test.c -O2 -mlittle-endian
| [mark@gravadlaks:~]% aarch64-linux-gnu-objdump -d bitfield-test.o
|
| bitfield-test.o:     file format elf64-littleaarch64
|
|
| Disassembly of section .text:
|
| 0000000000000000 <bfv1_w_r>:
|    0:   52800020        mov     w0, #0x1                        // #1
|    4:   d65f03c0        ret
|
| 0000000000000008 <bfv2_w_r>:
|    8:   52800020        mov     w0, #0x1                        // #1
|    c:   d65f03c0        ret
|
| 0000000000000010 <bfv1_w_bfv2_r>:
|   10:   52800020        mov     w0, #0x1                        // #1
|   14:   d65f03c0        ret

But when compiled for big-endian AArch64, it doesn't believe that at all:

| [mark@gravadlaks:~]% aarch64-linux-gnu-gcc -c bitfield-test.c -O2 -mbig-endian
| [mark@gravadlaks:~]% aarch64-linux-gnu-objdump -d bitfield-test.o
|
| bitfield-test.o:     file format elf64-bigaarch64
|
|
| Disassembly of section .text:
|
| 0000000000000000 <bfv1_w_r>:
|    0:   52800020        mov     w0, #0x1                        // #1
|    4:   d65f03c0        ret
|
| 0000000000000008 <bfv2_w_r>:
|    8:   52800020        mov     w0, #0x1                        // #1
|    c:   d65f03c0        ret
|
| 0000000000000010 <bfv1_w_bfv2_r>:
|   10:   12001c00        and     w0, w0, #0xff
|   14:   12000c01        and     w1, w0, #0xf
|   18:   531e0c00        ubfiz   w0, w0, #2, #4
|   1c:   6b01001f        cmp     w0, w1
|   20:   1a9f17e0        cset    w0, eq  // eq = none
|   24:   d65f03c0        ret

If we add the following:

| void write_bfv1_c(union bf *bf, unsigned char v)
| {
|         bf->v1.c = v;
| }
| 
| void write_bfv2_c(union bf *bf, unsigned char v)
| {
|         bf->v2.c = v;
| }
| 
| unsigned char read_bfv1_c(union bf *bf)
| {
|         return bf->v1.c;
| }
| 
| unsigned char read_bfv2_c(union bf *bf)
| {
|         return bf->v2.c;
| }

For LE we get:

| 0000000000000018 <write_bfv1_c>:
|   18:   39401402        ldrb    w2, [x0, #5]
|   1c:   33000c22        bfxil   w2, w1, #0, #4
|   20:   39001402        strb    w2, [x0, #5]
|   24:   d65f03c0        ret
| 
| 0000000000000028 <write_bfv2_c>:
|   28:   39401402        ldrb    w2, [x0, #5]
|   2c:   33001422        bfxil   w2, w1, #0, #6
|   30:   39001402        strb    w2, [x0, #5]
|   34:   d65f03c0        ret
| 
| 0000000000000038 <read_bfv1_c>:
|   38:   39401400        ldrb    w0, [x0, #5]
|   3c:   92400c00        and     x0, x0, #0xf
|   40:   d65f03c0        ret
|   44:   d503201f        nop
| 
| 0000000000000048 <read_bfv2_c>:
|   48:   39401400        ldrb    w0, [x0, #5]
|   4c:   92401400        and     x0, x0, #0x3f
|   50:   d65f03c0        ret

... where the low bits of the field stay in the same place, so for all existing
values this happens to work.

For BE we get:

| 0000000000000028 <write_bfv1_c>:
|   28:   39401402        ldrb    w2, [x0, #5]
|   2c:   331c0c22        bfi     w2, w1, #4, #4
|   30:   39001402        strb    w2, [x0, #5]
|   34:   d65f03c0        ret
| 
| 0000000000000038 <write_bfv2_c>:
|   38:   39401402        ldrb    w2, [x0, #5]
|   3c:   331e1422        bfi     w2, w1, #2, #6
|   40:   39001402        strb    w2, [x0, #5]
|   44:   d65f03c0        ret
| 
| 0000000000000048 <read_bfv1_c>:
|   48:   39401400        ldrb    w0, [x0, #5]
|   4c:   53041c00        ubfx    w0, w0, #4, #4
|   50:   d65f03c0        ret
|   54:   d503201f        nop
| 
| 0000000000000058 <read_bfv2_c>:
|   58:   39401400        ldrb    w0, [x0, #5]
|   5c:   53021c00        ubfx    w0, w0, #2, #6
|   60:   d65f03c0        ret

... where the low bits of the field have moved, and so this is broken even for
existing values!

Thanks,
Mark.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/2] perf: Expand perf_branch_entry.type
  2022-02-04 16:02         ` Mark Rutland
@ 2022-02-15 19:01           ` Rob Herring
  -1 siblings, 0 replies; 24+ messages in thread
From: Rob Herring @ 2022-02-15 19:01 UTC (permalink / raw)
  To: Mark Rutland, Anshuman Khandual
  Cc: linux-kernel, linux-perf-users, linux-arm-kernel, Peter Zijlstra,
	Ingo Molnar, Arnaldo Carvalho de Melo, Alexander Shishkin,
	Jiri Olsa, Namhyung Kim, Will Deacon

On Fri, Feb 04, 2022 at 04:02:04PM +0000, Mark Rutland wrote:
> On Fri, Feb 04, 2022 at 10:25:24AM +0530, Anshuman Khandual wrote:
> > On 2/2/22 5:27 PM, Mark Rutland wrote:
> > > On Fri, Jan 28, 2022 at 11:14:13AM +0530, Anshuman Khandual wrote:
> > >> @@ -1370,8 +1376,8 @@ struct perf_branch_entry {
> > >>  		in_tx:1,    /* in transaction */
> > >>  		abort:1,    /* transaction abort */
> > >>  		cycles:16,  /* cycle count to last branch */
> > >> -		type:4,     /* branch type */
> > >> -		reserved:40;
> > >> +		type:6,     /* branch type */
> > > 
> > > As above, is this a safe-change ABI-wise?
> > 
> > If the bit fields here cannot be expanded without breaking ABI, then
> > there is a fundamental problem. Only remaining option will be to add
> > new fields (with new width value) which could accommodate these new
> > required branch types.
> 
> Unfortunately, I think expanding this does break ABI, and is a fundamental
> problem, as:
> 
> (a) Any new values in the expanded field will be truncated when read by old
>     userspace, and so those may be mis-reported. Maybe we're not too worried
>     about this case.

'type' or specfically branch stack is not currently supported on arm64. 
Do we expect an old userspace which this didn't work on to start working 
with a new kernel?

Given at least some of the new types are arch specific, perhaps 
the existing type field should get a new 'PERF_BR_ARCH_SPECIFIC' or 
'PERF_BR_EXTENDED' value (or use PERF_BR_UNKNOWN?) which means read a 
new 'arch_type' field.

Another option is maybe some of these additional types just shouldn't be 
exposed to userspace? For example, are branches to FIQ useful or leaking 
any info about secure world? Debug mode branches also seem minimally 
useful to me (though I'm no expert in how this is used).


> (b) Depending on how the field is placed, existing values might get stored
>     differently. This could break any mismatched combination of
>     {old,new}-kernel and {old,new}-userspace.
> 
>     In practice, I think this means that this is broken for BE, and happens to
>     work for LE, but I don't know how bitfields are defined for each
>     architecture, so there could be other brokenness.
> 
> Consider the test case below:

[...]

> ... where the low bits of the field have moved, and so this is broken even for
> existing values!

So that is a separate issue to be fixed and not directly related to the 
size of 'type'. Looks like it needs similar '#if 
defined(__LITTLE_ENDIAN_BITFIELD)' treatment as some of the other struct 
bitfields. Though somehow BE PPC hasn't had issues?

Rob

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/2] perf: Expand perf_branch_entry.type
@ 2022-02-15 19:01           ` Rob Herring
  0 siblings, 0 replies; 24+ messages in thread
From: Rob Herring @ 2022-02-15 19:01 UTC (permalink / raw)
  To: Mark Rutland, Anshuman Khandual
  Cc: linux-kernel, linux-perf-users, linux-arm-kernel, Peter Zijlstra,
	Ingo Molnar, Arnaldo Carvalho de Melo, Alexander Shishkin,
	Jiri Olsa, Namhyung Kim, Will Deacon

On Fri, Feb 04, 2022 at 04:02:04PM +0000, Mark Rutland wrote:
> On Fri, Feb 04, 2022 at 10:25:24AM +0530, Anshuman Khandual wrote:
> > On 2/2/22 5:27 PM, Mark Rutland wrote:
> > > On Fri, Jan 28, 2022 at 11:14:13AM +0530, Anshuman Khandual wrote:
> > >> @@ -1370,8 +1376,8 @@ struct perf_branch_entry {
> > >>  		in_tx:1,    /* in transaction */
> > >>  		abort:1,    /* transaction abort */
> > >>  		cycles:16,  /* cycle count to last branch */
> > >> -		type:4,     /* branch type */
> > >> -		reserved:40;
> > >> +		type:6,     /* branch type */
> > > 
> > > As above, is this a safe-change ABI-wise?
> > 
> > If the bit fields here cannot be expanded without breaking ABI, then
> > there is a fundamental problem. Only remaining option will be to add
> > new fields (with new width value) which could accommodate these new
> > required branch types.
> 
> Unfortunately, I think expanding this does break ABI, and is a fundamental
> problem, as:
> 
> (a) Any new values in the expanded field will be truncated when read by old
>     userspace, and so those may be mis-reported. Maybe we're not too worried
>     about this case.

'type' or specfically branch stack is not currently supported on arm64. 
Do we expect an old userspace which this didn't work on to start working 
with a new kernel?

Given at least some of the new types are arch specific, perhaps 
the existing type field should get a new 'PERF_BR_ARCH_SPECIFIC' or 
'PERF_BR_EXTENDED' value (or use PERF_BR_UNKNOWN?) which means read a 
new 'arch_type' field.

Another option is maybe some of these additional types just shouldn't be 
exposed to userspace? For example, are branches to FIQ useful or leaking 
any info about secure world? Debug mode branches also seem minimally 
useful to me (though I'm no expert in how this is used).


> (b) Depending on how the field is placed, existing values might get stored
>     differently. This could break any mismatched combination of
>     {old,new}-kernel and {old,new}-userspace.
> 
>     In practice, I think this means that this is broken for BE, and happens to
>     work for LE, but I don't know how bitfields are defined for each
>     architecture, so there could be other brokenness.
> 
> Consider the test case below:

[...]

> ... where the low bits of the field have moved, and so this is broken even for
> existing values!

So that is a separate issue to be fixed and not directly related to the 
size of 'type'. Looks like it needs similar '#if 
defined(__LITTLE_ENDIAN_BITFIELD)' treatment as some of the other struct 
bitfields. Though somehow BE PPC hasn't had issues?

Rob

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/2] perf: Expand perf_branch_entry.type
  2022-02-15 19:01           ` Rob Herring
@ 2022-02-15 19:45             ` Mark Rutland
  -1 siblings, 0 replies; 24+ messages in thread
From: Mark Rutland @ 2022-02-15 19:45 UTC (permalink / raw)
  To: Rob Herring
  Cc: Anshuman Khandual, linux-kernel, linux-perf-users,
	linux-arm-kernel, Peter Zijlstra, Ingo Molnar,
	Arnaldo Carvalho de Melo, Alexander Shishkin, Jiri Olsa,
	Namhyung Kim, Will Deacon

On Tue, Feb 15, 2022 at 01:01:06PM -0600, Rob Herring wrote:
> On Fri, Feb 04, 2022 at 04:02:04PM +0000, Mark Rutland wrote:
> > On Fri, Feb 04, 2022 at 10:25:24AM +0530, Anshuman Khandual wrote:
> > > On 2/2/22 5:27 PM, Mark Rutland wrote:
> > > > On Fri, Jan 28, 2022 at 11:14:13AM +0530, Anshuman Khandual wrote:
> > > >> @@ -1370,8 +1376,8 @@ struct perf_branch_entry {
> > > >>  		in_tx:1,    /* in transaction */
> > > >>  		abort:1,    /* transaction abort */
> > > >>  		cycles:16,  /* cycle count to last branch */
> > > >> -		type:4,     /* branch type */
> > > >> -		reserved:40;
> > > >> +		type:6,     /* branch type */
> > > > 
> > > > As above, is this a safe-change ABI-wise?
> > > 
> > > If the bit fields here cannot be expanded without breaking ABI, then
> > > there is a fundamental problem. Only remaining option will be to add
> > > new fields (with new width value) which could accommodate these new
> > > required branch types.
> > 
> > Unfortunately, I think expanding this does break ABI, and is a fundamental
> > problem, as:
> > 
> > (a) Any new values in the expanded field will be truncated when read by old
> >     userspace, and so those may be mis-reported. Maybe we're not too worried
> >     about this case.
> 
> 'type' or specfically branch stack is not currently supported on arm64. 
> Do we expect an old userspace which this didn't work on to start working 
> with a new kernel?

I agree for arm64 specifically this probably doesn't matter; I just wanted to
have a clear explanation of why this *could* be a problem, since this could
affect other architectures.

> Given at least some of the new types are arch specific, perhaps 
> the existing type field should get a new 'PERF_BR_ARCH_SPECIFIC' or 
> 'PERF_BR_EXTENDED' value (or use PERF_BR_UNKNOWN?) which means read a 
> new 'arch_type' field.

Yup; something of that shape sounds good to me -- that was roughly what I had
suggested elsewhere.

> Another option is maybe some of these additional types just shouldn't be 
> exposed to userspace? For example, are branches to FIQ useful or leaking 
> any info about secure world? Debug mode branches also seem minimally 
> useful to me (though I'm no expert in how this is used).

I agree; this wasn't clear to me, and regardless I think many of the types
added in the prior patch should not be generic since they're very specific to
the Arm architecture.

> > (b) Depending on how the field is placed, existing values might get stored
> >     differently. This could break any mismatched combination of
> >     {old,new}-kernel and {old,new}-userspace.
> > 
> >     In practice, I think this means that this is broken for BE, and happens to
> >     work for LE, but I don't know how bitfields are defined for each
> >     architecture, so there could be other brokenness.
> > 
> > Consider the test case below:
> 
> [...]
> 
> > ... where the low bits of the field have moved, and so this is broken even for
> > existing values!
> 
> So that is a separate issue to be fixed and not directly related to the 
> size of 'type'. 

I agree if you moved the entire field that's broken everywhere, but in this
case it *is* directly related to the size changing. In my example the meaning
of specific bits changed *because* the size of the field changed and in BE that
meant the low bits of the field moved, even though the field started at the
same position.

> Looks like it needs similar '#if 
> defined(__LITTLE_ENDIAN_BITFIELD)' treatment as some of the other struct 
> bitfields. Though somehow BE PPC hasn't had issues?

IIRC there were recent problems in this area, and I think historically we've
broken ABI and people only noticed much later.

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/2] perf: Expand perf_branch_entry.type
@ 2022-02-15 19:45             ` Mark Rutland
  0 siblings, 0 replies; 24+ messages in thread
From: Mark Rutland @ 2022-02-15 19:45 UTC (permalink / raw)
  To: Rob Herring
  Cc: Anshuman Khandual, linux-kernel, linux-perf-users,
	linux-arm-kernel, Peter Zijlstra, Ingo Molnar,
	Arnaldo Carvalho de Melo, Alexander Shishkin, Jiri Olsa,
	Namhyung Kim, Will Deacon

On Tue, Feb 15, 2022 at 01:01:06PM -0600, Rob Herring wrote:
> On Fri, Feb 04, 2022 at 04:02:04PM +0000, Mark Rutland wrote:
> > On Fri, Feb 04, 2022 at 10:25:24AM +0530, Anshuman Khandual wrote:
> > > On 2/2/22 5:27 PM, Mark Rutland wrote:
> > > > On Fri, Jan 28, 2022 at 11:14:13AM +0530, Anshuman Khandual wrote:
> > > >> @@ -1370,8 +1376,8 @@ struct perf_branch_entry {
> > > >>  		in_tx:1,    /* in transaction */
> > > >>  		abort:1,    /* transaction abort */
> > > >>  		cycles:16,  /* cycle count to last branch */
> > > >> -		type:4,     /* branch type */
> > > >> -		reserved:40;
> > > >> +		type:6,     /* branch type */
> > > > 
> > > > As above, is this a safe-change ABI-wise?
> > > 
> > > If the bit fields here cannot be expanded without breaking ABI, then
> > > there is a fundamental problem. Only remaining option will be to add
> > > new fields (with new width value) which could accommodate these new
> > > required branch types.
> > 
> > Unfortunately, I think expanding this does break ABI, and is a fundamental
> > problem, as:
> > 
> > (a) Any new values in the expanded field will be truncated when read by old
> >     userspace, and so those may be mis-reported. Maybe we're not too worried
> >     about this case.
> 
> 'type' or specfically branch stack is not currently supported on arm64. 
> Do we expect an old userspace which this didn't work on to start working 
> with a new kernel?

I agree for arm64 specifically this probably doesn't matter; I just wanted to
have a clear explanation of why this *could* be a problem, since this could
affect other architectures.

> Given at least some of the new types are arch specific, perhaps 
> the existing type field should get a new 'PERF_BR_ARCH_SPECIFIC' or 
> 'PERF_BR_EXTENDED' value (or use PERF_BR_UNKNOWN?) which means read a 
> new 'arch_type' field.

Yup; something of that shape sounds good to me -- that was roughly what I had
suggested elsewhere.

> Another option is maybe some of these additional types just shouldn't be 
> exposed to userspace? For example, are branches to FIQ useful or leaking 
> any info about secure world? Debug mode branches also seem minimally 
> useful to me (though I'm no expert in how this is used).

I agree; this wasn't clear to me, and regardless I think many of the types
added in the prior patch should not be generic since they're very specific to
the Arm architecture.

> > (b) Depending on how the field is placed, existing values might get stored
> >     differently. This could break any mismatched combination of
> >     {old,new}-kernel and {old,new}-userspace.
> > 
> >     In practice, I think this means that this is broken for BE, and happens to
> >     work for LE, but I don't know how bitfields are defined for each
> >     architecture, so there could be other brokenness.
> > 
> > Consider the test case below:
> 
> [...]
> 
> > ... where the low bits of the field have moved, and so this is broken even for
> > existing values!
> 
> So that is a separate issue to be fixed and not directly related to the 
> size of 'type'. 

I agree if you moved the entire field that's broken everywhere, but in this
case it *is* directly related to the size changing. In my example the meaning
of specific bits changed *because* the size of the field changed and in BE that
meant the low bits of the field moved, even though the field started at the
same position.

> Looks like it needs similar '#if 
> defined(__LITTLE_ENDIAN_BITFIELD)' treatment as some of the other struct 
> bitfields. Though somehow BE PPC hasn't had issues?

IIRC there were recent problems in this area, and I think historically we've
broken ABI and people only noticed much later.

Thanks,
Mark.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/2] perf: Add more generic branch types
  2022-02-02 11:58     ` Mark Rutland
@ 2022-02-24  5:51       ` Anshuman Khandual
  -1 siblings, 0 replies; 24+ messages in thread
From: Anshuman Khandual @ 2022-02-24  5:51 UTC (permalink / raw)
  To: Mark Rutland
  Cc: linux-kernel, linux-perf-users, linux-arm-kernel, robh,
	Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Will Deacon



On 2/2/22 5:28 PM, Mark Rutland wrote:
> Hi Anshuman,
> 
> On Fri, Jan 28, 2022 at 11:14:12AM +0530, Anshuman Khandual wrote:
>> This expands generic branch type classification by adding some more entries
>> , that can still be represented with the existing 4 bit 'type' field. While
>> here this also updates the x86 implementation with these new branch types.
> It looks like there's some whitespace damage here.
> 
>>From a quick scan of the below, I think the "exception return" and "IRQ
> exception" types are somewhat generic, and could be added now (aside from any
> bikeshedding over names), but:

Hi Mark,

I have sent a patch adding just PERF_BR_ERET and PERF_BR_IRQ which are generic
enough to be included now. The 'type' field still got 3 more places for future
use. Hence we should try and include two more branch types, before adding the
last entry for ABI expansion (e.g PERF_BR_EXTENDED).

https://lore.kernel.org/all/1645681014-3346-1-git-send-email-anshuman.khandual@arm.com/

- Anshuman

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/2] perf: Add more generic branch types
@ 2022-02-24  5:51       ` Anshuman Khandual
  0 siblings, 0 replies; 24+ messages in thread
From: Anshuman Khandual @ 2022-02-24  5:51 UTC (permalink / raw)
  To: Mark Rutland
  Cc: linux-kernel, linux-perf-users, linux-arm-kernel, robh,
	Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Will Deacon



On 2/2/22 5:28 PM, Mark Rutland wrote:
> Hi Anshuman,
> 
> On Fri, Jan 28, 2022 at 11:14:12AM +0530, Anshuman Khandual wrote:
>> This expands generic branch type classification by adding some more entries
>> , that can still be represented with the existing 4 bit 'type' field. While
>> here this also updates the x86 implementation with these new branch types.
> It looks like there's some whitespace damage here.
> 
>>From a quick scan of the below, I think the "exception return" and "IRQ
> exception" types are somewhat generic, and could be added now (aside from any
> bikeshedding over names), but:

Hi Mark,

I have sent a patch adding just PERF_BR_ERET and PERF_BR_IRQ which are generic
enough to be included now. The 'type' field still got 3 more places for future
use. Hence we should try and include two more branch types, before adding the
last entry for ABI expansion (e.g PERF_BR_EXTENDED).

https://lore.kernel.org/all/1645681014-3346-1-git-send-email-anshuman.khandual@arm.com/

- Anshuman

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/2] perf: Add more generic branch types
  2022-02-24  5:51       ` Anshuman Khandual
@ 2022-02-24  7:10         ` Anshuman Khandual
  -1 siblings, 0 replies; 24+ messages in thread
From: Anshuman Khandual @ 2022-02-24  7:10 UTC (permalink / raw)
  To: Mark Rutland
  Cc: linux-kernel, linux-perf-users, linux-arm-kernel, robh,
	Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Will Deacon



On 2/24/22 11:21 AM, Anshuman Khandual wrote:
> 
> 
> On 2/2/22 5:28 PM, Mark Rutland wrote:
>> Hi Anshuman,
>>
>> On Fri, Jan 28, 2022 at 11:14:12AM +0530, Anshuman Khandual wrote:
>>> This expands generic branch type classification by adding some more entries
>>> , that can still be represented with the existing 4 bit 'type' field. While
>>> here this also updates the x86 implementation with these new branch types.
>> It looks like there's some whitespace damage here.
>>
>> >From a quick scan of the below, I think the "exception return" and "IRQ
>> exception" types are somewhat generic, and could be added now (aside from any
>> bikeshedding over names), but:
> 
> Hi Mark,
> 
> I have sent a patch adding just PERF_BR_ERET and PERF_BR_IRQ which are generic
> enough to be included now. The 'type' field still got 3 more places for future
> use. Hence we should try and include two more branch types, before adding the

The two additional generic branch types could be

- PERF_BR_SERROR (possible arm64 equivalent for x86 MCE)
- PERF_BR_NO_TX  (only missing TX related branch type in perf_branch_entry)

perf_branch_entry.[in_tx | abort] are already available. Also PERF_BR_NO_TX
could be used right away on x86 platform, via X86_BR_NO_TX.

> last entry for ABI expansion (e.g PERF_BR_EXTENDED).

PERF_BR_EXTENDED could help expand into another 4 bits 'new_type' field
following the existing 4 bits 'type' field. Does this sound feasible ?

enum {
        PERF_BR_UNKNOWN         = 0,    /* unknown */
        PERF_BR_COND            = 1,    /* conditional */
        PERF_BR_UNCOND          = 2,    /* unconditional  */
        PERF_BR_IND             = 3,    /* indirect */
        PERF_BR_CALL            = 4,    /* function call */
        PERF_BR_IND_CALL        = 5,    /* indirect function call */
        PERF_BR_RET             = 6,    /* function return */
        PERF_BR_SYSCALL         = 7,    /* syscall */
        PERF_BR_SYSRET          = 8,    /* syscall return */
        PERF_BR_COND_CALL       = 9,    /* conditional function call */
        PERF_BR_COND_RET        = 10,   /* conditional function return */
        PERF_BR_ERET            = 11,   /* exception return */
        PERF_BR_IRQ             = 12,   /* irq */
        PERF_BR_SERROR          = 13,   /* System error */
        PERF_BR_NO_TX           = 14,   /* No transaction */
	PERF_BR_EXTEND_ABI	= 15 	/* Extended ABI */
        PERF_BR_MAX,
};

enum {
       PERF_BR_NEW_FAULT_ALGN          = 0,    /* Alignment fault */
       PERF_BR_NEW_FAULT_DATA          = 1,    /* Data fault */
       PERF_BR_NEW_FAULT_INST          = 2,    /* Inst fault */
       PERF_BR_NEW_ARCH_1              = 3,    /* Architecture specific */
       PERF_BR_NEW_ARCH_2              = 4,    /* Architecture specific */
       PERF_BR_NEW_ARCH_3              = 5,    /* Architecture specific */
       PERF_BR_NEW_ARCH_4              = 6,    /* Architecture specific */
       PERF_BR_NEW_ARCH_5              = 7,    /* Architecture specific */
       PERF_BR_NEW_MAX,
};

#ifdef CONFIG_ARM64
#define PERF_BR_FIQ            PERF_BR_NEW_ARCH_1
#define PERF_BR_DEBUG_HALT     PERF_BR_NEW_ARCH_2
#define PERF_BR_DEBUG_EXIT     PERF_BR_NEW_ARCH_3
#define PERF_BR_DEBUG_INST     PERF_BR_NEW_ARCH_4
#define PERF_BR_DEBUG_DATA     PERF_BR_NEW_ARCH_5
#endif

User space perf tool will look into perf_branch_entry.new_type, only
when (perf_branch_entry.type == PERF_BR_EXTEND_ABI). Older perf tool
on a newer kernel will be safe via old PERF_BR_MAX, and will ignore
[PERF_BR_ERET - PERF_BR_EXTEND_ABI] values possibly with an warning.

- Anshuman

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/2] perf: Add more generic branch types
@ 2022-02-24  7:10         ` Anshuman Khandual
  0 siblings, 0 replies; 24+ messages in thread
From: Anshuman Khandual @ 2022-02-24  7:10 UTC (permalink / raw)
  To: Mark Rutland
  Cc: linux-kernel, linux-perf-users, linux-arm-kernel, robh,
	Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Will Deacon



On 2/24/22 11:21 AM, Anshuman Khandual wrote:
> 
> 
> On 2/2/22 5:28 PM, Mark Rutland wrote:
>> Hi Anshuman,
>>
>> On Fri, Jan 28, 2022 at 11:14:12AM +0530, Anshuman Khandual wrote:
>>> This expands generic branch type classification by adding some more entries
>>> , that can still be represented with the existing 4 bit 'type' field. While
>>> here this also updates the x86 implementation with these new branch types.
>> It looks like there's some whitespace damage here.
>>
>> >From a quick scan of the below, I think the "exception return" and "IRQ
>> exception" types are somewhat generic, and could be added now (aside from any
>> bikeshedding over names), but:
> 
> Hi Mark,
> 
> I have sent a patch adding just PERF_BR_ERET and PERF_BR_IRQ which are generic
> enough to be included now. The 'type' field still got 3 more places for future
> use. Hence we should try and include two more branch types, before adding the

The two additional generic branch types could be

- PERF_BR_SERROR (possible arm64 equivalent for x86 MCE)
- PERF_BR_NO_TX  (only missing TX related branch type in perf_branch_entry)

perf_branch_entry.[in_tx | abort] are already available. Also PERF_BR_NO_TX
could be used right away on x86 platform, via X86_BR_NO_TX.

> last entry for ABI expansion (e.g PERF_BR_EXTENDED).

PERF_BR_EXTENDED could help expand into another 4 bits 'new_type' field
following the existing 4 bits 'type' field. Does this sound feasible ?

enum {
        PERF_BR_UNKNOWN         = 0,    /* unknown */
        PERF_BR_COND            = 1,    /* conditional */
        PERF_BR_UNCOND          = 2,    /* unconditional  */
        PERF_BR_IND             = 3,    /* indirect */
        PERF_BR_CALL            = 4,    /* function call */
        PERF_BR_IND_CALL        = 5,    /* indirect function call */
        PERF_BR_RET             = 6,    /* function return */
        PERF_BR_SYSCALL         = 7,    /* syscall */
        PERF_BR_SYSRET          = 8,    /* syscall return */
        PERF_BR_COND_CALL       = 9,    /* conditional function call */
        PERF_BR_COND_RET        = 10,   /* conditional function return */
        PERF_BR_ERET            = 11,   /* exception return */
        PERF_BR_IRQ             = 12,   /* irq */
        PERF_BR_SERROR          = 13,   /* System error */
        PERF_BR_NO_TX           = 14,   /* No transaction */
	PERF_BR_EXTEND_ABI	= 15 	/* Extended ABI */
        PERF_BR_MAX,
};

enum {
       PERF_BR_NEW_FAULT_ALGN          = 0,    /* Alignment fault */
       PERF_BR_NEW_FAULT_DATA          = 1,    /* Data fault */
       PERF_BR_NEW_FAULT_INST          = 2,    /* Inst fault */
       PERF_BR_NEW_ARCH_1              = 3,    /* Architecture specific */
       PERF_BR_NEW_ARCH_2              = 4,    /* Architecture specific */
       PERF_BR_NEW_ARCH_3              = 5,    /* Architecture specific */
       PERF_BR_NEW_ARCH_4              = 6,    /* Architecture specific */
       PERF_BR_NEW_ARCH_5              = 7,    /* Architecture specific */
       PERF_BR_NEW_MAX,
};

#ifdef CONFIG_ARM64
#define PERF_BR_FIQ            PERF_BR_NEW_ARCH_1
#define PERF_BR_DEBUG_HALT     PERF_BR_NEW_ARCH_2
#define PERF_BR_DEBUG_EXIT     PERF_BR_NEW_ARCH_3
#define PERF_BR_DEBUG_INST     PERF_BR_NEW_ARCH_4
#define PERF_BR_DEBUG_DATA     PERF_BR_NEW_ARCH_5
#endif

User space perf tool will look into perf_branch_entry.new_type, only
when (perf_branch_entry.type == PERF_BR_EXTEND_ABI). Older perf tool
on a newer kernel will be safe via old PERF_BR_MAX, and will ignore
[PERF_BR_ERET - PERF_BR_EXTEND_ABI] values possibly with an warning.

- Anshuman

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2022-02-24  7:11 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-01-28  5:44 [PATCH 0/2] perf: Expand captured branch types Anshuman Khandual
2022-01-28  5:44 ` Anshuman Khandual
2022-01-28  5:44 ` [PATCH 1/2] perf: Add more generic " Anshuman Khandual
2022-01-28  5:44   ` Anshuman Khandual
2022-02-02 11:58   ` Mark Rutland
2022-02-02 11:58     ` Mark Rutland
2022-02-04  4:56     ` Anshuman Khandual
2022-02-04  4:56       ` Anshuman Khandual
2022-02-24  5:51     ` Anshuman Khandual
2022-02-24  5:51       ` Anshuman Khandual
2022-02-24  7:10       ` Anshuman Khandual
2022-02-24  7:10         ` Anshuman Khandual
2022-01-28  5:44 ` [PATCH 2/2] perf: Expand perf_branch_entry.type Anshuman Khandual
2022-01-28  5:44   ` Anshuman Khandual
2022-02-02 11:57   ` Mark Rutland
2022-02-02 11:57     ` Mark Rutland
2022-02-04  4:55     ` Anshuman Khandual
2022-02-04  4:55       ` Anshuman Khandual
2022-02-04 16:02       ` Mark Rutland
2022-02-04 16:02         ` Mark Rutland
2022-02-15 19:01         ` Rob Herring
2022-02-15 19:01           ` Rob Herring
2022-02-15 19:45           ` Mark Rutland
2022-02-15 19:45             ` Mark Rutland

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.