* [PATCH 0/2] perf: Expand captured branch types @ 2022-01-28 5:44 ` Anshuman Khandual 0 siblings, 0 replies; 24+ messages in thread From: Anshuman Khandual @ 2022-01-28 5:44 UTC (permalink / raw) To: linux-kernel, linux-perf-users, linux-arm-kernel Cc: robh, Anshuman Khandual, Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo, Mark Rutland Branch Record Buffer Extension (BRBE) implementation on arm64 captures more branch type classification which cannot be accommodated in the current perf branch record format via perf_branch_entry.type element. To overcome this limitation, perf_branch_entry.type needs to be expanded along with generic branch classification. This series achieves this expansion. But before that it adds more generic branch classification types which can be accommodated without any changes, while also updating the x86 implementation as required. This series applies on v5.17-rc1 Please find BRBE captured branch type classification reference here. https://developer.arm.com/documentation/ddi0601/2021-12/AArch64-Registers/ BRBINF-n--EL1--Branch-Record-Buffer-Information-Register--n-?lang=en Proposed perf branch stack enablement on arm64 platform can be found here. https://lore.kernel.org/all/1642998653-21377-1-git-send-email-anshuman.khandual@arm.com/ These patches are split from the above series (PATCH 9/11, PATCH 10/11) as these perf changes are pre-requisite for BRBE enablement. Although there is another proposal (PATCH 11/11), extending struct perf_branch_entry further to capture branch record privilege level information (priv, 2 bits), it is not included here. But please do let me know if that proposal should also be discussed in this series. Thank you. Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-perf-users@vger.kernel.org Cc: linux-kernel@vger.kernel.org Anshuman Khandual (2): perf: Add more generic branch types perf: Expand perf_branch_entry.type arch/x86/events/intel/lbr.c | 4 ++-- include/uapi/linux/perf_event.h | 15 +++++++++++++-- tools/include/uapi/linux/perf_event.h | 15 +++++++++++++-- tools/perf/util/branch.c | 13 ++++++++++++- tools/perf/util/branch.h | 4 ++-- 5 files changed, 42 insertions(+), 9 deletions(-) -- 2.25.1 ^ permalink raw reply [flat|nested] 24+ messages in thread
* [PATCH 0/2] perf: Expand captured branch types @ 2022-01-28 5:44 ` Anshuman Khandual 0 siblings, 0 replies; 24+ messages in thread From: Anshuman Khandual @ 2022-01-28 5:44 UTC (permalink / raw) To: linux-kernel, linux-perf-users, linux-arm-kernel Cc: robh, Anshuman Khandual, Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo, Mark Rutland Branch Record Buffer Extension (BRBE) implementation on arm64 captures more branch type classification which cannot be accommodated in the current perf branch record format via perf_branch_entry.type element. To overcome this limitation, perf_branch_entry.type needs to be expanded along with generic branch classification. This series achieves this expansion. But before that it adds more generic branch classification types which can be accommodated without any changes, while also updating the x86 implementation as required. This series applies on v5.17-rc1 Please find BRBE captured branch type classification reference here. https://developer.arm.com/documentation/ddi0601/2021-12/AArch64-Registers/ BRBINF-n--EL1--Branch-Record-Buffer-Information-Register--n-?lang=en Proposed perf branch stack enablement on arm64 platform can be found here. https://lore.kernel.org/all/1642998653-21377-1-git-send-email-anshuman.khandual@arm.com/ These patches are split from the above series (PATCH 9/11, PATCH 10/11) as these perf changes are pre-requisite for BRBE enablement. Although there is another proposal (PATCH 11/11), extending struct perf_branch_entry further to capture branch record privilege level information (priv, 2 bits), it is not included here. But please do let me know if that proposal should also be discussed in this series. Thank you. Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-perf-users@vger.kernel.org Cc: linux-kernel@vger.kernel.org Anshuman Khandual (2): perf: Add more generic branch types perf: Expand perf_branch_entry.type arch/x86/events/intel/lbr.c | 4 ++-- include/uapi/linux/perf_event.h | 15 +++++++++++++-- tools/include/uapi/linux/perf_event.h | 15 +++++++++++++-- tools/perf/util/branch.c | 13 ++++++++++++- tools/perf/util/branch.h | 4 ++-- 5 files changed, 42 insertions(+), 9 deletions(-) -- 2.25.1 _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 24+ messages in thread
* [PATCH 1/2] perf: Add more generic branch types 2022-01-28 5:44 ` Anshuman Khandual @ 2022-01-28 5:44 ` Anshuman Khandual -1 siblings, 0 replies; 24+ messages in thread From: Anshuman Khandual @ 2022-01-28 5:44 UTC (permalink / raw) To: linux-kernel, linux-perf-users, linux-arm-kernel Cc: robh, Anshuman Khandual, Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo, Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim, Will Deacon This expands generic branch type classification by adding some more entries , that can still be represented with the existing 4 bit 'type' field. While here this also updates the x86 implementation with these new branch types. Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Will Deacon <will@kernel.org> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-perf-users@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> --- arch/x86/events/intel/lbr.c | 4 ++-- include/uapi/linux/perf_event.h | 5 +++++ tools/include/uapi/linux/perf_event.h | 5 +++++ tools/perf/util/branch.c | 7 ++++++- 4 files changed, 18 insertions(+), 3 deletions(-) diff --git a/arch/x86/events/intel/lbr.c b/arch/x86/events/intel/lbr.c index 8043213b75a5..9f86fac8c6a5 100644 --- a/arch/x86/events/intel/lbr.c +++ b/arch/x86/events/intel/lbr.c @@ -1336,10 +1336,10 @@ static int branch_map[X86_BR_TYPE_MAP_MAX] = { PERF_BR_SYSCALL, /* X86_BR_SYSCALL */ PERF_BR_SYSRET, /* X86_BR_SYSRET */ PERF_BR_UNKNOWN, /* X86_BR_INT */ - PERF_BR_UNKNOWN, /* X86_BR_IRET */ + PERF_BR_EXPT_RET, /* X86_BR_IRET */ PERF_BR_COND, /* X86_BR_JCC */ PERF_BR_UNCOND, /* X86_BR_JMP */ - PERF_BR_UNKNOWN, /* X86_BR_IRQ */ + PERF_BR_IRQ, /* X86_BR_IRQ */ PERF_BR_IND_CALL, /* X86_BR_IND_CALL */ PERF_BR_UNKNOWN, /* X86_BR_ABORT */ PERF_BR_UNKNOWN, /* X86_BR_IN_TX */ diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h index 1b65042ab1db..b91d0f575d0c 100644 --- a/include/uapi/linux/perf_event.h +++ b/include/uapi/linux/perf_event.h @@ -251,6 +251,11 @@ enum { PERF_BR_SYSRET = 8, /* syscall return */ PERF_BR_COND_CALL = 9, /* conditional function call */ PERF_BR_COND_RET = 10, /* conditional function return */ + PERF_BR_EXPT_RET = 11, /* exception return */ + PERF_BR_IRQ = 12, /* irq */ + PERF_BR_FIQ = 13, /* fiq */ + PERF_BR_DEBUG_HALT = 14, /* debug halt */ + PERF_BR_DEBUG_EXIT = 15, /* debug exit */ PERF_BR_MAX, }; diff --git a/tools/include/uapi/linux/perf_event.h b/tools/include/uapi/linux/perf_event.h index 4cd39aaccbe7..1882054e8684 100644 --- a/tools/include/uapi/linux/perf_event.h +++ b/tools/include/uapi/linux/perf_event.h @@ -251,6 +251,11 @@ enum { PERF_BR_SYSRET = 8, /* syscall return */ PERF_BR_COND_CALL = 9, /* conditional function call */ PERF_BR_COND_RET = 10, /* conditional function return */ + PERF_BR_EXPT_RET = 11, /* exception return */ + PERF_BR_IRQ = 12, /* irq */ + PERF_BR_FIQ = 13, /* fiq */ + PERF_BR_DEBUG_HALT = 14, /* debug halt */ + PERF_BR_DEBUG_EXIT = 15, /* debug exit */ PERF_BR_MAX, }; diff --git a/tools/perf/util/branch.c b/tools/perf/util/branch.c index 2285b1eb3128..74e5e67b1779 100644 --- a/tools/perf/util/branch.c +++ b/tools/perf/util/branch.c @@ -49,7 +49,12 @@ const char *branch_type_name(int type) "SYSCALL", "SYSRET", "COND_CALL", - "COND_RET" + "COND_RET", + "EXPT_RET", + "IRQ", + "FIQ", + "DEBUG_HALT", + "DEBUG_EXIT" }; if (type >= 0 && type < PERF_BR_MAX) -- 2.25.1 ^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH 1/2] perf: Add more generic branch types @ 2022-01-28 5:44 ` Anshuman Khandual 0 siblings, 0 replies; 24+ messages in thread From: Anshuman Khandual @ 2022-01-28 5:44 UTC (permalink / raw) To: linux-kernel, linux-perf-users, linux-arm-kernel Cc: robh, Anshuman Khandual, Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo, Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim, Will Deacon This expands generic branch type classification by adding some more entries , that can still be represented with the existing 4 bit 'type' field. While here this also updates the x86 implementation with these new branch types. Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Will Deacon <will@kernel.org> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-perf-users@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> --- arch/x86/events/intel/lbr.c | 4 ++-- include/uapi/linux/perf_event.h | 5 +++++ tools/include/uapi/linux/perf_event.h | 5 +++++ tools/perf/util/branch.c | 7 ++++++- 4 files changed, 18 insertions(+), 3 deletions(-) diff --git a/arch/x86/events/intel/lbr.c b/arch/x86/events/intel/lbr.c index 8043213b75a5..9f86fac8c6a5 100644 --- a/arch/x86/events/intel/lbr.c +++ b/arch/x86/events/intel/lbr.c @@ -1336,10 +1336,10 @@ static int branch_map[X86_BR_TYPE_MAP_MAX] = { PERF_BR_SYSCALL, /* X86_BR_SYSCALL */ PERF_BR_SYSRET, /* X86_BR_SYSRET */ PERF_BR_UNKNOWN, /* X86_BR_INT */ - PERF_BR_UNKNOWN, /* X86_BR_IRET */ + PERF_BR_EXPT_RET, /* X86_BR_IRET */ PERF_BR_COND, /* X86_BR_JCC */ PERF_BR_UNCOND, /* X86_BR_JMP */ - PERF_BR_UNKNOWN, /* X86_BR_IRQ */ + PERF_BR_IRQ, /* X86_BR_IRQ */ PERF_BR_IND_CALL, /* X86_BR_IND_CALL */ PERF_BR_UNKNOWN, /* X86_BR_ABORT */ PERF_BR_UNKNOWN, /* X86_BR_IN_TX */ diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h index 1b65042ab1db..b91d0f575d0c 100644 --- a/include/uapi/linux/perf_event.h +++ b/include/uapi/linux/perf_event.h @@ -251,6 +251,11 @@ enum { PERF_BR_SYSRET = 8, /* syscall return */ PERF_BR_COND_CALL = 9, /* conditional function call */ PERF_BR_COND_RET = 10, /* conditional function return */ + PERF_BR_EXPT_RET = 11, /* exception return */ + PERF_BR_IRQ = 12, /* irq */ + PERF_BR_FIQ = 13, /* fiq */ + PERF_BR_DEBUG_HALT = 14, /* debug halt */ + PERF_BR_DEBUG_EXIT = 15, /* debug exit */ PERF_BR_MAX, }; diff --git a/tools/include/uapi/linux/perf_event.h b/tools/include/uapi/linux/perf_event.h index 4cd39aaccbe7..1882054e8684 100644 --- a/tools/include/uapi/linux/perf_event.h +++ b/tools/include/uapi/linux/perf_event.h @@ -251,6 +251,11 @@ enum { PERF_BR_SYSRET = 8, /* syscall return */ PERF_BR_COND_CALL = 9, /* conditional function call */ PERF_BR_COND_RET = 10, /* conditional function return */ + PERF_BR_EXPT_RET = 11, /* exception return */ + PERF_BR_IRQ = 12, /* irq */ + PERF_BR_FIQ = 13, /* fiq */ + PERF_BR_DEBUG_HALT = 14, /* debug halt */ + PERF_BR_DEBUG_EXIT = 15, /* debug exit */ PERF_BR_MAX, }; diff --git a/tools/perf/util/branch.c b/tools/perf/util/branch.c index 2285b1eb3128..74e5e67b1779 100644 --- a/tools/perf/util/branch.c +++ b/tools/perf/util/branch.c @@ -49,7 +49,12 @@ const char *branch_type_name(int type) "SYSCALL", "SYSRET", "COND_CALL", - "COND_RET" + "COND_RET", + "EXPT_RET", + "IRQ", + "FIQ", + "DEBUG_HALT", + "DEBUG_EXIT" }; if (type >= 0 && type < PERF_BR_MAX) -- 2.25.1 _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply related [flat|nested] 24+ messages in thread
* Re: [PATCH 1/2] perf: Add more generic branch types 2022-01-28 5:44 ` Anshuman Khandual @ 2022-02-02 11:58 ` Mark Rutland -1 siblings, 0 replies; 24+ messages in thread From: Mark Rutland @ 2022-02-02 11:58 UTC (permalink / raw) To: Anshuman Khandual Cc: linux-kernel, linux-perf-users, linux-arm-kernel, robh, Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo, Alexander Shishkin, Jiri Olsa, Namhyung Kim, Will Deacon Hi Anshuman, On Fri, Jan 28, 2022 at 11:14:12AM +0530, Anshuman Khandual wrote: > This expands generic branch type classification by adding some more entries > , that can still be represented with the existing 4 bit 'type' field. While > here this also updates the x86 implementation with these new branch types. It looks like there's some whitespace damage here. From a quick scan of the below, I think the "exception return" and "IRQ exception" types are somewhat generic, and could be added now (aside from any bikeshedding over names), but: * For IRQ vs FIQ, we might just want to have a top-level "asynchronous exception" type, and then further divide that with a separate field. That way it's easier to extend in future if new exceptions are added. * I don't think the debug state entry/exits make sense as generic branch types, since those are somewhat specific to the ARM architecutre, and it might make sense to define generic PERF_BR_ARCH* definitions instead. * Given the next patch extends the field, and therei are potential ABI problems with that, we might need to reserve a value for ABI extensibility purposes, and I suspect we need to do that *first*. More comments on the subsequent patch. > Cc: Peter Zijlstra <peterz@infradead.org> > Cc: Ingo Molnar <mingo@redhat.com> > Cc: Arnaldo Carvalho de Melo <acme@kernel.org> > Cc: Mark Rutland <mark.rutland@arm.com> > Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> > Cc: Jiri Olsa <jolsa@redhat.com> > Cc: Namhyung Kim <namhyung@kernel.org> > Cc: Will Deacon <will@kernel.org> > Cc: linux-arm-kernel@lists.infradead.org > Cc: linux-perf-users@vger.kernel.org > Cc: linux-kernel@vger.kernel.org > Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> > --- > arch/x86/events/intel/lbr.c | 4 ++-- > include/uapi/linux/perf_event.h | 5 +++++ > tools/include/uapi/linux/perf_event.h | 5 +++++ > tools/perf/util/branch.c | 7 ++++++- > 4 files changed, 18 insertions(+), 3 deletions(-) > > diff --git a/arch/x86/events/intel/lbr.c b/arch/x86/events/intel/lbr.c > index 8043213b75a5..9f86fac8c6a5 100644 > --- a/arch/x86/events/intel/lbr.c > +++ b/arch/x86/events/intel/lbr.c > @@ -1336,10 +1336,10 @@ static int branch_map[X86_BR_TYPE_MAP_MAX] = { > PERF_BR_SYSCALL, /* X86_BR_SYSCALL */ > PERF_BR_SYSRET, /* X86_BR_SYSRET */ > PERF_BR_UNKNOWN, /* X86_BR_INT */ > - PERF_BR_UNKNOWN, /* X86_BR_IRET */ > + PERF_BR_EXPT_RET, /* X86_BR_IRET */ > PERF_BR_COND, /* X86_BR_JCC */ > PERF_BR_UNCOND, /* X86_BR_JMP */ > - PERF_BR_UNKNOWN, /* X86_BR_IRQ */ > + PERF_BR_IRQ, /* X86_BR_IRQ */ > PERF_BR_IND_CALL, /* X86_BR_IND_CALL */ > PERF_BR_UNKNOWN, /* X86_BR_ABORT */ > PERF_BR_UNKNOWN, /* X86_BR_IN_TX */ This presumably changes the values reported to userspace, so the commit message should mention that and explain why that is not a problem. > diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h > index 1b65042ab1db..b91d0f575d0c 100644 > --- a/include/uapi/linux/perf_event.h > +++ b/include/uapi/linux/perf_event.h > @@ -251,6 +251,11 @@ enum { > PERF_BR_SYSRET = 8, /* syscall return */ > PERF_BR_COND_CALL = 9, /* conditional function call */ > PERF_BR_COND_RET = 10, /* conditional function return */ > + PERF_BR_EXPT_RET = 11, /* exception return */ We don't use 'EXPT' anywhere else, so it might be better to just use 'ERET'. IIUC that's the naming the x86 FRED stuff is going to use anyhow. > + PERF_BR_IRQ = 12, /* irq */ This looks somewhat generic, so adding it now makes sense to me, but ... > + PERF_BR_FIQ = 13, /* fiq */ ... this is arguably just a idfferent class of interrupt from the PoV of Linux, and the naming is ARM-specific, so I don't think this makes sense to add *now*. As above, maybe it would be better to have a generic "aynchronous exception" or "interrupt" type, and a separate field to distinguish specific types of those. > + PERF_BR_DEBUG_HALT = 14, /* debug halt */ > + PERF_BR_DEBUG_EXIT = 15, /* debug exit */ For the benefit of those not familiar with the ARM architecture, "debug halt" and "debug exit" usually refer to "debug state", which is what an external (e.g. JTAG) debugger uses rather than the usual self-hosted debug stuff that Linux uses. Given that, I'm not sure these are very generic, and I suspect it would be better to have more generic PERF_BR_ARCH_* entries for things like this. Thanks, Mark. > PERF_BR_MAX, > }; > > diff --git a/tools/include/uapi/linux/perf_event.h b/tools/include/uapi/linux/perf_event.h > index 4cd39aaccbe7..1882054e8684 100644 > --- a/tools/include/uapi/linux/perf_event.h > +++ b/tools/include/uapi/linux/perf_event.h > @@ -251,6 +251,11 @@ enum { > PERF_BR_SYSRET = 8, /* syscall return */ > PERF_BR_COND_CALL = 9, /* conditional function call */ > PERF_BR_COND_RET = 10, /* conditional function return */ > + PERF_BR_EXPT_RET = 11, /* exception return */ > + PERF_BR_IRQ = 12, /* irq */ > + PERF_BR_FIQ = 13, /* fiq */ > + PERF_BR_DEBUG_HALT = 14, /* debug halt */ > + PERF_BR_DEBUG_EXIT = 15, /* debug exit */ > PERF_BR_MAX, > }; > > diff --git a/tools/perf/util/branch.c b/tools/perf/util/branch.c > index 2285b1eb3128..74e5e67b1779 100644 > --- a/tools/perf/util/branch.c > +++ b/tools/perf/util/branch.c > @@ -49,7 +49,12 @@ const char *branch_type_name(int type) > "SYSCALL", > "SYSRET", > "COND_CALL", > - "COND_RET" > + "COND_RET", > + "EXPT_RET", > + "IRQ", > + "FIQ", > + "DEBUG_HALT", > + "DEBUG_EXIT" > }; > > if (type >= 0 && type < PERF_BR_MAX) > -- > 2.25.1 > ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 1/2] perf: Add more generic branch types @ 2022-02-02 11:58 ` Mark Rutland 0 siblings, 0 replies; 24+ messages in thread From: Mark Rutland @ 2022-02-02 11:58 UTC (permalink / raw) To: Anshuman Khandual Cc: linux-kernel, linux-perf-users, linux-arm-kernel, robh, Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo, Alexander Shishkin, Jiri Olsa, Namhyung Kim, Will Deacon Hi Anshuman, On Fri, Jan 28, 2022 at 11:14:12AM +0530, Anshuman Khandual wrote: > This expands generic branch type classification by adding some more entries > , that can still be represented with the existing 4 bit 'type' field. While > here this also updates the x86 implementation with these new branch types. It looks like there's some whitespace damage here. From a quick scan of the below, I think the "exception return" and "IRQ exception" types are somewhat generic, and could be added now (aside from any bikeshedding over names), but: * For IRQ vs FIQ, we might just want to have a top-level "asynchronous exception" type, and then further divide that with a separate field. That way it's easier to extend in future if new exceptions are added. * I don't think the debug state entry/exits make sense as generic branch types, since those are somewhat specific to the ARM architecutre, and it might make sense to define generic PERF_BR_ARCH* definitions instead. * Given the next patch extends the field, and therei are potential ABI problems with that, we might need to reserve a value for ABI extensibility purposes, and I suspect we need to do that *first*. More comments on the subsequent patch. > Cc: Peter Zijlstra <peterz@infradead.org> > Cc: Ingo Molnar <mingo@redhat.com> > Cc: Arnaldo Carvalho de Melo <acme@kernel.org> > Cc: Mark Rutland <mark.rutland@arm.com> > Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> > Cc: Jiri Olsa <jolsa@redhat.com> > Cc: Namhyung Kim <namhyung@kernel.org> > Cc: Will Deacon <will@kernel.org> > Cc: linux-arm-kernel@lists.infradead.org > Cc: linux-perf-users@vger.kernel.org > Cc: linux-kernel@vger.kernel.org > Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> > --- > arch/x86/events/intel/lbr.c | 4 ++-- > include/uapi/linux/perf_event.h | 5 +++++ > tools/include/uapi/linux/perf_event.h | 5 +++++ > tools/perf/util/branch.c | 7 ++++++- > 4 files changed, 18 insertions(+), 3 deletions(-) > > diff --git a/arch/x86/events/intel/lbr.c b/arch/x86/events/intel/lbr.c > index 8043213b75a5..9f86fac8c6a5 100644 > --- a/arch/x86/events/intel/lbr.c > +++ b/arch/x86/events/intel/lbr.c > @@ -1336,10 +1336,10 @@ static int branch_map[X86_BR_TYPE_MAP_MAX] = { > PERF_BR_SYSCALL, /* X86_BR_SYSCALL */ > PERF_BR_SYSRET, /* X86_BR_SYSRET */ > PERF_BR_UNKNOWN, /* X86_BR_INT */ > - PERF_BR_UNKNOWN, /* X86_BR_IRET */ > + PERF_BR_EXPT_RET, /* X86_BR_IRET */ > PERF_BR_COND, /* X86_BR_JCC */ > PERF_BR_UNCOND, /* X86_BR_JMP */ > - PERF_BR_UNKNOWN, /* X86_BR_IRQ */ > + PERF_BR_IRQ, /* X86_BR_IRQ */ > PERF_BR_IND_CALL, /* X86_BR_IND_CALL */ > PERF_BR_UNKNOWN, /* X86_BR_ABORT */ > PERF_BR_UNKNOWN, /* X86_BR_IN_TX */ This presumably changes the values reported to userspace, so the commit message should mention that and explain why that is not a problem. > diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h > index 1b65042ab1db..b91d0f575d0c 100644 > --- a/include/uapi/linux/perf_event.h > +++ b/include/uapi/linux/perf_event.h > @@ -251,6 +251,11 @@ enum { > PERF_BR_SYSRET = 8, /* syscall return */ > PERF_BR_COND_CALL = 9, /* conditional function call */ > PERF_BR_COND_RET = 10, /* conditional function return */ > + PERF_BR_EXPT_RET = 11, /* exception return */ We don't use 'EXPT' anywhere else, so it might be better to just use 'ERET'. IIUC that's the naming the x86 FRED stuff is going to use anyhow. > + PERF_BR_IRQ = 12, /* irq */ This looks somewhat generic, so adding it now makes sense to me, but ... > + PERF_BR_FIQ = 13, /* fiq */ ... this is arguably just a idfferent class of interrupt from the PoV of Linux, and the naming is ARM-specific, so I don't think this makes sense to add *now*. As above, maybe it would be better to have a generic "aynchronous exception" or "interrupt" type, and a separate field to distinguish specific types of those. > + PERF_BR_DEBUG_HALT = 14, /* debug halt */ > + PERF_BR_DEBUG_EXIT = 15, /* debug exit */ For the benefit of those not familiar with the ARM architecture, "debug halt" and "debug exit" usually refer to "debug state", which is what an external (e.g. JTAG) debugger uses rather than the usual self-hosted debug stuff that Linux uses. Given that, I'm not sure these are very generic, and I suspect it would be better to have more generic PERF_BR_ARCH_* entries for things like this. Thanks, Mark. > PERF_BR_MAX, > }; > > diff --git a/tools/include/uapi/linux/perf_event.h b/tools/include/uapi/linux/perf_event.h > index 4cd39aaccbe7..1882054e8684 100644 > --- a/tools/include/uapi/linux/perf_event.h > +++ b/tools/include/uapi/linux/perf_event.h > @@ -251,6 +251,11 @@ enum { > PERF_BR_SYSRET = 8, /* syscall return */ > PERF_BR_COND_CALL = 9, /* conditional function call */ > PERF_BR_COND_RET = 10, /* conditional function return */ > + PERF_BR_EXPT_RET = 11, /* exception return */ > + PERF_BR_IRQ = 12, /* irq */ > + PERF_BR_FIQ = 13, /* fiq */ > + PERF_BR_DEBUG_HALT = 14, /* debug halt */ > + PERF_BR_DEBUG_EXIT = 15, /* debug exit */ > PERF_BR_MAX, > }; > > diff --git a/tools/perf/util/branch.c b/tools/perf/util/branch.c > index 2285b1eb3128..74e5e67b1779 100644 > --- a/tools/perf/util/branch.c > +++ b/tools/perf/util/branch.c > @@ -49,7 +49,12 @@ const char *branch_type_name(int type) > "SYSCALL", > "SYSRET", > "COND_CALL", > - "COND_RET" > + "COND_RET", > + "EXPT_RET", > + "IRQ", > + "FIQ", > + "DEBUG_HALT", > + "DEBUG_EXIT" > }; > > if (type >= 0 && type < PERF_BR_MAX) > -- > 2.25.1 > _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 1/2] perf: Add more generic branch types 2022-02-02 11:58 ` Mark Rutland @ 2022-02-04 4:56 ` Anshuman Khandual -1 siblings, 0 replies; 24+ messages in thread From: Anshuman Khandual @ 2022-02-04 4:56 UTC (permalink / raw) To: Mark Rutland Cc: linux-kernel, linux-perf-users, linux-arm-kernel, robh, Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo, Alexander Shishkin, Jiri Olsa, Namhyung Kim, Will Deacon On 2/2/22 5:28 PM, Mark Rutland wrote: > Hi Anshuman, > > On Fri, Jan 28, 2022 at 11:14:12AM +0530, Anshuman Khandual wrote: >> This expands generic branch type classification by adding some more entries >> , that can still be represented with the existing 4 bit 'type' field. While ^ >> here this also updates the x86 implementation with these new branch types. > > It looks like there's some whitespace damage here. Are you referring the above ? I will have a look. > >>From a quick scan of the below, I think the "exception return" and "IRQ > exception" types are somewhat generic, and could be added now (aside from any > bikeshedding over names), but: > > * For IRQ vs FIQ, we might just want to have a top-level "asynchronous > exception" type, and then further divide that with a separate field. That way > it's easier to extend in future if new exceptions are added. Okay. But that might lead to a hierarchical bit fields design where as the current one is just linear. > > * I don't think the debug state entry/exits make sense as generic branch types, From BRBE perspective, a branch is any control flow change including exception level change, debug enter, debug exit etc. If exception and its return can be classified as 'branches' why not debug state change ? Are there no similar debug states transition on other platforms ? > since those are somewhat specific to the ARM architecutre, and it might make > sense to define generic PERF_BR_ARCH* definitions instead. Makes sense but corresponding bit field layout change in branch_sample_type will remain a challenge. > > * Given the next patch extends the field, and therei are potential ABI problems > with that, we might need to reserve a value for ABI extensibility purposes, > and I suspect we need to do that *first*. More comments on the subsequent > patch. Sure, understood. > >> Cc: Peter Zijlstra <peterz@infradead.org> >> Cc: Ingo Molnar <mingo@redhat.com> >> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> >> Cc: Mark Rutland <mark.rutland@arm.com> >> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> >> Cc: Jiri Olsa <jolsa@redhat.com> >> Cc: Namhyung Kim <namhyung@kernel.org> >> Cc: Will Deacon <will@kernel.org> >> Cc: linux-arm-kernel@lists.infradead.org >> Cc: linux-perf-users@vger.kernel.org >> Cc: linux-kernel@vger.kernel.org >> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> >> --- >> arch/x86/events/intel/lbr.c | 4 ++-- >> include/uapi/linux/perf_event.h | 5 +++++ >> tools/include/uapi/linux/perf_event.h | 5 +++++ >> tools/perf/util/branch.c | 7 ++++++- >> 4 files changed, 18 insertions(+), 3 deletions(-) >> >> diff --git a/arch/x86/events/intel/lbr.c b/arch/x86/events/intel/lbr.c >> index 8043213b75a5..9f86fac8c6a5 100644 >> --- a/arch/x86/events/intel/lbr.c >> +++ b/arch/x86/events/intel/lbr.c >> @@ -1336,10 +1336,10 @@ static int branch_map[X86_BR_TYPE_MAP_MAX] = { >> PERF_BR_SYSCALL, /* X86_BR_SYSCALL */ >> PERF_BR_SYSRET, /* X86_BR_SYSRET */ >> PERF_BR_UNKNOWN, /* X86_BR_INT */ >> - PERF_BR_UNKNOWN, /* X86_BR_IRET */ >> + PERF_BR_EXPT_RET, /* X86_BR_IRET */ >> PERF_BR_COND, /* X86_BR_JCC */ >> PERF_BR_UNCOND, /* X86_BR_JMP */ >> - PERF_BR_UNKNOWN, /* X86_BR_IRQ */ >> + PERF_BR_IRQ, /* X86_BR_IRQ */ >> PERF_BR_IND_CALL, /* X86_BR_IND_CALL */ >> PERF_BR_UNKNOWN, /* X86_BR_ABORT */ >> PERF_BR_UNKNOWN, /* X86_BR_IN_TX */ > > This presumably changes the values reported to userspace, so the commit message > should mention that and explain why that is not a problem. Okay. > >> diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h >> index 1b65042ab1db..b91d0f575d0c 100644 >> --- a/include/uapi/linux/perf_event.h >> +++ b/include/uapi/linux/perf_event.h >> @@ -251,6 +251,11 @@ enum { >> PERF_BR_SYSRET = 8, /* syscall return */ >> PERF_BR_COND_CALL = 9, /* conditional function call */ >> PERF_BR_COND_RET = 10, /* conditional function return */ >> + PERF_BR_EXPT_RET = 11, /* exception return */ > > We don't use 'EXPT' anywhere else, so it might be better to just use 'ERET'. > IIUC that's the naming the x86 FRED stuff is going to use anyhow. Sure, will change. > >> + PERF_BR_IRQ = 12, /* irq */ > > This looks somewhat generic, so adding it now makes sense to me, but ... > >> + PERF_BR_FIQ = 13, /* fiq */ > > ... this is arguably just a idfferent class of interrupt from the PoV of Linux, > and the naming is ARM-specific, so I don't think this makes sense to add *now*. I assume 'now' --> without ABI extension. > As above, maybe it would be better to have a generic "aynchronous exception" or > "interrupt" type, and a separate field to distinguish specific types of those. > >> + PERF_BR_DEBUG_HALT = 14, /* debug halt */ >> + PERF_BR_DEBUG_EXIT = 15, /* debug exit */ > > For the benefit of those not familiar with the ARM architecture, "debug halt" > and "debug exit" usually refer to "debug state", which is what an external > (e.g. JTAG) debugger uses rather than the usual self-hosted debug stuff that > Linux uses. > > Given that, I'm not sure these are very generic, and I suspect it would be > better to have more generic PERF_BR_ARCH_* entries for things like this. Sure, will try and come up with something similar. > > Thanks, > Mark. > >> PERF_BR_MAX, >> }; >> >> diff --git a/tools/include/uapi/linux/perf_event.h b/tools/include/uapi/linux/perf_event.h >> index 4cd39aaccbe7..1882054e8684 100644 >> --- a/tools/include/uapi/linux/perf_event.h >> +++ b/tools/include/uapi/linux/perf_event.h >> @@ -251,6 +251,11 @@ enum { >> PERF_BR_SYSRET = 8, /* syscall return */ >> PERF_BR_COND_CALL = 9, /* conditional function call */ >> PERF_BR_COND_RET = 10, /* conditional function return */ >> + PERF_BR_EXPT_RET = 11, /* exception return */ >> + PERF_BR_IRQ = 12, /* irq */ >> + PERF_BR_FIQ = 13, /* fiq */ >> + PERF_BR_DEBUG_HALT = 14, /* debug halt */ >> + PERF_BR_DEBUG_EXIT = 15, /* debug exit */ >> PERF_BR_MAX, >> }; >> >> diff --git a/tools/perf/util/branch.c b/tools/perf/util/branch.c >> index 2285b1eb3128..74e5e67b1779 100644 >> --- a/tools/perf/util/branch.c >> +++ b/tools/perf/util/branch.c >> @@ -49,7 +49,12 @@ const char *branch_type_name(int type) >> "SYSCALL", >> "SYSRET", >> "COND_CALL", >> - "COND_RET" >> + "COND_RET", >> + "EXPT_RET", >> + "IRQ", >> + "FIQ", >> + "DEBUG_HALT", >> + "DEBUG_EXIT" >> }; >> >> if (type >= 0 && type < PERF_BR_MAX) >> -- >> 2.25.1 >> ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 1/2] perf: Add more generic branch types @ 2022-02-04 4:56 ` Anshuman Khandual 0 siblings, 0 replies; 24+ messages in thread From: Anshuman Khandual @ 2022-02-04 4:56 UTC (permalink / raw) To: Mark Rutland Cc: linux-kernel, linux-perf-users, linux-arm-kernel, robh, Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo, Alexander Shishkin, Jiri Olsa, Namhyung Kim, Will Deacon On 2/2/22 5:28 PM, Mark Rutland wrote: > Hi Anshuman, > > On Fri, Jan 28, 2022 at 11:14:12AM +0530, Anshuman Khandual wrote: >> This expands generic branch type classification by adding some more entries >> , that can still be represented with the existing 4 bit 'type' field. While ^ >> here this also updates the x86 implementation with these new branch types. > > It looks like there's some whitespace damage here. Are you referring the above ? I will have a look. > >>From a quick scan of the below, I think the "exception return" and "IRQ > exception" types are somewhat generic, and could be added now (aside from any > bikeshedding over names), but: > > * For IRQ vs FIQ, we might just want to have a top-level "asynchronous > exception" type, and then further divide that with a separate field. That way > it's easier to extend in future if new exceptions are added. Okay. But that might lead to a hierarchical bit fields design where as the current one is just linear. > > * I don't think the debug state entry/exits make sense as generic branch types, From BRBE perspective, a branch is any control flow change including exception level change, debug enter, debug exit etc. If exception and its return can be classified as 'branches' why not debug state change ? Are there no similar debug states transition on other platforms ? > since those are somewhat specific to the ARM architecutre, and it might make > sense to define generic PERF_BR_ARCH* definitions instead. Makes sense but corresponding bit field layout change in branch_sample_type will remain a challenge. > > * Given the next patch extends the field, and therei are potential ABI problems > with that, we might need to reserve a value for ABI extensibility purposes, > and I suspect we need to do that *first*. More comments on the subsequent > patch. Sure, understood. > >> Cc: Peter Zijlstra <peterz@infradead.org> >> Cc: Ingo Molnar <mingo@redhat.com> >> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> >> Cc: Mark Rutland <mark.rutland@arm.com> >> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> >> Cc: Jiri Olsa <jolsa@redhat.com> >> Cc: Namhyung Kim <namhyung@kernel.org> >> Cc: Will Deacon <will@kernel.org> >> Cc: linux-arm-kernel@lists.infradead.org >> Cc: linux-perf-users@vger.kernel.org >> Cc: linux-kernel@vger.kernel.org >> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> >> --- >> arch/x86/events/intel/lbr.c | 4 ++-- >> include/uapi/linux/perf_event.h | 5 +++++ >> tools/include/uapi/linux/perf_event.h | 5 +++++ >> tools/perf/util/branch.c | 7 ++++++- >> 4 files changed, 18 insertions(+), 3 deletions(-) >> >> diff --git a/arch/x86/events/intel/lbr.c b/arch/x86/events/intel/lbr.c >> index 8043213b75a5..9f86fac8c6a5 100644 >> --- a/arch/x86/events/intel/lbr.c >> +++ b/arch/x86/events/intel/lbr.c >> @@ -1336,10 +1336,10 @@ static int branch_map[X86_BR_TYPE_MAP_MAX] = { >> PERF_BR_SYSCALL, /* X86_BR_SYSCALL */ >> PERF_BR_SYSRET, /* X86_BR_SYSRET */ >> PERF_BR_UNKNOWN, /* X86_BR_INT */ >> - PERF_BR_UNKNOWN, /* X86_BR_IRET */ >> + PERF_BR_EXPT_RET, /* X86_BR_IRET */ >> PERF_BR_COND, /* X86_BR_JCC */ >> PERF_BR_UNCOND, /* X86_BR_JMP */ >> - PERF_BR_UNKNOWN, /* X86_BR_IRQ */ >> + PERF_BR_IRQ, /* X86_BR_IRQ */ >> PERF_BR_IND_CALL, /* X86_BR_IND_CALL */ >> PERF_BR_UNKNOWN, /* X86_BR_ABORT */ >> PERF_BR_UNKNOWN, /* X86_BR_IN_TX */ > > This presumably changes the values reported to userspace, so the commit message > should mention that and explain why that is not a problem. Okay. > >> diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h >> index 1b65042ab1db..b91d0f575d0c 100644 >> --- a/include/uapi/linux/perf_event.h >> +++ b/include/uapi/linux/perf_event.h >> @@ -251,6 +251,11 @@ enum { >> PERF_BR_SYSRET = 8, /* syscall return */ >> PERF_BR_COND_CALL = 9, /* conditional function call */ >> PERF_BR_COND_RET = 10, /* conditional function return */ >> + PERF_BR_EXPT_RET = 11, /* exception return */ > > We don't use 'EXPT' anywhere else, so it might be better to just use 'ERET'. > IIUC that's the naming the x86 FRED stuff is going to use anyhow. Sure, will change. > >> + PERF_BR_IRQ = 12, /* irq */ > > This looks somewhat generic, so adding it now makes sense to me, but ... > >> + PERF_BR_FIQ = 13, /* fiq */ > > ... this is arguably just a idfferent class of interrupt from the PoV of Linux, > and the naming is ARM-specific, so I don't think this makes sense to add *now*. I assume 'now' --> without ABI extension. > As above, maybe it would be better to have a generic "aynchronous exception" or > "interrupt" type, and a separate field to distinguish specific types of those. > >> + PERF_BR_DEBUG_HALT = 14, /* debug halt */ >> + PERF_BR_DEBUG_EXIT = 15, /* debug exit */ > > For the benefit of those not familiar with the ARM architecture, "debug halt" > and "debug exit" usually refer to "debug state", which is what an external > (e.g. JTAG) debugger uses rather than the usual self-hosted debug stuff that > Linux uses. > > Given that, I'm not sure these are very generic, and I suspect it would be > better to have more generic PERF_BR_ARCH_* entries for things like this. Sure, will try and come up with something similar. > > Thanks, > Mark. > >> PERF_BR_MAX, >> }; >> >> diff --git a/tools/include/uapi/linux/perf_event.h b/tools/include/uapi/linux/perf_event.h >> index 4cd39aaccbe7..1882054e8684 100644 >> --- a/tools/include/uapi/linux/perf_event.h >> +++ b/tools/include/uapi/linux/perf_event.h >> @@ -251,6 +251,11 @@ enum { >> PERF_BR_SYSRET = 8, /* syscall return */ >> PERF_BR_COND_CALL = 9, /* conditional function call */ >> PERF_BR_COND_RET = 10, /* conditional function return */ >> + PERF_BR_EXPT_RET = 11, /* exception return */ >> + PERF_BR_IRQ = 12, /* irq */ >> + PERF_BR_FIQ = 13, /* fiq */ >> + PERF_BR_DEBUG_HALT = 14, /* debug halt */ >> + PERF_BR_DEBUG_EXIT = 15, /* debug exit */ >> PERF_BR_MAX, >> }; >> >> diff --git a/tools/perf/util/branch.c b/tools/perf/util/branch.c >> index 2285b1eb3128..74e5e67b1779 100644 >> --- a/tools/perf/util/branch.c >> +++ b/tools/perf/util/branch.c >> @@ -49,7 +49,12 @@ const char *branch_type_name(int type) >> "SYSCALL", >> "SYSRET", >> "COND_CALL", >> - "COND_RET" >> + "COND_RET", >> + "EXPT_RET", >> + "IRQ", >> + "FIQ", >> + "DEBUG_HALT", >> + "DEBUG_EXIT" >> }; >> >> if (type >= 0 && type < PERF_BR_MAX) >> -- >> 2.25.1 >> _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 1/2] perf: Add more generic branch types 2022-02-02 11:58 ` Mark Rutland @ 2022-02-24 5:51 ` Anshuman Khandual -1 siblings, 0 replies; 24+ messages in thread From: Anshuman Khandual @ 2022-02-24 5:51 UTC (permalink / raw) To: Mark Rutland Cc: linux-kernel, linux-perf-users, linux-arm-kernel, robh, Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo, Alexander Shishkin, Jiri Olsa, Namhyung Kim, Will Deacon On 2/2/22 5:28 PM, Mark Rutland wrote: > Hi Anshuman, > > On Fri, Jan 28, 2022 at 11:14:12AM +0530, Anshuman Khandual wrote: >> This expands generic branch type classification by adding some more entries >> , that can still be represented with the existing 4 bit 'type' field. While >> here this also updates the x86 implementation with these new branch types. > It looks like there's some whitespace damage here. > >>From a quick scan of the below, I think the "exception return" and "IRQ > exception" types are somewhat generic, and could be added now (aside from any > bikeshedding over names), but: Hi Mark, I have sent a patch adding just PERF_BR_ERET and PERF_BR_IRQ which are generic enough to be included now. The 'type' field still got 3 more places for future use. Hence we should try and include two more branch types, before adding the last entry for ABI expansion (e.g PERF_BR_EXTENDED). https://lore.kernel.org/all/1645681014-3346-1-git-send-email-anshuman.khandual@arm.com/ - Anshuman ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 1/2] perf: Add more generic branch types @ 2022-02-24 5:51 ` Anshuman Khandual 0 siblings, 0 replies; 24+ messages in thread From: Anshuman Khandual @ 2022-02-24 5:51 UTC (permalink / raw) To: Mark Rutland Cc: linux-kernel, linux-perf-users, linux-arm-kernel, robh, Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo, Alexander Shishkin, Jiri Olsa, Namhyung Kim, Will Deacon On 2/2/22 5:28 PM, Mark Rutland wrote: > Hi Anshuman, > > On Fri, Jan 28, 2022 at 11:14:12AM +0530, Anshuman Khandual wrote: >> This expands generic branch type classification by adding some more entries >> , that can still be represented with the existing 4 bit 'type' field. While >> here this also updates the x86 implementation with these new branch types. > It looks like there's some whitespace damage here. > >>From a quick scan of the below, I think the "exception return" and "IRQ > exception" types are somewhat generic, and could be added now (aside from any > bikeshedding over names), but: Hi Mark, I have sent a patch adding just PERF_BR_ERET and PERF_BR_IRQ which are generic enough to be included now. The 'type' field still got 3 more places for future use. Hence we should try and include two more branch types, before adding the last entry for ABI expansion (e.g PERF_BR_EXTENDED). https://lore.kernel.org/all/1645681014-3346-1-git-send-email-anshuman.khandual@arm.com/ - Anshuman _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 1/2] perf: Add more generic branch types 2022-02-24 5:51 ` Anshuman Khandual @ 2022-02-24 7:10 ` Anshuman Khandual -1 siblings, 0 replies; 24+ messages in thread From: Anshuman Khandual @ 2022-02-24 7:10 UTC (permalink / raw) To: Mark Rutland Cc: linux-kernel, linux-perf-users, linux-arm-kernel, robh, Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo, Alexander Shishkin, Jiri Olsa, Namhyung Kim, Will Deacon On 2/24/22 11:21 AM, Anshuman Khandual wrote: > > > On 2/2/22 5:28 PM, Mark Rutland wrote: >> Hi Anshuman, >> >> On Fri, Jan 28, 2022 at 11:14:12AM +0530, Anshuman Khandual wrote: >>> This expands generic branch type classification by adding some more entries >>> , that can still be represented with the existing 4 bit 'type' field. While >>> here this also updates the x86 implementation with these new branch types. >> It looks like there's some whitespace damage here. >> >> >From a quick scan of the below, I think the "exception return" and "IRQ >> exception" types are somewhat generic, and could be added now (aside from any >> bikeshedding over names), but: > > Hi Mark, > > I have sent a patch adding just PERF_BR_ERET and PERF_BR_IRQ which are generic > enough to be included now. The 'type' field still got 3 more places for future > use. Hence we should try and include two more branch types, before adding the The two additional generic branch types could be - PERF_BR_SERROR (possible arm64 equivalent for x86 MCE) - PERF_BR_NO_TX (only missing TX related branch type in perf_branch_entry) perf_branch_entry.[in_tx | abort] are already available. Also PERF_BR_NO_TX could be used right away on x86 platform, via X86_BR_NO_TX. > last entry for ABI expansion (e.g PERF_BR_EXTENDED). PERF_BR_EXTENDED could help expand into another 4 bits 'new_type' field following the existing 4 bits 'type' field. Does this sound feasible ? enum { PERF_BR_UNKNOWN = 0, /* unknown */ PERF_BR_COND = 1, /* conditional */ PERF_BR_UNCOND = 2, /* unconditional */ PERF_BR_IND = 3, /* indirect */ PERF_BR_CALL = 4, /* function call */ PERF_BR_IND_CALL = 5, /* indirect function call */ PERF_BR_RET = 6, /* function return */ PERF_BR_SYSCALL = 7, /* syscall */ PERF_BR_SYSRET = 8, /* syscall return */ PERF_BR_COND_CALL = 9, /* conditional function call */ PERF_BR_COND_RET = 10, /* conditional function return */ PERF_BR_ERET = 11, /* exception return */ PERF_BR_IRQ = 12, /* irq */ PERF_BR_SERROR = 13, /* System error */ PERF_BR_NO_TX = 14, /* No transaction */ PERF_BR_EXTEND_ABI = 15 /* Extended ABI */ PERF_BR_MAX, }; enum { PERF_BR_NEW_FAULT_ALGN = 0, /* Alignment fault */ PERF_BR_NEW_FAULT_DATA = 1, /* Data fault */ PERF_BR_NEW_FAULT_INST = 2, /* Inst fault */ PERF_BR_NEW_ARCH_1 = 3, /* Architecture specific */ PERF_BR_NEW_ARCH_2 = 4, /* Architecture specific */ PERF_BR_NEW_ARCH_3 = 5, /* Architecture specific */ PERF_BR_NEW_ARCH_4 = 6, /* Architecture specific */ PERF_BR_NEW_ARCH_5 = 7, /* Architecture specific */ PERF_BR_NEW_MAX, }; #ifdef CONFIG_ARM64 #define PERF_BR_FIQ PERF_BR_NEW_ARCH_1 #define PERF_BR_DEBUG_HALT PERF_BR_NEW_ARCH_2 #define PERF_BR_DEBUG_EXIT PERF_BR_NEW_ARCH_3 #define PERF_BR_DEBUG_INST PERF_BR_NEW_ARCH_4 #define PERF_BR_DEBUG_DATA PERF_BR_NEW_ARCH_5 #endif User space perf tool will look into perf_branch_entry.new_type, only when (perf_branch_entry.type == PERF_BR_EXTEND_ABI). Older perf tool on a newer kernel will be safe via old PERF_BR_MAX, and will ignore [PERF_BR_ERET - PERF_BR_EXTEND_ABI] values possibly with an warning. - Anshuman ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 1/2] perf: Add more generic branch types @ 2022-02-24 7:10 ` Anshuman Khandual 0 siblings, 0 replies; 24+ messages in thread From: Anshuman Khandual @ 2022-02-24 7:10 UTC (permalink / raw) To: Mark Rutland Cc: linux-kernel, linux-perf-users, linux-arm-kernel, robh, Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo, Alexander Shishkin, Jiri Olsa, Namhyung Kim, Will Deacon On 2/24/22 11:21 AM, Anshuman Khandual wrote: > > > On 2/2/22 5:28 PM, Mark Rutland wrote: >> Hi Anshuman, >> >> On Fri, Jan 28, 2022 at 11:14:12AM +0530, Anshuman Khandual wrote: >>> This expands generic branch type classification by adding some more entries >>> , that can still be represented with the existing 4 bit 'type' field. While >>> here this also updates the x86 implementation with these new branch types. >> It looks like there's some whitespace damage here. >> >> >From a quick scan of the below, I think the "exception return" and "IRQ >> exception" types are somewhat generic, and could be added now (aside from any >> bikeshedding over names), but: > > Hi Mark, > > I have sent a patch adding just PERF_BR_ERET and PERF_BR_IRQ which are generic > enough to be included now. The 'type' field still got 3 more places for future > use. Hence we should try and include two more branch types, before adding the The two additional generic branch types could be - PERF_BR_SERROR (possible arm64 equivalent for x86 MCE) - PERF_BR_NO_TX (only missing TX related branch type in perf_branch_entry) perf_branch_entry.[in_tx | abort] are already available. Also PERF_BR_NO_TX could be used right away on x86 platform, via X86_BR_NO_TX. > last entry for ABI expansion (e.g PERF_BR_EXTENDED). PERF_BR_EXTENDED could help expand into another 4 bits 'new_type' field following the existing 4 bits 'type' field. Does this sound feasible ? enum { PERF_BR_UNKNOWN = 0, /* unknown */ PERF_BR_COND = 1, /* conditional */ PERF_BR_UNCOND = 2, /* unconditional */ PERF_BR_IND = 3, /* indirect */ PERF_BR_CALL = 4, /* function call */ PERF_BR_IND_CALL = 5, /* indirect function call */ PERF_BR_RET = 6, /* function return */ PERF_BR_SYSCALL = 7, /* syscall */ PERF_BR_SYSRET = 8, /* syscall return */ PERF_BR_COND_CALL = 9, /* conditional function call */ PERF_BR_COND_RET = 10, /* conditional function return */ PERF_BR_ERET = 11, /* exception return */ PERF_BR_IRQ = 12, /* irq */ PERF_BR_SERROR = 13, /* System error */ PERF_BR_NO_TX = 14, /* No transaction */ PERF_BR_EXTEND_ABI = 15 /* Extended ABI */ PERF_BR_MAX, }; enum { PERF_BR_NEW_FAULT_ALGN = 0, /* Alignment fault */ PERF_BR_NEW_FAULT_DATA = 1, /* Data fault */ PERF_BR_NEW_FAULT_INST = 2, /* Inst fault */ PERF_BR_NEW_ARCH_1 = 3, /* Architecture specific */ PERF_BR_NEW_ARCH_2 = 4, /* Architecture specific */ PERF_BR_NEW_ARCH_3 = 5, /* Architecture specific */ PERF_BR_NEW_ARCH_4 = 6, /* Architecture specific */ PERF_BR_NEW_ARCH_5 = 7, /* Architecture specific */ PERF_BR_NEW_MAX, }; #ifdef CONFIG_ARM64 #define PERF_BR_FIQ PERF_BR_NEW_ARCH_1 #define PERF_BR_DEBUG_HALT PERF_BR_NEW_ARCH_2 #define PERF_BR_DEBUG_EXIT PERF_BR_NEW_ARCH_3 #define PERF_BR_DEBUG_INST PERF_BR_NEW_ARCH_4 #define PERF_BR_DEBUG_DATA PERF_BR_NEW_ARCH_5 #endif User space perf tool will look into perf_branch_entry.new_type, only when (perf_branch_entry.type == PERF_BR_EXTEND_ABI). Older perf tool on a newer kernel will be safe via old PERF_BR_MAX, and will ignore [PERF_BR_ERET - PERF_BR_EXTEND_ABI] values possibly with an warning. - Anshuman _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 24+ messages in thread
* [PATCH 2/2] perf: Expand perf_branch_entry.type 2022-01-28 5:44 ` Anshuman Khandual @ 2022-01-28 5:44 ` Anshuman Khandual -1 siblings, 0 replies; 24+ messages in thread From: Anshuman Khandual @ 2022-01-28 5:44 UTC (permalink / raw) To: linux-kernel, linux-perf-users, linux-arm-kernel Cc: robh, Anshuman Khandual, Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo, Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim, Will Deacon Current perf_branch_entry.type is a 4 bits field just enough to accommodate 16 generic branch types. This is insufficient to accommodate platforms like arm64 which has much more branch types. Lets just expands this field into a 6 bits one, which can now hold 64 generic branch types. This also adds more generic branch types. Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Will Deacon <will@kernel.org> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-perf-users@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> --- include/uapi/linux/perf_event.h | 10 ++++++++-- tools/include/uapi/linux/perf_event.h | 10 ++++++++-- tools/perf/util/branch.c | 8 +++++++- tools/perf/util/branch.h | 4 ++-- 4 files changed, 25 insertions(+), 7 deletions(-) diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h index b91d0f575d0c..361fdc6b87a0 100644 --- a/include/uapi/linux/perf_event.h +++ b/include/uapi/linux/perf_event.h @@ -256,6 +256,12 @@ enum { PERF_BR_FIQ = 13, /* fiq */ PERF_BR_DEBUG_HALT = 14, /* debug halt */ PERF_BR_DEBUG_EXIT = 15, /* debug exit */ + PERF_BR_DEBUG_INST = 16, /* instruciton debug */ + PERF_BR_DEBUG_DATA = 17, /* data debug */ + PERF_BR_FAULT_ALGN = 18, /* alignment fault */ + PERF_BR_FAULT_DATA = 19, /* data fault */ + PERF_BR_FAULT_INST = 20, /* instruction fault */ + PERF_BR_SERROR = 21, /* system error */ PERF_BR_MAX, }; @@ -1370,8 +1376,8 @@ struct perf_branch_entry { in_tx:1, /* in transaction */ abort:1, /* transaction abort */ cycles:16, /* cycle count to last branch */ - type:4, /* branch type */ - reserved:40; + type:6, /* branch type */ + reserved:38; }; union perf_sample_weight { diff --git a/tools/include/uapi/linux/perf_event.h b/tools/include/uapi/linux/perf_event.h index 1882054e8684..9a82b8aaed93 100644 --- a/tools/include/uapi/linux/perf_event.h +++ b/tools/include/uapi/linux/perf_event.h @@ -256,6 +256,12 @@ enum { PERF_BR_FIQ = 13, /* fiq */ PERF_BR_DEBUG_HALT = 14, /* debug halt */ PERF_BR_DEBUG_EXIT = 15, /* debug exit */ + PERF_BR_DEBUG_INST = 16, /* instruciton debug */ + PERF_BR_DEBUG_DATA = 17, /* data debug */ + PERF_BR_FAULT_ALGN = 18, /* alignment fault */ + PERF_BR_FAULT_DATA = 19, /* data fault */ + PERF_BR_FAULT_INST = 20, /* instruction fault */ + PERF_BR_SERROR = 21, /* system error */ PERF_BR_MAX, }; @@ -1370,8 +1376,8 @@ struct perf_branch_entry { in_tx:1, /* in transaction */ abort:1, /* transaction abort */ cycles:16, /* cycle count to last branch */ - type:4, /* branch type */ - reserved:40; + type:6, /* branch type */ + reserved:38; }; union perf_sample_weight { diff --git a/tools/perf/util/branch.c b/tools/perf/util/branch.c index 74e5e67b1779..1e216ea2e2a8 100644 --- a/tools/perf/util/branch.c +++ b/tools/perf/util/branch.c @@ -54,7 +54,13 @@ const char *branch_type_name(int type) "IRQ", "FIQ", "DEBUG_HALT", - "DEBUG_EXIT" + "DEBUG_EXIT", + "DEBUG_INST", + "DEBUG_DATA", + "FAULT_ALGN", + "FAULT_DATA", + "FAULT_INST", + "SERROR" }; if (type >= 0 && type < PERF_BR_MAX) diff --git a/tools/perf/util/branch.h b/tools/perf/util/branch.h index 17b2ccc61094..875d99abdc36 100644 --- a/tools/perf/util/branch.h +++ b/tools/perf/util/branch.h @@ -23,8 +23,8 @@ struct branch_flags { u64 in_tx:1; u64 abort:1; u64 cycles:16; - u64 type:4; - u64 reserved:40; + u64 type:6; + u64 reserved:38; }; }; }; -- 2.25.1 ^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH 2/2] perf: Expand perf_branch_entry.type @ 2022-01-28 5:44 ` Anshuman Khandual 0 siblings, 0 replies; 24+ messages in thread From: Anshuman Khandual @ 2022-01-28 5:44 UTC (permalink / raw) To: linux-kernel, linux-perf-users, linux-arm-kernel Cc: robh, Anshuman Khandual, Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo, Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim, Will Deacon Current perf_branch_entry.type is a 4 bits field just enough to accommodate 16 generic branch types. This is insufficient to accommodate platforms like arm64 which has much more branch types. Lets just expands this field into a 6 bits one, which can now hold 64 generic branch types. This also adds more generic branch types. Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Will Deacon <will@kernel.org> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-perf-users@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> --- include/uapi/linux/perf_event.h | 10 ++++++++-- tools/include/uapi/linux/perf_event.h | 10 ++++++++-- tools/perf/util/branch.c | 8 +++++++- tools/perf/util/branch.h | 4 ++-- 4 files changed, 25 insertions(+), 7 deletions(-) diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h index b91d0f575d0c..361fdc6b87a0 100644 --- a/include/uapi/linux/perf_event.h +++ b/include/uapi/linux/perf_event.h @@ -256,6 +256,12 @@ enum { PERF_BR_FIQ = 13, /* fiq */ PERF_BR_DEBUG_HALT = 14, /* debug halt */ PERF_BR_DEBUG_EXIT = 15, /* debug exit */ + PERF_BR_DEBUG_INST = 16, /* instruciton debug */ + PERF_BR_DEBUG_DATA = 17, /* data debug */ + PERF_BR_FAULT_ALGN = 18, /* alignment fault */ + PERF_BR_FAULT_DATA = 19, /* data fault */ + PERF_BR_FAULT_INST = 20, /* instruction fault */ + PERF_BR_SERROR = 21, /* system error */ PERF_BR_MAX, }; @@ -1370,8 +1376,8 @@ struct perf_branch_entry { in_tx:1, /* in transaction */ abort:1, /* transaction abort */ cycles:16, /* cycle count to last branch */ - type:4, /* branch type */ - reserved:40; + type:6, /* branch type */ + reserved:38; }; union perf_sample_weight { diff --git a/tools/include/uapi/linux/perf_event.h b/tools/include/uapi/linux/perf_event.h index 1882054e8684..9a82b8aaed93 100644 --- a/tools/include/uapi/linux/perf_event.h +++ b/tools/include/uapi/linux/perf_event.h @@ -256,6 +256,12 @@ enum { PERF_BR_FIQ = 13, /* fiq */ PERF_BR_DEBUG_HALT = 14, /* debug halt */ PERF_BR_DEBUG_EXIT = 15, /* debug exit */ + PERF_BR_DEBUG_INST = 16, /* instruciton debug */ + PERF_BR_DEBUG_DATA = 17, /* data debug */ + PERF_BR_FAULT_ALGN = 18, /* alignment fault */ + PERF_BR_FAULT_DATA = 19, /* data fault */ + PERF_BR_FAULT_INST = 20, /* instruction fault */ + PERF_BR_SERROR = 21, /* system error */ PERF_BR_MAX, }; @@ -1370,8 +1376,8 @@ struct perf_branch_entry { in_tx:1, /* in transaction */ abort:1, /* transaction abort */ cycles:16, /* cycle count to last branch */ - type:4, /* branch type */ - reserved:40; + type:6, /* branch type */ + reserved:38; }; union perf_sample_weight { diff --git a/tools/perf/util/branch.c b/tools/perf/util/branch.c index 74e5e67b1779..1e216ea2e2a8 100644 --- a/tools/perf/util/branch.c +++ b/tools/perf/util/branch.c @@ -54,7 +54,13 @@ const char *branch_type_name(int type) "IRQ", "FIQ", "DEBUG_HALT", - "DEBUG_EXIT" + "DEBUG_EXIT", + "DEBUG_INST", + "DEBUG_DATA", + "FAULT_ALGN", + "FAULT_DATA", + "FAULT_INST", + "SERROR" }; if (type >= 0 && type < PERF_BR_MAX) diff --git a/tools/perf/util/branch.h b/tools/perf/util/branch.h index 17b2ccc61094..875d99abdc36 100644 --- a/tools/perf/util/branch.h +++ b/tools/perf/util/branch.h @@ -23,8 +23,8 @@ struct branch_flags { u64 in_tx:1; u64 abort:1; u64 cycles:16; - u64 type:4; - u64 reserved:40; + u64 type:6; + u64 reserved:38; }; }; }; -- 2.25.1 _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply related [flat|nested] 24+ messages in thread
* Re: [PATCH 2/2] perf: Expand perf_branch_entry.type 2022-01-28 5:44 ` Anshuman Khandual @ 2022-02-02 11:57 ` Mark Rutland -1 siblings, 0 replies; 24+ messages in thread From: Mark Rutland @ 2022-02-02 11:57 UTC (permalink / raw) To: Anshuman Khandual Cc: linux-kernel, linux-perf-users, linux-arm-kernel, robh, Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo, Alexander Shishkin, Jiri Olsa, Namhyung Kim, Will Deacon On Fri, Jan 28, 2022 at 11:14:13AM +0530, Anshuman Khandual wrote: > Current perf_branch_entry.type is a 4 bits field just enough to accommodate > 16 generic branch types. This is insufficient to accommodate platforms like > arm64 which has much more branch types. It would be good to mention BRBE specifically here, along with specific values and a rought intro, e.g. | The Arm Branch Record Buffer Extension (BRBE) distinguishes $N types of | branch/exception/return: <rough summary here>. There's not enough space to | describe these all in perf_branch_entry.type, as this is a 4-bit field. That way reviewers (and anyone looking at the patch in future) have a lot more rationale to work with. A rough summary of the distinct branch types would be *really* helpful. > Lets just expands this field into a 6 bits one, which can now hold 64 generic > branch types. Is it safe (ABI-wise) to extend a bit-field like this? Does that break any combination of old/new userspace and old/new kernel? I'm not sure how bit fields are managed w.r.t. endianness, but normally extending a field would break BE, so this seems suspicious. I suspect we might need to allocate a *separate* field for new values, and possibly reserve a value in the existing field to say "go look at the new field". Do you have any rationale for 64 values specifically? e.g. is that mostly for future extensibility? How many will we need for Arm's BRBE? Do those types fall into a hierarchy, that we could split across separate fields? > This also adds more generic branch types. This feels like ti should be in a separate/subsequent patch. If nothing else that aids bisectability if changing the size of the field breaks anything. > Cc: Peter Zijlstra <peterz@infradead.org> > Cc: Ingo Molnar <mingo@redhat.com> > Cc: Arnaldo Carvalho de Melo <acme@kernel.org> > Cc: Mark Rutland <mark.rutland@arm.com> > Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> > Cc: Jiri Olsa <jolsa@redhat.com> > Cc: Namhyung Kim <namhyung@kernel.org> > Cc: Will Deacon <will@kernel.org> > Cc: linux-arm-kernel@lists.infradead.org > Cc: linux-perf-users@vger.kernel.org > Cc: linux-kernel@vger.kernel.org > Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> > --- > include/uapi/linux/perf_event.h | 10 ++++++++-- > tools/include/uapi/linux/perf_event.h | 10 ++++++++-- > tools/perf/util/branch.c | 8 +++++++- > tools/perf/util/branch.h | 4 ++-- > 4 files changed, 25 insertions(+), 7 deletions(-) > > diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h > index b91d0f575d0c..361fdc6b87a0 100644 > --- a/include/uapi/linux/perf_event.h > +++ b/include/uapi/linux/perf_event.h > @@ -256,6 +256,12 @@ enum { > PERF_BR_FIQ = 13, /* fiq */ > PERF_BR_DEBUG_HALT = 14, /* debug halt */ > PERF_BR_DEBUG_EXIT = 15, /* debug exit */ > + PERF_BR_DEBUG_INST = 16, /* instruciton debug */ > + PERF_BR_DEBUG_DATA = 17, /* data debug */ This is really unclear. What is "instruction debug" vs "data debug" ? Are there meant for breakpoint/watchpoint exceptions? HW breakpoints vs BRK instructions? > + PERF_BR_FAULT_ALGN = 18, /* alignment fault */ > + PERF_BR_FAULT_DATA = 19, /* data fault */ > + PERF_BR_FAULT_INST = 20, /* instruction fault */ There are many other potential faults a CPU could take; are these specifically what Arm's BRBE provides? > + PERF_BR_SERROR = 21, /* system error */ This is really arm-specific; IIUC the closest thing on x86 is an MCE. > PERF_BR_MAX, > }; > > @@ -1370,8 +1376,8 @@ struct perf_branch_entry { > in_tx:1, /* in transaction */ > abort:1, /* transaction abort */ > cycles:16, /* cycle count to last branch */ > - type:4, /* branch type */ > - reserved:40; > + type:6, /* branch type */ As above, is this a safe-change ABI-wise? Thanks, Mark. > + reserved:38; > }; > > union perf_sample_weight { > diff --git a/tools/include/uapi/linux/perf_event.h b/tools/include/uapi/linux/perf_event.h > index 1882054e8684..9a82b8aaed93 100644 > --- a/tools/include/uapi/linux/perf_event.h > +++ b/tools/include/uapi/linux/perf_event.h > @@ -256,6 +256,12 @@ enum { > PERF_BR_FIQ = 13, /* fiq */ > PERF_BR_DEBUG_HALT = 14, /* debug halt */ > PERF_BR_DEBUG_EXIT = 15, /* debug exit */ > + PERF_BR_DEBUG_INST = 16, /* instruciton debug */ > + PERF_BR_DEBUG_DATA = 17, /* data debug */ > + PERF_BR_FAULT_ALGN = 18, /* alignment fault */ > + PERF_BR_FAULT_DATA = 19, /* data fault */ > + PERF_BR_FAULT_INST = 20, /* instruction fault */ > + PERF_BR_SERROR = 21, /* system error */ > PERF_BR_MAX, > }; > > @@ -1370,8 +1376,8 @@ struct perf_branch_entry { > in_tx:1, /* in transaction */ > abort:1, /* transaction abort */ > cycles:16, /* cycle count to last branch */ > - type:4, /* branch type */ > - reserved:40; > + type:6, /* branch type */ > + reserved:38; > }; > > union perf_sample_weight { > diff --git a/tools/perf/util/branch.c b/tools/perf/util/branch.c > index 74e5e67b1779..1e216ea2e2a8 100644 > --- a/tools/perf/util/branch.c > +++ b/tools/perf/util/branch.c > @@ -54,7 +54,13 @@ const char *branch_type_name(int type) > "IRQ", > "FIQ", > "DEBUG_HALT", > - "DEBUG_EXIT" > + "DEBUG_EXIT", > + "DEBUG_INST", > + "DEBUG_DATA", > + "FAULT_ALGN", > + "FAULT_DATA", > + "FAULT_INST", > + "SERROR" > }; > > if (type >= 0 && type < PERF_BR_MAX) > diff --git a/tools/perf/util/branch.h b/tools/perf/util/branch.h > index 17b2ccc61094..875d99abdc36 100644 > --- a/tools/perf/util/branch.h > +++ b/tools/perf/util/branch.h > @@ -23,8 +23,8 @@ struct branch_flags { > u64 in_tx:1; > u64 abort:1; > u64 cycles:16; > - u64 type:4; > - u64 reserved:40; > + u64 type:6; > + u64 reserved:38; > }; > }; > }; > -- > 2.25.1 > ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 2/2] perf: Expand perf_branch_entry.type @ 2022-02-02 11:57 ` Mark Rutland 0 siblings, 0 replies; 24+ messages in thread From: Mark Rutland @ 2022-02-02 11:57 UTC (permalink / raw) To: Anshuman Khandual Cc: linux-kernel, linux-perf-users, linux-arm-kernel, robh, Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo, Alexander Shishkin, Jiri Olsa, Namhyung Kim, Will Deacon On Fri, Jan 28, 2022 at 11:14:13AM +0530, Anshuman Khandual wrote: > Current perf_branch_entry.type is a 4 bits field just enough to accommodate > 16 generic branch types. This is insufficient to accommodate platforms like > arm64 which has much more branch types. It would be good to mention BRBE specifically here, along with specific values and a rought intro, e.g. | The Arm Branch Record Buffer Extension (BRBE) distinguishes $N types of | branch/exception/return: <rough summary here>. There's not enough space to | describe these all in perf_branch_entry.type, as this is a 4-bit field. That way reviewers (and anyone looking at the patch in future) have a lot more rationale to work with. A rough summary of the distinct branch types would be *really* helpful. > Lets just expands this field into a 6 bits one, which can now hold 64 generic > branch types. Is it safe (ABI-wise) to extend a bit-field like this? Does that break any combination of old/new userspace and old/new kernel? I'm not sure how bit fields are managed w.r.t. endianness, but normally extending a field would break BE, so this seems suspicious. I suspect we might need to allocate a *separate* field for new values, and possibly reserve a value in the existing field to say "go look at the new field". Do you have any rationale for 64 values specifically? e.g. is that mostly for future extensibility? How many will we need for Arm's BRBE? Do those types fall into a hierarchy, that we could split across separate fields? > This also adds more generic branch types. This feels like ti should be in a separate/subsequent patch. If nothing else that aids bisectability if changing the size of the field breaks anything. > Cc: Peter Zijlstra <peterz@infradead.org> > Cc: Ingo Molnar <mingo@redhat.com> > Cc: Arnaldo Carvalho de Melo <acme@kernel.org> > Cc: Mark Rutland <mark.rutland@arm.com> > Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> > Cc: Jiri Olsa <jolsa@redhat.com> > Cc: Namhyung Kim <namhyung@kernel.org> > Cc: Will Deacon <will@kernel.org> > Cc: linux-arm-kernel@lists.infradead.org > Cc: linux-perf-users@vger.kernel.org > Cc: linux-kernel@vger.kernel.org > Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> > --- > include/uapi/linux/perf_event.h | 10 ++++++++-- > tools/include/uapi/linux/perf_event.h | 10 ++++++++-- > tools/perf/util/branch.c | 8 +++++++- > tools/perf/util/branch.h | 4 ++-- > 4 files changed, 25 insertions(+), 7 deletions(-) > > diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h > index b91d0f575d0c..361fdc6b87a0 100644 > --- a/include/uapi/linux/perf_event.h > +++ b/include/uapi/linux/perf_event.h > @@ -256,6 +256,12 @@ enum { > PERF_BR_FIQ = 13, /* fiq */ > PERF_BR_DEBUG_HALT = 14, /* debug halt */ > PERF_BR_DEBUG_EXIT = 15, /* debug exit */ > + PERF_BR_DEBUG_INST = 16, /* instruciton debug */ > + PERF_BR_DEBUG_DATA = 17, /* data debug */ This is really unclear. What is "instruction debug" vs "data debug" ? Are there meant for breakpoint/watchpoint exceptions? HW breakpoints vs BRK instructions? > + PERF_BR_FAULT_ALGN = 18, /* alignment fault */ > + PERF_BR_FAULT_DATA = 19, /* data fault */ > + PERF_BR_FAULT_INST = 20, /* instruction fault */ There are many other potential faults a CPU could take; are these specifically what Arm's BRBE provides? > + PERF_BR_SERROR = 21, /* system error */ This is really arm-specific; IIUC the closest thing on x86 is an MCE. > PERF_BR_MAX, > }; > > @@ -1370,8 +1376,8 @@ struct perf_branch_entry { > in_tx:1, /* in transaction */ > abort:1, /* transaction abort */ > cycles:16, /* cycle count to last branch */ > - type:4, /* branch type */ > - reserved:40; > + type:6, /* branch type */ As above, is this a safe-change ABI-wise? Thanks, Mark. > + reserved:38; > }; > > union perf_sample_weight { > diff --git a/tools/include/uapi/linux/perf_event.h b/tools/include/uapi/linux/perf_event.h > index 1882054e8684..9a82b8aaed93 100644 > --- a/tools/include/uapi/linux/perf_event.h > +++ b/tools/include/uapi/linux/perf_event.h > @@ -256,6 +256,12 @@ enum { > PERF_BR_FIQ = 13, /* fiq */ > PERF_BR_DEBUG_HALT = 14, /* debug halt */ > PERF_BR_DEBUG_EXIT = 15, /* debug exit */ > + PERF_BR_DEBUG_INST = 16, /* instruciton debug */ > + PERF_BR_DEBUG_DATA = 17, /* data debug */ > + PERF_BR_FAULT_ALGN = 18, /* alignment fault */ > + PERF_BR_FAULT_DATA = 19, /* data fault */ > + PERF_BR_FAULT_INST = 20, /* instruction fault */ > + PERF_BR_SERROR = 21, /* system error */ > PERF_BR_MAX, > }; > > @@ -1370,8 +1376,8 @@ struct perf_branch_entry { > in_tx:1, /* in transaction */ > abort:1, /* transaction abort */ > cycles:16, /* cycle count to last branch */ > - type:4, /* branch type */ > - reserved:40; > + type:6, /* branch type */ > + reserved:38; > }; > > union perf_sample_weight { > diff --git a/tools/perf/util/branch.c b/tools/perf/util/branch.c > index 74e5e67b1779..1e216ea2e2a8 100644 > --- a/tools/perf/util/branch.c > +++ b/tools/perf/util/branch.c > @@ -54,7 +54,13 @@ const char *branch_type_name(int type) > "IRQ", > "FIQ", > "DEBUG_HALT", > - "DEBUG_EXIT" > + "DEBUG_EXIT", > + "DEBUG_INST", > + "DEBUG_DATA", > + "FAULT_ALGN", > + "FAULT_DATA", > + "FAULT_INST", > + "SERROR" > }; > > if (type >= 0 && type < PERF_BR_MAX) > diff --git a/tools/perf/util/branch.h b/tools/perf/util/branch.h > index 17b2ccc61094..875d99abdc36 100644 > --- a/tools/perf/util/branch.h > +++ b/tools/perf/util/branch.h > @@ -23,8 +23,8 @@ struct branch_flags { > u64 in_tx:1; > u64 abort:1; > u64 cycles:16; > - u64 type:4; > - u64 reserved:40; > + u64 type:6; > + u64 reserved:38; > }; > }; > }; > -- > 2.25.1 > _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 2/2] perf: Expand perf_branch_entry.type 2022-02-02 11:57 ` Mark Rutland @ 2022-02-04 4:55 ` Anshuman Khandual -1 siblings, 0 replies; 24+ messages in thread From: Anshuman Khandual @ 2022-02-04 4:55 UTC (permalink / raw) To: Mark Rutland Cc: linux-kernel, linux-perf-users, linux-arm-kernel, robh, Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo, Alexander Shishkin, Jiri Olsa, Namhyung Kim, Will Deacon On 2/2/22 5:27 PM, Mark Rutland wrote: > On Fri, Jan 28, 2022 at 11:14:13AM +0530, Anshuman Khandual wrote: >> Current perf_branch_entry.type is a 4 bits field just enough to accommodate >> 16 generic branch types. This is insufficient to accommodate platforms like >> arm64 which has much more branch types. > > It would be good to mention BRBE specifically here, along with specific values > and a rought intro, e.g. > > | The Arm Branch Record Buffer Extension (BRBE) distinguishes $N types of > | branch/exception/return: <rough summary here>. There's not enough space to > | describe these all in perf_branch_entry.type, as this is a 4-bit field. > > That way reviewers (and anyone looking at the patch in future) have a lot more > rationale to work with. A rough summary of the distinct branch types would be > *really* helpful. Sure, will add a summary describing the need for an ABI extension as suggested. > >> Lets just expands this field into a 6 bits one, which can now hold 64 generic >> branch types. > > Is it safe (ABI-wise) to extend a bit-field like this? Does that break any > combination of old/new userspace and old/new kernel? I'm not sure how bit > fields are managed w.r.t. endianness, but normally extending a field would > break BE, so this seems suspicious. Probably. I guess we would need some more inputs/suggestions from others regarding any potential issues and possible workarounds. > > I suspect we might need to allocate a *separate* field for new values, and > possibly reserve a value in the existing field to say "go look at the new > field". In that case there might be another level of indirection. > > Do you have any rationale for 64 values specifically? e.g. is that mostly for > future extensibility? How many will we need for Arm's BRBE? Yeah that is mostly for future extensibility. BRBE's current requirement will be well within 32 types itself. 19 to be specific. > > Do those types fall into a hierarchy, that we could split across separate > fields? I will take a look and get back on this. But as mentioned before it will cause additional level of indirection for a look up. > >> This also adds more generic branch types. > > This feels like ti should be in a separate/subsequent patch. If nothing else > that aids bisectability if changing the size of the field breaks anything. Makes sense, will split them. > >> Cc: Peter Zijlstra <peterz@infradead.org> >> Cc: Ingo Molnar <mingo@redhat.com> >> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> >> Cc: Mark Rutland <mark.rutland@arm.com> >> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> >> Cc: Jiri Olsa <jolsa@redhat.com> >> Cc: Namhyung Kim <namhyung@kernel.org> >> Cc: Will Deacon <will@kernel.org> >> Cc: linux-arm-kernel@lists.infradead.org >> Cc: linux-perf-users@vger.kernel.org >> Cc: linux-kernel@vger.kernel.org >> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> >> --- >> include/uapi/linux/perf_event.h | 10 ++++++++-- >> tools/include/uapi/linux/perf_event.h | 10 ++++++++-- >> tools/perf/util/branch.c | 8 +++++++- >> tools/perf/util/branch.h | 4 ++-- >> 4 files changed, 25 insertions(+), 7 deletions(-) >> >> diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h >> index b91d0f575d0c..361fdc6b87a0 100644 >> --- a/include/uapi/linux/perf_event.h >> +++ b/include/uapi/linux/perf_event.h >> @@ -256,6 +256,12 @@ enum { >> PERF_BR_FIQ = 13, /* fiq */ >> PERF_BR_DEBUG_HALT = 14, /* debug halt */ >> PERF_BR_DEBUG_EXIT = 15, /* debug exit */ >> + PERF_BR_DEBUG_INST = 16, /* instruciton debug */ >> + PERF_BR_DEBUG_DATA = 17, /* data debug */ > > This is really unclear. What is "instruction debug" vs "data debug" ? > > Are there meant for breakpoint/watchpoint exceptions? HW breakpoints vs BRK > instructions? Unfortunately the spec does not expand much in detail but will try and find some more clarity. > >> + PERF_BR_FAULT_ALGN = 18, /* alignment fault */ >> + PERF_BR_FAULT_DATA = 19, /* data fault */ >> + PERF_BR_FAULT_INST = 20, /* instruction fault */ > > There are many other potential faults a CPU could take; are these specifically > what Arm's BRBE provides? Right, these are what BRBE captures for now. > >> + PERF_BR_SERROR = 21, /* system error */ > > This is really arm-specific; IIUC the closest thing on x86 is an MCE. But ('unhandled' ?) system error can be a generic control flow change. > >> PERF_BR_MAX, >> }; >> >> @@ -1370,8 +1376,8 @@ struct perf_branch_entry { >> in_tx:1, /* in transaction */ >> abort:1, /* transaction abort */ >> cycles:16, /* cycle count to last branch */ >> - type:4, /* branch type */ >> - reserved:40; >> + type:6, /* branch type */ > > As above, is this a safe-change ABI-wise? If the bit fields here cannot be expanded without breaking ABI, then there is a fundamental problem. Only remaining option will be to add new fields (with new width value) which could accommodate these new required branch types. > > Thanks, > Mark. > >> + reserved:38; >> }; >> >> union perf_sample_weight { >> diff --git a/tools/include/uapi/linux/perf_event.h b/tools/include/uapi/linux/perf_event.h >> index 1882054e8684..9a82b8aaed93 100644 >> --- a/tools/include/uapi/linux/perf_event.h >> +++ b/tools/include/uapi/linux/perf_event.h >> @@ -256,6 +256,12 @@ enum { >> PERF_BR_FIQ = 13, /* fiq */ >> PERF_BR_DEBUG_HALT = 14, /* debug halt */ >> PERF_BR_DEBUG_EXIT = 15, /* debug exit */ >> + PERF_BR_DEBUG_INST = 16, /* instruciton debug */ >> + PERF_BR_DEBUG_DATA = 17, /* data debug */ >> + PERF_BR_FAULT_ALGN = 18, /* alignment fault */ >> + PERF_BR_FAULT_DATA = 19, /* data fault */ >> + PERF_BR_FAULT_INST = 20, /* instruction fault */ >> + PERF_BR_SERROR = 21, /* system error */ >> PERF_BR_MAX, >> }; >> >> @@ -1370,8 +1376,8 @@ struct perf_branch_entry { >> in_tx:1, /* in transaction */ >> abort:1, /* transaction abort */ >> cycles:16, /* cycle count to last branch */ >> - type:4, /* branch type */ >> - reserved:40; >> + type:6, /* branch type */ >> + reserved:38; >> }; >> >> union perf_sample_weight { >> diff --git a/tools/perf/util/branch.c b/tools/perf/util/branch.c >> index 74e5e67b1779..1e216ea2e2a8 100644 >> --- a/tools/perf/util/branch.c >> +++ b/tools/perf/util/branch.c >> @@ -54,7 +54,13 @@ const char *branch_type_name(int type) >> "IRQ", >> "FIQ", >> "DEBUG_HALT", >> - "DEBUG_EXIT" >> + "DEBUG_EXIT", >> + "DEBUG_INST", >> + "DEBUG_DATA", >> + "FAULT_ALGN", >> + "FAULT_DATA", >> + "FAULT_INST", >> + "SERROR" >> }; >> >> if (type >= 0 && type < PERF_BR_MAX) >> diff --git a/tools/perf/util/branch.h b/tools/perf/util/branch.h >> index 17b2ccc61094..875d99abdc36 100644 >> --- a/tools/perf/util/branch.h >> +++ b/tools/perf/util/branch.h >> @@ -23,8 +23,8 @@ struct branch_flags { >> u64 in_tx:1; >> u64 abort:1; >> u64 cycles:16; >> - u64 type:4; >> - u64 reserved:40; >> + u64 type:6; >> + u64 reserved:38; >> }; >> }; >> }; >> -- >> 2.25.1 ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 2/2] perf: Expand perf_branch_entry.type @ 2022-02-04 4:55 ` Anshuman Khandual 0 siblings, 0 replies; 24+ messages in thread From: Anshuman Khandual @ 2022-02-04 4:55 UTC (permalink / raw) To: Mark Rutland Cc: linux-kernel, linux-perf-users, linux-arm-kernel, robh, Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo, Alexander Shishkin, Jiri Olsa, Namhyung Kim, Will Deacon On 2/2/22 5:27 PM, Mark Rutland wrote: > On Fri, Jan 28, 2022 at 11:14:13AM +0530, Anshuman Khandual wrote: >> Current perf_branch_entry.type is a 4 bits field just enough to accommodate >> 16 generic branch types. This is insufficient to accommodate platforms like >> arm64 which has much more branch types. > > It would be good to mention BRBE specifically here, along with specific values > and a rought intro, e.g. > > | The Arm Branch Record Buffer Extension (BRBE) distinguishes $N types of > | branch/exception/return: <rough summary here>. There's not enough space to > | describe these all in perf_branch_entry.type, as this is a 4-bit field. > > That way reviewers (and anyone looking at the patch in future) have a lot more > rationale to work with. A rough summary of the distinct branch types would be > *really* helpful. Sure, will add a summary describing the need for an ABI extension as suggested. > >> Lets just expands this field into a 6 bits one, which can now hold 64 generic >> branch types. > > Is it safe (ABI-wise) to extend a bit-field like this? Does that break any > combination of old/new userspace and old/new kernel? I'm not sure how bit > fields are managed w.r.t. endianness, but normally extending a field would > break BE, so this seems suspicious. Probably. I guess we would need some more inputs/suggestions from others regarding any potential issues and possible workarounds. > > I suspect we might need to allocate a *separate* field for new values, and > possibly reserve a value in the existing field to say "go look at the new > field". In that case there might be another level of indirection. > > Do you have any rationale for 64 values specifically? e.g. is that mostly for > future extensibility? How many will we need for Arm's BRBE? Yeah that is mostly for future extensibility. BRBE's current requirement will be well within 32 types itself. 19 to be specific. > > Do those types fall into a hierarchy, that we could split across separate > fields? I will take a look and get back on this. But as mentioned before it will cause additional level of indirection for a look up. > >> This also adds more generic branch types. > > This feels like ti should be in a separate/subsequent patch. If nothing else > that aids bisectability if changing the size of the field breaks anything. Makes sense, will split them. > >> Cc: Peter Zijlstra <peterz@infradead.org> >> Cc: Ingo Molnar <mingo@redhat.com> >> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> >> Cc: Mark Rutland <mark.rutland@arm.com> >> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> >> Cc: Jiri Olsa <jolsa@redhat.com> >> Cc: Namhyung Kim <namhyung@kernel.org> >> Cc: Will Deacon <will@kernel.org> >> Cc: linux-arm-kernel@lists.infradead.org >> Cc: linux-perf-users@vger.kernel.org >> Cc: linux-kernel@vger.kernel.org >> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> >> --- >> include/uapi/linux/perf_event.h | 10 ++++++++-- >> tools/include/uapi/linux/perf_event.h | 10 ++++++++-- >> tools/perf/util/branch.c | 8 +++++++- >> tools/perf/util/branch.h | 4 ++-- >> 4 files changed, 25 insertions(+), 7 deletions(-) >> >> diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h >> index b91d0f575d0c..361fdc6b87a0 100644 >> --- a/include/uapi/linux/perf_event.h >> +++ b/include/uapi/linux/perf_event.h >> @@ -256,6 +256,12 @@ enum { >> PERF_BR_FIQ = 13, /* fiq */ >> PERF_BR_DEBUG_HALT = 14, /* debug halt */ >> PERF_BR_DEBUG_EXIT = 15, /* debug exit */ >> + PERF_BR_DEBUG_INST = 16, /* instruciton debug */ >> + PERF_BR_DEBUG_DATA = 17, /* data debug */ > > This is really unclear. What is "instruction debug" vs "data debug" ? > > Are there meant for breakpoint/watchpoint exceptions? HW breakpoints vs BRK > instructions? Unfortunately the spec does not expand much in detail but will try and find some more clarity. > >> + PERF_BR_FAULT_ALGN = 18, /* alignment fault */ >> + PERF_BR_FAULT_DATA = 19, /* data fault */ >> + PERF_BR_FAULT_INST = 20, /* instruction fault */ > > There are many other potential faults a CPU could take; are these specifically > what Arm's BRBE provides? Right, these are what BRBE captures for now. > >> + PERF_BR_SERROR = 21, /* system error */ > > This is really arm-specific; IIUC the closest thing on x86 is an MCE. But ('unhandled' ?) system error can be a generic control flow change. > >> PERF_BR_MAX, >> }; >> >> @@ -1370,8 +1376,8 @@ struct perf_branch_entry { >> in_tx:1, /* in transaction */ >> abort:1, /* transaction abort */ >> cycles:16, /* cycle count to last branch */ >> - type:4, /* branch type */ >> - reserved:40; >> + type:6, /* branch type */ > > As above, is this a safe-change ABI-wise? If the bit fields here cannot be expanded without breaking ABI, then there is a fundamental problem. Only remaining option will be to add new fields (with new width value) which could accommodate these new required branch types. > > Thanks, > Mark. > >> + reserved:38; >> }; >> >> union perf_sample_weight { >> diff --git a/tools/include/uapi/linux/perf_event.h b/tools/include/uapi/linux/perf_event.h >> index 1882054e8684..9a82b8aaed93 100644 >> --- a/tools/include/uapi/linux/perf_event.h >> +++ b/tools/include/uapi/linux/perf_event.h >> @@ -256,6 +256,12 @@ enum { >> PERF_BR_FIQ = 13, /* fiq */ >> PERF_BR_DEBUG_HALT = 14, /* debug halt */ >> PERF_BR_DEBUG_EXIT = 15, /* debug exit */ >> + PERF_BR_DEBUG_INST = 16, /* instruciton debug */ >> + PERF_BR_DEBUG_DATA = 17, /* data debug */ >> + PERF_BR_FAULT_ALGN = 18, /* alignment fault */ >> + PERF_BR_FAULT_DATA = 19, /* data fault */ >> + PERF_BR_FAULT_INST = 20, /* instruction fault */ >> + PERF_BR_SERROR = 21, /* system error */ >> PERF_BR_MAX, >> }; >> >> @@ -1370,8 +1376,8 @@ struct perf_branch_entry { >> in_tx:1, /* in transaction */ >> abort:1, /* transaction abort */ >> cycles:16, /* cycle count to last branch */ >> - type:4, /* branch type */ >> - reserved:40; >> + type:6, /* branch type */ >> + reserved:38; >> }; >> >> union perf_sample_weight { >> diff --git a/tools/perf/util/branch.c b/tools/perf/util/branch.c >> index 74e5e67b1779..1e216ea2e2a8 100644 >> --- a/tools/perf/util/branch.c >> +++ b/tools/perf/util/branch.c >> @@ -54,7 +54,13 @@ const char *branch_type_name(int type) >> "IRQ", >> "FIQ", >> "DEBUG_HALT", >> - "DEBUG_EXIT" >> + "DEBUG_EXIT", >> + "DEBUG_INST", >> + "DEBUG_DATA", >> + "FAULT_ALGN", >> + "FAULT_DATA", >> + "FAULT_INST", >> + "SERROR" >> }; >> >> if (type >= 0 && type < PERF_BR_MAX) >> diff --git a/tools/perf/util/branch.h b/tools/perf/util/branch.h >> index 17b2ccc61094..875d99abdc36 100644 >> --- a/tools/perf/util/branch.h >> +++ b/tools/perf/util/branch.h >> @@ -23,8 +23,8 @@ struct branch_flags { >> u64 in_tx:1; >> u64 abort:1; >> u64 cycles:16; >> - u64 type:4; >> - u64 reserved:40; >> + u64 type:6; >> + u64 reserved:38; >> }; >> }; >> }; >> -- >> 2.25.1 _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 2/2] perf: Expand perf_branch_entry.type 2022-02-04 4:55 ` Anshuman Khandual @ 2022-02-04 16:02 ` Mark Rutland -1 siblings, 0 replies; 24+ messages in thread From: Mark Rutland @ 2022-02-04 16:02 UTC (permalink / raw) To: Anshuman Khandual Cc: linux-kernel, linux-perf-users, linux-arm-kernel, robh, Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo, Alexander Shishkin, Jiri Olsa, Namhyung Kim, Will Deacon On Fri, Feb 04, 2022 at 10:25:24AM +0530, Anshuman Khandual wrote: > On 2/2/22 5:27 PM, Mark Rutland wrote: > > On Fri, Jan 28, 2022 at 11:14:13AM +0530, Anshuman Khandual wrote: > >> @@ -1370,8 +1376,8 @@ struct perf_branch_entry { > >> in_tx:1, /* in transaction */ > >> abort:1, /* transaction abort */ > >> cycles:16, /* cycle count to last branch */ > >> - type:4, /* branch type */ > >> - reserved:40; > >> + type:6, /* branch type */ > > > > As above, is this a safe-change ABI-wise? > > If the bit fields here cannot be expanded without breaking ABI, then > there is a fundamental problem. Only remaining option will be to add > new fields (with new width value) which could accommodate these new > required branch types. Unfortunately, I think expanding this does break ABI, and is a fundamental problem, as: (a) Any new values in the expanded field will be truncated when read by old userspace, and so those may be mis-reported. Maybe we're not too worried about this case. (b) Depending on how the field is placed, existing values might get stored differently. This could break any mismatched combination of {old,new}-kernel and {old,new}-userspace. In practice, I think this means that this is broken for BE, and happens to work for LE, but I don't know how bitfields are defined for each architecture, so there could be other brokenness. Consider the test case below: -------- #include <stdbool.h> struct bfv1 { unsigned long a:20, b:20, c:4, d:20; }; struct bfv2 { unsigned long a:20, b:20, c:6, d:18; }; union bf { struct bfv1 v1; struct bfv2 v2; }; bool bfv1_w_r(unsigned char v) { unsigned char v_old = v & ((1UL << 4) - 1); unsigned char v_new; union bf bf = { 0 }; bf.v1.c = v_old; v_new = bf.v1.c; return v_old == v_new; } bool bfv2_w_r(unsigned char v) { unsigned char v_old = v & ((1UL << 6) - 1); unsigned char v_new; union bf bf = { 0 }; bf.v2.c = v_old; v_new = bf.v2.c; return v_old == v_new; } bool bfv1_w_bfv2_r(unsigned char v) { unsigned char v_old = v & ((1UL << 4) - 1); unsigned char v_new; union bf bf = { 0 }; bf.v1.c = v_old; v_new = bf.v2.c; return v_old == v_new; } -------- When compiled for little-endian AArch64, GCC thinks all old values will be interpreted the same, and optimizes all the round-trip tests to return true: | [mark@gravadlaks:~]% aarch64-linux-gnu-gcc -c bitfield-test.c -O2 -mlittle-endian | [mark@gravadlaks:~]% aarch64-linux-gnu-objdump -d bitfield-test.o | | bitfield-test.o: file format elf64-littleaarch64 | | | Disassembly of section .text: | | 0000000000000000 <bfv1_w_r>: | 0: 52800020 mov w0, #0x1 // #1 | 4: d65f03c0 ret | | 0000000000000008 <bfv2_w_r>: | 8: 52800020 mov w0, #0x1 // #1 | c: d65f03c0 ret | | 0000000000000010 <bfv1_w_bfv2_r>: | 10: 52800020 mov w0, #0x1 // #1 | 14: d65f03c0 ret But when compiled for big-endian AArch64, it doesn't believe that at all: | [mark@gravadlaks:~]% aarch64-linux-gnu-gcc -c bitfield-test.c -O2 -mbig-endian | [mark@gravadlaks:~]% aarch64-linux-gnu-objdump -d bitfield-test.o | | bitfield-test.o: file format elf64-bigaarch64 | | | Disassembly of section .text: | | 0000000000000000 <bfv1_w_r>: | 0: 52800020 mov w0, #0x1 // #1 | 4: d65f03c0 ret | | 0000000000000008 <bfv2_w_r>: | 8: 52800020 mov w0, #0x1 // #1 | c: d65f03c0 ret | | 0000000000000010 <bfv1_w_bfv2_r>: | 10: 12001c00 and w0, w0, #0xff | 14: 12000c01 and w1, w0, #0xf | 18: 531e0c00 ubfiz w0, w0, #2, #4 | 1c: 6b01001f cmp w0, w1 | 20: 1a9f17e0 cset w0, eq // eq = none | 24: d65f03c0 ret If we add the following: | void write_bfv1_c(union bf *bf, unsigned char v) | { | bf->v1.c = v; | } | | void write_bfv2_c(union bf *bf, unsigned char v) | { | bf->v2.c = v; | } | | unsigned char read_bfv1_c(union bf *bf) | { | return bf->v1.c; | } | | unsigned char read_bfv2_c(union bf *bf) | { | return bf->v2.c; | } For LE we get: | 0000000000000018 <write_bfv1_c>: | 18: 39401402 ldrb w2, [x0, #5] | 1c: 33000c22 bfxil w2, w1, #0, #4 | 20: 39001402 strb w2, [x0, #5] | 24: d65f03c0 ret | | 0000000000000028 <write_bfv2_c>: | 28: 39401402 ldrb w2, [x0, #5] | 2c: 33001422 bfxil w2, w1, #0, #6 | 30: 39001402 strb w2, [x0, #5] | 34: d65f03c0 ret | | 0000000000000038 <read_bfv1_c>: | 38: 39401400 ldrb w0, [x0, #5] | 3c: 92400c00 and x0, x0, #0xf | 40: d65f03c0 ret | 44: d503201f nop | | 0000000000000048 <read_bfv2_c>: | 48: 39401400 ldrb w0, [x0, #5] | 4c: 92401400 and x0, x0, #0x3f | 50: d65f03c0 ret ... where the low bits of the field stay in the same place, so for all existing values this happens to work. For BE we get: | 0000000000000028 <write_bfv1_c>: | 28: 39401402 ldrb w2, [x0, #5] | 2c: 331c0c22 bfi w2, w1, #4, #4 | 30: 39001402 strb w2, [x0, #5] | 34: d65f03c0 ret | | 0000000000000038 <write_bfv2_c>: | 38: 39401402 ldrb w2, [x0, #5] | 3c: 331e1422 bfi w2, w1, #2, #6 | 40: 39001402 strb w2, [x0, #5] | 44: d65f03c0 ret | | 0000000000000048 <read_bfv1_c>: | 48: 39401400 ldrb w0, [x0, #5] | 4c: 53041c00 ubfx w0, w0, #4, #4 | 50: d65f03c0 ret | 54: d503201f nop | | 0000000000000058 <read_bfv2_c>: | 58: 39401400 ldrb w0, [x0, #5] | 5c: 53021c00 ubfx w0, w0, #2, #6 | 60: d65f03c0 ret ... where the low bits of the field have moved, and so this is broken even for existing values! Thanks, Mark. ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 2/2] perf: Expand perf_branch_entry.type @ 2022-02-04 16:02 ` Mark Rutland 0 siblings, 0 replies; 24+ messages in thread From: Mark Rutland @ 2022-02-04 16:02 UTC (permalink / raw) To: Anshuman Khandual Cc: linux-kernel, linux-perf-users, linux-arm-kernel, robh, Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo, Alexander Shishkin, Jiri Olsa, Namhyung Kim, Will Deacon On Fri, Feb 04, 2022 at 10:25:24AM +0530, Anshuman Khandual wrote: > On 2/2/22 5:27 PM, Mark Rutland wrote: > > On Fri, Jan 28, 2022 at 11:14:13AM +0530, Anshuman Khandual wrote: > >> @@ -1370,8 +1376,8 @@ struct perf_branch_entry { > >> in_tx:1, /* in transaction */ > >> abort:1, /* transaction abort */ > >> cycles:16, /* cycle count to last branch */ > >> - type:4, /* branch type */ > >> - reserved:40; > >> + type:6, /* branch type */ > > > > As above, is this a safe-change ABI-wise? > > If the bit fields here cannot be expanded without breaking ABI, then > there is a fundamental problem. Only remaining option will be to add > new fields (with new width value) which could accommodate these new > required branch types. Unfortunately, I think expanding this does break ABI, and is a fundamental problem, as: (a) Any new values in the expanded field will be truncated when read by old userspace, and so those may be mis-reported. Maybe we're not too worried about this case. (b) Depending on how the field is placed, existing values might get stored differently. This could break any mismatched combination of {old,new}-kernel and {old,new}-userspace. In practice, I think this means that this is broken for BE, and happens to work for LE, but I don't know how bitfields are defined for each architecture, so there could be other brokenness. Consider the test case below: -------- #include <stdbool.h> struct bfv1 { unsigned long a:20, b:20, c:4, d:20; }; struct bfv2 { unsigned long a:20, b:20, c:6, d:18; }; union bf { struct bfv1 v1; struct bfv2 v2; }; bool bfv1_w_r(unsigned char v) { unsigned char v_old = v & ((1UL << 4) - 1); unsigned char v_new; union bf bf = { 0 }; bf.v1.c = v_old; v_new = bf.v1.c; return v_old == v_new; } bool bfv2_w_r(unsigned char v) { unsigned char v_old = v & ((1UL << 6) - 1); unsigned char v_new; union bf bf = { 0 }; bf.v2.c = v_old; v_new = bf.v2.c; return v_old == v_new; } bool bfv1_w_bfv2_r(unsigned char v) { unsigned char v_old = v & ((1UL << 4) - 1); unsigned char v_new; union bf bf = { 0 }; bf.v1.c = v_old; v_new = bf.v2.c; return v_old == v_new; } -------- When compiled for little-endian AArch64, GCC thinks all old values will be interpreted the same, and optimizes all the round-trip tests to return true: | [mark@gravadlaks:~]% aarch64-linux-gnu-gcc -c bitfield-test.c -O2 -mlittle-endian | [mark@gravadlaks:~]% aarch64-linux-gnu-objdump -d bitfield-test.o | | bitfield-test.o: file format elf64-littleaarch64 | | | Disassembly of section .text: | | 0000000000000000 <bfv1_w_r>: | 0: 52800020 mov w0, #0x1 // #1 | 4: d65f03c0 ret | | 0000000000000008 <bfv2_w_r>: | 8: 52800020 mov w0, #0x1 // #1 | c: d65f03c0 ret | | 0000000000000010 <bfv1_w_bfv2_r>: | 10: 52800020 mov w0, #0x1 // #1 | 14: d65f03c0 ret But when compiled for big-endian AArch64, it doesn't believe that at all: | [mark@gravadlaks:~]% aarch64-linux-gnu-gcc -c bitfield-test.c -O2 -mbig-endian | [mark@gravadlaks:~]% aarch64-linux-gnu-objdump -d bitfield-test.o | | bitfield-test.o: file format elf64-bigaarch64 | | | Disassembly of section .text: | | 0000000000000000 <bfv1_w_r>: | 0: 52800020 mov w0, #0x1 // #1 | 4: d65f03c0 ret | | 0000000000000008 <bfv2_w_r>: | 8: 52800020 mov w0, #0x1 // #1 | c: d65f03c0 ret | | 0000000000000010 <bfv1_w_bfv2_r>: | 10: 12001c00 and w0, w0, #0xff | 14: 12000c01 and w1, w0, #0xf | 18: 531e0c00 ubfiz w0, w0, #2, #4 | 1c: 6b01001f cmp w0, w1 | 20: 1a9f17e0 cset w0, eq // eq = none | 24: d65f03c0 ret If we add the following: | void write_bfv1_c(union bf *bf, unsigned char v) | { | bf->v1.c = v; | } | | void write_bfv2_c(union bf *bf, unsigned char v) | { | bf->v2.c = v; | } | | unsigned char read_bfv1_c(union bf *bf) | { | return bf->v1.c; | } | | unsigned char read_bfv2_c(union bf *bf) | { | return bf->v2.c; | } For LE we get: | 0000000000000018 <write_bfv1_c>: | 18: 39401402 ldrb w2, [x0, #5] | 1c: 33000c22 bfxil w2, w1, #0, #4 | 20: 39001402 strb w2, [x0, #5] | 24: d65f03c0 ret | | 0000000000000028 <write_bfv2_c>: | 28: 39401402 ldrb w2, [x0, #5] | 2c: 33001422 bfxil w2, w1, #0, #6 | 30: 39001402 strb w2, [x0, #5] | 34: d65f03c0 ret | | 0000000000000038 <read_bfv1_c>: | 38: 39401400 ldrb w0, [x0, #5] | 3c: 92400c00 and x0, x0, #0xf | 40: d65f03c0 ret | 44: d503201f nop | | 0000000000000048 <read_bfv2_c>: | 48: 39401400 ldrb w0, [x0, #5] | 4c: 92401400 and x0, x0, #0x3f | 50: d65f03c0 ret ... where the low bits of the field stay in the same place, so for all existing values this happens to work. For BE we get: | 0000000000000028 <write_bfv1_c>: | 28: 39401402 ldrb w2, [x0, #5] | 2c: 331c0c22 bfi w2, w1, #4, #4 | 30: 39001402 strb w2, [x0, #5] | 34: d65f03c0 ret | | 0000000000000038 <write_bfv2_c>: | 38: 39401402 ldrb w2, [x0, #5] | 3c: 331e1422 bfi w2, w1, #2, #6 | 40: 39001402 strb w2, [x0, #5] | 44: d65f03c0 ret | | 0000000000000048 <read_bfv1_c>: | 48: 39401400 ldrb w0, [x0, #5] | 4c: 53041c00 ubfx w0, w0, #4, #4 | 50: d65f03c0 ret | 54: d503201f nop | | 0000000000000058 <read_bfv2_c>: | 58: 39401400 ldrb w0, [x0, #5] | 5c: 53021c00 ubfx w0, w0, #2, #6 | 60: d65f03c0 ret ... where the low bits of the field have moved, and so this is broken even for existing values! Thanks, Mark. _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 2/2] perf: Expand perf_branch_entry.type 2022-02-04 16:02 ` Mark Rutland @ 2022-02-15 19:01 ` Rob Herring -1 siblings, 0 replies; 24+ messages in thread From: Rob Herring @ 2022-02-15 19:01 UTC (permalink / raw) To: Mark Rutland, Anshuman Khandual Cc: linux-kernel, linux-perf-users, linux-arm-kernel, Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo, Alexander Shishkin, Jiri Olsa, Namhyung Kim, Will Deacon On Fri, Feb 04, 2022 at 04:02:04PM +0000, Mark Rutland wrote: > On Fri, Feb 04, 2022 at 10:25:24AM +0530, Anshuman Khandual wrote: > > On 2/2/22 5:27 PM, Mark Rutland wrote: > > > On Fri, Jan 28, 2022 at 11:14:13AM +0530, Anshuman Khandual wrote: > > >> @@ -1370,8 +1376,8 @@ struct perf_branch_entry { > > >> in_tx:1, /* in transaction */ > > >> abort:1, /* transaction abort */ > > >> cycles:16, /* cycle count to last branch */ > > >> - type:4, /* branch type */ > > >> - reserved:40; > > >> + type:6, /* branch type */ > > > > > > As above, is this a safe-change ABI-wise? > > > > If the bit fields here cannot be expanded without breaking ABI, then > > there is a fundamental problem. Only remaining option will be to add > > new fields (with new width value) which could accommodate these new > > required branch types. > > Unfortunately, I think expanding this does break ABI, and is a fundamental > problem, as: > > (a) Any new values in the expanded field will be truncated when read by old > userspace, and so those may be mis-reported. Maybe we're not too worried > about this case. 'type' or specfically branch stack is not currently supported on arm64. Do we expect an old userspace which this didn't work on to start working with a new kernel? Given at least some of the new types are arch specific, perhaps the existing type field should get a new 'PERF_BR_ARCH_SPECIFIC' or 'PERF_BR_EXTENDED' value (or use PERF_BR_UNKNOWN?) which means read a new 'arch_type' field. Another option is maybe some of these additional types just shouldn't be exposed to userspace? For example, are branches to FIQ useful or leaking any info about secure world? Debug mode branches also seem minimally useful to me (though I'm no expert in how this is used). > (b) Depending on how the field is placed, existing values might get stored > differently. This could break any mismatched combination of > {old,new}-kernel and {old,new}-userspace. > > In practice, I think this means that this is broken for BE, and happens to > work for LE, but I don't know how bitfields are defined for each > architecture, so there could be other brokenness. > > Consider the test case below: [...] > ... where the low bits of the field have moved, and so this is broken even for > existing values! So that is a separate issue to be fixed and not directly related to the size of 'type'. Looks like it needs similar '#if defined(__LITTLE_ENDIAN_BITFIELD)' treatment as some of the other struct bitfields. Though somehow BE PPC hasn't had issues? Rob ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 2/2] perf: Expand perf_branch_entry.type @ 2022-02-15 19:01 ` Rob Herring 0 siblings, 0 replies; 24+ messages in thread From: Rob Herring @ 2022-02-15 19:01 UTC (permalink / raw) To: Mark Rutland, Anshuman Khandual Cc: linux-kernel, linux-perf-users, linux-arm-kernel, Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo, Alexander Shishkin, Jiri Olsa, Namhyung Kim, Will Deacon On Fri, Feb 04, 2022 at 04:02:04PM +0000, Mark Rutland wrote: > On Fri, Feb 04, 2022 at 10:25:24AM +0530, Anshuman Khandual wrote: > > On 2/2/22 5:27 PM, Mark Rutland wrote: > > > On Fri, Jan 28, 2022 at 11:14:13AM +0530, Anshuman Khandual wrote: > > >> @@ -1370,8 +1376,8 @@ struct perf_branch_entry { > > >> in_tx:1, /* in transaction */ > > >> abort:1, /* transaction abort */ > > >> cycles:16, /* cycle count to last branch */ > > >> - type:4, /* branch type */ > > >> - reserved:40; > > >> + type:6, /* branch type */ > > > > > > As above, is this a safe-change ABI-wise? > > > > If the bit fields here cannot be expanded without breaking ABI, then > > there is a fundamental problem. Only remaining option will be to add > > new fields (with new width value) which could accommodate these new > > required branch types. > > Unfortunately, I think expanding this does break ABI, and is a fundamental > problem, as: > > (a) Any new values in the expanded field will be truncated when read by old > userspace, and so those may be mis-reported. Maybe we're not too worried > about this case. 'type' or specfically branch stack is not currently supported on arm64. Do we expect an old userspace which this didn't work on to start working with a new kernel? Given at least some of the new types are arch specific, perhaps the existing type field should get a new 'PERF_BR_ARCH_SPECIFIC' or 'PERF_BR_EXTENDED' value (or use PERF_BR_UNKNOWN?) which means read a new 'arch_type' field. Another option is maybe some of these additional types just shouldn't be exposed to userspace? For example, are branches to FIQ useful or leaking any info about secure world? Debug mode branches also seem minimally useful to me (though I'm no expert in how this is used). > (b) Depending on how the field is placed, existing values might get stored > differently. This could break any mismatched combination of > {old,new}-kernel and {old,new}-userspace. > > In practice, I think this means that this is broken for BE, and happens to > work for LE, but I don't know how bitfields are defined for each > architecture, so there could be other brokenness. > > Consider the test case below: [...] > ... where the low bits of the field have moved, and so this is broken even for > existing values! So that is a separate issue to be fixed and not directly related to the size of 'type'. Looks like it needs similar '#if defined(__LITTLE_ENDIAN_BITFIELD)' treatment as some of the other struct bitfields. Though somehow BE PPC hasn't had issues? Rob _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 2/2] perf: Expand perf_branch_entry.type 2022-02-15 19:01 ` Rob Herring @ 2022-02-15 19:45 ` Mark Rutland -1 siblings, 0 replies; 24+ messages in thread From: Mark Rutland @ 2022-02-15 19:45 UTC (permalink / raw) To: Rob Herring Cc: Anshuman Khandual, linux-kernel, linux-perf-users, linux-arm-kernel, Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo, Alexander Shishkin, Jiri Olsa, Namhyung Kim, Will Deacon On Tue, Feb 15, 2022 at 01:01:06PM -0600, Rob Herring wrote: > On Fri, Feb 04, 2022 at 04:02:04PM +0000, Mark Rutland wrote: > > On Fri, Feb 04, 2022 at 10:25:24AM +0530, Anshuman Khandual wrote: > > > On 2/2/22 5:27 PM, Mark Rutland wrote: > > > > On Fri, Jan 28, 2022 at 11:14:13AM +0530, Anshuman Khandual wrote: > > > >> @@ -1370,8 +1376,8 @@ struct perf_branch_entry { > > > >> in_tx:1, /* in transaction */ > > > >> abort:1, /* transaction abort */ > > > >> cycles:16, /* cycle count to last branch */ > > > >> - type:4, /* branch type */ > > > >> - reserved:40; > > > >> + type:6, /* branch type */ > > > > > > > > As above, is this a safe-change ABI-wise? > > > > > > If the bit fields here cannot be expanded without breaking ABI, then > > > there is a fundamental problem. Only remaining option will be to add > > > new fields (with new width value) which could accommodate these new > > > required branch types. > > > > Unfortunately, I think expanding this does break ABI, and is a fundamental > > problem, as: > > > > (a) Any new values in the expanded field will be truncated when read by old > > userspace, and so those may be mis-reported. Maybe we're not too worried > > about this case. > > 'type' or specfically branch stack is not currently supported on arm64. > Do we expect an old userspace which this didn't work on to start working > with a new kernel? I agree for arm64 specifically this probably doesn't matter; I just wanted to have a clear explanation of why this *could* be a problem, since this could affect other architectures. > Given at least some of the new types are arch specific, perhaps > the existing type field should get a new 'PERF_BR_ARCH_SPECIFIC' or > 'PERF_BR_EXTENDED' value (or use PERF_BR_UNKNOWN?) which means read a > new 'arch_type' field. Yup; something of that shape sounds good to me -- that was roughly what I had suggested elsewhere. > Another option is maybe some of these additional types just shouldn't be > exposed to userspace? For example, are branches to FIQ useful or leaking > any info about secure world? Debug mode branches also seem minimally > useful to me (though I'm no expert in how this is used). I agree; this wasn't clear to me, and regardless I think many of the types added in the prior patch should not be generic since they're very specific to the Arm architecture. > > (b) Depending on how the field is placed, existing values might get stored > > differently. This could break any mismatched combination of > > {old,new}-kernel and {old,new}-userspace. > > > > In practice, I think this means that this is broken for BE, and happens to > > work for LE, but I don't know how bitfields are defined for each > > architecture, so there could be other brokenness. > > > > Consider the test case below: > > [...] > > > ... where the low bits of the field have moved, and so this is broken even for > > existing values! > > So that is a separate issue to be fixed and not directly related to the > size of 'type'. I agree if you moved the entire field that's broken everywhere, but in this case it *is* directly related to the size changing. In my example the meaning of specific bits changed *because* the size of the field changed and in BE that meant the low bits of the field moved, even though the field started at the same position. > Looks like it needs similar '#if > defined(__LITTLE_ENDIAN_BITFIELD)' treatment as some of the other struct > bitfields. Though somehow BE PPC hasn't had issues? IIRC there were recent problems in this area, and I think historically we've broken ABI and people only noticed much later. Thanks, Mark. ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 2/2] perf: Expand perf_branch_entry.type @ 2022-02-15 19:45 ` Mark Rutland 0 siblings, 0 replies; 24+ messages in thread From: Mark Rutland @ 2022-02-15 19:45 UTC (permalink / raw) To: Rob Herring Cc: Anshuman Khandual, linux-kernel, linux-perf-users, linux-arm-kernel, Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo, Alexander Shishkin, Jiri Olsa, Namhyung Kim, Will Deacon On Tue, Feb 15, 2022 at 01:01:06PM -0600, Rob Herring wrote: > On Fri, Feb 04, 2022 at 04:02:04PM +0000, Mark Rutland wrote: > > On Fri, Feb 04, 2022 at 10:25:24AM +0530, Anshuman Khandual wrote: > > > On 2/2/22 5:27 PM, Mark Rutland wrote: > > > > On Fri, Jan 28, 2022 at 11:14:13AM +0530, Anshuman Khandual wrote: > > > >> @@ -1370,8 +1376,8 @@ struct perf_branch_entry { > > > >> in_tx:1, /* in transaction */ > > > >> abort:1, /* transaction abort */ > > > >> cycles:16, /* cycle count to last branch */ > > > >> - type:4, /* branch type */ > > > >> - reserved:40; > > > >> + type:6, /* branch type */ > > > > > > > > As above, is this a safe-change ABI-wise? > > > > > > If the bit fields here cannot be expanded without breaking ABI, then > > > there is a fundamental problem. Only remaining option will be to add > > > new fields (with new width value) which could accommodate these new > > > required branch types. > > > > Unfortunately, I think expanding this does break ABI, and is a fundamental > > problem, as: > > > > (a) Any new values in the expanded field will be truncated when read by old > > userspace, and so those may be mis-reported. Maybe we're not too worried > > about this case. > > 'type' or specfically branch stack is not currently supported on arm64. > Do we expect an old userspace which this didn't work on to start working > with a new kernel? I agree for arm64 specifically this probably doesn't matter; I just wanted to have a clear explanation of why this *could* be a problem, since this could affect other architectures. > Given at least some of the new types are arch specific, perhaps > the existing type field should get a new 'PERF_BR_ARCH_SPECIFIC' or > 'PERF_BR_EXTENDED' value (or use PERF_BR_UNKNOWN?) which means read a > new 'arch_type' field. Yup; something of that shape sounds good to me -- that was roughly what I had suggested elsewhere. > Another option is maybe some of these additional types just shouldn't be > exposed to userspace? For example, are branches to FIQ useful or leaking > any info about secure world? Debug mode branches also seem minimally > useful to me (though I'm no expert in how this is used). I agree; this wasn't clear to me, and regardless I think many of the types added in the prior patch should not be generic since they're very specific to the Arm architecture. > > (b) Depending on how the field is placed, existing values might get stored > > differently. This could break any mismatched combination of > > {old,new}-kernel and {old,new}-userspace. > > > > In practice, I think this means that this is broken for BE, and happens to > > work for LE, but I don't know how bitfields are defined for each > > architecture, so there could be other brokenness. > > > > Consider the test case below: > > [...] > > > ... where the low bits of the field have moved, and so this is broken even for > > existing values! > > So that is a separate issue to be fixed and not directly related to the > size of 'type'. I agree if you moved the entire field that's broken everywhere, but in this case it *is* directly related to the size changing. In my example the meaning of specific bits changed *because* the size of the field changed and in BE that meant the low bits of the field moved, even though the field started at the same position. > Looks like it needs similar '#if > defined(__LITTLE_ENDIAN_BITFIELD)' treatment as some of the other struct > bitfields. Though somehow BE PPC hasn't had issues? IIRC there were recent problems in this area, and I think historically we've broken ABI and people only noticed much later. Thanks, Mark. _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 24+ messages in thread
end of thread, other threads:[~2022-02-24 7:11 UTC | newest] Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2022-01-28 5:44 [PATCH 0/2] perf: Expand captured branch types Anshuman Khandual 2022-01-28 5:44 ` Anshuman Khandual 2022-01-28 5:44 ` [PATCH 1/2] perf: Add more generic " Anshuman Khandual 2022-01-28 5:44 ` Anshuman Khandual 2022-02-02 11:58 ` Mark Rutland 2022-02-02 11:58 ` Mark Rutland 2022-02-04 4:56 ` Anshuman Khandual 2022-02-04 4:56 ` Anshuman Khandual 2022-02-24 5:51 ` Anshuman Khandual 2022-02-24 5:51 ` Anshuman Khandual 2022-02-24 7:10 ` Anshuman Khandual 2022-02-24 7:10 ` Anshuman Khandual 2022-01-28 5:44 ` [PATCH 2/2] perf: Expand perf_branch_entry.type Anshuman Khandual 2022-01-28 5:44 ` Anshuman Khandual 2022-02-02 11:57 ` Mark Rutland 2022-02-02 11:57 ` Mark Rutland 2022-02-04 4:55 ` Anshuman Khandual 2022-02-04 4:55 ` Anshuman Khandual 2022-02-04 16:02 ` Mark Rutland 2022-02-04 16:02 ` Mark Rutland 2022-02-15 19:01 ` Rob Herring 2022-02-15 19:01 ` Rob Herring 2022-02-15 19:45 ` Mark Rutland 2022-02-15 19:45 ` Mark Rutland
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.