From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754609AbdDDPw7 (ORCPT ); Tue, 4 Apr 2017 11:52:59 -0400 Received: from mga05.intel.com ([192.55.52.43]:31901 "EHLO mga05.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754345AbdDDPw5 (ORCPT ); Tue, 4 Apr 2017 11:52:57 -0400 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.36,275,1486454400"; d="scan'208";a="1150811759" Subject: Re: [PATCH v1 1/5] perf/core: Define the common branch type classification To: Arnaldo Carvalho de Melo References: <1490973522-5499-1-git-send-email-yao.jin@linux.intel.com> <1490973522-5499-2-git-send-email-yao.jin@linux.intel.com> <20170404141805.GA12903@kernel.org> Cc: Jiri Olsa , linux-kernel@vger.kernel.org, ak@linux.intel.com, kan.liang@intel.com, yao.jin@intel.com, Peter Zijlstra , Alexander Shishkin , Ingo Molnar From: "Jin, Yao" Message-ID: Date: Tue, 4 Apr 2017 23:52:53 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:45.0) Gecko/20100101 Thunderbird/45.8.0 MIME-Version: 1.0 In-Reply-To: <20170404141805.GA12903@kernel.org> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 4/4/2017 10:18 PM, Arnaldo Carvalho de Melo wrote: > Adding the perf kernel maintainers to the CC list. > > Em Fri, Mar 31, 2017 at 11:18:38PM +0800, Jin Yao escreveu: >> It is often useful to know the branch types while analyzing branch >> data. For example, a call is very different from a conditional branch. >> >> Currently we have to look it up in binary while the binary may later >> not be available and even the binary is available but user has to take >> some time. It is very useful for user to check it directly in perf >> report. >> >> Perf already has support for disassembling the branch instruction >> to get the branch type. The branch type is defined in lbr.c. >> >> To keep consistent on kernel and userspace and make the classification >> more common, the patch adds the common branch type classification >> in perf_event.h. >> >> Since the disassembling of branch instruction needs some overhead, >> a new PERF_SAMPLE_BRANCH_TYPE_SAVE is introduced to indicate if it >> needs to disassemble the branch instruction and record the branch >> type. >> >> Signed-off-by: Jin Yao >> --- >> include/uapi/linux/perf_event.h | 24 +++++++++++++++++++++++- >> tools/include/uapi/linux/perf_event.h | 24 +++++++++++++++++++++++- >> 2 files changed, 46 insertions(+), 2 deletions(-) >> >> diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h >> index d09a9cd..4d731fd 100644 >> --- a/include/uapi/linux/perf_event.h >> +++ b/include/uapi/linux/perf_event.h >> @@ -174,6 +174,8 @@ enum perf_branch_sample_type_shift { >> PERF_SAMPLE_BRANCH_NO_FLAGS_SHIFT = 14, /* no flags */ >> PERF_SAMPLE_BRANCH_NO_CYCLES_SHIFT = 15, /* no cycles */ >> >> + PERF_SAMPLE_BRANCH_TYPE_SAVE_SHIFT = 16, /* save branch type */ >> + >> PERF_SAMPLE_BRANCH_MAX_SHIFT /* non-ABI */ >> }; >> >> @@ -198,9 +200,27 @@ enum perf_branch_sample_type { >> PERF_SAMPLE_BRANCH_NO_FLAGS = 1U << PERF_SAMPLE_BRANCH_NO_FLAGS_SHIFT, >> PERF_SAMPLE_BRANCH_NO_CYCLES = 1U << PERF_SAMPLE_BRANCH_NO_CYCLES_SHIFT, >> >> + PERF_SAMPLE_BRANCH_TYPE_SAVE = >> + 1U << PERF_SAMPLE_BRANCH_TYPE_SAVE_SHIFT, >> + >> PERF_SAMPLE_BRANCH_MAX = 1U << PERF_SAMPLE_BRANCH_MAX_SHIFT, >> }; >> >> +/* >> + * Common flow change classification >> + */ >> +enum { >> + PERF_BR_NONE = 0, /* unknown */ >> + PERF_BR_JCC_FWD = 1 << 1, /* conditional forward jump */ >> + PERF_BR_JCC_BWD = 1 << 2, /* conditional backward jump */ >> + PERF_BR_JMP = 1 << 3, /* jump */ >> + PERF_BR_IND_JMP = 1 << 4, /* indirect jump */ >> + PERF_BR_CALL = 1 << 5, /* call */ >> + PERF_BR_IND_CALL = 1 << 6, /* indirect call */ >> + PERF_BR_RET = 1 << 7, /* return */ >> + PERF_BR_FAR_BRANCH = 1 << 8, /* SYSCALL,SYSRET,IRQ,... */ > Humm, wouldn't be better to have those in separate buckets? I.e. > > PERF_BR_SYSCALL > PERF_BR_SYSRET, > PERF_BR_IRQ > > etc? There are also other types. I.e. abort, ..... I use FAR_BRANCH is because I just want to differentiate between basic branch types and others. (others may be too much and platform specific). > > And why a bitmask? /me reads a bit more... couldn't find a reason for > this: > > + type:9, /* branch type */ > > Do you have a reason to use 9 bits? Why not just: > > enum { > PERF_BR_NONE = 0, /* unknown */ > PERF_BR_JCC_FWD = 1, /* conditional forward jump */ > PERF_BR_JCC_BWD = 2, /* conditional backward jump */ > PERF_BR_JMP = 3, /* jump */ > PERF_BR_IND_JMP = 4, /* indirect jump */ > PERF_BR_CALL = 5, /* call */ > PERF_BR_IND_CALL = 6, /* indirect call */ > PERF_BR_RET = 7, /* return */ > PERF_BR_FAR_BRANCH = 8, /* SYSCALL,SYSRET,IRQ,... */ > > And then use, say, 4 or 5 bits for that type field? > > I must be missing something trivial ;-\ > > - Arnaldo You are right. I made things more complicated. Yes, the definitions should be clear and simple. I will redefine them in v2. Thanks Jin Yao > >> +}; >> + >> #define PERF_SAMPLE_BRANCH_PLM_ALL \ >> (PERF_SAMPLE_BRANCH_USER|\ >> PERF_SAMPLE_BRANCH_KERNEL|\ >> @@ -999,6 +1019,7 @@ union perf_mem_data_src { >> * in_tx: running in a hardware transaction >> * abort: aborting a hardware transaction >> * cycles: cycles from last branch (or 0 if not supported) >> + * type: branch type >> */ >> struct perf_branch_entry { >> __u64 from; >> @@ -1008,7 +1029,8 @@ struct perf_branch_entry { >> in_tx:1, /* in transaction */ >> abort:1, /* transaction abort */ >> cycles:16, /* cycle count to last branch */ >> - reserved:44; >> + type:9, /* branch type */ >> + reserved:35; >> }; >> >> #endif /* _UAPI_LINUX_PERF_EVENT_H */ >> diff --git a/tools/include/uapi/linux/perf_event.h b/tools/include/uapi/linux/perf_event.h >> index d09a9cd..4d731fd 100644 >> --- a/tools/include/uapi/linux/perf_event.h >> +++ b/tools/include/uapi/linux/perf_event.h >> @@ -174,6 +174,8 @@ enum perf_branch_sample_type_shift { >> PERF_SAMPLE_BRANCH_NO_FLAGS_SHIFT = 14, /* no flags */ >> PERF_SAMPLE_BRANCH_NO_CYCLES_SHIFT = 15, /* no cycles */ >> >> + PERF_SAMPLE_BRANCH_TYPE_SAVE_SHIFT = 16, /* save branch type */ >> + >> PERF_SAMPLE_BRANCH_MAX_SHIFT /* non-ABI */ >> }; >> >> @@ -198,9 +200,27 @@ enum perf_branch_sample_type { >> PERF_SAMPLE_BRANCH_NO_FLAGS = 1U << PERF_SAMPLE_BRANCH_NO_FLAGS_SHIFT, >> PERF_SAMPLE_BRANCH_NO_CYCLES = 1U << PERF_SAMPLE_BRANCH_NO_CYCLES_SHIFT, >> >> + PERF_SAMPLE_BRANCH_TYPE_SAVE = >> + 1U << PERF_SAMPLE_BRANCH_TYPE_SAVE_SHIFT, >> + >> PERF_SAMPLE_BRANCH_MAX = 1U << PERF_SAMPLE_BRANCH_MAX_SHIFT, >> }; >> >> +/* >> + * Common flow change classification >> + */ >> +enum { >> + PERF_BR_NONE = 0, /* unknown */ >> + PERF_BR_JCC_FWD = 1 << 1, /* conditional forward jump */ >> + PERF_BR_JCC_BWD = 1 << 2, /* conditional backward jump */ >> + PERF_BR_JMP = 1 << 3, /* jump */ >> + PERF_BR_IND_JMP = 1 << 4, /* indirect jump */ >> + PERF_BR_CALL = 1 << 5, /* call */ >> + PERF_BR_IND_CALL = 1 << 6, /* indirect call */ >> + PERF_BR_RET = 1 << 7, /* return */ >> + PERF_BR_FAR_BRANCH = 1 << 8, /* SYSCALL,SYSRET,IRQ,... */ >> +}; >> + >> #define PERF_SAMPLE_BRANCH_PLM_ALL \ >> (PERF_SAMPLE_BRANCH_USER|\ >> PERF_SAMPLE_BRANCH_KERNEL|\ >> @@ -999,6 +1019,7 @@ union perf_mem_data_src { >> * in_tx: running in a hardware transaction >> * abort: aborting a hardware transaction >> * cycles: cycles from last branch (or 0 if not supported) >> + * type: branch type >> */ >> struct perf_branch_entry { >> __u64 from; >> @@ -1008,7 +1029,8 @@ struct perf_branch_entry { >> in_tx:1, /* in transaction */ >> abort:1, /* transaction abort */ >> cycles:16, /* cycle count to last branch */ >> - reserved:44; >> + type:9, /* branch type */ >> + reserved:35; >> }; >> >> #endif /* _UAPI_LINUX_PERF_EVENT_H */ >> -- >> 2.7.4