[V2,3/3] perf regs x86: Add X86 specific arch__intr_reg_mask()
diff mbox series

Message ID 1557865174-56264-3-git-send-email-kan.liang@linux.intel.com
State New
Headers show
Series
  • [V2,1/3] perf parse-regs: Split parse_regs
Related show

Commit Message

Liang, Kan May 14, 2019, 8:19 p.m. UTC
From: Kan Liang <kan.liang@linux.intel.com>

XMM registers can be collected on Icelake and later platforms.

Add specific arch__intr_reg_mask(), which creating an event to check if
the kernel and hardware can collect XMM registers.

Test on Skylake which doesn't support XMM registers collection. There is
nothing changed.

   #perf record -I?
   available registers: AX BX CX DX SI DI BP SP IP FLAGS CS SS R8 R9
   R10 R11 R12 R13 R14 R15

   Usage: perf record [<options>] [<command>]
    or: perf record [<options>] -- <command> [<options>]

    -I, --intr-regs[=<any register>]
                          sample selected machine registers on
   interrupt, use '-I?' to list register names

   #perf record -I
   [ perf record: Woken up 1 times to write data ]
   [ perf record: Captured and wrote 0.905 MB perf.data (2520 samples) ]

   #perf evlist -v
   cycles: size: 112, { sample_period, sample_freq }: 4000, sample_type:
   IP|TID|TIME|CPU|PERIOD|REGS_INTR, read_format: ID, disabled: 1,
   inherit: 1, mmap: 1, comm: 1, freq: 1, task: 1, precise_ip: 3,
   sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, ksymbol:
   1, bpf_event: 1, sample_regs_intr: 0xff0fff

Test on Icelake which support XMM registers collection.

   #perf record -I?
   available registers: AX BX CX DX SI DI BP SP IP FLAGS CS SS R8 R9 R10
   R11 R12 R13 R14 R15 XMM0 XMM1 XMM2 XMM3 XMM4 XMM5 XMM6 XMM7 XMM8 XMM9
   XMM10 XMM11 XMM12 XMM13 XMM14 XMM15

   Usage: perf record [<options>] [<command>]
    or: perf record [<options>] -- <command> [<options>]

    -I, --intr-regs[=<any register>]
                          sample selected machine registers on
   interrupt, use '-I?' to list register names

   #perf record -I
   [ perf record: Woken up 1 times to write data ]
   [ perf record: Captured and wrote 0.800 MB perf.data (318 samples) ]

   #perf evlist -v
   cycles: size: 112, { sample_period, sample_freq }: 4000, sample_type:
   IP|TID|TIME|CPU|PERIOD|REGS_INTR, read_format: ID, disabled: 1,
   inherit: 1, mmap: 1, comm: 1, freq: 1, task: 1, precise_ip: 3,
   sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, ksymbol:
   1, bpf_event: 1, sample_regs_intr: 0xffffffff00ff0fff

Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
---

Changes since V1:
- Add specific arch__intr_reg_mask() support
  Drop specific has_non_gprs_support() and non_gprs_mask()

 tools/perf/arch/x86/include/perf_regs.h |  1 +
 tools/perf/arch/x86/util/perf_regs.c    | 25 +++++++++++++++++++++++++
 2 files changed, 26 insertions(+)

Comments

Arnaldo Carvalho de Melo May 15, 2019, 7:28 p.m. UTC | #1
Em Tue, May 14, 2019 at 01:19:34PM -0700, kan.liang@linux.intel.com escreveu:
> From: Kan Liang <kan.liang@linux.intel.com>
> 
> XMM registers can be collected on Icelake and later platforms.
> 
> Add specific arch__intr_reg_mask(), which creating an event to check if
> the kernel and hardware can collect XMM registers.
> 
> Test on Skylake which doesn't support XMM registers collection. There is
> nothing changed.

Thanks a lot for doing this and tested on both a machine without these
registers as well as on one with it.

Applied, together with Ravi's tested-by for the first two and the change
in the --user-regs doc,

Regards,

- Arnaldo
 
>    #perf record -I?
>    available registers: AX BX CX DX SI DI BP SP IP FLAGS CS SS R8 R9
>    R10 R11 R12 R13 R14 R15
> 
>    Usage: perf record [<options>] [<command>]
>     or: perf record [<options>] -- <command> [<options>]
> 
>     -I, --intr-regs[=<any register>]
>                           sample selected machine registers on
>    interrupt, use '-I?' to list register names
> 
>    #perf record -I
>    [ perf record: Woken up 1 times to write data ]
>    [ perf record: Captured and wrote 0.905 MB perf.data (2520 samples) ]
> 
>    #perf evlist -v
>    cycles: size: 112, { sample_period, sample_freq }: 4000, sample_type:
>    IP|TID|TIME|CPU|PERIOD|REGS_INTR, read_format: ID, disabled: 1,
>    inherit: 1, mmap: 1, comm: 1, freq: 1, task: 1, precise_ip: 3,
>    sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, ksymbol:
>    1, bpf_event: 1, sample_regs_intr: 0xff0fff
> 
> Test on Icelake which support XMM registers collection.
> 
>    #perf record -I?
>    available registers: AX BX CX DX SI DI BP SP IP FLAGS CS SS R8 R9 R10
>    R11 R12 R13 R14 R15 XMM0 XMM1 XMM2 XMM3 XMM4 XMM5 XMM6 XMM7 XMM8 XMM9
>    XMM10 XMM11 XMM12 XMM13 XMM14 XMM15
> 
>    Usage: perf record [<options>] [<command>]
>     or: perf record [<options>] -- <command> [<options>]
> 
>     -I, --intr-regs[=<any register>]
>                           sample selected machine registers on
>    interrupt, use '-I?' to list register names
> 
>    #perf record -I
>    [ perf record: Woken up 1 times to write data ]
>    [ perf record: Captured and wrote 0.800 MB perf.data (318 samples) ]
> 
>    #perf evlist -v
>    cycles: size: 112, { sample_period, sample_freq }: 4000, sample_type:
>    IP|TID|TIME|CPU|PERIOD|REGS_INTR, read_format: ID, disabled: 1,
>    inherit: 1, mmap: 1, comm: 1, freq: 1, task: 1, precise_ip: 3,
>    sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, ksymbol:
>    1, bpf_event: 1, sample_regs_intr: 0xffffffff00ff0fff
> 
> Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
> ---
> 
> Changes since V1:
> - Add specific arch__intr_reg_mask() support
>   Drop specific has_non_gprs_support() and non_gprs_mask()
> 
>  tools/perf/arch/x86/include/perf_regs.h |  1 +
>  tools/perf/arch/x86/util/perf_regs.c    | 25 +++++++++++++++++++++++++
>  2 files changed, 26 insertions(+)
> 
> diff --git a/tools/perf/arch/x86/include/perf_regs.h b/tools/perf/arch/x86/include/perf_regs.h
> index b732133..b7cd91a 100644
> --- a/tools/perf/arch/x86/include/perf_regs.h
> +++ b/tools/perf/arch/x86/include/perf_regs.h
> @@ -9,6 +9,7 @@
>  void perf_regs_load(u64 *regs);
>  
>  #define PERF_REGS_MAX PERF_REG_X86_XMM_MAX
> +#define PERF_XMM_REGS_MASK	(~((1ULL << PERF_REG_X86_XMM0) - 1))
>  #ifndef HAVE_ARCH_X86_64_SUPPORT
>  #define PERF_REGS_MASK ((1ULL << PERF_REG_X86_32_MAX) - 1)
>  #define PERF_SAMPLE_REGS_ABI PERF_SAMPLE_REGS_ABI_32
> diff --git a/tools/perf/arch/x86/util/perf_regs.c b/tools/perf/arch/x86/util/perf_regs.c
> index 71d7604..c3d7479 100644
> --- a/tools/perf/arch/x86/util/perf_regs.c
> +++ b/tools/perf/arch/x86/util/perf_regs.c
> @@ -270,3 +270,28 @@ int arch_sdt_arg_parse_op(char *old_op, char **new_op)
>  
>  	return SDT_ARG_VALID;
>  }
> +
> +uint64_t arch__intr_reg_mask(void)
> +{
> +	struct perf_event_attr attr = {
> +		.type			= PERF_TYPE_HARDWARE,
> +		.config			= PERF_COUNT_HW_CPU_CYCLES,
> +		.sample_period		= 1,
> +		.sample_type		= PERF_SAMPLE_REGS_INTR,
> +		.sample_regs_intr	= PERF_XMM_REGS_MASK,
> +		.precise_ip		= 1,
> +		.disabled 		= 1,
> +		.exclude_kernel		= 1,
> +	};
> +	int fd;
> +
> +	event_attr_init(&attr);
> +
> +	fd = sys_perf_event_open(&attr, 0, -1, -1, 0);
> +	if (fd != -1) {
> +		close(fd);
> +		return (PERF_XMM_REGS_MASK | PERF_REGS_MASK);
> +	}
> +
> +	return PERF_REGS_MASK;
> +}
> -- 
> 2.7.4

Patch
diff mbox series

diff --git a/tools/perf/arch/x86/include/perf_regs.h b/tools/perf/arch/x86/include/perf_regs.h
index b732133..b7cd91a 100644
--- a/tools/perf/arch/x86/include/perf_regs.h
+++ b/tools/perf/arch/x86/include/perf_regs.h
@@ -9,6 +9,7 @@ 
 void perf_regs_load(u64 *regs);
 
 #define PERF_REGS_MAX PERF_REG_X86_XMM_MAX
+#define PERF_XMM_REGS_MASK	(~((1ULL << PERF_REG_X86_XMM0) - 1))
 #ifndef HAVE_ARCH_X86_64_SUPPORT
 #define PERF_REGS_MASK ((1ULL << PERF_REG_X86_32_MAX) - 1)
 #define PERF_SAMPLE_REGS_ABI PERF_SAMPLE_REGS_ABI_32
diff --git a/tools/perf/arch/x86/util/perf_regs.c b/tools/perf/arch/x86/util/perf_regs.c
index 71d7604..c3d7479 100644
--- a/tools/perf/arch/x86/util/perf_regs.c
+++ b/tools/perf/arch/x86/util/perf_regs.c
@@ -270,3 +270,28 @@  int arch_sdt_arg_parse_op(char *old_op, char **new_op)
 
 	return SDT_ARG_VALID;
 }
+
+uint64_t arch__intr_reg_mask(void)
+{
+	struct perf_event_attr attr = {
+		.type			= PERF_TYPE_HARDWARE,
+		.config			= PERF_COUNT_HW_CPU_CYCLES,
+		.sample_period		= 1,
+		.sample_type		= PERF_SAMPLE_REGS_INTR,
+		.sample_regs_intr	= PERF_XMM_REGS_MASK,
+		.precise_ip		= 1,
+		.disabled 		= 1,
+		.exclude_kernel		= 1,
+	};
+	int fd;
+
+	event_attr_init(&attr);
+
+	fd = sys_perf_event_open(&attr, 0, -1, -1, 0);
+	if (fd != -1) {
+		close(fd);
+		return (PERF_XMM_REGS_MASK | PERF_REGS_MASK);
+	}
+
+	return PERF_REGS_MASK;
+}