linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 0/3] csky: perf callchain dwarf support
@ 2019-04-10  8:16 Mao Han
  2019-04-10  8:16 ` [PATCH v3 1/3] perf: use hweight64 instead of hweight_long Mao Han
                   ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: Mao Han @ 2019-04-10  8:16 UTC (permalink / raw)
  To: linux-kernel
  Cc: Mao Han, Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Guo Ren

This patch set add perf DWARF unwinding support for C-SKY.
Including user registers/stack dump API, and libdw support.

CC: Peter Zijlstra <peterz@infradead.org>
CC: Ingo Molnar <mingo@redhat.com>
CC: Arnaldo Carvalho de Melo <acme@kernel.org>
CC: Alexander Shishkin <alexander.shishkin@linux.intel.com>
CC: Jiri Olsa <jolsa@redhat.com>
CC: Namhyung Kim <namhyung@kernel.org>
CC: Guo Ren <guoren@kernel.org>

Changes since v2:
  - use same registers name as struct pt_regs
  - code convention

Changes since v1:
  - seperate the callchain support using frame pointer

Mao Han (3):
  perf: use hweight64 instead of hweight_long
  csky: Add support for perf registers sampling
  csky: add support for libdw

 arch/csky/Kconfig                            |   2 +
 arch/csky/include/uapi/asm/perf_regs.h       |  51 ++++++++++++++
 arch/csky/kernel/Makefile                    |   1 +
 arch/csky/kernel/perf_regs.c                 |  40 +++++++++++
 tools/arch/csky/include/uapi/asm/perf_regs.h |  51 ++++++++++++++
 tools/perf/Makefile.config                   |   6 +-
 tools/perf/arch/csky/Build                   |   1 +
 tools/perf/arch/csky/Makefile                |   3 +
 tools/perf/arch/csky/include/perf_regs.h     | 100 +++++++++++++++++++++++++++
 tools/perf/arch/csky/util/Build              |   2 +
 tools/perf/arch/csky/util/dwarf-regs.c       |  49 +++++++++++++
 tools/perf/arch/csky/util/unwind-libdw.c     |  78 +++++++++++++++++++++
 tools/perf/util/evsel.c                      |   2 +-
 13 files changed, 384 insertions(+), 2 deletions(-)
 create mode 100644 arch/csky/include/uapi/asm/perf_regs.h
 create mode 100644 arch/csky/kernel/perf_regs.c
 create mode 100644 tools/arch/csky/include/uapi/asm/perf_regs.h
 create mode 100644 tools/perf/arch/csky/Build
 create mode 100644 tools/perf/arch/csky/Makefile
 create mode 100644 tools/perf/arch/csky/include/perf_regs.h
 create mode 100644 tools/perf/arch/csky/util/Build
 create mode 100644 tools/perf/arch/csky/util/dwarf-regs.c
 create mode 100644 tools/perf/arch/csky/util/unwind-libdw.c

-- 
2.7.4


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v3 1/3] perf: use hweight64 instead of hweight_long
  2019-04-10  8:16 [PATCH v3 0/3] csky: perf callchain dwarf support Mao Han
@ 2019-04-10  8:16 ` Mao Han
  2019-04-10 13:08   ` Arnaldo Carvalho de Melo
                     ` (2 more replies)
  2019-04-10  8:16 ` [PATCH v3 2/3] csky: Add support for perf registers sampling Mao Han
  2019-04-10  8:16 ` [PATCH v3 3/3] csky: add support for libdw Mao Han
  2 siblings, 3 replies; 10+ messages in thread
From: Mao Han @ 2019-04-10  8:16 UTC (permalink / raw)
  To: linux-kernel
  Cc: Mao Han, Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim

On 32-bits platform with more than 32 registers, the 64 bits mask is
truncate to the lower 32 bits and the return value of hweight_long will
always smaller than 32. When kernel outputs more than 32 registers, but
the user perf program only counts 32, there will be a data mismatch
result to overflow check fail.

CC: Peter Zijlstra <peterz@infradead.org>
CC: Ingo Molnar <mingo@redhat.com>
CC: Arnaldo Carvalho de Melo <acme@kernel.org>
CC: Alexander Shishkin <alexander.shishkin@linux.intel.com>
CC: Jiri Olsa <jolsa@redhat.com>
CC: Namhyung Kim <namhyung@kernel.org>

Signed-off-by: Mao Han <han_mao@c-sky.com>
---
 tools/perf/util/evsel.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 7835e05..73c78be 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -2322,7 +2322,7 @@ int perf_evsel__parse_sample(struct perf_evsel *evsel, union perf_event *event,
 		if (data->user_regs.abi) {
 			u64 mask = evsel->attr.sample_regs_user;
 
-			sz = hweight_long(mask) * sizeof(u64);
+			sz = hweight64(mask) * sizeof(u64);
 			OVERFLOW_CHECK(array, sz, max_size);
 			data->user_regs.mask = mask;
 			data->user_regs.regs = (u64 *)array;
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v3 2/3] csky: Add support for perf registers sampling
  2019-04-10  8:16 [PATCH v3 0/3] csky: perf callchain dwarf support Mao Han
  2019-04-10  8:16 ` [PATCH v3 1/3] perf: use hweight64 instead of hweight_long Mao Han
@ 2019-04-10  8:16 ` Mao Han
  2019-04-10  8:16 ` [PATCH v3 3/3] csky: add support for libdw Mao Han
  2 siblings, 0 replies; 10+ messages in thread
From: Mao Han @ 2019-04-10  8:16 UTC (permalink / raw)
  To: linux-kernel; +Cc: Mao Han, Guo Ren

This patch implements the perf registers sampling and validation API
for csky arch. The valid registers and their register ID are defined in
perf_regs.h. Perf tool can backtrace in userspace with unwind library
and the registers/user stack dump support.

CC: Guo Ren <guoren@kernel.org>

Signed-off-by: Mao Han <han_mao@c-sky.com>
---
 arch/csky/Kconfig                      |  2 ++
 arch/csky/include/uapi/asm/perf_regs.h | 51 ++++++++++++++++++++++++++++++++++
 arch/csky/kernel/Makefile              |  1 +
 arch/csky/kernel/perf_regs.c           | 40 ++++++++++++++++++++++++++
 4 files changed, 94 insertions(+)
 create mode 100644 arch/csky/include/uapi/asm/perf_regs.h
 create mode 100644 arch/csky/kernel/perf_regs.c

diff --git a/arch/csky/Kconfig b/arch/csky/Kconfig
index c4974cf..8e45c7a 100644
--- a/arch/csky/Kconfig
+++ b/arch/csky/Kconfig
@@ -38,6 +38,8 @@ config CSKY
 	select HAVE_KERNEL_LZO
 	select HAVE_KERNEL_LZMA
 	select HAVE_PERF_EVENTS
+	select HAVE_PERF_REGS
+	select HAVE_PERF_USER_STACK_DUMP
 	select HAVE_DMA_API_DEBUG
 	select HAVE_DMA_CONTIGUOUS
 	select HAVE_SYSCALL_TRACEPOINTS
diff --git a/arch/csky/include/uapi/asm/perf_regs.h b/arch/csky/include/uapi/asm/perf_regs.h
new file mode 100644
index 0000000..ee323d8
--- /dev/null
+++ b/arch/csky/include/uapi/asm/perf_regs.h
@@ -0,0 +1,51 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+// Copyright (C) 2019 Hangzhou C-SKY Microsystems co.,ltd.
+
+#ifndef _ASM_CSKY_PERF_REGS_H
+#define _ASM_CSKY_PERF_REGS_H
+
+/* Index of struct pt_regs */
+enum perf_event_csky_regs {
+	PERF_REG_CSKY_TLS,
+	PERF_REG_CSKY_LR,
+	PERF_REG_CSKY_PC,
+	PERF_REG_CSKY_SR,
+	PERF_REG_CSKY_SP,
+	PERF_REG_CSKY_ORIG_A0,
+	PERF_REG_CSKY_A0,
+	PERF_REG_CSKY_A1,
+	PERF_REG_CSKY_A2,
+	PERF_REG_CSKY_A3,
+	PERF_REG_CSKY_REGS0,
+	PERF_REG_CSKY_REGS1,
+	PERF_REG_CSKY_REGS2,
+	PERF_REG_CSKY_REGS3,
+	PERF_REG_CSKY_REGS4,
+	PERF_REG_CSKY_REGS5,
+	PERF_REG_CSKY_REGS6,
+	PERF_REG_CSKY_REGS7,
+	PERF_REG_CSKY_REGS8,
+	PERF_REG_CSKY_REGS9,
+#if defined(__CSKYABIV2__)
+	PERF_REG_CSKY_EXREGS0,
+	PERF_REG_CSKY_EXREGS1,
+	PERF_REG_CSKY_EXREGS2,
+	PERF_REG_CSKY_EXREGS3,
+	PERF_REG_CSKY_EXREGS4,
+	PERF_REG_CSKY_EXREGS5,
+	PERF_REG_CSKY_EXREGS6,
+	PERF_REG_CSKY_EXREGS7,
+	PERF_REG_CSKY_EXREGS8,
+	PERF_REG_CSKY_EXREGS9,
+	PERF_REG_CSKY_EXREGS10,
+	PERF_REG_CSKY_EXREGS11,
+	PERF_REG_CSKY_EXREGS12,
+	PERF_REG_CSKY_EXREGS13,
+	PERF_REG_CSKY_EXREGS14,
+	PERF_REG_CSKY_HI,
+	PERF_REG_CSKY_LO,
+	PERF_REG_CSKY_DCSR,
+#endif
+	PERF_REG_CSKY_MAX,
+};
+#endif /* _ASM_CSKY_PERF_REGS_H */
diff --git a/arch/csky/kernel/Makefile b/arch/csky/kernel/Makefile
index 4c462f5..1624b04 100644
--- a/arch/csky/kernel/Makefile
+++ b/arch/csky/kernel/Makefile
@@ -10,6 +10,7 @@ obj-$(CONFIG_FUNCTION_TRACER)		+= ftrace.o
 obj-$(CONFIG_STACKTRACE)		+= stacktrace.o
 obj-$(CONFIG_CSKY_PMU_V1)		+= perf_event.o
 obj-$(CONFIG_PERF_EVENTS)		+= perf_callchain.o
+obj-$(CONFIG_HAVE_PERF_REGS)            += perf_regs.o
 
 ifdef CONFIG_FUNCTION_TRACER
 CFLAGS_REMOVE_ftrace.o = $(CC_FLAGS_FTRACE)
diff --git a/arch/csky/kernel/perf_regs.c b/arch/csky/kernel/perf_regs.c
new file mode 100644
index 0000000..88f1875
--- /dev/null
+++ b/arch/csky/kernel/perf_regs.c
@@ -0,0 +1,40 @@
+// SPDX-License-Identifier: GPL-2.0
+// Copyright (C) 2019 Hangzhou C-SKY Microsystems co.,ltd.
+
+#include <linux/errno.h>
+#include <linux/kernel.h>
+#include <linux/perf_event.h>
+#include <linux/bug.h>
+#include <asm/perf_regs.h>
+#include <asm/ptrace.h>
+
+u64 perf_reg_value(struct pt_regs *regs, int idx)
+{
+	if (WARN_ON_ONCE((u32)idx >= PERF_REG_CSKY_MAX))
+		return 0;
+
+	return ((long *)regs)[idx];
+}
+
+#define REG_RESERVED (~((1ULL << PERF_REG_CSKY_MAX) - 1))
+
+int perf_reg_validate(u64 mask)
+{
+	if (!mask || mask & REG_RESERVED)
+		return -EINVAL;
+
+	return 0;
+}
+
+u64 perf_reg_abi(struct task_struct *task)
+{
+	return PERF_SAMPLE_REGS_ABI_32;
+}
+
+void perf_get_regs_user(struct perf_regs *regs_user,
+			struct pt_regs *regs,
+			struct pt_regs *regs_user_copy)
+{
+	regs_user->regs = task_pt_regs(current);
+	regs_user->abi = perf_reg_abi(current);
+}
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v3 3/3] csky: add support for libdw
  2019-04-10  8:16 [PATCH v3 0/3] csky: perf callchain dwarf support Mao Han
  2019-04-10  8:16 ` [PATCH v3 1/3] perf: use hweight64 instead of hweight_long Mao Han
  2019-04-10  8:16 ` [PATCH v3 2/3] csky: Add support for perf registers sampling Mao Han
@ 2019-04-10  8:16 ` Mao Han
  2 siblings, 0 replies; 10+ messages in thread
From: Mao Han @ 2019-04-10  8:16 UTC (permalink / raw)
  To: linux-kernel
  Cc: Mao Han, Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim

This patch add support for DWARF register mappings and libdw registers
initialization, which is used by perf callchain analyzing when
--call-graph=dwarf is given.

CC: Peter Zijlstra <peterz@infradead.org>
CC: Ingo Molnar <mingo@redhat.com>
CC: Arnaldo Carvalho de Melo <acme@kernel.org>
CC: Alexander Shishkin <alexander.shishkin@linux.intel.com>
CC: Jiri Olsa <jolsa@redhat.com>
CC: Namhyung Kim <namhyung@kernel.org>

Signed-off-by: Mao Han <han_mao@c-sky.com>
---
 tools/arch/csky/include/uapi/asm/perf_regs.h |  51 ++++++++++++++
 tools/perf/Makefile.config                   |   6 +-
 tools/perf/arch/csky/Build                   |   1 +
 tools/perf/arch/csky/Makefile                |   3 +
 tools/perf/arch/csky/include/perf_regs.h     | 100 +++++++++++++++++++++++++++
 tools/perf/arch/csky/util/Build              |   2 +
 tools/perf/arch/csky/util/dwarf-regs.c       |  49 +++++++++++++
 tools/perf/arch/csky/util/unwind-libdw.c     |  78 +++++++++++++++++++++
 8 files changed, 289 insertions(+), 1 deletion(-)
 create mode 100644 tools/arch/csky/include/uapi/asm/perf_regs.h
 create mode 100644 tools/perf/arch/csky/Build
 create mode 100644 tools/perf/arch/csky/Makefile
 create mode 100644 tools/perf/arch/csky/include/perf_regs.h
 create mode 100644 tools/perf/arch/csky/util/Build
 create mode 100644 tools/perf/arch/csky/util/dwarf-regs.c
 create mode 100644 tools/perf/arch/csky/util/unwind-libdw.c

diff --git a/tools/arch/csky/include/uapi/asm/perf_regs.h b/tools/arch/csky/include/uapi/asm/perf_regs.h
new file mode 100644
index 0000000..ee323d8
--- /dev/null
+++ b/tools/arch/csky/include/uapi/asm/perf_regs.h
@@ -0,0 +1,51 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+// Copyright (C) 2019 Hangzhou C-SKY Microsystems co.,ltd.
+
+#ifndef _ASM_CSKY_PERF_REGS_H
+#define _ASM_CSKY_PERF_REGS_H
+
+/* Index of struct pt_regs */
+enum perf_event_csky_regs {
+	PERF_REG_CSKY_TLS,
+	PERF_REG_CSKY_LR,
+	PERF_REG_CSKY_PC,
+	PERF_REG_CSKY_SR,
+	PERF_REG_CSKY_SP,
+	PERF_REG_CSKY_ORIG_A0,
+	PERF_REG_CSKY_A0,
+	PERF_REG_CSKY_A1,
+	PERF_REG_CSKY_A2,
+	PERF_REG_CSKY_A3,
+	PERF_REG_CSKY_REGS0,
+	PERF_REG_CSKY_REGS1,
+	PERF_REG_CSKY_REGS2,
+	PERF_REG_CSKY_REGS3,
+	PERF_REG_CSKY_REGS4,
+	PERF_REG_CSKY_REGS5,
+	PERF_REG_CSKY_REGS6,
+	PERF_REG_CSKY_REGS7,
+	PERF_REG_CSKY_REGS8,
+	PERF_REG_CSKY_REGS9,
+#if defined(__CSKYABIV2__)
+	PERF_REG_CSKY_EXREGS0,
+	PERF_REG_CSKY_EXREGS1,
+	PERF_REG_CSKY_EXREGS2,
+	PERF_REG_CSKY_EXREGS3,
+	PERF_REG_CSKY_EXREGS4,
+	PERF_REG_CSKY_EXREGS5,
+	PERF_REG_CSKY_EXREGS6,
+	PERF_REG_CSKY_EXREGS7,
+	PERF_REG_CSKY_EXREGS8,
+	PERF_REG_CSKY_EXREGS9,
+	PERF_REG_CSKY_EXREGS10,
+	PERF_REG_CSKY_EXREGS11,
+	PERF_REG_CSKY_EXREGS12,
+	PERF_REG_CSKY_EXREGS13,
+	PERF_REG_CSKY_EXREGS14,
+	PERF_REG_CSKY_HI,
+	PERF_REG_CSKY_LO,
+	PERF_REG_CSKY_DCSR,
+#endif
+	PERF_REG_CSKY_MAX,
+};
+#endif /* _ASM_CSKY_PERF_REGS_H */
diff --git a/tools/perf/Makefile.config b/tools/perf/Makefile.config
index fe3f97e..42985ae 100644
--- a/tools/perf/Makefile.config
+++ b/tools/perf/Makefile.config
@@ -59,6 +59,10 @@ ifeq ($(SRCARCH),arm64)
   LIBUNWIND_LIBS = -lunwind -lunwind-aarch64
 endif
 
+ifeq ($(SRCARCH),csky)
+  NO_PERF_REGS := 0
+endif
+
 ifeq ($(ARCH),s390)
   NO_PERF_REGS := 0
   NO_SYSCALL_TABLE := 0
@@ -77,7 +81,7 @@ endif
 # Disable it on all other architectures in case libdw unwind
 # support is detected in system. Add supported architectures
 # to the check.
-ifneq ($(SRCARCH),$(filter $(SRCARCH),x86 arm arm64 powerpc s390))
+ifneq ($(SRCARCH),$(filter $(SRCARCH),x86 arm arm64 powerpc s390 csky))
   NO_LIBDW_DWARF_UNWIND := 1
 endif
 
diff --git a/tools/perf/arch/csky/Build b/tools/perf/arch/csky/Build
new file mode 100644
index 0000000..e4e5f33
--- /dev/null
+++ b/tools/perf/arch/csky/Build
@@ -0,0 +1 @@
+perf-y += util/
diff --git a/tools/perf/arch/csky/Makefile b/tools/perf/arch/csky/Makefile
new file mode 100644
index 0000000..7fbca17
--- /dev/null
+++ b/tools/perf/arch/csky/Makefile
@@ -0,0 +1,3 @@
+ifndef NO_DWARF
+PERF_HAVE_DWARF_REGS := 1
+endif
diff --git a/tools/perf/arch/csky/include/perf_regs.h b/tools/perf/arch/csky/include/perf_regs.h
new file mode 100644
index 0000000..8f336ea
--- /dev/null
+++ b/tools/perf/arch/csky/include/perf_regs.h
@@ -0,0 +1,100 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+// Copyright (C) 2019 Hangzhou C-SKY Microsystems co.,ltd.
+
+#ifndef ARCH_PERF_REGS_H
+#define ARCH_PERF_REGS_H
+
+#include <stdlib.h>
+#include <linux/types.h>
+#include <asm/perf_regs.h>
+
+#define PERF_REGS_MASK	((1ULL << PERF_REG_CSKY_MAX) - 1)
+#define PERF_REGS_MAX	PERF_REG_CSKY_MAX
+#define PERF_SAMPLE_REGS_ABI	PERF_SAMPLE_REGS_ABI_32
+
+#define PERF_REG_IP	PERF_REG_CSKY_PC
+#define PERF_REG_SP	PERF_REG_CSKY_SP
+
+static inline const char *perf_reg_name(int id)
+{
+	switch (id) {
+	case PERF_REG_CSKY_A0:
+		return "a0";
+	case PERF_REG_CSKY_A1:
+		return "a1";
+	case PERF_REG_CSKY_A2:
+		return "a2";
+	case PERF_REG_CSKY_A3:
+		return "a3";
+	case PERF_REG_CSKY_REGS0:
+		return "regs0";
+	case PERF_REG_CSKY_REGS1:
+		return "regs1";
+	case PERF_REG_CSKY_REGS2:
+		return "regs2";
+	case PERF_REG_CSKY_REGS3:
+		return "regs3";
+	case PERF_REG_CSKY_REGS4:
+		return "regs4";
+	case PERF_REG_CSKY_REGS5:
+		return "regs5";
+	case PERF_REG_CSKY_REGS6:
+		return "regs6";
+	case PERF_REG_CSKY_REGS7:
+		return "regs7";
+	case PERF_REG_CSKY_REGS8:
+		return "regs8";
+	case PERF_REG_CSKY_REGS9:
+		return "regs9";
+	case PERF_REG_CSKY_SP:
+		return "sp";
+	case PERF_REG_CSKY_LR:
+		return "lr";
+	case PERF_REG_CSKY_PC:
+		return "pc";
+#if defined(__CSKYABIV2__)
+	case PERF_REG_CSKY_EXREGS0:
+		return "exregs0";
+	case PERF_REG_CSKY_EXREGS1:
+		return "exregs1";
+	case PERF_REG_CSKY_EXREGS2:
+		return "exregs2";
+	case PERF_REG_CSKY_EXREGS3:
+		return "exregs3";
+	case PERF_REG_CSKY_EXREGS4:
+		return "exregs4";
+	case PERF_REG_CSKY_EXREGS5:
+		return "exregs5";
+	case PERF_REG_CSKY_EXREGS6:
+		return "exregs6";
+	case PERF_REG_CSKY_EXREGS7:
+		return "exregs7";
+	case PERF_REG_CSKY_EXREGS8:
+		return "exregs8";
+	case PERF_REG_CSKY_EXREGS9:
+		return "exregs9";
+	case PERF_REG_CSKY_EXREGS10:
+		return "exregs10";
+	case PERF_REG_CSKY_EXREGS11:
+		return "exregs11";
+	case PERF_REG_CSKY_EXREGS12:
+		return "exregs12";
+	case PERF_REG_CSKY_EXREGS13:
+		return "exregs13";
+	case PERF_REG_CSKY_EXREGS14:
+		return "exregs14";
+	case PERF_REG_CSKY_TLS:
+		return "tls";
+	case PERF_REG_CSKY_HI:
+		return "hi";
+	case PERF_REG_CSKY_LO:
+		return "lo";
+#endif
+	default:
+		return NULL;
+	}
+
+	return NULL;
+}
+
+#endif /* ARCH_PERF_REGS_H */
diff --git a/tools/perf/arch/csky/util/Build b/tools/perf/arch/csky/util/Build
new file mode 100644
index 0000000..1160bb2
--- /dev/null
+++ b/tools/perf/arch/csky/util/Build
@@ -0,0 +1,2 @@
+perf-$(CONFIG_DWARF) += dwarf-regs.o
+perf-$(CONFIG_LIBDW_DWARF_UNWIND) += unwind-libdw.o
diff --git a/tools/perf/arch/csky/util/dwarf-regs.c b/tools/perf/arch/csky/util/dwarf-regs.c
new file mode 100644
index 0000000..3591ca1
--- /dev/null
+++ b/tools/perf/arch/csky/util/dwarf-regs.c
@@ -0,0 +1,49 @@
+// SPDX-License-Identifier: GPL-2.0
+// Copyright (C) 2019 Hangzhou C-SKY Microsystems co.,ltd.
+// Mapping of DWARF debug register numbers into register names.
+
+#include <stddef.h>
+#include <dwarf-regs.h>
+
+#if defined(__CSKYABIV2__)
+#define CSKY_MAX_REGS 71
+const char *csky_dwarf_regs_table[CSKY_MAX_REGS] = {
+	/* r0 ~ r8 */
+	"%a0", "%a1", "%a2", "%a3", "%regs0", "%regs1", "%regs2", "%regs3",
+	/* r9 ~ r15 */
+	"%regs4", "%regs5", "%regs6", "%regs7", "%regs8", "%regs9", "%sp",
+	"%lr",
+	/* r16 ~ r23 */
+	"%exregs0", "%exregs1", "%exregs2", "%exregs3", "%exregs4",
+	"%exregs5", "%exregs6", "%exregs7",
+	/* r24 ~ r31 */
+	"%exregs8", "%exregs9", "%exregs10", "%exregs11", "%exregs12",
+	"%exregs13", "%exregs14", "%tls",
+	"%pc", "%cc", "%hi", "%lo", NULL, NULL, NULL, NULL,
+	NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL,
+	NULL, NULL, NULL, NULL, "%vr0", "%vr1", "%vr2", "%vr3",
+	"%vr4", "%vr5", "%vr6", "%vr7", "%vr8", "%vr9", "%vr10", "%vr11",
+	"%vr12", "%vr13", "%vr14", "%vr15", NULL, NULL, "%epc",
+};
+#else
+#define CSKY_MAX_REGS 70
+const char *csky_dwarf_regs_table[CSKY_MAX_REGS] = {
+	/* r0 ~ r8 */
+	"%sp", "%regs9", "%a0", "%a1", "%a2", "%a3", "%regs0", "%regs1",
+	/* r9 ~ r15 */
+	"%regs2", "%regs3", "%regs4", "%regs5", "%regs6", "%regs7", "%regs8",
+	"%lr",
+	NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL,
+	NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL,
+	"%vap", "%cc", "%vfp", "%epc", "%hi", "%lo", "%fr0", "%fr1",
+	"%fr2", "%fr3", "%fr4", "%fr5", "%fr6", "%fr7", "%fr8", "%fr9",
+	"%fr10", "%fr11", "%fr12", "%fr13", "%fr14", "%fr15", "%fr16", "%fr17",
+	"%fr18", "%fr19", "%fr20", "%fr21", "%fr22", "%fr23", "%fr24", "%fr25",
+	"%fr26", "%fr27", "%fr28", "%fr29", "%fr30", "%fr31"
+};
+#endif
+
+const char *get_arch_regstr(unsigned int n)
+{
+	return (n < CSKY_MAX_REGS) ? csky_dwarf_regs_table[n] : NULL;
+}
diff --git a/tools/perf/arch/csky/util/unwind-libdw.c b/tools/perf/arch/csky/util/unwind-libdw.c
new file mode 100644
index 0000000..078951b
--- /dev/null
+++ b/tools/perf/arch/csky/util/unwind-libdw.c
@@ -0,0 +1,78 @@
+// SPDX-License-Identifier: GPL-2.0
+// Copyright (C) 2019 Hangzhou C-SKY Microsystems co.,ltd.
+
+#include <elfutils/libdwfl.h>
+#include "../../util/unwind-libdw.h"
+#include "../../util/perf_regs.h"
+#include "../../util/event.h"
+
+bool libdw__arch_set_initial_registers(Dwfl_Thread *thread, void *arg)
+{
+	struct unwind_info *ui = arg;
+	struct regs_dump *user_regs = &ui->sample->user_regs;
+	Dwarf_Word dwarf_regs[PERF_REG_CSKY_MAX];
+
+#define REG(r) ({						\
+	Dwarf_Word val = 0;					\
+	perf_reg_value(&val, user_regs, PERF_REG_CSKY_##r);	\
+	val;							\
+})
+
+#if defined(__CSKYABIV2__)
+	dwarf_regs[0]  = REG(A0);
+	dwarf_regs[1]  = REG(A1);
+	dwarf_regs[2]  = REG(A2);
+	dwarf_regs[3]  = REG(A3);
+	dwarf_regs[4]  = REG(REGS0);
+	dwarf_regs[5]  = REG(REGS1);
+	dwarf_regs[6]  = REG(REGS2);
+	dwarf_regs[7]  = REG(REGS3);
+	dwarf_regs[8]  = REG(REGS4);
+	dwarf_regs[9]  = REG(REGS5);
+	dwarf_regs[10] = REG(REGS6);
+	dwarf_regs[11] = REG(REGS7);
+	dwarf_regs[12] = REG(REGS8);
+	dwarf_regs[13] = REG(REGS9);
+	dwarf_regs[14] = REG(SP);
+	dwarf_regs[15] = REG(LR);
+	dwarf_regs[16] = REG(EXREGS0);
+	dwarf_regs[17] = REG(EXREGS1);
+	dwarf_regs[18] = REG(EXREGS2);
+	dwarf_regs[19] = REG(EXREGS3);
+	dwarf_regs[20] = REG(EXREGS4);
+	dwarf_regs[21] = REG(EXREGS5);
+	dwarf_regs[22] = REG(EXREGS6);
+	dwarf_regs[23] = REG(EXREGS7);
+	dwarf_regs[24] = REG(EXREGS8);
+	dwarf_regs[25] = REG(EXREGS9);
+	dwarf_regs[26] = REG(EXREGS10);
+	dwarf_regs[27] = REG(EXREGS11);
+	dwarf_regs[28] = REG(EXREGS12);
+	dwarf_regs[29] = REG(EXREGS13);
+	dwarf_regs[30] = REG(EXREGS14);
+	dwarf_regs[31] = REG(TLS);
+	dwarf_regs[32] = REG(PC);
+#else
+	dwarf_regs[0]  = REG(SP);
+	dwarf_regs[1]  = REG(REGS9);
+	dwarf_regs[2]  = REG(A0);
+	dwarf_regs[3]  = REG(A1);
+	dwarf_regs[4]  = REG(A2);
+	dwarf_regs[5]  = REG(A3);
+	dwarf_regs[6]  = REG(REGS0);
+	dwarf_regs[7]  = REG(REGS1);
+	dwarf_regs[8]  = REG(REGS2);
+	dwarf_regs[9]  = REG(REGS3);
+	dwarf_regs[10] = REG(REGS4);
+	dwarf_regs[11] = REG(REGS5);
+	dwarf_regs[12] = REG(REGS6);
+	dwarf_regs[13] = REG(REGS7);
+	dwarf_regs[14] = REG(REGS8);
+	dwarf_regs[15] = REG(LR);
+	dwarf_regs[32] = REG(PC);
+#endif
+	dwfl_thread_state_register_pc(thread, dwarf_regs[32]);
+
+	return dwfl_thread_state_registers(thread, 0, PERF_REG_CSKY_MAX,
+					   dwarf_regs);
+}
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH v3 1/3] perf: use hweight64 instead of hweight_long
  2019-04-10  8:16 ` [PATCH v3 1/3] perf: use hweight64 instead of hweight_long Mao Han
@ 2019-04-10 13:08   ` Arnaldo Carvalho de Melo
  2019-04-10 13:10     ` Arnaldo Carvalho de Melo
  2019-04-12 16:40   ` [tip:perf/urgent] perf evsel: Use hweight64() instead of hweight_long(attr.sample_regs_user) tip-bot for Mao Han
  2019-04-16 15:30   ` tip-bot for Mao Han
  2 siblings, 1 reply; 10+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-04-10 13:08 UTC (permalink / raw)
  To: Mao Han
  Cc: linux-kernel, Peter Zijlstra, Ingo Molnar, Alexander Shishkin,
	Jiri Olsa, Namhyung Kim

Em Wed, Apr 10, 2019 at 04:16:43PM +0800, Mao Han escreveu:
> On 32-bits platform with more than 32 registers, the 64 bits mask is
> truncate to the lower 32 bits and the return value of hweight_long will
> always smaller than 32. When kernel outputs more than 32 registers, but
> the user perf program only counts 32, there will be a data mismatch
> result to overflow check fail.
> 
> CC: Peter Zijlstra <peterz@infradead.org>
> CC: Ingo Molnar <mingo@redhat.com>
> CC: Arnaldo Carvalho de Melo <acme@kernel.org>
> CC: Alexander Shishkin <alexander.shishkin@linux.intel.com>
> CC: Jiri Olsa <jolsa@redhat.com>
> CC: Namhyung Kim <namhyung@kernel.org>
> 
> Signed-off-by: Mao Han <han_mao@c-sky.com>
> ---
>  tools/perf/util/evsel.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
> index 7835e05..73c78be 100644
> --- a/tools/perf/util/evsel.c
> +++ b/tools/perf/util/evsel.c
> @@ -2322,7 +2322,7 @@ int perf_evsel__parse_sample(struct perf_evsel *evsel, union perf_event *event,
>  		if (data->user_regs.abi) {
>  			u64 mask = evsel->attr.sample_regs_user;
>  
> -			sz = hweight_long(mask) * sizeof(u64);
> +			sz = hweight64(mask) * sizeof(u64);
>  			OVERFLOW_CHECK(array, sz, max_size);
>  			data->user_regs.mask = mask;
>  			data->user_regs.regs = (u64 *)array;

Later on, in the same function, perf_evsel__parse_sample() we have:

        data->intr_regs.abi = PERF_SAMPLE_REGS_ABI_NONE;
        if (type & PERF_SAMPLE_REGS_INTR) {
                OVERFLOW_CHECK_u64(array);
                data->intr_regs.abi = *array;
                array++;

                if (data->intr_regs.abi != PERF_SAMPLE_REGS_ABI_NONE) {
                        u64 mask = evsel->attr.sample_regs_intr;

                        sz = hweight_long(mask) * sizeof(u64);
                        OVERFLOW_CHECK(array, sz, max_size);
                        data->intr_regs.mask = mask;
                        data->intr_regs.regs = (u64 *)array;
                        array = (void *)array + sz;
                }
        }

You forgot to convert that one, doing it for you,

Thanks,

- Arnaldo

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v3 1/3] perf: use hweight64 instead of hweight_long
  2019-04-10 13:08   ` Arnaldo Carvalho de Melo
@ 2019-04-10 13:10     ` Arnaldo Carvalho de Melo
  2019-04-10 13:28       ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 10+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-04-10 13:10 UTC (permalink / raw)
  To: Mao Han
  Cc: linux-kernel, Peter Zijlstra, Ingo Molnar, Alexander Shishkin,
	Jiri Olsa, Namhyung Kim

Em Wed, Apr 10, 2019 at 10:08:41AM -0300, Arnaldo Carvalho de Melo escreveu:
> Em Wed, Apr 10, 2019 at 04:16:43PM +0800, Mao Han escreveu:
> > On 32-bits platform with more than 32 registers, the 64 bits mask is
> > truncate to the lower 32 bits and the return value of hweight_long will
> > always smaller than 32. When kernel outputs more than 32 registers, but
> > the user perf program only counts 32, there will be a data mismatch
> > result to overflow check fail.
> > 
> > CC: Peter Zijlstra <peterz@infradead.org>
> > CC: Ingo Molnar <mingo@redhat.com>
> > CC: Arnaldo Carvalho de Melo <acme@kernel.org>
> > CC: Alexander Shishkin <alexander.shishkin@linux.intel.com>
> > CC: Jiri Olsa <jolsa@redhat.com>
> > CC: Namhyung Kim <namhyung@kernel.org>
> > 
> > Signed-off-by: Mao Han <han_mao@c-sky.com>
> > ---
> >  tools/perf/util/evsel.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
> > index 7835e05..73c78be 100644
> > --- a/tools/perf/util/evsel.c
> > +++ b/tools/perf/util/evsel.c
> > @@ -2322,7 +2322,7 @@ int perf_evsel__parse_sample(struct perf_evsel *evsel, union perf_event *event,
> >  		if (data->user_regs.abi) {
> >  			u64 mask = evsel->attr.sample_regs_user;
> >  
> > -			sz = hweight_long(mask) * sizeof(u64);
> > +			sz = hweight64(mask) * sizeof(u64);
> >  			OVERFLOW_CHECK(array, sz, max_size);
> >  			data->user_regs.mask = mask;
> >  			data->user_regs.regs = (u64 *)array;
> 
> Later on, in the same function, perf_evsel__parse_sample() we have:
> 
>         data->intr_regs.abi = PERF_SAMPLE_REGS_ABI_NONE;
>         if (type & PERF_SAMPLE_REGS_INTR) {
>                 OVERFLOW_CHECK_u64(array);
>                 data->intr_regs.abi = *array;
>                 array++;
> 
>                 if (data->intr_regs.abi != PERF_SAMPLE_REGS_ABI_NONE) {
>                         u64 mask = evsel->attr.sample_regs_intr;
> 
>                         sz = hweight_long(mask) * sizeof(u64);
>                         OVERFLOW_CHECK(array, sz, max_size);
>                         data->intr_regs.mask = mask;
>                         data->intr_regs.regs = (u64 *)array;
>                         array = (void *)array + sz;
>                 }
>         }
> 
> You forgot to convert that one, doing it for you,

Also in perf_event__sample_event_size() we need to do the same thing,
right?

- Arnaldo

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v3 1/3] perf: use hweight64 instead of hweight_long
  2019-04-10 13:10     ` Arnaldo Carvalho de Melo
@ 2019-04-10 13:28       ` Arnaldo Carvalho de Melo
  2019-04-11  7:40         ` Mao Han
  0 siblings, 1 reply; 10+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-04-10 13:28 UTC (permalink / raw)
  To: Mao Han
  Cc: linux-kernel, Peter Zijlstra, Ingo Molnar, Alexander Shishkin,
	Jiri Olsa, Namhyung Kim, Adrian Hunter, Stephane Eranian

Em Wed, Apr 10, 2019 at 10:10:42AM -0300, Arnaldo Carvalho de Melo escreveu:
> Em Wed, Apr 10, 2019 at 10:08:41AM -0300, Arnaldo Carvalho de Melo escreveu:
> > Em Wed, Apr 10, 2019 at 04:16:43PM +0800, Mao Han escreveu:
> > > On 32-bits platform with more than 32 registers, the 64 bits mask is
> > > truncate to the lower 32 bits and the return value of hweight_long will
> > > always smaller than 32. When kernel outputs more than 32 registers, but
> > > the user perf program only counts 32, there will be a data mismatch
> > > result to overflow check fail.
> > > 
> > > CC: Peter Zijlstra <peterz@infradead.org>
> > > CC: Ingo Molnar <mingo@redhat.com>
> > > CC: Arnaldo Carvalho de Melo <acme@kernel.org>
> > > CC: Alexander Shishkin <alexander.shishkin@linux.intel.com>
> > > CC: Jiri Olsa <jolsa@redhat.com>
> > > CC: Namhyung Kim <namhyung@kernel.org>
> > > 
> > > Signed-off-by: Mao Han <han_mao@c-sky.com>
> > > ---
> > >  tools/perf/util/evsel.c | 2 +-
> > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > 
> > > diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
> > > index 7835e05..73c78be 100644
> > > --- a/tools/perf/util/evsel.c
> > > +++ b/tools/perf/util/evsel.c
> > > @@ -2322,7 +2322,7 @@ int perf_evsel__parse_sample(struct perf_evsel *evsel, union perf_event *event,
> > >  		if (data->user_regs.abi) {
> > >  			u64 mask = evsel->attr.sample_regs_user;
> > >  
> > > -			sz = hweight_long(mask) * sizeof(u64);
> > > +			sz = hweight64(mask) * sizeof(u64);
> > >  			OVERFLOW_CHECK(array, sz, max_size);
> > >  			data->user_regs.mask = mask;
> > >  			data->user_regs.regs = (u64 *)array;
> > 
> > Later on, in the same function, perf_evsel__parse_sample() we have:
> > 
> >         data->intr_regs.abi = PERF_SAMPLE_REGS_ABI_NONE;
> >         if (type & PERF_SAMPLE_REGS_INTR) {
> >                 OVERFLOW_CHECK_u64(array);
> >                 data->intr_regs.abi = *array;
> >                 array++;
> > 
> >                 if (data->intr_regs.abi != PERF_SAMPLE_REGS_ABI_NONE) {
> >                         u64 mask = evsel->attr.sample_regs_intr;
> > 
> >                         sz = hweight_long(mask) * sizeof(u64);
> >                         OVERFLOW_CHECK(array, sz, max_size);
> >                         data->intr_regs.mask = mask;
> >                         data->intr_regs.regs = (u64 *)array;
> >                         array = (void *)array + sz;
> >                 }
> >         }
> > 
> > You forgot to convert that one, doing it for you,
> 
> Also in perf_event__sample_event_size() we need to do the same thing,
> right?

and perf_event__synthesize_sample()

Done, resulting patch is at the end of this messages, and matches the
kernel, that uses only hweight64().

I've also added Fixes tags to the patches that used hweight_long() in
various places, to help with the stable trees backporting process,
please consider doing it next time.

- Arnaldo

commit 21e6dfe04861c2c1b529f2759850bc62a80ca050
Author: Mao Han <han_mao@c-sky.com>
Date:   Wed Apr 10 16:16:43 2019 +0800

    perf evsel: Use hweight64() instead of hweight_long(attr.sample_regs_user)
    
    On 32-bits platform with more than 32 registers, the 64 bits mask is
    truncate to the lower 32 bits and the return value of hweight_long will
    always smaller than 32. When kernel outputs more than 32 registers, but
    the user perf program only counts 32, there will be a data mismatch
    result to overflow check fail.
    
    Signed-off-by: Mao Han <han_mao@c-sky.com>
    Cc: Adrian Hunter <adrian.hunter@intel.com>
    Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
    Cc: Jiri Olsa <jolsa@redhat.com>
    Cc: Namhyung Kim <namhyung@kernel.org>
    Cc: Peter Zijlstra <peterz@infradead.org>
    Cc: Stephane Eranian <eranian@google.com>
    Fixes: 6a21c0b5c2ab ("perf tools: Add core support for sampling intr machine state regs")
    Fixes: d03f2170546d ("perf tools: Expand perf_event__synthesize_sample()")
    Fixes: 0f6a30150ca2 ("perf tools: Support user regs and stack in sample parsing")
    Link: http://lkml.kernel.org/r/29ad7947dc8fd1ff0abd2093a72cc27a2446be9f.1554883878.git.han_mao@c-sky.com
    Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 66d066f18b5b..966360844fff 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -2368,7 +2368,7 @@ int perf_evsel__parse_sample(struct perf_evsel *evsel, union perf_event *event,
 		if (data->user_regs.abi) {
 			u64 mask = evsel->attr.sample_regs_user;
 
-			sz = hweight_long(mask) * sizeof(u64);
+			sz = hweight64(mask) * sizeof(u64);
 			OVERFLOW_CHECK(array, sz, max_size);
 			data->user_regs.mask = mask;
 			data->user_regs.regs = (u64 *)array;
@@ -2424,7 +2424,7 @@ int perf_evsel__parse_sample(struct perf_evsel *evsel, union perf_event *event,
 		if (data->intr_regs.abi != PERF_SAMPLE_REGS_ABI_NONE) {
 			u64 mask = evsel->attr.sample_regs_intr;
 
-			sz = hweight_long(mask) * sizeof(u64);
+			sz = hweight64(mask) * sizeof(u64);
 			OVERFLOW_CHECK(array, sz, max_size);
 			data->intr_regs.mask = mask;
 			data->intr_regs.regs = (u64 *)array;
@@ -2552,7 +2552,7 @@ size_t perf_event__sample_event_size(const struct perf_sample *sample, u64 type,
 	if (type & PERF_SAMPLE_REGS_USER) {
 		if (sample->user_regs.abi) {
 			result += sizeof(u64);
-			sz = hweight_long(sample->user_regs.mask) * sizeof(u64);
+			sz = hweight64(sample->user_regs.mask) * sizeof(u64);
 			result += sz;
 		} else {
 			result += sizeof(u64);
@@ -2580,7 +2580,7 @@ size_t perf_event__sample_event_size(const struct perf_sample *sample, u64 type,
 	if (type & PERF_SAMPLE_REGS_INTR) {
 		if (sample->intr_regs.abi) {
 			result += sizeof(u64);
-			sz = hweight_long(sample->intr_regs.mask) * sizeof(u64);
+			sz = hweight64(sample->intr_regs.mask) * sizeof(u64);
 			result += sz;
 		} else {
 			result += sizeof(u64);
@@ -2710,7 +2710,7 @@ int perf_event__synthesize_sample(union perf_event *event, u64 type,
 	if (type & PERF_SAMPLE_REGS_USER) {
 		if (sample->user_regs.abi) {
 			*array++ = sample->user_regs.abi;
-			sz = hweight_long(sample->user_regs.mask) * sizeof(u64);
+			sz = hweight64(sample->user_regs.mask) * sizeof(u64);
 			memcpy(array, sample->user_regs.regs, sz);
 			array = (void *)array + sz;
 		} else {
@@ -2746,7 +2746,7 @@ int perf_event__synthesize_sample(union perf_event *event, u64 type,
 	if (type & PERF_SAMPLE_REGS_INTR) {
 		if (sample->intr_regs.abi) {
 			*array++ = sample->intr_regs.abi;
-			sz = hweight_long(sample->intr_regs.mask) * sizeof(u64);
+			sz = hweight64(sample->intr_regs.mask) * sizeof(u64);
 			memcpy(array, sample->intr_regs.regs, sz);
 			array = (void *)array + sz;
 		} else {

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH v3 1/3] perf: use hweight64 instead of hweight_long
  2019-04-10 13:28       ` Arnaldo Carvalho de Melo
@ 2019-04-11  7:40         ` Mao Han
  0 siblings, 0 replies; 10+ messages in thread
From: Mao Han @ 2019-04-11  7:40 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo; +Cc: linux-kernel

On Wed, Apr 10, 2019 at 10:28:37AM -0300, Arnaldo Carvalho de Melo wrote:
> > > You forgot to convert that one, doing it for you,
> > 
> > Also in perf_event__sample_event_size() we need to do the same thing,
> > right?
> 
> and perf_event__synthesize_sample()
> 
> Done, resulting patch is at the end of this messages, and matches the
> kernel, that uses only hweight64().
> 
> I've also added Fixes tags to the patches that used hweight_long() in
> various places, to help with the stable trees backporting process,
> please consider doing it next time.
> 
> - Arnaldo
>

Thanks for help improving the patch and the suggestion.
Tested with the new patch on C-SKY, seems work fine.

Mao Han

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [tip:perf/urgent] perf evsel: Use hweight64() instead of hweight_long(attr.sample_regs_user)
  2019-04-10  8:16 ` [PATCH v3 1/3] perf: use hweight64 instead of hweight_long Mao Han
  2019-04-10 13:08   ` Arnaldo Carvalho de Melo
@ 2019-04-12 16:40   ` tip-bot for Mao Han
  2019-04-16 15:30   ` tip-bot for Mao Han
  2 siblings, 0 replies; 10+ messages in thread
From: tip-bot for Mao Han @ 2019-04-12 16:40 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: acme, adrian.hunter, peterz, namhyung, tglx, han_mao,
	alexander.shishkin, hpa, eranian, mingo, jolsa, linux-kernel

Commit-ID:  21e6dfe04861c2c1b529f2759850bc62a80ca050
Gitweb:     https://git.kernel.org/tip/21e6dfe04861c2c1b529f2759850bc62a80ca050
Author:     Mao Han <han_mao@c-sky.com>
AuthorDate: Wed, 10 Apr 2019 16:16:43 +0800
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Wed, 10 Apr 2019 10:25:28 -0300

perf evsel: Use hweight64() instead of hweight_long(attr.sample_regs_user)

On 32-bits platform with more than 32 registers, the 64 bits mask is
truncate to the lower 32 bits and the return value of hweight_long will
always smaller than 32. When kernel outputs more than 32 registers, but
the user perf program only counts 32, there will be a data mismatch
result to overflow check fail.

Signed-off-by: Mao Han <han_mao@c-sky.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Fixes: 6a21c0b5c2ab ("perf tools: Add core support for sampling intr machine state regs")
Fixes: d03f2170546d ("perf tools: Expand perf_event__synthesize_sample()")
Fixes: 0f6a30150ca2 ("perf tools: Support user regs and stack in sample parsing")
Link: http://lkml.kernel.org/r/29ad7947dc8fd1ff0abd2093a72cc27a2446be9f.1554883878.git.han_mao@c-sky.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/evsel.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 66d066f18b5b..966360844fff 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -2368,7 +2368,7 @@ int perf_evsel__parse_sample(struct perf_evsel *evsel, union perf_event *event,
 		if (data->user_regs.abi) {
 			u64 mask = evsel->attr.sample_regs_user;
 
-			sz = hweight_long(mask) * sizeof(u64);
+			sz = hweight64(mask) * sizeof(u64);
 			OVERFLOW_CHECK(array, sz, max_size);
 			data->user_regs.mask = mask;
 			data->user_regs.regs = (u64 *)array;
@@ -2424,7 +2424,7 @@ int perf_evsel__parse_sample(struct perf_evsel *evsel, union perf_event *event,
 		if (data->intr_regs.abi != PERF_SAMPLE_REGS_ABI_NONE) {
 			u64 mask = evsel->attr.sample_regs_intr;
 
-			sz = hweight_long(mask) * sizeof(u64);
+			sz = hweight64(mask) * sizeof(u64);
 			OVERFLOW_CHECK(array, sz, max_size);
 			data->intr_regs.mask = mask;
 			data->intr_regs.regs = (u64 *)array;
@@ -2552,7 +2552,7 @@ size_t perf_event__sample_event_size(const struct perf_sample *sample, u64 type,
 	if (type & PERF_SAMPLE_REGS_USER) {
 		if (sample->user_regs.abi) {
 			result += sizeof(u64);
-			sz = hweight_long(sample->user_regs.mask) * sizeof(u64);
+			sz = hweight64(sample->user_regs.mask) * sizeof(u64);
 			result += sz;
 		} else {
 			result += sizeof(u64);
@@ -2580,7 +2580,7 @@ size_t perf_event__sample_event_size(const struct perf_sample *sample, u64 type,
 	if (type & PERF_SAMPLE_REGS_INTR) {
 		if (sample->intr_regs.abi) {
 			result += sizeof(u64);
-			sz = hweight_long(sample->intr_regs.mask) * sizeof(u64);
+			sz = hweight64(sample->intr_regs.mask) * sizeof(u64);
 			result += sz;
 		} else {
 			result += sizeof(u64);
@@ -2710,7 +2710,7 @@ int perf_event__synthesize_sample(union perf_event *event, u64 type,
 	if (type & PERF_SAMPLE_REGS_USER) {
 		if (sample->user_regs.abi) {
 			*array++ = sample->user_regs.abi;
-			sz = hweight_long(sample->user_regs.mask) * sizeof(u64);
+			sz = hweight64(sample->user_regs.mask) * sizeof(u64);
 			memcpy(array, sample->user_regs.regs, sz);
 			array = (void *)array + sz;
 		} else {
@@ -2746,7 +2746,7 @@ int perf_event__synthesize_sample(union perf_event *event, u64 type,
 	if (type & PERF_SAMPLE_REGS_INTR) {
 		if (sample->intr_regs.abi) {
 			*array++ = sample->intr_regs.abi;
-			sz = hweight_long(sample->intr_regs.mask) * sizeof(u64);
+			sz = hweight64(sample->intr_regs.mask) * sizeof(u64);
 			memcpy(array, sample->intr_regs.regs, sz);
 			array = (void *)array + sz;
 		} else {

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [tip:perf/urgent] perf evsel: Use hweight64() instead of hweight_long(attr.sample_regs_user)
  2019-04-10  8:16 ` [PATCH v3 1/3] perf: use hweight64 instead of hweight_long Mao Han
  2019-04-10 13:08   ` Arnaldo Carvalho de Melo
  2019-04-12 16:40   ` [tip:perf/urgent] perf evsel: Use hweight64() instead of hweight_long(attr.sample_regs_user) tip-bot for Mao Han
@ 2019-04-16 15:30   ` tip-bot for Mao Han
  2 siblings, 0 replies; 10+ messages in thread
From: tip-bot for Mao Han @ 2019-04-16 15:30 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: jolsa, eranian, tglx, han_mao, adrian.hunter, alexander.shishkin,
	peterz, hpa, namhyung, mingo, linux-kernel, acme

Commit-ID:  3a5b64f05d7fe36dea0dde26423e3044fbacd482
Gitweb:     https://git.kernel.org/tip/3a5b64f05d7fe36dea0dde26423e3044fbacd482
Author:     Mao Han <han_mao@c-sky.com>
AuthorDate: Wed, 10 Apr 2019 16:16:43 +0800
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Tue, 16 Apr 2019 11:27:53 -0300

perf evsel: Use hweight64() instead of hweight_long(attr.sample_regs_user)

On 32-bits platform with more than 32 registers, the 64 bits mask is
truncate to the lower 32 bits and the return value of hweight_long will
always smaller than 32. When kernel outputs more than 32 registers, but
the user perf program only counts 32, there will be a data mismatch
result to overflow check fail.

Signed-off-by: Mao Han <han_mao@c-sky.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Fixes: 6a21c0b5c2ab ("perf tools: Add core support for sampling intr machine state regs")
Fixes: d03f2170546d ("perf tools: Expand perf_event__synthesize_sample()")
Fixes: 0f6a30150ca2 ("perf tools: Support user regs and stack in sample parsing")
Link: http://lkml.kernel.org/r/29ad7947dc8fd1ff0abd2093a72cc27a2446be9f.1554883878.git.han_mao@c-sky.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/evsel.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 66d066f18b5b..966360844fff 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -2368,7 +2368,7 @@ int perf_evsel__parse_sample(struct perf_evsel *evsel, union perf_event *event,
 		if (data->user_regs.abi) {
 			u64 mask = evsel->attr.sample_regs_user;
 
-			sz = hweight_long(mask) * sizeof(u64);
+			sz = hweight64(mask) * sizeof(u64);
 			OVERFLOW_CHECK(array, sz, max_size);
 			data->user_regs.mask = mask;
 			data->user_regs.regs = (u64 *)array;
@@ -2424,7 +2424,7 @@ int perf_evsel__parse_sample(struct perf_evsel *evsel, union perf_event *event,
 		if (data->intr_regs.abi != PERF_SAMPLE_REGS_ABI_NONE) {
 			u64 mask = evsel->attr.sample_regs_intr;
 
-			sz = hweight_long(mask) * sizeof(u64);
+			sz = hweight64(mask) * sizeof(u64);
 			OVERFLOW_CHECK(array, sz, max_size);
 			data->intr_regs.mask = mask;
 			data->intr_regs.regs = (u64 *)array;
@@ -2552,7 +2552,7 @@ size_t perf_event__sample_event_size(const struct perf_sample *sample, u64 type,
 	if (type & PERF_SAMPLE_REGS_USER) {
 		if (sample->user_regs.abi) {
 			result += sizeof(u64);
-			sz = hweight_long(sample->user_regs.mask) * sizeof(u64);
+			sz = hweight64(sample->user_regs.mask) * sizeof(u64);
 			result += sz;
 		} else {
 			result += sizeof(u64);
@@ -2580,7 +2580,7 @@ size_t perf_event__sample_event_size(const struct perf_sample *sample, u64 type,
 	if (type & PERF_SAMPLE_REGS_INTR) {
 		if (sample->intr_regs.abi) {
 			result += sizeof(u64);
-			sz = hweight_long(sample->intr_regs.mask) * sizeof(u64);
+			sz = hweight64(sample->intr_regs.mask) * sizeof(u64);
 			result += sz;
 		} else {
 			result += sizeof(u64);
@@ -2710,7 +2710,7 @@ int perf_event__synthesize_sample(union perf_event *event, u64 type,
 	if (type & PERF_SAMPLE_REGS_USER) {
 		if (sample->user_regs.abi) {
 			*array++ = sample->user_regs.abi;
-			sz = hweight_long(sample->user_regs.mask) * sizeof(u64);
+			sz = hweight64(sample->user_regs.mask) * sizeof(u64);
 			memcpy(array, sample->user_regs.regs, sz);
 			array = (void *)array + sz;
 		} else {
@@ -2746,7 +2746,7 @@ int perf_event__synthesize_sample(union perf_event *event, u64 type,
 	if (type & PERF_SAMPLE_REGS_INTR) {
 		if (sample->intr_regs.abi) {
 			*array++ = sample->intr_regs.abi;
-			sz = hweight_long(sample->intr_regs.mask) * sizeof(u64);
+			sz = hweight64(sample->intr_regs.mask) * sizeof(u64);
 			memcpy(array, sample->intr_regs.regs, sz);
 			array = (void *)array + sz;
 		} else {

^ permalink raw reply related	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2019-04-16 15:31 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-04-10  8:16 [PATCH v3 0/3] csky: perf callchain dwarf support Mao Han
2019-04-10  8:16 ` [PATCH v3 1/3] perf: use hweight64 instead of hweight_long Mao Han
2019-04-10 13:08   ` Arnaldo Carvalho de Melo
2019-04-10 13:10     ` Arnaldo Carvalho de Melo
2019-04-10 13:28       ` Arnaldo Carvalho de Melo
2019-04-11  7:40         ` Mao Han
2019-04-12 16:40   ` [tip:perf/urgent] perf evsel: Use hweight64() instead of hweight_long(attr.sample_regs_user) tip-bot for Mao Han
2019-04-16 15:30   ` tip-bot for Mao Han
2019-04-10  8:16 ` [PATCH v3 2/3] csky: Add support for perf registers sampling Mao Han
2019-04-10  8:16 ` [PATCH v3 3/3] csky: add support for libdw Mao Han

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).