linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/3] perf tools: Add libdw DWARF post unwind support for ARM
@ 2014-03-03  9:53 Jean Pihet
  2014-03-03  9:53 ` [PATCH 1/3] perf tests: Introduce perf_regs_load function on ARM Jean Pihet
                   ` (3 more replies)
  0 siblings, 4 replies; 16+ messages in thread
From: Jean Pihet @ 2014-03-03  9:53 UTC (permalink / raw)
  To: linux-kernel, linaro-kernel, linux-arm-kernel, Arnaldo,
	Ingo Molnar, Jiri Olsa
  Cc: Will Deacon, steve.capper, patches, Jean Pihet

Adding libdw DWARF post unwind support, which is part
of elfutils-devel/libdw-dev package from version 0.158.

Also includes the test suite for dwarf unwinding, by adding the
arch specific test code and the perf_regs_load function.

Jean Pihet (3):
  perf tests: Introduce perf_regs_load function on ARM
  perf tests: Add dwarf unwind test on ARM
  perf tools: Add libdw DWARF post unwind support for ARM

 tools/perf/Makefile.perf                 |  2 +-
 tools/perf/arch/arm/Makefile             |  7 ++++
 tools/perf/arch/arm/include/perf_regs.h  |  5 +++
 tools/perf/arch/arm/tests/dwarf-unwind.c | 59 ++++++++++++++++++++++++++++++++
 tools/perf/arch/arm/tests/regs_load.S    | 51 +++++++++++++++++++++++++++
 tools/perf/arch/arm/util/unwind-libdw.c  | 36 +++++++++++++++++++
 tools/perf/tests/builtin-test.c          |  2 +-
 tools/perf/tests/tests.h                 |  2 +-
 8 files changed, 161 insertions(+), 3 deletions(-)
 create mode 100644 tools/perf/arch/arm/tests/dwarf-unwind.c
 create mode 100644 tools/perf/arch/arm/tests/regs_load.S
 create mode 100644 tools/perf/arch/arm/util/unwind-libdw.c

---

- Rebased on latest acme/perf/core git tree,
- Tested on quad-core ARMv7 machine

-- 
1.7.11.7


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 1/3] perf tests: Introduce perf_regs_load function on ARM
  2014-03-03  9:53 [PATCH 0/3] perf tools: Add libdw DWARF post unwind support for ARM Jean Pihet
@ 2014-03-03  9:53 ` Jean Pihet
  2014-03-04  7:50   ` Jean Pihet
  2014-03-04 11:00   ` Will Deacon
  2014-03-03  9:53 ` [PATCH 2/3] perf tests: Add dwarf unwind test " Jean Pihet
                   ` (2 subsequent siblings)
  3 siblings, 2 replies; 16+ messages in thread
From: Jean Pihet @ 2014-03-03  9:53 UTC (permalink / raw)
  To: linux-kernel, linaro-kernel, linux-arm-kernel, Arnaldo,
	Ingo Molnar, Jiri Olsa
  Cc: Will Deacon, steve.capper, patches, Jean Pihet, Corey Ashford,
	Frederic Weisbecker, Namhyung Kim, Paul Mackerras,
	Peter Zijlstra, David Ahern

Introducing perf_regs_load function, which is going
to be used for dwarf unwind test in following patches.

It takes single argument as a pointer to the regs dump
buffer and populates it with current registers values.

Signed-off-by: Jean Pihet <jean.pihet@linaro.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
---
 tools/perf/arch/arm/Makefile            |  1 +
 tools/perf/arch/arm/include/perf_regs.h |  2 ++
 tools/perf/arch/arm/tests/regs_load.S   | 51 +++++++++++++++++++++++++++++++++
 3 files changed, 54 insertions(+)
 create mode 100644 tools/perf/arch/arm/tests/regs_load.S

diff --git a/tools/perf/arch/arm/Makefile b/tools/perf/arch/arm/Makefile
index 67e9b3d..9b8f87e 100644
--- a/tools/perf/arch/arm/Makefile
+++ b/tools/perf/arch/arm/Makefile
@@ -4,4 +4,5 @@ LIB_OBJS += $(OUTPUT)arch/$(ARCH)/util/dwarf-regs.o
 endif
 ifndef NO_LIBUNWIND
 LIB_OBJS += $(OUTPUT)arch/$(ARCH)/util/unwind-libunwind.o
+LIB_OBJS += $(OUTPUT)arch/$(ARCH)/tests/regs_load.o
 endif
diff --git a/tools/perf/arch/arm/include/perf_regs.h b/tools/perf/arch/arm/include/perf_regs.h
index 2a1cfde..1476ae7 100644
--- a/tools/perf/arch/arm/include/perf_regs.h
+++ b/tools/perf/arch/arm/include/perf_regs.h
@@ -5,6 +5,8 @@
 #include "../../util/types.h"
 #include <asm/perf_regs.h>
 
+void perf_regs_load(u64 *regs);
+
 #define PERF_REGS_MASK	((1ULL << PERF_REG_ARM_MAX) - 1)
 #define PERF_REG_IP	PERF_REG_ARM_PC
 #define PERF_REG_SP	PERF_REG_ARM_SP
diff --git a/tools/perf/arch/arm/tests/regs_load.S b/tools/perf/arch/arm/tests/regs_load.S
new file mode 100644
index 0000000..241c6df
--- /dev/null
+++ b/tools/perf/arch/arm/tests/regs_load.S
@@ -0,0 +1,51 @@
+#include <linux/linkage.h>
+
+#define R0 0x00
+#define R1 0x08
+#define R2 0x10
+#define R3 0x18
+#define R4 0x20
+#define R5 0x28
+#define R6 0x30
+#define R7 0x38
+#define R8 0x40
+#define R9 0x48
+#define SL 0x50
+#define FP 0x58
+#define IP 0x60
+#define SP 0x68
+#define LR 0x70
+#define PC 0x78
+
+@ Implementation of void perf_regs_load(u64 *regs);
+@
+@ This functions fills in the 'regs' buffer from the actual registers values.
+@ Note that the return values (i.e. caller values) of sp and lr
+@ are retrieved and stored, in order to skip the call to this function.
+
+.text
+.type perf_regs_load,%function
+ENTRY(perf_regs_load)
+	push {r1}
+
+	str r0, [r0, #R0]
+	str r1, [r0, #R1]
+	str r2, [r0, #R2]
+	str r3, [r0, #R3]
+	str r4, [r0, #R4]
+	str r5, [r0, #R5]
+	str r6, [r0, #R6]
+	str r7, [r0, #R7]
+	str r8, [r0, #R8]
+	str r9, [r0, #R9]
+	str sl, [r0, #SL]
+	str fp, [r0, #FP]
+	str ip, [r0, #IP]
+	add r1, sp, #4		@ Retrieve and save sp at entry time
+	str r1, [r0, #SP]
+	str lr, [r0, #LR]
+	str lr, [r0, #PC]	@ Save caller PC
+
+	pop {r1}
+	bx lr
+ENDPROC(perf_regs_load)
-- 
1.7.11.7


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 2/3] perf tests: Add dwarf unwind test on ARM
  2014-03-03  9:53 [PATCH 0/3] perf tools: Add libdw DWARF post unwind support for ARM Jean Pihet
  2014-03-03  9:53 ` [PATCH 1/3] perf tests: Introduce perf_regs_load function on ARM Jean Pihet
@ 2014-03-03  9:53 ` Jean Pihet
  2014-03-03  9:53 ` [PATCH 3/3] perf tools: Add libdw DWARF post unwind support for ARM Jean Pihet
  2014-03-04  1:24 ` [PATCH 0/3] " Jean Pihet
  3 siblings, 0 replies; 16+ messages in thread
From: Jean Pihet @ 2014-03-03  9:53 UTC (permalink / raw)
  To: linux-kernel, linaro-kernel, linux-arm-kernel, Arnaldo,
	Ingo Molnar, Jiri Olsa
  Cc: Will Deacon, steve.capper, patches, Jean Pihet, Corey Ashford,
	Frederic Weisbecker, Namhyung Kim, Paul Mackerras,
	Peter Zijlstra, David Ahern

Adding dwarf unwind test, that setups live machine data over
the perf test thread and does the remote unwind.

Need to use -fno-optimize-sibling-calls for test compilation,
otherwise 'krava_*' function calls are optimized into jumps
and ommited from the stack unwind.

So far it was enabled only for x86.

Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Cc: David Ahern <dsahern@gmail.com>
Signed-off-by: Jean Pihet <jean.pihet@linaro.org>
---
 tools/perf/Makefile.perf                 |  2 +-
 tools/perf/arch/arm/Makefile             |  1 +
 tools/perf/arch/arm/include/perf_regs.h  |  3 ++
 tools/perf/arch/arm/tests/dwarf-unwind.c | 59 ++++++++++++++++++++++++++++++++
 tools/perf/tests/builtin-test.c          |  2 +-
 tools/perf/tests/tests.h                 |  2 +-
 6 files changed, 66 insertions(+), 3 deletions(-)
 create mode 100644 tools/perf/arch/arm/tests/dwarf-unwind.c

diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf
index 50d875d..f281c4f 100644
--- a/tools/perf/Makefile.perf
+++ b/tools/perf/Makefile.perf
@@ -410,7 +410,7 @@ LIB_OBJS += $(OUTPUT)tests/code-reading.o
 LIB_OBJS += $(OUTPUT)tests/sample-parsing.o
 LIB_OBJS += $(OUTPUT)tests/parse-no-sample-id-all.o
 ifndef NO_DWARF_UNWIND
-ifeq ($(ARCH),x86)
+ifeq ($(ARCH),$(filter $(ARCH),x86 arm))
 LIB_OBJS += $(OUTPUT)tests/dwarf-unwind.o
 endif
 endif
diff --git a/tools/perf/arch/arm/Makefile b/tools/perf/arch/arm/Makefile
index 9b8f87e..221f21d 100644
--- a/tools/perf/arch/arm/Makefile
+++ b/tools/perf/arch/arm/Makefile
@@ -5,4 +5,5 @@ endif
 ifndef NO_LIBUNWIND
 LIB_OBJS += $(OUTPUT)arch/$(ARCH)/util/unwind-libunwind.o
 LIB_OBJS += $(OUTPUT)arch/$(ARCH)/tests/regs_load.o
+LIB_OBJS += $(OUTPUT)arch/$(ARCH)/tests/dwarf-unwind.o
 endif
diff --git a/tools/perf/arch/arm/include/perf_regs.h b/tools/perf/arch/arm/include/perf_regs.h
index 1476ae7..c2cefff 100644
--- a/tools/perf/arch/arm/include/perf_regs.h
+++ b/tools/perf/arch/arm/include/perf_regs.h
@@ -8,6 +8,9 @@
 void perf_regs_load(u64 *regs);
 
 #define PERF_REGS_MASK	((1ULL << PERF_REG_ARM_MAX) - 1)
+#define PERF_REGS_MAX	PERF_REG_ARM_MAX
+#define PERF_SAMPLE_REGS_ABI	PERF_SAMPLE_REGS_ABI_32
+
 #define PERF_REG_IP	PERF_REG_ARM_PC
 #define PERF_REG_SP	PERF_REG_ARM_SP
 
diff --git a/tools/perf/arch/arm/tests/dwarf-unwind.c b/tools/perf/arch/arm/tests/dwarf-unwind.c
new file mode 100644
index 0000000..87362f7
--- /dev/null
+++ b/tools/perf/arch/arm/tests/dwarf-unwind.c
@@ -0,0 +1,59 @@
+#include <string.h>
+#include "perf_regs.h"
+#include "thread.h"
+#include "map.h"
+#include "event.h"
+#include "tests/tests.h"
+
+#define STACK_SIZE 8192
+
+static int sample_ustack(struct perf_sample *sample,
+			 struct thread *thread, u64 *regs)
+{
+	struct stack_dump *stack = &sample->user_stack;
+	struct map *map;
+	unsigned long sp;
+	u64 stack_size, *buf;
+
+	buf = malloc(STACK_SIZE);
+	if (!buf) {
+		pr_debug("failed to allocate sample uregs data\n");
+		return -1;
+	}
+
+	sp = (unsigned long) regs[PERF_REG_ARM_SP];
+
+	map = map_groups__find(&thread->mg, MAP__FUNCTION, (u64) sp);
+	if (!map) {
+		pr_debug("failed to get stack map\n");
+		return -1;
+	}
+
+	stack_size = map->end - sp;
+	stack_size = stack_size > STACK_SIZE ? STACK_SIZE : stack_size;
+
+	memcpy(buf, (void *) sp, stack_size);
+	stack->data = (char *) buf;
+	stack->size = stack_size;
+	return 0;
+}
+
+int test__arch_unwind_sample(struct perf_sample *sample,
+			     struct thread *thread)
+{
+	struct regs_dump *regs = &sample->user_regs;
+	u64 *buf;
+
+	buf = calloc(1, sizeof(u64) * PERF_REGS_MAX);
+	if (!buf) {
+		pr_debug("failed to allocate sample uregs data\n");
+		return -1;
+	}
+
+	perf_regs_load(buf);
+	regs->abi  = PERF_SAMPLE_REGS_ABI;
+	regs->regs = buf;
+	regs->mask = PERF_REGS_MASK;
+
+	return sample_ustack(sample, thread, buf);
+}
diff --git a/tools/perf/tests/builtin-test.c b/tools/perf/tests/builtin-test.c
index b11bf8a..167d527 100644
--- a/tools/perf/tests/builtin-test.c
+++ b/tools/perf/tests/builtin-test.c
@@ -115,7 +115,7 @@ static struct test {
 		.desc = "Test parsing with no sample_id_all bit set",
 		.func = test__parse_no_sample_id_all,
 	},
-#if defined(__x86_64__) || defined(__i386__)
+#if defined(__x86_64__) || defined(__i386__) || defined(__arm__)
 #ifdef HAVE_DWARF_UNWIND_SUPPORT
 	{
 		.desc = "Test dwarf unwind",
diff --git a/tools/perf/tests/tests.h b/tools/perf/tests/tests.h
index a24795c..0c89cfe 100644
--- a/tools/perf/tests/tests.h
+++ b/tools/perf/tests/tests.h
@@ -42,7 +42,7 @@ int test__keep_tracking(void);
 int test__parse_no_sample_id_all(void);
 int test__dwarf_unwind(void);
 
-#if defined(__x86_64__) || defined(__i386__)
+#if defined(__x86_64__) || defined(__i386__) || defined(__arm__)
 #ifdef HAVE_DWARF_UNWIND_SUPPORT
 struct thread;
 struct perf_sample;
-- 
1.7.11.7


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 3/3] perf tools: Add libdw DWARF post unwind support for ARM
  2014-03-03  9:53 [PATCH 0/3] perf tools: Add libdw DWARF post unwind support for ARM Jean Pihet
  2014-03-03  9:53 ` [PATCH 1/3] perf tests: Introduce perf_regs_load function on ARM Jean Pihet
  2014-03-03  9:53 ` [PATCH 2/3] perf tests: Add dwarf unwind test " Jean Pihet
@ 2014-03-03  9:53 ` Jean Pihet
  2014-03-04  1:24 ` [PATCH 0/3] " Jean Pihet
  3 siblings, 0 replies; 16+ messages in thread
From: Jean Pihet @ 2014-03-03  9:53 UTC (permalink / raw)
  To: linux-kernel, linaro-kernel, linux-arm-kernel, Arnaldo,
	Ingo Molnar, Jiri Olsa
  Cc: Will Deacon, steve.capper, patches, Jean Pihet, Corey Ashford,
	Frederic Weisbecker, Namhyung Kim, Paul Mackerras,
	Peter Zijlstra, David Ahern

Adding libdw DWARF post unwind support, which is part
of elfutils-devel/libdw-dev package from version 0.158.

The new code is contained in unwin-libdw.c object, and
implements unwind__get_entries unwind interface function.

Signed-off-by: Jean Pihet <jean.pihet@linaro.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Cc: David Ahern <dsahern@gmail.com>
---
 tools/perf/arch/arm/Makefile            |  5 +++++
 tools/perf/arch/arm/util/unwind-libdw.c | 36 +++++++++++++++++++++++++++++++++
 2 files changed, 41 insertions(+)
 create mode 100644 tools/perf/arch/arm/util/unwind-libdw.c

diff --git a/tools/perf/arch/arm/Makefile b/tools/perf/arch/arm/Makefile
index 221f21d..09d6215 100644
--- a/tools/perf/arch/arm/Makefile
+++ b/tools/perf/arch/arm/Makefile
@@ -4,6 +4,11 @@ LIB_OBJS += $(OUTPUT)arch/$(ARCH)/util/dwarf-regs.o
 endif
 ifndef NO_LIBUNWIND
 LIB_OBJS += $(OUTPUT)arch/$(ARCH)/util/unwind-libunwind.o
+endif
+ifndef NO_LIBDW_DWARF_UNWIND
+LIB_OBJS += $(OUTPUT)arch/$(ARCH)/util/unwind-libdw.o
+endif
+ifndef NO_DWARF_UNWIND
 LIB_OBJS += $(OUTPUT)arch/$(ARCH)/tests/regs_load.o
 LIB_OBJS += $(OUTPUT)arch/$(ARCH)/tests/dwarf-unwind.o
 endif
diff --git a/tools/perf/arch/arm/util/unwind-libdw.c b/tools/perf/arch/arm/util/unwind-libdw.c
new file mode 100644
index 0000000..b4176c6
--- /dev/null
+++ b/tools/perf/arch/arm/util/unwind-libdw.c
@@ -0,0 +1,36 @@
+#include <elfutils/libdwfl.h>
+#include "../../util/unwind-libdw.h"
+#include "../../util/perf_regs.h"
+
+bool libdw__arch_set_initial_registers(Dwfl_Thread *thread, void *arg)
+{
+	struct unwind_info *ui = arg;
+	struct regs_dump *user_regs = &ui->sample->user_regs;
+	Dwarf_Word dwarf_regs[PERF_REG_ARM_MAX];
+
+#define REG(r) ({						\
+	Dwarf_Word val = 0;					\
+	perf_reg_value(&val, user_regs, PERF_REG_ARM_##r);	\
+	val;							\
+})
+
+	dwarf_regs[0]  = REG(R0);
+	dwarf_regs[1]  = REG(R1);
+	dwarf_regs[2]  = REG(R2);
+	dwarf_regs[3]  = REG(R3);
+	dwarf_regs[4]  = REG(R4);
+	dwarf_regs[5]  = REG(R5);
+	dwarf_regs[6]  = REG(R6);
+	dwarf_regs[7]  = REG(R7);
+	dwarf_regs[8]  = REG(R8);
+	dwarf_regs[9]  = REG(R9);
+	dwarf_regs[10] = REG(R10);
+	dwarf_regs[11] = REG(FP);
+	dwarf_regs[12] = REG(IP);
+	dwarf_regs[13] = REG(SP);
+	dwarf_regs[14] = REG(LR);
+	dwarf_regs[15] = REG(PC);
+
+	return dwfl_thread_state_registers(thread, 0, PERF_REG_ARM_MAX,
+					   dwarf_regs);
+}
-- 
1.7.11.7


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH 0/3] perf tools: Add libdw DWARF post unwind support for ARM
  2014-03-03  9:53 [PATCH 0/3] perf tools: Add libdw DWARF post unwind support for ARM Jean Pihet
                   ` (2 preceding siblings ...)
  2014-03-03  9:53 ` [PATCH 3/3] perf tools: Add libdw DWARF post unwind support for ARM Jean Pihet
@ 2014-03-04  1:24 ` Jean Pihet
  2014-03-04 10:37   ` Jiri Olsa
  2014-03-04 11:58   ` Jiri Olsa
  3 siblings, 2 replies; 16+ messages in thread
From: Jean Pihet @ 2014-03-04  1:24 UTC (permalink / raw)
  To: linux-kernel, linaro-kernel, linux-arm-kernel, Arnaldo,
	Ingo Molnar, Jiri Olsa
  Cc: Will Deacon, Steve Capper, Patch Tracking, Jean Pihet

Hi,

Here are some notes about the performance improvement and the test usage.

Jiri, do you need these notes in the one of commit description?

1. Using libdw vs libunwind on ARMv7

The performance gain is (more than) significant: >800%.

I did not profile perf itself ;-p just did a timing measurement.
The benchmark has been run multiple times with different perf.data
sizes. The results are consistent across the tests runs.

Usage:
./tools/perf/perf record --call-graph dwarf --
../../libunwind/test_app/stress_bt
time ./tools/perf/perf report --stdio > /dev/null 2>&1

Platform:
Quad-core marvell XP370. perf runs on 1 cpu

Perf data size:
304MB    libunwind    libdw        improvement
real    9m31.577s    1m13.052s    782%
user    5m33.020s    1m2.910s    529%
sys        3m57.770s    0m10.090s    2356%

2. unwind test usage
perf test list gives the list of supported tests. In this case the
test #23 is dwarf unwinding:

./tools/perf/perf test 23
23: Test dwarf unwind                                      : Ok


Regards,
Jean


On 3 March 2014 10:53, Jean Pihet <jean.pihet@linaro.org> wrote:
> Adding libdw DWARF post unwind support, which is part
> of elfutils-devel/libdw-dev package from version 0.158.
>
> Also includes the test suite for dwarf unwinding, by adding the
> arch specific test code and the perf_regs_load function.
>
> Jean Pihet (3):
>   perf tests: Introduce perf_regs_load function on ARM
>   perf tests: Add dwarf unwind test on ARM
>   perf tools: Add libdw DWARF post unwind support for ARM
>
>  tools/perf/Makefile.perf                 |  2 +-
>  tools/perf/arch/arm/Makefile             |  7 ++++
>  tools/perf/arch/arm/include/perf_regs.h  |  5 +++
>  tools/perf/arch/arm/tests/dwarf-unwind.c | 59 ++++++++++++++++++++++++++++++++
>  tools/perf/arch/arm/tests/regs_load.S    | 51 +++++++++++++++++++++++++++
>  tools/perf/arch/arm/util/unwind-libdw.c  | 36 +++++++++++++++++++
>  tools/perf/tests/builtin-test.c          |  2 +-
>  tools/perf/tests/tests.h                 |  2 +-
>  8 files changed, 161 insertions(+), 3 deletions(-)
>  create mode 100644 tools/perf/arch/arm/tests/dwarf-unwind.c
>  create mode 100644 tools/perf/arch/arm/tests/regs_load.S
>  create mode 100644 tools/perf/arch/arm/util/unwind-libdw.c
>
> ---
>
> - Rebased on latest acme/perf/core git tree,
> - Tested on quad-core ARMv7 machine
>
> --
> 1.7.11.7
>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/3] perf tests: Introduce perf_regs_load function on ARM
  2014-03-03  9:53 ` [PATCH 1/3] perf tests: Introduce perf_regs_load function on ARM Jean Pihet
@ 2014-03-04  7:50   ` Jean Pihet
  2014-03-04  7:57     ` Jean Pihet
  2014-03-04 11:00   ` Will Deacon
  1 sibling, 1 reply; 16+ messages in thread
From: Jean Pihet @ 2014-03-04  7:50 UTC (permalink / raw)
  To: linux-kernel, linaro-kernel, linux-arm-kernel, Arnaldo,
	Ingo Molnar, Jiri Olsa, Steve Capper

A new version is coming asap after a review of the code with Steve.

Thx,
Jean

On 3 March 2014 10:53, Jean Pihet <jean.pihet@linaro.org> wrote:
> Introducing perf_regs_load function, which is going
> to be used for dwarf unwind test in following patches.
>
> It takes single argument as a pointer to the regs dump
> buffer and populates it with current registers values.
>
> Signed-off-by: Jean Pihet <jean.pihet@linaro.org>
> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
> Cc: Frederic Weisbecker <fweisbec@gmail.com>
> Cc: Ingo Molnar <mingo@kernel.org>
> Cc: Namhyung Kim <namhyung@kernel.org>
> Cc: Paul Mackerras <paulus@samba.org>
> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
> Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
> Cc: David Ahern <dsahern@gmail.com>
> Cc: Jiri Olsa <jolsa@redhat.com>
> ---
>  tools/perf/arch/arm/Makefile            |  1 +
>  tools/perf/arch/arm/include/perf_regs.h |  2 ++
>  tools/perf/arch/arm/tests/regs_load.S   | 51 +++++++++++++++++++++++++++++++++
>  3 files changed, 54 insertions(+)
>  create mode 100644 tools/perf/arch/arm/tests/regs_load.S
>
> diff --git a/tools/perf/arch/arm/Makefile b/tools/perf/arch/arm/Makefile
> index 67e9b3d..9b8f87e 100644
> --- a/tools/perf/arch/arm/Makefile
> +++ b/tools/perf/arch/arm/Makefile
> @@ -4,4 +4,5 @@ LIB_OBJS += $(OUTPUT)arch/$(ARCH)/util/dwarf-regs.o
>  endif
>  ifndef NO_LIBUNWIND
>  LIB_OBJS += $(OUTPUT)arch/$(ARCH)/util/unwind-libunwind.o
> +LIB_OBJS += $(OUTPUT)arch/$(ARCH)/tests/regs_load.o
>  endif
> diff --git a/tools/perf/arch/arm/include/perf_regs.h b/tools/perf/arch/arm/include/perf_regs.h
> index 2a1cfde..1476ae7 100644
> --- a/tools/perf/arch/arm/include/perf_regs.h
> +++ b/tools/perf/arch/arm/include/perf_regs.h
> @@ -5,6 +5,8 @@
>  #include "../../util/types.h"
>  #include <asm/perf_regs.h>
>
> +void perf_regs_load(u64 *regs);
> +
>  #define PERF_REGS_MASK ((1ULL << PERF_REG_ARM_MAX) - 1)
>  #define PERF_REG_IP    PERF_REG_ARM_PC
>  #define PERF_REG_SP    PERF_REG_ARM_SP
> diff --git a/tools/perf/arch/arm/tests/regs_load.S b/tools/perf/arch/arm/tests/regs_load.S
> new file mode 100644
> index 0000000..241c6df
> --- /dev/null
> +++ b/tools/perf/arch/arm/tests/regs_load.S
> @@ -0,0 +1,51 @@
> +#include <linux/linkage.h>
> +
> +#define R0 0x00
> +#define R1 0x08
> +#define R2 0x10
> +#define R3 0x18
> +#define R4 0x20
> +#define R5 0x28
> +#define R6 0x30
> +#define R7 0x38
> +#define R8 0x40
> +#define R9 0x48
> +#define SL 0x50
> +#define FP 0x58
> +#define IP 0x60
> +#define SP 0x68
> +#define LR 0x70
> +#define PC 0x78
> +
> +@ Implementation of void perf_regs_load(u64 *regs);
> +@
> +@ This functions fills in the 'regs' buffer from the actual registers values.
> +@ Note that the return values (i.e. caller values) of sp and lr
> +@ are retrieved and stored, in order to skip the call to this function.
> +
> +.text
> +.type perf_regs_load,%function
> +ENTRY(perf_regs_load)
> +       push {r1}
> +
> +       str r0, [r0, #R0]
> +       str r1, [r0, #R1]
> +       str r2, [r0, #R2]
> +       str r3, [r0, #R3]
> +       str r4, [r0, #R4]
> +       str r5, [r0, #R5]
> +       str r6, [r0, #R6]
> +       str r7, [r0, #R7]
> +       str r8, [r0, #R8]
> +       str r9, [r0, #R9]
> +       str sl, [r0, #SL]
> +       str fp, [r0, #FP]
> +       str ip, [r0, #IP]
> +       add r1, sp, #4          @ Retrieve and save sp at entry time
> +       str r1, [r0, #SP]
> +       str lr, [r0, #LR]
> +       str lr, [r0, #PC]       @ Save caller PC
> +
> +       pop {r1}
> +       bx lr
> +ENDPROC(perf_regs_load)
> --
> 1.7.11.7
>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 1/3] perf tests: Introduce perf_regs_load function on ARM
  2014-03-04  7:50   ` Jean Pihet
@ 2014-03-04  7:57     ` Jean Pihet
  0 siblings, 0 replies; 16+ messages in thread
From: Jean Pihet @ 2014-03-04  7:57 UTC (permalink / raw)
  To: linux-kernel, linaro-kernel, linux-arm-kernel, Arnaldo,
	Ingo Molnar, Jiri Olsa, Steve Capper
  Cc: Will Deacon, patches, Jean Pihet, Corey Ashford,
	Frederic Weisbecker, Namhyung Kim, Paul Mackerras,
	Peter Zijlstra, David Ahern

Introducing perf_regs_load function, which is going
to be used for dwarf unwind test in following patches.

It takes single argument as a pointer to the regs dump
buffer and populates it with current registers values.

Signed-off-by: Jean Pihet <jean.pihet@linaro.org>
Cc: Steve Capper <steve.capper@linaro.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
---
 tools/perf/arch/arm/Makefile            |  1 +
 tools/perf/arch/arm/include/perf_regs.h |  2 ++
 tools/perf/arch/arm/tests/regs_load.S   | 47 +++++++++++++++++++++++++++++++++
 3 files changed, 50 insertions(+)
 create mode 100644 tools/perf/arch/arm/tests/regs_load.S

diff --git a/tools/perf/arch/arm/Makefile b/tools/perf/arch/arm/Makefile
index 67e9b3d..9b8f87e 100644
--- a/tools/perf/arch/arm/Makefile
+++ b/tools/perf/arch/arm/Makefile
@@ -4,4 +4,5 @@ LIB_OBJS += $(OUTPUT)arch/$(ARCH)/util/dwarf-regs.o
 endif
 ifndef NO_LIBUNWIND
 LIB_OBJS += $(OUTPUT)arch/$(ARCH)/util/unwind-libunwind.o
+LIB_OBJS += $(OUTPUT)arch/$(ARCH)/tests/regs_load.o
 endif
diff --git a/tools/perf/arch/arm/include/perf_regs.h b/tools/perf/arch/arm/include/perf_regs.h
index 2a1cfde..1476ae7 100644
--- a/tools/perf/arch/arm/include/perf_regs.h
+++ b/tools/perf/arch/arm/include/perf_regs.h
@@ -5,6 +5,8 @@
 #include "../../util/types.h"
 #include <asm/perf_regs.h>
 
+void perf_regs_load(u64 *regs);
+
 #define PERF_REGS_MASK	((1ULL << PERF_REG_ARM_MAX) - 1)
 #define PERF_REG_IP	PERF_REG_ARM_PC
 #define PERF_REG_SP	PERF_REG_ARM_SP
diff --git a/tools/perf/arch/arm/tests/regs_load.S b/tools/perf/arch/arm/tests/regs_load.S
new file mode 100644
index 0000000..916d268
--- /dev/null
+++ b/tools/perf/arch/arm/tests/regs_load.S
@@ -0,0 +1,47 @@
+#include <linux/linkage.h>
+
+#define R0 0x00
+#define R1 0x08
+#define R2 0x10
+#define R3 0x18
+#define R4 0x20
+#define R5 0x28
+#define R6 0x30
+#define R7 0x38
+#define R8 0x40
+#define R9 0x48
+#define SL 0x50
+#define FP 0x58
+#define IP 0x60
+#define SP 0x68
+#define LR 0x70
+#define PC 0x78
+
+@ Implementation of void perf_regs_load(u64 *regs);
+@
+@ This functions fills in the 'regs' buffer from the actual registers values.
+@ Note that the return value (i.e. caller values) of pc is retrieved
+@ and stored, in order to skip the call to this function.
+
+.text
+.type perf_regs_load,%function
+ENTRY(perf_regs_load)
+	str r0, [r0, #R0]
+	str r1, [r0, #R1]
+	str r2, [r0, #R2]
+	str r3, [r0, #R3]
+	str r4, [r0, #R4]
+	str r5, [r0, #R5]
+	str r6, [r0, #R6]
+	str r7, [r0, #R7]
+	str r8, [r0, #R8]
+	str r9, [r0, #R9]
+	str sl, [r0, #SL]
+	str fp, [r0, #FP]
+	str ip, [r0, #IP]
+	str sp, [r0, #SP]
+	str lr, [r0, #LR]
+	str lr, [r0, #PC]	@ Save caller PC
+
+	bx lr
+ENDPROC(perf_regs_load)
-- 
1.8.1.2


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH 0/3] perf tools: Add libdw DWARF post unwind support for ARM
  2014-03-04  1:24 ` [PATCH 0/3] " Jean Pihet
@ 2014-03-04 10:37   ` Jiri Olsa
  2014-03-04 11:58   ` Jiri Olsa
  1 sibling, 0 replies; 16+ messages in thread
From: Jiri Olsa @ 2014-03-04 10:37 UTC (permalink / raw)
  To: Jean Pihet
  Cc: linux-kernel, linaro-kernel, linux-arm-kernel, Arnaldo,
	Ingo Molnar, Will Deacon, Steve Capper, Patch Tracking

On Tue, Mar 04, 2014 at 02:24:37AM +0100, Jean Pihet wrote:
> Hi,
> 
> Here are some notes about the performance improvement and the test usage.
> 
> Jiri, do you need these notes in the one of commit description?
> 
> 1. Using libdw vs libunwind on ARMv7
> 
> The performance gain is (more than) significant: >800%.

I dont mind.. but 800% would look just great in changelog ;-)

jirka

> 
> I did not profile perf itself ;-p just did a timing measurement.
> The benchmark has been run multiple times with different perf.data
> sizes. The results are consistent across the tests runs.
> 
> Usage:
> ./tools/perf/perf record --call-graph dwarf --
> ../../libunwind/test_app/stress_bt
> time ./tools/perf/perf report --stdio > /dev/null 2>&1
> 
> Platform:
> Quad-core marvell XP370. perf runs on 1 cpu
> 
> Perf data size:
> 304MB    libunwind    libdw        improvement
> real    9m31.577s    1m13.052s    782%
> user    5m33.020s    1m2.910s    529%
> sys        3m57.770s    0m10.090s    2356%
> 
> 2. unwind test usage
> perf test list gives the list of supported tests. In this case the
> test #23 is dwarf unwinding:
> 
> ./tools/perf/perf test 23
> 23: Test dwarf unwind                                      : Ok
> 
> 
> Regards,
> Jean
> 
> 
> On 3 March 2014 10:53, Jean Pihet <jean.pihet@linaro.org> wrote:
> > Adding libdw DWARF post unwind support, which is part
> > of elfutils-devel/libdw-dev package from version 0.158.
> >
> > Also includes the test suite for dwarf unwinding, by adding the
> > arch specific test code and the perf_regs_load function.
> >
> > Jean Pihet (3):
> >   perf tests: Introduce perf_regs_load function on ARM
> >   perf tests: Add dwarf unwind test on ARM
> >   perf tools: Add libdw DWARF post unwind support for ARM
> >
> >  tools/perf/Makefile.perf                 |  2 +-
> >  tools/perf/arch/arm/Makefile             |  7 ++++
> >  tools/perf/arch/arm/include/perf_regs.h  |  5 +++
> >  tools/perf/arch/arm/tests/dwarf-unwind.c | 59 ++++++++++++++++++++++++++++++++
> >  tools/perf/arch/arm/tests/regs_load.S    | 51 +++++++++++++++++++++++++++
> >  tools/perf/arch/arm/util/unwind-libdw.c  | 36 +++++++++++++++++++
> >  tools/perf/tests/builtin-test.c          |  2 +-
> >  tools/perf/tests/tests.h                 |  2 +-
> >  8 files changed, 161 insertions(+), 3 deletions(-)
> >  create mode 100644 tools/perf/arch/arm/tests/dwarf-unwind.c
> >  create mode 100644 tools/perf/arch/arm/tests/regs_load.S
> >  create mode 100644 tools/perf/arch/arm/util/unwind-libdw.c
> >
> > ---
> >
> > - Rebased on latest acme/perf/core git tree,
> > - Tested on quad-core ARMv7 machine
> >
> > --
> > 1.7.11.7
> >

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/3] perf tests: Introduce perf_regs_load function on ARM
  2014-03-03  9:53 ` [PATCH 1/3] perf tests: Introduce perf_regs_load function on ARM Jean Pihet
  2014-03-04  7:50   ` Jean Pihet
@ 2014-03-04 11:00   ` Will Deacon
  2014-03-05  2:17     ` Jean Pihet
  1 sibling, 1 reply; 16+ messages in thread
From: Will Deacon @ 2014-03-04 11:00 UTC (permalink / raw)
  To: Jean Pihet
  Cc: linux-kernel, linaro-kernel, linux-arm-kernel, Arnaldo,
	Ingo Molnar, Jiri Olsa, steve.capper, patches, Corey Ashford,
	Frederic Weisbecker, Namhyung Kim, Paul Mackerras,
	Peter Zijlstra, David Ahern

Hi Jean,

On Mon, Mar 03, 2014 at 09:53:21AM +0000, Jean Pihet wrote:
> Introducing perf_regs_load function, which is going
> to be used for dwarf unwind test in following patches.
> 
> It takes single argument as a pointer to the regs dump
> buffer and populates it with current registers values.

[...]

> diff --git a/tools/perf/arch/arm/tests/regs_load.S b/tools/perf/arch/arm/tests/regs_load.S
> new file mode 100644
> index 0000000..241c6df
> --- /dev/null
> +++ b/tools/perf/arch/arm/tests/regs_load.S
> @@ -0,0 +1,51 @@
> +#include <linux/linkage.h>
> +
> +#define R0 0x00
> +#define R1 0x08

Why are you using a 64-bit stride for 32-bit registers? (which prevents you
from using stm later on).

> +.text
> +.type perf_regs_load,%function
> +ENTRY(perf_regs_load)
> +	push {r1}

Do you only push r1 here so that you can do the stack arithmetic later? That
doesn't make sense to me -- can't you str sp directly?

> +	str r0, [r0, #R0]
> +	str r1, [r0, #R1]
> +	str r2, [r0, #R2]
> +	str r3, [r0, #R3]
> +	str r4, [r0, #R4]
> +	str r5, [r0, #R5]
> +	str r6, [r0, #R6]
> +	str r7, [r0, #R7]
> +	str r8, [r0, #R8]
> +	str r9, [r0, #R9]
> +	str sl, [r0, #SL]
> +	str fp, [r0, #FP]
> +	str ip, [r0, #IP]
> +	add r1, sp, #4		@ Retrieve and save sp at entry time
> +	str r1, [r0, #SP]
> +	str lr, [r0, #LR]
> +	str lr, [r0, #PC]	@ Save caller PC

This isn't necessarily the `caller PC' (depending on how you define it).
It's the return address, which is probably (but not always) the instruction
following the branch to this function.

Will

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 0/3] perf tools: Add libdw DWARF post unwind support for ARM
  2014-03-04  1:24 ` [PATCH 0/3] " Jean Pihet
  2014-03-04 10:37   ` Jiri Olsa
@ 2014-03-04 11:58   ` Jiri Olsa
  1 sibling, 0 replies; 16+ messages in thread
From: Jiri Olsa @ 2014-03-04 11:58 UTC (permalink / raw)
  To: Jean Pihet
  Cc: linux-kernel, linaro-kernel, linux-arm-kernel, Arnaldo,
	Ingo Molnar, Will Deacon, Steve Capper, Patch Tracking

On Tue, Mar 04, 2014 at 02:24:37AM +0100, Jean Pihet wrote:
> Hi,
> 
> Here are some notes about the performance improvement and the test usage.
> 
> Jiri, do you need these notes in the one of commit description?
> 
> 1. Using libdw vs libunwind on ARMv7
> 
> The performance gain is (more than) significant: >800%.
> 
> I did not profile perf itself ;-p just did a timing measurement.
> The benchmark has been run multiple times with different perf.data
> sizes. The results are consistent across the tests runs.
> 
> Usage:
> ./tools/perf/perf record --call-graph dwarf --
> ../../libunwind/test_app/stress_bt
> time ./tools/perf/perf report --stdio > /dev/null 2>&1
> 
> Platform:
> Quad-core marvell XP370. perf runs on 1 cpu
> 
> Perf data size:
> 304MB    libunwind    libdw        improvement
> real    9m31.577s    1m13.052s    782%
> user    5m33.020s    1m2.910s    529%
> sys        3m57.770s    0m10.090s    2356%
> 
> 2. unwind test usage
> perf test list gives the list of supported tests. In this case the
> test #23 is dwarf unwinding:
> 
> ./tools/perf/perf test 23
> 23: Test dwarf unwind                                      : Ok
> 
> 
> Regards,
> Jean
> 
> 

I can't speak for the arm specific stuff, but generally it fits
and looks ok (and does not break the x86 build)

also.. who wouldn't ack 800% speed improvement ;-)

Acked-by: Jiri Olsa <jolsa@redhat.com>

thanks,
jirka

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/3] perf tests: Introduce perf_regs_load function on ARM
  2014-03-04 11:00   ` Will Deacon
@ 2014-03-05  2:17     ` Jean Pihet
  2014-03-06 11:33       ` Will Deacon
  0 siblings, 1 reply; 16+ messages in thread
From: Jean Pihet @ 2014-03-05  2:17 UTC (permalink / raw)
  To: Will Deacon
  Cc: linux-kernel, linaro-kernel, linux-arm-kernel, Arnaldo,
	Ingo Molnar, Jiri Olsa, steve.capper, patches, Corey Ashford,
	Frederic Weisbecker, Namhyung Kim, Paul Mackerras,
	Peter Zijlstra, David Ahern

Hi Will,

On 4 March 2014 12:00, Will Deacon <will.deacon@arm.com> wrote:
> Hi Jean,
>
> On Mon, Mar 03, 2014 at 09:53:21AM +0000, Jean Pihet wrote:
>> Introducing perf_regs_load function, which is going
>> to be used for dwarf unwind test in following patches.
>>
>> It takes single argument as a pointer to the regs dump
>> buffer and populates it with current registers values.
>
> [...]
>
>> diff --git a/tools/perf/arch/arm/tests/regs_load.S b/tools/perf/arch/arm/tests/regs_load.S
>> new file mode 100644
>> index 0000000..241c6df
>> --- /dev/null
>> +++ b/tools/perf/arch/arm/tests/regs_load.S
>> @@ -0,0 +1,51 @@
>> +#include <linux/linkage.h>
>> +
>> +#define R0 0x00
>> +#define R1 0x08
>
> Why are you using a 64-bit stride for 32-bit registers? (which prevents you
> from using stm later on).
This is because perf internally uses a 64-bit array for the registers,
cf. the function prototype in the comment above.

>
>> +.text
>> +.type perf_regs_load,%function
>> +ENTRY(perf_regs_load)
>> +     push {r1}
>
> Do you only push r1 here so that you can do the stack arithmetic later? That
> doesn't make sense to me -- can't you str sp directly?
Yes indeed! I submitted a new version yesterday with that change and a
fix in the comments.

>
>> +     str r0, [r0, #R0]
>> +     str r1, [r0, #R1]
>> +     str r2, [r0, #R2]
>> +     str r3, [r0, #R3]
>> +     str r4, [r0, #R4]
>> +     str r5, [r0, #R5]
>> +     str r6, [r0, #R6]
>> +     str r7, [r0, #R7]
>> +     str r8, [r0, #R8]
>> +     str r9, [r0, #R9]
>> +     str sl, [r0, #SL]
>> +     str fp, [r0, #FP]
>> +     str ip, [r0, #IP]
>> +     add r1, sp, #4          @ Retrieve and save sp at entry time
>> +     str r1, [r0, #SP]
>> +     str lr, [r0, #LR]
>> +     str lr, [r0, #PC]       @ Save caller PC
>
> This isn't necessarily the `caller PC' (depending on how you define it).
> It's the return address, which is probably (but not always) the instruction
> following the branch to this function.
Agreed. However the perf test code expects a registers buffer filled
in with the caller's values.
I can change the comment here, is that needed?

Thanks for reviewing!
Jean

>
> Will
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/3] perf tests: Introduce perf_regs_load function on ARM
  2014-03-05  2:17     ` Jean Pihet
@ 2014-03-06 11:33       ` Will Deacon
  2014-03-06 17:22         ` Jiri Olsa
  0 siblings, 1 reply; 16+ messages in thread
From: Will Deacon @ 2014-03-06 11:33 UTC (permalink / raw)
  To: Jean Pihet
  Cc: linux-kernel, linaro-kernel, linux-arm-kernel, Arnaldo,
	Ingo Molnar, Jiri Olsa, steve.capper, patches, Corey Ashford,
	Frederic Weisbecker, Namhyung Kim, Paul Mackerras,
	Peter Zijlstra, David Ahern

On Wed, Mar 05, 2014 at 02:17:00AM +0000, Jean Pihet wrote:
> On 4 March 2014 12:00, Will Deacon <will.deacon@arm.com> wrote:
> > On Mon, Mar 03, 2014 at 09:53:21AM +0000, Jean Pihet wrote:
> >> +     str lr, [r0, #PC]       @ Save caller PC
> >
> > This isn't necessarily the `caller PC' (depending on how you define it).
> > It's the return address, which is probably (but not always) the instruction
> > following the branch to this function.
> Agreed. However the perf test code expects a registers buffer filled
> in with the caller's values.
> I can change the comment here, is that needed?

It depends what the perf test code really expects. At the moment, you're not
providing it with anything consistent which doesn't sound correct.

What does it use the caller PC for?

Will

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/3] perf tests: Introduce perf_regs_load function on ARM
  2014-03-06 11:33       ` Will Deacon
@ 2014-03-06 17:22         ` Jiri Olsa
  2014-03-11 18:25           ` Jean Pihet
  0 siblings, 1 reply; 16+ messages in thread
From: Jiri Olsa @ 2014-03-06 17:22 UTC (permalink / raw)
  To: Will Deacon
  Cc: Jean Pihet, linux-kernel, linaro-kernel, linux-arm-kernel,
	Arnaldo, Ingo Molnar, steve.capper, patches, Corey Ashford,
	Frederic Weisbecker, Namhyung Kim, Paul Mackerras,
	Peter Zijlstra, David Ahern

On Thu, Mar 06, 2014 at 11:33:15AM +0000, Will Deacon wrote:
> On Wed, Mar 05, 2014 at 02:17:00AM +0000, Jean Pihet wrote:
> > On 4 March 2014 12:00, Will Deacon <will.deacon@arm.com> wrote:
> > > On Mon, Mar 03, 2014 at 09:53:21AM +0000, Jean Pihet wrote:
> > >> +     str lr, [r0, #PC]       @ Save caller PC
> > >
> > > This isn't necessarily the `caller PC' (depending on how you define it).
> > > It's the return address, which is probably (but not always) the instruction
> > > following the branch to this function.
> > Agreed. However the perf test code expects a registers buffer filled
> > in with the caller's values.
> > I can change the comment here, is that needed?
> 
> It depends what the perf test code really expects. At the moment, you're not
> providing it with anything consistent which doesn't sound correct.

the code expects caller's PC. That is what the x86 test code
expects from perf_regs_load. We take the return IP saved by
call instruction:

ENTRY(perf_regs_load)
	...
	movq 0(%rsp), %rax
	movq %rax, IP(%rdi)
	...

jirka

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/3] perf tests: Introduce perf_regs_load function on ARM
  2014-03-06 17:22         ` Jiri Olsa
@ 2014-03-11 18:25           ` Jean Pihet
  2014-03-12 16:13             ` Will Deacon
  0 siblings, 1 reply; 16+ messages in thread
From: Jean Pihet @ 2014-03-11 18:25 UTC (permalink / raw)
  To: Will Deacon, steve.capper
  Cc: linux-kernel, linaro-kernel, linux-arm-kernel, Arnaldo,
	Ingo Molnar, patches, Corey Ashford, Frederic Weisbecker,
	Namhyung Kim, Paul Mackerras, Peter Zijlstra, David Ahern,
	Jiri Olsa

HI Will, Steve,

On 6 March 2014 18:22, Jiri Olsa <jolsa@redhat.com> wrote:
> On Thu, Mar 06, 2014 at 11:33:15AM +0000, Will Deacon wrote:
>> On Wed, Mar 05, 2014 at 02:17:00AM +0000, Jean Pihet wrote:
>> > On 4 March 2014 12:00, Will Deacon <will.deacon@arm.com> wrote:
>> > > On Mon, Mar 03, 2014 at 09:53:21AM +0000, Jean Pihet wrote:
>> > >> +     str lr, [r0, #PC]       @ Save caller PC
>> > >
>> > > This isn't necessarily the `caller PC' (depending on how you define it).
>> > > It's the return address, which is probably (but not always) the instruction
>> > > following the branch to this function.
>> > Agreed. However the perf test code expects a registers buffer filled
>> > in with the caller's values.
>> > I can change the comment here, is that needed?
>>
>> It depends what the perf test code really expects. At the moment, you're not
>> providing it with anything consistent which doesn't sound correct.
>
> the code expects caller's PC. That is what the x86 test code
> expects from perf_regs_load. We take the return IP saved by
> call instruction:
>
> ENTRY(perf_regs_load)
>         ...
>         movq 0(%rsp), %rax
>         movq %rax, IP(%rdi)
>         ...
>
> jirka

The perf built-in test code expects the caller PC to do the unwinding
from. Does that sound correct to you?

Do you want me to add a note like this one (as done in the upcoming
aarch64 patches, to be submitted asap after testing):
"
/*
 * Implementation of void perf_regs_load(u64 *regs);
 *
 * This functions fills in the 'regs' buffer from the actual registers values.
 *
 * Notes:
 *  1. the return value of the pc is retrieved from lr and stored, in order
 *  to skip the call to this function,
 *  2. the current value of lr is merely retrieved and stored because the
 *  value before the call to this function is unknown at this time; it will
 *  be unwound from the dwarf information in unwind__get_entries.
 */
"

Please let me know how to make this better.

Regards,
Jean

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/3] perf tests: Introduce perf_regs_load function on ARM
  2014-03-11 18:25           ` Jean Pihet
@ 2014-03-12 16:13             ` Will Deacon
  0 siblings, 0 replies; 16+ messages in thread
From: Will Deacon @ 2014-03-12 16:13 UTC (permalink / raw)
  To: Jean Pihet
  Cc: steve.capper, linux-kernel, linaro-kernel, linux-arm-kernel,
	Arnaldo, Ingo Molnar, patches, Corey Ashford,
	Frederic Weisbecker, Namhyung Kim, Paul Mackerras,
	Peter Zijlstra, David Ahern, Jiri Olsa

On Tue, Mar 11, 2014 at 06:25:11PM +0000, Jean Pihet wrote:
> HI Will, Steve,

Hi Jean,

> On 6 March 2014 18:22, Jiri Olsa <jolsa@redhat.com> wrote:
> > On Thu, Mar 06, 2014 at 11:33:15AM +0000, Will Deacon wrote:
> >> On Wed, Mar 05, 2014 at 02:17:00AM +0000, Jean Pihet wrote:
> >> > On 4 March 2014 12:00, Will Deacon <will.deacon@arm.com> wrote:
> >> > > On Mon, Mar 03, 2014 at 09:53:21AM +0000, Jean Pihet wrote:
> >> > >> +     str lr, [r0, #PC]       @ Save caller PC
> >> > >
> >> > > This isn't necessarily the `caller PC' (depending on how you define it).
> >> > > It's the return address, which is probably (but not always) the instruction
> >> > > following the branch to this function.
> >> > Agreed. However the perf test code expects a registers buffer filled
> >> > in with the caller's values.
> >> > I can change the comment here, is that needed?
> >>
> >> It depends what the perf test code really expects. At the moment, you're not
> >> providing it with anything consistent which doesn't sound correct.
> >
> > the code expects caller's PC. That is what the x86 test code
> > expects from perf_regs_load. We take the return IP saved by
> > call instruction:
> >
> > ENTRY(perf_regs_load)
> >         ...
> >         movq 0(%rsp), %rax
> >         movq %rax, IP(%rdi)
> >         ...
> >
> > jirka
> 
> The perf built-in test code expects the caller PC to do the unwinding
> from. Does that sound correct to you?

Yes, but we're not necessarily providing that in your code. We're providing
the return address, which could be different (e.g., if this was a tail
call).

> Do you want me to add a note like this one (as done in the upcoming
> aarch64 patches, to be submitted asap after testing):
> "
> /*
>  * Implementation of void perf_regs_load(u64 *regs);
>  *
>  * This functions fills in the 'regs' buffer from the actual registers values.
>  *
>  * Notes:
>  *  1. the return value of the pc is retrieved from lr and stored, in order
>  *  to skip the call to this function,
>  *  2. the current value of lr is merely retrieved and stored because the
>  *  value before the call to this function is unknown at this time; it will
>  *  be unwound from the dwarf information in unwind__get_entries.
>  */
> "

I think the main thing is that your code may well work for this specific
test case (and perhaps that's good enough) but we should be clear about the
assumptions you're making.

Will

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 0/3] perf tools: Add libdw DWARF post unwind support for ARM
@ 2014-05-06 15:26 Jean Pihet
  0 siblings, 0 replies; 16+ messages in thread
From: Jean Pihet @ 2014-05-06 15:26 UTC (permalink / raw)
  To: Jiri Olsa, Arnaldo Carvalho de Melo, will.deacon
  Cc: linux-kernel, linaro-kernel, linux-arm-kernel, Jean Pihet

Adding libdw DWARF post unwind support, which is part
of elfutils-devel/libdw-dev package from version 0.158.

Also includes the test suite for dwarf unwinding, by adding the
arch specific test code and the perf_regs_load function.

Jean Pihet (3):
  perf tests: Introduce perf_regs_load function on ARM
  perf tests: Add dwarf unwind test on ARM
  perf tools: Add libdw DWARF post unwind support for ARM

 tools/perf/Makefile.perf                 |  2 +-
 tools/perf/arch/arm/Makefile             |  7 ++++
 tools/perf/arch/arm/include/perf_regs.h  |  5 +++
 tools/perf/arch/arm/tests/dwarf-unwind.c | 59 +++++++++++++++++++++++++++++++
 tools/perf/arch/arm/tests/regs_load.S    | 60 ++++++++++++++++++++++++++++++++
 tools/perf/arch/arm/util/unwind-libdw.c  | 36 +++++++++++++++++++
 tools/perf/tests/builtin-test.c          |  2 +-
 tools/perf/tests/tests.h                 |  2 +-
 8 files changed, 170 insertions(+), 3 deletions(-)
 create mode 100644 tools/perf/arch/arm/tests/dwarf-unwind.c
 create mode 100644 tools/perf/arch/arm/tests/regs_load.S
 create mode 100644 tools/perf/arch/arm/util/unwind-libdw.c

---
Rebased on the latest jolsa/perf/core

-- 
1.7.11.7


^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2014-05-06 15:26 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-03-03  9:53 [PATCH 0/3] perf tools: Add libdw DWARF post unwind support for ARM Jean Pihet
2014-03-03  9:53 ` [PATCH 1/3] perf tests: Introduce perf_regs_load function on ARM Jean Pihet
2014-03-04  7:50   ` Jean Pihet
2014-03-04  7:57     ` Jean Pihet
2014-03-04 11:00   ` Will Deacon
2014-03-05  2:17     ` Jean Pihet
2014-03-06 11:33       ` Will Deacon
2014-03-06 17:22         ` Jiri Olsa
2014-03-11 18:25           ` Jean Pihet
2014-03-12 16:13             ` Will Deacon
2014-03-03  9:53 ` [PATCH 2/3] perf tests: Add dwarf unwind test " Jean Pihet
2014-03-03  9:53 ` [PATCH 3/3] perf tools: Add libdw DWARF post unwind support for ARM Jean Pihet
2014-03-04  1:24 ` [PATCH 0/3] " Jean Pihet
2014-03-04 10:37   ` Jiri Olsa
2014-03-04 11:58   ` Jiri Olsa
2014-05-06 15:26 Jean Pihet

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).