All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3] perf tools: Add libdw DWARF post unwind support for ARM64
@ 2014-05-06 15:55 ` Jean Pihet
  0 siblings, 0 replies; 46+ messages in thread
From: Jean Pihet @ 2014-05-06 15:55 UTC (permalink / raw)
  To: Jiri Olsa, Arnaldo Carvalho de Melo, will.deacon
  Cc: linux-kernel, linaro-kernel, linux-arm-kernel, Jean Pihet

Adding libdw DWARF post unwind support, which is part
of elfutils-devel/libdw-dev package from version 0.158.

Also includes the test suite for dwarf unwinding, by adding the
arch specific test code and the perf_regs_load function.

This series depends on the following kernel patches series:
- AARCH64 unwinding support [1]. Already mainlined.
- ARM libdw integration [2],
and on the changes from the branch for:
- libdw AARCH64 unwinding support [3].

[1] http://www.spinics.net/lists/arm-kernel/msg304483.html
[2] https://lkml.org/lkml/2014/5/6/366
[3] https://git.fedorahosted.org/cgit/elfutils.git/log/?h=mjw/aarch64-unwind

ToDo: investigate the libdw unwinding problem with compat binaries (i.e.
ARMv7 binaries running on ARMv8). Since this functionality works ok with
libunwind, the problem should be in libdw compat support [3].


Jean Pihet (3):
  perf tests: Introduce perf_regs_load function on ARM64
  perf tests: Add dwarf unwind test on ARM64
  perf tools: Add libdw DWARF post unwind support for ARM64

 tools/perf/Makefile.perf                   |  2 +-
 tools/perf/arch/arm64/Makefile             |  7 ++++
 tools/perf/arch/arm64/include/perf_regs.h  |  5 +++
 tools/perf/arch/arm64/tests/dwarf-unwind.c | 59 ++++++++++++++++++++++++++++++
 tools/perf/arch/arm64/tests/regs_load.S    | 39 ++++++++++++++++++++
 tools/perf/arch/arm64/util/unwind-libdw.c  | 53 +++++++++++++++++++++++++++
 tools/perf/tests/builtin-test.c            |  3 +-
 tools/perf/tests/tests.h                   |  3 +-
 8 files changed, 168 insertions(+), 3 deletions(-)
 create mode 100644 tools/perf/arch/arm64/tests/dwarf-unwind.c
 create mode 100644 tools/perf/arch/arm64/tests/regs_load.S
 create mode 100644 tools/perf/arch/arm64/util/unwind-libdw.c

---
Rebased on the latest jolsa/perf/core

-- 
1.7.11.7


^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH 0/3] perf tools: Add libdw DWARF post unwind support for ARM64
@ 2014-05-06 15:55 ` Jean Pihet
  0 siblings, 0 replies; 46+ messages in thread
From: Jean Pihet @ 2014-05-06 15:55 UTC (permalink / raw)
  To: linux-arm-kernel

Adding libdw DWARF post unwind support, which is part
of elfutils-devel/libdw-dev package from version 0.158.

Also includes the test suite for dwarf unwinding, by adding the
arch specific test code and the perf_regs_load function.

This series depends on the following kernel patches series:
- AARCH64 unwinding support [1]. Already mainlined.
- ARM libdw integration [2],
and on the changes from the branch for:
- libdw AARCH64 unwinding support [3].

[1] http://www.spinics.net/lists/arm-kernel/msg304483.html
[2] https://lkml.org/lkml/2014/5/6/366
[3] https://git.fedorahosted.org/cgit/elfutils.git/log/?h=mjw/aarch64-unwind

ToDo: investigate the libdw unwinding problem with compat binaries (i.e.
ARMv7 binaries running on ARMv8). Since this functionality works ok with
libunwind, the problem should be in libdw compat support [3].


Jean Pihet (3):
  perf tests: Introduce perf_regs_load function on ARM64
  perf tests: Add dwarf unwind test on ARM64
  perf tools: Add libdw DWARF post unwind support for ARM64

 tools/perf/Makefile.perf                   |  2 +-
 tools/perf/arch/arm64/Makefile             |  7 ++++
 tools/perf/arch/arm64/include/perf_regs.h  |  5 +++
 tools/perf/arch/arm64/tests/dwarf-unwind.c | 59 ++++++++++++++++++++++++++++++
 tools/perf/arch/arm64/tests/regs_load.S    | 39 ++++++++++++++++++++
 tools/perf/arch/arm64/util/unwind-libdw.c  | 53 +++++++++++++++++++++++++++
 tools/perf/tests/builtin-test.c            |  3 +-
 tools/perf/tests/tests.h                   |  3 +-
 8 files changed, 168 insertions(+), 3 deletions(-)
 create mode 100644 tools/perf/arch/arm64/tests/dwarf-unwind.c
 create mode 100644 tools/perf/arch/arm64/tests/regs_load.S
 create mode 100644 tools/perf/arch/arm64/util/unwind-libdw.c

---
Rebased on the latest jolsa/perf/core

-- 
1.7.11.7

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH 1/3] perf tests: Introduce perf_regs_load function on ARM64
  2014-05-06 15:55 ` Jean Pihet
@ 2014-05-06 15:55   ` Jean Pihet
  -1 siblings, 0 replies; 46+ messages in thread
From: Jean Pihet @ 2014-05-06 15:55 UTC (permalink / raw)
  To: Jiri Olsa, Arnaldo Carvalho de Melo, will.deacon
  Cc: linux-kernel, linaro-kernel, linux-arm-kernel, Jean Pihet,
	Steve Capper, Corey Ashford, Frederic Weisbecker, Ingo Molnar,
	Namhyung Kim, Paul Mackerras, Peter Zijlstra, David Ahern

Introducing perf_regs_load function, which is going
to be used for dwarf unwind test in following patches.

It takes single argument as a pointer to the regs dump
buffer and populates it with current registers values, as
expected by the perf built-in unwinding test.

Signed-off-by: Jean Pihet <jean.pihet@linaro.org>
Cc: Steve Capper <steve.capper@linaro.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
---
 tools/perf/arch/arm64/Makefile            |  1 +
 tools/perf/arch/arm64/include/perf_regs.h |  2 ++
 tools/perf/arch/arm64/tests/regs_load.S   | 39 +++++++++++++++++++++++++++++++
 3 files changed, 42 insertions(+)
 create mode 100644 tools/perf/arch/arm64/tests/regs_load.S

diff --git a/tools/perf/arch/arm64/Makefile b/tools/perf/arch/arm64/Makefile
index 67e9b3d..9b8f87e 100644
--- a/tools/perf/arch/arm64/Makefile
+++ b/tools/perf/arch/arm64/Makefile
@@ -4,4 +4,5 @@ LIB_OBJS += $(OUTPUT)arch/$(ARCH)/util/dwarf-regs.o
 endif
 ifndef NO_LIBUNWIND
 LIB_OBJS += $(OUTPUT)arch/$(ARCH)/util/unwind-libunwind.o
+LIB_OBJS += $(OUTPUT)arch/$(ARCH)/tests/regs_load.o
 endif
diff --git a/tools/perf/arch/arm64/include/perf_regs.h b/tools/perf/arch/arm64/include/perf_regs.h
index 2359546..1e052f1 100644
--- a/tools/perf/arch/arm64/include/perf_regs.h
+++ b/tools/perf/arch/arm64/include/perf_regs.h
@@ -9,6 +9,8 @@
 #define PERF_REG_IP	PERF_REG_ARM64_PC
 #define PERF_REG_SP	PERF_REG_ARM64_SP
 
+void perf_regs_load(u64 *regs);
+
 static inline const char *perf_reg_name(int id)
 {
 	switch (id) {
diff --git a/tools/perf/arch/arm64/tests/regs_load.S b/tools/perf/arch/arm64/tests/regs_load.S
new file mode 100644
index 0000000..92ab968
--- /dev/null
+++ b/tools/perf/arch/arm64/tests/regs_load.S
@@ -0,0 +1,39 @@
+#include <linux/linkage.h>
+
+/*
+ * Implementation of void perf_regs_load(u64 *regs);
+ *
+ * This functions fills in the 'regs' buffer from the actual registers values,
+ * in the way the perf built-in unwinding test expects them:
+ * - the PC at the time at the call to this function. Since this function
+ *   is called using a bl instruction, the PC value is taken from LR,
+ * - the current SP (not touched by this function),
+ * - the current value of LR is merely retrieved and stored because the
+ *   value before the call to this function is unknown at this time; it will
+ *   be unwound from the dwarf information in unwind__get_entries.
+ */
+
+.text
+.type perf_regs_load,%function
+ENTRY(perf_regs_load)
+	stp x0,  x1,  [x0], #16	// store x0..x29
+	stp x2,  x3,  [x0], #16
+	stp x4,  x5,  [x0], #16
+	stp x6,  x7,  [x0], #16
+	stp x8,  x9,  [x0], #16
+	stp x10, x11, [x0], #16
+	stp x12, x13, [x0], #16
+	stp x14, x15, [x0], #16
+	stp x16, x17, [x0], #16
+	stp x18, x19, [x0], #16
+	stp x20, x21, [x0], #16
+	stp x22, x23, [x0], #16
+	stp x24, x25, [x0], #16
+	stp x26, x27, [x0], #16
+	stp x28, x29, [x0], #16
+	mov x1,  sp
+	stp x30, x1,  [x0], #16	// store lr and sp
+	str x30, [x0]		// store pc as lr in order to skip the call
+				//  to this function
+	ret
+ENDPROC(perf_regs_load)
-- 
1.7.11.7


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 1/3] perf tests: Introduce perf_regs_load function on ARM64
@ 2014-05-06 15:55   ` Jean Pihet
  0 siblings, 0 replies; 46+ messages in thread
From: Jean Pihet @ 2014-05-06 15:55 UTC (permalink / raw)
  To: linux-arm-kernel

Introducing perf_regs_load function, which is going
to be used for dwarf unwind test in following patches.

It takes single argument as a pointer to the regs dump
buffer and populates it with current registers values, as
expected by the perf built-in unwinding test.

Signed-off-by: Jean Pihet <jean.pihet@linaro.org>
Cc: Steve Capper <steve.capper@linaro.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
---
 tools/perf/arch/arm64/Makefile            |  1 +
 tools/perf/arch/arm64/include/perf_regs.h |  2 ++
 tools/perf/arch/arm64/tests/regs_load.S   | 39 +++++++++++++++++++++++++++++++
 3 files changed, 42 insertions(+)
 create mode 100644 tools/perf/arch/arm64/tests/regs_load.S

diff --git a/tools/perf/arch/arm64/Makefile b/tools/perf/arch/arm64/Makefile
index 67e9b3d..9b8f87e 100644
--- a/tools/perf/arch/arm64/Makefile
+++ b/tools/perf/arch/arm64/Makefile
@@ -4,4 +4,5 @@ LIB_OBJS += $(OUTPUT)arch/$(ARCH)/util/dwarf-regs.o
 endif
 ifndef NO_LIBUNWIND
 LIB_OBJS += $(OUTPUT)arch/$(ARCH)/util/unwind-libunwind.o
+LIB_OBJS += $(OUTPUT)arch/$(ARCH)/tests/regs_load.o
 endif
diff --git a/tools/perf/arch/arm64/include/perf_regs.h b/tools/perf/arch/arm64/include/perf_regs.h
index 2359546..1e052f1 100644
--- a/tools/perf/arch/arm64/include/perf_regs.h
+++ b/tools/perf/arch/arm64/include/perf_regs.h
@@ -9,6 +9,8 @@
 #define PERF_REG_IP	PERF_REG_ARM64_PC
 #define PERF_REG_SP	PERF_REG_ARM64_SP
 
+void perf_regs_load(u64 *regs);
+
 static inline const char *perf_reg_name(int id)
 {
 	switch (id) {
diff --git a/tools/perf/arch/arm64/tests/regs_load.S b/tools/perf/arch/arm64/tests/regs_load.S
new file mode 100644
index 0000000..92ab968
--- /dev/null
+++ b/tools/perf/arch/arm64/tests/regs_load.S
@@ -0,0 +1,39 @@
+#include <linux/linkage.h>
+
+/*
+ * Implementation of void perf_regs_load(u64 *regs);
+ *
+ * This functions fills in the 'regs' buffer from the actual registers values,
+ * in the way the perf built-in unwinding test expects them:
+ * - the PC at the time at the call to this function. Since this function
+ *   is called using a bl instruction, the PC value is taken from LR,
+ * - the current SP (not touched by this function),
+ * - the current value of LR is merely retrieved and stored because the
+ *   value before the call to this function is unknown at this time; it will
+ *   be unwound from the dwarf information in unwind__get_entries.
+ */
+
+.text
+.type perf_regs_load,%function
+ENTRY(perf_regs_load)
+	stp x0,  x1,  [x0], #16	// store x0..x29
+	stp x2,  x3,  [x0], #16
+	stp x4,  x5,  [x0], #16
+	stp x6,  x7,  [x0], #16
+	stp x8,  x9,  [x0], #16
+	stp x10, x11, [x0], #16
+	stp x12, x13, [x0], #16
+	stp x14, x15, [x0], #16
+	stp x16, x17, [x0], #16
+	stp x18, x19, [x0], #16
+	stp x20, x21, [x0], #16
+	stp x22, x23, [x0], #16
+	stp x24, x25, [x0], #16
+	stp x26, x27, [x0], #16
+	stp x28, x29, [x0], #16
+	mov x1,  sp
+	stp x30, x1,  [x0], #16	// store lr and sp
+	str x30, [x0]		// store pc as lr in order to skip the call
+				//  to this function
+	ret
+ENDPROC(perf_regs_load)
-- 
1.7.11.7

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 2/3] perf tests: Add dwarf unwind test on ARM64
  2014-05-06 15:55 ` Jean Pihet
@ 2014-05-06 15:55   ` Jean Pihet
  -1 siblings, 0 replies; 46+ messages in thread
From: Jean Pihet @ 2014-05-06 15:55 UTC (permalink / raw)
  To: Jiri Olsa, Arnaldo Carvalho de Melo, will.deacon
  Cc: linux-kernel, linaro-kernel, linux-arm-kernel, Jean Pihet,
	Corey Ashford, Frederic Weisbecker, Ingo Molnar, Namhyung Kim,
	Paul Mackerras, Peter Zijlstra, David Ahern

Adding dwarf unwind test, that setups live machine data over
the perf test thread and does the remote unwind.

Need to use -fno-optimize-sibling-calls for test compilation,
otherwise 'krava_*' function calls are optimized into jumps
and ommited from the stack unwind.

Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
Cc: David Ahern <dsahern@gmail.com>
Signed-off-by: Jean Pihet <jean.pihet@linaro.org>
---
 tools/perf/Makefile.perf                   |  2 +-
 tools/perf/arch/arm64/Makefile             |  1 +
 tools/perf/arch/arm64/include/perf_regs.h  |  3 ++
 tools/perf/arch/arm64/tests/dwarf-unwind.c | 59 ++++++++++++++++++++++++++++++
 tools/perf/tests/builtin-test.c            |  3 +-
 tools/perf/tests/tests.h                   |  3 +-
 6 files changed, 68 insertions(+), 3 deletions(-)
 create mode 100644 tools/perf/arch/arm64/tests/dwarf-unwind.c

diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf
index dea2d633..6cde50f 100644
--- a/tools/perf/Makefile.perf
+++ b/tools/perf/Makefile.perf
@@ -411,7 +411,7 @@ LIB_OBJS += $(OUTPUT)tests/code-reading.o
 LIB_OBJS += $(OUTPUT)tests/sample-parsing.o
 LIB_OBJS += $(OUTPUT)tests/parse-no-sample-id-all.o
 ifndef NO_DWARF_UNWIND
-ifeq ($(ARCH),$(filter $(ARCH),x86 arm))
+ifeq ($(ARCH),$(filter $(ARCH),x86 arm arm64))
 LIB_OBJS += $(OUTPUT)tests/dwarf-unwind.o
 endif
 endif
diff --git a/tools/perf/arch/arm64/Makefile b/tools/perf/arch/arm64/Makefile
index 9b8f87e..221f21d 100644
--- a/tools/perf/arch/arm64/Makefile
+++ b/tools/perf/arch/arm64/Makefile
@@ -5,4 +5,5 @@ endif
 ifndef NO_LIBUNWIND
 LIB_OBJS += $(OUTPUT)arch/$(ARCH)/util/unwind-libunwind.o
 LIB_OBJS += $(OUTPUT)arch/$(ARCH)/tests/regs_load.o
+LIB_OBJS += $(OUTPUT)arch/$(ARCH)/tests/dwarf-unwind.o
 endif
diff --git a/tools/perf/arch/arm64/include/perf_regs.h b/tools/perf/arch/arm64/include/perf_regs.h
index 1e052f1..e74df99 100644
--- a/tools/perf/arch/arm64/include/perf_regs.h
+++ b/tools/perf/arch/arm64/include/perf_regs.h
@@ -9,6 +9,9 @@
 #define PERF_REG_IP	PERF_REG_ARM64_PC
 #define PERF_REG_SP	PERF_REG_ARM64_SP
 
+#define PERF_REGS_MAX	PERF_REG_ARM64_MAX
+#define PERF_SAMPLE_REGS_ABI	PERF_SAMPLE_REGS_ABI_64
+
 void perf_regs_load(u64 *regs);
 
 static inline const char *perf_reg_name(int id)
diff --git a/tools/perf/arch/arm64/tests/dwarf-unwind.c b/tools/perf/arch/arm64/tests/dwarf-unwind.c
new file mode 100644
index 0000000..0aa64f3
--- /dev/null
+++ b/tools/perf/arch/arm64/tests/dwarf-unwind.c
@@ -0,0 +1,59 @@
+#include <string.h>
+#include "perf_regs.h"
+#include "thread.h"
+#include "map.h"
+#include "event.h"
+#include "tests/tests.h"
+
+#define STACK_SIZE 8192
+
+static int sample_ustack(struct perf_sample *sample,
+			 struct thread *thread, u64 *regs)
+{
+	struct stack_dump *stack = &sample->user_stack;
+	struct map *map;
+	unsigned long sp;
+	u64 stack_size, *buf;
+
+	buf = malloc(STACK_SIZE);
+	if (!buf) {
+		pr_debug("failed to allocate sample uregs data\n");
+		return -1;
+	}
+
+	sp = (unsigned long) regs[PERF_REG_ARM64_SP];
+
+	map = map_groups__find(&thread->mg, MAP__FUNCTION, (u64) sp);
+	if (!map) {
+		pr_debug("failed to get stack map\n");
+		return -1;
+	}
+
+	stack_size = map->end - sp;
+	stack_size = stack_size > STACK_SIZE ? STACK_SIZE : stack_size;
+
+	memcpy(buf, (void *) sp, stack_size);
+	stack->data = (char *) buf;
+	stack->size = stack_size;
+	return 0;
+}
+
+int test__arch_unwind_sample(struct perf_sample *sample,
+			     struct thread *thread)
+{
+	struct regs_dump *regs = &sample->user_regs;
+	u64 *buf;
+
+	buf = malloc(sizeof(u64) * PERF_REGS_MAX);
+	if (!buf) {
+		pr_debug("failed to allocate sample uregs data\n");
+		return -1;
+	}
+
+	perf_regs_load(buf);
+	regs->abi  = PERF_SAMPLE_REGS_ABI;
+	regs->regs = buf;
+	regs->mask = PERF_REGS_MASK;
+
+	return sample_ustack(sample, thread, buf);
+}
diff --git a/tools/perf/tests/builtin-test.c b/tools/perf/tests/builtin-test.c
index 5e0764b..7921aa0 100644
--- a/tools/perf/tests/builtin-test.c
+++ b/tools/perf/tests/builtin-test.c
@@ -115,7 +115,8 @@ static struct test {
 		.desc = "Test parsing with no sample_id_all bit set",
 		.func = test__parse_no_sample_id_all,
 	},
-#if defined(__x86_64__) || defined(__i386__) || defined(__arm__)
+#if defined(__x86_64__) || defined(__i386__) || \
+    defined(__arm__) || defined(__aarch64__)
 #ifdef HAVE_DWARF_UNWIND_SUPPORT
 	{
 		.desc = "Test dwarf unwind",
diff --git a/tools/perf/tests/tests.h b/tools/perf/tests/tests.h
index 8f91fb0..426680e 100644
--- a/tools/perf/tests/tests.h
+++ b/tools/perf/tests/tests.h
@@ -45,7 +45,8 @@ int test__hists_filter(void);
 int test__mmap_thread_lookup(void);
 int test__thread_mg_share(void);
 
-#if defined(__x86_64__) || defined(__i386__) || defined(__arm__)
+#if defined(__x86_64__) || defined(__i386__) || \
+    defined(__arm__) || defined(__aarch64__)
 #ifdef HAVE_DWARF_UNWIND_SUPPORT
 struct thread;
 struct perf_sample;
-- 
1.7.11.7


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 2/3] perf tests: Add dwarf unwind test on ARM64
@ 2014-05-06 15:55   ` Jean Pihet
  0 siblings, 0 replies; 46+ messages in thread
From: Jean Pihet @ 2014-05-06 15:55 UTC (permalink / raw)
  To: linux-arm-kernel

Adding dwarf unwind test, that setups live machine data over
the perf test thread and does the remote unwind.

Need to use -fno-optimize-sibling-calls for test compilation,
otherwise 'krava_*' function calls are optimized into jumps
and ommited from the stack unwind.

Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
Cc: David Ahern <dsahern@gmail.com>
Signed-off-by: Jean Pihet <jean.pihet@linaro.org>
---
 tools/perf/Makefile.perf                   |  2 +-
 tools/perf/arch/arm64/Makefile             |  1 +
 tools/perf/arch/arm64/include/perf_regs.h  |  3 ++
 tools/perf/arch/arm64/tests/dwarf-unwind.c | 59 ++++++++++++++++++++++++++++++
 tools/perf/tests/builtin-test.c            |  3 +-
 tools/perf/tests/tests.h                   |  3 +-
 6 files changed, 68 insertions(+), 3 deletions(-)
 create mode 100644 tools/perf/arch/arm64/tests/dwarf-unwind.c

diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf
index dea2d633..6cde50f 100644
--- a/tools/perf/Makefile.perf
+++ b/tools/perf/Makefile.perf
@@ -411,7 +411,7 @@ LIB_OBJS += $(OUTPUT)tests/code-reading.o
 LIB_OBJS += $(OUTPUT)tests/sample-parsing.o
 LIB_OBJS += $(OUTPUT)tests/parse-no-sample-id-all.o
 ifndef NO_DWARF_UNWIND
-ifeq ($(ARCH),$(filter $(ARCH),x86 arm))
+ifeq ($(ARCH),$(filter $(ARCH),x86 arm arm64))
 LIB_OBJS += $(OUTPUT)tests/dwarf-unwind.o
 endif
 endif
diff --git a/tools/perf/arch/arm64/Makefile b/tools/perf/arch/arm64/Makefile
index 9b8f87e..221f21d 100644
--- a/tools/perf/arch/arm64/Makefile
+++ b/tools/perf/arch/arm64/Makefile
@@ -5,4 +5,5 @@ endif
 ifndef NO_LIBUNWIND
 LIB_OBJS += $(OUTPUT)arch/$(ARCH)/util/unwind-libunwind.o
 LIB_OBJS += $(OUTPUT)arch/$(ARCH)/tests/regs_load.o
+LIB_OBJS += $(OUTPUT)arch/$(ARCH)/tests/dwarf-unwind.o
 endif
diff --git a/tools/perf/arch/arm64/include/perf_regs.h b/tools/perf/arch/arm64/include/perf_regs.h
index 1e052f1..e74df99 100644
--- a/tools/perf/arch/arm64/include/perf_regs.h
+++ b/tools/perf/arch/arm64/include/perf_regs.h
@@ -9,6 +9,9 @@
 #define PERF_REG_IP	PERF_REG_ARM64_PC
 #define PERF_REG_SP	PERF_REG_ARM64_SP
 
+#define PERF_REGS_MAX	PERF_REG_ARM64_MAX
+#define PERF_SAMPLE_REGS_ABI	PERF_SAMPLE_REGS_ABI_64
+
 void perf_regs_load(u64 *regs);
 
 static inline const char *perf_reg_name(int id)
diff --git a/tools/perf/arch/arm64/tests/dwarf-unwind.c b/tools/perf/arch/arm64/tests/dwarf-unwind.c
new file mode 100644
index 0000000..0aa64f3
--- /dev/null
+++ b/tools/perf/arch/arm64/tests/dwarf-unwind.c
@@ -0,0 +1,59 @@
+#include <string.h>
+#include "perf_regs.h"
+#include "thread.h"
+#include "map.h"
+#include "event.h"
+#include "tests/tests.h"
+
+#define STACK_SIZE 8192
+
+static int sample_ustack(struct perf_sample *sample,
+			 struct thread *thread, u64 *regs)
+{
+	struct stack_dump *stack = &sample->user_stack;
+	struct map *map;
+	unsigned long sp;
+	u64 stack_size, *buf;
+
+	buf = malloc(STACK_SIZE);
+	if (!buf) {
+		pr_debug("failed to allocate sample uregs data\n");
+		return -1;
+	}
+
+	sp = (unsigned long) regs[PERF_REG_ARM64_SP];
+
+	map = map_groups__find(&thread->mg, MAP__FUNCTION, (u64) sp);
+	if (!map) {
+		pr_debug("failed to get stack map\n");
+		return -1;
+	}
+
+	stack_size = map->end - sp;
+	stack_size = stack_size > STACK_SIZE ? STACK_SIZE : stack_size;
+
+	memcpy(buf, (void *) sp, stack_size);
+	stack->data = (char *) buf;
+	stack->size = stack_size;
+	return 0;
+}
+
+int test__arch_unwind_sample(struct perf_sample *sample,
+			     struct thread *thread)
+{
+	struct regs_dump *regs = &sample->user_regs;
+	u64 *buf;
+
+	buf = malloc(sizeof(u64) * PERF_REGS_MAX);
+	if (!buf) {
+		pr_debug("failed to allocate sample uregs data\n");
+		return -1;
+	}
+
+	perf_regs_load(buf);
+	regs->abi  = PERF_SAMPLE_REGS_ABI;
+	regs->regs = buf;
+	regs->mask = PERF_REGS_MASK;
+
+	return sample_ustack(sample, thread, buf);
+}
diff --git a/tools/perf/tests/builtin-test.c b/tools/perf/tests/builtin-test.c
index 5e0764b..7921aa0 100644
--- a/tools/perf/tests/builtin-test.c
+++ b/tools/perf/tests/builtin-test.c
@@ -115,7 +115,8 @@ static struct test {
 		.desc = "Test parsing with no sample_id_all bit set",
 		.func = test__parse_no_sample_id_all,
 	},
-#if defined(__x86_64__) || defined(__i386__) || defined(__arm__)
+#if defined(__x86_64__) || defined(__i386__) || \
+    defined(__arm__) || defined(__aarch64__)
 #ifdef HAVE_DWARF_UNWIND_SUPPORT
 	{
 		.desc = "Test dwarf unwind",
diff --git a/tools/perf/tests/tests.h b/tools/perf/tests/tests.h
index 8f91fb0..426680e 100644
--- a/tools/perf/tests/tests.h
+++ b/tools/perf/tests/tests.h
@@ -45,7 +45,8 @@ int test__hists_filter(void);
 int test__mmap_thread_lookup(void);
 int test__thread_mg_share(void);
 
-#if defined(__x86_64__) || defined(__i386__) || defined(__arm__)
+#if defined(__x86_64__) || defined(__i386__) || \
+    defined(__arm__) || defined(__aarch64__)
 #ifdef HAVE_DWARF_UNWIND_SUPPORT
 struct thread;
 struct perf_sample;
-- 
1.7.11.7

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 3/3] perf tools: Add libdw DWARF post unwind support for ARM64
  2014-05-06 15:55 ` Jean Pihet
@ 2014-05-06 15:55   ` Jean Pihet
  -1 siblings, 0 replies; 46+ messages in thread
From: Jean Pihet @ 2014-05-06 15:55 UTC (permalink / raw)
  To: Jiri Olsa, Arnaldo Carvalho de Melo, will.deacon
  Cc: linux-kernel, linaro-kernel, linux-arm-kernel, Jean Pihet,
	Corey Ashford, Frederic Weisbecker, Ingo Molnar, Namhyung Kim,
	Paul Mackerras, Peter Zijlstra, David Ahern

Adding libdw DWARF post unwind support, which is part
of elfutils-devel/libdw-dev package from version 0.158.

Note: the libdw code needs some support for dwarf unwinding
on ARM64, this code is submitted seperately on the elfutils
ML.

The new code is contained in unwin-libdw.c object, and
implements unwind__get_entries unwind interface function.

Signed-off-by: Jean Pihet <jean.pihet@linaro.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
Cc: David Ahern <dsahern@gmail.com>
---
 tools/perf/arch/arm64/Makefile            |  5 +++
 tools/perf/arch/arm64/util/unwind-libdw.c | 53 +++++++++++++++++++++++++++++++
 2 files changed, 58 insertions(+)
 create mode 100644 tools/perf/arch/arm64/util/unwind-libdw.c

diff --git a/tools/perf/arch/arm64/Makefile b/tools/perf/arch/arm64/Makefile
index 221f21d..09d6215 100644
--- a/tools/perf/arch/arm64/Makefile
+++ b/tools/perf/arch/arm64/Makefile
@@ -4,6 +4,11 @@ LIB_OBJS += $(OUTPUT)arch/$(ARCH)/util/dwarf-regs.o
 endif
 ifndef NO_LIBUNWIND
 LIB_OBJS += $(OUTPUT)arch/$(ARCH)/util/unwind-libunwind.o
+endif
+ifndef NO_LIBDW_DWARF_UNWIND
+LIB_OBJS += $(OUTPUT)arch/$(ARCH)/util/unwind-libdw.o
+endif
+ifndef NO_DWARF_UNWIND
 LIB_OBJS += $(OUTPUT)arch/$(ARCH)/tests/regs_load.o
 LIB_OBJS += $(OUTPUT)arch/$(ARCH)/tests/dwarf-unwind.o
 endif
diff --git a/tools/perf/arch/arm64/util/unwind-libdw.c b/tools/perf/arch/arm64/util/unwind-libdw.c
new file mode 100644
index 0000000..8d24958
--- /dev/null
+++ b/tools/perf/arch/arm64/util/unwind-libdw.c
@@ -0,0 +1,53 @@
+#include <elfutils/libdwfl.h>
+#include "../../util/unwind-libdw.h"
+#include "../../util/perf_regs.h"
+
+bool libdw__arch_set_initial_registers(Dwfl_Thread *thread, void *arg)
+{
+	struct unwind_info *ui = arg;
+	struct regs_dump *user_regs = &ui->sample->user_regs;
+	Dwarf_Word dwarf_regs[PERF_REG_ARM64_MAX];
+
+#define REG(r) ({						\
+	Dwarf_Word val = 0;					\
+	perf_reg_value(&val, user_regs, PERF_REG_ARM64_##r);	\
+	val;							\
+})
+
+	dwarf_regs[0]  = REG(X0);
+	dwarf_regs[1]  = REG(X1);
+	dwarf_regs[2]  = REG(X2);
+	dwarf_regs[3]  = REG(X3);
+	dwarf_regs[4]  = REG(X4);
+	dwarf_regs[5]  = REG(X5);
+	dwarf_regs[6]  = REG(X6);
+	dwarf_regs[7]  = REG(X7);
+	dwarf_regs[8]  = REG(X8);
+	dwarf_regs[9]  = REG(X9);
+	dwarf_regs[10] = REG(X10);
+	dwarf_regs[11] = REG(X11);
+	dwarf_regs[12] = REG(X12);
+	dwarf_regs[13] = REG(X13);
+	dwarf_regs[14] = REG(X14);
+	dwarf_regs[15] = REG(X15);
+	dwarf_regs[16] = REG(X16);
+	dwarf_regs[17] = REG(X17);
+	dwarf_regs[18] = REG(X18);
+	dwarf_regs[19] = REG(X19);
+	dwarf_regs[20] = REG(X20);
+	dwarf_regs[21] = REG(X21);
+	dwarf_regs[22] = REG(X22);
+	dwarf_regs[23] = REG(X23);
+	dwarf_regs[24] = REG(X24);
+	dwarf_regs[25] = REG(X25);
+	dwarf_regs[26] = REG(X26);
+	dwarf_regs[27] = REG(X27);
+	dwarf_regs[28] = REG(X28);
+	dwarf_regs[29] = REG(X29);
+	dwarf_regs[30] = REG(LR);
+	dwarf_regs[31] = REG(SP);
+	dwarf_regs[32] = REG(PC);
+
+	return dwfl_thread_state_registers(thread, 0, PERF_REG_ARM64_MAX,
+					   dwarf_regs);
+}
-- 
1.7.11.7


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 3/3] perf tools: Add libdw DWARF post unwind support for ARM64
@ 2014-05-06 15:55   ` Jean Pihet
  0 siblings, 0 replies; 46+ messages in thread
From: Jean Pihet @ 2014-05-06 15:55 UTC (permalink / raw)
  To: linux-arm-kernel

Adding libdw DWARF post unwind support, which is part
of elfutils-devel/libdw-dev package from version 0.158.

Note: the libdw code needs some support for dwarf unwinding
on ARM64, this code is submitted seperately on the elfutils
ML.

The new code is contained in unwin-libdw.c object, and
implements unwind__get_entries unwind interface function.

Signed-off-by: Jean Pihet <jean.pihet@linaro.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
Cc: David Ahern <dsahern@gmail.com>
---
 tools/perf/arch/arm64/Makefile            |  5 +++
 tools/perf/arch/arm64/util/unwind-libdw.c | 53 +++++++++++++++++++++++++++++++
 2 files changed, 58 insertions(+)
 create mode 100644 tools/perf/arch/arm64/util/unwind-libdw.c

diff --git a/tools/perf/arch/arm64/Makefile b/tools/perf/arch/arm64/Makefile
index 221f21d..09d6215 100644
--- a/tools/perf/arch/arm64/Makefile
+++ b/tools/perf/arch/arm64/Makefile
@@ -4,6 +4,11 @@ LIB_OBJS += $(OUTPUT)arch/$(ARCH)/util/dwarf-regs.o
 endif
 ifndef NO_LIBUNWIND
 LIB_OBJS += $(OUTPUT)arch/$(ARCH)/util/unwind-libunwind.o
+endif
+ifndef NO_LIBDW_DWARF_UNWIND
+LIB_OBJS += $(OUTPUT)arch/$(ARCH)/util/unwind-libdw.o
+endif
+ifndef NO_DWARF_UNWIND
 LIB_OBJS += $(OUTPUT)arch/$(ARCH)/tests/regs_load.o
 LIB_OBJS += $(OUTPUT)arch/$(ARCH)/tests/dwarf-unwind.o
 endif
diff --git a/tools/perf/arch/arm64/util/unwind-libdw.c b/tools/perf/arch/arm64/util/unwind-libdw.c
new file mode 100644
index 0000000..8d24958
--- /dev/null
+++ b/tools/perf/arch/arm64/util/unwind-libdw.c
@@ -0,0 +1,53 @@
+#include <elfutils/libdwfl.h>
+#include "../../util/unwind-libdw.h"
+#include "../../util/perf_regs.h"
+
+bool libdw__arch_set_initial_registers(Dwfl_Thread *thread, void *arg)
+{
+	struct unwind_info *ui = arg;
+	struct regs_dump *user_regs = &ui->sample->user_regs;
+	Dwarf_Word dwarf_regs[PERF_REG_ARM64_MAX];
+
+#define REG(r) ({						\
+	Dwarf_Word val = 0;					\
+	perf_reg_value(&val, user_regs, PERF_REG_ARM64_##r);	\
+	val;							\
+})
+
+	dwarf_regs[0]  = REG(X0);
+	dwarf_regs[1]  = REG(X1);
+	dwarf_regs[2]  = REG(X2);
+	dwarf_regs[3]  = REG(X3);
+	dwarf_regs[4]  = REG(X4);
+	dwarf_regs[5]  = REG(X5);
+	dwarf_regs[6]  = REG(X6);
+	dwarf_regs[7]  = REG(X7);
+	dwarf_regs[8]  = REG(X8);
+	dwarf_regs[9]  = REG(X9);
+	dwarf_regs[10] = REG(X10);
+	dwarf_regs[11] = REG(X11);
+	dwarf_regs[12] = REG(X12);
+	dwarf_regs[13] = REG(X13);
+	dwarf_regs[14] = REG(X14);
+	dwarf_regs[15] = REG(X15);
+	dwarf_regs[16] = REG(X16);
+	dwarf_regs[17] = REG(X17);
+	dwarf_regs[18] = REG(X18);
+	dwarf_regs[19] = REG(X19);
+	dwarf_regs[20] = REG(X20);
+	dwarf_regs[21] = REG(X21);
+	dwarf_regs[22] = REG(X22);
+	dwarf_regs[23] = REG(X23);
+	dwarf_regs[24] = REG(X24);
+	dwarf_regs[25] = REG(X25);
+	dwarf_regs[26] = REG(X26);
+	dwarf_regs[27] = REG(X27);
+	dwarf_regs[28] = REG(X28);
+	dwarf_regs[29] = REG(X29);
+	dwarf_regs[30] = REG(LR);
+	dwarf_regs[31] = REG(SP);
+	dwarf_regs[32] = REG(PC);
+
+	return dwfl_thread_state_registers(thread, 0, PERF_REG_ARM64_MAX,
+					   dwarf_regs);
+}
-- 
1.7.11.7

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* Re: [PATCH 3/3] perf tools: Add libdw DWARF post unwind support for ARM64
  2014-05-06 15:55   ` Jean Pihet
@ 2014-05-06 17:00     ` Will Deacon
  -1 siblings, 0 replies; 46+ messages in thread
From: Will Deacon @ 2014-05-06 17:00 UTC (permalink / raw)
  To: Jean Pihet
  Cc: Jiri Olsa, Arnaldo Carvalho de Melo, linux-kernel, linaro-kernel,
	linux-arm-kernel, Corey Ashford, Frederic Weisbecker,
	Ingo Molnar, Namhyung Kim, Paul Mackerras, Peter Zijlstra,
	David Ahern

Hi Jean,

On Tue, May 06, 2014 at 04:55:33PM +0100, Jean Pihet wrote:
> Adding libdw DWARF post unwind support, which is part
> of elfutils-devel/libdw-dev package from version 0.158.
> 
> Note: the libdw code needs some support for dwarf unwinding
> on ARM64, this code is submitted seperately on the elfutils
> ML.
> 
> The new code is contained in unwin-libdw.c object, and
> implements unwind__get_entries unwind interface function.

Are you planning to implement support for 32-bit ARM too? If so, we'll need
compat handling here again (your favourite!).

> +bool libdw__arch_set_initial_registers(Dwfl_Thread *thread, void *arg)
> +{
> +	struct unwind_info *ui = arg;
> +	struct regs_dump *user_regs = &ui->sample->user_regs;
> +	Dwarf_Word dwarf_regs[PERF_REG_ARM64_MAX];

Shouldn't this be PERF_REG_ARM64_MAX - 1?

Will

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH 3/3] perf tools: Add libdw DWARF post unwind support for ARM64
@ 2014-05-06 17:00     ` Will Deacon
  0 siblings, 0 replies; 46+ messages in thread
From: Will Deacon @ 2014-05-06 17:00 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Jean,

On Tue, May 06, 2014 at 04:55:33PM +0100, Jean Pihet wrote:
> Adding libdw DWARF post unwind support, which is part
> of elfutils-devel/libdw-dev package from version 0.158.
> 
> Note: the libdw code needs some support for dwarf unwinding
> on ARM64, this code is submitted seperately on the elfutils
> ML.
> 
> The new code is contained in unwin-libdw.c object, and
> implements unwind__get_entries unwind interface function.

Are you planning to implement support for 32-bit ARM too? If so, we'll need
compat handling here again (your favourite!).

> +bool libdw__arch_set_initial_registers(Dwfl_Thread *thread, void *arg)
> +{
> +	struct unwind_info *ui = arg;
> +	struct regs_dump *user_regs = &ui->sample->user_regs;
> +	Dwarf_Word dwarf_regs[PERF_REG_ARM64_MAX];

Shouldn't this be PERF_REG_ARM64_MAX - 1?

Will

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 3/3] perf tools: Add libdw DWARF post unwind support for ARM64
  2014-05-06 17:00     ` Will Deacon
@ 2014-05-06 17:41       ` Jean Pihet
  -1 siblings, 0 replies; 46+ messages in thread
From: Jean Pihet @ 2014-05-06 17:41 UTC (permalink / raw)
  To: Will Deacon
  Cc: Jiri Olsa, Arnaldo Carvalho de Melo, linux-kernel, linaro-kernel,
	linux-arm-kernel, Corey Ashford, Frederic Weisbecker,
	Ingo Molnar, Namhyung Kim, Paul Mackerras, Peter Zijlstra,
	David Ahern

Hi Will,

On 6 May 2014 19:00, Will Deacon <will.deacon@arm.com> wrote:
> Hi Jean,
>
> On Tue, May 06, 2014 at 04:55:33PM +0100, Jean Pihet wrote:
>> Adding libdw DWARF post unwind support, which is part
>> of elfutils-devel/libdw-dev package from version 0.158.
>>
>> Note: the libdw code needs some support for dwarf unwinding
>> on ARM64, this code is submitted seperately on the elfutils
>> ML.
>>
>> The new code is contained in unwin-libdw.c object, and
>> implements unwind__get_entries unwind interface function.
>
> Are you planning to implement support for 32-bit ARM too? If so, we'll need
> compat handling here again (your favourite!).
Yes! Another patch set (sent just before this one) targets ARM. There
is a nice ToDo in the cover letter: handle compat mode correctly. In
fact I sent a patch to libdw, so it supports it already but is
somewhat broken for compat mode. This is on my prefered ToDo list ;-)

>
>> +bool libdw__arch_set_initial_registers(Dwfl_Thread *thread, void *arg)
>> +{
>> +     struct unwind_info *ui = arg;
>> +     struct regs_dump *user_regs = &ui->sample->user_regs;
>> +     Dwarf_Word dwarf_regs[PERF_REG_ARM64_MAX];
>
> Shouldn't this be PERF_REG_ARM64_MAX - 1?
Ah, well spotted! I will change although it shouldn't harm, right?

>
> Will

Thx for reviewing,
Jean

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH 3/3] perf tools: Add libdw DWARF post unwind support for ARM64
@ 2014-05-06 17:41       ` Jean Pihet
  0 siblings, 0 replies; 46+ messages in thread
From: Jean Pihet @ 2014-05-06 17:41 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Will,

On 6 May 2014 19:00, Will Deacon <will.deacon@arm.com> wrote:
> Hi Jean,
>
> On Tue, May 06, 2014 at 04:55:33PM +0100, Jean Pihet wrote:
>> Adding libdw DWARF post unwind support, which is part
>> of elfutils-devel/libdw-dev package from version 0.158.
>>
>> Note: the libdw code needs some support for dwarf unwinding
>> on ARM64, this code is submitted seperately on the elfutils
>> ML.
>>
>> The new code is contained in unwin-libdw.c object, and
>> implements unwind__get_entries unwind interface function.
>
> Are you planning to implement support for 32-bit ARM too? If so, we'll need
> compat handling here again (your favourite!).
Yes! Another patch set (sent just before this one) targets ARM. There
is a nice ToDo in the cover letter: handle compat mode correctly. In
fact I sent a patch to libdw, so it supports it already but is
somewhat broken for compat mode. This is on my prefered ToDo list ;-)

>
>> +bool libdw__arch_set_initial_registers(Dwfl_Thread *thread, void *arg)
>> +{
>> +     struct unwind_info *ui = arg;
>> +     struct regs_dump *user_regs = &ui->sample->user_regs;
>> +     Dwarf_Word dwarf_regs[PERF_REG_ARM64_MAX];
>
> Shouldn't this be PERF_REG_ARM64_MAX - 1?
Ah, well spotted! I will change although it shouldn't harm, right?

>
> Will

Thx for reviewing,
Jean

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 3/3] perf tools: Add libdw DWARF post unwind support for ARM64
  2014-05-06 17:41       ` Jean Pihet
@ 2014-05-06 17:52         ` Will Deacon
  -1 siblings, 0 replies; 46+ messages in thread
From: Will Deacon @ 2014-05-06 17:52 UTC (permalink / raw)
  To: Jean Pihet
  Cc: Jiri Olsa, Arnaldo Carvalho de Melo, linux-kernel, linaro-kernel,
	linux-arm-kernel, Corey Ashford, Frederic Weisbecker,
	Ingo Molnar, Namhyung Kim, Paul Mackerras, Peter Zijlstra,
	David Ahern

On Tue, May 06, 2014 at 06:41:55PM +0100, Jean Pihet wrote:
> Hi Will,
> 
> On 6 May 2014 19:00, Will Deacon <will.deacon@arm.com> wrote:
> > Hi Jean,
> >
> > On Tue, May 06, 2014 at 04:55:33PM +0100, Jean Pihet wrote:
> >> Adding libdw DWARF post unwind support, which is part
> >> of elfutils-devel/libdw-dev package from version 0.158.
> >>
> >> Note: the libdw code needs some support for dwarf unwinding
> >> on ARM64, this code is submitted seperately on the elfutils
> >> ML.
> >>
> >> The new code is contained in unwin-libdw.c object, and
> >> implements unwind__get_entries unwind interface function.
> >
> > Are you planning to implement support for 32-bit ARM too? If so, we'll need
> > compat handling here again (your favourite!).
> Yes! Another patch set (sent just before this one) targets ARM. There
> is a nice ToDo in the cover letter: handle compat mode correctly. In
> fact I sent a patch to libdw, so it supports it already but is
> somewhat broken for compat mode. This is on my prefered ToDo list ;-)
> 
> >
> >> +bool libdw__arch_set_initial_registers(Dwfl_Thread *thread, void *arg)
> >> +{
> >> +     struct unwind_info *ui = arg;
> >> +     struct regs_dump *user_regs = &ui->sample->user_regs;
> >> +     Dwarf_Word dwarf_regs[PERF_REG_ARM64_MAX];
> >
> > Shouldn't this be PERF_REG_ARM64_MAX - 1?
> Ah, well spotted! I will change although it shouldn't harm, right?

Actually, looking again, I think I'm wrong and your code was right first
time! It looks like dwfl_thread_state_registers takes the limit too, so I
don't think you need to change anything (except for adding compat support).

Sorry about that,

Will

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH 3/3] perf tools: Add libdw DWARF post unwind support for ARM64
@ 2014-05-06 17:52         ` Will Deacon
  0 siblings, 0 replies; 46+ messages in thread
From: Will Deacon @ 2014-05-06 17:52 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, May 06, 2014 at 06:41:55PM +0100, Jean Pihet wrote:
> Hi Will,
> 
> On 6 May 2014 19:00, Will Deacon <will.deacon@arm.com> wrote:
> > Hi Jean,
> >
> > On Tue, May 06, 2014 at 04:55:33PM +0100, Jean Pihet wrote:
> >> Adding libdw DWARF post unwind support, which is part
> >> of elfutils-devel/libdw-dev package from version 0.158.
> >>
> >> Note: the libdw code needs some support for dwarf unwinding
> >> on ARM64, this code is submitted seperately on the elfutils
> >> ML.
> >>
> >> The new code is contained in unwin-libdw.c object, and
> >> implements unwind__get_entries unwind interface function.
> >
> > Are you planning to implement support for 32-bit ARM too? If so, we'll need
> > compat handling here again (your favourite!).
> Yes! Another patch set (sent just before this one) targets ARM. There
> is a nice ToDo in the cover letter: handle compat mode correctly. In
> fact I sent a patch to libdw, so it supports it already but is
> somewhat broken for compat mode. This is on my prefered ToDo list ;-)
> 
> >
> >> +bool libdw__arch_set_initial_registers(Dwfl_Thread *thread, void *arg)
> >> +{
> >> +     struct unwind_info *ui = arg;
> >> +     struct regs_dump *user_regs = &ui->sample->user_regs;
> >> +     Dwarf_Word dwarf_regs[PERF_REG_ARM64_MAX];
> >
> > Shouldn't this be PERF_REG_ARM64_MAX - 1?
> Ah, well spotted! I will change although it shouldn't harm, right?

Actually, looking again, I think I'm wrong and your code was right first
time! It looks like dwfl_thread_state_registers takes the limit too, so I
don't think you need to change anything (except for adding compat support).

Sorry about that,

Will

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 3/3] perf tools: Add libdw DWARF post unwind support for ARM64
  2014-05-06 17:52         ` Will Deacon
@ 2014-05-07  7:40           ` Jean Pihet
  -1 siblings, 0 replies; 46+ messages in thread
From: Jean Pihet @ 2014-05-07  7:40 UTC (permalink / raw)
  To: Will Deacon
  Cc: Jean Pihet, Arnaldo Carvalho de Melo, linaro-kernel,
	Peter Zijlstra, Corey Ashford, Frederic Weisbecker, linux-kernel,
	Paul Mackerras, David Ahern, Namhyung Kim, Jiri Olsa,
	Ingo Molnar, linux-arm-kernel

Hi Will,

On Tue, May 6, 2014 at 7:52 PM, Will Deacon <will.deacon@arm.com> wrote:
> On Tue, May 06, 2014 at 06:41:55PM +0100, Jean Pihet wrote:
>> Hi Will,
>>
>> On 6 May 2014 19:00, Will Deacon <will.deacon@arm.com> wrote:
>> > Hi Jean,
>> >
>> > On Tue, May 06, 2014 at 04:55:33PM +0100, Jean Pihet wrote:
>> >> Adding libdw DWARF post unwind support, which is part
>> >> of elfutils-devel/libdw-dev package from version 0.158.
>> >>
>> >> Note: the libdw code needs some support for dwarf unwinding
>> >> on ARM64, this code is submitted seperately on the elfutils
>> >> ML.
>> >>
>> >> The new code is contained in unwin-libdw.c object, and
>> >> implements unwind__get_entries unwind interface function.
>> >
>> > Are you planning to implement support for 32-bit ARM too? If so, we'll need
>> > compat handling here again (your favourite!).
>> Yes! Another patch set (sent just before this one) targets ARM. There
>> is a nice ToDo in the cover letter: handle compat mode correctly. In
>> fact I sent a patch to libdw, so it supports it already but is
>> somewhat broken for compat mode. This is on my prefered ToDo list ;-)
>>
>> >
>> >> +bool libdw__arch_set_initial_registers(Dwfl_Thread *thread, void *arg)
>> >> +{
>> >> +     struct unwind_info *ui = arg;
>> >> +     struct regs_dump *user_regs = &ui->sample->user_regs;
>> >> +     Dwarf_Word dwarf_regs[PERF_REG_ARM64_MAX];
>> >
>> > Shouldn't this be PERF_REG_ARM64_MAX - 1?
>> Ah, well spotted! I will change although it shouldn't harm, right?
>
> Actually, looking again, I think I'm wrong and your code was right first
> time! It looks like dwfl_thread_state_registers takes the limit too, so I
> don't think you need to change anything (except for adding compat support).
>
> Sorry about that,
My bad, I haven't checked carefully enough before replying.

Thx!
Jean
>
> Will
>
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH 3/3] perf tools: Add libdw DWARF post unwind support for ARM64
@ 2014-05-07  7:40           ` Jean Pihet
  0 siblings, 0 replies; 46+ messages in thread
From: Jean Pihet @ 2014-05-07  7:40 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Will,

On Tue, May 6, 2014 at 7:52 PM, Will Deacon <will.deacon@arm.com> wrote:
> On Tue, May 06, 2014 at 06:41:55PM +0100, Jean Pihet wrote:
>> Hi Will,
>>
>> On 6 May 2014 19:00, Will Deacon <will.deacon@arm.com> wrote:
>> > Hi Jean,
>> >
>> > On Tue, May 06, 2014 at 04:55:33PM +0100, Jean Pihet wrote:
>> >> Adding libdw DWARF post unwind support, which is part
>> >> of elfutils-devel/libdw-dev package from version 0.158.
>> >>
>> >> Note: the libdw code needs some support for dwarf unwinding
>> >> on ARM64, this code is submitted seperately on the elfutils
>> >> ML.
>> >>
>> >> The new code is contained in unwin-libdw.c object, and
>> >> implements unwind__get_entries unwind interface function.
>> >
>> > Are you planning to implement support for 32-bit ARM too? If so, we'll need
>> > compat handling here again (your favourite!).
>> Yes! Another patch set (sent just before this one) targets ARM. There
>> is a nice ToDo in the cover letter: handle compat mode correctly. In
>> fact I sent a patch to libdw, so it supports it already but is
>> somewhat broken for compat mode. This is on my prefered ToDo list ;-)
>>
>> >
>> >> +bool libdw__arch_set_initial_registers(Dwfl_Thread *thread, void *arg)
>> >> +{
>> >> +     struct unwind_info *ui = arg;
>> >> +     struct regs_dump *user_regs = &ui->sample->user_regs;
>> >> +     Dwarf_Word dwarf_regs[PERF_REG_ARM64_MAX];
>> >
>> > Shouldn't this be PERF_REG_ARM64_MAX - 1?
>> Ah, well spotted! I will change although it shouldn't harm, right?
>
> Actually, looking again, I think I'm wrong and your code was right first
> time! It looks like dwfl_thread_state_registers takes the limit too, so I
> don't think you need to change anything (except for adding compat support).
>
> Sorry about that,
My bad, I haven't checked carefully enough before replying.

Thx!
Jean
>
> Will
>
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel at lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 2/3] perf tests: Add dwarf unwind test on ARM64
  2014-05-06 15:55   ` Jean Pihet
@ 2014-05-07 12:05     ` Jiri Olsa
  -1 siblings, 0 replies; 46+ messages in thread
From: Jiri Olsa @ 2014-05-07 12:05 UTC (permalink / raw)
  To: Jean Pihet
  Cc: Arnaldo Carvalho de Melo, will.deacon, linux-kernel,
	linaro-kernel, linux-arm-kernel, Corey Ashford,
	Frederic Weisbecker, Ingo Molnar, Namhyung Kim, Paul Mackerras,
	Peter Zijlstra, David Ahern

On Tue, May 06, 2014 at 05:55:32PM +0200, Jean Pihet wrote:

SNIP

> +#include "tests/tests.h"
> +
> +#define STACK_SIZE 8192
> +
> +static int sample_ustack(struct perf_sample *sample,
> +			 struct thread *thread, u64 *regs)
> +{
> +	struct stack_dump *stack = &sample->user_stack;
> +	struct map *map;
> +	unsigned long sp;
> +	u64 stack_size, *buf;
> +
> +	buf = malloc(STACK_SIZE);
> +	if (!buf) {
> +		pr_debug("failed to allocate sample uregs data\n");
> +		return -1;
> +	}
> +
> +	sp = (unsigned long) regs[PERF_REG_ARM64_SP];
> +
> +	map = map_groups__find(&thread->mg, MAP__FUNCTION, (u64) sp);
> +	if (!map) {
> +		pr_debug("failed to get stack map\n");
> +		return -1;
> +	}

there's a memory leak of 'buf' already fixed fox x86:

  perf tests x86: Fix memory leak in sample_ustack()
  commit 763d7f5f2718f085bab5a9e63308349728f3ad12
  Author: Masanari Iida <standby24x7@gmail.com>
  Date:   Sun Apr 20 00:16:41 2014 +0900

jirka

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH 2/3] perf tests: Add dwarf unwind test on ARM64
@ 2014-05-07 12:05     ` Jiri Olsa
  0 siblings, 0 replies; 46+ messages in thread
From: Jiri Olsa @ 2014-05-07 12:05 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, May 06, 2014 at 05:55:32PM +0200, Jean Pihet wrote:

SNIP

> +#include "tests/tests.h"
> +
> +#define STACK_SIZE 8192
> +
> +static int sample_ustack(struct perf_sample *sample,
> +			 struct thread *thread, u64 *regs)
> +{
> +	struct stack_dump *stack = &sample->user_stack;
> +	struct map *map;
> +	unsigned long sp;
> +	u64 stack_size, *buf;
> +
> +	buf = malloc(STACK_SIZE);
> +	if (!buf) {
> +		pr_debug("failed to allocate sample uregs data\n");
> +		return -1;
> +	}
> +
> +	sp = (unsigned long) regs[PERF_REG_ARM64_SP];
> +
> +	map = map_groups__find(&thread->mg, MAP__FUNCTION, (u64) sp);
> +	if (!map) {
> +		pr_debug("failed to get stack map\n");
> +		return -1;
> +	}

there's a memory leak of 'buf' already fixed fox x86:

  perf tests x86: Fix memory leak in sample_ustack()
  commit 763d7f5f2718f085bab5a9e63308349728f3ad12
  Author: Masanari Iida <standby24x7@gmail.com>
  Date:   Sun Apr 20 00:16:41 2014 +0900

jirka

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 1/3] perf tests: Introduce perf_regs_load function on ARM64
  2014-05-05  7:07                   ` Jean Pihet
@ 2014-05-06  8:51                     ` Will Deacon
  -1 siblings, 0 replies; 46+ messages in thread
From: Will Deacon @ 2014-05-06  8:51 UTC (permalink / raw)
  To: Jean Pihet
  Cc: Jean Pihet, Arnaldo Carvalho de Melo, Mark Rutland,
	linaro-kernel, Steve Capper, Peter Zijlstra, Patch Tracking,
	Frederic Weisbecker, Corey Ashford, linux-kernel, Paul Mackerras,
	Arnaldo Carvalho de Melo, David Ahern, Namhyung Kim, Jiri Olsa,
	Ingo Molnar, linux-arm-kernel

On Mon, May 05, 2014 at 08:07:11AM +0100, Jean Pihet wrote:
> Hi Will,
> 
> On Fri, May 2, 2014 at 6:51 PM, Will Deacon <will.deacon@arm.com> wrote:
> > On Fri, May 02, 2014 at 10:19:38AM +0100, Jean Pihet wrote:
> >> Hi WIll, Mark,
> >>
> >> Ping on this series. Can you please check? Is the refreshed patch OK?
> >
> > Fine by me, as long as it's not reused outside of this test :)
> Great! Is there some precaution to avoid a mis-reuse?

Hopefully your comment will be enough.

> Other than that, which tree are the patches going to?

They're all under perf/tools/, so they should go via that tree.

Will

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH 1/3] perf tests: Introduce perf_regs_load function on ARM64
@ 2014-05-06  8:51                     ` Will Deacon
  0 siblings, 0 replies; 46+ messages in thread
From: Will Deacon @ 2014-05-06  8:51 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, May 05, 2014 at 08:07:11AM +0100, Jean Pihet wrote:
> Hi Will,
> 
> On Fri, May 2, 2014 at 6:51 PM, Will Deacon <will.deacon@arm.com> wrote:
> > On Fri, May 02, 2014 at 10:19:38AM +0100, Jean Pihet wrote:
> >> Hi WIll, Mark,
> >>
> >> Ping on this series. Can you please check? Is the refreshed patch OK?
> >
> > Fine by me, as long as it's not reused outside of this test :)
> Great! Is there some precaution to avoid a mis-reuse?

Hopefully your comment will be enough.

> Other than that, which tree are the patches going to?

They're all under perf/tools/, so they should go via that tree.

Will

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 1/3] perf tests: Introduce perf_regs_load function on ARM64
  2014-05-02 16:51                 ` Will Deacon
@ 2014-05-05  7:07                   ` Jean Pihet
  -1 siblings, 0 replies; 46+ messages in thread
From: Jean Pihet @ 2014-05-05  7:07 UTC (permalink / raw)
  To: Will Deacon
  Cc: Jean Pihet, Arnaldo Carvalho de Melo, Mark Rutland,
	linaro-kernel, Steve Capper, Peter Zijlstra, Patch Tracking,
	Frederic Weisbecker, Corey Ashford, linux-kernel, Paul Mackerras,
	Arnaldo Carvalho de Melo, David Ahern, Namhyung Kim, Jiri Olsa,
	Ingo Molnar, linux-arm-kernel

Hi Will,

On Fri, May 2, 2014 at 6:51 PM, Will Deacon <will.deacon@arm.com> wrote:
> On Fri, May 02, 2014 at 10:19:38AM +0100, Jean Pihet wrote:
>> Hi WIll, Mark,
>>
>> Ping on this series. Can you please check? Is the refreshed patch OK?
>
> Fine by me, as long as it's not reused outside of this test :)
Great! Is there some precaution to avoid a mis-reuse?

Other than that, which tree are the patches going to?

Cheers,
Jean

>
> Will
>
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH 1/3] perf tests: Introduce perf_regs_load function on ARM64
@ 2014-05-05  7:07                   ` Jean Pihet
  0 siblings, 0 replies; 46+ messages in thread
From: Jean Pihet @ 2014-05-05  7:07 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Will,

On Fri, May 2, 2014 at 6:51 PM, Will Deacon <will.deacon@arm.com> wrote:
> On Fri, May 02, 2014 at 10:19:38AM +0100, Jean Pihet wrote:
>> Hi WIll, Mark,
>>
>> Ping on this series. Can you please check? Is the refreshed patch OK?
>
> Fine by me, as long as it's not reused outside of this test :)
Great! Is there some precaution to avoid a mis-reuse?

Other than that, which tree are the patches going to?

Cheers,
Jean

>
> Will
>
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel at lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 1/3] perf tests: Introduce perf_regs_load function on ARM64
  2014-05-02  9:19               ` Jean Pihet
@ 2014-05-02 16:51                 ` Will Deacon
  -1 siblings, 0 replies; 46+ messages in thread
From: Will Deacon @ 2014-05-02 16:51 UTC (permalink / raw)
  To: Jean Pihet
  Cc: linux-kernel, linaro-kernel, linux-arm-kernel, Ingo Molnar,
	Jiri Olsa, Patch Tracking, Arnaldo Carvalho de Melo,
	Steve Capper, Corey Ashford, Frederic Weisbecker, Namhyung Kim,
	Paul Mackerras, Peter Zijlstra, Arnaldo Carvalho de Melo,
	David Ahern, Mark Rutland

On Fri, May 02, 2014 at 10:19:38AM +0100, Jean Pihet wrote:
> Hi WIll, Mark,
> 
> Ping on this series. Can you please check? Is the refreshed patch OK?

Fine by me, as long as it's not reused outside of this test :)

Will

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH 1/3] perf tests: Introduce perf_regs_load function on ARM64
@ 2014-05-02 16:51                 ` Will Deacon
  0 siblings, 0 replies; 46+ messages in thread
From: Will Deacon @ 2014-05-02 16:51 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, May 02, 2014 at 10:19:38AM +0100, Jean Pihet wrote:
> Hi WIll, Mark,
> 
> Ping on this series. Can you please check? Is the refreshed patch OK?

Fine by me, as long as it's not reused outside of this test :)

Will

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 1/3] perf tests: Introduce perf_regs_load function on ARM64
  2014-04-28 13:12             ` Jean Pihet
@ 2014-05-02  9:19               ` Jean Pihet
  -1 siblings, 0 replies; 46+ messages in thread
From: Jean Pihet @ 2014-05-02  9:19 UTC (permalink / raw)
  To: Will Deacon
  Cc: linux-kernel, linaro-kernel, linux-arm-kernel, Ingo Molnar,
	Jiri Olsa, Patch Tracking, Arnaldo Carvalho de Melo, Jean Pihet,
	Steve Capper, Corey Ashford, Frederic Weisbecker, Namhyung Kim,
	Paul Mackerras, Peter Zijlstra, Arnaldo Carvalho de Melo,
	David Ahern, Mark Rutland

Hi WIll, Mark,

Ping on this series. Can you please check? Is the refreshed patch OK?

Cheers,
Jean


On 28 April 2014 15:12, Jean Pihet <jean.pihet@linaro.org> wrote:
> Introducing perf_regs_load function, which is going
> to be used for dwarf unwind test in following patches.
>
> It takes single argument as a pointer to the regs dump
> buffer and populates it with current registers values, as
> expected by the perf built-in unwinding test.
>
> Signed-off-by: Jean Pihet <jean.pihet@linaro.org>
> Cc: Steve Capper <steve.capper@linaro.org>
> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
> Cc: Frederic Weisbecker <fweisbec@gmail.com>
> Cc: Ingo Molnar <mingo@kernel.org>
> Cc: Namhyung Kim <namhyung@kernel.org>
> Cc: Paul Mackerras <paulus@samba.org>
> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
> Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
> Cc: David Ahern <dsahern@gmail.com>
> Cc: Jiri Olsa <jolsa@redhat.com>
> ---
>  tools/perf/arch/arm64/Makefile            |  1 +
>  tools/perf/arch/arm64/include/perf_regs.h |  2 ++
>  tools/perf/arch/arm64/tests/regs_load.S   | 40 +++++++++++++++++++++++++++++++
>  3 files changed, 43 insertions(+)
>  create mode 100644 tools/perf/arch/arm64/tests/regs_load.S
>
> diff --git a/tools/perf/arch/arm64/Makefile b/tools/perf/arch/arm64/Makefile
> index 67e9b3d..9b8f87e 100644
> --- a/tools/perf/arch/arm64/Makefile
> +++ b/tools/perf/arch/arm64/Makefile
> @@ -4,4 +4,5 @@ LIB_OBJS += $(OUTPUT)arch/$(ARCH)/util/dwarf-regs.o
>  endif
>  ifndef NO_LIBUNWIND
>  LIB_OBJS += $(OUTPUT)arch/$(ARCH)/util/unwind-libunwind.o
> +LIB_OBJS += $(OUTPUT)arch/$(ARCH)/tests/regs_load.o
>  endif
> diff --git a/tools/perf/arch/arm64/include/perf_regs.h b/tools/perf/arch/arm64/include/perf_regs.h
> index 2359546..1e052f1 100644
> --- a/tools/perf/arch/arm64/include/perf_regs.h
> +++ b/tools/perf/arch/arm64/include/perf_regs.h
> @@ -9,6 +9,8 @@
>  #define PERF_REG_IP    PERF_REG_ARM64_PC
>  #define PERF_REG_SP    PERF_REG_ARM64_SP
>
> +void perf_regs_load(u64 *regs);
> +
>  static inline const char *perf_reg_name(int id)
>  {
>         switch (id) {
> diff --git a/tools/perf/arch/arm64/tests/regs_load.S b/tools/perf/arch/arm64/tests/regs_load.S
> new file mode 100644
> index 0000000..40b8b99
> --- /dev/null
> +++ b/tools/perf/arch/arm64/tests/regs_load.S
> @@ -0,0 +1,40 @@
> +#include <linux/linkage.h>
> +
> +/*
> + * Implementation of void perf_regs_load(u64 *regs);
> + *
> + * This functions fills in the 'regs' buffer from the actual registers values,
> + * in the way the perf built-in unwinding test expects them, e.g. an address
> + * within the caller:
> + * - the PC at the time at the call to this function. Since this function
> + *   is called using a bl instruction, the PC value is taken from LR,
> + * - the current SP (not touched by this function),
> + * - the current value of LR is merely retrieved and stored because the
> + *   value before the call to this function is unknown at this time; it will
> + *   be unwound from the dwarf information in unwind__get_entries.
> + */
> +
> +.text
> +.type perf_regs_load,%function
> +ENTRY(perf_regs_load)
> +       stp x0,  x1,  [x0], #16 // store x0..x29
> +       stp x2,  x3,  [x0], #16
> +       stp x4,  x5,  [x0], #16
> +       stp x6,  x7,  [x0], #16
> +       stp x8,  x9,  [x0], #16
> +       stp x10, x11, [x0], #16
> +       stp x12, x13, [x0], #16
> +       stp x14, x15, [x0], #16
> +       stp x16, x17, [x0], #16
> +       stp x18, x19, [x0], #16
> +       stp x20, x21, [x0], #16
> +       stp x22, x23, [x0], #16
> +       stp x24, x25, [x0], #16
> +       stp x26, x27, [x0], #16
> +       stp x28, x29, [x0], #16
> +       mov x1,  sp
> +       stp x30, x1,  [x0], #16 // store lr and sp
> +       str x30, [x0]           // store lr as the PC in order to skip the call
> +                               //  to this function
> +       ret
> +ENDPROC(perf_regs_load)
> --
> 1.7.11.7
>

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH 1/3] perf tests: Introduce perf_regs_load function on ARM64
@ 2014-05-02  9:19               ` Jean Pihet
  0 siblings, 0 replies; 46+ messages in thread
From: Jean Pihet @ 2014-05-02  9:19 UTC (permalink / raw)
  To: linux-arm-kernel

Hi WIll, Mark,

Ping on this series. Can you please check? Is the refreshed patch OK?

Cheers,
Jean


On 28 April 2014 15:12, Jean Pihet <jean.pihet@linaro.org> wrote:
> Introducing perf_regs_load function, which is going
> to be used for dwarf unwind test in following patches.
>
> It takes single argument as a pointer to the regs dump
> buffer and populates it with current registers values, as
> expected by the perf built-in unwinding test.
>
> Signed-off-by: Jean Pihet <jean.pihet@linaro.org>
> Cc: Steve Capper <steve.capper@linaro.org>
> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
> Cc: Frederic Weisbecker <fweisbec@gmail.com>
> Cc: Ingo Molnar <mingo@kernel.org>
> Cc: Namhyung Kim <namhyung@kernel.org>
> Cc: Paul Mackerras <paulus@samba.org>
> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
> Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
> Cc: David Ahern <dsahern@gmail.com>
> Cc: Jiri Olsa <jolsa@redhat.com>
> ---
>  tools/perf/arch/arm64/Makefile            |  1 +
>  tools/perf/arch/arm64/include/perf_regs.h |  2 ++
>  tools/perf/arch/arm64/tests/regs_load.S   | 40 +++++++++++++++++++++++++++++++
>  3 files changed, 43 insertions(+)
>  create mode 100644 tools/perf/arch/arm64/tests/regs_load.S
>
> diff --git a/tools/perf/arch/arm64/Makefile b/tools/perf/arch/arm64/Makefile
> index 67e9b3d..9b8f87e 100644
> --- a/tools/perf/arch/arm64/Makefile
> +++ b/tools/perf/arch/arm64/Makefile
> @@ -4,4 +4,5 @@ LIB_OBJS += $(OUTPUT)arch/$(ARCH)/util/dwarf-regs.o
>  endif
>  ifndef NO_LIBUNWIND
>  LIB_OBJS += $(OUTPUT)arch/$(ARCH)/util/unwind-libunwind.o
> +LIB_OBJS += $(OUTPUT)arch/$(ARCH)/tests/regs_load.o
>  endif
> diff --git a/tools/perf/arch/arm64/include/perf_regs.h b/tools/perf/arch/arm64/include/perf_regs.h
> index 2359546..1e052f1 100644
> --- a/tools/perf/arch/arm64/include/perf_regs.h
> +++ b/tools/perf/arch/arm64/include/perf_regs.h
> @@ -9,6 +9,8 @@
>  #define PERF_REG_IP    PERF_REG_ARM64_PC
>  #define PERF_REG_SP    PERF_REG_ARM64_SP
>
> +void perf_regs_load(u64 *regs);
> +
>  static inline const char *perf_reg_name(int id)
>  {
>         switch (id) {
> diff --git a/tools/perf/arch/arm64/tests/regs_load.S b/tools/perf/arch/arm64/tests/regs_load.S
> new file mode 100644
> index 0000000..40b8b99
> --- /dev/null
> +++ b/tools/perf/arch/arm64/tests/regs_load.S
> @@ -0,0 +1,40 @@
> +#include <linux/linkage.h>
> +
> +/*
> + * Implementation of void perf_regs_load(u64 *regs);
> + *
> + * This functions fills in the 'regs' buffer from the actual registers values,
> + * in the way the perf built-in unwinding test expects them, e.g. an address
> + * within the caller:
> + * - the PC at the time at the call to this function. Since this function
> + *   is called using a bl instruction, the PC value is taken from LR,
> + * - the current SP (not touched by this function),
> + * - the current value of LR is merely retrieved and stored because the
> + *   value before the call to this function is unknown at this time; it will
> + *   be unwound from the dwarf information in unwind__get_entries.
> + */
> +
> +.text
> +.type perf_regs_load,%function
> +ENTRY(perf_regs_load)
> +       stp x0,  x1,  [x0], #16 // store x0..x29
> +       stp x2,  x3,  [x0], #16
> +       stp x4,  x5,  [x0], #16
> +       stp x6,  x7,  [x0], #16
> +       stp x8,  x9,  [x0], #16
> +       stp x10, x11, [x0], #16
> +       stp x12, x13, [x0], #16
> +       stp x14, x15, [x0], #16
> +       stp x16, x17, [x0], #16
> +       stp x18, x19, [x0], #16
> +       stp x20, x21, [x0], #16
> +       stp x22, x23, [x0], #16
> +       stp x24, x25, [x0], #16
> +       stp x26, x27, [x0], #16
> +       stp x28, x29, [x0], #16
> +       mov x1,  sp
> +       stp x30, x1,  [x0], #16 // store lr and sp
> +       str x30, [x0]           // store lr as the PC in order to skip the call
> +                               //  to this function
> +       ret
> +ENDPROC(perf_regs_load)
> --
> 1.7.11.7
>

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH 1/3] perf tests: Introduce perf_regs_load function on ARM64
  2014-04-28 13:10           ` Jean Pihet
@ 2014-04-28 13:12             ` Jean Pihet
  -1 siblings, 0 replies; 46+ messages in thread
From: Jean Pihet @ 2014-04-28 13:12 UTC (permalink / raw)
  To: Will Deacon
  Cc: linux-kernel, linaro-kernel, linux-arm-kernel, Ingo Molnar,
	Jiri Olsa, patches, Arnaldo Carvalho de Melo, Jean Pihet,
	Steve Capper, Corey Ashford, Frederic Weisbecker, Namhyung Kim,
	Paul Mackerras, Peter Zijlstra, Arnaldo Carvalho de Melo,
	David Ahern

Introducing perf_regs_load function, which is going
to be used for dwarf unwind test in following patches.

It takes single argument as a pointer to the regs dump
buffer and populates it with current registers values, as
expected by the perf built-in unwinding test.

Signed-off-by: Jean Pihet <jean.pihet@linaro.org>
Cc: Steve Capper <steve.capper@linaro.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
---
 tools/perf/arch/arm64/Makefile            |  1 +
 tools/perf/arch/arm64/include/perf_regs.h |  2 ++
 tools/perf/arch/arm64/tests/regs_load.S   | 40 +++++++++++++++++++++++++++++++
 3 files changed, 43 insertions(+)
 create mode 100644 tools/perf/arch/arm64/tests/regs_load.S

diff --git a/tools/perf/arch/arm64/Makefile b/tools/perf/arch/arm64/Makefile
index 67e9b3d..9b8f87e 100644
--- a/tools/perf/arch/arm64/Makefile
+++ b/tools/perf/arch/arm64/Makefile
@@ -4,4 +4,5 @@ LIB_OBJS += $(OUTPUT)arch/$(ARCH)/util/dwarf-regs.o
 endif
 ifndef NO_LIBUNWIND
 LIB_OBJS += $(OUTPUT)arch/$(ARCH)/util/unwind-libunwind.o
+LIB_OBJS += $(OUTPUT)arch/$(ARCH)/tests/regs_load.o
 endif
diff --git a/tools/perf/arch/arm64/include/perf_regs.h b/tools/perf/arch/arm64/include/perf_regs.h
index 2359546..1e052f1 100644
--- a/tools/perf/arch/arm64/include/perf_regs.h
+++ b/tools/perf/arch/arm64/include/perf_regs.h
@@ -9,6 +9,8 @@
 #define PERF_REG_IP	PERF_REG_ARM64_PC
 #define PERF_REG_SP	PERF_REG_ARM64_SP
 
+void perf_regs_load(u64 *regs);
+
 static inline const char *perf_reg_name(int id)
 {
 	switch (id) {
diff --git a/tools/perf/arch/arm64/tests/regs_load.S b/tools/perf/arch/arm64/tests/regs_load.S
new file mode 100644
index 0000000..40b8b99
--- /dev/null
+++ b/tools/perf/arch/arm64/tests/regs_load.S
@@ -0,0 +1,40 @@
+#include <linux/linkage.h>
+
+/*
+ * Implementation of void perf_regs_load(u64 *regs);
+ *
+ * This functions fills in the 'regs' buffer from the actual registers values,
+ * in the way the perf built-in unwinding test expects them, e.g. an address
+ * within the caller:
+ * - the PC at the time at the call to this function. Since this function
+ *   is called using a bl instruction, the PC value is taken from LR,
+ * - the current SP (not touched by this function),
+ * - the current value of LR is merely retrieved and stored because the
+ *   value before the call to this function is unknown at this time; it will
+ *   be unwound from the dwarf information in unwind__get_entries.
+ */
+
+.text
+.type perf_regs_load,%function
+ENTRY(perf_regs_load)
+	stp x0,  x1,  [x0], #16	// store x0..x29
+	stp x2,  x3,  [x0], #16
+	stp x4,  x5,  [x0], #16
+	stp x6,  x7,  [x0], #16
+	stp x8,  x9,  [x0], #16
+	stp x10, x11, [x0], #16
+	stp x12, x13, [x0], #16
+	stp x14, x15, [x0], #16
+	stp x16, x17, [x0], #16
+	stp x18, x19, [x0], #16
+	stp x20, x21, [x0], #16
+	stp x22, x23, [x0], #16
+	stp x24, x25, [x0], #16
+	stp x26, x27, [x0], #16
+	stp x28, x29, [x0], #16
+	mov x1,  sp
+	stp x30, x1,  [x0], #16	// store lr and sp
+	str x30, [x0]		// store lr as the PC in order to skip the call
+				//  to this function
+	ret
+ENDPROC(perf_regs_load)
-- 
1.7.11.7


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 1/3] perf tests: Introduce perf_regs_load function on ARM64
@ 2014-04-28 13:12             ` Jean Pihet
  0 siblings, 0 replies; 46+ messages in thread
From: Jean Pihet @ 2014-04-28 13:12 UTC (permalink / raw)
  To: linux-arm-kernel

Introducing perf_regs_load function, which is going
to be used for dwarf unwind test in following patches.

It takes single argument as a pointer to the regs dump
buffer and populates it with current registers values, as
expected by the perf built-in unwinding test.

Signed-off-by: Jean Pihet <jean.pihet@linaro.org>
Cc: Steve Capper <steve.capper@linaro.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
---
 tools/perf/arch/arm64/Makefile            |  1 +
 tools/perf/arch/arm64/include/perf_regs.h |  2 ++
 tools/perf/arch/arm64/tests/regs_load.S   | 40 +++++++++++++++++++++++++++++++
 3 files changed, 43 insertions(+)
 create mode 100644 tools/perf/arch/arm64/tests/regs_load.S

diff --git a/tools/perf/arch/arm64/Makefile b/tools/perf/arch/arm64/Makefile
index 67e9b3d..9b8f87e 100644
--- a/tools/perf/arch/arm64/Makefile
+++ b/tools/perf/arch/arm64/Makefile
@@ -4,4 +4,5 @@ LIB_OBJS += $(OUTPUT)arch/$(ARCH)/util/dwarf-regs.o
 endif
 ifndef NO_LIBUNWIND
 LIB_OBJS += $(OUTPUT)arch/$(ARCH)/util/unwind-libunwind.o
+LIB_OBJS += $(OUTPUT)arch/$(ARCH)/tests/regs_load.o
 endif
diff --git a/tools/perf/arch/arm64/include/perf_regs.h b/tools/perf/arch/arm64/include/perf_regs.h
index 2359546..1e052f1 100644
--- a/tools/perf/arch/arm64/include/perf_regs.h
+++ b/tools/perf/arch/arm64/include/perf_regs.h
@@ -9,6 +9,8 @@
 #define PERF_REG_IP	PERF_REG_ARM64_PC
 #define PERF_REG_SP	PERF_REG_ARM64_SP
 
+void perf_regs_load(u64 *regs);
+
 static inline const char *perf_reg_name(int id)
 {
 	switch (id) {
diff --git a/tools/perf/arch/arm64/tests/regs_load.S b/tools/perf/arch/arm64/tests/regs_load.S
new file mode 100644
index 0000000..40b8b99
--- /dev/null
+++ b/tools/perf/arch/arm64/tests/regs_load.S
@@ -0,0 +1,40 @@
+#include <linux/linkage.h>
+
+/*
+ * Implementation of void perf_regs_load(u64 *regs);
+ *
+ * This functions fills in the 'regs' buffer from the actual registers values,
+ * in the way the perf built-in unwinding test expects them, e.g. an address
+ * within the caller:
+ * - the PC at the time at the call to this function. Since this function
+ *   is called using a bl instruction, the PC value is taken from LR,
+ * - the current SP (not touched by this function),
+ * - the current value of LR is merely retrieved and stored because the
+ *   value before the call to this function is unknown at this time; it will
+ *   be unwound from the dwarf information in unwind__get_entries.
+ */
+
+.text
+.type perf_regs_load,%function
+ENTRY(perf_regs_load)
+	stp x0,  x1,  [x0], #16	// store x0..x29
+	stp x2,  x3,  [x0], #16
+	stp x4,  x5,  [x0], #16
+	stp x6,  x7,  [x0], #16
+	stp x8,  x9,  [x0], #16
+	stp x10, x11, [x0], #16
+	stp x12, x13, [x0], #16
+	stp x14, x15, [x0], #16
+	stp x16, x17, [x0], #16
+	stp x18, x19, [x0], #16
+	stp x20, x21, [x0], #16
+	stp x22, x23, [x0], #16
+	stp x24, x25, [x0], #16
+	stp x26, x27, [x0], #16
+	stp x28, x29, [x0], #16
+	mov x1,  sp
+	stp x30, x1,  [x0], #16	// store lr and sp
+	str x30, [x0]		// store lr as the PC in order to skip the call
+				//  to this function
+	ret
+ENDPROC(perf_regs_load)
-- 
1.7.11.7

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* Re: [PATCH 1/3] perf tests: Introduce perf_regs_load function on ARM64
  2014-04-22 13:42         ` Mark Rutland
@ 2014-04-28 13:10           ` Jean Pihet
  -1 siblings, 0 replies; 46+ messages in thread
From: Jean Pihet @ 2014-04-28 13:10 UTC (permalink / raw)
  To: Mark Rutland
  Cc: linux-kernel, linaro-kernel, linux-arm-kernel, Arnaldo,
	Ingo Molnar, Jiri Olsa, Steve Capper, patches, Corey Ashford,
	Frederic Weisbecker, Namhyung Kim, Paul Mackerras,
	Peter Zijlstra, David Ahern

On 22 April 2014 15:42, Mark Rutland <mark.rutland@arm.com> wrote:
> Hi Jean,
>
> Apologies for the delay on this.
>
> On Tue, Mar 25, 2014 at 03:23:26PM +0000, Jean Pihet wrote:
>> Hi Mark,
>>
>> On 21 March 2014 16:11, Mark Rutland <mark.rutland@arm.com> wrote:
>> > Hi Jean,
>> >
>> > On Wed, Mar 19, 2014 at 09:42:33AM +0000, Jean Pihet wrote:
>> >> Introducing perf_regs_load function, which is going
>> >> to be used for dwarf unwind test in following patches.
>> >>
>> >> It takes single argument as a pointer to the regs dump
>> >> buffer and populates it with current registers values, as
>> >> expected by the perf built-in unwinding test.
>> >>
>> >> Signed-off-by: Jean Pihet <jean.pihet@linaro.org>
>> >> Cc: Steve Capper <steve.capper@linaro.org>
>> >> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
>> >> Cc: Frederic Weisbecker <fweisbec@gmail.com>
>> >> Cc: Ingo Molnar <mingo@kernel.org>
>> >> Cc: Namhyung Kim <namhyung@kernel.org>
>> >> Cc: Paul Mackerras <paulus@samba.org>
>> >> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
>> >> Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
>> >> Cc: David Ahern <dsahern@gmail.com>
>> >> Cc: Jiri Olsa <jolsa@redhat.com>
>> >> ---
>> >>  tools/perf/arch/arm64/Makefile            |  1 +
>> >>  tools/perf/arch/arm64/include/perf_regs.h |  2 ++
>> >>  tools/perf/arch/arm64/tests/regs_load.S   | 39 +++++++++++++++++++++++++++++++
>> >>  3 files changed, 42 insertions(+)
>> >>  create mode 100644 tools/perf/arch/arm64/tests/regs_load.S
>> >>
>> >> diff --git a/tools/perf/arch/arm64/Makefile b/tools/perf/arch/arm64/Makefile
>> >> index 67e9b3d..9b8f87e 100644
>> >> --- a/tools/perf/arch/arm64/Makefile
>> >> +++ b/tools/perf/arch/arm64/Makefile
>> >> @@ -4,4 +4,5 @@ LIB_OBJS += $(OUTPUT)arch/$(ARCH)/util/dwarf-regs.o
>> >>  endif
>> >>  ifndef NO_LIBUNWIND
>> >>  LIB_OBJS += $(OUTPUT)arch/$(ARCH)/util/unwind-libunwind.o
>> >> +LIB_OBJS += $(OUTPUT)arch/$(ARCH)/tests/regs_load.o
>> >>  endif
>> >> diff --git a/tools/perf/arch/arm64/include/perf_regs.h b/tools/perf/arch/arm64/include/perf_regs.h
>> >> index 2359546..1e052f1 100644
>> >> --- a/tools/perf/arch/arm64/include/perf_regs.h
>> >> +++ b/tools/perf/arch/arm64/include/perf_regs.h
>> >> @@ -9,6 +9,8 @@
>> >>  #define PERF_REG_IP  PERF_REG_ARM64_PC
>> >>  #define PERF_REG_SP  PERF_REG_ARM64_SP
>> >>
>> >> +void perf_regs_load(u64 *regs);
>> >> +
>> >>  static inline const char *perf_reg_name(int id)
>> >>  {
>> >>       switch (id) {
>> >> diff --git a/tools/perf/arch/arm64/tests/regs_load.S b/tools/perf/arch/arm64/tests/regs_load.S
>> >> new file mode 100644
>> >> index 0000000..92ab968
>> >> --- /dev/null
>> >> +++ b/tools/perf/arch/arm64/tests/regs_load.S
>> >> @@ -0,0 +1,39 @@
>> >> +#include <linux/linkage.h>
>> >> +
>> >> +/*
>> >> + * Implementation of void perf_regs_load(u64 *regs);
>> >> + *
>> >> + * This functions fills in the 'regs' buffer from the actual registers values,
>> >> + * in the way the perf built-in unwinding test expects them:
>> >> + * - the PC at the time at the call to this function. Since this function
>> >> + *   is called using a bl instruction, the PC value is taken from LR,
>> >
>> > Is it guaranteed that this function is always invoked with a branch with
>> > link instruction, or is that just what current compiler versions are
>> > doing? I couldn't see where we would get that guarantee from.
>> The current compiler implements the call as a bl instruction.
>
> While I don't think we can rely on the compiler using a bl to call the
> function it shouldn't matter here if we only care about the LR value
> being an address within the caller, as it doesn't look amenable to tail
> call optimization.
>
>> > If it is called with a branch with link, then the LR value will be the
>> > PC at call time + 4, rather than just the exact PC at call time. If not
>> > then we don't have a guaranteed relationship between the PC at call time
>> > and the current LR value.
>> >
>> > If the only place that perf_regs_load is used is a single test which
>> > doesn't care about the precise PC at the time of the call, then it's
>> > probably OK to use the LR value, but we should be careful to document
>> > what the faked-up PC actually is and how we expect it to be used.
>> The code is only used by an unwinding test. The unwinding code
>> resolves the function name from an address range found in the dwarf
>> information so in principle it is ok to use the PC/LR at the time of
>> the call to a function.
>>
>> Is the comment above OK or do you want an update of the code as well?
>
> If we just need an (arbitrary) address within the caller, a comment
> update should be fine.
Yes that is the idea;

>
>> >> + * - the current SP (not touched by this function),
>> >> + * - the current value of LR is merely retrieved and stored because the
>> >> + *   value before the call to this function is unknown at this time; it will
>> >> + *   be unwound from the dwarf information in unwind__get_entries.
>> >> + */
>> >> +
>> >> +.text
>> >> +.type perf_regs_load,%function
>> >> +ENTRY(perf_regs_load)
>> >> +     stp x0,  x1,  [x0], #16 // store x0..x29
>> >> +     stp x2,  x3,  [x0], #16
>> >> +     stp x4,  x5,  [x0], #16
>> >> +     stp x6,  x7,  [x0], #16
>> >> +     stp x8,  x9,  [x0], #16
>> >> +     stp x10, x11, [x0], #16
>> >> +     stp x12, x13, [x0], #16
>> >> +     stp x14, x15, [x0], #16
>> >> +     stp x16, x17, [x0], #16
>> >> +     stp x18, x19, [x0], #16
>> >> +     stp x20, x21, [x0], #16
>> >> +     stp x22, x23, [x0], #16
>> >> +     stp x24, x25, [x0], #16
>> >> +     stp x26, x27, [x0], #16
>> >> +     stp x28, x29, [x0], #16
>> >> +     mov x1,  sp
>> >> +     stp x30, x1,  [x0], #16 // store lr and sp
>> >> +     str x30, [x0]           // store pc as lr in order to skip the call
>> >> +                             //  to this function
>> >
>> > It might be better to word this a "store the lr in place of the pc". To
>> > me at least the current wording implies the opposite of what the code
>> > seems to be doing.
>> Ok the last comment can be updated.
>
> Ok, cheers.
>
> With those changes I think this looks fine.
Ok let me send a refreshed version in a bit. If the wording is Ok I
will refresh the ARM patches for the same topic and re-submit them.

>
> Thanks,
> Mark.

Thanks,
Jean

> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH 1/3] perf tests: Introduce perf_regs_load function on ARM64
@ 2014-04-28 13:10           ` Jean Pihet
  0 siblings, 0 replies; 46+ messages in thread
From: Jean Pihet @ 2014-04-28 13:10 UTC (permalink / raw)
  To: linux-arm-kernel

On 22 April 2014 15:42, Mark Rutland <mark.rutland@arm.com> wrote:
> Hi Jean,
>
> Apologies for the delay on this.
>
> On Tue, Mar 25, 2014 at 03:23:26PM +0000, Jean Pihet wrote:
>> Hi Mark,
>>
>> On 21 March 2014 16:11, Mark Rutland <mark.rutland@arm.com> wrote:
>> > Hi Jean,
>> >
>> > On Wed, Mar 19, 2014 at 09:42:33AM +0000, Jean Pihet wrote:
>> >> Introducing perf_regs_load function, which is going
>> >> to be used for dwarf unwind test in following patches.
>> >>
>> >> It takes single argument as a pointer to the regs dump
>> >> buffer and populates it with current registers values, as
>> >> expected by the perf built-in unwinding test.
>> >>
>> >> Signed-off-by: Jean Pihet <jean.pihet@linaro.org>
>> >> Cc: Steve Capper <steve.capper@linaro.org>
>> >> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
>> >> Cc: Frederic Weisbecker <fweisbec@gmail.com>
>> >> Cc: Ingo Molnar <mingo@kernel.org>
>> >> Cc: Namhyung Kim <namhyung@kernel.org>
>> >> Cc: Paul Mackerras <paulus@samba.org>
>> >> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
>> >> Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
>> >> Cc: David Ahern <dsahern@gmail.com>
>> >> Cc: Jiri Olsa <jolsa@redhat.com>
>> >> ---
>> >>  tools/perf/arch/arm64/Makefile            |  1 +
>> >>  tools/perf/arch/arm64/include/perf_regs.h |  2 ++
>> >>  tools/perf/arch/arm64/tests/regs_load.S   | 39 +++++++++++++++++++++++++++++++
>> >>  3 files changed, 42 insertions(+)
>> >>  create mode 100644 tools/perf/arch/arm64/tests/regs_load.S
>> >>
>> >> diff --git a/tools/perf/arch/arm64/Makefile b/tools/perf/arch/arm64/Makefile
>> >> index 67e9b3d..9b8f87e 100644
>> >> --- a/tools/perf/arch/arm64/Makefile
>> >> +++ b/tools/perf/arch/arm64/Makefile
>> >> @@ -4,4 +4,5 @@ LIB_OBJS += $(OUTPUT)arch/$(ARCH)/util/dwarf-regs.o
>> >>  endif
>> >>  ifndef NO_LIBUNWIND
>> >>  LIB_OBJS += $(OUTPUT)arch/$(ARCH)/util/unwind-libunwind.o
>> >> +LIB_OBJS += $(OUTPUT)arch/$(ARCH)/tests/regs_load.o
>> >>  endif
>> >> diff --git a/tools/perf/arch/arm64/include/perf_regs.h b/tools/perf/arch/arm64/include/perf_regs.h
>> >> index 2359546..1e052f1 100644
>> >> --- a/tools/perf/arch/arm64/include/perf_regs.h
>> >> +++ b/tools/perf/arch/arm64/include/perf_regs.h
>> >> @@ -9,6 +9,8 @@
>> >>  #define PERF_REG_IP  PERF_REG_ARM64_PC
>> >>  #define PERF_REG_SP  PERF_REG_ARM64_SP
>> >>
>> >> +void perf_regs_load(u64 *regs);
>> >> +
>> >>  static inline const char *perf_reg_name(int id)
>> >>  {
>> >>       switch (id) {
>> >> diff --git a/tools/perf/arch/arm64/tests/regs_load.S b/tools/perf/arch/arm64/tests/regs_load.S
>> >> new file mode 100644
>> >> index 0000000..92ab968
>> >> --- /dev/null
>> >> +++ b/tools/perf/arch/arm64/tests/regs_load.S
>> >> @@ -0,0 +1,39 @@
>> >> +#include <linux/linkage.h>
>> >> +
>> >> +/*
>> >> + * Implementation of void perf_regs_load(u64 *regs);
>> >> + *
>> >> + * This functions fills in the 'regs' buffer from the actual registers values,
>> >> + * in the way the perf built-in unwinding test expects them:
>> >> + * - the PC at the time at the call to this function. Since this function
>> >> + *   is called using a bl instruction, the PC value is taken from LR,
>> >
>> > Is it guaranteed that this function is always invoked with a branch with
>> > link instruction, or is that just what current compiler versions are
>> > doing? I couldn't see where we would get that guarantee from.
>> The current compiler implements the call as a bl instruction.
>
> While I don't think we can rely on the compiler using a bl to call the
> function it shouldn't matter here if we only care about the LR value
> being an address within the caller, as it doesn't look amenable to tail
> call optimization.
>
>> > If it is called with a branch with link, then the LR value will be the
>> > PC at call time + 4, rather than just the exact PC at call time. If not
>> > then we don't have a guaranteed relationship between the PC at call time
>> > and the current LR value.
>> >
>> > If the only place that perf_regs_load is used is a single test which
>> > doesn't care about the precise PC at the time of the call, then it's
>> > probably OK to use the LR value, but we should be careful to document
>> > what the faked-up PC actually is and how we expect it to be used.
>> The code is only used by an unwinding test. The unwinding code
>> resolves the function name from an address range found in the dwarf
>> information so in principle it is ok to use the PC/LR at the time of
>> the call to a function.
>>
>> Is the comment above OK or do you want an update of the code as well?
>
> If we just need an (arbitrary) address within the caller, a comment
> update should be fine.
Yes that is the idea;

>
>> >> + * - the current SP (not touched by this function),
>> >> + * - the current value of LR is merely retrieved and stored because the
>> >> + *   value before the call to this function is unknown at this time; it will
>> >> + *   be unwound from the dwarf information in unwind__get_entries.
>> >> + */
>> >> +
>> >> +.text
>> >> +.type perf_regs_load,%function
>> >> +ENTRY(perf_regs_load)
>> >> +     stp x0,  x1,  [x0], #16 // store x0..x29
>> >> +     stp x2,  x3,  [x0], #16
>> >> +     stp x4,  x5,  [x0], #16
>> >> +     stp x6,  x7,  [x0], #16
>> >> +     stp x8,  x9,  [x0], #16
>> >> +     stp x10, x11, [x0], #16
>> >> +     stp x12, x13, [x0], #16
>> >> +     stp x14, x15, [x0], #16
>> >> +     stp x16, x17, [x0], #16
>> >> +     stp x18, x19, [x0], #16
>> >> +     stp x20, x21, [x0], #16
>> >> +     stp x22, x23, [x0], #16
>> >> +     stp x24, x25, [x0], #16
>> >> +     stp x26, x27, [x0], #16
>> >> +     stp x28, x29, [x0], #16
>> >> +     mov x1,  sp
>> >> +     stp x30, x1,  [x0], #16 // store lr and sp
>> >> +     str x30, [x0]           // store pc as lr in order to skip the call
>> >> +                             //  to this function
>> >
>> > It might be better to word this a "store the lr in place of the pc". To
>> > me at least the current wording implies the opposite of what the code
>> > seems to be doing.
>> Ok the last comment can be updated.
>
> Ok, cheers.
>
> With those changes I think this looks fine.
Ok let me send a refreshed version in a bit. If the wording is Ok I
will refresh the ARM patches for the same topic and re-submit them.

>
> Thanks,
> Mark.

Thanks,
Jean

> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo at vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 1/3] perf tests: Introduce perf_regs_load function on ARM64
  2014-03-25 15:23       ` Jean Pihet
@ 2014-04-22 13:42         ` Mark Rutland
  -1 siblings, 0 replies; 46+ messages in thread
From: Mark Rutland @ 2014-04-22 13:42 UTC (permalink / raw)
  To: Jean Pihet
  Cc: linux-kernel, linaro-kernel, linux-arm-kernel, Arnaldo,
	Ingo Molnar, Jiri Olsa, Steve Capper, patches, Corey Ashford,
	Frederic Weisbecker, Namhyung Kim, Paul Mackerras,
	Peter Zijlstra, David Ahern

Hi Jean,

Apologies for the delay on this.

On Tue, Mar 25, 2014 at 03:23:26PM +0000, Jean Pihet wrote:
> Hi Mark,
> 
> On 21 March 2014 16:11, Mark Rutland <mark.rutland@arm.com> wrote:
> > Hi Jean,
> >
> > On Wed, Mar 19, 2014 at 09:42:33AM +0000, Jean Pihet wrote:
> >> Introducing perf_regs_load function, which is going
> >> to be used for dwarf unwind test in following patches.
> >>
> >> It takes single argument as a pointer to the regs dump
> >> buffer and populates it with current registers values, as
> >> expected by the perf built-in unwinding test.
> >>
> >> Signed-off-by: Jean Pihet <jean.pihet@linaro.org>
> >> Cc: Steve Capper <steve.capper@linaro.org>
> >> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
> >> Cc: Frederic Weisbecker <fweisbec@gmail.com>
> >> Cc: Ingo Molnar <mingo@kernel.org>
> >> Cc: Namhyung Kim <namhyung@kernel.org>
> >> Cc: Paul Mackerras <paulus@samba.org>
> >> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
> >> Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
> >> Cc: David Ahern <dsahern@gmail.com>
> >> Cc: Jiri Olsa <jolsa@redhat.com>
> >> ---
> >>  tools/perf/arch/arm64/Makefile            |  1 +
> >>  tools/perf/arch/arm64/include/perf_regs.h |  2 ++
> >>  tools/perf/arch/arm64/tests/regs_load.S   | 39 +++++++++++++++++++++++++++++++
> >>  3 files changed, 42 insertions(+)
> >>  create mode 100644 tools/perf/arch/arm64/tests/regs_load.S
> >>
> >> diff --git a/tools/perf/arch/arm64/Makefile b/tools/perf/arch/arm64/Makefile
> >> index 67e9b3d..9b8f87e 100644
> >> --- a/tools/perf/arch/arm64/Makefile
> >> +++ b/tools/perf/arch/arm64/Makefile
> >> @@ -4,4 +4,5 @@ LIB_OBJS += $(OUTPUT)arch/$(ARCH)/util/dwarf-regs.o
> >>  endif
> >>  ifndef NO_LIBUNWIND
> >>  LIB_OBJS += $(OUTPUT)arch/$(ARCH)/util/unwind-libunwind.o
> >> +LIB_OBJS += $(OUTPUT)arch/$(ARCH)/tests/regs_load.o
> >>  endif
> >> diff --git a/tools/perf/arch/arm64/include/perf_regs.h b/tools/perf/arch/arm64/include/perf_regs.h
> >> index 2359546..1e052f1 100644
> >> --- a/tools/perf/arch/arm64/include/perf_regs.h
> >> +++ b/tools/perf/arch/arm64/include/perf_regs.h
> >> @@ -9,6 +9,8 @@
> >>  #define PERF_REG_IP  PERF_REG_ARM64_PC
> >>  #define PERF_REG_SP  PERF_REG_ARM64_SP
> >>
> >> +void perf_regs_load(u64 *regs);
> >> +
> >>  static inline const char *perf_reg_name(int id)
> >>  {
> >>       switch (id) {
> >> diff --git a/tools/perf/arch/arm64/tests/regs_load.S b/tools/perf/arch/arm64/tests/regs_load.S
> >> new file mode 100644
> >> index 0000000..92ab968
> >> --- /dev/null
> >> +++ b/tools/perf/arch/arm64/tests/regs_load.S
> >> @@ -0,0 +1,39 @@
> >> +#include <linux/linkage.h>
> >> +
> >> +/*
> >> + * Implementation of void perf_regs_load(u64 *regs);
> >> + *
> >> + * This functions fills in the 'regs' buffer from the actual registers values,
> >> + * in the way the perf built-in unwinding test expects them:
> >> + * - the PC at the time at the call to this function. Since this function
> >> + *   is called using a bl instruction, the PC value is taken from LR,
> >
> > Is it guaranteed that this function is always invoked with a branch with
> > link instruction, or is that just what current compiler versions are
> > doing? I couldn't see where we would get that guarantee from.
> The current compiler implements the call as a bl instruction.

While I don't think we can rely on the compiler using a bl to call the
function it shouldn't matter here if we only care about the LR value
being an address within the caller, as it doesn't look amenable to tail
call optimization.

> > If it is called with a branch with link, then the LR value will be the
> > PC at call time + 4, rather than just the exact PC at call time. If not
> > then we don't have a guaranteed relationship between the PC at call time
> > and the current LR value.
> >
> > If the only place that perf_regs_load is used is a single test which
> > doesn't care about the precise PC at the time of the call, then it's
> > probably OK to use the LR value, but we should be careful to document
> > what the faked-up PC actually is and how we expect it to be used.
> The code is only used by an unwinding test. The unwinding code
> resolves the function name from an address range found in the dwarf
> information so in principle it is ok to use the PC/LR at the time of
> the call to a function.
> 
> Is the comment above OK or do you want an update of the code as well?

If we just need an (arbitrary) address within the caller, a comment
update should be fine.

> >> + * - the current SP (not touched by this function),
> >> + * - the current value of LR is merely retrieved and stored because the
> >> + *   value before the call to this function is unknown at this time; it will
> >> + *   be unwound from the dwarf information in unwind__get_entries.
> >> + */
> >> +
> >> +.text
> >> +.type perf_regs_load,%function
> >> +ENTRY(perf_regs_load)
> >> +     stp x0,  x1,  [x0], #16 // store x0..x29
> >> +     stp x2,  x3,  [x0], #16
> >> +     stp x4,  x5,  [x0], #16
> >> +     stp x6,  x7,  [x0], #16
> >> +     stp x8,  x9,  [x0], #16
> >> +     stp x10, x11, [x0], #16
> >> +     stp x12, x13, [x0], #16
> >> +     stp x14, x15, [x0], #16
> >> +     stp x16, x17, [x0], #16
> >> +     stp x18, x19, [x0], #16
> >> +     stp x20, x21, [x0], #16
> >> +     stp x22, x23, [x0], #16
> >> +     stp x24, x25, [x0], #16
> >> +     stp x26, x27, [x0], #16
> >> +     stp x28, x29, [x0], #16
> >> +     mov x1,  sp
> >> +     stp x30, x1,  [x0], #16 // store lr and sp
> >> +     str x30, [x0]           // store pc as lr in order to skip the call
> >> +                             //  to this function
> >
> > It might be better to word this a "store the lr in place of the pc". To
> > me at least the current wording implies the opposite of what the code
> > seems to be doing.
> Ok the last comment can be updated.

Ok, cheers.

With those changes I think this looks fine.

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH 1/3] perf tests: Introduce perf_regs_load function on ARM64
@ 2014-04-22 13:42         ` Mark Rutland
  0 siblings, 0 replies; 46+ messages in thread
From: Mark Rutland @ 2014-04-22 13:42 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Jean,

Apologies for the delay on this.

On Tue, Mar 25, 2014 at 03:23:26PM +0000, Jean Pihet wrote:
> Hi Mark,
> 
> On 21 March 2014 16:11, Mark Rutland <mark.rutland@arm.com> wrote:
> > Hi Jean,
> >
> > On Wed, Mar 19, 2014 at 09:42:33AM +0000, Jean Pihet wrote:
> >> Introducing perf_regs_load function, which is going
> >> to be used for dwarf unwind test in following patches.
> >>
> >> It takes single argument as a pointer to the regs dump
> >> buffer and populates it with current registers values, as
> >> expected by the perf built-in unwinding test.
> >>
> >> Signed-off-by: Jean Pihet <jean.pihet@linaro.org>
> >> Cc: Steve Capper <steve.capper@linaro.org>
> >> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
> >> Cc: Frederic Weisbecker <fweisbec@gmail.com>
> >> Cc: Ingo Molnar <mingo@kernel.org>
> >> Cc: Namhyung Kim <namhyung@kernel.org>
> >> Cc: Paul Mackerras <paulus@samba.org>
> >> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
> >> Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
> >> Cc: David Ahern <dsahern@gmail.com>
> >> Cc: Jiri Olsa <jolsa@redhat.com>
> >> ---
> >>  tools/perf/arch/arm64/Makefile            |  1 +
> >>  tools/perf/arch/arm64/include/perf_regs.h |  2 ++
> >>  tools/perf/arch/arm64/tests/regs_load.S   | 39 +++++++++++++++++++++++++++++++
> >>  3 files changed, 42 insertions(+)
> >>  create mode 100644 tools/perf/arch/arm64/tests/regs_load.S
> >>
> >> diff --git a/tools/perf/arch/arm64/Makefile b/tools/perf/arch/arm64/Makefile
> >> index 67e9b3d..9b8f87e 100644
> >> --- a/tools/perf/arch/arm64/Makefile
> >> +++ b/tools/perf/arch/arm64/Makefile
> >> @@ -4,4 +4,5 @@ LIB_OBJS += $(OUTPUT)arch/$(ARCH)/util/dwarf-regs.o
> >>  endif
> >>  ifndef NO_LIBUNWIND
> >>  LIB_OBJS += $(OUTPUT)arch/$(ARCH)/util/unwind-libunwind.o
> >> +LIB_OBJS += $(OUTPUT)arch/$(ARCH)/tests/regs_load.o
> >>  endif
> >> diff --git a/tools/perf/arch/arm64/include/perf_regs.h b/tools/perf/arch/arm64/include/perf_regs.h
> >> index 2359546..1e052f1 100644
> >> --- a/tools/perf/arch/arm64/include/perf_regs.h
> >> +++ b/tools/perf/arch/arm64/include/perf_regs.h
> >> @@ -9,6 +9,8 @@
> >>  #define PERF_REG_IP  PERF_REG_ARM64_PC
> >>  #define PERF_REG_SP  PERF_REG_ARM64_SP
> >>
> >> +void perf_regs_load(u64 *regs);
> >> +
> >>  static inline const char *perf_reg_name(int id)
> >>  {
> >>       switch (id) {
> >> diff --git a/tools/perf/arch/arm64/tests/regs_load.S b/tools/perf/arch/arm64/tests/regs_load.S
> >> new file mode 100644
> >> index 0000000..92ab968
> >> --- /dev/null
> >> +++ b/tools/perf/arch/arm64/tests/regs_load.S
> >> @@ -0,0 +1,39 @@
> >> +#include <linux/linkage.h>
> >> +
> >> +/*
> >> + * Implementation of void perf_regs_load(u64 *regs);
> >> + *
> >> + * This functions fills in the 'regs' buffer from the actual registers values,
> >> + * in the way the perf built-in unwinding test expects them:
> >> + * - the PC at the time at the call to this function. Since this function
> >> + *   is called using a bl instruction, the PC value is taken from LR,
> >
> > Is it guaranteed that this function is always invoked with a branch with
> > link instruction, or is that just what current compiler versions are
> > doing? I couldn't see where we would get that guarantee from.
> The current compiler implements the call as a bl instruction.

While I don't think we can rely on the compiler using a bl to call the
function it shouldn't matter here if we only care about the LR value
being an address within the caller, as it doesn't look amenable to tail
call optimization.

> > If it is called with a branch with link, then the LR value will be the
> > PC at call time + 4, rather than just the exact PC at call time. If not
> > then we don't have a guaranteed relationship between the PC at call time
> > and the current LR value.
> >
> > If the only place that perf_regs_load is used is a single test which
> > doesn't care about the precise PC at the time of the call, then it's
> > probably OK to use the LR value, but we should be careful to document
> > what the faked-up PC actually is and how we expect it to be used.
> The code is only used by an unwinding test. The unwinding code
> resolves the function name from an address range found in the dwarf
> information so in principle it is ok to use the PC/LR at the time of
> the call to a function.
> 
> Is the comment above OK or do you want an update of the code as well?

If we just need an (arbitrary) address within the caller, a comment
update should be fine.

> >> + * - the current SP (not touched by this function),
> >> + * - the current value of LR is merely retrieved and stored because the
> >> + *   value before the call to this function is unknown at this time; it will
> >> + *   be unwound from the dwarf information in unwind__get_entries.
> >> + */
> >> +
> >> +.text
> >> +.type perf_regs_load,%function
> >> +ENTRY(perf_regs_load)
> >> +     stp x0,  x1,  [x0], #16 // store x0..x29
> >> +     stp x2,  x3,  [x0], #16
> >> +     stp x4,  x5,  [x0], #16
> >> +     stp x6,  x7,  [x0], #16
> >> +     stp x8,  x9,  [x0], #16
> >> +     stp x10, x11, [x0], #16
> >> +     stp x12, x13, [x0], #16
> >> +     stp x14, x15, [x0], #16
> >> +     stp x16, x17, [x0], #16
> >> +     stp x18, x19, [x0], #16
> >> +     stp x20, x21, [x0], #16
> >> +     stp x22, x23, [x0], #16
> >> +     stp x24, x25, [x0], #16
> >> +     stp x26, x27, [x0], #16
> >> +     stp x28, x29, [x0], #16
> >> +     mov x1,  sp
> >> +     stp x30, x1,  [x0], #16 // store lr and sp
> >> +     str x30, [x0]           // store pc as lr in order to skip the call
> >> +                             //  to this function
> >
> > It might be better to word this a "store the lr in place of the pc". To
> > me at least the current wording implies the opposite of what the code
> > seems to be doing.
> Ok the last comment can be updated.

Ok, cheers.

With those changes I think this looks fine.

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 1/3] perf tests: Introduce perf_regs_load function on ARM64
  2014-04-22 10:37           ` Will Deacon
@ 2014-04-22 13:24             ` Mark Rutland
  -1 siblings, 0 replies; 46+ messages in thread
From: Mark Rutland @ 2014-04-22 13:24 UTC (permalink / raw)
  To: Will Deacon
  Cc: Jean Pihet, linux-kernel, linaro-kernel, linux-arm-kernel,
	Arnaldo, Ingo Molnar, Jiri Olsa, Steve Capper, patches,
	Corey Ashford, Frederic Weisbecker, Namhyung Kim, Paul Mackerras,
	Peter Zijlstra, David Ahern, Arnaldo Carvalho de Melo

On Tue, Apr 22, 2014 at 11:37:44AM +0100, Will Deacon wrote:
> On Tue, Apr 22, 2014 at 09:13:33AM +0100, Jean Pihet wrote:
> > Hi Mark, Will,
> 
> Hi Jean,
> 
> > Ping on this series. Can you please check? I
> 
> Do you have a pointer to the latest version of your code please? The email
> backlog I have seems to end with MarkR saying he would take a look.

The last posting I saw was [1-6]. The lack of reply is my fault, as I
lost track of the thread.

Cheers,
Mark.

[1] http://lists.infradead.org/pipermail/linux-arm-kernel/2014-March/241470.html
[2] http://lists.infradead.org/pipermail/linux-arm-kernel/2014-March/241467.html
[3] http://lists.infradead.org/pipermail/linux-arm-kernel/2014-March/242396.html
[4] http://lists.infradead.org/pipermail/linux-arm-kernel/2014-March/242998.html
[5] http://lists.infradead.org/pipermail/linux-arm-kernel/2014-March/241466.html
[6] http://lists.infradead.org/pipermail/linux-arm-kernel/2014-March/241468.html

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH 1/3] perf tests: Introduce perf_regs_load function on ARM64
@ 2014-04-22 13:24             ` Mark Rutland
  0 siblings, 0 replies; 46+ messages in thread
From: Mark Rutland @ 2014-04-22 13:24 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Apr 22, 2014 at 11:37:44AM +0100, Will Deacon wrote:
> On Tue, Apr 22, 2014 at 09:13:33AM +0100, Jean Pihet wrote:
> > Hi Mark, Will,
> 
> Hi Jean,
> 
> > Ping on this series. Can you please check? I
> 
> Do you have a pointer to the latest version of your code please? The email
> backlog I have seems to end with MarkR saying he would take a look.

The last posting I saw was [1-6]. The lack of reply is my fault, as I
lost track of the thread.

Cheers,
Mark.

[1] http://lists.infradead.org/pipermail/linux-arm-kernel/2014-March/241470.html
[2] http://lists.infradead.org/pipermail/linux-arm-kernel/2014-March/241467.html
[3] http://lists.infradead.org/pipermail/linux-arm-kernel/2014-March/242396.html
[4] http://lists.infradead.org/pipermail/linux-arm-kernel/2014-March/242998.html
[5] http://lists.infradead.org/pipermail/linux-arm-kernel/2014-March/241466.html
[6] http://lists.infradead.org/pipermail/linux-arm-kernel/2014-March/241468.html

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 1/3] perf tests: Introduce perf_regs_load function on ARM64
  2014-04-22  8:13         ` Jean Pihet
@ 2014-04-22 10:37           ` Will Deacon
  -1 siblings, 0 replies; 46+ messages in thread
From: Will Deacon @ 2014-04-22 10:37 UTC (permalink / raw)
  To: Jean Pihet
  Cc: Mark Rutland, linux-kernel, linaro-kernel, linux-arm-kernel,
	Arnaldo, Ingo Molnar, Jiri Olsa, Steve Capper, patches,
	Corey Ashford, Frederic Weisbecker, Namhyung Kim, Paul Mackerras,
	Peter Zijlstra, David Ahern, Arnaldo Carvalho de Melo

On Tue, Apr 22, 2014 at 09:13:33AM +0100, Jean Pihet wrote:
> Hi Mark, Will,

Hi Jean,

> Ping on this series. Can you please check? I

Do you have a pointer to the latest version of your code please? The email
backlog I have seems to end with MarkR saying he would take a look.

Will

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH 1/3] perf tests: Introduce perf_regs_load function on ARM64
@ 2014-04-22 10:37           ` Will Deacon
  0 siblings, 0 replies; 46+ messages in thread
From: Will Deacon @ 2014-04-22 10:37 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Apr 22, 2014 at 09:13:33AM +0100, Jean Pihet wrote:
> Hi Mark, Will,

Hi Jean,

> Ping on this series. Can you please check? I

Do you have a pointer to the latest version of your code please? The email
backlog I have seems to end with MarkR saying he would take a look.

Will

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 1/3] perf tests: Introduce perf_regs_load function on ARM64
  2014-03-25 15:23       ` Jean Pihet
@ 2014-04-22  8:13         ` Jean Pihet
  -1 siblings, 0 replies; 46+ messages in thread
From: Jean Pihet @ 2014-04-22  8:13 UTC (permalink / raw)
  To: Mark Rutland, Will Deacon
  Cc: linux-kernel, linaro-kernel, linux-arm-kernel, Arnaldo,
	Ingo Molnar, Jiri Olsa, Steve Capper, patches, Corey Ashford,
	Frederic Weisbecker, Namhyung Kim, Paul Mackerras,
	Peter Zijlstra, David Ahern, Arnaldo Carvalho de Melo

Hi Mark, Will,

Ping on this series. Can you please check? I

Regards,
Jean


On 25 March 2014 16:23, Jean Pihet <jean.pihet@linaro.org> wrote:
> Hi Mark,
>
> On 21 March 2014 16:11, Mark Rutland <mark.rutland@arm.com> wrote:
>> Hi Jean,
>>
>> On Wed, Mar 19, 2014 at 09:42:33AM +0000, Jean Pihet wrote:
>>> Introducing perf_regs_load function, which is going
>>> to be used for dwarf unwind test in following patches.
>>>
>>> It takes single argument as a pointer to the regs dump
>>> buffer and populates it with current registers values, as
>>> expected by the perf built-in unwinding test.
>>>
>>> Signed-off-by: Jean Pihet <jean.pihet@linaro.org>
>>> Cc: Steve Capper <steve.capper@linaro.org>
>>> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
>>> Cc: Frederic Weisbecker <fweisbec@gmail.com>
>>> Cc: Ingo Molnar <mingo@kernel.org>
>>> Cc: Namhyung Kim <namhyung@kernel.org>
>>> Cc: Paul Mackerras <paulus@samba.org>
>>> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
>>> Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
>>> Cc: David Ahern <dsahern@gmail.com>
>>> Cc: Jiri Olsa <jolsa@redhat.com>
>>> ---
>>>  tools/perf/arch/arm64/Makefile            |  1 +
>>>  tools/perf/arch/arm64/include/perf_regs.h |  2 ++
>>>  tools/perf/arch/arm64/tests/regs_load.S   | 39 +++++++++++++++++++++++++++++++
>>>  3 files changed, 42 insertions(+)
>>>  create mode 100644 tools/perf/arch/arm64/tests/regs_load.S
>>>
>>> diff --git a/tools/perf/arch/arm64/Makefile b/tools/perf/arch/arm64/Makefile
>>> index 67e9b3d..9b8f87e 100644
>>> --- a/tools/perf/arch/arm64/Makefile
>>> +++ b/tools/perf/arch/arm64/Makefile
>>> @@ -4,4 +4,5 @@ LIB_OBJS += $(OUTPUT)arch/$(ARCH)/util/dwarf-regs.o
>>>  endif
>>>  ifndef NO_LIBUNWIND
>>>  LIB_OBJS += $(OUTPUT)arch/$(ARCH)/util/unwind-libunwind.o
>>> +LIB_OBJS += $(OUTPUT)arch/$(ARCH)/tests/regs_load.o
>>>  endif
>>> diff --git a/tools/perf/arch/arm64/include/perf_regs.h b/tools/perf/arch/arm64/include/perf_regs.h
>>> index 2359546..1e052f1 100644
>>> --- a/tools/perf/arch/arm64/include/perf_regs.h
>>> +++ b/tools/perf/arch/arm64/include/perf_regs.h
>>> @@ -9,6 +9,8 @@
>>>  #define PERF_REG_IP  PERF_REG_ARM64_PC
>>>  #define PERF_REG_SP  PERF_REG_ARM64_SP
>>>
>>> +void perf_regs_load(u64 *regs);
>>> +
>>>  static inline const char *perf_reg_name(int id)
>>>  {
>>>       switch (id) {
>>> diff --git a/tools/perf/arch/arm64/tests/regs_load.S b/tools/perf/arch/arm64/tests/regs_load.S
>>> new file mode 100644
>>> index 0000000..92ab968
>>> --- /dev/null
>>> +++ b/tools/perf/arch/arm64/tests/regs_load.S
>>> @@ -0,0 +1,39 @@
>>> +#include <linux/linkage.h>
>>> +
>>> +/*
>>> + * Implementation of void perf_regs_load(u64 *regs);
>>> + *
>>> + * This functions fills in the 'regs' buffer from the actual registers values,
>>> + * in the way the perf built-in unwinding test expects them:
>>> + * - the PC at the time at the call to this function. Since this function
>>> + *   is called using a bl instruction, the PC value is taken from LR,
>>
>> Is it guaranteed that this function is always invoked with a branch with
>> link instruction, or is that just what current compiler versions are
>> doing? I couldn't see where we would get that guarantee from.
> The current compiler implements the call as a bl instruction.
>
>> If it is called with a branch with link, then the LR value will be the
>> PC at call time + 4, rather than just the exact PC at call time. If not
>> then we don't have a guaranteed relationship between the PC at call time
>> and the current LR value.
>>
>> If the only place that perf_regs_load is used is a single test which
>> doesn't care about the precise PC at the time of the call, then it's
>> probably OK to use the LR value, but we should be careful to document
>> what the faked-up PC actually is and how we expect it to be used.
> The code is only used by an unwinding test. The unwinding code
> resolves the function name from an address range found in the dwarf
> information so in principle it is ok to use the PC/LR at the time of
> the call to a function.
>
> Is the comment above OK or do you want an update of the code as well?
>
>>
>>> + * - the current SP (not touched by this function),
>>> + * - the current value of LR is merely retrieved and stored because the
>>> + *   value before the call to this function is unknown at this time; it will
>>> + *   be unwound from the dwarf information in unwind__get_entries.
>>> + */
>>> +
>>> +.text
>>> +.type perf_regs_load,%function
>>> +ENTRY(perf_regs_load)
>>> +     stp x0,  x1,  [x0], #16 // store x0..x29
>>> +     stp x2,  x3,  [x0], #16
>>> +     stp x4,  x5,  [x0], #16
>>> +     stp x6,  x7,  [x0], #16
>>> +     stp x8,  x9,  [x0], #16
>>> +     stp x10, x11, [x0], #16
>>> +     stp x12, x13, [x0], #16
>>> +     stp x14, x15, [x0], #16
>>> +     stp x16, x17, [x0], #16
>>> +     stp x18, x19, [x0], #16
>>> +     stp x20, x21, [x0], #16
>>> +     stp x22, x23, [x0], #16
>>> +     stp x24, x25, [x0], #16
>>> +     stp x26, x27, [x0], #16
>>> +     stp x28, x29, [x0], #16
>>> +     mov x1,  sp
>>> +     stp x30, x1,  [x0], #16 // store lr and sp
>>> +     str x30, [x0]           // store pc as lr in order to skip the call
>>> +                             //  to this function
>>
>> It might be better to word this a "store the lr in place of the pc". To
>> me at least the current wording implies the opposite of what the code
>> seems to be doing.
> Ok the last comment can be updated.
>
> Thanks!
> Jean
>
>>
>> Cheers,
>> Mark.
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH 1/3] perf tests: Introduce perf_regs_load function on ARM64
@ 2014-04-22  8:13         ` Jean Pihet
  0 siblings, 0 replies; 46+ messages in thread
From: Jean Pihet @ 2014-04-22  8:13 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Mark, Will,

Ping on this series. Can you please check? I

Regards,
Jean


On 25 March 2014 16:23, Jean Pihet <jean.pihet@linaro.org> wrote:
> Hi Mark,
>
> On 21 March 2014 16:11, Mark Rutland <mark.rutland@arm.com> wrote:
>> Hi Jean,
>>
>> On Wed, Mar 19, 2014 at 09:42:33AM +0000, Jean Pihet wrote:
>>> Introducing perf_regs_load function, which is going
>>> to be used for dwarf unwind test in following patches.
>>>
>>> It takes single argument as a pointer to the regs dump
>>> buffer and populates it with current registers values, as
>>> expected by the perf built-in unwinding test.
>>>
>>> Signed-off-by: Jean Pihet <jean.pihet@linaro.org>
>>> Cc: Steve Capper <steve.capper@linaro.org>
>>> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
>>> Cc: Frederic Weisbecker <fweisbec@gmail.com>
>>> Cc: Ingo Molnar <mingo@kernel.org>
>>> Cc: Namhyung Kim <namhyung@kernel.org>
>>> Cc: Paul Mackerras <paulus@samba.org>
>>> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
>>> Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
>>> Cc: David Ahern <dsahern@gmail.com>
>>> Cc: Jiri Olsa <jolsa@redhat.com>
>>> ---
>>>  tools/perf/arch/arm64/Makefile            |  1 +
>>>  tools/perf/arch/arm64/include/perf_regs.h |  2 ++
>>>  tools/perf/arch/arm64/tests/regs_load.S   | 39 +++++++++++++++++++++++++++++++
>>>  3 files changed, 42 insertions(+)
>>>  create mode 100644 tools/perf/arch/arm64/tests/regs_load.S
>>>
>>> diff --git a/tools/perf/arch/arm64/Makefile b/tools/perf/arch/arm64/Makefile
>>> index 67e9b3d..9b8f87e 100644
>>> --- a/tools/perf/arch/arm64/Makefile
>>> +++ b/tools/perf/arch/arm64/Makefile
>>> @@ -4,4 +4,5 @@ LIB_OBJS += $(OUTPUT)arch/$(ARCH)/util/dwarf-regs.o
>>>  endif
>>>  ifndef NO_LIBUNWIND
>>>  LIB_OBJS += $(OUTPUT)arch/$(ARCH)/util/unwind-libunwind.o
>>> +LIB_OBJS += $(OUTPUT)arch/$(ARCH)/tests/regs_load.o
>>>  endif
>>> diff --git a/tools/perf/arch/arm64/include/perf_regs.h b/tools/perf/arch/arm64/include/perf_regs.h
>>> index 2359546..1e052f1 100644
>>> --- a/tools/perf/arch/arm64/include/perf_regs.h
>>> +++ b/tools/perf/arch/arm64/include/perf_regs.h
>>> @@ -9,6 +9,8 @@
>>>  #define PERF_REG_IP  PERF_REG_ARM64_PC
>>>  #define PERF_REG_SP  PERF_REG_ARM64_SP
>>>
>>> +void perf_regs_load(u64 *regs);
>>> +
>>>  static inline const char *perf_reg_name(int id)
>>>  {
>>>       switch (id) {
>>> diff --git a/tools/perf/arch/arm64/tests/regs_load.S b/tools/perf/arch/arm64/tests/regs_load.S
>>> new file mode 100644
>>> index 0000000..92ab968
>>> --- /dev/null
>>> +++ b/tools/perf/arch/arm64/tests/regs_load.S
>>> @@ -0,0 +1,39 @@
>>> +#include <linux/linkage.h>
>>> +
>>> +/*
>>> + * Implementation of void perf_regs_load(u64 *regs);
>>> + *
>>> + * This functions fills in the 'regs' buffer from the actual registers values,
>>> + * in the way the perf built-in unwinding test expects them:
>>> + * - the PC at the time at the call to this function. Since this function
>>> + *   is called using a bl instruction, the PC value is taken from LR,
>>
>> Is it guaranteed that this function is always invoked with a branch with
>> link instruction, or is that just what current compiler versions are
>> doing? I couldn't see where we would get that guarantee from.
> The current compiler implements the call as a bl instruction.
>
>> If it is called with a branch with link, then the LR value will be the
>> PC at call time + 4, rather than just the exact PC at call time. If not
>> then we don't have a guaranteed relationship between the PC at call time
>> and the current LR value.
>>
>> If the only place that perf_regs_load is used is a single test which
>> doesn't care about the precise PC at the time of the call, then it's
>> probably OK to use the LR value, but we should be careful to document
>> what the faked-up PC actually is and how we expect it to be used.
> The code is only used by an unwinding test. The unwinding code
> resolves the function name from an address range found in the dwarf
> information so in principle it is ok to use the PC/LR at the time of
> the call to a function.
>
> Is the comment above OK or do you want an update of the code as well?
>
>>
>>> + * - the current SP (not touched by this function),
>>> + * - the current value of LR is merely retrieved and stored because the
>>> + *   value before the call to this function is unknown at this time; it will
>>> + *   be unwound from the dwarf information in unwind__get_entries.
>>> + */
>>> +
>>> +.text
>>> +.type perf_regs_load,%function
>>> +ENTRY(perf_regs_load)
>>> +     stp x0,  x1,  [x0], #16 // store x0..x29
>>> +     stp x2,  x3,  [x0], #16
>>> +     stp x4,  x5,  [x0], #16
>>> +     stp x6,  x7,  [x0], #16
>>> +     stp x8,  x9,  [x0], #16
>>> +     stp x10, x11, [x0], #16
>>> +     stp x12, x13, [x0], #16
>>> +     stp x14, x15, [x0], #16
>>> +     stp x16, x17, [x0], #16
>>> +     stp x18, x19, [x0], #16
>>> +     stp x20, x21, [x0], #16
>>> +     stp x22, x23, [x0], #16
>>> +     stp x24, x25, [x0], #16
>>> +     stp x26, x27, [x0], #16
>>> +     stp x28, x29, [x0], #16
>>> +     mov x1,  sp
>>> +     stp x30, x1,  [x0], #16 // store lr and sp
>>> +     str x30, [x0]           // store pc as lr in order to skip the call
>>> +                             //  to this function
>>
>> It might be better to word this a "store the lr in place of the pc". To
>> me at least the current wording implies the opposite of what the code
>> seems to be doing.
> Ok the last comment can be updated.
>
> Thanks!
> Jean
>
>>
>> Cheers,
>> Mark.
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>> the body of a message to majordomo at vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 1/3] perf tests: Introduce perf_regs_load function on ARM64
  2014-03-25 15:23       ` Jean Pihet
@ 2014-04-04  7:51         ` Jean Pihet
  -1 siblings, 0 replies; 46+ messages in thread
From: Jean Pihet @ 2014-04-04  7:51 UTC (permalink / raw)
  To: Mark Rutland
  Cc: linux-kernel, linaro-kernel, linux-arm-kernel, Arnaldo,
	Ingo Molnar, Jiri Olsa, Steve Capper, patches, Corey Ashford,
	Frederic Weisbecker, Namhyung Kim, Paul Mackerras,
	Peter Zijlstra, David Ahern

Hi Mark,

ping on this series, see comment below.

On 25 March 2014 16:23, Jean Pihet <jean.pihet@linaro.org> wrote:
> Hi Mark,
>
> On 21 March 2014 16:11, Mark Rutland <mark.rutland@arm.com> wrote:
>> Hi Jean,
>>
>> On Wed, Mar 19, 2014 at 09:42:33AM +0000, Jean Pihet wrote:
>>> Introducing perf_regs_load function, which is going
>>> to be used for dwarf unwind test in following patches.
>>>
>>> It takes single argument as a pointer to the regs dump
>>> buffer and populates it with current registers values, as
>>> expected by the perf built-in unwinding test.
>>>
>>> Signed-off-by: Jean Pihet <jean.pihet@linaro.org>
>>> Cc: Steve Capper <steve.capper@linaro.org>
>>> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
>>> Cc: Frederic Weisbecker <fweisbec@gmail.com>
>>> Cc: Ingo Molnar <mingo@kernel.org>
>>> Cc: Namhyung Kim <namhyung@kernel.org>
>>> Cc: Paul Mackerras <paulus@samba.org>
>>> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
>>> Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
>>> Cc: David Ahern <dsahern@gmail.com>
>>> Cc: Jiri Olsa <jolsa@redhat.com>
>>> ---
>>>  tools/perf/arch/arm64/Makefile            |  1 +
>>>  tools/perf/arch/arm64/include/perf_regs.h |  2 ++
>>>  tools/perf/arch/arm64/tests/regs_load.S   | 39 +++++++++++++++++++++++++++++++
>>>  3 files changed, 42 insertions(+)
>>>  create mode 100644 tools/perf/arch/arm64/tests/regs_load.S
>>>
>>> diff --git a/tools/perf/arch/arm64/Makefile b/tools/perf/arch/arm64/Makefile
>>> index 67e9b3d..9b8f87e 100644
>>> --- a/tools/perf/arch/arm64/Makefile
>>> +++ b/tools/perf/arch/arm64/Makefile
>>> @@ -4,4 +4,5 @@ LIB_OBJS += $(OUTPUT)arch/$(ARCH)/util/dwarf-regs.o
>>>  endif
>>>  ifndef NO_LIBUNWIND
>>>  LIB_OBJS += $(OUTPUT)arch/$(ARCH)/util/unwind-libunwind.o
>>> +LIB_OBJS += $(OUTPUT)arch/$(ARCH)/tests/regs_load.o
>>>  endif
>>> diff --git a/tools/perf/arch/arm64/include/perf_regs.h b/tools/perf/arch/arm64/include/perf_regs.h
>>> index 2359546..1e052f1 100644
>>> --- a/tools/perf/arch/arm64/include/perf_regs.h
>>> +++ b/tools/perf/arch/arm64/include/perf_regs.h
>>> @@ -9,6 +9,8 @@
>>>  #define PERF_REG_IP  PERF_REG_ARM64_PC
>>>  #define PERF_REG_SP  PERF_REG_ARM64_SP
>>>
>>> +void perf_regs_load(u64 *regs);
>>> +
>>>  static inline const char *perf_reg_name(int id)
>>>  {
>>>       switch (id) {
>>> diff --git a/tools/perf/arch/arm64/tests/regs_load.S b/tools/perf/arch/arm64/tests/regs_load.S
>>> new file mode 100644
>>> index 0000000..92ab968
>>> --- /dev/null
>>> +++ b/tools/perf/arch/arm64/tests/regs_load.S
>>> @@ -0,0 +1,39 @@
>>> +#include <linux/linkage.h>
>>> +
>>> +/*
>>> + * Implementation of void perf_regs_load(u64 *regs);
>>> + *
>>> + * This functions fills in the 'regs' buffer from the actual registers values,
>>> + * in the way the perf built-in unwinding test expects them:
>>> + * - the PC at the time at the call to this function. Since this function
>>> + *   is called using a bl instruction, the PC value is taken from LR,
>>
>> Is it guaranteed that this function is always invoked with a branch with
>> link instruction, or is that just what current compiler versions are
>> doing? I couldn't see where we would get that guarantee from.
> The current compiler implements the call as a bl instruction.
>
>> If it is called with a branch with link, then the LR value will be the
>> PC at call time + 4, rather than just the exact PC at call time. If not
>> then we don't have a guaranteed relationship between the PC at call time
>> and the current LR value.
>>
>> If the only place that perf_regs_load is used is a single test which
>> doesn't care about the precise PC at the time of the call, then it's
>> probably OK to use the LR value, but we should be careful to document
>> what the faked-up PC actually is and how we expect it to be used.
> The code is only used by an unwinding test. The unwinding code
> resolves the function name from an address range found in the dwarf
> information so in principle it is ok to use the PC/LR at the time of
> the call to a function.
>
> Is the comment above OK or do you want an update of the code as well?
What do you think?

Regards,
Jean

>
>>
>>> + * - the current SP (not touched by this function),
>>> + * - the current value of LR is merely retrieved and stored because the
>>> + *   value before the call to this function is unknown at this time; it will
>>> + *   be unwound from the dwarf information in unwind__get_entries.
>>> + */
>>> +
>>> +.text
>>> +.type perf_regs_load,%function
>>> +ENTRY(perf_regs_load)
>>> +     stp x0,  x1,  [x0], #16 // store x0..x29
>>> +     stp x2,  x3,  [x0], #16
>>> +     stp x4,  x5,  [x0], #16
>>> +     stp x6,  x7,  [x0], #16
>>> +     stp x8,  x9,  [x0], #16
>>> +     stp x10, x11, [x0], #16
>>> +     stp x12, x13, [x0], #16
>>> +     stp x14, x15, [x0], #16
>>> +     stp x16, x17, [x0], #16
>>> +     stp x18, x19, [x0], #16
>>> +     stp x20, x21, [x0], #16
>>> +     stp x22, x23, [x0], #16
>>> +     stp x24, x25, [x0], #16
>>> +     stp x26, x27, [x0], #16
>>> +     stp x28, x29, [x0], #16
>>> +     mov x1,  sp
>>> +     stp x30, x1,  [x0], #16 // store lr and sp
>>> +     str x30, [x0]           // store pc as lr in order to skip the call
>>> +                             //  to this function
>>
>> It might be better to word this a "store the lr in place of the pc". To
>> me at least the current wording implies the opposite of what the code
>> seems to be doing.
> Ok the last comment can be updated.
>
> Thanks!
> Jean
>
>>
>> Cheers,
>> Mark.
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH 1/3] perf tests: Introduce perf_regs_load function on ARM64
@ 2014-04-04  7:51         ` Jean Pihet
  0 siblings, 0 replies; 46+ messages in thread
From: Jean Pihet @ 2014-04-04  7:51 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Mark,

ping on this series, see comment below.

On 25 March 2014 16:23, Jean Pihet <jean.pihet@linaro.org> wrote:
> Hi Mark,
>
> On 21 March 2014 16:11, Mark Rutland <mark.rutland@arm.com> wrote:
>> Hi Jean,
>>
>> On Wed, Mar 19, 2014 at 09:42:33AM +0000, Jean Pihet wrote:
>>> Introducing perf_regs_load function, which is going
>>> to be used for dwarf unwind test in following patches.
>>>
>>> It takes single argument as a pointer to the regs dump
>>> buffer and populates it with current registers values, as
>>> expected by the perf built-in unwinding test.
>>>
>>> Signed-off-by: Jean Pihet <jean.pihet@linaro.org>
>>> Cc: Steve Capper <steve.capper@linaro.org>
>>> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
>>> Cc: Frederic Weisbecker <fweisbec@gmail.com>
>>> Cc: Ingo Molnar <mingo@kernel.org>
>>> Cc: Namhyung Kim <namhyung@kernel.org>
>>> Cc: Paul Mackerras <paulus@samba.org>
>>> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
>>> Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
>>> Cc: David Ahern <dsahern@gmail.com>
>>> Cc: Jiri Olsa <jolsa@redhat.com>
>>> ---
>>>  tools/perf/arch/arm64/Makefile            |  1 +
>>>  tools/perf/arch/arm64/include/perf_regs.h |  2 ++
>>>  tools/perf/arch/arm64/tests/regs_load.S   | 39 +++++++++++++++++++++++++++++++
>>>  3 files changed, 42 insertions(+)
>>>  create mode 100644 tools/perf/arch/arm64/tests/regs_load.S
>>>
>>> diff --git a/tools/perf/arch/arm64/Makefile b/tools/perf/arch/arm64/Makefile
>>> index 67e9b3d..9b8f87e 100644
>>> --- a/tools/perf/arch/arm64/Makefile
>>> +++ b/tools/perf/arch/arm64/Makefile
>>> @@ -4,4 +4,5 @@ LIB_OBJS += $(OUTPUT)arch/$(ARCH)/util/dwarf-regs.o
>>>  endif
>>>  ifndef NO_LIBUNWIND
>>>  LIB_OBJS += $(OUTPUT)arch/$(ARCH)/util/unwind-libunwind.o
>>> +LIB_OBJS += $(OUTPUT)arch/$(ARCH)/tests/regs_load.o
>>>  endif
>>> diff --git a/tools/perf/arch/arm64/include/perf_regs.h b/tools/perf/arch/arm64/include/perf_regs.h
>>> index 2359546..1e052f1 100644
>>> --- a/tools/perf/arch/arm64/include/perf_regs.h
>>> +++ b/tools/perf/arch/arm64/include/perf_regs.h
>>> @@ -9,6 +9,8 @@
>>>  #define PERF_REG_IP  PERF_REG_ARM64_PC
>>>  #define PERF_REG_SP  PERF_REG_ARM64_SP
>>>
>>> +void perf_regs_load(u64 *regs);
>>> +
>>>  static inline const char *perf_reg_name(int id)
>>>  {
>>>       switch (id) {
>>> diff --git a/tools/perf/arch/arm64/tests/regs_load.S b/tools/perf/arch/arm64/tests/regs_load.S
>>> new file mode 100644
>>> index 0000000..92ab968
>>> --- /dev/null
>>> +++ b/tools/perf/arch/arm64/tests/regs_load.S
>>> @@ -0,0 +1,39 @@
>>> +#include <linux/linkage.h>
>>> +
>>> +/*
>>> + * Implementation of void perf_regs_load(u64 *regs);
>>> + *
>>> + * This functions fills in the 'regs' buffer from the actual registers values,
>>> + * in the way the perf built-in unwinding test expects them:
>>> + * - the PC at the time at the call to this function. Since this function
>>> + *   is called using a bl instruction, the PC value is taken from LR,
>>
>> Is it guaranteed that this function is always invoked with a branch with
>> link instruction, or is that just what current compiler versions are
>> doing? I couldn't see where we would get that guarantee from.
> The current compiler implements the call as a bl instruction.
>
>> If it is called with a branch with link, then the LR value will be the
>> PC at call time + 4, rather than just the exact PC at call time. If not
>> then we don't have a guaranteed relationship between the PC at call time
>> and the current LR value.
>>
>> If the only place that perf_regs_load is used is a single test which
>> doesn't care about the precise PC at the time of the call, then it's
>> probably OK to use the LR value, but we should be careful to document
>> what the faked-up PC actually is and how we expect it to be used.
> The code is only used by an unwinding test. The unwinding code
> resolves the function name from an address range found in the dwarf
> information so in principle it is ok to use the PC/LR at the time of
> the call to a function.
>
> Is the comment above OK or do you want an update of the code as well?
What do you think?

Regards,
Jean

>
>>
>>> + * - the current SP (not touched by this function),
>>> + * - the current value of LR is merely retrieved and stored because the
>>> + *   value before the call to this function is unknown at this time; it will
>>> + *   be unwound from the dwarf information in unwind__get_entries.
>>> + */
>>> +
>>> +.text
>>> +.type perf_regs_load,%function
>>> +ENTRY(perf_regs_load)
>>> +     stp x0,  x1,  [x0], #16 // store x0..x29
>>> +     stp x2,  x3,  [x0], #16
>>> +     stp x4,  x5,  [x0], #16
>>> +     stp x6,  x7,  [x0], #16
>>> +     stp x8,  x9,  [x0], #16
>>> +     stp x10, x11, [x0], #16
>>> +     stp x12, x13, [x0], #16
>>> +     stp x14, x15, [x0], #16
>>> +     stp x16, x17, [x0], #16
>>> +     stp x18, x19, [x0], #16
>>> +     stp x20, x21, [x0], #16
>>> +     stp x22, x23, [x0], #16
>>> +     stp x24, x25, [x0], #16
>>> +     stp x26, x27, [x0], #16
>>> +     stp x28, x29, [x0], #16
>>> +     mov x1,  sp
>>> +     stp x30, x1,  [x0], #16 // store lr and sp
>>> +     str x30, [x0]           // store pc as lr in order to skip the call
>>> +                             //  to this function
>>
>> It might be better to word this a "store the lr in place of the pc". To
>> me at least the current wording implies the opposite of what the code
>> seems to be doing.
> Ok the last comment can be updated.
>
> Thanks!
> Jean
>
>>
>> Cheers,
>> Mark.
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>> the body of a message to majordomo at vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 1/3] perf tests: Introduce perf_regs_load function on ARM64
  2014-03-21 15:11     ` Mark Rutland
@ 2014-03-25 15:23       ` Jean Pihet
  -1 siblings, 0 replies; 46+ messages in thread
From: Jean Pihet @ 2014-03-25 15:23 UTC (permalink / raw)
  To: Mark Rutland
  Cc: linux-kernel, linaro-kernel, linux-arm-kernel, Arnaldo,
	Ingo Molnar, Jiri Olsa, Steve Capper, patches, Corey Ashford,
	Frederic Weisbecker, Namhyung Kim, Paul Mackerras,
	Peter Zijlstra, David Ahern

Hi Mark,

On 21 March 2014 16:11, Mark Rutland <mark.rutland@arm.com> wrote:
> Hi Jean,
>
> On Wed, Mar 19, 2014 at 09:42:33AM +0000, Jean Pihet wrote:
>> Introducing perf_regs_load function, which is going
>> to be used for dwarf unwind test in following patches.
>>
>> It takes single argument as a pointer to the regs dump
>> buffer and populates it with current registers values, as
>> expected by the perf built-in unwinding test.
>>
>> Signed-off-by: Jean Pihet <jean.pihet@linaro.org>
>> Cc: Steve Capper <steve.capper@linaro.org>
>> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
>> Cc: Frederic Weisbecker <fweisbec@gmail.com>
>> Cc: Ingo Molnar <mingo@kernel.org>
>> Cc: Namhyung Kim <namhyung@kernel.org>
>> Cc: Paul Mackerras <paulus@samba.org>
>> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
>> Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
>> Cc: David Ahern <dsahern@gmail.com>
>> Cc: Jiri Olsa <jolsa@redhat.com>
>> ---
>>  tools/perf/arch/arm64/Makefile            |  1 +
>>  tools/perf/arch/arm64/include/perf_regs.h |  2 ++
>>  tools/perf/arch/arm64/tests/regs_load.S   | 39 +++++++++++++++++++++++++++++++
>>  3 files changed, 42 insertions(+)
>>  create mode 100644 tools/perf/arch/arm64/tests/regs_load.S
>>
>> diff --git a/tools/perf/arch/arm64/Makefile b/tools/perf/arch/arm64/Makefile
>> index 67e9b3d..9b8f87e 100644
>> --- a/tools/perf/arch/arm64/Makefile
>> +++ b/tools/perf/arch/arm64/Makefile
>> @@ -4,4 +4,5 @@ LIB_OBJS += $(OUTPUT)arch/$(ARCH)/util/dwarf-regs.o
>>  endif
>>  ifndef NO_LIBUNWIND
>>  LIB_OBJS += $(OUTPUT)arch/$(ARCH)/util/unwind-libunwind.o
>> +LIB_OBJS += $(OUTPUT)arch/$(ARCH)/tests/regs_load.o
>>  endif
>> diff --git a/tools/perf/arch/arm64/include/perf_regs.h b/tools/perf/arch/arm64/include/perf_regs.h
>> index 2359546..1e052f1 100644
>> --- a/tools/perf/arch/arm64/include/perf_regs.h
>> +++ b/tools/perf/arch/arm64/include/perf_regs.h
>> @@ -9,6 +9,8 @@
>>  #define PERF_REG_IP  PERF_REG_ARM64_PC
>>  #define PERF_REG_SP  PERF_REG_ARM64_SP
>>
>> +void perf_regs_load(u64 *regs);
>> +
>>  static inline const char *perf_reg_name(int id)
>>  {
>>       switch (id) {
>> diff --git a/tools/perf/arch/arm64/tests/regs_load.S b/tools/perf/arch/arm64/tests/regs_load.S
>> new file mode 100644
>> index 0000000..92ab968
>> --- /dev/null
>> +++ b/tools/perf/arch/arm64/tests/regs_load.S
>> @@ -0,0 +1,39 @@
>> +#include <linux/linkage.h>
>> +
>> +/*
>> + * Implementation of void perf_regs_load(u64 *regs);
>> + *
>> + * This functions fills in the 'regs' buffer from the actual registers values,
>> + * in the way the perf built-in unwinding test expects them:
>> + * - the PC at the time at the call to this function. Since this function
>> + *   is called using a bl instruction, the PC value is taken from LR,
>
> Is it guaranteed that this function is always invoked with a branch with
> link instruction, or is that just what current compiler versions are
> doing? I couldn't see where we would get that guarantee from.
The current compiler implements the call as a bl instruction.

> If it is called with a branch with link, then the LR value will be the
> PC at call time + 4, rather than just the exact PC at call time. If not
> then we don't have a guaranteed relationship between the PC at call time
> and the current LR value.
>
> If the only place that perf_regs_load is used is a single test which
> doesn't care about the precise PC at the time of the call, then it's
> probably OK to use the LR value, but we should be careful to document
> what the faked-up PC actually is and how we expect it to be used.
The code is only used by an unwinding test. The unwinding code
resolves the function name from an address range found in the dwarf
information so in principle it is ok to use the PC/LR at the time of
the call to a function.

Is the comment above OK or do you want an update of the code as well?

>
>> + * - the current SP (not touched by this function),
>> + * - the current value of LR is merely retrieved and stored because the
>> + *   value before the call to this function is unknown at this time; it will
>> + *   be unwound from the dwarf information in unwind__get_entries.
>> + */
>> +
>> +.text
>> +.type perf_regs_load,%function
>> +ENTRY(perf_regs_load)
>> +     stp x0,  x1,  [x0], #16 // store x0..x29
>> +     stp x2,  x3,  [x0], #16
>> +     stp x4,  x5,  [x0], #16
>> +     stp x6,  x7,  [x0], #16
>> +     stp x8,  x9,  [x0], #16
>> +     stp x10, x11, [x0], #16
>> +     stp x12, x13, [x0], #16
>> +     stp x14, x15, [x0], #16
>> +     stp x16, x17, [x0], #16
>> +     stp x18, x19, [x0], #16
>> +     stp x20, x21, [x0], #16
>> +     stp x22, x23, [x0], #16
>> +     stp x24, x25, [x0], #16
>> +     stp x26, x27, [x0], #16
>> +     stp x28, x29, [x0], #16
>> +     mov x1,  sp
>> +     stp x30, x1,  [x0], #16 // store lr and sp
>> +     str x30, [x0]           // store pc as lr in order to skip the call
>> +                             //  to this function
>
> It might be better to word this a "store the lr in place of the pc". To
> me at least the current wording implies the opposite of what the code
> seems to be doing.
Ok the last comment can be updated.

Thanks!
Jean

>
> Cheers,
> Mark.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH 1/3] perf tests: Introduce perf_regs_load function on ARM64
@ 2014-03-25 15:23       ` Jean Pihet
  0 siblings, 0 replies; 46+ messages in thread
From: Jean Pihet @ 2014-03-25 15:23 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Mark,

On 21 March 2014 16:11, Mark Rutland <mark.rutland@arm.com> wrote:
> Hi Jean,
>
> On Wed, Mar 19, 2014 at 09:42:33AM +0000, Jean Pihet wrote:
>> Introducing perf_regs_load function, which is going
>> to be used for dwarf unwind test in following patches.
>>
>> It takes single argument as a pointer to the regs dump
>> buffer and populates it with current registers values, as
>> expected by the perf built-in unwinding test.
>>
>> Signed-off-by: Jean Pihet <jean.pihet@linaro.org>
>> Cc: Steve Capper <steve.capper@linaro.org>
>> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
>> Cc: Frederic Weisbecker <fweisbec@gmail.com>
>> Cc: Ingo Molnar <mingo@kernel.org>
>> Cc: Namhyung Kim <namhyung@kernel.org>
>> Cc: Paul Mackerras <paulus@samba.org>
>> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
>> Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
>> Cc: David Ahern <dsahern@gmail.com>
>> Cc: Jiri Olsa <jolsa@redhat.com>
>> ---
>>  tools/perf/arch/arm64/Makefile            |  1 +
>>  tools/perf/arch/arm64/include/perf_regs.h |  2 ++
>>  tools/perf/arch/arm64/tests/regs_load.S   | 39 +++++++++++++++++++++++++++++++
>>  3 files changed, 42 insertions(+)
>>  create mode 100644 tools/perf/arch/arm64/tests/regs_load.S
>>
>> diff --git a/tools/perf/arch/arm64/Makefile b/tools/perf/arch/arm64/Makefile
>> index 67e9b3d..9b8f87e 100644
>> --- a/tools/perf/arch/arm64/Makefile
>> +++ b/tools/perf/arch/arm64/Makefile
>> @@ -4,4 +4,5 @@ LIB_OBJS += $(OUTPUT)arch/$(ARCH)/util/dwarf-regs.o
>>  endif
>>  ifndef NO_LIBUNWIND
>>  LIB_OBJS += $(OUTPUT)arch/$(ARCH)/util/unwind-libunwind.o
>> +LIB_OBJS += $(OUTPUT)arch/$(ARCH)/tests/regs_load.o
>>  endif
>> diff --git a/tools/perf/arch/arm64/include/perf_regs.h b/tools/perf/arch/arm64/include/perf_regs.h
>> index 2359546..1e052f1 100644
>> --- a/tools/perf/arch/arm64/include/perf_regs.h
>> +++ b/tools/perf/arch/arm64/include/perf_regs.h
>> @@ -9,6 +9,8 @@
>>  #define PERF_REG_IP  PERF_REG_ARM64_PC
>>  #define PERF_REG_SP  PERF_REG_ARM64_SP
>>
>> +void perf_regs_load(u64 *regs);
>> +
>>  static inline const char *perf_reg_name(int id)
>>  {
>>       switch (id) {
>> diff --git a/tools/perf/arch/arm64/tests/regs_load.S b/tools/perf/arch/arm64/tests/regs_load.S
>> new file mode 100644
>> index 0000000..92ab968
>> --- /dev/null
>> +++ b/tools/perf/arch/arm64/tests/regs_load.S
>> @@ -0,0 +1,39 @@
>> +#include <linux/linkage.h>
>> +
>> +/*
>> + * Implementation of void perf_regs_load(u64 *regs);
>> + *
>> + * This functions fills in the 'regs' buffer from the actual registers values,
>> + * in the way the perf built-in unwinding test expects them:
>> + * - the PC at the time at the call to this function. Since this function
>> + *   is called using a bl instruction, the PC value is taken from LR,
>
> Is it guaranteed that this function is always invoked with a branch with
> link instruction, or is that just what current compiler versions are
> doing? I couldn't see where we would get that guarantee from.
The current compiler implements the call as a bl instruction.

> If it is called with a branch with link, then the LR value will be the
> PC at call time + 4, rather than just the exact PC at call time. If not
> then we don't have a guaranteed relationship between the PC at call time
> and the current LR value.
>
> If the only place that perf_regs_load is used is a single test which
> doesn't care about the precise PC at the time of the call, then it's
> probably OK to use the LR value, but we should be careful to document
> what the faked-up PC actually is and how we expect it to be used.
The code is only used by an unwinding test. The unwinding code
resolves the function name from an address range found in the dwarf
information so in principle it is ok to use the PC/LR at the time of
the call to a function.

Is the comment above OK or do you want an update of the code as well?

>
>> + * - the current SP (not touched by this function),
>> + * - the current value of LR is merely retrieved and stored because the
>> + *   value before the call to this function is unknown at this time; it will
>> + *   be unwound from the dwarf information in unwind__get_entries.
>> + */
>> +
>> +.text
>> +.type perf_regs_load,%function
>> +ENTRY(perf_regs_load)
>> +     stp x0,  x1,  [x0], #16 // store x0..x29
>> +     stp x2,  x3,  [x0], #16
>> +     stp x4,  x5,  [x0], #16
>> +     stp x6,  x7,  [x0], #16
>> +     stp x8,  x9,  [x0], #16
>> +     stp x10, x11, [x0], #16
>> +     stp x12, x13, [x0], #16
>> +     stp x14, x15, [x0], #16
>> +     stp x16, x17, [x0], #16
>> +     stp x18, x19, [x0], #16
>> +     stp x20, x21, [x0], #16
>> +     stp x22, x23, [x0], #16
>> +     stp x24, x25, [x0], #16
>> +     stp x26, x27, [x0], #16
>> +     stp x28, x29, [x0], #16
>> +     mov x1,  sp
>> +     stp x30, x1,  [x0], #16 // store lr and sp
>> +     str x30, [x0]           // store pc as lr in order to skip the call
>> +                             //  to this function
>
> It might be better to word this a "store the lr in place of the pc". To
> me at least the current wording implies the opposite of what the code
> seems to be doing.
Ok the last comment can be updated.

Thanks!
Jean

>
> Cheers,
> Mark.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo at vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 1/3] perf tests: Introduce perf_regs_load function on ARM64
  2014-03-19  9:42   ` Jean Pihet
@ 2014-03-21 15:11     ` Mark Rutland
  -1 siblings, 0 replies; 46+ messages in thread
From: Mark Rutland @ 2014-03-21 15:11 UTC (permalink / raw)
  To: Jean Pihet
  Cc: linux-kernel, linaro-kernel, linux-arm-kernel, Arnaldo,
	Ingo Molnar, Jiri Olsa, Steve Capper, patches, Corey Ashford,
	Frederic Weisbecker, Namhyung Kim, Paul Mackerras,
	Peter Zijlstra, David Ahern

Hi Jean,

On Wed, Mar 19, 2014 at 09:42:33AM +0000, Jean Pihet wrote:
> Introducing perf_regs_load function, which is going
> to be used for dwarf unwind test in following patches.
> 
> It takes single argument as a pointer to the regs dump
> buffer and populates it with current registers values, as
> expected by the perf built-in unwinding test.
> 
> Signed-off-by: Jean Pihet <jean.pihet@linaro.org>
> Cc: Steve Capper <steve.capper@linaro.org>
> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
> Cc: Frederic Weisbecker <fweisbec@gmail.com>
> Cc: Ingo Molnar <mingo@kernel.org>
> Cc: Namhyung Kim <namhyung@kernel.org>
> Cc: Paul Mackerras <paulus@samba.org>
> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
> Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
> Cc: David Ahern <dsahern@gmail.com>
> Cc: Jiri Olsa <jolsa@redhat.com>
> ---
>  tools/perf/arch/arm64/Makefile            |  1 +
>  tools/perf/arch/arm64/include/perf_regs.h |  2 ++
>  tools/perf/arch/arm64/tests/regs_load.S   | 39 +++++++++++++++++++++++++++++++
>  3 files changed, 42 insertions(+)
>  create mode 100644 tools/perf/arch/arm64/tests/regs_load.S
> 
> diff --git a/tools/perf/arch/arm64/Makefile b/tools/perf/arch/arm64/Makefile
> index 67e9b3d..9b8f87e 100644
> --- a/tools/perf/arch/arm64/Makefile
> +++ b/tools/perf/arch/arm64/Makefile
> @@ -4,4 +4,5 @@ LIB_OBJS += $(OUTPUT)arch/$(ARCH)/util/dwarf-regs.o
>  endif
>  ifndef NO_LIBUNWIND
>  LIB_OBJS += $(OUTPUT)arch/$(ARCH)/util/unwind-libunwind.o
> +LIB_OBJS += $(OUTPUT)arch/$(ARCH)/tests/regs_load.o
>  endif
> diff --git a/tools/perf/arch/arm64/include/perf_regs.h b/tools/perf/arch/arm64/include/perf_regs.h
> index 2359546..1e052f1 100644
> --- a/tools/perf/arch/arm64/include/perf_regs.h
> +++ b/tools/perf/arch/arm64/include/perf_regs.h
> @@ -9,6 +9,8 @@
>  #define PERF_REG_IP	PERF_REG_ARM64_PC
>  #define PERF_REG_SP	PERF_REG_ARM64_SP
>  
> +void perf_regs_load(u64 *regs);
> +
>  static inline const char *perf_reg_name(int id)
>  {
>  	switch (id) {
> diff --git a/tools/perf/arch/arm64/tests/regs_load.S b/tools/perf/arch/arm64/tests/regs_load.S
> new file mode 100644
> index 0000000..92ab968
> --- /dev/null
> +++ b/tools/perf/arch/arm64/tests/regs_load.S
> @@ -0,0 +1,39 @@
> +#include <linux/linkage.h>
> +
> +/*
> + * Implementation of void perf_regs_load(u64 *regs);
> + *
> + * This functions fills in the 'regs' buffer from the actual registers values,
> + * in the way the perf built-in unwinding test expects them:
> + * - the PC at the time at the call to this function. Since this function
> + *   is called using a bl instruction, the PC value is taken from LR,

Is it guaranteed that this function is always invoked with a branch with
link instruction, or is that just what current compiler versions are
doing? I couldn't see where we would get that guarantee from.

If it is called with a branch with link, then the LR value will be the
PC at call time + 4, rather than just the exact PC at call time. If not
then we don't have a guaranteed relationship between the PC at call time
and the current LR value.

If the only place that perf_regs_load is used is a single test which
doesn't care about the precise PC at the time of the call, then it's
probably OK to use the LR value, but we should be careful to document
what the faked-up PC actually is and how we expect it to be used.

> + * - the current SP (not touched by this function),
> + * - the current value of LR is merely retrieved and stored because the
> + *   value before the call to this function is unknown at this time; it will
> + *   be unwound from the dwarf information in unwind__get_entries.
> + */
> +
> +.text
> +.type perf_regs_load,%function
> +ENTRY(perf_regs_load)
> +	stp x0,  x1,  [x0], #16	// store x0..x29
> +	stp x2,  x3,  [x0], #16
> +	stp x4,  x5,  [x0], #16
> +	stp x6,  x7,  [x0], #16
> +	stp x8,  x9,  [x0], #16
> +	stp x10, x11, [x0], #16
> +	stp x12, x13, [x0], #16
> +	stp x14, x15, [x0], #16
> +	stp x16, x17, [x0], #16
> +	stp x18, x19, [x0], #16
> +	stp x20, x21, [x0], #16
> +	stp x22, x23, [x0], #16
> +	stp x24, x25, [x0], #16
> +	stp x26, x27, [x0], #16
> +	stp x28, x29, [x0], #16
> +	mov x1,  sp
> +	stp x30, x1,  [x0], #16	// store lr and sp
> +	str x30, [x0]		// store pc as lr in order to skip the call
> +				//  to this function

It might be better to word this a "store the lr in place of the pc". To
me at least the current wording implies the opposite of what the code
seems to be doing.

Cheers,
Mark.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH 1/3] perf tests: Introduce perf_regs_load function on ARM64
@ 2014-03-21 15:11     ` Mark Rutland
  0 siblings, 0 replies; 46+ messages in thread
From: Mark Rutland @ 2014-03-21 15:11 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Jean,

On Wed, Mar 19, 2014 at 09:42:33AM +0000, Jean Pihet wrote:
> Introducing perf_regs_load function, which is going
> to be used for dwarf unwind test in following patches.
> 
> It takes single argument as a pointer to the regs dump
> buffer and populates it with current registers values, as
> expected by the perf built-in unwinding test.
> 
> Signed-off-by: Jean Pihet <jean.pihet@linaro.org>
> Cc: Steve Capper <steve.capper@linaro.org>
> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
> Cc: Frederic Weisbecker <fweisbec@gmail.com>
> Cc: Ingo Molnar <mingo@kernel.org>
> Cc: Namhyung Kim <namhyung@kernel.org>
> Cc: Paul Mackerras <paulus@samba.org>
> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
> Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
> Cc: David Ahern <dsahern@gmail.com>
> Cc: Jiri Olsa <jolsa@redhat.com>
> ---
>  tools/perf/arch/arm64/Makefile            |  1 +
>  tools/perf/arch/arm64/include/perf_regs.h |  2 ++
>  tools/perf/arch/arm64/tests/regs_load.S   | 39 +++++++++++++++++++++++++++++++
>  3 files changed, 42 insertions(+)
>  create mode 100644 tools/perf/arch/arm64/tests/regs_load.S
> 
> diff --git a/tools/perf/arch/arm64/Makefile b/tools/perf/arch/arm64/Makefile
> index 67e9b3d..9b8f87e 100644
> --- a/tools/perf/arch/arm64/Makefile
> +++ b/tools/perf/arch/arm64/Makefile
> @@ -4,4 +4,5 @@ LIB_OBJS += $(OUTPUT)arch/$(ARCH)/util/dwarf-regs.o
>  endif
>  ifndef NO_LIBUNWIND
>  LIB_OBJS += $(OUTPUT)arch/$(ARCH)/util/unwind-libunwind.o
> +LIB_OBJS += $(OUTPUT)arch/$(ARCH)/tests/regs_load.o
>  endif
> diff --git a/tools/perf/arch/arm64/include/perf_regs.h b/tools/perf/arch/arm64/include/perf_regs.h
> index 2359546..1e052f1 100644
> --- a/tools/perf/arch/arm64/include/perf_regs.h
> +++ b/tools/perf/arch/arm64/include/perf_regs.h
> @@ -9,6 +9,8 @@
>  #define PERF_REG_IP	PERF_REG_ARM64_PC
>  #define PERF_REG_SP	PERF_REG_ARM64_SP
>  
> +void perf_regs_load(u64 *regs);
> +
>  static inline const char *perf_reg_name(int id)
>  {
>  	switch (id) {
> diff --git a/tools/perf/arch/arm64/tests/regs_load.S b/tools/perf/arch/arm64/tests/regs_load.S
> new file mode 100644
> index 0000000..92ab968
> --- /dev/null
> +++ b/tools/perf/arch/arm64/tests/regs_load.S
> @@ -0,0 +1,39 @@
> +#include <linux/linkage.h>
> +
> +/*
> + * Implementation of void perf_regs_load(u64 *regs);
> + *
> + * This functions fills in the 'regs' buffer from the actual registers values,
> + * in the way the perf built-in unwinding test expects them:
> + * - the PC at the time at the call to this function. Since this function
> + *   is called using a bl instruction, the PC value is taken from LR,

Is it guaranteed that this function is always invoked with a branch with
link instruction, or is that just what current compiler versions are
doing? I couldn't see where we would get that guarantee from.

If it is called with a branch with link, then the LR value will be the
PC at call time + 4, rather than just the exact PC at call time. If not
then we don't have a guaranteed relationship between the PC at call time
and the current LR value.

If the only place that perf_regs_load is used is a single test which
doesn't care about the precise PC at the time of the call, then it's
probably OK to use the LR value, but we should be careful to document
what the faked-up PC actually is and how we expect it to be used.

> + * - the current SP (not touched by this function),
> + * - the current value of LR is merely retrieved and stored because the
> + *   value before the call to this function is unknown at this time; it will
> + *   be unwound from the dwarf information in unwind__get_entries.
> + */
> +
> +.text
> +.type perf_regs_load,%function
> +ENTRY(perf_regs_load)
> +	stp x0,  x1,  [x0], #16	// store x0..x29
> +	stp x2,  x3,  [x0], #16
> +	stp x4,  x5,  [x0], #16
> +	stp x6,  x7,  [x0], #16
> +	stp x8,  x9,  [x0], #16
> +	stp x10, x11, [x0], #16
> +	stp x12, x13, [x0], #16
> +	stp x14, x15, [x0], #16
> +	stp x16, x17, [x0], #16
> +	stp x18, x19, [x0], #16
> +	stp x20, x21, [x0], #16
> +	stp x22, x23, [x0], #16
> +	stp x24, x25, [x0], #16
> +	stp x26, x27, [x0], #16
> +	stp x28, x29, [x0], #16
> +	mov x1,  sp
> +	stp x30, x1,  [x0], #16	// store lr and sp
> +	str x30, [x0]		// store pc as lr in order to skip the call
> +				//  to this function

It might be better to word this a "store the lr in place of the pc". To
me at least the current wording implies the opposite of what the code
seems to be doing.

Cheers,
Mark.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH 1/3] perf tests: Introduce perf_regs_load function on ARM64
  2014-03-19  9:42 [PATCH 0/3] " Jean Pihet
@ 2014-03-19  9:42   ` Jean Pihet
  0 siblings, 0 replies; 46+ messages in thread
From: Jean Pihet @ 2014-03-19  9:42 UTC (permalink / raw)
  To: linux-kernel, linaro-kernel, linux-arm-kernel, Arnaldo,
	Ingo Molnar, Jiri Olsa, Steve Capper, Mark Rutland
  Cc: patches, Jean Pihet, Corey Ashford, Frederic Weisbecker,
	Namhyung Kim, Paul Mackerras, Peter Zijlstra, David Ahern

Introducing perf_regs_load function, which is going
to be used for dwarf unwind test in following patches.

It takes single argument as a pointer to the regs dump
buffer and populates it with current registers values, as
expected by the perf built-in unwinding test.

Signed-off-by: Jean Pihet <jean.pihet@linaro.org>
Cc: Steve Capper <steve.capper@linaro.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
---
 tools/perf/arch/arm64/Makefile            |  1 +
 tools/perf/arch/arm64/include/perf_regs.h |  2 ++
 tools/perf/arch/arm64/tests/regs_load.S   | 39 +++++++++++++++++++++++++++++++
 3 files changed, 42 insertions(+)
 create mode 100644 tools/perf/arch/arm64/tests/regs_load.S

diff --git a/tools/perf/arch/arm64/Makefile b/tools/perf/arch/arm64/Makefile
index 67e9b3d..9b8f87e 100644
--- a/tools/perf/arch/arm64/Makefile
+++ b/tools/perf/arch/arm64/Makefile
@@ -4,4 +4,5 @@ LIB_OBJS += $(OUTPUT)arch/$(ARCH)/util/dwarf-regs.o
 endif
 ifndef NO_LIBUNWIND
 LIB_OBJS += $(OUTPUT)arch/$(ARCH)/util/unwind-libunwind.o
+LIB_OBJS += $(OUTPUT)arch/$(ARCH)/tests/regs_load.o
 endif
diff --git a/tools/perf/arch/arm64/include/perf_regs.h b/tools/perf/arch/arm64/include/perf_regs.h
index 2359546..1e052f1 100644
--- a/tools/perf/arch/arm64/include/perf_regs.h
+++ b/tools/perf/arch/arm64/include/perf_regs.h
@@ -9,6 +9,8 @@
 #define PERF_REG_IP	PERF_REG_ARM64_PC
 #define PERF_REG_SP	PERF_REG_ARM64_SP
 
+void perf_regs_load(u64 *regs);
+
 static inline const char *perf_reg_name(int id)
 {
 	switch (id) {
diff --git a/tools/perf/arch/arm64/tests/regs_load.S b/tools/perf/arch/arm64/tests/regs_load.S
new file mode 100644
index 0000000..92ab968
--- /dev/null
+++ b/tools/perf/arch/arm64/tests/regs_load.S
@@ -0,0 +1,39 @@
+#include <linux/linkage.h>
+
+/*
+ * Implementation of void perf_regs_load(u64 *regs);
+ *
+ * This functions fills in the 'regs' buffer from the actual registers values,
+ * in the way the perf built-in unwinding test expects them:
+ * - the PC at the time at the call to this function. Since this function
+ *   is called using a bl instruction, the PC value is taken from LR,
+ * - the current SP (not touched by this function),
+ * - the current value of LR is merely retrieved and stored because the
+ *   value before the call to this function is unknown at this time; it will
+ *   be unwound from the dwarf information in unwind__get_entries.
+ */
+
+.text
+.type perf_regs_load,%function
+ENTRY(perf_regs_load)
+	stp x0,  x1,  [x0], #16	// store x0..x29
+	stp x2,  x3,  [x0], #16
+	stp x4,  x5,  [x0], #16
+	stp x6,  x7,  [x0], #16
+	stp x8,  x9,  [x0], #16
+	stp x10, x11, [x0], #16
+	stp x12, x13, [x0], #16
+	stp x14, x15, [x0], #16
+	stp x16, x17, [x0], #16
+	stp x18, x19, [x0], #16
+	stp x20, x21, [x0], #16
+	stp x22, x23, [x0], #16
+	stp x24, x25, [x0], #16
+	stp x26, x27, [x0], #16
+	stp x28, x29, [x0], #16
+	mov x1,  sp
+	stp x30, x1,  [x0], #16	// store lr and sp
+	str x30, [x0]		// store pc as lr in order to skip the call
+				//  to this function
+	ret
+ENDPROC(perf_regs_load)
-- 
1.7.11.7


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 1/3] perf tests: Introduce perf_regs_load function on ARM64
@ 2014-03-19  9:42   ` Jean Pihet
  0 siblings, 0 replies; 46+ messages in thread
From: Jean Pihet @ 2014-03-19  9:42 UTC (permalink / raw)
  To: linux-arm-kernel

Introducing perf_regs_load function, which is going
to be used for dwarf unwind test in following patches.

It takes single argument as a pointer to the regs dump
buffer and populates it with current registers values, as
expected by the perf built-in unwinding test.

Signed-off-by: Jean Pihet <jean.pihet@linaro.org>
Cc: Steve Capper <steve.capper@linaro.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
---
 tools/perf/arch/arm64/Makefile            |  1 +
 tools/perf/arch/arm64/include/perf_regs.h |  2 ++
 tools/perf/arch/arm64/tests/regs_load.S   | 39 +++++++++++++++++++++++++++++++
 3 files changed, 42 insertions(+)
 create mode 100644 tools/perf/arch/arm64/tests/regs_load.S

diff --git a/tools/perf/arch/arm64/Makefile b/tools/perf/arch/arm64/Makefile
index 67e9b3d..9b8f87e 100644
--- a/tools/perf/arch/arm64/Makefile
+++ b/tools/perf/arch/arm64/Makefile
@@ -4,4 +4,5 @@ LIB_OBJS += $(OUTPUT)arch/$(ARCH)/util/dwarf-regs.o
 endif
 ifndef NO_LIBUNWIND
 LIB_OBJS += $(OUTPUT)arch/$(ARCH)/util/unwind-libunwind.o
+LIB_OBJS += $(OUTPUT)arch/$(ARCH)/tests/regs_load.o
 endif
diff --git a/tools/perf/arch/arm64/include/perf_regs.h b/tools/perf/arch/arm64/include/perf_regs.h
index 2359546..1e052f1 100644
--- a/tools/perf/arch/arm64/include/perf_regs.h
+++ b/tools/perf/arch/arm64/include/perf_regs.h
@@ -9,6 +9,8 @@
 #define PERF_REG_IP	PERF_REG_ARM64_PC
 #define PERF_REG_SP	PERF_REG_ARM64_SP
 
+void perf_regs_load(u64 *regs);
+
 static inline const char *perf_reg_name(int id)
 {
 	switch (id) {
diff --git a/tools/perf/arch/arm64/tests/regs_load.S b/tools/perf/arch/arm64/tests/regs_load.S
new file mode 100644
index 0000000..92ab968
--- /dev/null
+++ b/tools/perf/arch/arm64/tests/regs_load.S
@@ -0,0 +1,39 @@
+#include <linux/linkage.h>
+
+/*
+ * Implementation of void perf_regs_load(u64 *regs);
+ *
+ * This functions fills in the 'regs' buffer from the actual registers values,
+ * in the way the perf built-in unwinding test expects them:
+ * - the PC at the time at the call to this function. Since this function
+ *   is called using a bl instruction, the PC value is taken from LR,
+ * - the current SP (not touched by this function),
+ * - the current value of LR is merely retrieved and stored because the
+ *   value before the call to this function is unknown at this time; it will
+ *   be unwound from the dwarf information in unwind__get_entries.
+ */
+
+.text
+.type perf_regs_load,%function
+ENTRY(perf_regs_load)
+	stp x0,  x1,  [x0], #16	// store x0..x29
+	stp x2,  x3,  [x0], #16
+	stp x4,  x5,  [x0], #16
+	stp x6,  x7,  [x0], #16
+	stp x8,  x9,  [x0], #16
+	stp x10, x11, [x0], #16
+	stp x12, x13, [x0], #16
+	stp x14, x15, [x0], #16
+	stp x16, x17, [x0], #16
+	stp x18, x19, [x0], #16
+	stp x20, x21, [x0], #16
+	stp x22, x23, [x0], #16
+	stp x24, x25, [x0], #16
+	stp x26, x27, [x0], #16
+	stp x28, x29, [x0], #16
+	mov x1,  sp
+	stp x30, x1,  [x0], #16	// store lr and sp
+	str x30, [x0]		// store pc as lr in order to skip the call
+				//  to this function
+	ret
+ENDPROC(perf_regs_load)
-- 
1.7.11.7

^ permalink raw reply related	[flat|nested] 46+ messages in thread

end of thread, other threads:[~2014-05-07 12:06 UTC | newest]

Thread overview: 46+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-05-06 15:55 [PATCH 0/3] perf tools: Add libdw DWARF post unwind support for ARM64 Jean Pihet
2014-05-06 15:55 ` Jean Pihet
2014-05-06 15:55 ` [PATCH 1/3] perf tests: Introduce perf_regs_load function on ARM64 Jean Pihet
2014-05-06 15:55   ` Jean Pihet
2014-05-06 15:55 ` [PATCH 2/3] perf tests: Add dwarf unwind test " Jean Pihet
2014-05-06 15:55   ` Jean Pihet
2014-05-07 12:05   ` Jiri Olsa
2014-05-07 12:05     ` Jiri Olsa
2014-05-06 15:55 ` [PATCH 3/3] perf tools: Add libdw DWARF post unwind support for ARM64 Jean Pihet
2014-05-06 15:55   ` Jean Pihet
2014-05-06 17:00   ` Will Deacon
2014-05-06 17:00     ` Will Deacon
2014-05-06 17:41     ` Jean Pihet
2014-05-06 17:41       ` Jean Pihet
2014-05-06 17:52       ` Will Deacon
2014-05-06 17:52         ` Will Deacon
2014-05-07  7:40         ` Jean Pihet
2014-05-07  7:40           ` Jean Pihet
  -- strict thread matches above, loose matches on Subject: below --
2014-03-19  9:42 [PATCH 0/3] " Jean Pihet
2014-03-19  9:42 ` [PATCH 1/3] perf tests: Introduce perf_regs_load function on ARM64 Jean Pihet
2014-03-19  9:42   ` Jean Pihet
2014-03-21 15:11   ` Mark Rutland
2014-03-21 15:11     ` Mark Rutland
2014-03-25 15:23     ` Jean Pihet
2014-03-25 15:23       ` Jean Pihet
2014-04-04  7:51       ` Jean Pihet
2014-04-04  7:51         ` Jean Pihet
2014-04-22  8:13       ` Jean Pihet
2014-04-22  8:13         ` Jean Pihet
2014-04-22 10:37         ` Will Deacon
2014-04-22 10:37           ` Will Deacon
2014-04-22 13:24           ` Mark Rutland
2014-04-22 13:24             ` Mark Rutland
2014-04-22 13:42       ` Mark Rutland
2014-04-22 13:42         ` Mark Rutland
2014-04-28 13:10         ` Jean Pihet
2014-04-28 13:10           ` Jean Pihet
2014-04-28 13:12           ` Jean Pihet
2014-04-28 13:12             ` Jean Pihet
2014-05-02  9:19             ` Jean Pihet
2014-05-02  9:19               ` Jean Pihet
2014-05-02 16:51               ` Will Deacon
2014-05-02 16:51                 ` Will Deacon
2014-05-05  7:07                 ` Jean Pihet
2014-05-05  7:07                   ` Jean Pihet
2014-05-06  8:51                   ` Will Deacon
2014-05-06  8:51                     ` Will Deacon

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.