linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [GIT PULL 0/4] perf/annotate loop detection V2, fixes
@ 2012-04-26 15:06 Arnaldo Carvalho de Melo
  2012-04-26 15:06 ` [PATCH 1/4] perf annotate browser: Handle NULL jump targets Arnaldo Carvalho de Melo
                   ` (5 more replies)
  0 siblings, 6 replies; 20+ messages in thread
From: Arnaldo Carvalho de Melo @ 2012-04-26 15:06 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, Arnaldo Carvalho de Melo, David Ahern,
	Frederic Weisbecker, Hagen Paul Pfeifer, Mike Galbraith,
	Namhyung Kim, Paul Mackerras, Peter Zijlstra, Stephane Eranian,
	arnaldo.melo, Arnaldo Carvalho de Melo

Hi Ingo,

	Please consider pulling into tip/perf/core.

	Just flushing out the annotate queue before it gets too big.

	I'll work on the suggestions made on G+ and on lkml.

- Arnaldo

The following changes since commit a3f895be1f1ed17f66e6e71adeef0cc7f937512c:

  perf annotate browser: Initial loop detection (2012-04-24 14:24:28 -0300)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux tags/perf-annotate-for-mingo

for you to fetch changes up to 38b31bd0cefbb0e69a182d9a94b09a7e648549dc:

  perf annotate browser: Don't draw jump connectors for out of function jumps (2012-04-25 14:18:42 -0300)

----------------------------------------------------------------
Fixes on top of the previous perf/annotate pull request

. Sometimes a jump points to an offset with no instructions, make the
  mark jump targets function handle that, for now just ignoring such
  jump targets, more investigation is needed to figure out how to cope
  with that.

. Handle jump targets that are outside the function, for now just don't
  try to draw the connector arrow, right thing seems to be to mark this
  jump with a -> (right arrow) and handle it like a callq.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

----------------------------------------------------------------
Arnaldo Carvalho de Melo (4):
      perf annotate browser: Handle NULL jump targets
      perf annotate: Disambiguage offsets and addresses in operands
      perf annotate: Mark jump instructions with no offset
      perf annotate browser: Don't draw jump connectors for out of function jumps

 tools/perf/ui/browsers/annotate.c |   27 ++++++++++++++++++---------
 tools/perf/util/annotate.c        |   27 +++++++++++++++------------
 tools/perf/util/annotate.h        |   13 +++++++++++--
 3 files changed, 44 insertions(+), 23 deletions(-)

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH 1/4] perf annotate browser: Handle NULL jump targets
  2012-04-26 15:06 [GIT PULL 0/4] perf/annotate loop detection V2, fixes Arnaldo Carvalho de Melo
@ 2012-04-26 15:06 ` Arnaldo Carvalho de Melo
  2012-04-26 15:06 ` [PATCH 2/4] perf annotate: Disambiguage offsets and addresses in operands Arnaldo Carvalho de Melo
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 20+ messages in thread
From: Arnaldo Carvalho de Melo @ 2012-04-26 15:06 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, Arnaldo Carvalho de Melo, David Ahern,
	Frederic Weisbecker, Mike Galbraith, Namhyung Kim,
	Paul Mackerras, Peter Zijlstra, Stephane Eranian

From: Arnaldo Carvalho de Melo <acme@redhat.com>

In annotate_browser__mark_jump_targets

702                     dlt = browser->offsets[dl->ops.target];
703                     bdlt = disasm_line__browser(dlt);
704                     bdlt->jump_target = true;
705             }
706
707     }

(gdb) p size
$5 = 2415
(gdb) p offset
$6 = 140
(gdb) p dl->ops.target
$7 = 143
(gdb) p browser->offsets[143]
$8 = (struct disasm_line *) 0x0
(gdb) p dl->name
$9 = 0x2363bd0 "je"
(gdb)

Really strange, the code assumed that at the jump target we would have
an assembly line, but only in the previous instruction offset we have a
'lock':

(gdb) p browser->offsets[144]
$10 = (struct disasm_line *) 0x0
(gdb) p browser->offsets[142]
$11 = (struct disasm_line *) 0x27bd620
(gdb) p browser->offsets[142]->name
$12 = 0x237a8a0 "lock"
(gdb)

I'll study this more, but for now I'll just check if there is a
disasm_line at dl->ops.target, i.e. a valid jump target.

Reported-by: Hagen Paul Pfeifer <hagen@jauu.net>
Reported-by: Ingo Molnar <mingo@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-inzjrzyqhkzyv78met2vula6@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/ui/browsers/annotate.c |    7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/tools/perf/ui/browsers/annotate.c b/tools/perf/ui/browsers/annotate.c
index 9e3310c..4c83fe3 100644
--- a/tools/perf/ui/browsers/annotate.c
+++ b/tools/perf/ui/browsers/annotate.c
@@ -700,6 +700,13 @@ static void annotate_browser__mark_jump_targets(struct annotate_browser *browser
 		}
 
 		dlt = browser->offsets[dl->ops.target];
+		/*
+ 		 * FIXME: Oops, no jump target? Buggy disassembler? Or do we
+ 		 * have to adjust to the previous offset?
+ 		 */
+		if (dlt == NULL)
+			continue;
+
 		bdlt = disasm_line__browser(dlt);
 		bdlt->jump_target = true;
 	}
-- 
1.7.9.2.358.g22243


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 2/4] perf annotate: Disambiguage offsets and addresses in operands
  2012-04-26 15:06 [GIT PULL 0/4] perf/annotate loop detection V2, fixes Arnaldo Carvalho de Melo
  2012-04-26 15:06 ` [PATCH 1/4] perf annotate browser: Handle NULL jump targets Arnaldo Carvalho de Melo
@ 2012-04-26 15:06 ` Arnaldo Carvalho de Melo
  2012-04-26 15:06 ` [PATCH 3/4] perf annotate: Mark jump instructions with no offset Arnaldo Carvalho de Melo
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 20+ messages in thread
From: Arnaldo Carvalho de Melo @ 2012-04-26 15:06 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, Arnaldo Carvalho de Melo, David Ahern,
	Frederic Weisbecker, Mike Galbraith, Namhyung Kim,
	Paul Mackerras, Peter Zijlstra, Stephane Eranian

From: Arnaldo Carvalho de Melo <acme@redhat.com>

We were using ins_ops->target for callq addresses and jump offsets,
disambiguate by having ins_ops->target.addr and ins_ops->target.offset.

For jumps we'll need both to fixup lines that don't have an offset on
the <> part.

Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-3nlcmstua75u07ao7wja1rwx@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/ui/browsers/annotate.c |   14 +++++++-------
 tools/perf/util/annotate.c        |   20 ++++++++++----------
 tools/perf/util/annotate.h        |    7 +++++--
 3 files changed, 22 insertions(+), 19 deletions(-)

diff --git a/tools/perf/ui/browsers/annotate.c b/tools/perf/ui/browsers/annotate.c
index 4c83fe3..73e1ef0 100644
--- a/tools/perf/ui/browsers/annotate.c
+++ b/tools/perf/ui/browsers/annotate.c
@@ -112,7 +112,7 @@ static void annotate_browser__write(struct ui_browser *self, void *entry, int ro
 			ui_browser__set_color(self, color);
 		if (dl->ins && dl->ins->ops->scnprintf) {
 			if (ins__is_jump(dl->ins)) {
-				bool fwd = dl->ops.target > (u64)dl->offset;
+				bool fwd = dl->ops.target.offset > (u64)dl->offset;
 
 				ui_browser__write_graph(self, fwd ? SLSMG_DARROW_CHAR :
 								    SLSMG_UARROW_CHAR);
@@ -156,7 +156,7 @@ static void annotate_browser__draw_current_loop(struct ui_browser *browser)
 		if (!pos->ins || !ins__is_jump(pos->ins))
 			continue;
 
-		target = ab->offsets[pos->ops.target];
+		target = ab->offsets[pos->ops.target.offset];
 		if (!target)
 			continue;
 
@@ -360,7 +360,7 @@ static bool annotate_browser__callq(struct annotate_browser *browser,
 	if (!ins__is_call(dl->ins))
 		return false;
 
-	ip = ms->map->map_ip(ms->map, dl->ops.target);
+	ip = ms->map->map_ip(ms->map, dl->ops.target.addr);
 	target = map__find_symbol(ms->map, ip, NULL);
 	if (target == NULL) {
 		ui_helpline__puts("The called function was not found.");
@@ -411,7 +411,7 @@ static bool annotate_browser__jump(struct annotate_browser *browser)
 	if (!ins__is_jump(dl->ins))
 		return false;
 
-	dl = annotate_browser__find_offset(browser, dl->ops.target, &idx);
+	dl = annotate_browser__find_offset(browser, dl->ops.target.offset, &idx);
 	if (dl == NULL) {
 		ui_helpline__puts("Invallid jump offset");
 		return true;
@@ -692,14 +692,14 @@ static void annotate_browser__mark_jump_targets(struct annotate_browser *browser
 		if (!dl || !dl->ins || !ins__is_jump(dl->ins))
 			continue;
 
-		if (dl->ops.target >= size) {
+		if (dl->ops.target.offset >= size) {
 			ui__error("jump to after symbol!\n"
 				  "size: %zx, jump target: %" PRIx64,
-				  size, dl->ops.target);
+				  size, dl->ops.target.offset);
 			continue;
 		}
 
-		dlt = browser->offsets[dl->ops.target];
+		dlt = browser->offsets[dl->ops.target.offset];
 		/*
  		 * FIXME: Oops, no jump target? Buggy disassembler? Or do we
  		 * have to adjust to the previous offset?
diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index b07d7d1..e1e7d0e 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -22,7 +22,7 @@ static int call__parse(struct ins_operands *ops)
 {
 	char *endptr, *tok, *name;
 
-	ops->target = strtoull(ops->raw, &endptr, 16);
+	ops->target.addr = strtoull(ops->raw, &endptr, 16);
 
 	name = strchr(endptr, '<');
 	if (name == NULL)
@@ -35,17 +35,17 @@ static int call__parse(struct ins_operands *ops)
 		return -1;
 
 	*tok = '\0';
-	ops->target_name = strdup(name);
+	ops->target.name = strdup(name);
 	*tok = '>';
 
-	return ops->target_name == NULL ? -1 : 0;
+	return ops->target.name == NULL ? -1 : 0;
 
 indirect_call:
 	tok = strchr(endptr, '*');
 	if (tok == NULL)
 		return -1;
 
-	ops->target = strtoull(tok + 1, NULL, 16);
+	ops->target.addr = strtoull(tok + 1, NULL, 16);
 	return 0;
 }
 
@@ -55,10 +55,10 @@ static int call__scnprintf(struct ins *ins, char *bf, size_t size,
 	if (addrs)
 		return scnprintf(bf, size, "%-6.6s %s", ins->name, ops->raw);
 
-	if (ops->target_name)
-		return scnprintf(bf, size, "%-6.6s %s", ins->name, ops->target_name);
+	if (ops->target.name)
+		return scnprintf(bf, size, "%-6.6s %s", ins->name, ops->target.name);
 
-	return scnprintf(bf, size, "%-6.6s *%" PRIx64, ins->name, ops->target);
+	return scnprintf(bf, size, "%-6.6s *%" PRIx64, ins->name, ops->target.addr);
 }
 
 static struct ins_ops call_ops = {
@@ -78,7 +78,7 @@ static int jump__parse(struct ins_operands *ops)
 	if (s++ == NULL)
 		return -1;
 
-	ops->target = strtoll(s, NULL, 16);
+	ops->target.offset = strtoll(s, NULL, 16);
 	return 0;
 }
 
@@ -88,7 +88,7 @@ static int jump__scnprintf(struct ins *ins, char *bf, size_t size,
 	if (addrs)
 		return scnprintf(bf, size, "%-6.6s %s", ins->name, ops->raw);
 
-	return scnprintf(bf, size, "%-6.6s %" PRIx64, ins->name, ops->target);
+	return scnprintf(bf, size, "%-6.6s %" PRIx64, ins->name, ops->target.offset);
 }
 
 static struct ins_ops jump_ops = {
@@ -289,7 +289,7 @@ void disasm_line__free(struct disasm_line *dl)
 {
 	free(dl->line);
 	free(dl->name);
-	free(dl->ops.target_name);
+	free(dl->ops.target.name);
 	free(dl);
 }
 
diff --git a/tools/perf/util/annotate.h b/tools/perf/util/annotate.h
index 8a8af0d..2b9e3e0 100644
--- a/tools/perf/util/annotate.h
+++ b/tools/perf/util/annotate.h
@@ -11,8 +11,11 @@ struct ins;
 
 struct ins_operands {
 	char	*raw;
-	char	*target_name;
-	u64	target;
+	struct {
+		char	*name;
+		u64	offset;
+		u64	addr;
+	} target;
 };
 
 struct ins_ops {
-- 
1.7.9.2.358.g22243


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 3/4] perf annotate: Mark jump instructions with no offset
  2012-04-26 15:06 [GIT PULL 0/4] perf/annotate loop detection V2, fixes Arnaldo Carvalho de Melo
  2012-04-26 15:06 ` [PATCH 1/4] perf annotate browser: Handle NULL jump targets Arnaldo Carvalho de Melo
  2012-04-26 15:06 ` [PATCH 2/4] perf annotate: Disambiguage offsets and addresses in operands Arnaldo Carvalho de Melo
@ 2012-04-26 15:06 ` Arnaldo Carvalho de Melo
  2012-04-26 15:06 ` [PATCH 4/4] perf annotate browser: Don't draw jump connectors for out of function jumps Arnaldo Carvalho de Melo
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 20+ messages in thread
From: Arnaldo Carvalho de Melo @ 2012-04-26 15:06 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, Arnaldo Carvalho de Melo, David Ahern,
	Frederic Weisbecker, Mike Galbraith, Namhyung Kim,
	Paul Mackerras, Peter Zijlstra, Stephane Eranian

From: Arnaldo Carvalho de Melo <acme@redhat.com>

I.e. jumps that go to code outside the current function, that is denoted
in objdump -dS as:

   399f877a9f: jne    399f87bcf4 <_L_lock_5154>

I.e. without the + after the name of the current function, like in:

   399f877aa5: jmp    399f877ab2 <_int_free+0x412>

The browser will use that info to avoid drawing connectors to the start
of the function, since ops.target.addr was zero.

Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-xrn35g2mlawz1ydo1p73w3q6@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/annotate.c |    9 ++++++---
 tools/perf/util/annotate.h |    6 ++++++
 2 files changed, 12 insertions(+), 3 deletions(-)

diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index e1e7d0e..5eb3412 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -75,10 +75,13 @@ static int jump__parse(struct ins_operands *ops)
 {
 	const char *s = strchr(ops->raw, '+');
 
-	if (s++ == NULL)
-		return -1;
+	ops->target.addr = strtoll(ops->raw, NULL, 16);
+
+	if (s++ != NULL)
+		ops->target.offset = strtoll(s, NULL, 16);
+	else
+		ops->target.offset = UINT64_MAX;
 
-	ops->target.offset = strtoll(s, NULL, 16);
 	return 0;
 }
 
diff --git a/tools/perf/util/annotate.h b/tools/perf/util/annotate.h
index 2b9e3e0..13a21f1 100644
--- a/tools/perf/util/annotate.h
+++ b/tools/perf/util/annotate.h
@@ -2,6 +2,7 @@
 #define __PERF_ANNOTATE_H
 
 #include <stdbool.h>
+#include <stdint.h>
 #include "types.h"
 #include "symbol.h"
 #include <linux/list.h>
@@ -41,6 +42,11 @@ struct disasm_line {
 	struct ins_operands ops;
 };
 
+static inline bool disasm_line__has_offset(const struct disasm_line *dl)
+{
+	return dl->ops.target.offset != UINT64_MAX;
+}
+
 void disasm_line__free(struct disasm_line *dl);
 struct disasm_line *disasm__get_next_ip_line(struct list_head *head, struct disasm_line *pos);
 size_t disasm__fprintf(struct list_head *head, FILE *fp);
-- 
1.7.9.2.358.g22243


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 4/4] perf annotate browser: Don't draw jump connectors for out of function jumps
  2012-04-26 15:06 [GIT PULL 0/4] perf/annotate loop detection V2, fixes Arnaldo Carvalho de Melo
                   ` (2 preceding siblings ...)
  2012-04-26 15:06 ` [PATCH 3/4] perf annotate: Mark jump instructions with no offset Arnaldo Carvalho de Melo
@ 2012-04-26 15:06 ` Arnaldo Carvalho de Melo
  2012-04-27  7:21 ` [GIT PULL 0/4] perf/annotate loop detection V2, fixes Ingo Molnar
  2012-04-27 16:35 ` Linus Torvalds
  5 siblings, 0 replies; 20+ messages in thread
From: Arnaldo Carvalho de Melo @ 2012-04-26 15:06 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, Arnaldo Carvalho de Melo, David Ahern,
	Frederic Weisbecker, Mike Galbraith, Namhyung Kim,
	Paul Mackerras, Peter Zijlstra, Stephane Eranian

From: Arnaldo Carvalho de Melo <acme@redhat.com>

As described in the previous patch. Next step is to properly label those
jumps by using a -> arrow, i.e. not backwards/forwards, and allow the
user to navigate to this other function when enter or -> is pressed.

Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-ax2sss463eu88wgl9ee8a6b6@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/ui/browsers/annotate.c |    6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/tools/perf/ui/browsers/annotate.c b/tools/perf/ui/browsers/annotate.c
index 73e1ef0..077380b 100644
--- a/tools/perf/ui/browsers/annotate.c
+++ b/tools/perf/ui/browsers/annotate.c
@@ -153,7 +153,8 @@ static void annotate_browser__draw_current_loop(struct ui_browser *browser)
 	unsigned int from, to, start_width = 2;
 
 	list_for_each_entry_from(pos, &notes->src->source, node) {
-		if (!pos->ins || !ins__is_jump(pos->ins))
+		if (!pos->ins || !ins__is_jump(pos->ins) ||
+		    !disasm_line__has_offset(pos))
 			continue;
 
 		target = ab->offsets[pos->ops.target.offset];
@@ -689,7 +690,8 @@ static void annotate_browser__mark_jump_targets(struct annotate_browser *browser
 		struct disasm_line *dl = browser->offsets[offset], *dlt;
 		struct browser_disasm_line *bdlt;
 
-		if (!dl || !dl->ins || !ins__is_jump(dl->ins))
+		if (!dl || !dl->ins || !ins__is_jump(dl->ins) ||
+		    !disasm_line__has_offset(dl))
 			continue;
 
 		if (dl->ops.target.offset >= size) {
-- 
1.7.9.2.358.g22243


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [GIT PULL 0/4] perf/annotate loop detection V2, fixes
  2012-04-26 15:06 [GIT PULL 0/4] perf/annotate loop detection V2, fixes Arnaldo Carvalho de Melo
                   ` (3 preceding siblings ...)
  2012-04-26 15:06 ` [PATCH 4/4] perf annotate browser: Don't draw jump connectors for out of function jumps Arnaldo Carvalho de Melo
@ 2012-04-27  7:21 ` Ingo Molnar
  2012-04-27  8:06   ` Ingo Molnar
  2012-04-27 15:16   ` Arnaldo Carvalho de Melo
  2012-04-27 16:35 ` Linus Torvalds
  5 siblings, 2 replies; 20+ messages in thread
From: Ingo Molnar @ 2012-04-27  7:21 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: linux-kernel, David Ahern, Frederic Weisbecker,
	Hagen Paul Pfeifer, Mike Galbraith, Namhyung Kim, Paul Mackerras,
	Peter Zijlstra, Stephane Eranian, arnaldo.melo,
	Arnaldo Carvalho de Melo, Linus Torvalds


* Arnaldo Carvalho de Melo <acme@infradead.org> wrote:

> Hi Ingo,
> 
> 	Please consider pulling into tip/perf/core.
> 
> 	Just flushing out the annotate queue before it gets too big.
> 
> 	I'll work on the suggestions made on G+ and on lkml.
> 
> - Arnaldo
> 
> The following changes since commit a3f895be1f1ed17f66e6e71adeef0cc7f937512c:
> 
>   perf annotate browser: Initial loop detection (2012-04-24 14:24:28 -0300)
> 
> are available in the git repository at:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux tags/perf-annotate-for-mingo
> 
> for you to fetch changes up to 38b31bd0cefbb0e69a182d9a94b09a7e648549dc:
> 
>   perf annotate browser: Don't draw jump connectors for out of function jumps (2012-04-25 14:18:42 -0300)
> 
> ----------------------------------------------------------------
> Fixes on top of the previous perf/annotate pull request
> 
> . Sometimes a jump points to an offset with no instructions, make the
>   mark jump targets function handle that, for now just ignoring such
>   jump targets, more investigation is needed to figure out how to cope
>   with that.
> 
> . Handle jump targets that are outside the function, for now just don't
>   try to draw the connector arrow, right thing seems to be to mark this
>   jump with a -> (right arrow) and handle it like a callq.
> 
> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
> 
> ----------------------------------------------------------------
> Arnaldo Carvalho de Melo (4):
>       perf annotate browser: Handle NULL jump targets
>       perf annotate: Disambiguage offsets and addresses in operands
>       perf annotate: Mark jump instructions with no offset
>       perf annotate browser: Don't draw jump connectors for out of function jumps
> 
>  tools/perf/ui/browsers/annotate.c |   27 ++++++++++++++++++---------
>  tools/perf/util/annotate.c        |   27 +++++++++++++++------------
>  tools/perf/util/annotate.h        |   13 +++++++++++--
>  3 files changed, 44 insertions(+), 23 deletions(-)

Pulled, thanks a lot Arnaldo!

Works pretty well here, the segfaults are gone. I think we can 
use it as a starting point so I've pulled it into perf/core.

One thing that we need to fix with this new scheme is the visual 
dynamism/unpredictability of the loop drawing:

As I move the cursor line up and down the current loop is shown 
and it flips in and out of view depending on where I am. In one 
way it's a feature (it shows the currently interesting loop) - 
but in another way that kind of 'active' UI element draws my eye 
all the time and it gets tiring and hard to ignore when I'm not 
interested in the current loop but look at other elements of the 
UI output.

It would be better to have a stable image of loops that is 
static and scrolls up and down with the rest of the image. When 
I press 'o' to see the raw disassembly it should probably 
disappear completely.

This would require for us to draw all loops that are 
interesting: probably all backwards jumping ones, with some 
nesting limit of 5 or so. I think we really need this loop graph 
for this UI to have low visual overhead :-/

Beyond improving visual stability of the output, this would also 
obviously be (much) more informative as for reasonably sized 
functions it would show all the current loop contexts - which is 
very useful when one tries to make sense of a function in 
assembly.

Doing that will take up some screen real estate on the left, 
because with increasing nesting levels there will be parallel 
lines and crossing lines - I did a quick ASCII mockup and I 
think it will be fine, as long as we have a hotkey that makes it 
easy to show/hide the loop graph.

If it's all concentrated within a narrow vertical column it also 
does not clutter the output and is easy to ignore when it's not 
needed.

How and whether labels should be mixed with this graph output 
remains to be seen - my intuition is that the two should be 
integrated, like the current code does it.

( Once we have that done and gather some experience with it can 
  we decide whether to show it by default. )

We can also save some screen real estate on the left side of the 
screen, the instruction overhead percentage column. Right now 
the largest entry possible is:

  100.00 ││        lea    (%rbx,%r12,1),%r14

This could be trimmed by two characters to:

 100.0 ││        lea    (%rbx,%r12,1),%r14

This saves a space before the percent value and reduces the 
width of the output - I think 0.1% granularity ought to be 
enough in practice. We should also probably hide entries below 
0.1% overhead as if they got no hits at all.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [GIT PULL 0/4] perf/annotate loop detection V2, fixes
  2012-04-27  7:21 ` [GIT PULL 0/4] perf/annotate loop detection V2, fixes Ingo Molnar
@ 2012-04-27  8:06   ` Ingo Molnar
  2012-04-27 15:31     ` Arnaldo Carvalho de Melo
  2012-04-27 15:16   ` Arnaldo Carvalho de Melo
  1 sibling, 1 reply; 20+ messages in thread
From: Ingo Molnar @ 2012-04-27  8:06 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: linux-kernel, David Ahern, Frederic Weisbecker,
	Hagen Paul Pfeifer, Mike Galbraith, Namhyung Kim, Paul Mackerras,
	Peter Zijlstra, Stephane Eranian, arnaldo.melo,
	Arnaldo Carvalho de Melo, Linus Torvalds


* Ingo Molnar <mingo@kernel.org> wrote:

> [...]
> 
> This would require for us to draw all loops that are 
> interesting: probably all backwards jumping ones, with some 
> nesting limit of 5 or so. I think we really need this loop 
> graph for this UI to have low visual overhead :-/
> 
> Beyond improving visual stability of the output, this would 
> also obviously be (much) more informative as for reasonably 
> sized functions it would show all the current loop contexts - 
> which is very useful when one tries to make sense of a 
> function in assembly.

My (silent) hope would be that if we make assembly output more 
obvious, more structured visually then more people will read 
assembly code.

Note that once we have loop nesting there are countless other 
possibilities as well to augment the output.

( Note that I pile on a couple of visual suggestions below - so 
  the output becomes more and more complex - ideally we want to 
  pick the ones from below that we like and skip the ones that 
  are not so good. )

1) Loop grouping

We could slightly group the assembly instructions themselves 
as well:

    0.00 │       ↓ jmp    cc
    0.00 │         nopl   0x0(%rax)
         │
    0.00 │┌─→ c0:  cmp    %rax,%rbx
    0.00 ││        mov    %rax,%rcx
    0.00 ││      ↓ je     5dd
    0.00 ││   cc:  test   %rcx,%rcx
    0.00 ││      ↓ je     da
    0.00 ││        mov    0x8(%rcx),%esi
    0.00 ││        shr    $0x4,%esi
    0.00 ││        sub    $0x2,%esi
    0.00 ││   da:  mov    %rcx,0x10(%rbx)
    0.00 ││        mov    %rcx,%rax
    0.00 ││        cmpl   $0x0,%fs:0x18
    0.00 ││      ↓ je     ed
    0.00 ││        lock   cmpxchg %rbx,(%rdx)
    0.00 ││        cmp    %rax,%rcx
    0.00 │└──────↑ jne    c0
         │
    0.00 │         test   %rcx,%rcx
    0.00 │       ↓ je     104
    0.00 │         cmp    %r12d,%esi
    0.00 │       ↓ jne    719

See that by using a newline how much clearer the loop stands 
out, visually?

2) Nesting

The classic C method to show nesting is to use curly braces and 
slight indentation:

    0.00 │       ↓ jmp    cc
    0.00 │         nopl   0x0(%rax)
         │
         │         {
    0.00 │┌─→ c0:    cmp    %rax,%rbx
    0.00 ││          mov    %rax,%rcx
    0.00 ││      ↓   je     5dd
    0.00 ││   cc:    test   %rcx,%rcx
    0.00 ││      ↓   je     da
    0.00 ││          mov    0x8(%rcx),%esi
    0.00 ││          shr    $0x4,%esi
    0.00 ││          sub    $0x2,%esi
    0.00 ││   da:    mov    %rcx,0x10(%rbx)
    0.00 ││          mov    %rcx,%rax
    0.00 ││          cmpl   $0x0,%fs:0x18
    0.00 ││      ↓   je     ed
    0.00 ││          lock   cmpxchg %rbx,(%rdx)
    0.00 ││          cmp    %rax,%rcx
         ││
    0.00 │└──────↑ } jne    c0
         │
    0.00 │         test   %rcx,%rcx
    0.00 │       ↓ je     104
    0.00 │         cmp    %r12d,%esi
    0.00 │       ↓ jne    719

3) Separating out forward conditional jumps

    0.00 │       ↓ jmp    cc
    0.00 │         nopl   0x0(%rax)
         │
         │         {
    0.00 │┌─→ c0:    cmp    %rax,%rbx
    0.00 ││          mov    %rax,%rcx
    0.00 ││      ↓   .je    5dd
    0.00 ││   cc:    test   %rcx,%rcx
    0.00 ││      ↓   .je    da
    0.00 ││          mov    0x8(%rcx),%esi
    0.00 ││          shr    $0x4,%esi
    0.00 ││          sub    $0x2,%esi
    0.00 ││   da:    mov    %rcx,0x10(%rbx)
    0.00 ││          mov    %rcx,%rax
    0.00 ││          cmpl   $0x0,%fs:0x18
    0.00 ││      ↓   .je    ed
    0.00 ││          lock   cmpxchg %rbx,(%rdx)
    0.00 ││          cmp    %rax,%rcx
         ││
    0.00 │└──────↑ } jne    c0
         │
    0.00 │         test   %rcx,%rcx
    0.00 │       ↓ .je    104
    0.00 │         cmp    %r12d,%esi
    0.00 │       ↓ .jne   719

So the forward conditional jumps get moved by one character to 
the right and get prefixed with '.', which has the following 
effects:

 - the flow of all the 'other', register modifying and loop 
   instructions can be seen more clearly

 - the 'exception' forward conditional jumps get moved into the 
   visual background, slightly.

 - blocks of instructions with no branches amongst them are more 
   clearly visible.

If '.' is too aggressive or too confusing then some other 
(possibly graphical) character could be used as well?

4) Marking labels

I'd argue we bring in <ab> as label markers:

    0.00 │       ↓ jmp    <cc>
    0.00 │         nopl   0x0(%rax)
         │
         │         {
    0.00 │┌─→ c0:    cmp    %rax,%rbx
    0.00 ││          mov    %rax,%rcx
    0.00 ││      ↓   .je    <5dd>
    0.00 ││   cc:    test   %rcx,%rcx
    0.00 ││      ↓   .je    <da>
    0.00 ││          mov    0x8(%rcx),%esi
    0.00 ││          shr    $0x4,%esi
    0.00 ││          sub    $0x2,%esi
    0.00 ││   da:    mov    %rcx,0x10(%rbx)
    0.00 ││          mov    %rcx,%rax
    0.00 ││          cmpl   $0x0,%fs:0x18
    0.00 ││      ↓   .je    <ed>
    0.00 ││          lock   cmpxchg %rbx,(%rdx)
    0.00 ││          cmp    %rax,%rcx
         ││
    0.00 │└──────↑ } jne    <c0>
         │
    0.00 │         test   %rcx,%rcx
    0.00 │       ↓ .je    <104>
    0.00 │         cmp    %r12d,%esi
    0.00 │       ↓ .jne   <719>

Note how much easier it becomes to read the body of the loop: 
the <> labels are clearly separated from constants/offsets 
within the other assembly instructions. In the current output 
the two easily 'mix', which is bad.

5)

Another trick we could use to separate labels from constants 
more clearly is capitalization:

    0.00 │       ↓ jmp    <CC>
    0.00 │         nopl   0x0(%rax)
         │
         │         {
    0.00 │┌─→ C0:    cmp    %rax,%rbx
    0.00 ││          mov    %rax,%rcx
    0.00 ││      ↓   .je    <5DD>
    0.00 ││   CC:    test   %rcx,%rcx
    0.00 ││      ↓   .je    <DA>
    0.00 ││          mov    0x8(%rcx),%esi
    0.00 ││          shr    $0x4,%esi
    0.00 ││          sub    $0x2,%esi
    0.00 ││   DA:    mov    %rcx,0x10(%rbx)
    0.00 ││          mov    %rcx,%rax
    0.00 ││          cmpl   $0x0,%fs:0x18
    0.00 ││      ↓   .je    <ED>
    0.00 ││          lock   cmpxchg %rbx,(%rdx)
    0.00 ││          cmp    %rax,%rcx
         ││
    0.00 │└──────↑ } jne    <C0>
         │
    0.00 │         test   %rcx,%rcx
    0.00 │       ↓ .je    <104>
    0.00 │         cmp    %r12d,%esi
    0.00 │       ↓ .jne   <719>

I'm not entirely convinced we want this one.

6) Making callq's to function symbols more C-alike

Before:

    0.00 ││  70:  mov    %rbx,%rdi
    0.00 ││       callq  _IO_switch_to_main_get_area
    0.00 ││       mov    0x8(%rbx),%rdx
    0.00 ││       mov    0x10(%rbx),%rsi
    0.00 ││       cmp    %rsi,%rdx

After:

    0.00 ││  70:  mov    %rbx,%rdi
    0.00 ││       callq  _IO_switch_to_main_get_area()
    0.00 ││       mov    0x8(%rbx),%rdx
    0.00 ││       mov    0x10(%rbx),%rsi
    0.00 ││       cmp    %rsi,%rdx

The () makes it easier to separate the 'function' purpose of the 
symbol.

This would be especially useful once we start looking into 
debuginfo data and start symbolising stack frame offsets and 
data references:

Before:

    0.00 │   1b3:  mov    -0x38(%rbp),%rdx
    0.00 │         mov    -0x34(%rbp),%rcx
    0.00 │         mov    0x335fb4(%rip),%eax        # 392a5b0b60 <perturb_byte>

After:

  #
  # PARAM:idx          : 0x38
  # PARAM:addr         : 0x34
  # DATA:perturb_byte  : 0x392a5b0b60
  #

  ...

    0.00 │   1b3:  mov    P:idx....(%rbp),%rdx
    0.00 │         mov    P:addr...(%rbp),%rcx
    0.00 │         mov    D:pertu..(%rip),%eax

Where 'P:' stands for parameter, 'D:' for data, and the symbol 
names are length-culled to a fixed width 6 chars, to make the 
registers align better and to not interfere with other textual 
output. There's a summary at the top, to still have all the raw 
binary data available - but 'o' can be pressed as well to see 
the raw values.

This way it still looks like assembly - but is *much* more 
informative.

Or so.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [GIT PULL 0/4] perf/annotate loop detection V2, fixes
  2012-04-27  7:21 ` [GIT PULL 0/4] perf/annotate loop detection V2, fixes Ingo Molnar
  2012-04-27  8:06   ` Ingo Molnar
@ 2012-04-27 15:16   ` Arnaldo Carvalho de Melo
  1 sibling, 0 replies; 20+ messages in thread
From: Arnaldo Carvalho de Melo @ 2012-04-27 15:16 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, David Ahern, Frederic Weisbecker,
	Hagen Paul Pfeifer, Mike Galbraith, Namhyung Kim, Paul Mackerras,
	Peter Zijlstra, Stephane Eranian, Linus Torvalds

Em Fri, Apr 27, 2012 at 09:21:53AM +0200, Ingo Molnar escreveu:
> * Arnaldo Carvalho de Melo <acme@infradead.org> wrote:
> > Arnaldo Carvalho de Melo (4):
> >       perf annotate browser: Handle NULL jump targets
> >       perf annotate: Disambiguage offsets and addresses in operands
> >       perf annotate: Mark jump instructions with no offset
> >       perf annotate browser: Don't draw jump connectors for out of function jumps

> Pulled, thanks a lot Arnaldo!

> Works pretty well here, the segfaults are gone. I think we can 
> use it as a starting point so I've pulled it into perf/core.

> One thing that we need to fix with this new scheme is the visual 
> dynamism/unpredictability of the loop drawing:

Yeah, something to experiment, when Linus suggested it he was afraid
that we would get way too many lines, but Arjan has since requested that
we have a key to toggle showing all jumps at the same time, so that is
what I'm going to do.

Then we can check how that goes.
 
> As I move the cursor line up and down the current loop is shown 
> and it flips in and out of view depending on where I am. In one 
> way it's a feature (it shows the currently interesting loop) - 

That is why probably we want to get a
show-me-just-the-current-loop/show-me-all-loops toggle key.

> but in another way that kind of 'active' UI element draws my eye 
> all the time and it gets tiring and hard to ignore when I'm not 
> interested in the current loop but look at other elements of the 
> UI output.
> 
> It would be better to have a stable image of loops that is 
> static and scrolls up and down with the rest of the image. When 
> I press 'o' to see the raw disassembly it should probably 
> disappear completely.

Right, that can be done as well, but perhaps its better to have a 'j'
key that cycles thru:

. Show me just the current loop
. Show me all loops
. No loops please

That can be used in both raw and augmented mode.
 
> This would require for us to draw all loops that are 
> interesting: probably all backwards jumping ones, with some 
> nesting limit of 5 or so. I think we really need this loop graph 
> for this UI to have low visual overhead :-/

Well, when first marking the jump targets we'll notice how many loops
there are, reserve N columns for them, then just have a loop where right
now we have the current loop arrow drawing code.
 
> Beyond improving visual stability of the output, this would also 
> obviously be (much) more informative as for reasonably sized 
> functions it would show all the current loop contexts - which is 
> very useful when one tries to make sense of a function in 
> assembly.
> 
> Doing that will take up some screen real estate on the left, 
> because with increasing nesting levels there will be parallel 
> lines and crossing lines - I did a quick ASCII mockup and I 
> think it will be fine, as long as we have a hotkey that makes it 
> easy to show/hide the loop graph.

'j' or 'l', I'll check which one is available
 
> If it's all concentrated within a narrow vertical column it also 
> does not clutter the output and is easy to ignore when it's not 
> needed.
> 
> How and whether labels should be mixed with this graph output 
> remains to be seen - my intuition is that the two should be 
> integrated, like the current code does it.
> 
> ( Once we have that done and gather some experience with it can 
>   we decide whether to show it by default. )

Right, hence all the toggle keys, to experiment with combinations of
output formats.

> We can also save some screen real estate on the left side of the 
> screen, the instruction overhead percentage column. Right now 
> the largest entry possible is:
> 
>   100.00 ││        lea    (%rbx,%r12,1),%r14
> 
> This could be trimmed by two characters to:
> 
>  100.0 ││        lea    (%rbx,%r12,1),%r14
> 
> This saves a space before the percent value and reduces the 
> width of the output - I think 0.1% granularity ought to be 
> enough in practice. We should also probably hide entries below 
> 0.1% overhead as if they got no hits at all.

Right, that will be used as well when I implement a suggestin Stephane
made while in San Francisco: To show multiple overhead columns for
sessions with multiple events, helping to correlate them.

- Arnaldo

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [GIT PULL 0/4] perf/annotate loop detection V2, fixes
  2012-04-27  8:06   ` Ingo Molnar
@ 2012-04-27 15:31     ` Arnaldo Carvalho de Melo
  2012-04-27 15:48       ` Peter Zijlstra
  0 siblings, 1 reply; 20+ messages in thread
From: Arnaldo Carvalho de Melo @ 2012-04-27 15:31 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, David Ahern, Frederic Weisbecker,
	Hagen Paul Pfeifer, Mike Galbraith, Namhyung Kim, Paul Mackerras,
	Peter Zijlstra, Stephane Eranian, Linus Torvalds

Em Fri, Apr 27, 2012 at 10:06:13AM +0200, Ingo Molnar escreveu:
> * Ingo Molnar <mingo@kernel.org> wrote:
> 
> > [...]
> > 
> > This would require for us to draw all loops that are 
> > interesting: probably all backwards jumping ones, with some 
> > nesting limit of 5 or so. I think we really need this loop 
> > graph for this UI to have low visual overhead :-/
> > 
> > Beyond improving visual stability of the output, this would 
> > also obviously be (much) more informative as for reasonably 
> > sized functions it would show all the current loop contexts - 
> > which is very useful when one tries to make sense of a 
> > function in assembly.
> 
> My (silent) hope would be that if we make assembly output more 
> obvious, more structured visually then more people will read 
> assembly code.

Well, the idea is to improve the experience of people who _already_ look
at assembly output so that they can become slightly (or a lot, hey, one
can dream) more productive, as we get older we need better eye lenses
8-)

Doing that we would eventually have something that would make more
people join this existing assembly lovers club.
 
> Note that once we have loop nesting there are countless other 
> possibilities as well to augment the output.

Yeah, ponies!
 
> ( Note that I pile on a couple of visual suggestions below - so 
>   the output becomes more and more complex - ideally we want to 
>   pick the ones from below that we like and skip the ones that 
>   are not so good. )
> 
> 1) Loop grouping
> 
> We could slightly group the assembly instructions themselves 
> as well:
> 
>     0.00 │       ↓ jmp    cc
>     0.00 │         nopl   0x0(%rax)
>          │
>     0.00 │┌─→ c0:  cmp    %rax,%rbx
>     0.00 ││        mov    %rax,%rcx
>     0.00 ││      ↓ je     5dd
>     0.00 ││   cc:  test   %rcx,%rcx
>     0.00 ││      ↓ je     da
>     0.00 ││        mov    0x8(%rcx),%esi
>     0.00 ││        shr    $0x4,%esi
>     0.00 ││        sub    $0x2,%esi
>     0.00 ││   da:  mov    %rcx,0x10(%rbx)
>     0.00 ││        mov    %rcx,%rax
>     0.00 ││        cmpl   $0x0,%fs:0x18
>     0.00 ││      ↓ je     ed
>     0.00 ││        lock   cmpxchg %rbx,(%rdx)
>     0.00 ││        cmp    %rax,%rcx
>     0.00 │└──────↑ jne    c0
>          │
>     0.00 │         test   %rcx,%rcx
>     0.00 │       ↓ je     104
>     0.00 │         cmp    %r12d,%esi
>     0.00 │       ↓ jne    719
> 
> See that by using a newline how much clearer the loop stands 
> out, visually?

Linus also made that suggestion, agreed.
 
> 2) Nesting
> 
> The classic C method to show nesting is to use curly braces and 
> slight indentation:
> 
>     0.00 │       ↓ jmp    cc
>     0.00 │         nopl   0x0(%rax)
>          │
>          │         {
>     0.00 │┌─→ c0:    cmp    %rax,%rbx
>     0.00 ││          mov    %rax,%rcx
>     0.00 ││      ↓   je     5dd
>     0.00 ││   cc:    test   %rcx,%rcx
>     0.00 ││      ↓   je     da
>     0.00 ││          mov    0x8(%rcx),%esi
>     0.00 ││          shr    $0x4,%esi
>     0.00 ││          sub    $0x2,%esi
>     0.00 ││   da:    mov    %rcx,0x10(%rbx)
>     0.00 ││          mov    %rcx,%rax
>     0.00 ││          cmpl   $0x0,%fs:0x18
>     0.00 ││      ↓   je     ed
>     0.00 ││          lock   cmpxchg %rbx,(%rdx)
>     0.00 ││          cmp    %rax,%rcx
>          ││
>     0.00 │└──────↑ } jne    c0
>          │
>     0.00 │         test   %rcx,%rcx
>     0.00 │       ↓ je     104
>     0.00 │         cmp    %r12d,%esi
>     0.00 │       ↓ jne    719

I guess we could do away with the {, the spaces before/after the loop
+ the indentation (that Arjan also suggested on G+) should be enough,
right?
 
> 3) Separating out forward conditional jumps
> 
>     0.00 │       ↓ jmp    cc
>     0.00 │         nopl   0x0(%rax)
>          │
>          │         {
>     0.00 │┌─→ c0:    cmp    %rax,%rbx
>     0.00 ││          mov    %rax,%rcx
>     0.00 ││      ↓   .je    5dd
>     0.00 ││   cc:    test   %rcx,%rcx
>     0.00 ││      ↓   .je    da
>     0.00 ││          mov    0x8(%rcx),%esi
>     0.00 ││          shr    $0x4,%esi
>     0.00 ││          sub    $0x2,%esi
>     0.00 ││   da:    mov    %rcx,0x10(%rbx)
>     0.00 ││          mov    %rcx,%rax
>     0.00 ││          cmpl   $0x0,%fs:0x18
>     0.00 ││      ↓   .je    ed
>     0.00 ││          lock   cmpxchg %rbx,(%rdx)
>     0.00 ││          cmp    %rax,%rcx
>          ││
>     0.00 │└──────↑ } jne    c0
>          │
>     0.00 │         test   %rcx,%rcx
>     0.00 │       ↓ .je    104
>     0.00 │         cmp    %r12d,%esi
>     0.00 │       ↓ .jne   719
> 
> So the forward conditional jumps get moved by one character to 
> the right and get prefixed with '.', which has the following 
> effects:
> 
>  - the flow of all the 'other', register modifying and loop 
>    instructions can be seen more clearly
> 
>  - the 'exception' forward conditional jumps get moved into the 
>    visual background, slightly.
> 
>  - blocks of instructions with no branches amongst them are more 
>    clearly visible.
> 
> If '.' is too aggressive or too confusing then some other 
> (possibly graphical) character could be used as well?

Yeah, there are some unused graphical chars. But perhaps we should just
use ↓ again?

     0.00 │┌─→ c0:    cmp    %rax,%rbx
     0.00 ││          mov    %rax,%rcx
     0.00 ││      ↓   ↓je    5dd
     0.00 ││   cc:    test   %rcx,%rcx
     0.00 ││      ↓   ↓je    da
     0.00 ││          mov    0x8(%rcx),%esi
     0.00 ││          shr    $0x4,%esi
     0.00 ││          sub    $0x2,%esi
     0.00 ││   da:    mov    %rcx,0x10(%rbx)
     0.00 ││          mov    %rcx,%rax
     0.00 ││          cmpl   $0x0,%fs:0x18
     0.00 ││      ↓   ↓je    ed
     0.00 ││          lock   cmpxchg %rbx,(%rdx)
     0.00 ││          cmp    %rax,%rcx
          ││
     0.00 │└──────↑ } jne    c0

Does it still stands out? I think so, we expect a letter there,
something very different is there, matching the other down arrowsome
columns to the left.

> 4) Marking labels
> 
> I'd argue we bring in <ab> as label markers:
> 
>     0.00 │       ↓ jmp    <cc>
>     0.00 │         nopl   0x0(%rax)
>          │
>          │         {
>     0.00 │┌─→ c0:    cmp    %rax,%rbx
>     0.00 ││          mov    %rax,%rcx
>     0.00 ││      ↓   .je    <5dd>
>     0.00 ││   cc:    test   %rcx,%rcx
>     0.00 ││      ↓   .je    <da>
>     0.00 ││          mov    0x8(%rcx),%esi
>     0.00 ││          shr    $0x4,%esi
>     0.00 ││          sub    $0x2,%esi
>     0.00 ││   da:    mov    %rcx,0x10(%rbx)
>     0.00 ││          mov    %rcx,%rax
>     0.00 ││          cmpl   $0x0,%fs:0x18
>     0.00 ││      ↓   .je    <ed>
>     0.00 ││          lock   cmpxchg %rbx,(%rdx)
>     0.00 ││          cmp    %rax,%rcx
>          ││
>     0.00 │└──────↑ } jne    <c0>
>          │
>     0.00 │         test   %rcx,%rcx
>     0.00 │       ↓ .je    <104>
>     0.00 │         cmp    %r12d,%esi
>     0.00 │       ↓ .jne   <719>
> 
> Note how much easier it becomes to read the body of the loop: 
> the <> labels are clearly separated from constants/offsets 
> within the other assembly instructions. In the current output 
> the two easily 'mix', which is bad.

I think the <> is enough, i.e. no need for capitalization, as a bonus
point, it is already what is used in raw assembly mode, i.e. a subset of
users are already used to that visual cue.
 
> 5)
> 
> Another trick we could use to separate labels from constants 
> more clearly is capitalization:
> 
>     0.00 │       ↓ jmp    <CC>
>     0.00 │         nopl   0x0(%rax)
>          │
>          │         {
>     0.00 │┌─→ C0:    cmp    %rax,%rbx
>     0.00 ││          mov    %rax,%rcx
>     0.00 ││      ↓   .je    <5DD>
>     0.00 ││   CC:    test   %rcx,%rcx
>     0.00 ││      ↓   .je    <DA>
>     0.00 ││          mov    0x8(%rcx),%esi
>     0.00 ││          shr    $0x4,%esi
>     0.00 ││          sub    $0x2,%esi
>     0.00 ││   DA:    mov    %rcx,0x10(%rbx)
>     0.00 ││          mov    %rcx,%rax
>     0.00 ││          cmpl   $0x0,%fs:0x18
>     0.00 ││      ↓   .je    <ED>
>     0.00 ││          lock   cmpxchg %rbx,(%rdx)
>     0.00 ││          cmp    %rax,%rcx
>          ││
>     0.00 │└──────↑ } jne    <C0>
>          │
>     0.00 │         test   %rcx,%rcx
>     0.00 │       ↓ .je    <104>
>     0.00 │         cmp    %r12d,%esi
>     0.00 │       ↓ .jne   <719>
> 
> I'm not entirely convinced we want this one.
> 
> 6) Making callq's to function symbols more C-alike
> 
> Before:
> 
>     0.00 ││  70:  mov    %rbx,%rdi
>     0.00 ││       callq  _IO_switch_to_main_get_area
>     0.00 ││       mov    0x8(%rbx),%rdx
>     0.00 ││       mov    0x10(%rbx),%rsi
>     0.00 ││       cmp    %rsi,%rdx
> 
> After:
> 
>     0.00 ││  70:  mov    %rbx,%rdi
>     0.00 ││       callq  _IO_switch_to_main_get_area()
>     0.00 ││       mov    0x8(%rbx),%rdx
>     0.00 ││       mov    0x10(%rbx),%rsi
>     0.00 ││       cmp    %rsi,%rdx
> 
> The () makes it easier to separate the 'function' purpose of the 
> symbol.

Yes, I agreed already with that one, just was waiting a bit more to see
if somebody else would disagree (hint, hint)
 
> This would be especially useful once we start looking into 
> debuginfo data and start symbolising stack frame offsets and 
> data references:
> 
> Before:
> 
>     0.00 │   1b3:  mov    -0x38(%rbp),%rdx
>     0.00 │         mov    -0x34(%rbp),%rcx
>     0.00 │         mov    0x335fb4(%rip),%eax        # 392a5b0b60 <perturb_byte>
> 
> After:
> 
>   #
>   # PARAM:idx          : 0x38
>   # PARAM:addr         : 0x34
>   # DATA:perturb_byte  : 0x392a5b0b60
>   #
> 
>   ...
> 
>     0.00 │   1b3:  mov    P:idx....(%rbp),%rdx
>     0.00 │         mov    P:addr...(%rbp),%rcx
>     0.00 │         mov    D:pertu..(%rip),%eax
> 
> Where 'P:' stands for parameter, 'D:' for data, and the symbol 
> names are length-culled to a fixed width 6 chars, to make the 
> registers align better and to not interfere with other textual 
> output. There's a summary at the top, to still have all the raw 
> binary data available - but 'o' can be pressed as well to see 
> the raw values.
> 
> This way it still looks like assembly - but is *much* more 
> informative.

Yeah, there were some ideas related to figuring out what values are in
what registers at each line, easy for things like constants, requires a
bit more thought for other cases.
 
- Arnaldo

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [GIT PULL 0/4] perf/annotate loop detection V2, fixes
  2012-04-27 15:31     ` Arnaldo Carvalho de Melo
@ 2012-04-27 15:48       ` Peter Zijlstra
  2012-04-27 16:07         ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 20+ messages in thread
From: Peter Zijlstra @ 2012-04-27 15:48 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, linux-kernel, David Ahern, Frederic Weisbecker,
	Hagen Paul Pfeifer, Mike Galbraith, Namhyung Kim, Paul Mackerras,
	Stephane Eranian, Linus Torvalds

On Fri, 2012-04-27 at 12:31 -0300, Arnaldo Carvalho de Melo wrote:
> > So the forward conditional jumps get moved by one character to 
> > the right and get prefixed with '.', which has the following 
> > effects:
> > 
> >  - the flow of all the 'other', register modifying and loop 
> >    instructions can be seen more clearly
> > 
> >  - the 'exception' forward conditional jumps get moved into the 
> >    visual background, slightly.
> > 
> >  - blocks of instructions with no branches amongst them are more 
> >    clearly visible.
> > 
> > If '.' is too aggressive or too confusing then some other 
> > (possibly graphical) character could be used as well?
> 
> Yeah, there are some unused graphical chars. But perhaps we should just
> use ↓ again?
> 
>      0.00 │┌─→ c0:    cmp    %rax,%rbx
>      0.00 ││          mov    %rax,%rcx
>      0.00 ││      ↓   ↓je    5dd
>      0.00 ││   cc:    test   %rcx,%rcx
>      0.00 ││      ↓   ↓je    da
>      0.00 ││          mov    0x8(%rcx),%esi
>      0.00 ││          shr    $0x4,%esi
>      0.00 ││          sub    $0x2,%esi
>      0.00 ││   da:    mov    %rcx,0x10(%rbx)
>      0.00 ││          mov    %rcx,%rax
>      0.00 ││          cmpl   $0x0,%fs:0x18
>      0.00 ││      ↓   ↓je    ed
>      0.00 ││          lock   cmpxchg %rbx,(%rdx)
>      0.00 ││          cmp    %rax,%rcx
>           ││
>      0.00 │└──────↑ } jne    c0
> 
> Does it still stands out? I think so, we expect a letter there,
> something very different is there, matching the other down arrowsome
> columns to the left. 

That just looks daft.. what's wrong with the single up/down arrow?

Also, can we keep all this on the list please?


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [GIT PULL 0/4] perf/annotate loop detection V2, fixes
  2012-04-27 15:48       ` Peter Zijlstra
@ 2012-04-27 16:07         ` Arnaldo Carvalho de Melo
  2012-04-27 16:20           ` Peter Zijlstra
  0 siblings, 1 reply; 20+ messages in thread
From: Arnaldo Carvalho de Melo @ 2012-04-27 16:07 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, linux-kernel, David Ahern, Frederic Weisbecker,
	Hagen Paul Pfeifer, Mike Galbraith, Namhyung Kim, Paul Mackerras,
	Stephane Eranian, Linus Torvalds

Em Fri, Apr 27, 2012 at 05:48:57PM +0200, Peter Zijlstra escreveu:
> On Fri, 2012-04-27 at 12:31 -0300, Arnaldo Carvalho de Melo wrote:
> > Yeah, there are some unused graphical chars. But perhaps we should just
> > use ↓ again?
> > 
> >      0.00 │┌─→ c0:    cmp    %rax,%rbx
> >      0.00 ││          mov    %rax,%rcx
> >      0.00 ││      ↓   ↓je    5dd
> >      0.00 ││   cc:    test   %rcx,%rcx
> >      0.00 ││      ↓   ↓je    da
> >      0.00 ││          mov    0x8(%rcx),%esi
> >      0.00 ││          shr    $0x4,%esi
> >      0.00 ││          sub    $0x2,%esi
> >      0.00 ││   da:    mov    %rcx,0x10(%rbx)
> >      0.00 ││          mov    %rcx,%rax
> >      0.00 ││          cmpl   $0x0,%fs:0x18
> >      0.00 ││      ↓   ↓je    ed
> >      0.00 ││          lock   cmpxchg %rbx,(%rdx)
> >      0.00 ││          cmp    %rax,%rcx
> >           ││
> >      0.00 │└──────↑ } jne    c0
> > 
> > Does it still stands out? I think so, we expect a letter there,
> > something very different is there, matching the other down arrowsome
> > columns to the left. 
> 
> That just looks daft.. what's wrong with the single up/down arrow?

Yeah, that too, or perhaps we can bring the instruction column closer to
the jump column and then that makes it stand out better?
 
> Also, can we keep all this on the list please?

 Content-Type: text/plain; charset="UTF-8"
 Content-Transfer-Encoding: 8bit

Is doing the trick with keeping the arrows, etc, readable here, I'll
check what is wrong with git-send-email that corrupted the messages when
I tried using it to post patches.

- Arnaldo


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [GIT PULL 0/4] perf/annotate loop detection V2, fixes
  2012-04-27 16:07         ` Arnaldo Carvalho de Melo
@ 2012-04-27 16:20           ` Peter Zijlstra
  2012-04-27 18:09             ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 20+ messages in thread
From: Peter Zijlstra @ 2012-04-27 16:20 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, linux-kernel, David Ahern, Frederic Weisbecker,
	Hagen Paul Pfeifer, Mike Galbraith, Namhyung Kim, Paul Mackerras,
	Stephane Eranian, Linus Torvalds

On Fri, 2012-04-27 at 13:07 -0300, Arnaldo Carvalho de Melo wrote:
> > >      0.00 │┌─→ c0:    cmp    %rax,%rbx
> > >      0.00 ││          mov    %rax,%rcx
> > >      0.00 ││      ↓   ↓je    5dd
> > >      0.00 ││   cc:    test   %rcx,%rcx
> > >      0.00 ││      ↓   ↓je    da
> > >      0.00 ││          mov    0x8(%rcx),%esi
> > >      0.00 ││          shr    $0x4,%esi
> > >      0.00 ││          sub    $0x2,%esi
> > >      0.00 ││   da:    mov    %rcx,0x10(%rbx)
> > >      0.00 ││          mov    %rcx,%rax
> > >      0.00 ││          cmpl   $0x0,%fs:0x18
> > >      0.00 ││      ↓   ↓je    ed
> > >      0.00 ││          lock   cmpxchg %rbx,(%rdx)
> > >      0.00 ││          cmp    %rax,%rcx
> > >           ││
> > >      0.00 │└──────↑ } jne    c0
> > > 
> > > Does it still stands out? I think so, we expect a letter there,
> > > something very different is there, matching the other down arrowsome
> > > columns to the left. 
> > 
> > That just looks daft.. what's wrong with the single up/down arrow?
> 
> Yeah, that too, or perhaps we can bring the instruction column closer to
> the jump column and then that makes it stand out better? 

Well, there should be at least one space between the arrows and the
instruction, otherwise it reads funny. And looking at the above that's
already the case, except for that indenting you did.



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [GIT PULL 0/4] perf/annotate loop detection V2, fixes
  2012-04-26 15:06 [GIT PULL 0/4] perf/annotate loop detection V2, fixes Arnaldo Carvalho de Melo
                   ` (4 preceding siblings ...)
  2012-04-27  7:21 ` [GIT PULL 0/4] perf/annotate loop detection V2, fixes Ingo Molnar
@ 2012-04-27 16:35 ` Linus Torvalds
  2012-04-27 16:46   ` Arnaldo Carvalho de Melo
  5 siblings, 1 reply; 20+ messages in thread
From: Linus Torvalds @ 2012-04-27 16:35 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, linux-kernel, David Ahern, Frederic Weisbecker,
	Hagen Paul Pfeifer, Mike Galbraith, Namhyung Kim, Paul Mackerras,
	Peter Zijlstra, Stephane Eranian, arnaldo.melo,
	Arnaldo Carvalho de Melo

On Thu, Apr 26, 2012 at 8:06 AM, Arnaldo Carvalho de Melo
<acme@infradead.org> wrote:
>
> Fixes on top of the previous perf/annotate pull request
>
> . Sometimes a jump points to an offset with no instructions, make the
>  mark jump targets function handle that, for now just ignoring such
>  jump targets, more investigation is needed to figure out how to cope
>  with that.
>
> . Handle jump targets that are outside the function, for now just don't
>  try to draw the connector arrow, right thing seems to be to mark this
>  jump with a -> (right arrow) and handle it like a callq.

Ok, this looks pretty good, but I do have a few comments:

 - the "jumping around" of the loop view is distracting. I think it's
a very useful concept and generally well done, but quite frankly,
scrolling around a function and having the loop detection jump around
as you just want to scroll down is not a good interface. It just makes
the screen flicker annoyingly.

   I think it would be better to just make it static. I'd suggest
perhaps to show the loop when you explicitly ask for it (by pressing
"space" on the branch or the target or whatever), and *perhaps* by
default just statically show any non-overlapping loops (I don't care
what the priority is: inner-most, outer-most or "first branch seen" or
whatever).

 - the instruction decode is now broken. For example, the instruction

        callq  *0x10(%rax)

   now shows up as

        callq  *10

   which is obviously completely bogus - it's even broken the
constant. Your address simplification does something horribly bad.

Btw, since "retq" has a back arrow, maybe "callq" should have a
forward-pointing arrow on it too?

Talking about instruction decode, the instructions that should really
be simplified are things like this:

 - nopl   0x0(%rax,%rax,1)

   That's a totally worthless instruction. It should either be hidden
entirely, or perhaps just called "nop5" (for 5-byte nop)

 - mov    (%r9,%rsi,1),%rax

   That ",1" is completely bogus, and I don't understand why the tools
show that idiotic format to begin with. It's pure garbage. It adds
zero information.

 - mov    0x0(%r13),%r13

   That "0x0" is more useless garbage in the same vein. It is useless,
and just shows instruction encoding details that don't matter.

 - shl    $0x3,%ecx

   What does the "0x" part here really tell us? I'd suggest dropping
it for any constant < 9, because there really is no semantic value to
it.

 - mov    0xb897d9(%rip),%ecx        # ffffffff81c7beb8 <d_hash_shift>

   I think you should use the same simplification that you use for
call (once it is fixed), and just show this as

   mov    d_hash_shift,%ecx

because that's what both a human and a compiler would have written.
The "0xb897d9(%rip)" is not adding any information outside of (again)
pointless instruction encoding details.

Give the encoding details in the 'O'ffset view, by all means.

Btw, don't get me wrong. I really like the changes. It's just that now
that the asm is almost readable, the remaining stupid default decode
format details just show up so much more clearly.

Oh, and I think the long vertical line should just be removed. It's
not really adding anything, and with the loop lines it's just
distracting. Also, it's not even a solid line: it gets colorized by
percentage, and generally doesn't look good.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [GIT PULL 0/4] perf/annotate loop detection V2, fixes
  2012-04-27 16:35 ` Linus Torvalds
@ 2012-04-27 16:46   ` Arnaldo Carvalho de Melo
  2012-04-27 17:12     ` Linus Torvalds
  0 siblings, 1 reply; 20+ messages in thread
From: Arnaldo Carvalho de Melo @ 2012-04-27 16:46 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Ingo Molnar, linux-kernel, David Ahern, Frederic Weisbecker,
	Hagen Paul Pfeifer, Mike Galbraith, Namhyung Kim, Paul Mackerras,
	Peter Zijlstra, Stephane Eranian

Em Fri, Apr 27, 2012 at 09:35:58AM -0700, Linus Torvalds escreveu:
> Btw, don't get me wrong. I really like the changes. It's just that now
> that the asm is almost readable, the remaining stupid default decode
> format details just show up so much more clearly.

Hey, I love the comments and suggestions, keep them coming when you feel
like doing it.

- Arnaldo

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [GIT PULL 0/4] perf/annotate loop detection V2, fixes
  2012-04-27 16:46   ` Arnaldo Carvalho de Melo
@ 2012-04-27 17:12     ` Linus Torvalds
  2012-04-27 18:03       ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 20+ messages in thread
From: Linus Torvalds @ 2012-04-27 17:12 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, linux-kernel, David Ahern, Frederic Weisbecker,
	Hagen Paul Pfeifer, Mike Galbraith, Namhyung Kim, Paul Mackerras,
	Peter Zijlstra, Stephane Eranian

On Fri, Apr 27, 2012 at 9:46 AM, Arnaldo Carvalho de Melo
<acme@infradead.org> wrote:
> Em Fri, Apr 27, 2012 at 09:35:58AM -0700, Linus Torvalds escreveu:
>> Btw, don't get me wrong. I really like the changes. It's just that now
>> that the asm is almost readable, the remaining stupid default decode
>> format details just show up so much more clearly.
>
> Hey, I love the comments and suggestions, keep them coming when you feel
> like doing it.

I found another problem, and I think this one is more fundamental.

The "loop detection" is completely and utterly broken.

It seems to think that a backwards jump implies a loop. But that's not
at all true.

In fact, many backwards jumps are the *reverse* of loops. They are due
to *cold* code, that is totally uninteresting, and that was done
out-of-line. The backwards jump is not a loop at all, it's a jump back
to the hot code. In fact, it's often a jump back to the *exit* of a
function, when the cold code returns an error value (but the actual
code to do the "return" part was generated earlier as part of the hot
normal case code).

So making a big deal out of it as if it was a loop can be actively
wrong and misleading.

(And yes, I'm looking at an example of that right now -
__d_lookup_rcu() has this, for example)

Now, it's often nice to see the line to find the branch target
(whether it's a loop or not), but you don't show them for forwards
branches, you only show them for backwards branches, as if the
backwards branches were somehow more important. But they really really
aren't.

                    Linus

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [GIT PULL 0/4] perf/annotate loop detection V2, fixes
  2012-04-27 17:12     ` Linus Torvalds
@ 2012-04-27 18:03       ` Arnaldo Carvalho de Melo
  2012-04-27 18:23         ` Linus Torvalds
  0 siblings, 1 reply; 20+ messages in thread
From: Arnaldo Carvalho de Melo @ 2012-04-27 18:03 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Ingo Molnar, linux-kernel, David Ahern, Frederic Weisbecker,
	Hagen Paul Pfeifer, Mike Galbraith, Namhyung Kim, Paul Mackerras,
	Peter Zijlstra, Stephane Eranian

Em Fri, Apr 27, 2012 at 10:12:45AM -0700, Linus Torvalds escreveu:
> On Fri, Apr 27, 2012 at 9:46 AM, Arnaldo Carvalho de Melo
> <acme@infradead.org> wrote:
> > Em Fri, Apr 27, 2012 at 09:35:58AM -0700, Linus Torvalds escreveu:
> >> Btw, don't get me wrong. I really like the changes. It's just that now
> >> that the asm is almost readable, the remaining stupid default decode
> >> format details just show up so much more clearly.
> >
> > Hey, I love the comments and suggestions, keep them coming when you feel
> > like doing it.
> 
> I found another problem, and I think this one is more fundamental.
> 
> The "loop detection" is completely and utterly broken.

Arjan noticed that too ;-)
 
> It seems to think that a backwards jump implies a loop. But that's not
> at all true.

Yeah, the jump has to be conditional.

I should have reworded the "loop detection" with "basic jump arrows" in
the first place.
 
> In fact, many backwards jumps are the *reverse* of loops. They are due
> to *cold* code, that is totally uninteresting, and that was done
> out-of-line. The backwards jump is not a loop at all, it's a jump back
> to the hot code. In fact, it's often a jump back to the *exit* of a
> function, when the cold code returns an error value (but the actual
> code to do the "return" part was generated earlier as part of the hot
> normal case code).

That makes me think if another thing to add to the TODO isn't to detect
those cold code blocks from hot ones, even without profiling, just by
looking at these well known patterns.

Profiling would just prove that when done.

Such cold blocks could start folded, so that what appeared on the screen
at first would be just the hot path.
 
> So making a big deal out of it as if it was a loop can be actively
> wrong and misleading.
> 
> (And yes, I'm looking at an example of that right now -
> __d_lookup_rcu() has this, for example)
> 
> Now, it's often nice to see the line to find the branch target
> (whether it's a loop or not), but you don't show them for forwards
> branches, you only show them for backwards branches, as if the
> backwards branches were somehow more important. But they really really
> aren't.

Yeah, there has to be a key to press over jumps to ask for it to point
to its target.

In addition another key to press anywhere to ask for the current loop,
if any, and this key, if pressed twice, should show a capped number of
loops, something like that.

- Arnaldo

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [GIT PULL 0/4] perf/annotate loop detection V2, fixes
  2012-04-27 16:20           ` Peter Zijlstra
@ 2012-04-27 18:09             ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 20+ messages in thread
From: Arnaldo Carvalho de Melo @ 2012-04-27 18:09 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, linux-kernel, David Ahern, Frederic Weisbecker,
	Hagen Paul Pfeifer, Mike Galbraith, Namhyung Kim, Paul Mackerras,
	Stephane Eranian, Linus Torvalds

Em Fri, Apr 27, 2012 at 06:20:13PM +0200, Peter Zijlstra escreveu:
> On Fri, 2012-04-27 at 13:07 -0300, Arnaldo Carvalho de Melo wrote:
> > Yeah, that too, or perhaps we can bring the instruction column closer to
> > the jump column and then that makes it stand out better? 
> 
> Well, there should be at least one space between the arrows and the
> instruction, otherwise it reads funny. And looking at the above that's
> already the case, except for that indenting you did.

Right, I just looked again at then unindented output, nevermind then :-)

- Arnaldo

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [GIT PULL 0/4] perf/annotate loop detection V2, fixes
  2012-04-27 18:03       ` Arnaldo Carvalho de Melo
@ 2012-04-27 18:23         ` Linus Torvalds
  2012-04-27 19:17           ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 20+ messages in thread
From: Linus Torvalds @ 2012-04-27 18:23 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ingo Molnar, linux-kernel, David Ahern, Frederic Weisbecker,
	Hagen Paul Pfeifer, Mike Galbraith, Namhyung Kim, Paul Mackerras,
	Peter Zijlstra, Stephane Eranian

On Fri, Apr 27, 2012 at 11:03 AM, Arnaldo Carvalho de Melo
<acme@infradead.org> wrote:
>
>> It seems to think that a backwards jump implies a loop. But that's not
>> at all true.
>
> Yeah, the jump has to be conditional.

Not at all.

Unconditional backwards jumps can easily be parts of loops. It's a
very valid loop that does basically

   for (;;) {
   }

with a few exit cases in the *middle* of the loop. Sometimes that is
the right way to write things, and sometimes gcc rewrites things that
way for other reasons.

And conditional backwards jumps are *not* automatically loops either. Doing a

   if (error)
      return error;

is perfectly normal - and that "return error" may well be a backwards
jump to the "return" code that was generated earlier.

Seriously: backwards jumps are not loops. Not unconditional ones, not
conditional ones.

The only way to find a loop is to follow the flow control and notice
that it closes a loop.

> I should have reworded the "loop detection" with "basic jump arrows" in
> the first place.

.. and that is fine. But then you need to do it for *forwards* jumps
too. There is no difference between backwards and forwards jumps
*unless* you are looking for loops, and if you are looking for loops
you need to actually find the cycle.

                            Linus

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [GIT PULL 0/4] perf/annotate loop detection V2, fixes
  2012-04-27 18:23         ` Linus Torvalds
@ 2012-04-27 19:17           ` Arnaldo Carvalho de Melo
  2012-04-28 13:38             ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 20+ messages in thread
From: Arnaldo Carvalho de Melo @ 2012-04-27 19:17 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Ingo Molnar, linux-kernel, David Ahern, Frederic Weisbecker,
	Hagen Paul Pfeifer, Mike Galbraith, Namhyung Kim, Paul Mackerras,
	Peter Zijlstra, Stephane Eranian

Em Fri, Apr 27, 2012 at 11:23:24AM -0700, Linus Torvalds escreveu:
> On Fri, Apr 27, 2012 at 11:03 AM, Arnaldo Carvalho de Melo
> <acme@infradead.org> wrote:
> >
> >> It seems to think that a backwards jump implies a loop. But that's not
> >> at all true.
> >
> > Yeah, the jump has to be conditional.
> 
> Not at all.
> 
> Unconditional backwards jumps can easily be parts of loops. It's a
> very valid loop that does basically
> 
>    for (;;) {
>    }
> 
> with a few exit cases in the *middle* of the loop. Sometimes that is
> the right way to write things, and sometimes gcc rewrites things that
> way for other reasons.
> 
> And conditional backwards jumps are *not* automatically loops either. Doing a
> 
>    if (error)
>       return error;
> 
> is perfectly normal - and that "return error" may well be a backwards
> jump to the "return" code that was generated earlier.
> 
> Seriously: backwards jumps are not loops. Not unconditional ones, not
> conditional ones.
> 
> The only way to find a loop is to follow the flow control and notice
> that it closes a loop.

Ok, reading it now it all seems so obvious, stoopid me, thanks for the
explanation.
 
> > I should have reworded the "loop detection" with "basic jump arrows" in
> > the first place.
> 
> .. and that is fine. But then you need to do it for *forwards* jumps
> too. There is no difference between backwards and forwards jumps
> *unless* you are looking for loops, and if you are looking for loops
> you need to actually find the cycle.

Ok, so I changed things to not try to detect loops at all, for now, and
instead just start with 'show jumps' on, hotkey 'j', that will draw
arrows from jumps to its targets, backwards or forwards, when the cursor
is on a jump instruction, take a look:

diff --git a/tools/perf/ui/browser.c b/tools/perf/ui/browser.c
index 32ac116..f4b2530 100644
--- a/tools/perf/ui/browser.c
+++ b/tools/perf/ui/browser.c
@@ -600,8 +600,9 @@ void ui_browser__write_graph(struct ui_browser *browser __used, int graph)
 	SLsmg_set_char_set(0);
 }
 
-void __ui_browser__line_arrow_up(struct ui_browser *browser, unsigned int column,
-				 u64 start, u64 end, int start_width)
+static void __ui_browser__line_arrow_up(struct ui_browser *browser,
+					unsigned int column,
+					u64 start, u64 end, int start_width)
 {
 	unsigned int row, end_row;
 
@@ -639,6 +640,55 @@ out:
 	SLsmg_set_char_set(0);
 }
 
+static void __ui_browser__line_arrow_down(struct ui_browser *browser,
+					  unsigned int column,
+					  u64 start, u64 end, int start_width)
+{
+	unsigned int row, end_row;
+
+	SLsmg_set_char_set(1);
+
+	if (start >= browser->top_idx) {
+		row = start - browser->top_idx;
+		ui_browser__gotorc(browser, row, column);
+		SLsmg_write_char(SLSMG_ULCORN_CHAR);
+		ui_browser__gotorc(browser, row, column + 1);
+		SLsmg_draw_hline(start_width);
+
+		if (row++ == 0)
+			goto out;
+	} else
+		row = 0;
+
+	if (end >= browser->top_idx + browser->height)
+		end_row = browser->height - 1;
+	else
+		end_row = end - browser->top_idx;;
+
+	ui_browser__gotorc(browser, row, column);
+	SLsmg_draw_vline(end_row - row + 1);
+
+	ui_browser__gotorc(browser, end_row, column);
+	if (end < browser->top_idx + browser->height) {
+		SLsmg_write_char(SLSMG_LLCORN_CHAR);
+		ui_browser__gotorc(browser, end_row, column + 1);
+		SLsmg_write_char(SLSMG_HLINE_CHAR);
+		ui_browser__gotorc(browser, end_row, column + 2);
+		SLsmg_write_char(SLSMG_RARROW_CHAR);
+	}
+out:
+	SLsmg_set_char_set(0);
+}
+
+void __ui_browser__line_arrow(struct ui_browser *browser, unsigned int column,
+			      u64 start, u64 end, int start_width)
+{
+	if (start > end)
+		__ui_browser__line_arrow_up(browser, column, start, end, start_width);
+	else
+		__ui_browser__line_arrow_down(browser, column, start, end, start_width);
+}
+
 void ui_browser__init(void)
 {
 	int i = 0;
diff --git a/tools/perf/ui/browser.h b/tools/perf/ui/browser.h
index 2f226cb..059764b 100644
--- a/tools/perf/ui/browser.h
+++ b/tools/perf/ui/browser.h
@@ -38,8 +38,8 @@ void ui_browser__reset_index(struct ui_browser *self);
 
 void ui_browser__gotorc(struct ui_browser *self, int y, int x);
 void ui_browser__write_graph(struct ui_browser *browser, int graph);
-void __ui_browser__line_arrow_up(struct ui_browser *browser, unsigned int column,
-				 u64 start, u64 end, int start_width);
+void __ui_browser__line_arrow(struct ui_browser *browser, unsigned int column,
+			      u64 start, u64 end, int start_width);
 void __ui_browser__show_title(struct ui_browser *browser, const char *title);
 void ui_browser__show_title(struct ui_browser *browser, const char *title);
 int ui_browser__show(struct ui_browser *self, const char *title,
diff --git a/tools/perf/ui/browsers/annotate.c b/tools/perf/ui/browsers/annotate.c
index 4778172..d203daf 100644
--- a/tools/perf/ui/browsers/annotate.c
+++ b/tools/perf/ui/browsers/annotate.c
@@ -30,6 +30,7 @@ struct annotate_browser {
 	int		    nr_entries;
 	bool		    hide_src_code;
 	bool		    use_offset;
+	bool		    jump_arrows;
 	bool		    searching_backwards;
 	u8		    offset_width;
 	char		    search_bf[128];
@@ -144,56 +145,47 @@ static void annotate_browser__write(struct ui_browser *self, void *entry, int ro
 		ab->selection = dl;
 }
 
-static void annotate_browser__draw_current_loop(struct ui_browser *browser)
+static void annotate_browser__draw_current_jump(struct ui_browser *browser)
 {
 	struct annotate_browser *ab = container_of(browser, struct annotate_browser, b);
-	struct map_symbol *ms = browser->priv;
-	struct symbol *sym = ms->sym;
-	struct annotation *notes = symbol__annotation(sym);
-	struct disasm_line *cursor = ab->selection, *pos = cursor, *target;
-	struct browser_disasm_line *bcursor = disasm_line__browser(cursor),
-				   *btarget, *bpos;
+	struct disasm_line *cursor = ab->selection, *target;
+	struct browser_disasm_line *btarget, *bcursor;
 	unsigned int from, to, start_width = 2;
 
-	list_for_each_entry_from(pos, &notes->src->source, node) {
-		if (!pos->ins || !ins__is_jump(pos->ins) ||
-		    !disasm_line__has_offset(pos))
-			continue;
-
-		target = ab->offsets[pos->ops.target.offset];
-		if (!target)
-			continue;
+	if (!cursor->ins || !ins__is_jump(cursor->ins) ||
+	    !disasm_line__has_offset(cursor))
+		return;
 
-		btarget = disasm_line__browser(target);
-		if (btarget->idx <= bcursor->idx)
-			goto found;
-	}
+	target = ab->offsets[cursor->ops.target.offset];
+	if (!target)
+		return;
 
-	return;
+	bcursor = disasm_line__browser(cursor);
+	btarget = disasm_line__browser(target);
 
-found:
-	bpos = disasm_line__browser(pos);
 	if (ab->hide_src_code) {
-		from = bpos->idx_asm;
+		from = bcursor->idx_asm;
 		to = btarget->idx_asm;
 	} else {
-		from = (u64)bpos->idx;
+		from = (u64)bcursor->idx;
 		to = (u64)btarget->idx;
 	}
 
 	ui_browser__set_color(browser, HE_COLORSET_CODE);
 
-	if (!bpos->jump_target)
+	if (!bcursor->jump_target)
 		start_width += ab->offset_width + 1;
 
-	__ui_browser__line_arrow_up(browser, 10, from, to, start_width);
+	__ui_browser__line_arrow(browser, 10, from, to, start_width);
 }
 
 static unsigned int annotate_browser__refresh(struct ui_browser *browser)
 {
+	struct annotate_browser *ab = container_of(browser, struct annotate_browser, b);
 	int ret = ui_browser__list_head_refresh(browser);
 
-	annotate_browser__draw_current_loop(browser);
+	if (ab->jump_arrows)
+		annotate_browser__draw_current_jump(browser);
 
 	return ret;
 }
@@ -628,6 +620,9 @@ static int annotate_browser__run(struct annotate_browser *self, int evidx,
 		case 'o':
 			self->use_offset = !self->use_offset;
 			continue;
+		case 'j':
+			self->jump_arrows = !self->jump_arrows;
+			continue;
 		case '/':
 			if (annotate_browser__search(self, delay_secs)) {
 show_help:
@@ -739,6 +734,7 @@ int symbol__tui_annotate(struct symbol *sym, struct map *map, int evidx,
 			.use_navkeypressed = true,
 		},
 		.use_offset = true,
+		.jump_arrows = true,
 	};
 	int ret = -1;
 

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [GIT PULL 0/4] perf/annotate loop detection V2, fixes
  2012-04-27 19:17           ` Arnaldo Carvalho de Melo
@ 2012-04-28 13:38             ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 20+ messages in thread
From: Arnaldo Carvalho de Melo @ 2012-04-28 13:38 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Ingo Molnar, linux-kernel, David Ahern, Frederic Weisbecker,
	Hagen Paul Pfeifer, Mike Galbraith, Namhyung Kim, Paul Mackerras,
	Peter Zijlstra, Stephane Eranian

Em Fri, Apr 27, 2012 at 04:17:09PM -0300, Arnaldo Carvalho de Melo escreveu:
> Em Fri, Apr 27, 2012 at 11:23:24AM -0700, Linus Torvalds escreveu:
> > On Fri, Apr 27, 2012 at 11:03 AM, Arnaldo Carvalho de Melo
> > > I should have reworded the "loop detection" with "basic jump arrows" in
> > > the first place.
> > 
> > .. and that is fine. But then you need to do it for *forwards* jumps
> > too. There is no difference between backwards and forwards jumps
> > *unless* you are looking for loops, and if you are looking for loops
> > you need to actually find the cycle.
> 
> Ok, so I changed things to not try to detect loops at all, for now, and
> instead just start with 'show jumps' on, hotkey 'j', that will draw
> arrows from jumps to its targets, backwards or forwards, when the cursor
> is on a jump instruction, take a look:

I pushed this one plus some more changes to my perf/annotate branch.

- Arnaldo

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2012-04-28 13:38 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-04-26 15:06 [GIT PULL 0/4] perf/annotate loop detection V2, fixes Arnaldo Carvalho de Melo
2012-04-26 15:06 ` [PATCH 1/4] perf annotate browser: Handle NULL jump targets Arnaldo Carvalho de Melo
2012-04-26 15:06 ` [PATCH 2/4] perf annotate: Disambiguage offsets and addresses in operands Arnaldo Carvalho de Melo
2012-04-26 15:06 ` [PATCH 3/4] perf annotate: Mark jump instructions with no offset Arnaldo Carvalho de Melo
2012-04-26 15:06 ` [PATCH 4/4] perf annotate browser: Don't draw jump connectors for out of function jumps Arnaldo Carvalho de Melo
2012-04-27  7:21 ` [GIT PULL 0/4] perf/annotate loop detection V2, fixes Ingo Molnar
2012-04-27  8:06   ` Ingo Molnar
2012-04-27 15:31     ` Arnaldo Carvalho de Melo
2012-04-27 15:48       ` Peter Zijlstra
2012-04-27 16:07         ` Arnaldo Carvalho de Melo
2012-04-27 16:20           ` Peter Zijlstra
2012-04-27 18:09             ` Arnaldo Carvalho de Melo
2012-04-27 15:16   ` Arnaldo Carvalho de Melo
2012-04-27 16:35 ` Linus Torvalds
2012-04-27 16:46   ` Arnaldo Carvalho de Melo
2012-04-27 17:12     ` Linus Torvalds
2012-04-27 18:03       ` Arnaldo Carvalho de Melo
2012-04-27 18:23         ` Linus Torvalds
2012-04-27 19:17           ` Arnaldo Carvalho de Melo
2012-04-28 13:38             ` Arnaldo Carvalho de Melo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).