All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v1 0/2] perf report: Implement visual marker for macro fusion in annotate
@ 2017-06-14  2:53 Jin Yao
  2017-06-14  2:53 ` [PATCH v1 1/2] perf report: Check for fused instruction pair Jin Yao
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Jin Yao @ 2017-06-14  2:53 UTC (permalink / raw)
  To: acme, jolsa, peterz, mingo, alexander.shishkin
  Cc: Linux-kernel, ak, kan.liang, yao.jin, Jin Yao

Macro fusion merges two instructions to a single micro-op. Intel
core platform performs this hardware optimization under limited
circumstances. For example, CMP + JCC can be "fused" and executed
/retired together. While with sampling this can result in the
sample sometimes being on the JCC and sometimes on the CMP.
So for the fused instruction pair, they could be considered
together.

In general, the fused instruction pairs are:

cmp/test/add/sub/and/inc/dec + jcc.

This patch series marks the case clearly by joining the fused
instruction pair in the arrow of the jump.

For example:

       │   ┌──cmpl   $0x0,argp_program_version_hook
 81.93 │   │──je     20
       │   │  lock   cmpxchg %esi,0x38a9a4(%rip)
       │   │↓ jne    29
       │   │↓ jmp    43
 11.47 │20:└─→cmpxch %esi,0x38a999(%rip)

Jin Yao (2):
  perf report: Check for fused instruction pair
  perf report: Implement visual marker for macro fusion in annotate

 tools/perf/arch/x86/util/Build    |  1 +
 tools/perf/arch/x86/util/fused.c  | 20 ++++++++++++++++++++
 tools/perf/ui/browser.c           | 27 +++++++++++++++++++++++++++
 tools/perf/ui/browser.h           |  2 ++
 tools/perf/ui/browsers/annotate.c | 30 ++++++++++++++++++++++++++++++
 tools/perf/util/Build             |  1 +
 tools/perf/util/annotate.c        |  5 +++++
 tools/perf/util/annotate.h        |  1 +
 tools/perf/util/fused.c           | 11 +++++++++++
 tools/perf/util/fused.h           |  8 ++++++++
 10 files changed, 106 insertions(+)
 create mode 100644 tools/perf/arch/x86/util/fused.c
 create mode 100644 tools/perf/util/fused.c
 create mode 100644 tools/perf/util/fused.h

-- 
2.7.4

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH v1 1/2] perf report: Check for fused instruction pair
  2017-06-14  2:53 [PATCH v1 0/2] perf report: Implement visual marker for macro fusion in annotate Jin Yao
@ 2017-06-14  2:53 ` Jin Yao
  2017-06-16 16:21   ` Arnaldo Carvalho de Melo
  2017-06-14  2:53 ` [PATCH v1 2/2] perf report: Implement visual marker for macro fusion in annotate Jin Yao
  2017-06-16 16:16 ` [PATCH v1 0/2] " Arnaldo Carvalho de Melo
  2 siblings, 1 reply; 7+ messages in thread
From: Jin Yao @ 2017-06-14  2:53 UTC (permalink / raw)
  To: acme, jolsa, peterz, mingo, alexander.shishkin
  Cc: Linux-kernel, ak, kan.liang, yao.jin, Jin Yao

Macro fusion merges two instructions to a single micro-op. Intel
core platform performs this hardware optimization under limited
circumstances. For example, CMP + JCC can be "fused" and executed
/retired together. While with sampling this can result in the
sample sometimes being on the JCC and sometimes on the CMP.
So for the fused instruction pair, they could be considered
together.

In general, the fused instruction pairs are:

cmp/test/add/sub/and/inc/dec + jcc.

This patch adds a new function which checks if 2 x86 instructions
are in a "fused" pair. For non-x86 arch, the function just returns
false.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
---
 tools/perf/arch/x86/util/Build   |  1 +
 tools/perf/arch/x86/util/fused.c | 20 ++++++++++++++++++++
 tools/perf/util/Build            |  1 +
 tools/perf/util/fused.c          | 11 +++++++++++
 tools/perf/util/fused.h          |  8 ++++++++
 5 files changed, 41 insertions(+)
 create mode 100644 tools/perf/arch/x86/util/fused.c
 create mode 100644 tools/perf/util/fused.c
 create mode 100644 tools/perf/util/fused.h

diff --git a/tools/perf/arch/x86/util/Build b/tools/perf/arch/x86/util/Build
index f95e6f4..3809348 100644
--- a/tools/perf/arch/x86/util/Build
+++ b/tools/perf/arch/x86/util/Build
@@ -4,6 +4,7 @@ libperf-y += pmu.o
 libperf-y += kvm-stat.o
 libperf-y += perf_regs.o
 libperf-y += group.o
+libperf-y += fused.o
 
 libperf-$(CONFIG_DWARF) += dwarf-regs.o
 libperf-$(CONFIG_BPF_PROLOGUE) += dwarf-regs.o
diff --git a/tools/perf/arch/x86/util/fused.c b/tools/perf/arch/x86/util/fused.c
new file mode 100644
index 0000000..be28d22
--- /dev/null
+++ b/tools/perf/arch/x86/util/fused.c
@@ -0,0 +1,20 @@
+#include <string.h>
+#include "../../util/fused.h"
+
+bool fused_insn_pair(const char *insn1, const char *insn2)
+{
+	if (strstr(insn2, "jmp"))
+		return false;
+
+	if ((strstr(insn1, "cmp") && !strstr(insn1, "xchg")) ||
+	    strstr(insn1, "test") ||
+	    strstr(insn1, "add") ||
+	    strstr(insn1, "sub") ||
+	    strstr(insn1, "and") ||
+	    strstr(insn1, "inc") ||
+	    strstr(insn1, "dec")) {
+		return true;
+	}
+
+	return false;
+}
diff --git a/tools/perf/util/Build b/tools/perf/util/Build
index 79dea95..b83757d 100644
--- a/tools/perf/util/Build
+++ b/tools/perf/util/Build
@@ -93,6 +93,7 @@ libperf-y += drv_configs.o
 libperf-y += units.o
 libperf-y += time-utils.o
 libperf-y += expr-bison.o
+libperf-y += fused.o
 
 libperf-$(CONFIG_LIBBPF) += bpf-loader.o
 libperf-$(CONFIG_BPF_PROLOGUE) += bpf-prologue.o
diff --git a/tools/perf/util/fused.c b/tools/perf/util/fused.c
new file mode 100644
index 0000000..2cf56fa
--- /dev/null
+++ b/tools/perf/util/fused.c
@@ -0,0 +1,11 @@
+#include <linux/compiler.h>
+#include <linux/types.h>
+#include <string.h>
+
+#include "fused.h"
+
+bool __weak fused_insn_pair(const char *insn1 __maybe_unused,
+			    const char *insn2 __maybe_unused)
+{
+	return false;
+}
diff --git a/tools/perf/util/fused.h b/tools/perf/util/fused.h
new file mode 100644
index 0000000..fa26714
--- /dev/null
+++ b/tools/perf/util/fused.h
@@ -0,0 +1,8 @@
+#ifndef __PERF_FUSED_H
+#define __PERF_FUSED_H
+
+#include <linux/types.h>
+
+bool fused_insn_pair(const char *insn1, const char *insn2);
+
+#endif	/* __PERF_FUSED_H */
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH v1 2/2] perf report: Implement visual marker for macro fusion in annotate
  2017-06-14  2:53 [PATCH v1 0/2] perf report: Implement visual marker for macro fusion in annotate Jin Yao
  2017-06-14  2:53 ` [PATCH v1 1/2] perf report: Check for fused instruction pair Jin Yao
@ 2017-06-14  2:53 ` Jin Yao
  2017-06-16 16:16 ` [PATCH v1 0/2] " Arnaldo Carvalho de Melo
  2 siblings, 0 replies; 7+ messages in thread
From: Jin Yao @ 2017-06-14  2:53 UTC (permalink / raw)
  To: acme, jolsa, peterz, mingo, alexander.shishkin
  Cc: Linux-kernel, ak, kan.liang, yao.jin, Jin Yao

For marking the fused instructions clearly, This patch adds a
line before the first instruction of pair and joins it with the
arrow of the jump.

For example, when je is selected in annotate view, the line
before cmpl is displayed and joins the arrow of je.

       │   ┌──cmpl   $0x0,argp_program_version_hook
 81.93 │   │──je     20
       │   │  lock   cmpxchg %esi,0x38a9a4(%rip)
       │   │↓ jne    29
       │   │↓ jmp    43
 11.47 │20:└─→cmpxch %esi,0x38a999(%rip)

That means the cmpl+je is fused instruction pair and they should
be considered together.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
---
 tools/perf/ui/browser.c           | 27 +++++++++++++++++++++++++++
 tools/perf/ui/browser.h           |  2 ++
 tools/perf/ui/browsers/annotate.c | 30 ++++++++++++++++++++++++++++++
 tools/perf/util/annotate.c        |  5 +++++
 tools/perf/util/annotate.h        |  1 +
 5 files changed, 65 insertions(+)

diff --git a/tools/perf/ui/browser.c b/tools/perf/ui/browser.c
index a4d3762..acba636 100644
--- a/tools/perf/ui/browser.c
+++ b/tools/perf/ui/browser.c
@@ -738,6 +738,33 @@ void __ui_browser__line_arrow(struct ui_browser *browser, unsigned int column,
 		__ui_browser__line_arrow_down(browser, column, start, end);
 }
 
+void ui_browser__mark_fused(struct ui_browser *browser, unsigned int column,
+			    unsigned int row, bool arrow_down)
+{
+	unsigned int end_row;
+
+	if (row >= browser->top_idx)
+		end_row = row - browser->top_idx;
+	else
+		return;
+
+	SLsmg_set_char_set(1);
+
+	if (arrow_down) {
+		ui_browser__gotorc(browser, end_row, column - 1);
+		SLsmg_write_char(SLSMG_ULCORN_CHAR);
+		ui_browser__gotorc(browser, end_row, column);
+		SLsmg_draw_hline(2);
+		ui_browser__gotorc(browser, end_row + 1, column - 1);
+		SLsmg_draw_vline(1);
+	} else {
+		ui_browser__gotorc(browser, end_row, column);
+		SLsmg_draw_hline(2);
+	}
+
+	SLsmg_set_char_set(0);
+}
+
 void ui_browser__init(void)
 {
 	int i = 0;
diff --git a/tools/perf/ui/browser.h b/tools/perf/ui/browser.h
index be3b70e..a12eff7 100644
--- a/tools/perf/ui/browser.h
+++ b/tools/perf/ui/browser.h
@@ -43,6 +43,8 @@ void ui_browser__printf(struct ui_browser *browser, const char *fmt, ...);
 void ui_browser__write_graph(struct ui_browser *browser, int graph);
 void __ui_browser__line_arrow(struct ui_browser *browser, unsigned int column,
 			      u64 start, u64 end);
+void ui_browser__mark_fused(struct ui_browser *browser, unsigned int column,
+			    unsigned int row, bool arrow_down);
 void __ui_browser__show_title(struct ui_browser *browser, const char *title);
 void ui_browser__show_title(struct ui_browser *browser, const char *title);
 int ui_browser__show(struct ui_browser *browser, const char *title,
diff --git a/tools/perf/ui/browsers/annotate.c b/tools/perf/ui/browsers/annotate.c
index 7a03389..941fea2 100644
--- a/tools/perf/ui/browsers/annotate.c
+++ b/tools/perf/ui/browsers/annotate.c
@@ -9,6 +9,7 @@
 #include "../../util/symbol.h"
 #include "../../util/evsel.h"
 #include "../../util/config.h"
+#include "../../util/fused.h"
 #include <inttypes.h>
 #include <pthread.h>
 #include <linux/kernel.h>
@@ -269,6 +270,28 @@ static bool disasm_line__is_valid_jump(struct disasm_line *dl, struct symbol *sy
 	return true;
 }
 
+static bool is_fused(struct disasm_line *cursor)
+{
+	struct disasm_line *pos = list_prev_entry(cursor, node);
+	const char *name;
+
+	if (!pos)
+		return false;
+
+	if (ins__is_lock(&pos->ins))
+		name = pos->ops.locked.ins.name;
+	else
+		name = pos->ins.name;
+
+	if (!name || !cursor->ins.name)
+		return false;
+
+	if (fused_insn_pair(name, cursor->ins.name))
+		return true;
+
+	return false;
+}
+
 static void annotate_browser__draw_current_jump(struct ui_browser *browser)
 {
 	struct annotate_browser *ab = container_of(browser, struct annotate_browser, b);
@@ -304,6 +327,13 @@ static void annotate_browser__draw_current_jump(struct ui_browser *browser)
 	ui_browser__set_color(browser, HE_COLORSET_JUMP_ARROWS);
 	__ui_browser__line_arrow(browser, pcnt_width + 2 + ab->addr_width,
 				 from, to);
+
+	if (is_fused(cursor)) {
+		ui_browser__mark_fused(browser,
+				       pcnt_width + 3 + ab->addr_width,
+				       from - 1,
+				       to > from ? true : false);
+	}
 }
 
 static unsigned int annotate_browser__refresh(struct ui_browser *browser)
diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index 1367d7e..2dc5974 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -502,6 +502,11 @@ bool ins__is_ret(const struct ins *ins)
 	return ins->ops == &ret_ops;
 }
 
+bool ins__is_lock(const struct ins *ins)
+{
+	return ins->ops == &lock_ops;
+}
+
 static int ins__key_cmp(const void *name, const void *insp)
 {
 	const struct ins *ins = insp;
diff --git a/tools/perf/util/annotate.h b/tools/perf/util/annotate.h
index 948aa8e..9aa25af 100644
--- a/tools/perf/util/annotate.h
+++ b/tools/perf/util/annotate.h
@@ -52,6 +52,7 @@ struct ins_ops {
 bool ins__is_jump(const struct ins *ins);
 bool ins__is_call(const struct ins *ins);
 bool ins__is_ret(const struct ins *ins);
+bool ins__is_lock(const struct ins *ins);
 int ins__scnprintf(struct ins *ins, char *bf, size_t size, struct ins_operands *ops);
 
 struct annotation;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH v1 0/2] perf report: Implement visual marker for macro fusion in annotate
  2017-06-14  2:53 [PATCH v1 0/2] perf report: Implement visual marker for macro fusion in annotate Jin Yao
  2017-06-14  2:53 ` [PATCH v1 1/2] perf report: Check for fused instruction pair Jin Yao
  2017-06-14  2:53 ` [PATCH v1 2/2] perf report: Implement visual marker for macro fusion in annotate Jin Yao
@ 2017-06-16 16:16 ` Arnaldo Carvalho de Melo
  2017-06-16 16:17   ` Arnaldo Carvalho de Melo
  2 siblings, 1 reply; 7+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-06-16 16:16 UTC (permalink / raw)
  To: Jin Yao
  Cc: jolsa, peterz, mingo, alexander.shishkin, Linux-kernel, ak,
	kan.liang, yao.jin

Em Wed, Jun 14, 2017 at 10:53:39AM +0800, Jin Yao escreveu:
> Macro fusion merges two instructions to a single micro-op. Intel
> core platform performs this hardware optimization under limited
> circumstances. For example, CMP + JCC can be "fused" and executed
> /retired together. While with sampling this can result in the
> sample sometimes being on the JCC and sometimes on the CMP.
> So for the fused instruction pair, they could be considered
> together.
> 
> In general, the fused instruction pairs are:
> 
> cmp/test/add/sub/and/inc/dec + jcc.
> 
> This patch series marks the case clearly by joining the fused
> instruction pair in the arrow of the jump.
> 
> For example:
> 
>        │   ┌──cmpl   $0x0,argp_program_version_hook
>  81.93 │   │──je     20
>        │   │  lock   cmpxchg %esi,0x38a9a4(%rip)
>        │   │↓ jne    29
>        │   │↓ jmp    43
>  11.47 │20:└─→cmpxch %esi,0x38a999(%rip)

Try to have these example outputs in the changesets, not just in the
patch series header.

- Arnaldo
 
> Jin Yao (2):
>   perf report: Check for fused instruction pair
>   perf report: Implement visual marker for macro fusion in annotate
> 
>  tools/perf/arch/x86/util/Build    |  1 +
>  tools/perf/arch/x86/util/fused.c  | 20 ++++++++++++++++++++
>  tools/perf/ui/browser.c           | 27 +++++++++++++++++++++++++++
>  tools/perf/ui/browser.h           |  2 ++
>  tools/perf/ui/browsers/annotate.c | 30 ++++++++++++++++++++++++++++++
>  tools/perf/util/Build             |  1 +
>  tools/perf/util/annotate.c        |  5 +++++
>  tools/perf/util/annotate.h        |  1 +
>  tools/perf/util/fused.c           | 11 +++++++++++
>  tools/perf/util/fused.h           |  8 ++++++++
>  10 files changed, 106 insertions(+)
>  create mode 100644 tools/perf/arch/x86/util/fused.c
>  create mode 100644 tools/perf/util/fused.c
>  create mode 100644 tools/perf/util/fused.h
> 
> -- 
> 2.7.4

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v1 0/2] perf report: Implement visual marker for macro fusion in annotate
  2017-06-16 16:16 ` [PATCH v1 0/2] " Arnaldo Carvalho de Melo
@ 2017-06-16 16:17   ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 7+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-06-16 16:17 UTC (permalink / raw)
  To: Jin Yao
  Cc: jolsa, peterz, mingo, alexander.shishkin, Linux-kernel, ak,
	kan.liang, yao.jin

Em Fri, Jun 16, 2017 at 01:16:55PM -0300, Arnaldo Carvalho de Melo escreveu:
> Em Wed, Jun 14, 2017 at 10:53:39AM +0800, Jin Yao escreveu:
> > Macro fusion merges two instructions to a single micro-op. Intel
> > core platform performs this hardware optimization under limited
> > circumstances. For example, CMP + JCC can be "fused" and executed
> > /retired together. While with sampling this can result in the
> > sample sometimes being on the JCC and sometimes on the CMP.
> > So for the fused instruction pair, they could be considered
> > together.
> > 
> > In general, the fused instruction pairs are:
> > 
> > cmp/test/add/sub/and/inc/dec + jcc.
> > 
> > This patch series marks the case clearly by joining the fused
> > instruction pair in the arrow of the jump.
> > 
> > For example:
> > 
> >        │   ┌──cmpl   $0x0,argp_program_version_hook
> >  81.93 │   │──je     20
> >        │   │  lock   cmpxchg %esi,0x38a9a4(%rip)
> >        │   │↓ jne    29
> >        │   │↓ jmp    43
> >  11.47 │20:└─→cmpxch %esi,0x38a999(%rip)
> 
> Try to have these example outputs in the changesets, not just in the
> patch series header.

Ok, I went trigger happy, sorry, it is in the second patch, I had looked
just at the first :-\

- Arnaldo
 
> - Arnaldo
>  
> > Jin Yao (2):
> >   perf report: Check for fused instruction pair
> >   perf report: Implement visual marker for macro fusion in annotate
> > 
> >  tools/perf/arch/x86/util/Build    |  1 +
> >  tools/perf/arch/x86/util/fused.c  | 20 ++++++++++++++++++++
> >  tools/perf/ui/browser.c           | 27 +++++++++++++++++++++++++++
> >  tools/perf/ui/browser.h           |  2 ++
> >  tools/perf/ui/browsers/annotate.c | 30 ++++++++++++++++++++++++++++++
> >  tools/perf/util/Build             |  1 +
> >  tools/perf/util/annotate.c        |  5 +++++
> >  tools/perf/util/annotate.h        |  1 +
> >  tools/perf/util/fused.c           | 11 +++++++++++
> >  tools/perf/util/fused.h           |  8 ++++++++
> >  10 files changed, 106 insertions(+)
> >  create mode 100644 tools/perf/arch/x86/util/fused.c
> >  create mode 100644 tools/perf/util/fused.c
> >  create mode 100644 tools/perf/util/fused.h
> > 
> > -- 
> > 2.7.4

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v1 1/2] perf report: Check for fused instruction pair
  2017-06-14  2:53 ` [PATCH v1 1/2] perf report: Check for fused instruction pair Jin Yao
@ 2017-06-16 16:21   ` Arnaldo Carvalho de Melo
  2017-06-19  2:58     ` Jin, Yao
  0 siblings, 1 reply; 7+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-06-16 16:21 UTC (permalink / raw)
  To: Jin Yao
  Cc: jolsa, peterz, mingo, alexander.shishkin, Linux-kernel, ak,
	kan.liang, yao.jin

Em Wed, Jun 14, 2017 at 10:53:40AM +0800, Jin Yao escreveu:
> Macro fusion merges two instructions to a single micro-op. Intel
> core platform performs this hardware optimization under limited
> circumstances. For example, CMP + JCC can be "fused" and executed
> /retired together. While with sampling this can result in the
> sample sometimes being on the JCC and sometimes on the CMP.
> So for the fused instruction pair, they could be considered
> together.

doing it as a weak function that will be overriden by the host arch
doesn't work, as we also support cross-annotation. So you have to take
into account perf_evsel__env_arch(evsel), etc.

Please search for perf_evsel__env_arch(evsel) in the annotation source
files to see how it is used.

- Arnaldo
 
> In general, the fused instruction pairs are:
> 
> cmp/test/add/sub/and/inc/dec + jcc.
> 
> This patch adds a new function which checks if 2 x86 instructions
> are in a "fused" pair. For non-x86 arch, the function just returns
> false.
> 
> Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
> ---
>  tools/perf/arch/x86/util/Build   |  1 +
>  tools/perf/arch/x86/util/fused.c | 20 ++++++++++++++++++++
>  tools/perf/util/Build            |  1 +
>  tools/perf/util/fused.c          | 11 +++++++++++
>  tools/perf/util/fused.h          |  8 ++++++++
>  5 files changed, 41 insertions(+)
>  create mode 100644 tools/perf/arch/x86/util/fused.c
>  create mode 100644 tools/perf/util/fused.c
>  create mode 100644 tools/perf/util/fused.h
> 
> diff --git a/tools/perf/arch/x86/util/Build b/tools/perf/arch/x86/util/Build
> index f95e6f4..3809348 100644
> --- a/tools/perf/arch/x86/util/Build
> +++ b/tools/perf/arch/x86/util/Build
> @@ -4,6 +4,7 @@ libperf-y += pmu.o
>  libperf-y += kvm-stat.o
>  libperf-y += perf_regs.o
>  libperf-y += group.o
> +libperf-y += fused.o
>  
>  libperf-$(CONFIG_DWARF) += dwarf-regs.o
>  libperf-$(CONFIG_BPF_PROLOGUE) += dwarf-regs.o
> diff --git a/tools/perf/arch/x86/util/fused.c b/tools/perf/arch/x86/util/fused.c
> new file mode 100644
> index 0000000..be28d22
> --- /dev/null
> +++ b/tools/perf/arch/x86/util/fused.c
> @@ -0,0 +1,20 @@
> +#include <string.h>
> +#include "../../util/fused.h"
> +
> +bool fused_insn_pair(const char *insn1, const char *insn2)
> +{
> +	if (strstr(insn2, "jmp"))
> +		return false;
> +
> +	if ((strstr(insn1, "cmp") && !strstr(insn1, "xchg")) ||
> +	    strstr(insn1, "test") ||
> +	    strstr(insn1, "add") ||
> +	    strstr(insn1, "sub") ||
> +	    strstr(insn1, "and") ||
> +	    strstr(insn1, "inc") ||
> +	    strstr(insn1, "dec")) {
> +		return true;
> +	}
> +
> +	return false;
> +}
> diff --git a/tools/perf/util/Build b/tools/perf/util/Build
> index 79dea95..b83757d 100644
> --- a/tools/perf/util/Build
> +++ b/tools/perf/util/Build
> @@ -93,6 +93,7 @@ libperf-y += drv_configs.o
>  libperf-y += units.o
>  libperf-y += time-utils.o
>  libperf-y += expr-bison.o
> +libperf-y += fused.o
>  
>  libperf-$(CONFIG_LIBBPF) += bpf-loader.o
>  libperf-$(CONFIG_BPF_PROLOGUE) += bpf-prologue.o
> diff --git a/tools/perf/util/fused.c b/tools/perf/util/fused.c
> new file mode 100644
> index 0000000..2cf56fa
> --- /dev/null
> +++ b/tools/perf/util/fused.c
> @@ -0,0 +1,11 @@
> +#include <linux/compiler.h>
> +#include <linux/types.h>
> +#include <string.h>
> +
> +#include "fused.h"
> +
> +bool __weak fused_insn_pair(const char *insn1 __maybe_unused,
> +			    const char *insn2 __maybe_unused)
> +{
> +	return false;
> +}
> diff --git a/tools/perf/util/fused.h b/tools/perf/util/fused.h
> new file mode 100644
> index 0000000..fa26714
> --- /dev/null
> +++ b/tools/perf/util/fused.h
> @@ -0,0 +1,8 @@
> +#ifndef __PERF_FUSED_H
> +#define __PERF_FUSED_H
> +
> +#include <linux/types.h>
> +
> +bool fused_insn_pair(const char *insn1, const char *insn2);
> +
> +#endif	/* __PERF_FUSED_H */
> -- 
> 2.7.4

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v1 1/2] perf report: Check for fused instruction pair
  2017-06-16 16:21   ` Arnaldo Carvalho de Melo
@ 2017-06-19  2:58     ` Jin, Yao
  0 siblings, 0 replies; 7+ messages in thread
From: Jin, Yao @ 2017-06-19  2:58 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: jolsa, peterz, mingo, alexander.shishkin, Linux-kernel, ak,
	kan.liang, yao.jin



On 6/17/2017 12:21 AM, Arnaldo Carvalho de Melo wrote:
> Em Wed, Jun 14, 2017 at 10:53:40AM +0800, Jin Yao escreveu:
>> Macro fusion merges two instructions to a single micro-op. Intel
>> core platform performs this hardware optimization under limited
>> circumstances. For example, CMP + JCC can be "fused" and executed
>> /retired together. While with sampling this can result in the
>> sample sometimes being on the JCC and sometimes on the CMP.
>> So for the fused instruction pair, they could be considered
>> together.
> doing it as a weak function that will be overriden by the host arch
> doesn't work, as we also support cross-annotation. So you have to take
> into account perf_evsel__env_arch(evsel), etc.
>
> Please search for perf_evsel__env_arch(evsel) in the annotation source
> files to see how it is used.
>
> - Arnaldo
>   
Hi Arnaldo,

Thanks so much for pointing out that the weak function doesn't work.

I have changed it to arch-specific function and just send out v2 series 
for reviewing.

Thanks
Jin Yao

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2017-06-19  2:59 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-06-14  2:53 [PATCH v1 0/2] perf report: Implement visual marker for macro fusion in annotate Jin Yao
2017-06-14  2:53 ` [PATCH v1 1/2] perf report: Check for fused instruction pair Jin Yao
2017-06-16 16:21   ` Arnaldo Carvalho de Melo
2017-06-19  2:58     ` Jin, Yao
2017-06-14  2:53 ` [PATCH v1 2/2] perf report: Implement visual marker for macro fusion in annotate Jin Yao
2017-06-16 16:16 ` [PATCH v1 0/2] " Arnaldo Carvalho de Melo
2017-06-16 16:17   ` Arnaldo Carvalho de Melo

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.