From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751974AbdGGFDd (ORCPT ); Fri, 7 Jul 2017 01:03:33 -0400 Received: from mga09.intel.com ([134.134.136.24]:10806 "EHLO mga09.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750839AbdGGFDc (ORCPT ); Fri, 7 Jul 2017 01:03:32 -0400 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.40,320,1496127600"; d="scan'208";a="123655971" From: Jin Yao To: acme@kernel.org, jolsa@kernel.org, peterz@infradead.org, mingo@redhat.com, alexander.shishkin@linux.intel.com Cc: Linux-kernel@vger.kernel.org, ak@linux.intel.com, kan.liang@intel.com, yao.jin@intel.com, Jin Yao Subject: [PATCH v4 0/2] perf report: Implement visual marker for macro fusion in annotate Date: Fri, 7 Jul 2017 13:06:33 +0800 Message-Id: <1499403995-19857-1-git-send-email-yao.jin@linux.intel.com> X-Mailer: git-send-email 2.7.4 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Macro fusion merges two instructions to a single micro-op. Intel core platform performs this hardware optimization under limited circumstances. For example, CMP + JCC can be "fused" and executed /retired together. While with sampling this can result in the sample sometimes being on the JCC and sometimes on the CMP. So for the fused instruction pair, they could be considered together. On Nehalem, fused instruction pairs: cmp/test + jcc. On other new CPU: cmp/test/add/sub/and/inc/dec + jcc. This patch series marks the case clearly by joining the fused instruction pair in the arrow of the jump. For example: │ ┌──cmpl $0x0,argp_program_version_hook 81.93 │ ├──je 20 │ │ lock cmpxchg %esi,0x38a9a4(%rip) │ │↓ jne 29 │ │↓ jmp 43 11.47 │20:└─→cmpxch %esi,0x38a999(%rip) Change-log: ----------- v4: Move the CPU model checking to symbol__disassemble and save the family/model in arch structure. It avoids checking everytime when display the jump arrows. The patch set is still performing the fused checking when user moves the cursor on the jump instruction. v3: 1. Add checking for Nehalem (CMP, TEST). For other newer Intel CPUs just check it by default (CMP, TEST, ADD, SUB, AND, INC, DEC). 2. Use Arnaldo's fix to let the display be better v2: According to Arnaldo's comments, remove the weak function and use an arch-specific function instead to check fused instruction pair. v1: Inital post Jin Yao (2): perf util: Check for fused instruction perf report: Implement visual marker for macro fusion in annotate tools/perf/arch/x86/annotate/instructions.c | 46 +++++++++++++++++++++++++++++ tools/perf/builtin-top.c | 2 +- tools/perf/ui/browser.c | 29 ++++++++++++++++++ tools/perf/ui/browser.h | 2 ++ tools/perf/ui/browsers/annotate.c | 30 ++++++++++++++++++- tools/perf/ui/gtk/annotate.c | 2 +- tools/perf/util/annotate.c | 27 +++++++++++++++-- tools/perf/util/annotate.h | 4 ++- 8 files changed, 136 insertions(+), 6 deletions(-) -- 2.7.4