linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 00/11] perf intel-pt: Add limited support for tracing guest kernels
@ 2021-02-18  9:57 Adrian Hunter
  2021-02-18  9:57 ` [PATCH 01/11] perf script: Add branch types for VM-Entry and VM-Exit Adrian Hunter
                   ` (11 more replies)
  0 siblings, 12 replies; 13+ messages in thread
From: Adrian Hunter @ 2021-02-18  9:57 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa, Andi Kleen
  Cc: Alexander Shishkin, linux-kernel

Hi

Currently, only kernel tracing is supported and only with "timeless" decoding
i.e. no TSC timestamps

Other limitations and caveats

 VMX controls may suppress packets needed for decoding resulting in decoding errors
 VMX controls may block the perf NMI to the host potentially resulting in lost trace data
 Guest kernel self-modifying code (e.g. jump labels or JIT-compiled eBPF) will result in decoding errors
 Guest thread information is unknown
 Guest VCPU is unknown but may be able to be inferred from the host thread
 Callchains are not supported

There is an example in the documentation of patch
"perf intel-pt: Add documentation for tracing virtual machines"

The patches are on top of the "Add PSB events" series.


Adrian Hunter (11):
      perf script: Add branch types for VM-Entry and VM-Exit
      perf intel_pt: Add vmlaunch and vmresume as branches
      perf intel-pt: Retain the last PIP packet payload as is
      perf intel-pt: Amend decoder to track the NR flag
      perf machine: Factor out machines__find_guest()
      perf machine: Factor out machine__idle_thread()
      perf intel-pt: Support decoding of guest kernel
      perf intel-pt: Allow for a guest kernel address filter
      perf intel-pt: Adjust sample flags for VM-Exit
      perf intel-pt: Split VM-Entry and VM-Exit branches
      perf intel-pt: Add documentation for tracing virtual machines

 tools/perf/Documentation/perf-intel-pt.txt         |  82 ++++++++++++++
 tools/perf/arch/x86/tests/insn-x86.c               |   1 +
 .../arch/x86/tests/intel-pt-pkt-decoder-test.c     |   4 +-
 tools/perf/builtin-script.c                        |   2 +
 tools/perf/util/db-export.c                        |   2 +
 tools/perf/util/event.h                            |   6 +-
 .../perf/util/intel-pt-decoder/intel-pt-decoder.c  |  61 +++++++++--
 .../perf/util/intel-pt-decoder/intel-pt-decoder.h  |   3 +-
 .../util/intel-pt-decoder/intel-pt-insn-decoder.c  |  15 +++
 .../util/intel-pt-decoder/intel-pt-insn-decoder.h  |   1 +
 .../util/intel-pt-decoder/intel-pt-pkt-decoder.c   |  12 +-
 .../util/intel-pt-decoder/intel-pt-pkt-decoder.h   |   2 +
 tools/perf/util/intel-pt.c                         | 122 ++++++++++++++++++---
 tools/perf/util/machine.c                          |  27 +++++
 tools/perf/util/machine.h                          |   2 +
 tools/perf/util/session.c                          |  32 +-----
 16 files changed, 307 insertions(+), 67 deletions(-)


Regards
Adrian

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 01/11] perf script: Add branch types for VM-Entry and VM-Exit
  2021-02-18  9:57 [PATCH 00/11] perf intel-pt: Add limited support for tracing guest kernels Adrian Hunter
@ 2021-02-18  9:57 ` Adrian Hunter
  2021-02-18  9:57 ` [PATCH 02/11] perf intel_pt: Add vmlaunch and vmresume as branches Adrian Hunter
                   ` (10 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: Adrian Hunter @ 2021-02-18  9:57 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa, Andi Kleen
  Cc: Alexander Shishkin, linux-kernel

In preparation to support Intel PT decoding of virtual machine traces, add
branch types for VM-Entry and VM-Exit.

Note they are both treated as "calls" because the VM-Exit transfers control
to a different address.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 tools/perf/builtin-script.c | 2 ++
 tools/perf/util/db-export.c | 2 ++
 tools/perf/util/event.h     | 6 +++++-
 3 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index 4143d3f6eb37..57958bb96082 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -1537,6 +1537,8 @@ static struct {
 	{PERF_IP_FLAG_BRANCH | PERF_IP_FLAG_TX_ABORT, "tx abrt"},
 	{PERF_IP_FLAG_BRANCH | PERF_IP_FLAG_TRACE_BEGIN, "tr strt"},
 	{PERF_IP_FLAG_BRANCH | PERF_IP_FLAG_TRACE_END, "tr end"},
+	{PERF_IP_FLAG_BRANCH | PERF_IP_FLAG_CALL | PERF_IP_FLAG_VMENTRY, "vmentry"},
+	{PERF_IP_FLAG_BRANCH | PERF_IP_FLAG_CALL | PERF_IP_FLAG_VMEXIT, "vmexit"},
 	{0, NULL}
 };
 
diff --git a/tools/perf/util/db-export.c b/tools/perf/util/db-export.c
index db7447154622..5cd189172525 100644
--- a/tools/perf/util/db-export.c
+++ b/tools/perf/util/db-export.c
@@ -438,6 +438,8 @@ static struct {
 	{PERF_IP_FLAG_BRANCH | PERF_IP_FLAG_TX_ABORT, "transaction abort"},
 	{PERF_IP_FLAG_BRANCH | PERF_IP_FLAG_TRACE_BEGIN, "trace begin"},
 	{PERF_IP_FLAG_BRANCH | PERF_IP_FLAG_TRACE_END, "trace end"},
+	{PERF_IP_FLAG_BRANCH | PERF_IP_FLAG_CALL | PERF_IP_FLAG_VMENTRY, "vm entry"},
+	{PERF_IP_FLAG_BRANCH | PERF_IP_FLAG_CALL | PERF_IP_FLAG_VMEXIT, "vm exit"},
 	{0, NULL}
 };
 
diff --git a/tools/perf/util/event.h b/tools/perf/util/event.h
index 4a3b5b6478d8..034526288e05 100644
--- a/tools/perf/util/event.h
+++ b/tools/perf/util/event.h
@@ -96,6 +96,8 @@ enum {
 	PERF_IP_FLAG_TRACE_BEGIN	= 1ULL << 8,
 	PERF_IP_FLAG_TRACE_END		= 1ULL << 9,
 	PERF_IP_FLAG_IN_TX		= 1ULL << 10,
+	PERF_IP_FLAG_VMENTRY		= 1ULL << 11,
+	PERF_IP_FLAG_VMEXIT		= 1ULL << 12,
 };
 
 #define PERF_IP_FLAG_CHARS "bcrosyiABEx"
@@ -110,7 +112,9 @@ enum {
 	PERF_IP_FLAG_INTERRUPT		|\
 	PERF_IP_FLAG_TX_ABORT		|\
 	PERF_IP_FLAG_TRACE_BEGIN	|\
-	PERF_IP_FLAG_TRACE_END)
+	PERF_IP_FLAG_TRACE_END		|\
+	PERF_IP_FLAG_VMENTRY		|\
+	PERF_IP_FLAG_VMEXIT)
 
 #define MAX_INSN 16
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 02/11] perf intel_pt: Add vmlaunch and vmresume as branches
  2021-02-18  9:57 [PATCH 00/11] perf intel-pt: Add limited support for tracing guest kernels Adrian Hunter
  2021-02-18  9:57 ` [PATCH 01/11] perf script: Add branch types for VM-Entry and VM-Exit Adrian Hunter
@ 2021-02-18  9:57 ` Adrian Hunter
  2021-02-18  9:57 ` [PATCH 03/11] perf intel-pt: Retain the last PIP packet payload as is Adrian Hunter
                   ` (9 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: Adrian Hunter @ 2021-02-18  9:57 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa, Andi Kleen
  Cc: Alexander Shishkin, linux-kernel

In preparation to support Intel PT decoding of virtual machine traces, add
vmlaunch and vmresume as branch instructions.

Note, sample flags will show "VMentry" even if the VM-Entry fails.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 tools/perf/arch/x86/tests/insn-x86.c              |  1 +
 .../util/intel-pt-decoder/intel-pt-insn-decoder.c | 15 +++++++++++++++
 .../util/intel-pt-decoder/intel-pt-insn-decoder.h |  1 +
 3 files changed, 17 insertions(+)

diff --git a/tools/perf/arch/x86/tests/insn-x86.c b/tools/perf/arch/x86/tests/insn-x86.c
index 745f29adb14b..f782ef8c5982 100644
--- a/tools/perf/arch/x86/tests/insn-x86.c
+++ b/tools/perf/arch/x86/tests/insn-x86.c
@@ -48,6 +48,7 @@ static int get_op(const char *op_str)
 		{"int",     INTEL_PT_OP_INT},
 		{"syscall", INTEL_PT_OP_SYSCALL},
 		{"sysret",  INTEL_PT_OP_SYSRET},
+		{"vmentry",  INTEL_PT_OP_VMENTRY},
 		{NULL, 0},
 	};
 	struct val_data *val;
diff --git a/tools/perf/util/intel-pt-decoder/intel-pt-insn-decoder.c b/tools/perf/util/intel-pt-decoder/intel-pt-insn-decoder.c
index fb8a3558d3d5..2f6cc7eea251 100644
--- a/tools/perf/util/intel-pt-decoder/intel-pt-insn-decoder.c
+++ b/tools/perf/util/intel-pt-decoder/intel-pt-insn-decoder.c
@@ -43,6 +43,17 @@ static void intel_pt_insn_decoder(struct insn *insn,
 	switch (insn->opcode.bytes[0]) {
 	case 0xf:
 		switch (insn->opcode.bytes[1]) {
+		case 0x01:
+			switch (insn->modrm.bytes[0]) {
+			case 0xc2: /* vmlaunch */
+			case 0xc3: /* vmresume */
+				op = INTEL_PT_OP_VMENTRY;
+				branch = INTEL_PT_BR_INDIRECT;
+				break;
+			default:
+				break;
+			}
+			break;
 		case 0x05: /* syscall */
 		case 0x34: /* sysenter */
 			op = INTEL_PT_OP_SYSCALL;
@@ -213,6 +224,7 @@ const char *branch_name[] = {
 	[INTEL_PT_OP_INT]	= "Int",
 	[INTEL_PT_OP_SYSCALL]	= "Syscall",
 	[INTEL_PT_OP_SYSRET]	= "Sysret",
+	[INTEL_PT_OP_VMENTRY]	= "VMentry",
 };
 
 const char *intel_pt_insn_name(enum intel_pt_insn_op op)
@@ -267,6 +279,9 @@ int intel_pt_insn_type(enum intel_pt_insn_op op)
 	case INTEL_PT_OP_SYSRET:
 		return PERF_IP_FLAG_BRANCH | PERF_IP_FLAG_RETURN |
 		       PERF_IP_FLAG_SYSCALLRET;
+	case INTEL_PT_OP_VMENTRY:
+		return PERF_IP_FLAG_BRANCH | PERF_IP_FLAG_CALL |
+		       PERF_IP_FLAG_VMENTRY;
 	default:
 		return 0;
 	}
diff --git a/tools/perf/util/intel-pt-decoder/intel-pt-insn-decoder.h b/tools/perf/util/intel-pt-decoder/intel-pt-insn-decoder.h
index 95a1eb0141ff..c2861cfdd768 100644
--- a/tools/perf/util/intel-pt-decoder/intel-pt-insn-decoder.h
+++ b/tools/perf/util/intel-pt-decoder/intel-pt-insn-decoder.h
@@ -24,6 +24,7 @@ enum intel_pt_insn_op {
 	INTEL_PT_OP_INT,
 	INTEL_PT_OP_SYSCALL,
 	INTEL_PT_OP_SYSRET,
+	INTEL_PT_OP_VMENTRY,
 };
 
 enum intel_pt_insn_branch {
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 03/11] perf intel-pt: Retain the last PIP packet payload as is
  2021-02-18  9:57 [PATCH 00/11] perf intel-pt: Add limited support for tracing guest kernels Adrian Hunter
  2021-02-18  9:57 ` [PATCH 01/11] perf script: Add branch types for VM-Entry and VM-Exit Adrian Hunter
  2021-02-18  9:57 ` [PATCH 02/11] perf intel_pt: Add vmlaunch and vmresume as branches Adrian Hunter
@ 2021-02-18  9:57 ` Adrian Hunter
  2021-02-18  9:57 ` [PATCH 04/11] perf intel-pt: Amend decoder to track the NR flag Adrian Hunter
                   ` (8 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: Adrian Hunter @ 2021-02-18  9:57 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa, Andi Kleen
  Cc: Alexander Shishkin, linux-kernel

Retain the PIP packet payload as is, instead of just the CR3, because it
contains also the VMX NR flag which is needed to track VM-Entry.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 .../arch/x86/tests/intel-pt-pkt-decoder-test.c     |  4 ++--
 .../perf/util/intel-pt-decoder/intel-pt-decoder.c  | 14 +++++++-------
 .../perf/util/intel-pt-decoder/intel-pt-decoder.h  |  2 +-
 .../util/intel-pt-decoder/intel-pt-pkt-decoder.c   | 12 ++++--------
 .../util/intel-pt-decoder/intel-pt-pkt-decoder.h   |  2 ++
 5 files changed, 16 insertions(+), 18 deletions(-)

diff --git a/tools/perf/arch/x86/tests/intel-pt-pkt-decoder-test.c b/tools/perf/arch/x86/tests/intel-pt-pkt-decoder-test.c
index 901bf1f449c4..c933e3dcd0a8 100644
--- a/tools/perf/arch/x86/tests/intel-pt-pkt-decoder-test.c
+++ b/tools/perf/arch/x86/tests/intel-pt-pkt-decoder-test.c
@@ -66,8 +66,8 @@ struct test_data {
 	{7, {0x9d, 1, 2, 3, 4, 5, 6}, 0, {INTEL_PT_FUP, 4, 0x60504030201}, 0, 0 },
 	{9, {0xdd, 1, 2, 3, 4, 5, 6, 7, 8}, 0, {INTEL_PT_FUP, 6, 0x807060504030201}, 0, 0 },
 	/* Paging Information Packet */
-	{8, {0x02, 0x43, 2, 4, 6, 8, 10, 12}, 0, {INTEL_PT_PIP, 0, 0x60504030201}, 0, 0 },
-	{8, {0x02, 0x43, 3, 4, 6, 8, 10, 12}, 0, {INTEL_PT_PIP, 0, 0x60504030201 | (1ULL << 63)}, 0, 0 },
+	{8, {0x02, 0x43, 2, 4, 6, 8, 10, 12}, 0, {INTEL_PT_PIP, 0, 0xC0A08060402}, 0, 0 },
+	{8, {0x02, 0x43, 3, 4, 6, 8, 10, 12}, 0, {INTEL_PT_PIP, 0, 0xC0A08060403}, 0, 0 },
 	/* Mode Exec Packet */
 	{2, {0x99, 0x00}, 0, {INTEL_PT_MODE_EXEC, 0, 16}, 0, 0 },
 	{2, {0x99, 0x01}, 0, {INTEL_PT_MODE_EXEC, 0, 64}, 0, 0 },
diff --git a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
index 6df52d3c3f7e..cfaa091c935c 100644
--- a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
+++ b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
@@ -126,7 +126,7 @@ struct intel_pt_decoder {
 	uint64_t pos;
 	uint64_t last_ip;
 	uint64_t ip;
-	uint64_t cr3;
+	uint64_t pip_payload;
 	uint64_t timestamp;
 	uint64_t tsc_timestamp;
 	uint64_t ref_timestamp;
@@ -1757,7 +1757,7 @@ static int intel_pt_walk_psbend(struct intel_pt_decoder *decoder)
 			break;
 
 		case INTEL_PT_PIP:
-			decoder->cr3 = decoder->packet.payload & (BIT63 - 1);
+			decoder->pip_payload = decoder->packet.payload;
 			break;
 
 		case INTEL_PT_FUP:
@@ -1884,7 +1884,7 @@ static int intel_pt_walk_fup_tip(struct intel_pt_decoder *decoder)
 			return 0;
 
 		case INTEL_PT_PIP:
-			decoder->cr3 = decoder->packet.payload & (BIT63 - 1);
+			decoder->pip_payload = decoder->packet.payload;
 			break;
 
 		case INTEL_PT_MTC:
@@ -2297,7 +2297,7 @@ static int intel_pt_walk_trace(struct intel_pt_decoder *decoder)
 			return err;
 
 		case INTEL_PT_PIP:
-			decoder->cr3 = decoder->packet.payload & (BIT63 - 1);
+			decoder->pip_payload = decoder->packet.payload;
 			break;
 
 		case INTEL_PT_MTC:
@@ -2536,7 +2536,7 @@ static int intel_pt_walk_psb(struct intel_pt_decoder *decoder)
 			break;
 
 		case INTEL_PT_PIP:
-			decoder->cr3 = decoder->packet.payload & (BIT63 - 1);
+			decoder->pip_payload = decoder->packet.payload;
 			break;
 
 		case INTEL_PT_MODE_EXEC:
@@ -2655,7 +2655,7 @@ static int intel_pt_walk_to_ip(struct intel_pt_decoder *decoder)
 			break;
 
 		case INTEL_PT_PIP:
-			decoder->cr3 = decoder->packet.payload & (BIT63 - 1);
+			decoder->pip_payload = decoder->packet.payload;
 			break;
 
 		case INTEL_PT_MODE_EXEC:
@@ -2987,7 +2987,7 @@ const struct intel_pt_state *intel_pt_decode(struct intel_pt_decoder *decoder)
 
 	decoder->state.timestamp = decoder->sample_timestamp;
 	decoder->state.est_timestamp = intel_pt_est_timestamp(decoder);
-	decoder->state.cr3 = decoder->cr3;
+	decoder->state.pip_payload = decoder->pip_payload;
 	decoder->state.tot_insn_cnt = decoder->tot_insn_cnt;
 	decoder->state.tot_cyc_cnt = decoder->sample_tot_cyc_cnt;
 
diff --git a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.h b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.h
index ae13f3251536..b9564c93fca7 100644
--- a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.h
+++ b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.h
@@ -204,7 +204,7 @@ struct intel_pt_state {
 	int err;
 	uint64_t from_ip;
 	uint64_t to_ip;
-	uint64_t cr3;
+	uint64_t pip_payload;
 	uint64_t tot_insn_cnt;
 	uint64_t tot_cyc_cnt;
 	uint64_t timestamp;
diff --git a/tools/perf/util/intel-pt-decoder/intel-pt-pkt-decoder.c b/tools/perf/util/intel-pt-decoder/intel-pt-pkt-decoder.c
index 4ce109993e74..02a3395d6ce3 100644
--- a/tools/perf/util/intel-pt-decoder/intel-pt-pkt-decoder.c
+++ b/tools/perf/util/intel-pt-decoder/intel-pt-pkt-decoder.c
@@ -16,8 +16,6 @@
 
 #define BIT63		((uint64_t)1 << 63)
 
-#define NR_FLAG		BIT63
-
 #if __BYTE_ORDER == __BIG_ENDIAN
 #define le16_to_cpu bswap_16
 #define le32_to_cpu bswap_32
@@ -106,9 +104,7 @@ static int intel_pt_get_pip(const unsigned char *buf, size_t len,
 
 	packet->type = INTEL_PT_PIP;
 	memcpy_le64(&payload, buf + 2, 6);
-	packet->payload = payload >> 1;
-	if (payload & 1)
-		packet->payload |= NR_FLAG;
+	packet->payload = payload;
 
 	return 8;
 }
@@ -719,10 +715,10 @@ int intel_pt_pkt_desc(const struct intel_pt_pkt *packet, char *buf,
 				name, (unsigned)(payload >> 1) & 1,
 				(unsigned)payload & 1);
 	case INTEL_PT_PIP:
-		nr = packet->payload & NR_FLAG ? 1 : 0;
-		payload &= ~NR_FLAG;
+		nr = packet->payload & INTEL_PT_VMX_NR_FLAG ? 1 : 0;
+		payload &= ~INTEL_PT_VMX_NR_FLAG;
 		ret = snprintf(buf, buf_len, "%s 0x%llx (NR=%d)",
-			       name, payload, nr);
+			       name, payload >> 1, nr);
 		return ret;
 	case INTEL_PT_PTWRITE:
 		return snprintf(buf, buf_len, "%s 0x%llx IP:0", name, payload);
diff --git a/tools/perf/util/intel-pt-decoder/intel-pt-pkt-decoder.h b/tools/perf/util/intel-pt-decoder/intel-pt-pkt-decoder.h
index 17ca9b56d72f..996090cb84f6 100644
--- a/tools/perf/util/intel-pt-decoder/intel-pt-pkt-decoder.h
+++ b/tools/perf/util/intel-pt-decoder/intel-pt-pkt-decoder.h
@@ -21,6 +21,8 @@
 
 #define INTEL_PT_PKT_MAX_SZ		16
 
+#define INTEL_PT_VMX_NR_FLAG		1
+
 enum intel_pt_pkt_type {
 	INTEL_PT_BAD,
 	INTEL_PT_PAD,
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 04/11] perf intel-pt: Amend decoder to track the NR flag
  2021-02-18  9:57 [PATCH 00/11] perf intel-pt: Add limited support for tracing guest kernels Adrian Hunter
                   ` (2 preceding siblings ...)
  2021-02-18  9:57 ` [PATCH 03/11] perf intel-pt: Retain the last PIP packet payload as is Adrian Hunter
@ 2021-02-18  9:57 ` Adrian Hunter
  2021-02-18  9:57 ` [PATCH 05/11] perf machine: Factor out machines__find_guest() Adrian Hunter
                   ` (7 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: Adrian Hunter @ 2021-02-18  9:57 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa, Andi Kleen
  Cc: Alexander Shishkin, linux-kernel

The PIP packet NR (non-root) flag indicates whether or not a virtual
machine is being traced (NR=1 => VM). Add support for tracking its value.

In particular note that the PIP packet (outside of PSB+) will be
associated with a TIP packet from which address the NR value takes
effect. At that point, there is a branch from_ip, to_ip with
corresponding from_nr and to_nr.

In the event of VM-Entry failure, there should still PIP and TIP packets
that can be followed in the same way.

Also note that this assumes that a host VMM is not employing VMX controls
that affect Intel PT, e.g. to hide the host from a guest using Intel PT.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 .../util/intel-pt-decoder/intel-pt-decoder.c  | 59 ++++++++++++++++---
 .../util/intel-pt-decoder/intel-pt-decoder.h  |  3 +-
 2 files changed, 53 insertions(+), 9 deletions(-)

diff --git a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
index cfaa091c935c..8c59677bee13 100644
--- a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
+++ b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
@@ -122,6 +122,8 @@ struct intel_pt_decoder {
 	bool in_psb;
 	bool hop;
 	bool leap;
+	bool nr;
+	bool next_nr;
 	enum intel_pt_param_flags flags;
 	uint64_t pos;
 	uint64_t last_ip;
@@ -503,6 +505,28 @@ static inline void intel_pt_update_in_tx(struct intel_pt_decoder *decoder)
 	decoder->tx_flags = decoder->packet.payload & INTEL_PT_IN_TX;
 }
 
+static inline void intel_pt_update_pip(struct intel_pt_decoder *decoder)
+{
+	decoder->pip_payload = decoder->packet.payload;
+}
+
+static inline void intel_pt_update_nr(struct intel_pt_decoder *decoder)
+{
+	decoder->next_nr = decoder->pip_payload & 1;
+}
+
+static inline void intel_pt_set_nr(struct intel_pt_decoder *decoder)
+{
+	decoder->nr = decoder->pip_payload & 1;
+	decoder->next_nr = decoder->nr;
+}
+
+static inline void intel_pt_set_pip(struct intel_pt_decoder *decoder)
+{
+	intel_pt_update_pip(decoder);
+	intel_pt_set_nr(decoder);
+}
+
 static int intel_pt_bad_packet(struct intel_pt_decoder *decoder)
 {
 	intel_pt_clear_tx_flags(decoder);
@@ -1240,6 +1264,7 @@ static int intel_pt_walk_tip(struct intel_pt_decoder *decoder)
 		decoder->continuous_period = false;
 		decoder->pkt_state = INTEL_PT_STATE_IN_SYNC;
 		decoder->state.type |= INTEL_PT_TRACE_END;
+		intel_pt_update_nr(decoder);
 		return 0;
 	}
 	if (err == INTEL_PT_RETURN)
@@ -1247,6 +1272,8 @@ static int intel_pt_walk_tip(struct intel_pt_decoder *decoder)
 	if (err)
 		return err;
 
+	intel_pt_update_nr(decoder);
+
 	if (intel_pt_insn.branch == INTEL_PT_BR_INDIRECT) {
 		if (decoder->pkt_state == INTEL_PT_STATE_TIP_PGD) {
 			decoder->pge = false;
@@ -1359,6 +1386,7 @@ static int intel_pt_walk_tnt(struct intel_pt_decoder *decoder)
 			decoder->state.from_ip = decoder->ip;
 			decoder->state.to_ip = decoder->last_ip;
 			decoder->ip = decoder->last_ip;
+			intel_pt_update_nr(decoder);
 			return 0;
 		}
 
@@ -1483,6 +1511,7 @@ static int intel_pt_overflow(struct intel_pt_decoder *decoder)
 {
 	intel_pt_log("ERROR: Buffer overflow\n");
 	intel_pt_clear_tx_flags(decoder);
+	intel_pt_set_nr(decoder);
 	decoder->timestamp_insn_cnt = 0;
 	decoder->pkt_state = INTEL_PT_STATE_ERR_RESYNC;
 	decoder->overflow = true;
@@ -1757,7 +1786,7 @@ static int intel_pt_walk_psbend(struct intel_pt_decoder *decoder)
 			break;
 
 		case INTEL_PT_PIP:
-			decoder->pip_payload = decoder->packet.payload;
+			intel_pt_set_pip(decoder);
 			break;
 
 		case INTEL_PT_FUP:
@@ -1856,6 +1885,7 @@ static int intel_pt_walk_fup_tip(struct intel_pt_decoder *decoder)
 			decoder->pge = false;
 			decoder->continuous_period = false;
 			decoder->state.type |= INTEL_PT_TRACE_END;
+			intel_pt_update_nr(decoder);
 			return 0;
 
 		case INTEL_PT_TIP_PGE:
@@ -1871,6 +1901,7 @@ static int intel_pt_walk_fup_tip(struct intel_pt_decoder *decoder)
 			}
 			decoder->state.type |= INTEL_PT_TRACE_BEGIN;
 			intel_pt_mtc_cyc_cnt_pge(decoder);
+			intel_pt_set_nr(decoder);
 			return 0;
 
 		case INTEL_PT_TIP:
@@ -1881,10 +1912,11 @@ static int intel_pt_walk_fup_tip(struct intel_pt_decoder *decoder)
 				intel_pt_set_ip(decoder);
 				decoder->state.to_ip = decoder->ip;
 			}
+			intel_pt_update_nr(decoder);
 			return 0;
 
 		case INTEL_PT_PIP:
-			decoder->pip_payload = decoder->packet.payload;
+			intel_pt_update_pip(decoder);
 			break;
 
 		case INTEL_PT_MTC:
@@ -1943,21 +1975,27 @@ static int intel_pt_hop_trace(struct intel_pt_decoder *decoder, bool *no_tip, in
 		return HOP_IGNORE;
 
 	case INTEL_PT_TIP_PGD:
-		if (!decoder->packet.count)
+		if (!decoder->packet.count) {
+			intel_pt_set_nr(decoder);
 			return HOP_IGNORE;
+		}
 		intel_pt_set_ip(decoder);
 		decoder->state.type |= INTEL_PT_TRACE_END;
 		decoder->state.from_ip = 0;
 		decoder->state.to_ip = decoder->ip;
+		intel_pt_update_nr(decoder);
 		return HOP_RETURN;
 
 	case INTEL_PT_TIP:
-		if (!decoder->packet.count)
+		if (!decoder->packet.count) {
+			intel_pt_set_nr(decoder);
 			return HOP_IGNORE;
+		}
 		intel_pt_set_ip(decoder);
 		decoder->state.type = INTEL_PT_INSTRUCTION;
 		decoder->state.from_ip = decoder->ip;
 		decoder->state.to_ip = 0;
+		intel_pt_update_nr(decoder);
 		return HOP_RETURN;
 
 	case INTEL_PT_FUP:
@@ -2222,6 +2260,7 @@ static int intel_pt_walk_trace(struct intel_pt_decoder *decoder)
 		case INTEL_PT_TIP_PGE: {
 			decoder->pge = true;
 			intel_pt_mtc_cyc_cnt_pge(decoder);
+			intel_pt_set_nr(decoder);
 			if (decoder->packet.count == 0) {
 				intel_pt_log_at("Skipping zero TIP.PGE",
 						decoder->pos);
@@ -2297,7 +2336,7 @@ static int intel_pt_walk_trace(struct intel_pt_decoder *decoder)
 			return err;
 
 		case INTEL_PT_PIP:
-			decoder->pip_payload = decoder->packet.payload;
+			intel_pt_update_pip(decoder);
 			break;
 
 		case INTEL_PT_MTC:
@@ -2536,7 +2575,7 @@ static int intel_pt_walk_psb(struct intel_pt_decoder *decoder)
 			break;
 
 		case INTEL_PT_PIP:
-			decoder->pip_payload = decoder->packet.payload;
+			intel_pt_set_pip(decoder);
 			break;
 
 		case INTEL_PT_MODE_EXEC:
@@ -2655,7 +2694,7 @@ static int intel_pt_walk_to_ip(struct intel_pt_decoder *decoder)
 			break;
 
 		case INTEL_PT_PIP:
-			decoder->pip_payload = decoder->packet.payload;
+			intel_pt_set_pip(decoder);
 			break;
 
 		case INTEL_PT_MODE_EXEC:
@@ -2953,6 +2992,7 @@ const struct intel_pt_state *intel_pt_decode(struct intel_pt_decoder *decoder)
 		decoder->state.from_ip = decoder->ip;
 		intel_pt_update_sample_time(decoder);
 		decoder->sample_tot_cyc_cnt = decoder->tot_cyc_cnt;
+		intel_pt_set_nr(decoder);
 	} else {
 		decoder->state.err = 0;
 		if (decoder->cbr != decoder->cbr_seen) {
@@ -2985,9 +3025,12 @@ const struct intel_pt_state *intel_pt_decode(struct intel_pt_decoder *decoder)
 	if ((decoder->state.type & INTEL_PT_PSB_EVT) && decoder->tsc_timestamp)
 		decoder->sample_timestamp = decoder->tsc_timestamp;
 
+	decoder->state.from_nr = decoder->nr;
+	decoder->state.to_nr = decoder->next_nr;
+	decoder->nr = decoder->next_nr;
+
 	decoder->state.timestamp = decoder->sample_timestamp;
 	decoder->state.est_timestamp = intel_pt_est_timestamp(decoder);
-	decoder->state.pip_payload = decoder->pip_payload;
 	decoder->state.tot_insn_cnt = decoder->tot_insn_cnt;
 	decoder->state.tot_cyc_cnt = decoder->sample_tot_cyc_cnt;
 
diff --git a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.h b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.h
index b9564c93fca7..d9e62a7f6f0e 100644
--- a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.h
+++ b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.h
@@ -201,10 +201,11 @@ struct intel_pt_blk_items {
 
 struct intel_pt_state {
 	enum intel_pt_sample_type type;
+	bool from_nr;
+	bool to_nr;
 	int err;
 	uint64_t from_ip;
 	uint64_t to_ip;
-	uint64_t pip_payload;
 	uint64_t tot_insn_cnt;
 	uint64_t tot_cyc_cnt;
 	uint64_t timestamp;
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 05/11] perf machine: Factor out machines__find_guest()
  2021-02-18  9:57 [PATCH 00/11] perf intel-pt: Add limited support for tracing guest kernels Adrian Hunter
                   ` (3 preceding siblings ...)
  2021-02-18  9:57 ` [PATCH 04/11] perf intel-pt: Amend decoder to track the NR flag Adrian Hunter
@ 2021-02-18  9:57 ` Adrian Hunter
  2021-02-18  9:57 ` [PATCH 06/11] perf machine: Factor out machine__idle_thread() Adrian Hunter
                   ` (6 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: Adrian Hunter @ 2021-02-18  9:57 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa, Andi Kleen
  Cc: Alexander Shishkin, linux-kernel

Factor out machines__find_guest() so it can be re-used.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 tools/perf/util/machine.c | 9 +++++++++
 tools/perf/util/machine.h | 1 +
 tools/perf/util/session.c | 7 +------
 3 files changed, 11 insertions(+), 6 deletions(-)

diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index ab8a6b3e801d..90703b7ca6de 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -369,6 +369,15 @@ struct machine *machines__findnew(struct machines *machines, pid_t pid)
 	return machine;
 }
 
+struct machine *machines__find_guest(struct machines *machines, pid_t pid)
+{
+	struct machine *machine = machines__find(machines, pid);
+
+	if (!machine)
+		machine = machines__findnew(machines, DEFAULT_GUEST_KERNEL_ID);
+	return machine;
+}
+
 void machines__process_guests(struct machines *machines,
 			      machine__process_t process, void *data)
 {
diff --git a/tools/perf/util/machine.h b/tools/perf/util/machine.h
index 26368d3c1754..022c19ecd287 100644
--- a/tools/perf/util/machine.h
+++ b/tools/perf/util/machine.h
@@ -162,6 +162,7 @@ struct machine *machines__add(struct machines *machines, pid_t pid,
 struct machine *machines__find_host(struct machines *machines);
 struct machine *machines__find(struct machines *machines, pid_t pid);
 struct machine *machines__findnew(struct machines *machines, pid_t pid);
+struct machine *machines__find_guest(struct machines *machines, pid_t pid);
 
 void machines__set_id_hdr_size(struct machines *machines, u16 id_hdr_size);
 void machines__set_comm_exec(struct machines *machines, bool comm_exec);
diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index f4aeb1af05d8..7b0d0c9e3dd1 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -1356,8 +1356,6 @@ static struct machine *machines__find_for_cpumode(struct machines *machines,
 					       union perf_event *event,
 					       struct perf_sample *sample)
 {
-	struct machine *machine;
-
 	if (perf_guest &&
 	    ((sample->cpumode == PERF_RECORD_MISC_GUEST_KERNEL) ||
 	     (sample->cpumode == PERF_RECORD_MISC_GUEST_USER))) {
@@ -1369,10 +1367,7 @@ static struct machine *machines__find_for_cpumode(struct machines *machines,
 		else
 			pid = sample->pid;
 
-		machine = machines__find(machines, pid);
-		if (!machine)
-			machine = machines__findnew(machines, DEFAULT_GUEST_KERNEL_ID);
-		return machine;
+		return machines__find_guest(machines, pid);
 	}
 
 	return &machines->host;
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 06/11] perf machine: Factor out machine__idle_thread()
  2021-02-18  9:57 [PATCH 00/11] perf intel-pt: Add limited support for tracing guest kernels Adrian Hunter
                   ` (4 preceding siblings ...)
  2021-02-18  9:57 ` [PATCH 05/11] perf machine: Factor out machines__find_guest() Adrian Hunter
@ 2021-02-18  9:57 ` Adrian Hunter
  2021-02-18  9:57 ` [PATCH 07/11] perf intel-pt: Support decoding of guest kernel Adrian Hunter
                   ` (5 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: Adrian Hunter @ 2021-02-18  9:57 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa, Andi Kleen
  Cc: Alexander Shishkin, linux-kernel

Factor out machine__idle_thread() so it can be re-used for guest machines.

A thread is needed to find executable code, even for the guest kernel. To
avoid possible future pid number conflicts, the idle thread can be used.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 tools/perf/util/machine.c | 18 ++++++++++++++++++
 tools/perf/util/machine.h |  1 +
 tools/perf/util/session.c | 25 +++----------------------
 3 files changed, 22 insertions(+), 22 deletions(-)

diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index 90703b7ca6de..b5c2d8be4144 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -598,6 +598,24 @@ struct thread *machine__find_thread(struct machine *machine, pid_t pid,
 	return th;
 }
 
+/*
+ * Threads are identified by pid and tid, and the idle task has pid == tid == 0.
+ * So here a single thread is created for that, but actually there is a separate
+ * idle task per cpu, so there should be one 'struct thread' per cpu, but there
+ * is only 1. That causes problems for some tools, requiring workarounds. For
+ * example get_idle_thread() in builtin-sched.c, or thread_stack__per_cpu().
+ */
+struct thread *machine__idle_thread(struct machine *machine)
+{
+	struct thread *thread = machine__findnew_thread(machine, 0, 0);
+
+	if (!thread || thread__set_comm(thread, "swapper", 0) ||
+	    thread__set_namespaces(thread, 0, NULL))
+		pr_err("problem inserting idle task for machine pid %d\n", machine->pid);
+
+	return thread;
+}
+
 struct comm *machine__thread_exec_comm(struct machine *machine,
 				       struct thread *thread)
 {
diff --git a/tools/perf/util/machine.h b/tools/perf/util/machine.h
index 022c19ecd287..7377ed6efdf1 100644
--- a/tools/perf/util/machine.h
+++ b/tools/perf/util/machine.h
@@ -106,6 +106,7 @@ u8 machine__addr_cpumode(struct machine *machine, u8 cpumode, u64 addr);
 
 struct thread *machine__find_thread(struct machine *machine, pid_t pid,
 				    pid_t tid);
+struct thread *machine__idle_thread(struct machine *machine);
 struct comm *machine__thread_exec_comm(struct machine *machine,
 				       struct thread *thread);
 
diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index 7b0d0c9e3dd1..859832a82496 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -1789,32 +1789,13 @@ struct thread *perf_session__findnew(struct perf_session *session, pid_t pid)
 	return machine__findnew_thread(&session->machines.host, -1, pid);
 }
 
-/*
- * Threads are identified by pid and tid, and the idle task has pid == tid == 0.
- * So here a single thread is created for that, but actually there is a separate
- * idle task per cpu, so there should be one 'struct thread' per cpu, but there
- * is only 1. That causes problems for some tools, requiring workarounds. For
- * example get_idle_thread() in builtin-sched.c, or thread_stack__per_cpu().
- */
 int perf_session__register_idle_thread(struct perf_session *session)
 {
-	struct thread *thread;
-	int err = 0;
-
-	thread = machine__findnew_thread(&session->machines.host, 0, 0);
-	if (thread == NULL || thread__set_comm(thread, "swapper", 0)) {
-		pr_err("problem inserting idle task.\n");
-		err = -1;
-	}
+	struct thread *thread = machine__idle_thread(&session->machines.host);
 
-	if (thread == NULL || thread__set_namespaces(thread, 0, NULL)) {
-		pr_err("problem inserting idle task.\n");
-		err = -1;
-	}
-
-	/* machine__findnew_thread() got the thread, so put it */
+	/* machine__idle_thread() got the thread, so put it */
 	thread__put(thread);
-	return err;
+	return thread ? 0 : -1;
 }
 
 static void
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 07/11] perf intel-pt: Support decoding of guest kernel
  2021-02-18  9:57 [PATCH 00/11] perf intel-pt: Add limited support for tracing guest kernels Adrian Hunter
                   ` (5 preceding siblings ...)
  2021-02-18  9:57 ` [PATCH 06/11] perf machine: Factor out machine__idle_thread() Adrian Hunter
@ 2021-02-18  9:57 ` Adrian Hunter
  2021-02-18  9:57 ` [PATCH 08/11] perf intel-pt: Allow for a guest kernel address filter Adrian Hunter
                   ` (4 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: Adrian Hunter @ 2021-02-18  9:57 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa, Andi Kleen
  Cc: Alexander Shishkin, linux-kernel

The guest kernel can be found from any guest thread belonging to the guest
machine. The guest machine is associated with the current host process pid.
An idle thread (pid=tid=0) is created as a vehicle from which to find the
guest kernel map.

Decoding guest user space is not supported.

Synthesized samples just need the cpumode set for the guest.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 tools/perf/util/intel-pt.c | 81 ++++++++++++++++++++++++++++++++------
 1 file changed, 69 insertions(+), 12 deletions(-)

diff --git a/tools/perf/util/intel-pt.c b/tools/perf/util/intel-pt.c
index ddb8e6c3ffb0..29d871718995 100644
--- a/tools/perf/util/intel-pt.c
+++ b/tools/perf/util/intel-pt.c
@@ -163,6 +163,9 @@ struct intel_pt_queue {
 	int switch_state;
 	pid_t next_tid;
 	struct thread *thread;
+	struct machine *guest_machine;
+	struct thread *unknown_guest_thread;
+	pid_t guest_machine_pid;
 	bool exclude_kernel;
 	bool have_sample;
 	u64 time;
@@ -550,13 +553,59 @@ static void intel_pt_cache_invalidate(struct dso *dso, struct machine *machine,
 	auxtrace_cache__remove(dso->auxtrace_cache, offset);
 }
 
-static inline u8 intel_pt_cpumode(struct intel_pt *pt, uint64_t ip)
+static inline bool intel_pt_guest_kernel_ip(uint64_t ip)
 {
-	return ip >= pt->kernel_start ?
+	/* Assumes 64-bit kernel */
+	return ip & (1ULL << 63);
+}
+
+static inline u8 intel_pt_nr_cpumode(struct intel_pt_queue *ptq, uint64_t ip, bool nr)
+{
+	if (nr) {
+		return intel_pt_guest_kernel_ip(ip) ?
+		       PERF_RECORD_MISC_GUEST_KERNEL :
+		       PERF_RECORD_MISC_GUEST_USER;
+	}
+
+	return ip >= ptq->pt->kernel_start ?
 	       PERF_RECORD_MISC_KERNEL :
 	       PERF_RECORD_MISC_USER;
 }
 
+static inline u8 intel_pt_cpumode(struct intel_pt_queue *ptq, uint64_t from_ip, uint64_t to_ip)
+{
+	/* No support for non-zero CS base */
+	if (from_ip)
+		return intel_pt_nr_cpumode(ptq, from_ip, ptq->state->from_nr);
+	return intel_pt_nr_cpumode(ptq, to_ip, ptq->state->to_nr);
+}
+
+static int intel_pt_get_guest(struct intel_pt_queue *ptq)
+{
+	struct machines *machines = &ptq->pt->session->machines;
+	struct machine *machine;
+	pid_t pid = ptq->pid <= 0 ? DEFAULT_GUEST_KERNEL_ID : ptq->pid;
+
+	if (ptq->guest_machine && pid == ptq->guest_machine_pid)
+		return 0;
+
+	ptq->guest_machine = NULL;
+	thread__zput(ptq->unknown_guest_thread);
+
+	machine = machines__find_guest(machines, pid);
+	if (!machine)
+		return -1;
+
+	ptq->unknown_guest_thread = machine__idle_thread(machine);
+	if (!ptq->unknown_guest_thread)
+		return -1;
+
+	ptq->guest_machine = machine;
+	ptq->guest_machine_pid = pid;
+
+	return 0;
+}
+
 static int intel_pt_walk_next_insn(struct intel_pt_insn *intel_pt_insn,
 				   uint64_t *insn_cnt_ptr, uint64_t *ip,
 				   uint64_t to_ip, uint64_t max_insn_cnt,
@@ -573,19 +622,29 @@ static int intel_pt_walk_next_insn(struct intel_pt_insn *intel_pt_insn,
 	u64 offset, start_offset, start_ip;
 	u64 insn_cnt = 0;
 	bool one_map = true;
+	bool nr;
 
 	intel_pt_insn->length = 0;
 
 	if (to_ip && *ip == to_ip)
 		goto out_no_cache;
 
-	cpumode = intel_pt_cpumode(ptq->pt, *ip);
+	nr = ptq->state->to_nr;
+	cpumode = intel_pt_nr_cpumode(ptq, *ip, nr);
 
-	thread = ptq->thread;
-	if (!thread) {
-		if (cpumode != PERF_RECORD_MISC_KERNEL)
+	if (nr) {
+		if (cpumode != PERF_RECORD_MISC_GUEST_KERNEL ||
+		    intel_pt_get_guest(ptq))
 			return -EINVAL;
-		thread = ptq->pt->unknown_thread;
+		machine = ptq->guest_machine;
+		thread = ptq->unknown_guest_thread;
+	} else {
+		thread = ptq->thread;
+		if (!thread) {
+			if (cpumode != PERF_RECORD_MISC_KERNEL)
+				return -EINVAL;
+			thread = ptq->pt->unknown_thread;
+		}
 	}
 
 	while (1) {
@@ -1101,6 +1160,7 @@ static void intel_pt_free_queue(void *priv)
 	if (!ptq)
 		return;
 	thread__zput(ptq->thread);
+	thread__zput(ptq->unknown_guest_thread);
 	intel_pt_decoder_free(ptq->decoder);
 	zfree(&ptq->event_buf);
 	zfree(&ptq->last_branch);
@@ -1315,8 +1375,8 @@ static void intel_pt_prep_b_sample(struct intel_pt *pt,
 		sample->time = tsc_to_perf_time(ptq->timestamp, &pt->tc);
 
 	sample->ip = ptq->state->from_ip;
-	sample->cpumode = intel_pt_cpumode(pt, sample->ip);
 	sample->addr = ptq->state->to_ip;
+	sample->cpumode = intel_pt_cpumode(ptq, sample->ip, sample->addr);
 	sample->period = 1;
 	sample->flags = ptq->flags;
 
@@ -1833,10 +1893,7 @@ static int intel_pt_synth_pebs_sample(struct intel_pt_queue *ptq)
 	else
 		sample.ip = ptq->state->from_ip;
 
-	/* No support for guest mode at this time */
-	cpumode = sample.ip < ptq->pt->kernel_start ?
-		  PERF_RECORD_MISC_USER :
-		  PERF_RECORD_MISC_KERNEL;
+	cpumode = intel_pt_cpumode(ptq, sample.ip, 0);
 
 	event->sample.header.misc = cpumode | PERF_RECORD_MISC_EXACT_IP;
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 08/11] perf intel-pt: Allow for a guest kernel address filter
  2021-02-18  9:57 [PATCH 00/11] perf intel-pt: Add limited support for tracing guest kernels Adrian Hunter
                   ` (6 preceding siblings ...)
  2021-02-18  9:57 ` [PATCH 07/11] perf intel-pt: Support decoding of guest kernel Adrian Hunter
@ 2021-02-18  9:57 ` Adrian Hunter
  2021-02-18  9:57 ` [PATCH 09/11] perf intel-pt: Adjust sample flags for VM-Exit Adrian Hunter
                   ` (3 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: Adrian Hunter @ 2021-02-18  9:57 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa, Andi Kleen
  Cc: Alexander Shishkin, linux-kernel

Handling TIP.PGD for an address filter for a guest kernel is the same as a
host kernel, but user space decoding, and hence address filters, are not
supported.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 tools/perf/util/intel-pt.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/intel-pt.c b/tools/perf/util/intel-pt.c
index 29d871718995..546d512b300a 100644
--- a/tools/perf/util/intel-pt.c
+++ b/tools/perf/util/intel-pt.c
@@ -792,8 +792,14 @@ static int __intel_pt_pgd_ip(uint64_t ip, void *data)
 	u8 cpumode;
 	u64 offset;
 
-	if (ip >= ptq->pt->kernel_start)
+	if (ptq->state->to_nr) {
+		if (intel_pt_guest_kernel_ip(ip))
+			return intel_pt_match_pgd_ip(ptq->pt, ip, ip, NULL);
+		/* No support for decoding guest user space */
+		return -EINVAL;
+	} else if (ip >= ptq->pt->kernel_start) {
 		return intel_pt_match_pgd_ip(ptq->pt, ip, ip, NULL);
+	}
 
 	cpumode = PERF_RECORD_MISC_USER;
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 09/11] perf intel-pt: Adjust sample flags for VM-Exit
  2021-02-18  9:57 [PATCH 00/11] perf intel-pt: Add limited support for tracing guest kernels Adrian Hunter
                   ` (7 preceding siblings ...)
  2021-02-18  9:57 ` [PATCH 08/11] perf intel-pt: Allow for a guest kernel address filter Adrian Hunter
@ 2021-02-18  9:57 ` Adrian Hunter
  2021-02-18  9:58 ` [PATCH 10/11] perf intel-pt: Split VM-Entry and VM-Exit branches Adrian Hunter
                   ` (2 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: Adrian Hunter @ 2021-02-18  9:57 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa, Andi Kleen
  Cc: Alexander Shishkin, linux-kernel

Use the change of NR to detect whether an asynchronous branch is a VM-Exit.

Note VM-Entry is determined from the vmlaunch or vmresume instruction,
in which case, sample flags will show "VMentry" even if the VM-Entry fails.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 tools/perf/util/intel-pt.c | 11 +++++++----
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/tools/perf/util/intel-pt.c b/tools/perf/util/intel-pt.c
index 546d512b300a..cafb3943d5f6 100644
--- a/tools/perf/util/intel-pt.c
+++ b/tools/perf/util/intel-pt.c
@@ -1201,13 +1201,16 @@ static void intel_pt_sample_flags(struct intel_pt_queue *ptq)
 	if (ptq->state->flags & INTEL_PT_ABORT_TX) {
 		ptq->flags = PERF_IP_FLAG_BRANCH | PERF_IP_FLAG_TX_ABORT;
 	} else if (ptq->state->flags & INTEL_PT_ASYNC) {
-		if (ptq->state->to_ip)
+		if (!ptq->state->to_ip)
+			ptq->flags = PERF_IP_FLAG_BRANCH |
+				     PERF_IP_FLAG_TRACE_END;
+		else if (ptq->state->from_nr && !ptq->state->to_nr)
+			ptq->flags = PERF_IP_FLAG_BRANCH | PERF_IP_FLAG_CALL |
+				     PERF_IP_FLAG_VMEXIT;
+		else
 			ptq->flags = PERF_IP_FLAG_BRANCH | PERF_IP_FLAG_CALL |
 				     PERF_IP_FLAG_ASYNC |
 				     PERF_IP_FLAG_INTERRUPT;
-		else
-			ptq->flags = PERF_IP_FLAG_BRANCH |
-				     PERF_IP_FLAG_TRACE_END;
 		ptq->insn_len = 0;
 	} else {
 		if (ptq->state->from_ip)
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 10/11] perf intel-pt: Split VM-Entry and VM-Exit branches
  2021-02-18  9:57 [PATCH 00/11] perf intel-pt: Add limited support for tracing guest kernels Adrian Hunter
                   ` (8 preceding siblings ...)
  2021-02-18  9:57 ` [PATCH 09/11] perf intel-pt: Adjust sample flags for VM-Exit Adrian Hunter
@ 2021-02-18  9:58 ` Adrian Hunter
  2021-02-18  9:58 ` [PATCH 11/11] perf intel-pt: Add documentation for tracing virtual machines Adrian Hunter
  2021-02-18 18:24 ` [PATCH 00/11] perf intel-pt: Add limited support for tracing guest kernels Andi Kleen
  11 siblings, 0 replies; 13+ messages in thread
From: Adrian Hunter @ 2021-02-18  9:58 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa, Andi Kleen
  Cc: Alexander Shishkin, linux-kernel

Events record a single cpumode so the tools cannot handle a branch from
the host machine to a virtual machine, or vice versa. Split it in two so
that each branch can have a different cpumode.

  E.g.		host ip -> guest ip

  becomes:	host ip -> 0
		      0 -> guest ip

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 tools/perf/util/intel-pt.c | 22 +++++++++++++++++++++-
 1 file changed, 21 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/intel-pt.c b/tools/perf/util/intel-pt.c
index cafb3943d5f6..f6e28ac231b7 100644
--- a/tools/perf/util/intel-pt.c
+++ b/tools/perf/util/intel-pt.c
@@ -2171,7 +2171,27 @@ static int intel_pt_sample(struct intel_pt_queue *ptq)
 	}
 
 	if (pt->sample_branches) {
-		err = intel_pt_synth_branch_sample(ptq);
+		if (state->from_nr != state->to_nr &&
+		    state->from_ip && state->to_ip) {
+			struct intel_pt_state *st = (struct intel_pt_state *)state;
+			u64 to_ip = st->to_ip;
+			u64 from_ip = st->from_ip;
+
+			/*
+			 * perf cannot handle having different machines for ip
+			 * and addr, so create 2 branches.
+			 */
+			st->to_ip = 0;
+			err = intel_pt_synth_branch_sample(ptq);
+			if (err)
+				return err;
+			st->from_ip = 0;
+			st->to_ip = to_ip;
+			err = intel_pt_synth_branch_sample(ptq);
+			st->from_ip = from_ip;
+		} else {
+			err = intel_pt_synth_branch_sample(ptq);
+		}
 		if (err)
 			return err;
 	}
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 11/11] perf intel-pt: Add documentation for tracing virtual machines
  2021-02-18  9:57 [PATCH 00/11] perf intel-pt: Add limited support for tracing guest kernels Adrian Hunter
                   ` (9 preceding siblings ...)
  2021-02-18  9:58 ` [PATCH 10/11] perf intel-pt: Split VM-Entry and VM-Exit branches Adrian Hunter
@ 2021-02-18  9:58 ` Adrian Hunter
  2021-02-18 18:24 ` [PATCH 00/11] perf intel-pt: Add limited support for tracing guest kernels Andi Kleen
  11 siblings, 0 replies; 13+ messages in thread
From: Adrian Hunter @ 2021-02-18  9:58 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa, Andi Kleen
  Cc: Alexander Shishkin, linux-kernel

Add documentation to the perf-intel-pt man page for tracing virtual
machines.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 tools/perf/Documentation/perf-intel-pt.txt | 82 ++++++++++++++++++++++
 1 file changed, 82 insertions(+)

diff --git a/tools/perf/Documentation/perf-intel-pt.txt b/tools/perf/Documentation/perf-intel-pt.txt
index 0b8a339803cb..18c91b7cd937 100644
--- a/tools/perf/Documentation/perf-intel-pt.txt
+++ b/tools/perf/Documentation/perf-intel-pt.txt
@@ -1146,6 +1146,88 @@ XED
 
 include::build-xed.txt[]
 
+
+Tracing Virtual Machines
+------------------------
+
+Currently, only kernel tracing is supported and only with "timeless" decoding
+i.e. no TSC timestamps
+
+Other limitations and caveats
+
+ VMX controls may suppress packets needed for decoding resulting in decoding errors
+ VMX controls may block the perf NMI to the host potentially resulting in lost trace data
+ Guest kernel self-modifying code (e.g. jump labels or JIT-compiled eBPF) will result in decoding errors
+ Guest thread information is unknown
+ Guest VCPU is unknown but may be able to be inferred from the host thread
+ Callchains are not supported
+
+Example
+
+Start VM
+
+ $ sudo virsh start kubuntu20.04
+ Domain kubuntu20.04 started
+
+Mount the guest file system.  Note sshfs needs -o direct_io to enable reading of proc files.  root access is needed to read /proc/kcore.
+
+ $ mkdir vm0
+ $ sshfs -o direct_io root@vm0:/ vm0
+
+Copy the guest /proc/kallsyms, /proc/modules and /proc/kcore
+
+ $ perf buildid-cache -v --kcore vm0/proc/kcore
+ kcore added to build-id cache directory /home/user/.debug/[kernel.kcore]/9600f316a53a0f54278885e8d9710538ec5f6a08/2021021807494306
+ $ KALLSYMS=/home/user/.debug/[kernel.kcore]/9600f316a53a0f54278885e8d9710538ec5f6a08/2021021807494306/kallsyms
+
+Find the VM process
+
+ $ ps -eLl | grep 'KVM\|PID'
+ F S   UID     PID    PPID     LWP  C PRI  NI ADDR SZ WCHAN  TTY          TIME CMD
+ 3 S 64055    1430       1    1440  1  80   0 - 1921718 -    ?        00:02:47 CPU 0/KVM
+ 3 S 64055    1430       1    1441  1  80   0 - 1921718 -    ?        00:02:41 CPU 1/KVM
+ 3 S 64055    1430       1    1442  1  80   0 - 1921718 -    ?        00:02:38 CPU 2/KVM
+ 3 S 64055    1430       1    1443  2  80   0 - 1921718 -    ?        00:03:18 CPU 3/KVM
+
+Start an open-ended perf record, tracing the VM process, do something on the VM, and then ctrl-C to stop.
+TSC is not supported and tsc=0 must be specified.  That means mtc is useless, so add mtc=0.
+However, IPC can still be determined, hence cyc=1 can be added.
+Only kernel decoding is supported, so 'k' must be specified.
+Intel PT traces both the host and the guest so --guest and --host need to be specified.
+Without timestamps, --per-thread must be specified to distinguish threads.
+
+ $ sudo perf kvm --guest --host --guestkallsyms $KALLSYMS record --kcore -e intel_pt/tsc=0,mtc=0,cyc=1/k -p 1430 --per-thread
+ ^C
+ [ perf record: Woken up 1 times to write data ]
+ [ perf record: Captured and wrote 5.829 MB ]
+
+perf script can be used to provide an instruction trace
+
+ $ perf script --guestkallsyms $KALLSYMS --insn-trace --xed -F+ipc | grep -C10 vmresume | head -21
+       CPU 0/KVM  1440  ffffffff82133cdd __vmx_vcpu_run+0x3d ([kernel.kallsyms])                movq  0x48(%rax), %r9
+       CPU 0/KVM  1440  ffffffff82133ce1 __vmx_vcpu_run+0x41 ([kernel.kallsyms])                movq  0x50(%rax), %r10
+       CPU 0/KVM  1440  ffffffff82133ce5 __vmx_vcpu_run+0x45 ([kernel.kallsyms])                movq  0x58(%rax), %r11
+       CPU 0/KVM  1440  ffffffff82133ce9 __vmx_vcpu_run+0x49 ([kernel.kallsyms])                movq  0x60(%rax), %r12
+       CPU 0/KVM  1440  ffffffff82133ced __vmx_vcpu_run+0x4d ([kernel.kallsyms])                movq  0x68(%rax), %r13
+       CPU 0/KVM  1440  ffffffff82133cf1 __vmx_vcpu_run+0x51 ([kernel.kallsyms])                movq  0x70(%rax), %r14
+       CPU 0/KVM  1440  ffffffff82133cf5 __vmx_vcpu_run+0x55 ([kernel.kallsyms])                movq  0x78(%rax), %r15
+       CPU 0/KVM  1440  ffffffff82133cf9 __vmx_vcpu_run+0x59 ([kernel.kallsyms])                movq  (%rax), %rax
+       CPU 0/KVM  1440  ffffffff82133cfc __vmx_vcpu_run+0x5c ([kernel.kallsyms])                callq  0xffffffff82133c40
+       CPU 0/KVM  1440  ffffffff82133c40 vmx_vmenter+0x0 ([kernel.kallsyms])            jz 0xffffffff82133c46
+       CPU 0/KVM  1440  ffffffff82133c42 vmx_vmenter+0x2 ([kernel.kallsyms])            vmresume         IPC: 0.11 (50/445)
+           :1440  1440  ffffffffbb678b06 native_write_msr+0x6 ([guest.kernel.kallsyms])                 nopl  %eax, (%rax,%rax,1)
+           :1440  1440  ffffffffbb678b0b native_write_msr+0xb ([guest.kernel.kallsyms])                 retq     IPC: 0.04 (2/41)
+           :1440  1440  ffffffffbb666646 lapic_next_deadline+0x26 ([guest.kernel.kallsyms])             data16 nop
+           :1440  1440  ffffffffbb666648 lapic_next_deadline+0x28 ([guest.kernel.kallsyms])             xor %eax, %eax
+           :1440  1440  ffffffffbb66664a lapic_next_deadline+0x2a ([guest.kernel.kallsyms])             popq  %rbp
+           :1440  1440  ffffffffbb66664b lapic_next_deadline+0x2b ([guest.kernel.kallsyms])             retq     IPC: 0.16 (4/25)
+           :1440  1440  ffffffffbb74607f clockevents_program_event+0x8f ([guest.kernel.kallsyms])               test %eax, %eax
+           :1440  1440  ffffffffbb746081 clockevents_program_event+0x91 ([guest.kernel.kallsyms])               jz 0xffffffffbb74603c    IPC: 0.06 (2/30)
+           :1440  1440  ffffffffbb74603c clockevents_program_event+0x4c ([guest.kernel.kallsyms])               popq  %rbx
+           :1440  1440  ffffffffbb74603d clockevents_program_event+0x4d ([guest.kernel.kallsyms])               popq  %r12
+
+
+
 SEE ALSO
 --------
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH 00/11] perf intel-pt: Add limited support for tracing guest kernels
  2021-02-18  9:57 [PATCH 00/11] perf intel-pt: Add limited support for tracing guest kernels Adrian Hunter
                   ` (10 preceding siblings ...)
  2021-02-18  9:58 ` [PATCH 11/11] perf intel-pt: Add documentation for tracing virtual machines Adrian Hunter
@ 2021-02-18 18:24 ` Andi Kleen
  11 siblings, 0 replies; 13+ messages in thread
From: Andi Kleen @ 2021-02-18 18:24 UTC (permalink / raw)
  To: Adrian Hunter
  Cc: Arnaldo Carvalho de Melo, Jiri Olsa, Alexander Shishkin, linux-kernel

On Thu, Feb 18, 2021 at 11:57:50AM +0200, Adrian Hunter wrote:
> Hi
> 
> Currently, only kernel tracing is supported and only with "timeless" decoding
> i.e. no TSC timestamps

Patches look good to me. That will be quite useful.

Acked-by: Andi Kleen <ak@linux.intel.com>

Thanks,
-Andi

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2021-02-18 19:24 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-02-18  9:57 [PATCH 00/11] perf intel-pt: Add limited support for tracing guest kernels Adrian Hunter
2021-02-18  9:57 ` [PATCH 01/11] perf script: Add branch types for VM-Entry and VM-Exit Adrian Hunter
2021-02-18  9:57 ` [PATCH 02/11] perf intel_pt: Add vmlaunch and vmresume as branches Adrian Hunter
2021-02-18  9:57 ` [PATCH 03/11] perf intel-pt: Retain the last PIP packet payload as is Adrian Hunter
2021-02-18  9:57 ` [PATCH 04/11] perf intel-pt: Amend decoder to track the NR flag Adrian Hunter
2021-02-18  9:57 ` [PATCH 05/11] perf machine: Factor out machines__find_guest() Adrian Hunter
2021-02-18  9:57 ` [PATCH 06/11] perf machine: Factor out machine__idle_thread() Adrian Hunter
2021-02-18  9:57 ` [PATCH 07/11] perf intel-pt: Support decoding of guest kernel Adrian Hunter
2021-02-18  9:57 ` [PATCH 08/11] perf intel-pt: Allow for a guest kernel address filter Adrian Hunter
2021-02-18  9:57 ` [PATCH 09/11] perf intel-pt: Adjust sample flags for VM-Exit Adrian Hunter
2021-02-18  9:58 ` [PATCH 10/11] perf intel-pt: Split VM-Entry and VM-Exit branches Adrian Hunter
2021-02-18  9:58 ` [PATCH 11/11] perf intel-pt: Add documentation for tracing virtual machines Adrian Hunter
2021-02-18 18:24 ` [PATCH 00/11] perf intel-pt: Add limited support for tracing guest kernels Andi Kleen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).