All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v7 00/10] Implement support for external IPT monitoring
@ 2021-01-21 21:27 Andrew Cooper
  2021-01-21 21:27 ` [PATCH v7 01/10] xen+tools: Introduce XEN_SYSCTL_PHYSCAP_vmtrace Andrew Cooper
                   ` (9 more replies)
  0 siblings, 10 replies; 41+ messages in thread
From: Andrew Cooper @ 2021-01-21 21:27 UTC (permalink / raw)
  To: Xen-devel
  Cc: Andrew Cooper, Jan Beulich, Roger Pau Monné,
	Wei Liu, Ian Jackson, Anthony PERARD, Jun Nakajima, Kevin Tian,
	Michał Leszczyński, Tamas K Lengyel

This is the next iteration of the external IPT monitoring series, reworked
massively from v6 to fix a whole slew of bugs with the XENMEM_acquire_resource
interface.  It also includes some additional bugfixes to make traced VM's work
when forked.

A branch with all prerequisites is available at:
  https://xenbits.xen.org/gitweb/?p=people/andrewcoop/xen.git;a=shortlog;h=refs/heads/xen-ipt

This version has undergone basic testing from Michał, Tamas and myself, and
seems to work for the intended usecases.

Andrew Cooper (1):
  xen+tools: Introduce XEN_SYSCTL_PHYSCAP_vmtrace

Michał Leszczyński (7):
  xen/domain: Add vmtrace_frames domain creation parameter
  tools/[lib]xl: Add vmtrace_buf_size parameter
  xen/memory: Add a vmtrace_buf resource type
  x86/vmx: Add Intel Processor Trace support
  xen/domctl: Add XEN_DOMCTL_vmtrace_op
  tools/libxc: Add xc_vmtrace_* functions
  tools/misc: Add xen-vmtrace tool

Tamas K Lengyel (2):
  xen/vmtrace: support for VM forks
  x86/vm_event: Carry Processor Trace buffer offset in vm_event

 docs/man/xl.cfg.5.pod.in                    |   9 ++
 tools/golang/xenlight/helpers.gen.go        |   4 +
 tools/golang/xenlight/types.gen.go          |   2 +
 tools/include/libxl.h                       |  14 ++
 tools/include/xenctrl.h                     |  73 +++++++++++
 tools/libs/ctrl/Makefile                    |   1 +
 tools/libs/ctrl/xc_vmtrace.c                | 128 ++++++++++++++++++
 tools/libs/light/libxl.c                    |   2 +
 tools/libs/light/libxl_cpuid.c              |   1 +
 tools/libs/light/libxl_create.c             |   2 +
 tools/libs/light/libxl_types.idl            |   5 +
 tools/misc/.gitignore                       |   1 +
 tools/misc/Makefile                         |   4 +
 tools/misc/xen-cpuid.c                      |   2 +-
 tools/misc/xen-vmtrace.c                    | 154 ++++++++++++++++++++++
 tools/xl/xl_info.c                          |   5 +-
 tools/xl/xl_parse.c                         |   4 +
 xen/arch/x86/domain.c                       |  23 ++++
 xen/arch/x86/domctl.c                       |  55 ++++++++
 xen/arch/x86/hvm/vmx/vmcs.c                 |  15 ++-
 xen/arch/x86/hvm/vmx/vmx.c                  | 196 +++++++++++++++++++++++++++-
 xen/arch/x86/mm/mem_sharing.c               |   3 +
 xen/arch/x86/vm_event.c                     |   3 +
 xen/common/domain.c                         |  58 ++++++++
 xen/common/memory.c                         |  27 ++++
 xen/common/sysctl.c                         |   2 +
 xen/include/asm-x86/cpufeature.h            |   1 +
 xen/include/asm-x86/hvm/hvm.h               |  72 ++++++++++
 xen/include/asm-x86/hvm/vmx/vmcs.h          |   4 +
 xen/include/asm-x86/msr.h                   |  32 +++++
 xen/include/public/arch-x86/cpufeatureset.h |   1 +
 xen/include/public/domctl.h                 |  36 +++++
 xen/include/public/memory.h                 |   1 +
 xen/include/public/sysctl.h                 |   1 +
 xen/include/public/vm_event.h               |   6 +
 xen/include/xen/domain.h                    |   2 +
 xen/include/xen/sched.h                     |   7 +
 xen/xsm/flask/hooks.c                       |   1 +
 38 files changed, 952 insertions(+), 5 deletions(-)
 create mode 100644 tools/libs/ctrl/xc_vmtrace.c
 create mode 100644 tools/misc/xen-vmtrace.c

-- 
2.11.0



^ permalink raw reply	[flat|nested] 41+ messages in thread

* [PATCH v7 01/10] xen+tools: Introduce XEN_SYSCTL_PHYSCAP_vmtrace
  2021-01-21 21:27 [PATCH v7 00/10] Implement support for external IPT monitoring Andrew Cooper
@ 2021-01-21 21:27 ` Andrew Cooper
  2021-01-22 15:28   ` Ian Jackson
  2021-01-26  8:58   ` Julien Grall
  2021-01-21 21:27 ` [PATCH v7 02/10] xen/domain: Add vmtrace_frames domain creation parameter Andrew Cooper
                   ` (8 subsequent siblings)
  9 siblings, 2 replies; 41+ messages in thread
From: Andrew Cooper @ 2021-01-21 21:27 UTC (permalink / raw)
  To: Xen-devel
  Cc: Andrew Cooper, Jan Beulich, Roger Pau Monné,
	Wei Liu, Ian Jackson, Michał Leszczyński,
	Tamas K Lengyel

We're about to introduce support for Intel Processor Trace, but similar
functionality exists in other platforms.

Aspects of vmtrace can reasonably can be common, so start with
XEN_SYSCTL_PHYSCAP_vmtrace and plumb the signal from Xen all the way down into
`xl info`.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Roger Pau Monné <roger.pau@citrix.com>
CC: Wei Liu <wl@xen.org>
CC: Ian Jackson <iwj@xenproject.org>
CC: Wei Liu <wl@xen.org>
CC: Michał Leszczyński <michal.leszczynski@cert.pl>
CC: Tamas K Lengyel <tamas@tklengyel.com>

v7:
 * New
---
 tools/golang/xenlight/helpers.gen.go | 2 ++
 tools/golang/xenlight/types.gen.go   | 1 +
 tools/include/libxl.h                | 7 +++++++
 tools/libs/light/libxl.c             | 2 ++
 tools/libs/light/libxl_types.idl     | 1 +
 tools/xl/xl_info.c                   | 5 +++--
 xen/common/domain.c                  | 2 ++
 xen/common/sysctl.c                  | 2 ++
 xen/include/public/sysctl.h          | 1 +
 xen/include/xen/domain.h             | 2 ++
 10 files changed, 23 insertions(+), 2 deletions(-)

diff --git a/tools/golang/xenlight/helpers.gen.go b/tools/golang/xenlight/helpers.gen.go
index c8605994e7..62fb98ba30 100644
--- a/tools/golang/xenlight/helpers.gen.go
+++ b/tools/golang/xenlight/helpers.gen.go
@@ -3345,6 +3345,7 @@ x.CapHvmDirectio = bool(xc.cap_hvm_directio)
 x.CapHap = bool(xc.cap_hap)
 x.CapShadow = bool(xc.cap_shadow)
 x.CapIommuHapPtShare = bool(xc.cap_iommu_hap_pt_share)
+x.CapVmtrace = bool(xc.cap_vmtrace)
 
  return nil}
 
@@ -3375,6 +3376,7 @@ xc.cap_hvm_directio = C.bool(x.CapHvmDirectio)
 xc.cap_hap = C.bool(x.CapHap)
 xc.cap_shadow = C.bool(x.CapShadow)
 xc.cap_iommu_hap_pt_share = C.bool(x.CapIommuHapPtShare)
+xc.cap_vmtrace = C.bool(x.CapVmtrace)
 
  return nil
  }
diff --git a/tools/golang/xenlight/types.gen.go b/tools/golang/xenlight/types.gen.go
index b4c5df0f2c..369da6dd1e 100644
--- a/tools/golang/xenlight/types.gen.go
+++ b/tools/golang/xenlight/types.gen.go
@@ -998,6 +998,7 @@ CapHvmDirectio bool
 CapHap bool
 CapShadow bool
 CapIommuHapPtShare bool
+CapVmtrace bool
 }
 
 type Connectorinfo struct {
diff --git a/tools/include/libxl.h b/tools/include/libxl.h
index 3433c950f9..c4d920f1e5 100644
--- a/tools/include/libxl.h
+++ b/tools/include/libxl.h
@@ -464,6 +464,13 @@
 #define LIBXL_HAVE_DEVICE_PCI_ASSIGNABLE_LIST_FREE 1
 
 /*
+ * LIBXL_HAVE_PHYSINFO_CAP_VMTRACE indicates that libxl_physinfo has a
+ * cap_vmtrace field, which indicates the availability of platform tracing
+ * functionality.
+ */
+#define LIBXL_HAVE_PHYSINFO_CAP_VMTRACE 1
+
+/*
  * libxl ABI compatibility
  *
  * The only guarantee which libxl makes regarding ABI compatibility
diff --git a/tools/libs/light/libxl.c b/tools/libs/light/libxl.c
index d2a87157a2..204eb0be2d 100644
--- a/tools/libs/light/libxl.c
+++ b/tools/libs/light/libxl.c
@@ -402,6 +402,8 @@ int libxl_get_physinfo(libxl_ctx *ctx, libxl_physinfo *physinfo)
         !!(xcphysinfo.capabilities & XEN_SYSCTL_PHYSCAP_shadow);
     physinfo->cap_iommu_hap_pt_share =
         !!(xcphysinfo.capabilities & XEN_SYSCTL_PHYSCAP_iommu_hap_pt_share);
+    physinfo->cap_vmtrace =
+        !!(xcphysinfo.capabilities & XEN_SYSCTL_PHYSCAP_vmtrace);
 
     GC_FREE;
     return 0;
diff --git a/tools/libs/light/libxl_types.idl b/tools/libs/light/libxl_types.idl
index 05324736b7..b43d5f1265 100644
--- a/tools/libs/light/libxl_types.idl
+++ b/tools/libs/light/libxl_types.idl
@@ -1050,6 +1050,7 @@ libxl_physinfo = Struct("physinfo", [
     ("cap_hap", bool),
     ("cap_shadow", bool),
     ("cap_iommu_hap_pt_share", bool),
+    ("cap_vmtrace", bool),
     ], dir=DIR_OUT)
 
 libxl_connectorinfo = Struct("connectorinfo", [
diff --git a/tools/xl/xl_info.c b/tools/xl/xl_info.c
index ca417df8e8..8383e4a6df 100644
--- a/tools/xl/xl_info.c
+++ b/tools/xl/xl_info.c
@@ -210,14 +210,15 @@ static void output_physinfo(void)
          info.hw_cap[4], info.hw_cap[5], info.hw_cap[6], info.hw_cap[7]
         );
 
-    maybe_printf("virt_caps              :%s%s%s%s%s%s%s\n",
+    maybe_printf("virt_caps              :%s%s%s%s%s%s%s%s\n",
          info.cap_pv ? " pv" : "",
          info.cap_hvm ? " hvm" : "",
          info.cap_hvm && info.cap_hvm_directio ? " hvm_directio" : "",
          info.cap_pv && info.cap_hvm_directio ? " pv_directio" : "",
          info.cap_hap ? " hap" : "",
          info.cap_shadow ? " shadow" : "",
-         info.cap_iommu_hap_pt_share ? " iommu_hap_pt_share" : ""
+         info.cap_iommu_hap_pt_share ? " iommu_hap_pt_share" : "",
+         info.cap_vmtrace ? " vmtrace" : ""
         );
 
     vinfo = libxl_get_version_info(ctx);
diff --git a/xen/common/domain.c b/xen/common/domain.c
index 2b461655c3..d1e94d88cf 100644
--- a/xen/common/domain.c
+++ b/xen/common/domain.c
@@ -82,6 +82,8 @@ struct vcpu *idle_vcpu[NR_CPUS] __read_mostly;
 
 vcpu_info_t dummy_vcpu_info;
 
+bool __read_mostly vmtrace_available;
+
 static void __domain_finalise_shutdown(struct domain *d)
 {
     struct vcpu *v;
diff --git a/xen/common/sysctl.c b/xen/common/sysctl.c
index ec916424e5..3558641cd9 100644
--- a/xen/common/sysctl.c
+++ b/xen/common/sysctl.c
@@ -277,6 +277,8 @@ long do_sysctl(XEN_GUEST_HANDLE_PARAM(xen_sysctl_t) u_sysctl)
             if ( iommu_hap_pt_share )
                 pi->capabilities |= XEN_SYSCTL_PHYSCAP_iommu_hap_pt_share;
         }
+        if ( vmtrace_available )
+            pi->capabilities |= XEN_SYSCTL_PHYSCAP_vmtrace;
 
         if ( copy_to_guest(u_sysctl, op, 1) )
             ret = -EFAULT;
diff --git a/xen/include/public/sysctl.h b/xen/include/public/sysctl.h
index a073647117..d4453d2eab 100644
--- a/xen/include/public/sysctl.h
+++ b/xen/include/public/sysctl.h
@@ -100,6 +100,7 @@ struct xen_sysctl_tbuf_op {
 #define _XEN_SYSCTL_PHYSCAP_iommu_hap_pt_share 5
 #define XEN_SYSCTL_PHYSCAP_iommu_hap_pt_share  \
     (1u << _XEN_SYSCTL_PHYSCAP_iommu_hap_pt_share)
+#define XEN_SYSCTL_PHYSCAP_vmtrace       (1 << 6)
 
 /* Max XEN_SYSCTL_PHYSCAP_* constant.  Used for ABI checking. */
 #define XEN_SYSCTL_PHYSCAP_MAX XEN_SYSCTL_PHYSCAP_iommu_hap_pt_share
diff --git a/xen/include/xen/domain.h b/xen/include/xen/domain.h
index cde0d9c7fe..1708c36964 100644
--- a/xen/include/xen/domain.h
+++ b/xen/include/xen/domain.h
@@ -131,4 +131,6 @@ void vnuma_destroy(struct vnuma_info *vnuma);
 static inline void vnuma_destroy(struct vnuma_info *vnuma) { ASSERT(!vnuma); }
 #endif
 
+extern bool vmtrace_available;
+
 #endif /* __XEN_DOMAIN_H__ */
-- 
2.11.0



^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v7 02/10] xen/domain: Add vmtrace_frames domain creation parameter
  2021-01-21 21:27 [PATCH v7 00/10] Implement support for external IPT monitoring Andrew Cooper
  2021-01-21 21:27 ` [PATCH v7 01/10] xen+tools: Introduce XEN_SYSCTL_PHYSCAP_vmtrace Andrew Cooper
@ 2021-01-21 21:27 ` Andrew Cooper
  2021-01-25 15:08   ` Jan Beulich
  2021-01-21 21:27 ` [PATCH v7 03/10] tools/[lib]xl: Add vmtrace_buf_size parameter Andrew Cooper
                   ` (7 subsequent siblings)
  9 siblings, 1 reply; 41+ messages in thread
From: Andrew Cooper @ 2021-01-21 21:27 UTC (permalink / raw)
  To: Xen-devel
  Cc: Michał Leszczyński, Andrew Cooper, Jan Beulich,
	Roger Pau Monné,
	Wei Liu, Anthony PERARD, Tamas K Lengyel

From: Michał Leszczyński <michal.leszczynski@cert.pl>

To use vmtrace, buffers of a suitable size need allocating, and different
tasks will want different sizes.

Add a domain creation parameter, and audit it appropriately in the
{arch_,}sanitise_domain_config() functions.

For now, the x86 specific auditing is tuned to Processor Trace running in
Single Output mode, which requires a single contiguous range of memory.

The size is given an arbitrary limit of 64M which is expected to be enough for
anticipated usecases, but not large enough to get into long-running-hypercall
problems.

Signed-off-by: Michał Leszczyński <michal.leszczynski@cert.pl>
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Roger Pau Monné <roger.pau@citrix.com>
CC: Wei Liu <wl@xen.org>
CC: Anthony PERARD <anthony.perard@citrix.com>
CC: Michał Leszczyński <michal.leszczynski@cert.pl>
CC: Tamas K Lengyel <tamas@tklengyel.com>

When support for later generations of IPT get added, we can in principle start
to use ToTP which is a scatter list of smaller trace regions to use, if we
need to massively up the buffer size available.

v7:
 * Major chop&change within the series.
 * Use the name 'vmtrace' consistently.
 * Use the (new) common vcpu_teardown() functionality, rather than leaving a
   latent memory leak on ARM.
---
 xen/arch/x86/domain.c       | 23 +++++++++++++++++++
 xen/common/domain.c         | 56 +++++++++++++++++++++++++++++++++++++++++++++
 xen/include/public/domctl.h |  1 +
 xen/include/xen/sched.h     |  7 ++++++
 4 files changed, 87 insertions(+)

diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
index b9ba04633e..3f12d68e9e 100644
--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -660,6 +660,29 @@ int arch_sanitise_domain_config(struct xen_domctl_createdomain *config)
         return -EINVAL;
     }
 
+    if ( config->vmtrace_frames )
+    {
+        unsigned int frames = config->vmtrace_frames;
+
+        ASSERT(vmtrace_available); /* Checked by common code. */
+
+        /*
+         * For now, vmtrace is restricted to HVM guests, and using a
+         * power-of-2 buffer up to 64M in size.
+         */
+        if ( !hvm )
+        {
+            dprintk(XENLOG_INFO, "vmtrace not supported for PV\n");
+            return -EINVAL;
+        }
+
+        if ( frames > (MB(64) >> PAGE_SHIFT) || (frames & (frames - 1)) )
+        {
+            dprintk(XENLOG_INFO, "Unsupported vmtrace frames: %u\n", frames);
+            return -EINVAL;
+        }
+    }
+
     return 0;
 }
 
diff --git a/xen/common/domain.c b/xen/common/domain.c
index d1e94d88cf..a844bc7b78 100644
--- a/xen/common/domain.c
+++ b/xen/common/domain.c
@@ -132,6 +132,48 @@ static void vcpu_info_reset(struct vcpu *v)
     v->vcpu_info_mfn = INVALID_MFN;
 }
 
+static void vmtrace_free_buffer(struct vcpu *v)
+{
+    const struct domain *d = v->domain;
+    struct page_info *pg = v->vmtrace.buf;
+    unsigned int i;
+
+    if ( !pg )
+        return;
+
+    for ( i = 0; i < d->vmtrace_frames; i++ )
+    {
+        put_page_alloc_ref(&pg[i]);
+        put_page_and_type(&pg[i]);
+    }
+
+    v->vmtrace.buf = NULL;
+}
+
+static int vmtrace_alloc_buffer(struct vcpu *v)
+{
+    struct domain *d = v->domain;
+    struct page_info *pg;
+    unsigned int i;
+
+    if ( !d->vmtrace_frames )
+        return 0;
+
+    pg = alloc_domheap_pages(d, get_order_from_pages(d->vmtrace_frames),
+                             MEMF_no_refcount);
+    if ( !pg )
+        return -ENOMEM;
+
+    v->vmtrace.buf = pg;
+
+    for ( i = 0; i < d->vmtrace_frames; i++ )
+        /* Domain can't know about this page yet - something fishy going on. */
+        if ( !get_page_and_type(&pg[i], d, PGT_writable_page) )
+            BUG();
+
+    return 0;
+}
+
 /*
  * Release resources held by a vcpu.  There may or may not be live references
  * to the vcpu, and it may or may not be fully constructed.
@@ -140,6 +182,8 @@ static void vcpu_info_reset(struct vcpu *v)
  */
 static int vcpu_teardown(struct vcpu *v)
 {
+    vmtrace_free_buffer(v);
+
     return 0;
 }
 
@@ -201,6 +245,9 @@ struct vcpu *vcpu_create(struct domain *d, unsigned int vcpu_id)
     if ( sched_init_vcpu(v) != 0 )
         goto fail_wq;
 
+    if ( vmtrace_alloc_buffer(v) != 0 )
+        goto fail_wq;
+
     if ( arch_vcpu_create(v) != 0 )
         goto fail_sched;
 
@@ -449,6 +496,12 @@ static int sanitise_domain_config(struct xen_domctl_createdomain *config)
         }
     }
 
+    if ( config->vmtrace_frames && !vmtrace_available )
+    {
+        dprintk(XENLOG_INFO, "vmtrace requested but not available\n");
+        return -EINVAL;
+    }
+
     return arch_sanitise_domain_config(config);
 }
 
@@ -474,7 +527,10 @@ struct domain *domain_create(domid_t domid,
     ASSERT(is_system_domain(d) ? config == NULL : config != NULL);
 
     if ( config )
+    {
         d->options = config->flags;
+        d->vmtrace_frames = config->vmtrace_frames;
+    }
 
     /* Sort out our idea of is_control_domain(). */
     d->is_privileged = is_priv;
diff --git a/xen/include/public/domctl.h b/xen/include/public/domctl.h
index 666aeb71bf..1585678d50 100644
--- a/xen/include/public/domctl.h
+++ b/xen/include/public/domctl.h
@@ -94,6 +94,7 @@ struct xen_domctl_createdomain {
     uint32_t max_evtchn_port;
     int32_t max_grant_frames;
     int32_t max_maptrack_frames;
+    uint32_t vmtrace_frames;
 
     struct xen_arch_domainconfig arch;
 };
diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
index da19f4e9f6..03905f6246 100644
--- a/xen/include/xen/sched.h
+++ b/xen/include/xen/sched.h
@@ -257,6 +257,10 @@ struct vcpu
     /* vPCI per-vCPU area, used to store data for long running operations. */
     struct vpci_vcpu vpci;
 
+    struct {
+        struct page_info *buf;
+    } vmtrace;
+
     struct arch_vcpu arch;
 };
 
@@ -470,6 +474,9 @@ struct domain
     unsigned    pbuf_idx;
     spinlock_t  pbuf_lock;
 
+    /* Used by vmtrace features */
+    uint32_t    vmtrace_frames;
+
     /* OProfile support. */
     struct xenoprof *xenoprof;
 
-- 
2.11.0



^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v7 03/10] tools/[lib]xl: Add vmtrace_buf_size parameter
  2021-01-21 21:27 [PATCH v7 00/10] Implement support for external IPT monitoring Andrew Cooper
  2021-01-21 21:27 ` [PATCH v7 01/10] xen+tools: Introduce XEN_SYSCTL_PHYSCAP_vmtrace Andrew Cooper
  2021-01-21 21:27 ` [PATCH v7 02/10] xen/domain: Add vmtrace_frames domain creation parameter Andrew Cooper
@ 2021-01-21 21:27 ` Andrew Cooper
  2021-01-22 15:29   ` Ian Jackson
  2021-01-21 21:27 ` [PATCH v7 04/10] xen/memory: Add a vmtrace_buf resource type Andrew Cooper
                   ` (6 subsequent siblings)
  9 siblings, 1 reply; 41+ messages in thread
From: Andrew Cooper @ 2021-01-21 21:27 UTC (permalink / raw)
  To: Xen-devel
  Cc: Michał Leszczyński, Andrew Cooper, Ian Jackson,
	Wei Liu, Anthony PERARD, Tamas K Lengyel

From: Michał Leszczyński <michal.leszczynski@cert.pl>

Allow to specify the size of per-vCPU trace buffer upon
domain creation. This is zero by default (meaning: not enabled).

Signed-off-by: Michał Leszczyński <michal.leszczynski@cert.pl>
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Ian Jackson <iwj@xenproject.org>
CC: Wei Liu <wl@xen.org>
CC: Anthony PERARD <anthony.perard@citrix.com>
CC: Michał Leszczyński <michal.leszczynski@cert.pl>
CC: Tamas K Lengyel <tamas@tklengyel.com>

v7:
 * Use the name 'vmtrace' consistently.
---
 docs/man/xl.cfg.5.pod.in             | 9 +++++++++
 tools/golang/xenlight/helpers.gen.go | 2 ++
 tools/golang/xenlight/types.gen.go   | 1 +
 tools/include/libxl.h                | 7 +++++++
 tools/libs/light/libxl_create.c      | 2 ++
 tools/libs/light/libxl_types.idl     | 4 ++++
 tools/xl/xl_parse.c                  | 4 ++++
 7 files changed, 29 insertions(+)

diff --git a/docs/man/xl.cfg.5.pod.in b/docs/man/xl.cfg.5.pod.in
index c8e017f950..86963298a3 100644
--- a/docs/man/xl.cfg.5.pod.in
+++ b/docs/man/xl.cfg.5.pod.in
@@ -681,6 +681,15 @@ Windows).
 
 If this option is not specified then it will default to B<false>.
 
+=item B<vmtrace_buf_kb=KBYTES>
+
+Specifies the size of vmtrace buffer that would be allocated for each
+vCPU belonging to this domain.  Disabled (i.e.  B<vmtrace_buf_kb=0>) by
+default.
+
+B<NOTE>: Acceptable values are platform specific.  For Intel Processor
+Trace, this value must be a power of 2 between 4k and 16M.
+
 =back
 
 =head2 Devices
diff --git a/tools/golang/xenlight/helpers.gen.go b/tools/golang/xenlight/helpers.gen.go
index 62fb98ba30..1e22a407bf 100644
--- a/tools/golang/xenlight/helpers.gen.go
+++ b/tools/golang/xenlight/helpers.gen.go
@@ -1114,6 +1114,7 @@ return fmt.Errorf("invalid union key '%v'", x.Type)}
 x.ArchArm.GicVersion = GicVersion(xc.arch_arm.gic_version)
 x.ArchArm.Vuart = VuartType(xc.arch_arm.vuart)
 x.Altp2M = Altp2MMode(xc.altp2m)
+x.VmtraceBufKb = int(xc.vmtrace_buf_kb)
 
  return nil}
 
@@ -1589,6 +1590,7 @@ return fmt.Errorf("invalid union key '%v'", x.Type)}
 xc.arch_arm.gic_version = C.libxl_gic_version(x.ArchArm.GicVersion)
 xc.arch_arm.vuart = C.libxl_vuart_type(x.ArchArm.Vuart)
 xc.altp2m = C.libxl_altp2m_mode(x.Altp2M)
+xc.vmtrace_buf_kb = C.int(x.VmtraceBufKb)
 
  return nil
  }
diff --git a/tools/golang/xenlight/types.gen.go b/tools/golang/xenlight/types.gen.go
index 369da6dd1e..130524dce9 100644
--- a/tools/golang/xenlight/types.gen.go
+++ b/tools/golang/xenlight/types.gen.go
@@ -513,6 +513,7 @@ GicVersion GicVersion
 Vuart VuartType
 }
 Altp2M Altp2MMode
+VmtraceBufKb int
 }
 
 type domainBuildInfoTypeUnion interface {
diff --git a/tools/include/libxl.h b/tools/include/libxl.h
index c4d920f1e5..7606a21219 100644
--- a/tools/include/libxl.h
+++ b/tools/include/libxl.h
@@ -471,6 +471,13 @@
 #define LIBXL_HAVE_PHYSINFO_CAP_VMTRACE 1
 
 /*
+ * LIBXL_HAVE_VMTRACE_BUF_KB indicates that libxl_domain_create_info has a
+ * vmtrace_buf_kb parameter, which allows to enable pre-allocation of
+ * processor tracing buffers of given size.
+ */
+#define LIBXL_HAVE_VMTRACE_BUF_KB 1
+
+/*
  * libxl ABI compatibility
  *
  * The only guarantee which libxl makes regarding ABI compatibility
diff --git a/tools/libs/light/libxl_create.c b/tools/libs/light/libxl_create.c
index 86f4a8369d..d4a9380357 100644
--- a/tools/libs/light/libxl_create.c
+++ b/tools/libs/light/libxl_create.c
@@ -607,6 +607,8 @@ int libxl__domain_make(libxl__gc *gc, libxl_domain_config *d_config,
             .max_evtchn_port = b_info->event_channels,
             .max_grant_frames = b_info->max_grant_frames,
             .max_maptrack_frames = b_info->max_maptrack_frames,
+            .vmtrace_frames = DIV_ROUNDUP(b_info->vmtrace_buf_kb,
+                                          XC_PAGE_SIZE >> 10),
         };
 
         if (info->type != LIBXL_DOMAIN_TYPE_PV) {
diff --git a/tools/libs/light/libxl_types.idl b/tools/libs/light/libxl_types.idl
index b43d5f1265..c092c31b5e 100644
--- a/tools/libs/light/libxl_types.idl
+++ b/tools/libs/light/libxl_types.idl
@@ -646,6 +646,10 @@ libxl_domain_build_info = Struct("domain_build_info",[
     # supported by x86 HVM and ARM support is planned.
     ("altp2m", libxl_altp2m_mode),
 
+    # Size of preallocated vmtrace trace buffers (in KBYTES).
+    # Use zero value to disable this feature.
+    ("vmtrace_buf_kb", integer),
+
     ], dir=DIR_IN,
        copy_deprecated_fn="libxl__domain_build_info_copy_deprecated",
 )
diff --git a/tools/xl/xl_parse.c b/tools/xl/xl_parse.c
index 4ebf39620a..ca99af8d1b 100644
--- a/tools/xl/xl_parse.c
+++ b/tools/xl/xl_parse.c
@@ -1863,6 +1863,10 @@ void parse_config_data(const char *config_source,
         }
     }
 
+    if (!xlu_cfg_get_long(config, "vmtrace_buf_kb", &l, 1) && l) {
+        b_info->vmtrace_buf_kb = l;
+    }
+
     if (!xlu_cfg_get_list(config, "ioports", &ioports, &num_ioports, 0)) {
         b_info->num_ioports = num_ioports;
         b_info->ioports = calloc(num_ioports, sizeof(*b_info->ioports));
-- 
2.11.0



^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v7 04/10] xen/memory: Add a vmtrace_buf resource type
  2021-01-21 21:27 [PATCH v7 00/10] Implement support for external IPT monitoring Andrew Cooper
                   ` (2 preceding siblings ...)
  2021-01-21 21:27 ` [PATCH v7 03/10] tools/[lib]xl: Add vmtrace_buf_size parameter Andrew Cooper
@ 2021-01-21 21:27 ` Andrew Cooper
  2021-01-25 16:31   ` Jan Beulich
  2021-01-21 21:27 ` [PATCH v7 05/10] x86/vmx: Add Intel Processor Trace support Andrew Cooper
                   ` (5 subsequent siblings)
  9 siblings, 1 reply; 41+ messages in thread
From: Andrew Cooper @ 2021-01-21 21:27 UTC (permalink / raw)
  To: Xen-devel
  Cc: Michał Leszczyński, Andrew Cooper, Jan Beulich,
	Roger Pau Monné,
	Wei Liu, Tamas K Lengyel

From: Michał Leszczyński <michal.leszczynski@cert.pl>

Allow to map processor trace buffer using acquire_resource().

Signed-off-by: Michał Leszczyński <michal.leszczynski@cert.pl>
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Roger Pau Monné <roger.pau@citrix.com>
CC: Wei Liu <wl@xen.org>
CC: Michał Leszczyński <michal.leszczynski@cert.pl>
CC: Tamas K Lengyel <tamas@tklengyel.com>

v7:
 * Rebase over changes elsewhere in the series
---
 xen/common/memory.c         | 27 +++++++++++++++++++++++++++
 xen/include/public/memory.h |  1 +
 2 files changed, 28 insertions(+)

diff --git a/xen/common/memory.c b/xen/common/memory.c
index c89923d909..ec6a55172a 100644
--- a/xen/common/memory.c
+++ b/xen/common/memory.c
@@ -1068,11 +1068,35 @@ static unsigned int resource_max_frames(const struct domain *d,
     case XENMEM_resource_grant_table:
         return gnttab_resource_max_frames(d, id);
 
+    case XENMEM_resource_vmtrace_buf:
+        return d->vmtrace_frames;
+
     default:
         return arch_resource_max_frames(d, type, id);
     }
 }
 
+static int acquire_vmtrace_buf(
+    struct domain *d, unsigned int id, unsigned long frame,
+    unsigned int nr_frames, xen_pfn_t mfn_list[])
+{
+    const struct vcpu *v = domain_vcpu(d, id);
+    unsigned int i;
+    mfn_t mfn;
+
+    if ( !v || !v->vmtrace.buf ||
+         nr_frames > d->vmtrace_frames ||
+         (frame + nr_frames) > d->vmtrace_frames )
+        return -EINVAL;
+
+    mfn = page_to_mfn(v->vmtrace.buf);
+
+    for ( i = 0; i < nr_frames; i++ )
+        mfn_list[i] = mfn_x(mfn) + frame + i;
+
+    return nr_frames;
+}
+
 /*
  * Returns -errno on error, or positive in the range [1, nr_frames] on
  * success.  Returning less than nr_frames contitutes a request for a
@@ -1087,6 +1111,9 @@ static int _acquire_resource(
     case XENMEM_resource_grant_table:
         return gnttab_acquire_resource(d, id, frame, nr_frames, mfn_list);
 
+    case XENMEM_resource_vmtrace_buf:
+        return acquire_vmtrace_buf(d, id, frame, nr_frames, mfn_list);
+
     default:
         return arch_acquire_resource(d, type, id, frame, nr_frames, mfn_list);
     }
diff --git a/xen/include/public/memory.h b/xen/include/public/memory.h
index c4c47a0b38..b0378bb14b 100644
--- a/xen/include/public/memory.h
+++ b/xen/include/public/memory.h
@@ -625,6 +625,7 @@ struct xen_mem_acquire_resource {
 
 #define XENMEM_resource_ioreq_server 0
 #define XENMEM_resource_grant_table 1
+#define XENMEM_resource_vmtrace_buf 2
 
     /*
      * IN - a type-specific resource identifier, which must be zero
-- 
2.11.0



^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v7 05/10] x86/vmx: Add Intel Processor Trace support
  2021-01-21 21:27 [PATCH v7 00/10] Implement support for external IPT monitoring Andrew Cooper
                   ` (3 preceding siblings ...)
  2021-01-21 21:27 ` [PATCH v7 04/10] xen/memory: Add a vmtrace_buf resource type Andrew Cooper
@ 2021-01-21 21:27 ` Andrew Cooper
  2021-01-26 13:35   ` Jan Beulich
  2021-01-21 21:27 ` [PATCH v7 06/10] xen/domctl: Add XEN_DOMCTL_vmtrace_op Andrew Cooper
                   ` (4 subsequent siblings)
  9 siblings, 1 reply; 41+ messages in thread
From: Andrew Cooper @ 2021-01-21 21:27 UTC (permalink / raw)
  To: Xen-devel
  Cc: Michał Leszczyński, Andrew Cooper, Jan Beulich,
	Roger Pau Monné,
	Wei Liu, Jun Nakajima, Kevin Tian, Tamas K Lengyel

From: Michał Leszczyński <michal.leszczynski@cert.pl>

Add CPUID/MSR enumeration details for Processor Trace.  For now, we will only
support its use inside VMX operation.  Fill in the vmtrace_available boolean
to activate the newly introduced common infrastructure for allocating trace
buffers.

For now, Processor Trace is going to be operated in Single Output mode behind
the guests back.  Add the MSRs to struct vcpu_msrs, and set up the buffer
limit in vmx_init_pt() as it is fixed for the lifetime of the domain.

Context switch the most of the MSRs in and out of vCPU context switch, but the
main control register needs to reside in the MSR load/save lists.  Explicitly
pull the msrs pointer out into a local variable, because the optimiser cannot
keep it live across the memory clobbers in the MSR accesses.

Signed-off-by: Michał Leszczyński <michal.leszczynski@cert.pl>
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Roger Pau Monné <roger.pau@citrix.com>
CC: Wei Liu <wl@xen.org>
CC: Jun Nakajima <jun.nakajima@intel.com>
CC: Kevin Tian <kevin.tian@intel.com>
CC: Michał Leszczyński <michal.leszczynski@cert.pl>
CC: Tamas K Lengyel <tamas@tklengyel.com>

v7:
 * Major chop&change within the series.
 * Move MSRs to vcpu_msrs, which is where they'll definitely want to live when
   we offer PT to VMs for their own use.
---
 tools/libs/light/libxl_cpuid.c              |  1 +
 tools/misc/xen-cpuid.c                      |  2 +-
 xen/arch/x86/hvm/vmx/vmcs.c                 | 15 ++++++++++++-
 xen/arch/x86/hvm/vmx/vmx.c                  | 34 ++++++++++++++++++++++++++++-
 xen/include/asm-x86/cpufeature.h            |  1 +
 xen/include/asm-x86/hvm/vmx/vmcs.h          |  4 ++++
 xen/include/asm-x86/msr.h                   | 32 +++++++++++++++++++++++++++
 xen/include/public/arch-x86/cpufeatureset.h |  1 +
 8 files changed, 87 insertions(+), 3 deletions(-)

diff --git a/tools/libs/light/libxl_cpuid.c b/tools/libs/light/libxl_cpuid.c
index 259612834e..289c59c742 100644
--- a/tools/libs/light/libxl_cpuid.c
+++ b/tools/libs/light/libxl_cpuid.c
@@ -188,6 +188,7 @@ int libxl_cpuid_parse_config(libxl_cpuid_policy_list *cpuid, const char* str)
         {"avx512-ifma",  0x00000007,  0, CPUID_REG_EBX, 21,  1},
         {"clflushopt",   0x00000007,  0, CPUID_REG_EBX, 23,  1},
         {"clwb",         0x00000007,  0, CPUID_REG_EBX, 24,  1},
+        {"proc-trace",   0x00000007,  0, CPUID_REG_EBX, 25,  1},
         {"avx512pf",     0x00000007,  0, CPUID_REG_EBX, 26,  1},
         {"avx512er",     0x00000007,  0, CPUID_REG_EBX, 27,  1},
         {"avx512cd",     0x00000007,  0, CPUID_REG_EBX, 28,  1},
diff --git a/tools/misc/xen-cpuid.c b/tools/misc/xen-cpuid.c
index c81aa93055..2d04162d8d 100644
--- a/tools/misc/xen-cpuid.c
+++ b/tools/misc/xen-cpuid.c
@@ -106,7 +106,7 @@ static const char *const str_7b0[32] =
     [18] = "rdseed",   [19] = "adx",
     [20] = "smap",     [21] = "avx512-ifma",
     [22] = "pcommit",  [23] = "clflushopt",
-    [24] = "clwb",     [25] = "pt",
+    [24] = "clwb",     [25] = "proc-trace",
     [26] = "avx512pf", [27] = "avx512er",
     [28] = "avx512cd", [29] = "sha",
     [30] = "avx512bw", [31] = "avx512vl",
diff --git a/xen/arch/x86/hvm/vmx/vmcs.c b/xen/arch/x86/hvm/vmx/vmcs.c
index 164535f8f0..5576caad8e 100644
--- a/xen/arch/x86/hvm/vmx/vmcs.c
+++ b/xen/arch/x86/hvm/vmx/vmcs.c
@@ -291,6 +291,20 @@ static int vmx_init_vmcs_config(void)
         _vmx_cpu_based_exec_control &=
             ~(CPU_BASED_CR8_LOAD_EXITING | CPU_BASED_CR8_STORE_EXITING);
 
+    rdmsrl(MSR_IA32_VMX_MISC, _vmx_misc_cap);
+
+    /* Check whether IPT is supported in VMX operation. */
+    if ( !smp_processor_id() )
+        vmtrace_available = cpu_has_proc_trace &&
+                            (_vmx_misc_cap & VMX_MISC_PROC_TRACE);
+    else if ( vmtrace_available &&
+              !(_vmx_misc_cap & VMX_MISC_PROC_TRACE) )
+    {
+        printk("VMX: IPT capabilities differ between CPU%u and CPU0\n",
+               smp_processor_id());
+        return -EINVAL;
+    }
+
     if ( _vmx_cpu_based_exec_control & CPU_BASED_ACTIVATE_SECONDARY_CONTROLS )
     {
         min = 0;
@@ -305,7 +319,6 @@ static int vmx_init_vmcs_config(void)
                SECONDARY_EXEC_ENABLE_VIRT_EXCEPTIONS |
                SECONDARY_EXEC_XSAVES |
                SECONDARY_EXEC_TSC_SCALING);
-        rdmsrl(MSR_IA32_VMX_MISC, _vmx_misc_cap);
         if ( _vmx_misc_cap & VMX_MISC_VMWRITE_ALL )
             opt |= SECONDARY_EXEC_ENABLE_VMCS_SHADOWING;
         if ( opt_vpid_enabled )
diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index 4120234c15..93121fbf27 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -428,6 +428,20 @@ static void vmx_domain_relinquish_resources(struct domain *d)
     vmx_free_vlapic_mapping(d);
 }
 
+static void vmx_init_pt(struct vcpu *v)
+{
+    unsigned int frames = v->domain->vmtrace_frames;
+
+    if ( !frames )
+        return;
+
+    /* Checked by domain creation logic. */
+    ASSERT(v->vmtrace.buf);
+    ASSERT(frames <= (GB(4) >> PAGE_SHIFT) && (frames & (frames - 1)) == 0);
+
+    v->arch.msrs->rtit.output_limit = (frames << PAGE_SHIFT) - 1;
+}
+
 static int vmx_vcpu_initialise(struct vcpu *v)
 {
     int rc;
@@ -470,6 +484,7 @@ static int vmx_vcpu_initialise(struct vcpu *v)
     }
 
     vmx_install_vlapic_mapping(v);
+    vmx_init_pt(v);
 
     return 0;
 }
@@ -508,22 +523,39 @@ static void vmx_restore_host_msrs(void)
 
 static void vmx_save_guest_msrs(struct vcpu *v)
 {
+    struct vcpu_msrs *msrs = v->arch.msrs;
+
     /*
      * We cannot cache SHADOW_GS_BASE while the VCPU runs, as it can
      * be updated at any time via SWAPGS, which we cannot trap.
      */
     v->arch.hvm.vmx.shadow_gs = read_gs_shadow();
+
+    if ( v->arch.hvm.vmx.ipt_active )
+    {
+        rdmsrl(MSR_RTIT_OUTPUT_MASK, msrs->rtit.output_mask);
+        rdmsrl(MSR_RTIT_STATUS, msrs->rtit.status);
+    }
 }
 
 static void vmx_restore_guest_msrs(struct vcpu *v)
 {
+    const struct vcpu_msrs *msrs = v->arch.msrs;
+
     write_gs_shadow(v->arch.hvm.vmx.shadow_gs);
     wrmsrl(MSR_STAR,           v->arch.hvm.vmx.star);
     wrmsrl(MSR_LSTAR,          v->arch.hvm.vmx.lstar);
     wrmsrl(MSR_SYSCALL_MASK,   v->arch.hvm.vmx.sfmask);
 
     if ( cpu_has_msr_tsc_aux )
-        wrmsr_tsc_aux(v->arch.msrs->tsc_aux);
+        wrmsr_tsc_aux(msrs->tsc_aux);
+
+    if ( v->arch.hvm.vmx.ipt_active )
+    {
+        wrmsrl(MSR_RTIT_OUTPUT_BASE, page_to_maddr(v->vmtrace.buf));
+        wrmsrl(MSR_RTIT_OUTPUT_MASK, msrs->rtit.output_mask);
+        wrmsrl(MSR_RTIT_STATUS, msrs->rtit.status);
+    }
 }
 
 void vmx_update_cpu_exec_control(struct vcpu *v)
diff --git a/xen/include/asm-x86/cpufeature.h b/xen/include/asm-x86/cpufeature.h
index ad3d84bdde..8ad658be1b 100644
--- a/xen/include/asm-x86/cpufeature.h
+++ b/xen/include/asm-x86/cpufeature.h
@@ -104,6 +104,7 @@
 #define cpu_has_clwb            boot_cpu_has(X86_FEATURE_CLWB)
 #define cpu_has_avx512er        boot_cpu_has(X86_FEATURE_AVX512ER)
 #define cpu_has_avx512cd        boot_cpu_has(X86_FEATURE_AVX512CD)
+#define cpu_has_proc_trace      boot_cpu_has(X86_FEATURE_PROC_TRACE)
 #define cpu_has_sha             boot_cpu_has(X86_FEATURE_SHA)
 #define cpu_has_avx512bw        boot_cpu_has(X86_FEATURE_AVX512BW)
 #define cpu_has_avx512vl        boot_cpu_has(X86_FEATURE_AVX512VL)
diff --git a/xen/include/asm-x86/hvm/vmx/vmcs.h b/xen/include/asm-x86/hvm/vmx/vmcs.h
index 906810592f..8073af323b 100644
--- a/xen/include/asm-x86/hvm/vmx/vmcs.h
+++ b/xen/include/asm-x86/hvm/vmx/vmcs.h
@@ -156,6 +156,9 @@ struct vmx_vcpu {
     /* Do we need to tolerate a spurious EPT_MISCONFIG VM exit? */
     bool_t               ept_spurious_misconfig;
 
+    /* Processor Trace configured and enabled for the vcpu. */
+    bool                 ipt_active;
+
     /* Is the guest in real mode? */
     uint8_t              vmx_realmode;
     /* Are we emulating rather than VMENTERing? */
@@ -283,6 +286,7 @@ extern u32 vmx_secondary_exec_control;
 #define VMX_VPID_INVVPID_SINGLE_CONTEXT_RETAINING_GLOBAL 0x80000000000ULL
 extern u64 vmx_ept_vpid_cap;
 
+#define VMX_MISC_PROC_TRACE                     0x00004000
 #define VMX_MISC_CR3_TARGET                     0x01ff0000
 #define VMX_MISC_VMWRITE_ALL                    0x20000000
 
diff --git a/xen/include/asm-x86/msr.h b/xen/include/asm-x86/msr.h
index 16f95e7344..1d3eca9063 100644
--- a/xen/include/asm-x86/msr.h
+++ b/xen/include/asm-x86/msr.h
@@ -306,6 +306,38 @@ struct vcpu_msrs
         };
     } misc_features_enables;
 
+    /*
+     * 0x00000560 ... 57x - MSR_RTIT_*
+     *
+     * "Real Time Instruction Trace", now called Processor Trace.
+     *
+     * These MSRs are not exposed to guests.  They are controlled by Xen
+     * behind the scenes, when vmtrace is enabled for the domain.
+     *
+     * MSR_RTIT_OUTPUT_BASE not stored here.  It is fixed per vcpu, and
+     * derived from v->vmtrace.buf.
+     */
+    struct {
+        /*
+         * Placed in the MSR load/save lists.  Only modified by hypercall in
+         * the common case.
+         */
+        uint64_t ctl;
+
+        /*
+         * Updated by hardware in non-root mode.  Synchronised here on vcpu
+         * context switch.
+         */
+        uint64_t status;
+        union {
+            uint64_t output_mask;
+            struct {
+                uint32_t output_limit;
+                uint32_t output_offset;
+            };
+        };
+    } rtit;
+
     /* 0x00000da0 - MSR_IA32_XSS */
     struct {
         uint64_t raw;
diff --git a/xen/include/public/arch-x86/cpufeatureset.h b/xen/include/public/arch-x86/cpufeatureset.h
index 6f7efaad6d..a501479820 100644
--- a/xen/include/public/arch-x86/cpufeatureset.h
+++ b/xen/include/public/arch-x86/cpufeatureset.h
@@ -217,6 +217,7 @@ XEN_CPUFEATURE(SMAP,          5*32+20) /*S  Supervisor Mode Access Prevention */
 XEN_CPUFEATURE(AVX512_IFMA,   5*32+21) /*A  AVX-512 Integer Fused Multiply Add */
 XEN_CPUFEATURE(CLFLUSHOPT,    5*32+23) /*A  CLFLUSHOPT instruction */
 XEN_CPUFEATURE(CLWB,          5*32+24) /*A  CLWB instruction */
+XEN_CPUFEATURE(PROC_TRACE,    5*32+25) /*   Processor Trace */
 XEN_CPUFEATURE(AVX512PF,      5*32+26) /*A  AVX-512 Prefetch Instructions */
 XEN_CPUFEATURE(AVX512ER,      5*32+27) /*A  AVX-512 Exponent & Reciprocal Instrs */
 XEN_CPUFEATURE(AVX512CD,      5*32+28) /*A  AVX-512 Conflict Detection Instrs */
-- 
2.11.0



^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v7 06/10] xen/domctl: Add XEN_DOMCTL_vmtrace_op
  2021-01-21 21:27 [PATCH v7 00/10] Implement support for external IPT monitoring Andrew Cooper
                   ` (4 preceding siblings ...)
  2021-01-21 21:27 ` [PATCH v7 05/10] x86/vmx: Add Intel Processor Trace support Andrew Cooper
@ 2021-01-21 21:27 ` Andrew Cooper
  2021-01-26 14:18   ` Jan Beulich
  2021-01-21 21:27 ` [PATCH v7 07/10] tools/libxc: Add xc_vmtrace_* functions Andrew Cooper
                   ` (3 subsequent siblings)
  9 siblings, 1 reply; 41+ messages in thread
From: Andrew Cooper @ 2021-01-21 21:27 UTC (permalink / raw)
  To: Xen-devel
  Cc: Michał Leszczyński, Andrew Cooper, Jan Beulich,
	Roger Pau Monné,
	Wei Liu, Jun Nakajima, Kevin Tian, Tamas K Lengyel

From: Michał Leszczyński <michal.leszczynski@cert.pl>

Implement an interface to configure and control tracing operations.  Reuse the
existing SETDEBUGGING flask vector rather than inventing a new one.

Userspace using this interface is going to need platform specific knowledge
anyway to interpret the contents of the trace buffer.  While some operations
(e.g. enable/disable) can reasonably be generic, others cannot.  Provide an
explicitly-platform specific pair of get/set operations to reduce API churn as
new options get added/enabled.

For the VMX specific Processor Trace implementation, tolerate reading and
modifying a safe subset of bits in CTL, STATUS and OUTPUT_MASK.  This permits
userspace to control the content which gets logged, but prevents modification
of details such as the position/size of the output buffer.

Signed-off-by: Michał Leszczyński <michal.leszczynski@cert.pl>
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Roger Pau Monné <roger.pau@citrix.com>
CC: Wei Liu <wl@xen.org>
CC: Jun Nakajima <jun.nakajima@intel.com>
CC: Kevin Tian <kevin.tian@intel.com>
CC: Michał Leszczyński <michal.leszczynski@cert.pl>
CC: Tamas K Lengyel <tamas@tklengyel.com>

v7:
 * Major chop&change within the series.
---
 xen/arch/x86/domctl.c         |  55 +++++++++++++++
 xen/arch/x86/hvm/vmx/vmx.c    | 151 ++++++++++++++++++++++++++++++++++++++++++
 xen/include/asm-x86/hvm/hvm.h |  63 ++++++++++++++++++
 xen/include/public/domctl.h   |  35 ++++++++++
 xen/xsm/flask/hooks.c         |   1 +
 5 files changed, 305 insertions(+)

diff --git a/xen/arch/x86/domctl.c b/xen/arch/x86/domctl.c
index b28cfe9817..aa6dfe8eed 100644
--- a/xen/arch/x86/domctl.c
+++ b/xen/arch/x86/domctl.c
@@ -155,6 +155,55 @@ void arch_get_domain_info(const struct domain *d,
     info->arch_config.emulation_flags = d->arch.emulation_flags;
 }
 
+static int do_vmtrace_op(struct domain *d, struct xen_domctl_vmtrace_op *op,
+                         XEN_GUEST_HANDLE_PARAM(xen_domctl_t) u_domctl)
+{
+    struct vcpu *v;
+    int rc;
+
+    if ( !d->vmtrace_frames || d == current->domain /* No vcpu_pause() */ )
+        return -EINVAL;
+
+    ASSERT(is_hvm_domain(d)); /* Restricted by domain creation logic. */
+
+    v = domain_vcpu(d, op->vcpu);
+    if ( !v )
+        return -ENOENT;
+
+    vcpu_pause(v);
+    switch ( op->cmd )
+    {
+    case XEN_DOMCTL_vmtrace_enable:
+    case XEN_DOMCTL_vmtrace_disable:
+    case XEN_DOMCTL_vmtrace_reset_and_enable:
+        rc = hvm_vmtrace_control(
+            v, op->cmd != XEN_DOMCTL_vmtrace_disable,
+            op->cmd == XEN_DOMCTL_vmtrace_reset_and_enable);
+        break;
+
+    case XEN_DOMCTL_vmtrace_output_position:
+        rc = hvm_vmtrace_output_position(v, &op->value);
+        if ( rc >= 0 )
+            rc = 0;
+        break;
+
+    case XEN_DOMCTL_vmtrace_get_option:
+        rc = hvm_vmtrace_get_option(v, op->key, &op->value);
+        break;
+
+    case XEN_DOMCTL_vmtrace_set_option:
+        rc = hvm_vmtrace_set_option(v, op->key, op->value);
+        break;
+
+    default:
+        rc = -EOPNOTSUPP;
+        break;
+    }
+    vcpu_unpause(v);
+
+    return rc;
+}
+
 #define MAX_IOPORTS 0x10000
 
 long arch_do_domctl(
@@ -1320,6 +1369,12 @@ long arch_do_domctl(
         domain_unpause(d);
         break;
 
+    case XEN_DOMCTL_vmtrace_op:
+        ret = do_vmtrace_op(d, &domctl->u.vmtrace_op, u_domctl);
+        if ( !ret )
+            copyback = true;
+        break;
+
     default:
         ret = iommu_do_domctl(domctl, d, u_domctl);
         break;
diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index 93121fbf27..d4e7b50b8a 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -2261,6 +2261,153 @@ static bool vmx_get_pending_event(struct vcpu *v, struct x86_event *info)
     return true;
 }
 
+static int vmtrace_get_option(struct vcpu *v, uint64_t key, uint64_t *output)
+{
+    const struct vcpu_msrs *msrs = v->arch.msrs;
+
+    /*
+     * We only let vmtrace agents see and modify a subset of bits in
+     * MSR_RTIT_CTL.  These all pertain to date emitted into the trace
+     * buffer(s).  Must not include controls pertaining to the
+     * structure/position of the trace buffer(s).
+     */
+#define RTIT_CTL_MASK                                                   \
+    (RTIT_CTL_TRACE_EN | RTIT_CTL_OS | RTIT_CTL_USR | RTIT_CTL_TSC_EN | \
+     RTIT_CTL_DIS_RETC | RTIT_CTL_BRANCH_EN)
+
+    /*
+     * Status bits restricted to the first-gen subset (i.e. no further CPUID
+     * requirements.)
+     */
+#define RTIT_STATUS_MASK \
+    (RTIT_STATUS_FILTER_EN | RTIT_STATUS_CONTEXT_EN | RTIT_STATUS_TRIGGER_EN | \
+     RTIT_STATUS_ERROR | RTIT_STATUS_STOPPED)
+
+    switch ( key )
+    {
+    case MSR_RTIT_OUTPUT_MASK:
+        *output = msrs->rtit.output_mask;
+        break;
+
+    case MSR_RTIT_CTL:
+        *output = msrs->rtit.ctl & RTIT_CTL_MASK;
+        break;
+
+    case MSR_RTIT_STATUS:
+        *output = msrs->rtit.status & RTIT_STATUS_MASK;
+        break;
+
+    default:
+        *output = 0;
+        return -EINVAL;
+    }
+    return 0;
+}
+
+static int vmtrace_set_option(struct vcpu *v, uint64_t key, uint64_t value)
+{
+    struct vcpu_msrs *msrs = v->arch.msrs;
+    bool new_en, old_en = msrs->rtit.ctl & RTIT_CTL_TRACE_EN;
+
+    switch ( key )
+    {
+    case MSR_RTIT_OUTPUT_MASK:
+        /*
+         * MSR_RTIT_OUTPUT_MASK, when using Single Output mode, has a limit
+         * field in the lower 32 bits, and an offset in the upper 32 bits.
+         *
+         * Limit is fixed by the vmtrace buffer size and must not be
+         * controlled by userspace, while offset must be within the limit.
+         *
+         * Drop writes to the limit field to simply userspace wanting to reset
+         * the offset by writing 0.
+         */
+        if ( (value >> 32) > msrs->rtit.output_limit )
+            return -EINVAL;
+        msrs->rtit.output_offset = value >> 32;
+        break;
+
+    case MSR_RTIT_CTL:
+        if ( value & ~RTIT_CTL_MASK )
+            return -EINVAL;
+
+        msrs->rtit.ctl &= ~RTIT_CTL_MASK;
+        msrs->rtit.ctl |= (value & RTIT_CTL_MASK);
+        break;
+
+    case MSR_RTIT_STATUS:
+        if ( value & ~RTIT_STATUS_MASK )
+            return -EINVAL;
+
+        msrs->rtit.status &= ~RTIT_STATUS_MASK;
+        msrs->rtit.status |= (value & RTIT_STATUS_MASK);
+        break;
+
+    default:
+        return -EINVAL;
+    }
+
+    new_en = msrs->rtit.ctl & RTIT_CTL_TRACE_EN;
+
+    /* ctl.trace_en changed => update MSR load/save lists appropriately. */
+    if ( !old_en && new_en )
+    {
+        if ( vmx_add_guest_msr(v, MSR_RTIT_CTL, msrs->rtit.ctl) ||
+             vmx_add_host_load_msr(v, MSR_RTIT_CTL, 0) )
+        {
+            /*
+             * The only failure cases here are failing the
+             * singleton-per-domain memory allocation, or exceeding the space
+             * in the allocation.  We could unwind in principle, but there is
+             * nothing userspace can usefully do to continue using this VM.
+             */
+            domain_crash(v->domain);
+            return -ENXIO;
+        }
+    }
+    else if ( old_en && !new_en )
+    {
+        vmx_del_msr(v, MSR_RTIT_CTL, VMX_MSR_GUEST);
+        vmx_del_msr(v, MSR_RTIT_CTL, VMX_MSR_HOST);
+    }
+
+    return 0;
+}
+
+static int vmtrace_control(struct vcpu *v, bool enable, bool reset)
+{
+    struct vcpu_msrs *msrs = v->arch.msrs;
+    uint64_t new_ctl;
+    int rc;
+
+    if ( v->arch.hvm.vmx.ipt_active == enable )
+        return -EINVAL;
+
+    if ( reset )
+    {
+        msrs->rtit.status = 0;
+        msrs->rtit.output_offset = 0;
+    }
+
+    new_ctl = msrs->rtit.ctl & ~RTIT_CTL_TRACE_EN;
+    if ( enable )
+        new_ctl |= RTIT_CTL_TRACE_EN;
+
+    rc = vmtrace_set_option(v, MSR_RTIT_CTL, new_ctl);
+    if ( rc )
+        return rc;
+
+    v->arch.hvm.vmx.ipt_active = enable;
+
+    return 0;
+}
+
+static int vmtrace_output_position(struct vcpu *v, uint64_t *pos)
+{
+    *pos = v->arch.msrs->rtit.output_offset;
+    return v->arch.hvm.vmx.ipt_active;
+}
+
 static struct hvm_function_table __initdata vmx_function_table = {
     .name                 = "VMX",
     .cpu_up_prepare       = vmx_cpu_up_prepare,
@@ -2316,6 +2463,10 @@ static struct hvm_function_table __initdata vmx_function_table = {
     .altp2m_vcpu_update_vmfunc_ve = vmx_vcpu_update_vmfunc_ve,
     .altp2m_vcpu_emulate_ve = vmx_vcpu_emulate_ve,
     .altp2m_vcpu_emulate_vmfunc = vmx_vcpu_emulate_vmfunc,
+    .vmtrace_control = vmtrace_control,
+    .vmtrace_output_position = vmtrace_output_position,
+    .vmtrace_set_option = vmtrace_set_option,
+    .vmtrace_get_option = vmtrace_get_option,
     .tsc_scaling = {
         .max_ratio = VMX_TSC_MULTIPLIER_MAX,
     },
diff --git a/xen/include/asm-x86/hvm/hvm.h b/xen/include/asm-x86/hvm/hvm.h
index 334bd573b9..960ec03917 100644
--- a/xen/include/asm-x86/hvm/hvm.h
+++ b/xen/include/asm-x86/hvm/hvm.h
@@ -214,6 +214,12 @@ struct hvm_function_table {
     bool_t (*altp2m_vcpu_emulate_ve)(struct vcpu *v);
     int (*altp2m_vcpu_emulate_vmfunc)(const struct cpu_user_regs *regs);
 
+    /* vmtrace */
+    int (*vmtrace_control)(struct vcpu *v, bool enable, bool reset);
+    int (*vmtrace_output_position)(struct vcpu *v, uint64_t *pos);
+    int (*vmtrace_set_option)(struct vcpu *v, uint64_t key, uint64_t value);
+    int (*vmtrace_get_option)(struct vcpu *v, uint64_t key, uint64_t *value);
+
     /*
      * Parameters and callbacks for hardware-assisted TSC scaling,
      * which are valid only when the hardware feature is available.
@@ -655,6 +661,41 @@ static inline bool altp2m_vcpu_emulate_ve(struct vcpu *v)
     return false;
 }
 
+static inline int hvm_vmtrace_control(struct vcpu *v, bool enable, bool reset)
+{
+    if ( hvm_funcs.vmtrace_control )
+        return hvm_funcs.vmtrace_control(v, enable, reset);
+
+    return -EOPNOTSUPP;
+}
+
+/* Returns -errno, or a boolean of whether tracing is currently active. */
+static inline int hvm_vmtrace_output_position(struct vcpu *v, uint64_t *pos)
+{
+    if ( hvm_funcs.vmtrace_output_position )
+        return hvm_funcs.vmtrace_output_position(v, pos);
+
+    return -EOPNOTSUPP;
+}
+
+static inline int hvm_vmtrace_set_option(
+    struct vcpu *v, uint64_t key, uint64_t value)
+{
+    if ( hvm_funcs.vmtrace_set_option )
+        return hvm_funcs.vmtrace_set_option(v, key, value);
+
+    return -EOPNOTSUPP;
+}
+
+static inline int hvm_vmtrace_get_option(
+    struct vcpu *v, uint64_t key, uint64_t *value)
+{
+    if ( hvm_funcs.vmtrace_get_option )
+        return hvm_funcs.vmtrace_get_option(v, key, value);
+
+    return -EOPNOTSUPP;
+}
+
 /*
  * This must be defined as a macro instead of an inline function,
  * because it uses 'struct vcpu' and 'struct domain' which have
@@ -751,6 +792,28 @@ static inline bool hvm_has_set_descriptor_access_exiting(void)
     return false;
 }
 
+static inline int hvm_vmtrace_control(struct vcpu *v, bool enable, bool reset)
+{
+    return -EOPNOTSUPP;
+}
+
+static inline int hvm_vmtrace_output_position(struct vcpu *v, uint64_t *pos)
+{
+    return -EOPNOTSUPP;
+}
+
+static inline int hvm_vmtrace_set_option(
+    struct vcpu *v, uint64_t key, uint64_t value)
+{
+    return -EOPNOTSUPP;
+}
+
+static inline int hvm_vmtrace_get_option(
+    struct vcpu *v, uint64_t key, uint64_t *value)
+{
+    return -EOPNOTSUPP;
+}
+
 #define is_viridian_domain(d) ((void)(d), false)
 #define is_viridian_vcpu(v) ((void)(v), false)
 #define has_viridian_time_ref_count(d) ((void)(d), false)
diff --git a/xen/include/public/domctl.h b/xen/include/public/domctl.h
index 1585678d50..218593e548 100644
--- a/xen/include/public/domctl.h
+++ b/xen/include/public/domctl.h
@@ -1133,6 +1133,39 @@ struct xen_domctl_vuart_op {
                                  */
 };
 
+/* XEN_DOMCTL_vmtrace_op: Perform VM tracing operations. */
+struct xen_domctl_vmtrace_op {
+    uint32_t cmd;           /* IN */
+    uint32_t vcpu;          /* IN */
+    uint64_aligned_t key;   /* IN     - @cmd specific data. */
+    uint64_aligned_t value; /* IN/OUT - @cmd specific data. */
+
+    /*
+     * General enable/disable of tracing.
+     *
+     * XEN_DOMCTL_vmtrace_reset_and_enable is provided as optimisation for
+     * common usecases, which want to reset status and position information
+     * when turning tracing back on.
+     */
+#define XEN_DOMCTL_vmtrace_enable             1
+#define XEN_DOMCTL_vmtrace_disable            2
+#define XEN_DOMCTL_vmtrace_reset_and_enable   3
+
+    /* Obtain the current output position within the buffer.  Fills @value. */
+#define XEN_DOMCTL_vmtrace_output_position    4
+
+    /*
+     * Get/Set platform specific configuration.
+     *
+     * For Intel Processor Trace, @key/@value are interpreted as MSR
+     * reads/writes to MSR_RTIT_*, filtered to a safe subset.
+     */
+#define XEN_DOMCTL_vmtrace_get_option         5
+#define XEN_DOMCTL_vmtrace_set_option         6
+};
+typedef struct xen_domctl_vmtrace_op xen_domctl_vmtrace_op_t;
+DEFINE_XEN_GUEST_HANDLE(xen_domctl_vmtrace_op_t);
+
 struct xen_domctl {
     uint32_t cmd;
 #define XEN_DOMCTL_createdomain                   1
@@ -1217,6 +1250,7 @@ struct xen_domctl {
 #define XEN_DOMCTL_vuart_op                      81
 #define XEN_DOMCTL_get_cpu_policy                82
 #define XEN_DOMCTL_set_cpu_policy                83
+#define XEN_DOMCTL_vmtrace_op                    84
 #define XEN_DOMCTL_gdbsx_guestmemio            1000
 #define XEN_DOMCTL_gdbsx_pausevcpu             1001
 #define XEN_DOMCTL_gdbsx_unpausevcpu           1002
@@ -1277,6 +1311,7 @@ struct xen_domctl {
         struct xen_domctl_monitor_op        monitor_op;
         struct xen_domctl_psr_alloc         psr_alloc;
         struct xen_domctl_vuart_op          vuart_op;
+        struct xen_domctl_vmtrace_op        vmtrace_op;
         uint8_t                             pad[128];
     } u;
 };
diff --git a/xen/xsm/flask/hooks.c b/xen/xsm/flask/hooks.c
index 19b0d9e3eb..3eba495ab3 100644
--- a/xen/xsm/flask/hooks.c
+++ b/xen/xsm/flask/hooks.c
@@ -703,6 +703,7 @@ static int flask_domctl(struct domain *d, int cmd)
         return current_has_perm(d, SECCLASS_DOMAIN2, DOMAIN2__VM_EVENT);
 
     case XEN_DOMCTL_debug_op:
+    case XEN_DOMCTL_vmtrace_op:
     case XEN_DOMCTL_gdbsx_guestmemio:
     case XEN_DOMCTL_gdbsx_pausevcpu:
     case XEN_DOMCTL_gdbsx_unpausevcpu:
-- 
2.11.0



^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v7 07/10] tools/libxc: Add xc_vmtrace_* functions
  2021-01-21 21:27 [PATCH v7 00/10] Implement support for external IPT monitoring Andrew Cooper
                   ` (5 preceding siblings ...)
  2021-01-21 21:27 ` [PATCH v7 06/10] xen/domctl: Add XEN_DOMCTL_vmtrace_op Andrew Cooper
@ 2021-01-21 21:27 ` Andrew Cooper
  2021-01-22 15:29   ` Ian Jackson
  2021-01-21 21:27 ` [PATCH v7 08/10] tools/misc: Add xen-vmtrace tool Andrew Cooper
                   ` (2 subsequent siblings)
  9 siblings, 1 reply; 41+ messages in thread
From: Andrew Cooper @ 2021-01-21 21:27 UTC (permalink / raw)
  To: Xen-devel
  Cc: Michał Leszczyński, Andrew Cooper, Ian Jackson,
	Wei Liu, Tamas K Lengyel

From: Michał Leszczyński <michal.leszczynski@cert.pl>

Add functions in libxc that use the new XEN_DOMCTL_vmtrace interface.

Signed-off-by: Michał Leszczyński <michal.leszczynski@cert.pl>
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Ian Jackson <iwj@xenproject.org>
CC: Wei Liu <wl@xen.org>
CC: Michał Leszczyński <michal.leszczynski@cert.pl>
CC: Tamas K Lengyel <tamas@tklengyel.com>

v7:
 * Use the name 'vmtrace' consistently.
---
 tools/include/xenctrl.h      |  73 ++++++++++++++++++++++++
 tools/libs/ctrl/Makefile     |   1 +
 tools/libs/ctrl/xc_vmtrace.c | 128 +++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 202 insertions(+)
 create mode 100644 tools/libs/ctrl/xc_vmtrace.c

diff --git a/tools/include/xenctrl.h b/tools/include/xenctrl.h
index 3796425e1e..0efcdae8b4 100644
--- a/tools/include/xenctrl.h
+++ b/tools/include/xenctrl.h
@@ -1583,6 +1583,79 @@ int xc_tbuf_set_cpu_mask(xc_interface *xch, xc_cpumap_t mask);
 
 int xc_tbuf_set_evt_mask(xc_interface *xch, uint32_t mask);
 
+/**
+ * Enable vmtrace for given vCPU.
+ *
+ * @parm xch a handle to an open hypervisor interface
+ * @parm domid domain identifier
+ * @parm vcpu vcpu identifier
+ * @return 0 on success, -1 on failure
+ */
+int xc_vmtrace_enable(xc_interface *xch, uint32_t domid, uint32_t vcpu);
+
+/**
+ * Enable vmtrace for given vCPU.
+ *
+ * @parm xch a handle to an open hypervisor interface
+ * @parm domid domain identifier
+ * @parm vcpu vcpu identifier
+ * @return 0 on success, -1 on failure
+ */
+int xc_vmtrace_disable(xc_interface *xch, uint32_t domid, uint32_t vcpu);
+
+/**
+ * Enable vmtrace for a given vCPU, along with resetting status/offset
+ * details.
+ *
+ * @parm xch a handle to an open hypervisor interface
+ * @parm domid domain identifier
+ * @parm vcpu vcpu identifier
+ * @return 0 on success, -1 on failure
+ */
+int xc_vmtrace_reset_and_enable(xc_interface *xch, uint32_t domid,
+                                uint32_t vcpu);
+
+/**
+ * Get current output position inside the trace buffer.
+ *
+ * Repeated calls will return different values if tracing is enabled.  It is
+ * platform specific what happens when the buffer fills completely.
+ *
+ * @parm xch a handle to an open hypervisor interface
+ * @parm domid domain identifier
+ * @parm vcpu vcpu identifier
+ * @parm pos current output position in bytes
+ * @return 0 on success, -1 on failure
+ */
+int xc_vmtrace_output_position(xc_interface *xch, uint32_t domid,
+                               uint32_t vcpu, uint64_t *pos);
+
+/**
+ * Get platform specific vmtrace options.
+ *
+ * @parm xch a handle to an open hypervisor interface
+ * @parm domid domain identifier
+ * @parm vcpu vcpu identifier
+ * @parm key platform-specific input
+ * @parm value platform-specific output
+ * @return 0 on success, -1 on failure
+ */
+int xc_vmtrace_get_option(xc_interface *xch, uint32_t domid,
+                          uint32_t vcpu, uint64_t key, uint64_t *value);
+
+/**
+ * Set platform specific vntvmtrace options.
+ *
+ * @parm xch a handle to an open hypervisor interface
+ * @parm domid domain identifier
+ * @parm vcpu vcpu identifier
+ * @parm key platform-specific input
+ * @parm value platform-specific input
+ * @return 0 on success, -1 on failure
+ */
+int xc_vmtrace_set_option(xc_interface *xch, uint32_t domid,
+                          uint32_t vcpu, uint64_t key, uint64_t value);
+
 int xc_domctl(xc_interface *xch, struct xen_domctl *domctl);
 int xc_sysctl(xc_interface *xch, struct xen_sysctl *sysctl);
 
diff --git a/tools/libs/ctrl/Makefile b/tools/libs/ctrl/Makefile
index 4185dc3f22..449c89f440 100644
--- a/tools/libs/ctrl/Makefile
+++ b/tools/libs/ctrl/Makefile
@@ -22,6 +22,7 @@ SRCS-y       += xc_pm.c
 SRCS-y       += xc_cpu_hotplug.c
 SRCS-y       += xc_resume.c
 SRCS-y       += xc_vm_event.c
+SRCS-y       += xc_vmtrace.c
 SRCS-y       += xc_monitor.c
 SRCS-y       += xc_mem_paging.c
 SRCS-y       += xc_mem_access.c
diff --git a/tools/libs/ctrl/xc_vmtrace.c b/tools/libs/ctrl/xc_vmtrace.c
new file mode 100644
index 0000000000..602502367f
--- /dev/null
+++ b/tools/libs/ctrl/xc_vmtrace.c
@@ -0,0 +1,128 @@
+/******************************************************************************
+ * xc_vmtrace.c
+ *
+ * API for manipulating hardware tracing features
+ *
+ * Copyright (c) 2020, Michal Leszczynski
+ *
+ * Copyright 2020 CERT Polska. All rights reserved.
+ * Use is subject to license terms.
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation;
+ * version 2.1 of the License.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include "xc_private.h"
+
+int xc_vmtrace_enable(
+    xc_interface *xch, uint32_t domid, uint32_t vcpu)
+{
+    struct xen_domctl domctl = {
+        .cmd = XEN_DOMCTL_vmtrace_op,
+        .domain = domid,
+        .u.vmtrace_op = {
+            .cmd = XEN_DOMCTL_vmtrace_enable,
+            .vcpu = vcpu,
+        },
+    };
+
+    return do_domctl(xch, &domctl);
+}
+
+int xc_vmtrace_disable(
+    xc_interface *xch, uint32_t domid, uint32_t vcpu)
+{
+    struct xen_domctl domctl = {
+        .cmd = XEN_DOMCTL_vmtrace_op,
+        .domain = domid,
+        .u.vmtrace_op = {
+            .cmd = XEN_DOMCTL_vmtrace_disable,
+            .vcpu = vcpu,
+        },
+    };
+
+    return do_domctl(xch, &domctl);
+}
+
+int xc_vmtrace_reset_and_enable(
+    xc_interface *xch, uint32_t domid, uint32_t vcpu)
+{
+    struct xen_domctl domctl = {
+        .cmd = XEN_DOMCTL_vmtrace_op,
+        .domain = domid,
+        .u.vmtrace_op = {
+            .cmd = XEN_DOMCTL_vmtrace_reset_and_enable,
+            .vcpu = vcpu,
+        },
+    };
+
+    return do_domctl(xch, &domctl);
+}
+
+int xc_vmtrace_output_position(
+    xc_interface *xch, uint32_t domid, uint32_t vcpu, uint64_t *pos)
+{
+    struct xen_domctl domctl = {
+        .cmd = XEN_DOMCTL_vmtrace_op,
+        .domain = domid,
+        .u.vmtrace_op = {
+            .cmd = XEN_DOMCTL_vmtrace_output_position,
+            .vcpu = vcpu,
+        },
+    };
+    int rc = do_domctl(xch, &domctl);
+
+    if ( !rc )
+        *pos = domctl.u.vmtrace_op.value;
+
+    return rc;
+}
+
+int xc_vmtrace_get_option(
+    xc_interface *xch, uint32_t domid, uint32_t vcpu,
+    uint64_t key, uint64_t *value)
+{
+    struct xen_domctl domctl = {
+        .cmd = XEN_DOMCTL_vmtrace_op,
+        .domain = domid,
+        .u.vmtrace_op = {
+            .cmd = XEN_DOMCTL_vmtrace_get_option,
+            .vcpu = vcpu,
+            .key = key,
+        },
+    };
+    int rc = do_domctl(xch, &domctl);
+
+    if ( !rc )
+        *value = domctl.u.vmtrace_op.value;
+
+    return rc;
+}
+
+int xc_vmtrace_set_option(
+    xc_interface *xch, uint32_t domid, uint32_t vcpu,
+    uint64_t key, uint64_t value)
+{
+    struct xen_domctl domctl = {
+        .cmd = XEN_DOMCTL_vmtrace_op,
+        .domain = domid,
+        .u.vmtrace_op = {
+            .cmd = XEN_DOMCTL_vmtrace_set_option,
+            .vcpu = vcpu,
+            .key = key,
+            .value = value,
+        },
+    };
+
+    return do_domctl(xch, &domctl);
+}
-- 
2.11.0



^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v7 08/10] tools/misc: Add xen-vmtrace tool
  2021-01-21 21:27 [PATCH v7 00/10] Implement support for external IPT monitoring Andrew Cooper
                   ` (6 preceding siblings ...)
  2021-01-21 21:27 ` [PATCH v7 07/10] tools/libxc: Add xc_vmtrace_* functions Andrew Cooper
@ 2021-01-21 21:27 ` Andrew Cooper
  2021-01-22 15:33   ` Ian Jackson
  2021-01-21 21:27 ` [PATCH v7 09/10] xen/vmtrace: support for VM forks Andrew Cooper
  2021-01-21 21:27 ` [PATCH v7 10/10] x86/vm_event: Carry Processor Trace buffer offset in vm_event Andrew Cooper
  9 siblings, 1 reply; 41+ messages in thread
From: Andrew Cooper @ 2021-01-21 21:27 UTC (permalink / raw)
  To: Xen-devel
  Cc: Michał Leszczyński, Andrew Cooper, Ian Jackson,
	Wei Liu, Tamas K Lengyel

From: Michał Leszczyński <michal.leszczynski@cert.pl>

Add an demonstration tool that uses xc_vmtrace_* calls in order
to manage external IPT monitoring for DomU.

Signed-off-by: Michał Leszczyński <michal.leszczynski@cert.pl>
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Ian Jackson <iwj@xenproject.org>
CC: Wei Liu <wl@xen.org>
CC: Michał Leszczyński <michal.leszczynski@cert.pl>
CC: Tamas K Lengyel <tamas@tklengyel.com>
---
 tools/misc/.gitignore    |   1 +
 tools/misc/Makefile      |   4 ++
 tools/misc/xen-vmtrace.c | 154 +++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 159 insertions(+)
 create mode 100644 tools/misc/xen-vmtrace.c

diff --git a/tools/misc/.gitignore b/tools/misc/.gitignore
index b2c3b21f57..ce6f937d0c 100644
--- a/tools/misc/.gitignore
+++ b/tools/misc/.gitignore
@@ -1,3 +1,4 @@
 xen-access
 xen-memshare
 xen-ucode
+xen-vmtrace
diff --git a/tools/misc/Makefile b/tools/misc/Makefile
index 912c5d4f0e..c5017e6c70 100644
--- a/tools/misc/Makefile
+++ b/tools/misc/Makefile
@@ -25,6 +25,7 @@ INSTALL_SBIN-$(CONFIG_X86)     += xen-lowmemd
 INSTALL_SBIN-$(CONFIG_X86)     += xen-memshare
 INSTALL_SBIN-$(CONFIG_X86)     += xen-mfndump
 INSTALL_SBIN-$(CONFIG_X86)     += xen-ucode
+INSTALL_SBIN-$(CONFIG_X86)     += xen-vmtrace
 INSTALL_SBIN                   += xencov
 INSTALL_SBIN                   += xenhypfs
 INSTALL_SBIN                   += xenlockprof
@@ -90,6 +91,9 @@ xen-hvmcrash: xen-hvmcrash.o
 xen-memshare: xen-memshare.o
 	$(CC) $(LDFLAGS) -o $@ $< $(LDLIBS_libxenctrl) $(APPEND_LDFLAGS)
 
+xen-vmtrace: xen-vmtrace.o
+	$(CC) $(LDFLAGS) -o $@ $< $(LDLIBS_libxenctrl) $(LDLIBS_libxenforeignmemory) $(APPEND_LDFLAGS)
+
 xenperf: xenperf.o
 	$(CC) $(LDFLAGS) -o $@ $< $(LDLIBS_libxenctrl) $(APPEND_LDFLAGS)
 
diff --git a/tools/misc/xen-vmtrace.c b/tools/misc/xen-vmtrace.c
new file mode 100644
index 0000000000..47fea871cf
--- /dev/null
+++ b/tools/misc/xen-vmtrace.c
@@ -0,0 +1,154 @@
+/******************************************************************************
+ * tools/vmtrace.c
+ *
+ * Demonstrative tool for collecting Intel Processor Trace data from Xen.
+ *  Could be used to externally monitor a given vCPU in given DomU.
+ *
+ * Copyright (C) 2020 by CERT Polska - NASK PIB
+ *
+ * Authors: Michał Leszczyński, michal.leszczynski@cert.pl
+ * Date:    June, 2020
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; under version 2 of the License.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License
+ *  along with this program; If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <err.h>
+#include <errno.h>
+#include <signal.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <sys/mman.h>
+
+#include <xenctrl.h>
+#include <xenforeignmemory.h>
+
+#define MSR_RTIT_CTL                        0x00000570
+#define  RTIT_CTL_OS                        (1 <<  2)
+#define  RTIT_CTL_USR                       (1 <<  3)
+#define  RTIT_CTL_BRANCH_EN                 (1 << 13)
+
+static volatile int interrupted = 0;
+
+static xc_interface *xch;
+static xenforeignmemory_handle *fh;
+
+void int_handler(int signum)
+{
+    interrupted = 1;
+}
+
+int main(int argc, char **argv)
+{
+    uint32_t domid, vcpu;
+    int rc, exit = 1;
+    size_t size;
+    char *buf = NULL;
+    xenforeignmemory_resource_handle *fres = NULL;
+    uint64_t last_offset = 0;
+
+    if ( signal(SIGINT, int_handler) == SIG_ERR )
+        err(1, "Failed to register signal handler\n");
+
+    if ( argc != 3 )
+    {
+        fprintf(stderr, "Usage: %s <domid> <vcpu_id>\n", argv[0]);
+        fprintf(stderr, "It's recommended to redirect thisprogram's output to file\n");
+        fprintf(stderr, "or to pipe it's output to xxd or other program.\n");
+        return 1;
+    }
+
+    domid = atoi(argv[1]);
+    vcpu  = atoi(argv[2]);
+
+    xch = xc_interface_open(NULL, NULL, 0);
+    fh = xenforeignmemory_open(NULL, 0);
+
+    if ( !xch )
+        err(1, "xc_interface_open()");
+    if ( !fh )
+        err(1, "xenforeignmemory_open()");
+
+    rc = xenforeignmemory_resource_size(
+        fh, domid, XENMEM_resource_vmtrace_buf, vcpu, &size);
+    if ( rc )
+        err(1, "xenforeignmemory_resource_size()");
+
+    fres = xenforeignmemory_map_resource(
+        fh, domid, XENMEM_resource_vmtrace_buf, vcpu,
+        0, size >> XC_PAGE_SHIFT, (void **)&buf, PROT_READ, 0);
+    if ( !fres )
+        err(1, "xenforeignmemory_map_resource()");
+
+    if ( xc_vmtrace_set_option(
+             xch, domid, vcpu, MSR_RTIT_CTL,
+             RTIT_CTL_BRANCH_EN | RTIT_CTL_USR | RTIT_CTL_OS) )
+    {
+        perror("xc_vmtrace_set_option()");
+        goto out;
+    }
+
+    if ( xc_vmtrace_enable(xch, domid, vcpu) )
+    {
+        perror("xc_vmtrace_enable()");
+        goto out;
+    }
+
+    while ( !interrupted )
+    {
+        xc_dominfo_t dominfo;
+        uint64_t offset;
+
+        if ( xc_vmtrace_output_position(xch, domid, vcpu, &offset) )
+        {
+            perror("xc_vmtrace_output_position()");
+            goto out;
+        }
+
+        if ( offset > last_offset )
+            fwrite(buf + last_offset, offset - last_offset, 1, stdout);
+        else if ( offset < last_offset )
+        {
+            /* buffer wrapped */
+            fwrite(buf + last_offset, size - last_offset, 1, stdout);
+            fwrite(buf, offset, 1, stdout);
+        }
+
+        last_offset = offset;
+        usleep(1000 * 100);
+
+        if ( xc_domain_getinfo(xch, domid, 1, &dominfo) != 1 ||
+             dominfo.domid != domid || dominfo.shutdown )
+            break;
+    }
+
+    exit = 0;
+
+ out:
+    if ( xc_vmtrace_disable(xch, domid, vcpu) )
+        perror("xc_vmtrace_disable()");
+
+    if ( fres && xenforeignmemory_unmap_resource(fh, fres) )
+        perror("xenforeignmemory_unmap_resource()");
+
+    return exit;
+}
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
-- 
2.11.0



^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v7 09/10] xen/vmtrace: support for VM forks
  2021-01-21 21:27 [PATCH v7 00/10] Implement support for external IPT monitoring Andrew Cooper
                   ` (7 preceding siblings ...)
  2021-01-21 21:27 ` [PATCH v7 08/10] tools/misc: Add xen-vmtrace tool Andrew Cooper
@ 2021-01-21 21:27 ` Andrew Cooper
  2021-01-26 14:21   ` Jan Beulich
  2021-01-21 21:27 ` [PATCH v7 10/10] x86/vm_event: Carry Processor Trace buffer offset in vm_event Andrew Cooper
  9 siblings, 1 reply; 41+ messages in thread
From: Andrew Cooper @ 2021-01-21 21:27 UTC (permalink / raw)
  To: Xen-devel
  Cc: Tamas K Lengyel, Andrew Cooper, Jan Beulich, Roger Pau Monné,
	Wei Liu, Jun Nakajima, Kevin Tian, Michał Leszczyński,
	Tamas K Lengyel

From: Tamas K Lengyel <tamas.lengyel@intel.com>

Implement vmtrace_reset_pt function. Properly set IPT
state for VM forks.

Signed-off-by: Tamas K Lengyel <tamas.lengyel@intel.com>
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Roger Pau Monné <roger.pau@citrix.com>
CC: Wei Liu <wl@xen.org>
CC: Jun Nakajima <jun.nakajima@intel.com>
CC: Kevin Tian <kevin.tian@intel.com>
CC: Michał Leszczyński <michal.leszczynski@cert.pl>
CC: Tamas K Lengyel <tamas@tklengyel.com>

v7:
 * New
---
 xen/arch/x86/hvm/vmx/vmx.c    | 11 +++++++++++
 xen/arch/x86/mm/mem_sharing.c |  3 +++
 xen/include/asm-x86/hvm/hvm.h |  9 +++++++++
 3 files changed, 23 insertions(+)

diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index d4e7b50b8a..40ae398cf5 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -2408,6 +2408,16 @@ static int vmtrace_output_position(struct vcpu *v, uint64_t *pos)
     return v->arch.hvm.vmx.ipt_active;
 }
 
+static int vmtrace_reset(struct vcpu *v)
+{
+    if ( !v->arch.hvm.vmx.ipt_active )
+        return -EINVAL;
+
+    v->arch.msrs->rtit.output_offset = 0;
+    v->arch.msrs->rtit.status = 0;
+    return 0;
+}
+
 static struct hvm_function_table __initdata vmx_function_table = {
     .name                 = "VMX",
     .cpu_up_prepare       = vmx_cpu_up_prepare,
@@ -2467,6 +2477,7 @@ static struct hvm_function_table __initdata vmx_function_table = {
     .vmtrace_output_position = vmtrace_output_position,
     .vmtrace_set_option = vmtrace_set_option,
     .vmtrace_get_option = vmtrace_get_option,
+    .vmtrace_reset = vmtrace_reset,
     .tsc_scaling = {
         .max_ratio = VMX_TSC_MULTIPLIER_MAX,
     },
diff --git a/xen/arch/x86/mm/mem_sharing.c b/xen/arch/x86/mm/mem_sharing.c
index adaeab4612..fd07eb8d4d 100644
--- a/xen/arch/x86/mm/mem_sharing.c
+++ b/xen/arch/x86/mm/mem_sharing.c
@@ -1632,6 +1632,8 @@ static int copy_vcpu_settings(struct domain *cd, const struct domain *d)
             copy_domain_page(new_vcpu_info_mfn, vcpu_info_mfn);
         }
 
+        hvm_vmtrace_reset(cd_vcpu);
+
         /*
          * TODO: to support VMs with PV interfaces copy additional
          * settings here, such as PV timers.
@@ -1782,6 +1784,7 @@ static int fork(struct domain *cd, struct domain *d)
         cd->max_pages = d->max_pages;
         *cd->arch.cpuid = *d->arch.cpuid;
         *cd->arch.msr = *d->arch.msr;
+        cd->vmtrace_frames = d->vmtrace_frames;
         cd->parent = d;
     }
 
diff --git a/xen/include/asm-x86/hvm/hvm.h b/xen/include/asm-x86/hvm/hvm.h
index 960ec03917..150746de66 100644
--- a/xen/include/asm-x86/hvm/hvm.h
+++ b/xen/include/asm-x86/hvm/hvm.h
@@ -219,6 +219,7 @@ struct hvm_function_table {
     int (*vmtrace_output_position)(struct vcpu *v, uint64_t *pos);
     int (*vmtrace_set_option)(struct vcpu *v, uint64_t key, uint64_t value);
     int (*vmtrace_get_option)(struct vcpu *v, uint64_t key, uint64_t *value);
+    int (*vmtrace_reset)(struct vcpu *v);
 
     /*
      * Parameters and callbacks for hardware-assisted TSC scaling,
@@ -696,6 +697,14 @@ static inline int hvm_vmtrace_get_option(
     return -EOPNOTSUPP;
 }
 
+static inline int hvm_vmtrace_reset(struct vcpu *v)
+{
+    if ( hvm_funcs.vmtrace_reset )
+        return hvm_funcs.vmtrace_reset(v);
+
+    return -EOPNOTSUPP;
+}
+
 /*
  * This must be defined as a macro instead of an inline function,
  * because it uses 'struct vcpu' and 'struct domain' which have
-- 
2.11.0



^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v7 10/10] x86/vm_event: Carry Processor Trace buffer offset in vm_event
  2021-01-21 21:27 [PATCH v7 00/10] Implement support for external IPT monitoring Andrew Cooper
                   ` (8 preceding siblings ...)
  2021-01-21 21:27 ` [PATCH v7 09/10] xen/vmtrace: support for VM forks Andrew Cooper
@ 2021-01-21 21:27 ` Andrew Cooper
  2021-01-26 14:27   ` Jan Beulich
  9 siblings, 1 reply; 41+ messages in thread
From: Andrew Cooper @ 2021-01-21 21:27 UTC (permalink / raw)
  To: Xen-devel
  Cc: Tamas K Lengyel, Andrew Cooper, Jan Beulich, Roger Pau Monné,
	Wei Liu, Jun Nakajima, Kevin Tian, Michał Leszczyński,
	Tamas K Lengyel

From: Tamas K Lengyel <tamas.lengyel@intel.com>

Add pt_offset field to x86 regs in vm_event. Initialized to ~0 if PT
is not in use.

Signed-off-by: Tamas K Lengyel <tamas.lengyel@intel.com>
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Roger Pau Monné <roger.pau@citrix.com>
CC: Wei Liu <wl@xen.org>
CC: Jun Nakajima <jun.nakajima@intel.com>
CC: Kevin Tian <kevin.tian@intel.com>
CC: Michał Leszczyński <michal.leszczynski@cert.pl>
CC: Tamas K Lengyel <tamas@tklengyel.com>

v7:
 * New
---
 xen/arch/x86/vm_event.c       | 3 +++
 xen/include/public/vm_event.h | 6 ++++++
 2 files changed, 9 insertions(+)

diff --git a/xen/arch/x86/vm_event.c b/xen/arch/x86/vm_event.c
index 848d69c1b0..09dfc0924e 100644
--- a/xen/arch/x86/vm_event.c
+++ b/xen/arch/x86/vm_event.c
@@ -251,6 +251,9 @@ void vm_event_fill_regs(vm_event_request_t *req)
 
     req->data.regs.x86.shadow_gs = ctxt.shadow_gs;
     req->data.regs.x86.dr6 = ctxt.dr6;
+
+    if ( hvm_vmtrace_output_position(curr, &req->data.regs.x86.pt_offset) != 1 )
+        req->data.regs.x86.pt_offset = ~0;
 #endif
 }
 
diff --git a/xen/include/public/vm_event.h b/xen/include/public/vm_event.h
index 141ea024a3..57f34bf902 100644
--- a/xen/include/public/vm_event.h
+++ b/xen/include/public/vm_event.h
@@ -223,6 +223,12 @@ struct vm_event_regs_x86 {
      */
     uint64_t npt_base;
 
+    /*
+     * Current offset in the Processor Trace buffer. For Intel Processor Trace
+     * this is MSR_RTIT_OUTPUT_MASK. Set to ~0 if no Processor Trace is active.
+     */
+    uint64_t pt_offset;
+
     uint32_t cs_base;
     uint32_t ss_base;
     uint32_t ds_base;
-- 
2.11.0



^ permalink raw reply related	[flat|nested] 41+ messages in thread

* Re: [PATCH v7 01/10] xen+tools: Introduce XEN_SYSCTL_PHYSCAP_vmtrace
  2021-01-21 21:27 ` [PATCH v7 01/10] xen+tools: Introduce XEN_SYSCTL_PHYSCAP_vmtrace Andrew Cooper
@ 2021-01-22 15:28   ` Ian Jackson
  2021-01-26  8:58   ` Julien Grall
  1 sibling, 0 replies; 41+ messages in thread
From: Ian Jackson @ 2021-01-22 15:28 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: Xen-devel, Jan Beulich, Roger Pau Monné,
	Wei Liu, Michał Leszczyński, Tamas K Lengyel

Andrew Cooper writes ("[PATCH v7 01/10] xen+tools: Introduce XEN_SYSCTL_PHYSCAP_vmtrace"):
> We're about to introduce support for Intel Processor Trace, but similar
> functionality exists in other platforms.
> 
> Aspects of vmtrace can reasonably can be common, so start with
> XEN_SYSCTL_PHYSCAP_vmtrace and plumb the signal from Xen all the way down into
> `xl info`.

Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v7 03/10] tools/[lib]xl: Add vmtrace_buf_size parameter
  2021-01-21 21:27 ` [PATCH v7 03/10] tools/[lib]xl: Add vmtrace_buf_size parameter Andrew Cooper
@ 2021-01-22 15:29   ` Ian Jackson
  0 siblings, 0 replies; 41+ messages in thread
From: Ian Jackson @ 2021-01-22 15:29 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: Xen-devel, Michał Leszczyński, Wei Liu, Anthony PERARD,
	Tamas K  Lengyel

Andrew Cooper writes ("[PATCH v7 03/10] tools/[lib]xl: Add vmtrace_buf_size parameter"):
> From: Michał Leszczyński <michal.leszczynski@cert.pl>
> 
> Allow to specify the size of per-vCPU trace buffer upon
> domain creation. This is zero by default (meaning: not enabled).

Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v7 07/10] tools/libxc: Add xc_vmtrace_* functions
  2021-01-21 21:27 ` [PATCH v7 07/10] tools/libxc: Add xc_vmtrace_* functions Andrew Cooper
@ 2021-01-22 15:29   ` Ian Jackson
  0 siblings, 0 replies; 41+ messages in thread
From: Ian Jackson @ 2021-01-22 15:29 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: Xen-devel, Michał Leszczyński, Wei Liu, Tamas K Lengyel

Andrew Cooper writes ("[PATCH v7 07/10] tools/libxc: Add xc_vmtrace_* functions"):
> From: Michał Leszczyński <michal.leszczynski@cert.pl>
> 
> Add functions in libxc that use the new XEN_DOMCTL_vmtrace interface.

Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v7 08/10] tools/misc: Add xen-vmtrace tool
  2021-01-21 21:27 ` [PATCH v7 08/10] tools/misc: Add xen-vmtrace tool Andrew Cooper
@ 2021-01-22 15:33   ` Ian Jackson
  2021-01-25 15:30     ` Andrew Cooper
  0 siblings, 1 reply; 41+ messages in thread
From: Ian Jackson @ 2021-01-22 15:33 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: Xen-devel, Michał Leszczyński, Wei Liu, Tamas K Lengyel

Andrew Cooper writes ("[PATCH v7 08/10] tools/misc: Add xen-vmtrace tool"):
> From: Michał Leszczyński <michal.leszczynski@cert.pl>
...
> +    if ( signal(SIGINT, int_handler) == SIG_ERR )
> +        err(1, "Failed to register signal handler\n");

How bad is it if this signal handler is not effective ?

> +    if ( xc_vmtrace_disable(xch, domid, vcpu) )
> +        perror("xc_vmtrace_disable()");

I guess the tracing will remain on, pointlessly, which has a perf
impact but nothing else ?

How is it possible for the user to clean this up ?

Also: at the very least, you need to trap SIGTERM SIGHUP SIGPIPE.

It would be good to exit with the right signal by re-raising it.

> +static volatile int interrupted = 0;

sig_atomic_t

Ian.


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v7 02/10] xen/domain: Add vmtrace_frames domain creation parameter
  2021-01-21 21:27 ` [PATCH v7 02/10] xen/domain: Add vmtrace_frames domain creation parameter Andrew Cooper
@ 2021-01-25 15:08   ` Jan Beulich
  2021-01-25 17:17     ` Andrew Cooper
  2021-01-29 16:37     ` Jan Beulich
  0 siblings, 2 replies; 41+ messages in thread
From: Jan Beulich @ 2021-01-25 15:08 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: Michał Leszczyński, Roger Pau Monné,
	Wei Liu, Anthony PERARD, Tamas K Lengyel, Xen-devel

On 21.01.2021 22:27, Andrew Cooper wrote:
> --- a/xen/common/domain.c
> +++ b/xen/common/domain.c
> @@ -132,6 +132,48 @@ static void vcpu_info_reset(struct vcpu *v)
>      v->vcpu_info_mfn = INVALID_MFN;
>  }
>  
> +static void vmtrace_free_buffer(struct vcpu *v)
> +{
> +    const struct domain *d = v->domain;
> +    struct page_info *pg = v->vmtrace.buf;
> +    unsigned int i;
> +
> +    if ( !pg )
> +        return;
> +
> +    for ( i = 0; i < d->vmtrace_frames; i++ )
> +    {
> +        put_page_alloc_ref(&pg[i]);
> +        put_page_and_type(&pg[i]);
> +    }
> +
> +    v->vmtrace.buf = NULL;

To set a good precedent, maybe this wants moving up ahead of
the loop and ...

> +}
> +
> +static int vmtrace_alloc_buffer(struct vcpu *v)
> +{
> +    struct domain *d = v->domain;
> +    struct page_info *pg;
> +    unsigned int i;
> +
> +    if ( !d->vmtrace_frames )
> +        return 0;
> +
> +    pg = alloc_domheap_pages(d, get_order_from_pages(d->vmtrace_frames),
> +                             MEMF_no_refcount);
> +    if ( !pg )
> +        return -ENOMEM;
> +
> +    v->vmtrace.buf = pg;

... this wants moving down past the loop, to avoid
globally announcing something that isn't fully initialized
yet / anymore?

> +    for ( i = 0; i < d->vmtrace_frames; i++ )
> +        /* Domain can't know about this page yet - something fishy going on. */
> +        if ( !get_page_and_type(&pg[i], d, PGT_writable_page) )
> +            BUG();

Whatever the final verdict to the other similar places
that one of your patch changes should be applied here,
too.

> --- a/xen/include/public/domctl.h
> +++ b/xen/include/public/domctl.h
> @@ -94,6 +94,7 @@ struct xen_domctl_createdomain {
>      uint32_t max_evtchn_port;
>      int32_t max_grant_frames;
>      int32_t max_maptrack_frames;
> +    uint32_t vmtrace_frames;

Considering page size related irritations elsewhere in the
public interface, could you have a comment clarify the unit
of this value (Xen's page size according to the rest of the
patch), and that space will be allocated once per-vCPU
rather than per-domain (to stand a chance of recognizing
the ultimate memory footprint resulting from this)?

> --- a/xen/include/xen/sched.h
> +++ b/xen/include/xen/sched.h
> @@ -257,6 +257,10 @@ struct vcpu
>      /* vPCI per-vCPU area, used to store data for long running operations. */
>      struct vpci_vcpu vpci;
>  
> +    struct {
> +        struct page_info *buf;
> +    } vmtrace;

While perhaps minor, I'm unconvinced "buf" is a good name
for a field of this type.

> @@ -470,6 +474,9 @@ struct domain
>      unsigned    pbuf_idx;
>      spinlock_t  pbuf_lock;
>  
> +    /* Used by vmtrace features */
> +    uint32_t    vmtrace_frames;

unsigned int? Also could you move this to an existing 32-bit
hole, like immediately after "monitor"?

Jan


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v7 08/10] tools/misc: Add xen-vmtrace tool
  2021-01-22 15:33   ` Ian Jackson
@ 2021-01-25 15:30     ` Andrew Cooper
  2021-01-26 11:59       ` Ian Jackson
  0 siblings, 1 reply; 41+ messages in thread
From: Andrew Cooper @ 2021-01-25 15:30 UTC (permalink / raw)
  To: Ian Jackson
  Cc: Xen-devel, Michał Leszczyński, Wei Liu, Tamas K Lengyel

On 22/01/2021 15:33, Ian Jackson wrote:
> Andrew Cooper writes ("[PATCH v7 08/10] tools/misc: Add xen-vmtrace tool"):
>> From: Michał Leszczyński <michal.leszczynski@cert.pl>
> ...
>> +    if ( signal(SIGINT, int_handler) == SIG_ERR )
>> +        err(1, "Failed to register signal handler\n");
> How bad is it if this signal handler is not effective ?

I believe far less so now that I've fixed up everything to use a (fixed)
XENMEM_acquire_resource, so Xen doesn't crash if this process dies in
the wrong order WRT the domain shutting down.

But I would have to defer to Michał on that.

>> +    if ( xc_vmtrace_disable(xch, domid, vcpu) )
>> +        perror("xc_vmtrace_disable()");
> I guess the tracing will remain on, pointlessly, which has a perf
> impact but nothing else ?

The perf hit is substantial, but it is safe to leave enabled.

> How is it possible for the user to clean this up ?

For now, enable/disable can only fail with -EINVAL for calls made in the
wrong context, so a failure here is benign in practice.

I specifically didn't opt for reference counting the enable/disable
calls, because there cannot (usefully) be two users of this interface.

>
> Also: at the very least, you need to trap SIGTERM SIGHUP SIGPIPE.
>
> It would be good to exit with the right signal by re-raising it.

This is example code, not a production utility.

Anything more production-wise using this needs to account for the fact
that Intel Processor Trace can't pause on a full buffer.  (It ought to
be able to on forthcoming hardware, but this facility isn't available yet.)

The use-cases thus far are always "small delta of execution between
introspection events", using a massive buffer as the mitigation for
hardware wrapping.

No amount of additional code here can prevent stream corruption problems
with the buffer wrapping.  As a result, it is kept as simple as possible
as a demonstration of how to use the API.

~Andrew


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v7 04/10] xen/memory: Add a vmtrace_buf resource type
  2021-01-21 21:27 ` [PATCH v7 04/10] xen/memory: Add a vmtrace_buf resource type Andrew Cooper
@ 2021-01-25 16:31   ` Jan Beulich
  2021-01-26  7:37     ` Jan Beulich
  0 siblings, 1 reply; 41+ messages in thread
From: Jan Beulich @ 2021-01-25 16:31 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: Michał Leszczyński, Roger Pau Monné,
	Wei Liu, Tamas K Lengyel, Xen-devel

On 21.01.2021 22:27, Andrew Cooper wrote:
> --- a/xen/common/memory.c
> +++ b/xen/common/memory.c
> @@ -1068,11 +1068,35 @@ static unsigned int resource_max_frames(const struct domain *d,
>      case XENMEM_resource_grant_table:
>          return gnttab_resource_max_frames(d, id);
>  
> +    case XENMEM_resource_vmtrace_buf:
> +        return d->vmtrace_frames;
> +
>      default:
>          return arch_resource_max_frames(d, type, id);
>      }
>  }
>  
> +static int acquire_vmtrace_buf(
> +    struct domain *d, unsigned int id, unsigned long frame,
> +    unsigned int nr_frames, xen_pfn_t mfn_list[])
> +{
> +    const struct vcpu *v = domain_vcpu(d, id);
> +    unsigned int i;
> +    mfn_t mfn;
> +
> +    if ( !v || !v->vmtrace.buf ||
> +         nr_frames > d->vmtrace_frames ||
> +         (frame + nr_frames) > d->vmtrace_frames )
> +        return -EINVAL;


I think that for this to guard against overflow, the first nr_frames
needs to be replaced by frame (as having the wider type), or else a
very large value of frame coming in will not yield the intended
-EINVAL. If you agree, with this changed,
Reviewed-by: Jan Beulich <jbeulich@suse.com>

Jan


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v7 02/10] xen/domain: Add vmtrace_frames domain creation parameter
  2021-01-25 15:08   ` Jan Beulich
@ 2021-01-25 17:17     ` Andrew Cooper
  2021-01-26 10:51       ` Jan Beulich
  2021-01-29 16:37     ` Jan Beulich
  1 sibling, 1 reply; 41+ messages in thread
From: Andrew Cooper @ 2021-01-25 17:17 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Michał Leszczyński, Roger Pau Monné,
	Wei Liu, Anthony PERARD, Tamas K Lengyel, Xen-devel

On 25/01/2021 15:08, Jan Beulich wrote:
> On 21.01.2021 22:27, Andrew Cooper wrote:
>> --- a/xen/common/domain.c
>> +++ b/xen/common/domain.c
>> @@ -132,6 +132,48 @@ static void vcpu_info_reset(struct vcpu *v)
>>      v->vcpu_info_mfn = INVALID_MFN;
>>  }
>>  
>> +static void vmtrace_free_buffer(struct vcpu *v)
>> +{
>> +    const struct domain *d = v->domain;
>> +    struct page_info *pg = v->vmtrace.buf;
>> +    unsigned int i;
>> +
>> +    if ( !pg )
>> +        return;
>> +
>> +    for ( i = 0; i < d->vmtrace_frames; i++ )
>> +    {
>> +        put_page_alloc_ref(&pg[i]);
>> +        put_page_and_type(&pg[i]);
>> +    }
>> +
>> +    v->vmtrace.buf = NULL;
> To set a good precedent, maybe this wants moving up ahead of
> the loop and ...
>
>> +}
>> +
>> +static int vmtrace_alloc_buffer(struct vcpu *v)
>> +{
>> +    struct domain *d = v->domain;
>> +    struct page_info *pg;
>> +    unsigned int i;
>> +
>> +    if ( !d->vmtrace_frames )
>> +        return 0;
>> +
>> +    pg = alloc_domheap_pages(d, get_order_from_pages(d->vmtrace_frames),
>> +                             MEMF_no_refcount);
>> +    if ( !pg )
>> +        return -ENOMEM;
>> +
>> +    v->vmtrace.buf = pg;
> ... this wants moving down past the loop, to avoid
> globally announcing something that isn't fully initialized
> yet / anymore?

Fine.

>
>> +    for ( i = 0; i < d->vmtrace_frames; i++ )
>> +        /* Domain can't know about this page yet - something fishy going on. */
>> +        if ( !get_page_and_type(&pg[i], d, PGT_writable_page) )
>> +            BUG();
> Whatever the final verdict to the other similar places
> that one of your patch changes should be applied here,
> too.

Obviously, except there's 0 room for manoeuvring on that patch, so this
hunk is correct.

>
>> --- a/xen/include/public/domctl.h
>> +++ b/xen/include/public/domctl.h
>> @@ -94,6 +94,7 @@ struct xen_domctl_createdomain {
>>      uint32_t max_evtchn_port;
>>      int32_t max_grant_frames;
>>      int32_t max_maptrack_frames;
>> +    uint32_t vmtrace_frames;
> Considering page size related irritations elsewhere in the
> public interface, could you have a comment clarify the unit
> of this value (Xen's page size according to the rest of the
> patch), and that space will be allocated once per-vCPU
> rather than per-domain (to stand a chance of recognizing
> the ultimate memory footprint resulting from this)?

Well - its hopefully obvious that it shares the same units as the other
*_frames parameters.

But yes - the future ABI fixes, it will be forbidden to use anything in
units of frames, to fix the multitude of interface bugs pertaining to
non-4k page sizes.

I'll switch to using vmtrace_size, in units of bytes, and the per-arch
filtering can enforce being a multiple of 4k.

>
>> --- a/xen/include/xen/sched.h
>> +++ b/xen/include/xen/sched.h
>> @@ -257,6 +257,10 @@ struct vcpu
>>      /* vPCI per-vCPU area, used to store data for long running operations. */
>>      struct vpci_vcpu vpci;
>>  
>> +    struct {
>> +        struct page_info *buf;
>> +    } vmtrace;
> While perhaps minor, I'm unconvinced "buf" is a good name
> for a field of this type.

Please suggest a better one then.  This one is properly namespaced as
v->vmtrace.buf which is the least bad option I could come up with.

>
>> @@ -470,6 +474,9 @@ struct domain
>>      unsigned    pbuf_idx;
>>      spinlock_t  pbuf_lock;
>>  
>> +    /* Used by vmtrace features */
>> +    uint32_t    vmtrace_frames;
> unsigned int? Also could you move this to an existing 32-bit
> hole, like immediately after "monitor"?

Ok.

~Andrew


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v7 04/10] xen/memory: Add a vmtrace_buf resource type
  2021-01-25 16:31   ` Jan Beulich
@ 2021-01-26  7:37     ` Jan Beulich
  2021-01-26  9:58       ` Andrew Cooper
  0 siblings, 1 reply; 41+ messages in thread
From: Jan Beulich @ 2021-01-26  7:37 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: Michał Leszczyński, Roger Pau Monné,
	Wei Liu, Tamas K Lengyel, Xen-devel

On 25.01.2021 17:31, Jan Beulich wrote:
> On 21.01.2021 22:27, Andrew Cooper wrote:
>> --- a/xen/common/memory.c
>> +++ b/xen/common/memory.c
>> @@ -1068,11 +1068,35 @@ static unsigned int resource_max_frames(const struct domain *d,
>>      case XENMEM_resource_grant_table:
>>          return gnttab_resource_max_frames(d, id);
>>  
>> +    case XENMEM_resource_vmtrace_buf:
>> +        return d->vmtrace_frames;
>> +
>>      default:
>>          return arch_resource_max_frames(d, type, id);
>>      }
>>  }
>>  
>> +static int acquire_vmtrace_buf(
>> +    struct domain *d, unsigned int id, unsigned long frame,
>> +    unsigned int nr_frames, xen_pfn_t mfn_list[])
>> +{
>> +    const struct vcpu *v = domain_vcpu(d, id);
>> +    unsigned int i;
>> +    mfn_t mfn;
>> +
>> +    if ( !v || !v->vmtrace.buf ||
>> +         nr_frames > d->vmtrace_frames ||
>> +         (frame + nr_frames) > d->vmtrace_frames )
>> +        return -EINVAL;
> 
> 
> I think that for this to guard against overflow, the first nr_frames
> needs to be replaced by frame (as having the wider type), or else a
> very large value of frame coming in will not yield the intended
> -EINVAL.

Actually, besides this then wanting to be >= instead of >, this
wouldn't take care of the 32-bit case (or more generally the
sizeof(long) == sizeof(int) one). So I think you want

    if ( !v || !v->vmtrace.buf ||
         (frame + nr_frames) < frame ||
         (frame + nr_frames) > d->vmtrace_frames )
        return -EINVAL;

> If you agree, with this changed,
> Reviewed-by: Jan Beulich <jbeulich@suse.com>

This holds.

Jan


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v7 01/10] xen+tools: Introduce XEN_SYSCTL_PHYSCAP_vmtrace
  2021-01-21 21:27 ` [PATCH v7 01/10] xen+tools: Introduce XEN_SYSCTL_PHYSCAP_vmtrace Andrew Cooper
  2021-01-22 15:28   ` Ian Jackson
@ 2021-01-26  8:58   ` Julien Grall
  2021-01-26 10:04     ` Andrew Cooper
  1 sibling, 1 reply; 41+ messages in thread
From: Julien Grall @ 2021-01-26  8:58 UTC (permalink / raw)
  To: Andrew Cooper, Xen-devel
  Cc: Jan Beulich, Roger Pau Monné,
	Wei Liu, Ian Jackson, Michał Leszczyński,
	Tamas K Lengyel

Hi Andrew,

On 21/01/2021 21:27, Andrew Cooper wrote:
> diff --git a/xen/include/public/sysctl.h b/xen/include/public/sysctl.h
> index a073647117..d4453d2eab 100644
> --- a/xen/include/public/sysctl.h
> +++ b/xen/include/public/sysctl.h
> @@ -100,6 +100,7 @@ struct xen_sysctl_tbuf_op {
>   #define _XEN_SYSCTL_PHYSCAP_iommu_hap_pt_share 5
>   #define XEN_SYSCTL_PHYSCAP_iommu_hap_pt_share  \
>       (1u << _XEN_SYSCTL_PHYSCAP_iommu_hap_pt_share)
> +#define XEN_SYSCTL_PHYSCAP_vmtrace       (1 << 6)
>   
>   /* Max XEN_SYSCTL_PHYSCAP_* constant.  Used for ABI checking. */
>   #define XEN_SYSCTL_PHYSCAP_MAX XEN_SYSCTL_PHYSCAP_iommu_hap_pt_share

XEN_SYSCTL_PHYSCAP_MAX needs to be bumped.

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v7 04/10] xen/memory: Add a vmtrace_buf resource type
  2021-01-26  7:37     ` Jan Beulich
@ 2021-01-26  9:58       ` Andrew Cooper
  2021-01-26 10:30         ` Jan Beulich
  0 siblings, 1 reply; 41+ messages in thread
From: Andrew Cooper @ 2021-01-26  9:58 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Michał Leszczyński, Roger Pau Monné,
	Wei Liu, Tamas K Lengyel, Xen-devel

On 26/01/2021 07:37, Jan Beulich wrote:
> On 25.01.2021 17:31, Jan Beulich wrote:
>> On 21.01.2021 22:27, Andrew Cooper wrote:
>>> --- a/xen/common/memory.c
>>> +++ b/xen/common/memory.c
>>> @@ -1068,11 +1068,35 @@ static unsigned int resource_max_frames(const struct domain *d,
>>>      case XENMEM_resource_grant_table:
>>>          return gnttab_resource_max_frames(d, id);
>>>  
>>> +    case XENMEM_resource_vmtrace_buf:
>>> +        return d->vmtrace_frames;
>>> +
>>>      default:
>>>          return arch_resource_max_frames(d, type, id);
>>>      }
>>>  }
>>>  
>>> +static int acquire_vmtrace_buf(
>>> +    struct domain *d, unsigned int id, unsigned long frame,
>>> +    unsigned int nr_frames, xen_pfn_t mfn_list[])
>>> +{
>>> +    const struct vcpu *v = domain_vcpu(d, id);
>>> +    unsigned int i;
>>> +    mfn_t mfn;
>>> +
>>> +    if ( !v || !v->vmtrace.buf ||
>>> +         nr_frames > d->vmtrace_frames ||
>>> +         (frame + nr_frames) > d->vmtrace_frames )
>>> +        return -EINVAL;
>>
>> I think that for this to guard against overflow, the first nr_frames
>> needs to be replaced by frame (as having the wider type), or else a
>> very large value of frame coming in will not yield the intended
>> -EINVAL.
> Actually, besides this then wanting to be >= instead of >, this
> wouldn't take care of the 32-bit case (or more generally the
> sizeof(long) == sizeof(int) one). So I think you want
>
>     if ( !v || !v->vmtrace.buf ||
>          (frame + nr_frames) < frame ||
>          (frame + nr_frames) > d->vmtrace_frames )
>         return -EINVAL;
>
>> If you agree, with this changed,
>> Reviewed-by: Jan Beulich <jbeulich@suse.com>
> This holds.

I slipped this buggy version in to prove a point.

You're now 3 or 4 attempts into "simplifying" my original version, and
have on at least 2 attempts made your R-b conditional on a buggy version.

This form is clearly too complicated to reason about correctly, and it
is definitely more complicated than I am happy taking.


I am either going to go with my original version, which is trivially and
obviously correct, or I'm considering reducing frame to 32 bits at the
top level to fix this width nonsense throughout Xen.

~Andrew


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v7 01/10] xen+tools: Introduce XEN_SYSCTL_PHYSCAP_vmtrace
  2021-01-26  8:58   ` Julien Grall
@ 2021-01-26 10:04     ` Andrew Cooper
  0 siblings, 0 replies; 41+ messages in thread
From: Andrew Cooper @ 2021-01-26 10:04 UTC (permalink / raw)
  To: Julien Grall, Xen-devel
  Cc: Jan Beulich, Roger Pau Monné,
	Wei Liu, Ian Jackson, Michał Leszczyński,
	Tamas K Lengyel

On 26/01/2021 08:58, Julien Grall wrote:
> Hi Andrew,
>
> On 21/01/2021 21:27, Andrew Cooper wrote:
>> diff --git a/xen/include/public/sysctl.h b/xen/include/public/sysctl.h
>> index a073647117..d4453d2eab 100644
>> --- a/xen/include/public/sysctl.h
>> +++ b/xen/include/public/sysctl.h
>> @@ -100,6 +100,7 @@ struct xen_sysctl_tbuf_op {
>>   #define _XEN_SYSCTL_PHYSCAP_iommu_hap_pt_share 5
>>   #define XEN_SYSCTL_PHYSCAP_iommu_hap_pt_share  \
>>       (1u << _XEN_SYSCTL_PHYSCAP_iommu_hap_pt_share)
>> +#define XEN_SYSCTL_PHYSCAP_vmtrace       (1 << 6)
>>     /* Max XEN_SYSCTL_PHYSCAP_* constant.  Used for ABI checking. */
>>   #define XEN_SYSCTL_PHYSCAP_MAX XEN_SYSCTL_PHYSCAP_iommu_hap_pt_share
>
> XEN_SYSCTL_PHYSCAP_MAX needs to be bumped.

Right you are... Why didn't the build break on me...

~Andrew


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v7 04/10] xen/memory: Add a vmtrace_buf resource type
  2021-01-26  9:58       ` Andrew Cooper
@ 2021-01-26 10:30         ` Jan Beulich
  0 siblings, 0 replies; 41+ messages in thread
From: Jan Beulich @ 2021-01-26 10:30 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: Michał Leszczyński, Roger Pau Monné,
	Wei Liu, Tamas K Lengyel, Xen-devel

On 26.01.2021 10:58, Andrew Cooper wrote:
> On 26/01/2021 07:37, Jan Beulich wrote:
>> On 25.01.2021 17:31, Jan Beulich wrote:
>>> On 21.01.2021 22:27, Andrew Cooper wrote:
>>>> +static int acquire_vmtrace_buf(
>>>> +    struct domain *d, unsigned int id, unsigned long frame,
>>>> +    unsigned int nr_frames, xen_pfn_t mfn_list[])
>>>> +{
>>>> +    const struct vcpu *v = domain_vcpu(d, id);
>>>> +    unsigned int i;
>>>> +    mfn_t mfn;
>>>> +
>>>> +    if ( !v || !v->vmtrace.buf ||
>>>> +         nr_frames > d->vmtrace_frames ||
>>>> +         (frame + nr_frames) > d->vmtrace_frames )
>>>> +        return -EINVAL;
>>>
>>> I think that for this to guard against overflow, the first nr_frames
>>> needs to be replaced by frame (as having the wider type), or else a
>>> very large value of frame coming in will not yield the intended
>>> -EINVAL.
>> Actually, besides this then wanting to be >= instead of >, this
>> wouldn't take care of the 32-bit case (or more generally the
>> sizeof(long) == sizeof(int) one). So I think you want
>>
>>     if ( !v || !v->vmtrace.buf ||
>>          (frame + nr_frames) < frame ||
>>          (frame + nr_frames) > d->vmtrace_frames )
>>         return -EINVAL;
>>
>>> If you agree, with this changed,
>>> Reviewed-by: Jan Beulich <jbeulich@suse.com>
>> This holds.
> 
> I slipped this buggy version in to prove a point.

IOW you've been intentionally submitting buggy code. Very
interesting.

> You're now 3 or 4 attempts into "simplifying" my original version, and
> have on at least 2 attempts made your R-b conditional on a buggy version.

In which way is the last proposed version buggy, and in which
way was the intermediate proposal problematic beyond the
aspects I did recognize myself? (I also see no problem with
taking a number of iterations to arrive at the correct result,
and I also wouldn't view this happening as an indication that
an initial comment was wrong then, unless the final result of
this iterative process matches what there was originally.)

> This form is clearly too complicated to reason about correctly, and it
> is definitely more complicated than I am happy taking.
> 
> 
> I am either going to go with my original version, which is trivially and
> obviously correct,

I've just tried to locate your "original version" in my mailbox.
I don't have an earlier patch there with this same title.
Without being able to locate the prior suggestion of mine, I'm
afraid I won't be able to verify if indeed I did suggest the
variant above before; I wouldn't consider it very likely though.
In any event I think it would have helped more if you had
proven to me where I'm wrong; I can be convinced, but calling
something "trivially and obviously correct" is not a technical
statement in such a situation. It instead feels more like a
killer phrase.

By implication, you saying "trivially and obviously correct"
can really mean only one of two things if indeed I had found a
need to comment on this same piece of code (under a different
title) earlier on: I'm trivially and obviously stupid (and
would better go away), or you're wrong with the statement (at
least in assuming what's trivial and obvious to you also
necessarily is to everyone else). I'm sorry to say it this
bluntly, but your reply above feels pretty blunt as well.

> or I'm considering reducing frame to 32 bits at the
> top level to fix this width nonsense throughout Xen.

I wouldn't mind this (and I've been wondering about the
"unsigned long" a number of times), but I'm afraid I don't see
how your construct above would be correctly rejecting all
overflowing cases then.

Jan


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v7 02/10] xen/domain: Add vmtrace_frames domain creation parameter
  2021-01-25 17:17     ` Andrew Cooper
@ 2021-01-26 10:51       ` Jan Beulich
  0 siblings, 0 replies; 41+ messages in thread
From: Jan Beulich @ 2021-01-26 10:51 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: Michał Leszczyński, Roger Pau Monné,
	Wei Liu, Anthony PERARD, Tamas K Lengyel, Xen-devel

On 25.01.2021 18:17, Andrew Cooper wrote:
> On 25/01/2021 15:08, Jan Beulich wrote:
>> On 21.01.2021 22:27, Andrew Cooper wrote:
>>> --- a/xen/include/xen/sched.h
>>> +++ b/xen/include/xen/sched.h
>>> @@ -257,6 +257,10 @@ struct vcpu
>>>      /* vPCI per-vCPU area, used to store data for long running operations. */
>>>      struct vpci_vcpu vpci;
>>>  
>>> +    struct {
>>> +        struct page_info *buf;
>>> +    } vmtrace;
>> While perhaps minor, I'm unconvinced "buf" is a good name
>> for a field of this type.
> 
> Please suggest a better one then.  This one is properly namespaced as
> v->vmtrace.buf which is the least bad option I could come up with.

"pg", "page", or "pages" are all names we use for variables and
fields of this type. I don't see at all how the namespacing
helps it - to me "v->vmtrace.buf" is still a pointer to the
actual buffer, not the struct page_info describing the 1st page
of it.

Jan


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v7 08/10] tools/misc: Add xen-vmtrace tool
  2021-01-25 15:30     ` Andrew Cooper
@ 2021-01-26 11:59       ` Ian Jackson
  2021-01-26 12:55         ` Andrew Cooper
  0 siblings, 1 reply; 41+ messages in thread
From: Ian Jackson @ 2021-01-26 11:59 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: Xen-devel, Michał Leszczyński, Wei Liu, Tamas K Lengyel

Andrew Cooper writes ("Re: [PATCH v7 08/10] tools/misc: Add xen-vmtrace tool"):
> On 22/01/2021 15:33, Ian Jackson wrote:
> > Andrew Cooper writes ("[PATCH v7 08/10] tools/misc: Add xen-vmtrace tool"):
> >> +    if ( xc_vmtrace_disable(xch, domid, vcpu) )
> >> +        perror("xc_vmtrace_disable()");
> > I guess the tracing will remain on, pointlessly, which has a perf
> > impact but nothing else ?
> 
> The perf hit is substantial, but it is safe to leave enabled.

And how does one clean up after an unclean exit of this program ?

> > How is it possible for the user to clean this up ?
> 
> For now, enable/disable can only fail with -EINVAL for calls made in the
> wrong context, so a failure here is benign in practice.
> 
> I specifically didn't opt for reference counting the enable/disable
> calls, because there cannot (usefully) be two users of this interface.

Right.

> > Also: at the very least, you need to trap SIGTERM SIGHUP SIGPIPE.
> >
> > It would be good to exit with the right signal by re-raising it.
> 
> This is example code, not a production utility.

Perhaps the utility could print some kind of health warning about it
possibly leaving this perf-impacting feature enabled, and how to clean
up ?

Ian.


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v7 08/10] tools/misc: Add xen-vmtrace tool
  2021-01-26 11:59       ` Ian Jackson
@ 2021-01-26 12:55         ` Andrew Cooper
  2021-01-26 13:32           ` Ian Jackson
  0 siblings, 1 reply; 41+ messages in thread
From: Andrew Cooper @ 2021-01-26 12:55 UTC (permalink / raw)
  To: Ian Jackson
  Cc: Xen-devel, Michał Leszczyński, Wei Liu, Tamas K Lengyel

On 26/01/2021 11:59, Ian Jackson wrote:
>>> Also: at the very least, you need to trap SIGTERM SIGHUP SIGPIPE.
>>>
>>> It would be good to exit with the right signal by re-raising it.
>> This is example code, not a production utility.
> Perhaps the utility could print some kind of health warning about it
> possibly leaving this perf-impacting feature enabled, and how to clean
> up ?

Why?  The theoretical fallout here is not conceptually different to
libxl or qemu segfaulting, or any of the myriad other random utilities
we have.

Printing "Warning - this program, just like everything else in the Xen
tree, might in exceptional circumstances segfault and leave the domain
in a weird state" is obvious, and doesn't need stating.

The domain is stuffed. `xl destroy` may or may not make the problem go away.

~Andrew


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v7 08/10] tools/misc: Add xen-vmtrace tool
  2021-01-26 12:55         ` Andrew Cooper
@ 2021-01-26 13:32           ` Ian Jackson
  2021-01-26 15:59             ` Andrew Cooper
  0 siblings, 1 reply; 41+ messages in thread
From: Ian Jackson @ 2021-01-26 13:32 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: Xen-devel, Michał Leszczyński, Wei Liu, Tamas K Lengyel

Andrew Cooper writes ("Re: [PATCH v7 08/10] tools/misc: Add xen-vmtrace tool"):
> On 26/01/2021 11:59, Ian Jackson wrote:
> >[Andrew:]
> >> This is example code, not a production utility.
> > Perhaps the utility could print some kind of health warning about it
> > possibly leaving this perf-impacting feature enabled, and how to clean
> > up ?
> 
> Why?  The theoretical fallout here is not conceptually different to
> libxl or qemu segfaulting, or any of the myriad other random utilities
> we have.
> 
> Printing "Warning - this program, just like everything else in the Xen
> tree, might in exceptional circumstances segfault and leave the domain
> in a weird state" is obvious, and doesn't need stating.
> 
> The domain is stuffed. `xl destroy` may or may not make the problem go away.

Firstly, I don't agree with this pessimistic analysis of our current
tooling.  Secondly, I would consider many such behaviours bugs;
certainly we have bugs but we shouldn't introduce more of them.

You are justifying the poor robustness of this tool on the grounds
that it's "example code, not a production utility".

But we are shipping it to bin/ and there is nothing telling anyone
that trying to use it (perhaps wrapped in something of their own
devising) is a bad idea.

Either this is code users might be expected to run in production in
which we need to make it at least have a minimal level of engineering
robustness (which is perhaps too difficult at this stage), or we need
to communicate to our users that it's a programming example, not a
useful utility.

Note that *even if it is a programming example*, we should highlight
its most important deficiencies.  Otherwise it is a hazardously
misleading example.

I hope that makes sense.

Thanks,
Ian.


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v7 05/10] x86/vmx: Add Intel Processor Trace support
  2021-01-21 21:27 ` [PATCH v7 05/10] x86/vmx: Add Intel Processor Trace support Andrew Cooper
@ 2021-01-26 13:35   ` Jan Beulich
  2021-01-29 22:08     ` Andrew Cooper
  0 siblings, 1 reply; 41+ messages in thread
From: Jan Beulich @ 2021-01-26 13:35 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: Michał Leszczyński, Roger Pau Monné,
	Wei Liu, Jun Nakajima, Kevin Tian, Tamas K Lengyel, Xen-devel

On 21.01.2021 22:27, Andrew Cooper wrote:
> From: Michał Leszczyński <michal.leszczynski@cert.pl>
> 
> Add CPUID/MSR enumeration details for Processor Trace.  For now, we will only
> support its use inside VMX operation.  Fill in the vmtrace_available boolean
> to activate the newly introduced common infrastructure for allocating trace
> buffers.
> 
> For now, Processor Trace is going to be operated in Single Output mode behind
> the guests back.  Add the MSRs to struct vcpu_msrs, and set up the buffer
> limit in vmx_init_pt() as it is fixed for the lifetime of the domain.
> 
> Context switch the most of the MSRs in and out of vCPU context switch, but the
> main control register needs to reside in the MSR load/save lists.  Explicitly
> pull the msrs pointer out into a local variable, because the optimiser cannot
> keep it live across the memory clobbers in the MSR accesses.
> 
> Signed-off-by: Michał Leszczyński <michal.leszczynski@cert.pl>
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

Reviewed-by: Jan Beulich <jbeulich@suse.com>
albeit with a few things to consider and with a question to
confirm my understanding.

> --- a/xen/arch/x86/hvm/vmx/vmcs.c
> +++ b/xen/arch/x86/hvm/vmx/vmcs.c
> @@ -291,6 +291,20 @@ static int vmx_init_vmcs_config(void)
>          _vmx_cpu_based_exec_control &=
>              ~(CPU_BASED_CR8_LOAD_EXITING | CPU_BASED_CR8_STORE_EXITING);
>  
> +    rdmsrl(MSR_IA32_VMX_MISC, _vmx_misc_cap);
> +
> +    /* Check whether IPT is supported in VMX operation. */
> +    if ( !smp_processor_id() )

I'm not happy to see another one of these appear. I'd prefer
if the caller passed its "bsp" boolean into here, or if this
was made use system_state.

> +        vmtrace_available = cpu_has_proc_trace &&
> +                            (_vmx_misc_cap & VMX_MISC_PROC_TRACE);
> +    else if ( vmtrace_available &&
> +              !(_vmx_misc_cap & VMX_MISC_PROC_TRACE) )
> +    {
> +        printk("VMX: IPT capabilities differ between CPU%u and CPU0\n",

As a minor follow-on, instead of "CPU0" this may then want
to say "rest of the system" or "BSP", but I can see that at
least the former is also making the message longer (which
may not be desirable).

> --- a/xen/arch/x86/hvm/vmx/vmx.c
> +++ b/xen/arch/x86/hvm/vmx/vmx.c
> @@ -428,6 +428,20 @@ static void vmx_domain_relinquish_resources(struct domain *d)
>      vmx_free_vlapic_mapping(d);
>  }
>  
> +static void vmx_init_pt(struct vcpu *v)

We use "pt" already as an acronym for pass-through. Could we
use "ipt" here, for example (following the new "ipt_active"
field)?

> --- a/xen/include/asm-x86/msr.h
> +++ b/xen/include/asm-x86/msr.h
> @@ -306,6 +306,38 @@ struct vcpu_msrs
>          };
>      } misc_features_enables;
>  
> +    /*
> +     * 0x00000560 ... 57x - MSR_RTIT_*
> +     *
> +     * "Real Time Instruction Trace", now called Processor Trace.
> +     *
> +     * These MSRs are not exposed to guests.  They are controlled by Xen
> +     * behind the scenes, when vmtrace is enabled for the domain.
> +     *
> +     * MSR_RTIT_OUTPUT_BASE not stored here.  It is fixed per vcpu, and
> +     * derived from v->vmtrace.buf.
> +     */
> +    struct {
> +        /*
> +         * Placed in the MSR load/save lists.  Only modified by hypercall in
> +         * the common case.
> +         */
> +        uint64_t ctl;

IIUC this field will be used by subsequent patches only?

Jan


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v7 06/10] xen/domctl: Add XEN_DOMCTL_vmtrace_op
  2021-01-21 21:27 ` [PATCH v7 06/10] xen/domctl: Add XEN_DOMCTL_vmtrace_op Andrew Cooper
@ 2021-01-26 14:18   ` Jan Beulich
  2021-01-29 23:01     ` Andrew Cooper
  0 siblings, 1 reply; 41+ messages in thread
From: Jan Beulich @ 2021-01-26 14:18 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: Michał Leszczyński, Roger Pau Monné,
	Wei Liu, Jun Nakajima, Kevin Tian, Tamas K Lengyel, Xen-devel

On 21.01.2021 22:27, Andrew Cooper wrote:
> --- a/xen/arch/x86/domctl.c
> +++ b/xen/arch/x86/domctl.c
> @@ -155,6 +155,55 @@ void arch_get_domain_info(const struct domain *d,
>      info->arch_config.emulation_flags = d->arch.emulation_flags;
>  }
>  
> +static int do_vmtrace_op(struct domain *d, struct xen_domctl_vmtrace_op *op,
> +                         XEN_GUEST_HANDLE_PARAM(xen_domctl_t) u_domctl)
> +{
> +    struct vcpu *v;
> +    int rc;
> +
> +    if ( !d->vmtrace_frames || d == current->domain /* No vcpu_pause() */ )
> +        return -EINVAL;
> +
> +    ASSERT(is_hvm_domain(d)); /* Restricted by domain creation logic. */
> +
> +    v = domain_vcpu(d, op->vcpu);
> +    if ( !v )
> +        return -ENOENT;
> +
> +    vcpu_pause(v);
> +    switch ( op->cmd )
> +    {
> +    case XEN_DOMCTL_vmtrace_enable:
> +    case XEN_DOMCTL_vmtrace_disable:
> +    case XEN_DOMCTL_vmtrace_reset_and_enable:
> +        rc = hvm_vmtrace_control(
> +            v, op->cmd != XEN_DOMCTL_vmtrace_disable,
> +            op->cmd == XEN_DOMCTL_vmtrace_reset_and_enable);
> +        break;
> +
> +    case XEN_DOMCTL_vmtrace_output_position:
> +        rc = hvm_vmtrace_output_position(v, &op->value);
> +        if ( rc >= 0 )
> +            rc = 0;

So vmtrace_output_position() effectively returns a boolean, and
there is no other caller of it afaics. I understand the hook and
function return int to allow for error indicators. But what's
the purpose of returning ipt_active when the only caller doesn't
care?

> --- a/xen/arch/x86/hvm/vmx/vmx.c
> +++ b/xen/arch/x86/hvm/vmx/vmx.c
> @@ -2261,6 +2261,153 @@ static bool vmx_get_pending_event(struct vcpu *v, struct x86_event *info)
>      return true;
>  }
>  
> +static int vmtrace_get_option(struct vcpu *v, uint64_t key, uint64_t *output)
> +{
> +    const struct vcpu_msrs *msrs = v->arch.msrs;
> +
> +    /*
> +     * We only let vmtrace agents see and modify a subset of bits in
> +     * MSR_RTIT_CTL.  These all pertain to date emitted into the trace

s/date/data/ ?

> +     * buffer(s).  Must not include controls pertaining to the
> +     * structure/position of the trace buffer(s).
> +     */
> +#define RTIT_CTL_MASK                                                   \
> +    (RTIT_CTL_TRACE_EN | RTIT_CTL_OS | RTIT_CTL_USR | RTIT_CTL_TSC_EN | \
> +     RTIT_CTL_DIS_RETC | RTIT_CTL_BRANCH_EN)
> +
> +    /*
> +     * Status bits restricted to the first-gen subset (i.e. no further CPUID
> +     * requirements.)
> +     */
> +#define RTIT_STATUS_MASK \
> +    (RTIT_STATUS_FILTER_EN | RTIT_STATUS_CONTEXT_EN | RTIT_STATUS_TRIGGER_EN | \
> +     RTIT_STATUS_ERROR | RTIT_STATUS_STOPPED)

The placement of these two #define-s kind of suggests they're
intended for this function only, but the next one (at least)
also uses them. May I suggest to move these ahead of this
function?

> +static int vmtrace_set_option(struct vcpu *v, uint64_t key, uint64_t value)
> +{
> +    struct vcpu_msrs *msrs = v->arch.msrs;
> +    bool new_en, old_en = msrs->rtit.ctl & RTIT_CTL_TRACE_EN;
> +
> +    switch ( key )
> +    {
> +    case MSR_RTIT_OUTPUT_MASK:
> +        /*
> +         * MSR_RTIT_OUTPUT_MASK, when using Single Output mode, has a limit
> +         * field in the lower 32 bits, and an offset in the upper 32 bits.
> +         *
> +         * Limit is fixed by the vmtrace buffer size and must not be
> +         * controlled by userspace, while offset must be within the limit.
> +         *
> +         * Drop writes to the limit field to simply userspace wanting to reset
> +         * the offset by writing 0.
> +         */
> +        if ( (value >> 32) > msrs->rtit.output_limit )
> +            return -EINVAL;
> +        msrs->rtit.output_offset = value >> 32;
> +        break;
> +
> +    case MSR_RTIT_CTL:
> +        if ( value & ~RTIT_CTL_MASK )
> +            return -EINVAL;
> +
> +        msrs->rtit.ctl &= ~RTIT_CTL_MASK;
> +        msrs->rtit.ctl |= (value & RTIT_CTL_MASK);
> +        break;
> +
> +    case MSR_RTIT_STATUS:
> +        if ( value & ~RTIT_STATUS_MASK )
> +            return -EINVAL;
> +
> +        msrs->rtit.status &= ~RTIT_STATUS_MASK;
> +        msrs->rtit.status |= (value & RTIT_STATUS_MASK);
> +        break;
> +
> +    default:
> +        return -EINVAL;
> +    }
> +
> +    new_en = msrs->rtit.ctl & RTIT_CTL_TRACE_EN;
> +
> +    /* ctl.trace_en changed => update MSR load/save lists appropriately. */
> +    if ( !old_en && new_en )
> +    {
> +        if ( vmx_add_guest_msr(v, MSR_RTIT_CTL, msrs->rtit.ctl) ||
> +             vmx_add_host_load_msr(v, MSR_RTIT_CTL, 0) )
> +        {
> +            /*
> +             * The only failure cases here are failing the
> +             * singleton-per-domain memory allocation, or exceeding the space
> +             * in the allocation.  We could unwind in principle, but there is
> +             * nothing userspace can usefully do to continue using this VM.
> +             */
> +            domain_crash(v->domain);
> +            return -ENXIO;

I don't think I fully agree with the 2nd half of the last
sentence, but well, so be it then for the time being at least.
Why could userspace not decide to continue running this VM
with ipt disabled?

> +static int vmtrace_control(struct vcpu *v, bool enable, bool reset)
> +{
> +    struct vcpu_msrs *msrs = v->arch.msrs;
> +    uint64_t new_ctl;
> +    int rc;
> +
> +    if ( v->arch.hvm.vmx.ipt_active == enable )
> +        return -EINVAL;

Why is XEN_DOMCTL_vmtrace_reset_and_enable not permitted
when ipt_active is true? And, considering ...

> +    if ( reset )
> +    {
> +        msrs->rtit.status = 0;
> +        msrs->rtit.output_offset = 0;
> +    }
> +
> +    new_ctl = msrs->rtit.ctl & ~RTIT_CTL_TRACE_EN;
> +    if ( enable )
> +        new_ctl |= RTIT_CTL_TRACE_EN;
> +
> +    rc = vmtrace_set_option(v, MSR_RTIT_CTL, new_ctl);

... this is just a wrapper around a function directly
reachable via XEN_DOMCTL_vmtrace_set_option, why any
restriction at all?

> +    if ( rc )
> +        return rc;
> +
> +    v->arch.hvm.vmx.ipt_active = enable;

Shouldn't this be done in vmtrace_set_option(), to also
cover the other path leading there?

Jan


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v7 09/10] xen/vmtrace: support for VM forks
  2021-01-21 21:27 ` [PATCH v7 09/10] xen/vmtrace: support for VM forks Andrew Cooper
@ 2021-01-26 14:21   ` Jan Beulich
  2021-01-27 15:50     ` Lengyel, Tamas
  0 siblings, 1 reply; 41+ messages in thread
From: Jan Beulich @ 2021-01-26 14:21 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: Tamas K Lengyel, Roger Pau Monné,
	Wei Liu, Jun Nakajima, Kevin Tian, Michał Leszczyński,
	Tamas K Lengyel, Xen-devel

On 21.01.2021 22:27, Andrew Cooper wrote:
> From: Tamas K Lengyel <tamas.lengyel@intel.com>
> 
> Implement vmtrace_reset_pt function. Properly set IPT
> state for VM forks.
> 
> Signed-off-by: Tamas K Lengyel <tamas.lengyel@intel.com>
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

Reviewed-by: Jan Beulich <jbeulich@suse.com>
albeit it strikes me as somewhat odd that ...

> --- a/xen/arch/x86/hvm/vmx/vmx.c
> +++ b/xen/arch/x86/hvm/vmx/vmx.c
> @@ -2408,6 +2408,16 @@ static int vmtrace_output_position(struct vcpu *v, uint64_t *pos)
>      return v->arch.hvm.vmx.ipt_active;
>  }
>  
> +static int vmtrace_reset(struct vcpu *v)
> +{
> +    if ( !v->arch.hvm.vmx.ipt_active )
> +        return -EINVAL;
> +
> +    v->arch.msrs->rtit.output_offset = 0;
> +    v->arch.msrs->rtit.status = 0;
> +    return 0;
> +}

... the function/hook return non-void, yet ...

> --- a/xen/arch/x86/mm/mem_sharing.c
> +++ b/xen/arch/x86/mm/mem_sharing.c
> @@ -1632,6 +1632,8 @@ static int copy_vcpu_settings(struct domain *cd, const struct domain *d)
>              copy_domain_page(new_vcpu_info_mfn, vcpu_info_mfn);
>          }
>  
> +        hvm_vmtrace_reset(cd_vcpu);

... the only caller doesn't care.

Jan


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v7 10/10] x86/vm_event: Carry Processor Trace buffer offset in vm_event
  2021-01-21 21:27 ` [PATCH v7 10/10] x86/vm_event: Carry Processor Trace buffer offset in vm_event Andrew Cooper
@ 2021-01-26 14:27   ` Jan Beulich
  2021-01-29 23:22     ` Andrew Cooper
  0 siblings, 1 reply; 41+ messages in thread
From: Jan Beulich @ 2021-01-26 14:27 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: Tamas K Lengyel, Roger Pau Monné,
	Wei Liu, Jun Nakajima, Kevin Tian, Michał Leszczyński,
	Tamas K Lengyel, Xen-devel

On 21.01.2021 22:27, Andrew Cooper wrote:
> --- a/xen/arch/x86/vm_event.c
> +++ b/xen/arch/x86/vm_event.c
> @@ -251,6 +251,9 @@ void vm_event_fill_regs(vm_event_request_t *req)
>  
>      req->data.regs.x86.shadow_gs = ctxt.shadow_gs;
>      req->data.regs.x86.dr6 = ctxt.dr6;
> +
> +    if ( hvm_vmtrace_output_position(curr, &req->data.regs.x86.pt_offset) != 1 )
> +        req->data.regs.x86.pt_offset = ~0;

Ah. (Regarding my earlier question about this returning -errno or
boolean).

> --- a/xen/include/public/vm_event.h
> +++ b/xen/include/public/vm_event.h
> @@ -223,6 +223,12 @@ struct vm_event_regs_x86 {
>       */
>      uint64_t npt_base;
>  
> +    /*
> +     * Current offset in the Processor Trace buffer. For Intel Processor Trace
> +     * this is MSR_RTIT_OUTPUT_MASK. Set to ~0 if no Processor Trace is active.
> +     */
> +    uint64_t pt_offset;

According to vmtrace_output_position() the value is only one half
of what the named MSR contains. Perhaps "... this is from MSR_..."?
Not sure whether, despite this, there still is a reason to have
this 64-bit wide.

Jan


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v7 08/10] tools/misc: Add xen-vmtrace tool
  2021-01-26 13:32           ` Ian Jackson
@ 2021-01-26 15:59             ` Andrew Cooper
  0 siblings, 0 replies; 41+ messages in thread
From: Andrew Cooper @ 2021-01-26 15:59 UTC (permalink / raw)
  To: Ian Jackson
  Cc: Xen-devel, Michał Leszczyński, Wei Liu, Tamas K Lengyel

On 26/01/2021 13:32, Ian Jackson wrote:
> Andrew Cooper writes ("Re: [PATCH v7 08/10] tools/misc: Add xen-vmtrace tool"):
>> On 26/01/2021 11:59, Ian Jackson wrote:
>>> [Andrew:]
>>>> This is example code, not a production utility.
>>> Perhaps the utility could print some kind of health warning about it
>>> possibly leaving this perf-impacting feature enabled, and how to clean
>>> up ?
>> Why?  The theoretical fallout here is not conceptually different to
>> libxl or qemu segfaulting, or any of the myriad other random utilities
>> we have.
>>
>> Printing "Warning - this program, just like everything else in the Xen
>> tree, might in exceptional circumstances segfault and leave the domain
>> in a weird state" is obvious, and doesn't need stating.
>>
>> The domain is stuffed. `xl destroy` may or may not make the problem go away.
> Firstly, I don't agree with this pessimistic analysis of our current
> tooling.  Secondly, I would consider many such behaviours bugs;
> certainly we have bugs but we shouldn't introduce more of them.
>
> You are justifying the poor robustness of this tool on the grounds
> that it's "example code, not a production utility".
>
> But we are shipping it to bin/ and there is nothing telling anyone
> that trying to use it (perhaps wrapped in something of their own
> devising) is a bad idea.
>
> Either this is code users might be expected to run in production in
> which we need to make it at least have a minimal level of engineering
> robustness (which is perhaps too difficult at this stage), or we need
> to communicate to our users that it's a programming example, not a
> useful utility.
>
> Note that *even if it is a programming example*, we should highlight
> its most important deficiencies.  Otherwise it is a hazardously
> misleading example.
>
> I hope that makes sense.

First of all - I'm not the author of this code.  I'm merely the person
who did the latest round of cleanup, and sent the series.

This code is a damn sight more robust than the other utilities, because
in the case that something does go wrong, the domain will still function
correctly.  Almost everything else, qemu included, will leave the VM
totally broken, as it becomes paused waiting on a request which has
escaped into the ether.


I also don't feel that now is an appropriate time to be insisting that
the goalpost's move when it comes to submissions into tools/,
particularly as you were happy (well - didn't comment on) with this
pattern back in v3.

What exact colour do you want this bikeshed?  Anything non-trivial is
going to miss 4.15.

~Andrew


^ permalink raw reply	[flat|nested] 41+ messages in thread

* RE: [PATCH v7 09/10] xen/vmtrace: support for VM forks
  2021-01-26 14:21   ` Jan Beulich
@ 2021-01-27 15:50     ` Lengyel, Tamas
  0 siblings, 0 replies; 41+ messages in thread
From: Lengyel, Tamas @ 2021-01-27 15:50 UTC (permalink / raw)
  To: Jan Beulich, Cooper, Andrew
  Cc: Roger Pau Monné,
	Wei Liu, Nakajima, Jun, Tian, Kevin,
	Michał Leszczyński, Tamas K Lengyel, Xen-devel



> -----Original Message-----
> From: Jan Beulich <jbeulich@suse.com>
> Sent: Tuesday, January 26, 2021 9:22 AM
> To: Cooper, Andrew <andrew.cooper3@citrix.com>
> Cc: Lengyel, Tamas <tamas.lengyel@intel.com>; Roger Pau Monné
> <roger.pau@citrix.com>; Wei Liu <wl@xen.org>; Nakajima, Jun
> <jun.nakajima@intel.com>; Tian, Kevin <kevin.tian@intel.com>; Michał
> Leszczyński <michal.leszczynski@cert.pl>; Tamas K Lengyel
> <tamas@tklengyel.com>; Xen-devel <xen-devel@lists.xenproject.org>
> Subject: Re: [PATCH v7 09/10] xen/vmtrace: support for VM forks
> 
> On 21.01.2021 22:27, Andrew Cooper wrote:
> > From: Tamas K Lengyel <tamas.lengyel@intel.com>
> >
> > Implement vmtrace_reset_pt function. Properly set IPT state for VM
> > forks.
> >
> > Signed-off-by: Tamas K Lengyel <tamas.lengyel@intel.com>
> > Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> 
> Reviewed-by: Jan Beulich <jbeulich@suse.com> albeit it strikes me as
> somewhat odd that ...
> 
> > --- a/xen/arch/x86/hvm/vmx/vmx.c
> > +++ b/xen/arch/x86/hvm/vmx/vmx.c
> > @@ -2408,6 +2408,16 @@ static int vmtrace_output_position(struct vcpu
> *v, uint64_t *pos)
> >      return v->arch.hvm.vmx.ipt_active;  }
> >
> > +static int vmtrace_reset(struct vcpu *v) {
> > +    if ( !v->arch.hvm.vmx.ipt_active )
> > +        return -EINVAL;
> > +
> > +    v->arch.msrs->rtit.output_offset = 0;
> > +    v->arch.msrs->rtit.status = 0;
> > +    return 0;
> > +}
> 
> ... the function/hook return non-void, yet ...
> 
> > --- a/xen/arch/x86/mm/mem_sharing.c
> > +++ b/xen/arch/x86/mm/mem_sharing.c
> > @@ -1632,6 +1632,8 @@ static int copy_vcpu_settings(struct domain *cd,
> const struct domain *d)
> >              copy_domain_page(new_vcpu_info_mfn, vcpu_info_mfn);
> >          }
> >
> > +        hvm_vmtrace_reset(cd_vcpu);
> 
> ... the only caller doesn't care.

Noted. For now the function could just return void since if IPT is not enabled we don't care from a fork reset perspective. But perhaps in the future we want to return that information to the toolstack. Haven't decided yet whether that's important or not, for now things are OK as-is.

Tamas

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v7 02/10] xen/domain: Add vmtrace_frames domain creation parameter
  2021-01-25 15:08   ` Jan Beulich
  2021-01-25 17:17     ` Andrew Cooper
@ 2021-01-29 16:37     ` Jan Beulich
  1 sibling, 0 replies; 41+ messages in thread
From: Jan Beulich @ 2021-01-29 16:37 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: Michał Leszczyński, Roger Pau Monné,
	Wei Liu, Anthony PERARD, Tamas K Lengyel, Xen-devel

On 25.01.2021 16:08, Jan Beulich wrote:
> On 21.01.2021 22:27, Andrew Cooper wrote:
>> --- a/xen/common/domain.c
>> +++ b/xen/common/domain.c
>> @@ -132,6 +132,48 @@ static void vcpu_info_reset(struct vcpu *v)
>>      v->vcpu_info_mfn = INVALID_MFN;
>>  }
>>  
>> +static void vmtrace_free_buffer(struct vcpu *v)
>> +{
>> +    const struct domain *d = v->domain;
>> +    struct page_info *pg = v->vmtrace.buf;
>> +    unsigned int i;
>> +
>> +    if ( !pg )
>> +        return;
>> +
>> +    for ( i = 0; i < d->vmtrace_frames; i++ )
>> +    {
>> +        put_page_alloc_ref(&pg[i]);
>> +        put_page_and_type(&pg[i]);
>> +    }
>> +
>> +    v->vmtrace.buf = NULL;
> 
> To set a good precedent, maybe this wants moving up ahead of
> the loop and ...
> 
>> +}
>> +
>> +static int vmtrace_alloc_buffer(struct vcpu *v)
>> +{
>> +    struct domain *d = v->domain;
>> +    struct page_info *pg;
>> +    unsigned int i;
>> +
>> +    if ( !d->vmtrace_frames )
>> +        return 0;
>> +
>> +    pg = alloc_domheap_pages(d, get_order_from_pages(d->vmtrace_frames),
>> +                             MEMF_no_refcount);
>> +    if ( !pg )
>> +        return -ENOMEM;
>> +
>> +    v->vmtrace.buf = pg;
> 
> ... this wants moving down past the loop, to avoid
> globally announcing something that isn't fully initialized
> yet / anymore?

So having replied on the other thread I looked back here: The
suggested change actually is not just to set a good precedent.
By recording the page(s) in v->vmtrace_buf, ...

>> +    for ( i = 0; i < d->vmtrace_frames; i++ )
>> +        /* Domain can't know about this page yet - something fishy going on. */
>> +        if ( !get_page_and_type(&pg[i], d, PGT_writable_page) )
>> +            BUG();

... failure here (if handled better than by BUG()) will lead
vmtrace_free_buffer() to believe it needs to put references
(besides freeing the page(s), which is fine). So your claimed
bug really is just here, not everywhere else, and there is no
reason to go backwards in terms of error handling "quality",
as per the bottom of ./CODING_STYLE.

Jan


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v7 05/10] x86/vmx: Add Intel Processor Trace support
  2021-01-26 13:35   ` Jan Beulich
@ 2021-01-29 22:08     ` Andrew Cooper
  0 siblings, 0 replies; 41+ messages in thread
From: Andrew Cooper @ 2021-01-29 22:08 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Michał Leszczyński, Roger Pau Monné,
	Wei Liu, Jun Nakajima, Kevin Tian, Tamas K Lengyel, Xen-devel

On 26/01/2021 13:35, Jan Beulich wrote:
> On 21.01.2021 22:27, Andrew Cooper wrote:
>> --- a/xen/include/asm-x86/msr.h
>> +++ b/xen/include/asm-x86/msr.h
>> @@ -306,6 +306,38 @@ struct vcpu_msrs
>>          };
>>      } misc_features_enables;
>>  
>> +    /*
>> +     * 0x00000560 ... 57x - MSR_RTIT_*
>> +     *
>> +     * "Real Time Instruction Trace", now called Processor Trace.
>> +     *
>> +     * These MSRs are not exposed to guests.  They are controlled by Xen
>> +     * behind the scenes, when vmtrace is enabled for the domain.
>> +     *
>> +     * MSR_RTIT_OUTPUT_BASE not stored here.  It is fixed per vcpu, and
>> +     * derived from v->vmtrace.buf.
>> +     */
>> +    struct {
>> +        /*
>> +         * Placed in the MSR load/save lists.  Only modified by hypercall in
>> +         * the common case.
>> +         */
>> +        uint64_t ctl;
> IIUC this field will be used by subsequent patches only?

Correct.

~Andrew


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v7 06/10] xen/domctl: Add XEN_DOMCTL_vmtrace_op
  2021-01-26 14:18   ` Jan Beulich
@ 2021-01-29 23:01     ` Andrew Cooper
  0 siblings, 0 replies; 41+ messages in thread
From: Andrew Cooper @ 2021-01-29 23:01 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Michał Leszczyński, Roger Pau Monné,
	Wei Liu, Jun Nakajima, Kevin Tian, Tamas K Lengyel, Xen-devel

On 26/01/2021 14:18, Jan Beulich wrote:
>> +static int vmtrace_set_option(struct vcpu *v, uint64_t key, uint64_t value)
>> +{
>> +    struct vcpu_msrs *msrs = v->arch.msrs;
>> +    bool new_en, old_en = msrs->rtit.ctl & RTIT_CTL_TRACE_EN;
>> +
>> +    switch ( key )
>> +    {
>> +    case MSR_RTIT_OUTPUT_MASK:
>> +        /*
>> +         * MSR_RTIT_OUTPUT_MASK, when using Single Output mode, has a limit
>> +         * field in the lower 32 bits, and an offset in the upper 32 bits.
>> +         *
>> +         * Limit is fixed by the vmtrace buffer size and must not be
>> +         * controlled by userspace, while offset must be within the limit.
>> +         *
>> +         * Drop writes to the limit field to simply userspace wanting to reset
>> +         * the offset by writing 0.
>> +         */
>> +        if ( (value >> 32) > msrs->rtit.output_limit )
>> +            return -EINVAL;
>> +        msrs->rtit.output_offset = value >> 32;
>> +        break;
>> +
>> +    case MSR_RTIT_CTL:
>> +        if ( value & ~RTIT_CTL_MASK )
>> +            return -EINVAL;
>> +
>> +        msrs->rtit.ctl &= ~RTIT_CTL_MASK;
>> +        msrs->rtit.ctl |= (value & RTIT_CTL_MASK);
>> +        break;
>> +
>> +    case MSR_RTIT_STATUS:
>> +        if ( value & ~RTIT_STATUS_MASK )
>> +            return -EINVAL;
>> +
>> +        msrs->rtit.status &= ~RTIT_STATUS_MASK;
>> +        msrs->rtit.status |= (value & RTIT_STATUS_MASK);
>> +        break;
>> +
>> +    default:
>> +        return -EINVAL;
>> +    }
>> +
>> +    new_en = msrs->rtit.ctl & RTIT_CTL_TRACE_EN;
>> +
>> +    /* ctl.trace_en changed => update MSR load/save lists appropriately. */
>> +    if ( !old_en && new_en )
>> +    {
>> +        if ( vmx_add_guest_msr(v, MSR_RTIT_CTL, msrs->rtit.ctl) ||
>> +             vmx_add_host_load_msr(v, MSR_RTIT_CTL, 0) )
>> +        {
>> +            /*
>> +             * The only failure cases here are failing the
>> +             * singleton-per-domain memory allocation, or exceeding the space
>> +             * in the allocation.  We could unwind in principle, but there is
>> +             * nothing userspace can usefully do to continue using this VM.
>> +             */
>> +            domain_crash(v->domain);
>> +            return -ENXIO;
> I don't think I fully agree with the 2nd half of the last
> sentence, but well, so be it then for the time being at least.
> Why could userspace not decide to continue running this VM
> with ipt disabled?

Technically speaking, it could.  That wouldn't malfunction.

However, it would be exceedingly poor behaviour.

One major limitation IPT has is that it cant pause on a full ring (or at
least, not in any shipping hardware yet, and this series works back to
Broadwell).  You can't just leave IPT enabled and let the VM run,
because the buffer will wrap and corrupt itself.

The driving usecase for adding IPT is introspection based.  Frequent
breaks, combined with massive trace buffers, is the best effort attempt
not to lose data.

IPT is a niche usecase - it does come with a substantial frequency hit,
and lots of userspace complexity to do anything interesting with. 
Anyone who turns it on to begin with has a usecase which totally depends
on it working.

>> +static int vmtrace_control(struct vcpu *v, bool enable, bool reset)
>> +{
>> +    struct vcpu_msrs *msrs = v->arch.msrs;
>> +    uint64_t new_ctl;
>> +    int rc;
>> +
>> +    if ( v->arch.hvm.vmx.ipt_active == enable )
>> +        return -EINVAL;
> Why is XEN_DOMCTL_vmtrace_reset_and_enable not permitted
> when ipt_active is true?

Because absolutely nothing good can come of userspace and Xen getting
out of sync with their combined idea of whether IPT is active or not.

And I really don't feel like doing an ipt_pause reference count, because
there cannot plausibly be more than one entity handling the data.

>  And, considering ...
>
>> +    if ( reset )
>> +    {
>> +        msrs->rtit.status = 0;
>> +        msrs->rtit.output_offset = 0;
>> +    }
>> +
>> +    new_ctl = msrs->rtit.ctl & ~RTIT_CTL_TRACE_EN;
>> +    if ( enable )
>> +        new_ctl |= RTIT_CTL_TRACE_EN;
>> +
>> +    rc = vmtrace_set_option(v, MSR_RTIT_CTL, new_ctl);
> ... this is just a wrapper around a function directly
> reachable via XEN_DOMCTL_vmtrace_set_option, why any
> restriction at all?

This partial alias is a consequence of the split between the platform
neutral, and platform specific parts of the interface.

It is by no means certain that such an alias would exist on other
platforms, and passing TRACE_EN to set_option() falls firmly in the
"don't do that" category IMO.

~Andrew


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v7 10/10] x86/vm_event: Carry Processor Trace buffer offset in vm_event
  2021-01-26 14:27   ` Jan Beulich
@ 2021-01-29 23:22     ` Andrew Cooper
  2021-01-29 23:40       ` Tamas K Lengyel
  0 siblings, 1 reply; 41+ messages in thread
From: Andrew Cooper @ 2021-01-29 23:22 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Tamas K Lengyel, Roger Pau Monné,
	Wei Liu, Jun Nakajima, Kevin Tian, Michał Leszczyński,
	Tamas K Lengyel, Xen-devel

On 26/01/2021 14:27, Jan Beulich wrote:
> On 21.01.2021 22:27, Andrew Cooper wrote:
>> --- a/xen/arch/x86/vm_event.c
>> +++ b/xen/arch/x86/vm_event.c
>> @@ -251,6 +251,9 @@ void vm_event_fill_regs(vm_event_request_t *req)
>>  
>>      req->data.regs.x86.shadow_gs = ctxt.shadow_gs;
>>      req->data.regs.x86.dr6 = ctxt.dr6;
>> +
>> +    if ( hvm_vmtrace_output_position(curr, &req->data.regs.x86.pt_offset) != 1 )
>> +        req->data.regs.x86.pt_offset = ~0;
> Ah. (Regarding my earlier question about this returning -errno or
> boolean).
>
>> --- a/xen/include/public/vm_event.h
>> +++ b/xen/include/public/vm_event.h
>> @@ -223,6 +223,12 @@ struct vm_event_regs_x86 {
>>       */
>>      uint64_t npt_base;
>>  
>> +    /*
>> +     * Current offset in the Processor Trace buffer. For Intel Processor Trace
>> +     * this is MSR_RTIT_OUTPUT_MASK. Set to ~0 if no Processor Trace is active.
>> +     */
>> +    uint64_t pt_offset;
> According to vmtrace_output_position() the value is only one half
> of what the named MSR contains. Perhaps "... this is from MSR_..."?
> Not sure whether, despite this, there still is a reason to have
> this 64-bit wide.

This is a vestigial remnant which escaped the "use vmtrace uniformly" work.

It should match the domctl_vmtrace_output_position() format, so each
interface gives the same content for the attempted-platform-neutral version.

~Andrew


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v7 10/10] x86/vm_event: Carry Processor Trace buffer offset in vm_event
  2021-01-29 23:22     ` Andrew Cooper
@ 2021-01-29 23:40       ` Tamas K Lengyel
  2021-02-01  8:55         ` Jan Beulich
  0 siblings, 1 reply; 41+ messages in thread
From: Tamas K Lengyel @ 2021-01-29 23:40 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: Jan Beulich, Tamas K Lengyel, Roger Pau Monné,
	Wei Liu, Jun Nakajima, Kevin Tian, Michał Leszczyński,
	Xen-devel

On Fri, Jan 29, 2021 at 6:22 PM Andrew Cooper <andrew.cooper3@citrix.com> wrote:
>
> On 26/01/2021 14:27, Jan Beulich wrote:
> > On 21.01.2021 22:27, Andrew Cooper wrote:
> >> --- a/xen/arch/x86/vm_event.c
> >> +++ b/xen/arch/x86/vm_event.c
> >> @@ -251,6 +251,9 @@ void vm_event_fill_regs(vm_event_request_t *req)
> >>
> >>      req->data.regs.x86.shadow_gs = ctxt.shadow_gs;
> >>      req->data.regs.x86.dr6 = ctxt.dr6;
> >> +
> >> +    if ( hvm_vmtrace_output_position(curr, &req->data.regs.x86.pt_offset) != 1 )
> >> +        req->data.regs.x86.pt_offset = ~0;
> > Ah. (Regarding my earlier question about this returning -errno or
> > boolean).
> >
> >> --- a/xen/include/public/vm_event.h
> >> +++ b/xen/include/public/vm_event.h
> >> @@ -223,6 +223,12 @@ struct vm_event_regs_x86 {
> >>       */
> >>      uint64_t npt_base;
> >>
> >> +    /*
> >> +     * Current offset in the Processor Trace buffer. For Intel Processor Trace
> >> +     * this is MSR_RTIT_OUTPUT_MASK. Set to ~0 if no Processor Trace is active.
> >> +     */
> >> +    uint64_t pt_offset;
> > According to vmtrace_output_position() the value is only one half
> > of what the named MSR contains. Perhaps "... this is from MSR_..."?
> > Not sure whether, despite this, there still is a reason to have
> > this 64-bit wide.
>
> This is a vestigial remnant which escaped the "use vmtrace uniformly" work.
>
> It should match the domctl_vmtrace_output_position() format, so each
> interface gives the same content for the attempted-platform-neutral version.

From the vm_event ABI perspective it's simpler to have a 64-bit value
here even if the max value it may possibly carry is never going to use
the whole 64-bit width. I rather not play with shortening it just to
add padding somewhere else.

As for what it's called is not that important from my perspective,
vmtrace_pos or something like that for example is fine by me.

Tamas


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v7 10/10] x86/vm_event: Carry Processor Trace buffer offset in vm_event
  2021-01-29 23:40       ` Tamas K Lengyel
@ 2021-02-01  8:55         ` Jan Beulich
  2021-02-01  9:06           ` Andrew Cooper
  0 siblings, 1 reply; 41+ messages in thread
From: Jan Beulich @ 2021-02-01  8:55 UTC (permalink / raw)
  To: Tamas K Lengyel
  Cc: Tamas K Lengyel, Roger Pau Monné,
	Wei Liu, Jun Nakajima, Kevin Tian, Michał Leszczyński,
	Xen-devel, Andrew Cooper

On 30.01.2021 00:40, Tamas K Lengyel wrote:
> On Fri, Jan 29, 2021 at 6:22 PM Andrew Cooper <andrew.cooper3@citrix.com> wrote:
>>
>> On 26/01/2021 14:27, Jan Beulich wrote:
>>> On 21.01.2021 22:27, Andrew Cooper wrote:
>>>> --- a/xen/arch/x86/vm_event.c
>>>> +++ b/xen/arch/x86/vm_event.c
>>>> @@ -251,6 +251,9 @@ void vm_event_fill_regs(vm_event_request_t *req)
>>>>
>>>>      req->data.regs.x86.shadow_gs = ctxt.shadow_gs;
>>>>      req->data.regs.x86.dr6 = ctxt.dr6;
>>>> +
>>>> +    if ( hvm_vmtrace_output_position(curr, &req->data.regs.x86.pt_offset) != 1 )
>>>> +        req->data.regs.x86.pt_offset = ~0;
>>> Ah. (Regarding my earlier question about this returning -errno or
>>> boolean).
>>>
>>>> --- a/xen/include/public/vm_event.h
>>>> +++ b/xen/include/public/vm_event.h
>>>> @@ -223,6 +223,12 @@ struct vm_event_regs_x86 {
>>>>       */
>>>>      uint64_t npt_base;
>>>>
>>>> +    /*
>>>> +     * Current offset in the Processor Trace buffer. For Intel Processor Trace
>>>> +     * this is MSR_RTIT_OUTPUT_MASK. Set to ~0 if no Processor Trace is active.
>>>> +     */
>>>> +    uint64_t pt_offset;
>>> According to vmtrace_output_position() the value is only one half
>>> of what the named MSR contains. Perhaps "... this is from MSR_..."?
>>> Not sure whether, despite this, there still is a reason to have
>>> this 64-bit wide.
>>
>> This is a vestigial remnant which escaped the "use vmtrace uniformly" work.
>>
>> It should match the domctl_vmtrace_output_position() format, so each
>> interface gives the same content for the attempted-platform-neutral version.
> 
> From the vm_event ABI perspective it's simpler to have a 64-bit value
> here even if the max value it may possibly carry is never going to use
> the whole 64-bit width. I rather not play with shortening it just to
> add padding somewhere else.
> 
> As for what it's called is not that important from my perspective,
> vmtrace_pos or something like that for example is fine by me.

The important thing to me is that the comment not be misleading.
Whether that's arranged for by adjusting the comment of the
commented declaration is up to what you deem better, i.e. I
understand the comment it is.

Jan


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v7 10/10] x86/vm_event: Carry Processor Trace buffer offset in vm_event
  2021-02-01  8:55         ` Jan Beulich
@ 2021-02-01  9:06           ` Andrew Cooper
  0 siblings, 0 replies; 41+ messages in thread
From: Andrew Cooper @ 2021-02-01  9:06 UTC (permalink / raw)
  To: Jan Beulich, Tamas K Lengyel
  Cc: Tamas K Lengyel, Roger Pau Monné,
	Wei Liu, Jun Nakajima, Kevin Tian, Michał Leszczyński,
	Xen-devel

On 01/02/2021 08:55, Jan Beulich wrote:
> On 30.01.2021 00:40, Tamas K Lengyel wrote:
>> On Fri, Jan 29, 2021 at 6:22 PM Andrew Cooper <andrew.cooper3@citrix.com> wrote:
>>> On 26/01/2021 14:27, Jan Beulich wrote:
>>>> On 21.01.2021 22:27, Andrew Cooper wrote:
>>>>> --- a/xen/arch/x86/vm_event.c
>>>>> +++ b/xen/arch/x86/vm_event.c
>>>>> @@ -251,6 +251,9 @@ void vm_event_fill_regs(vm_event_request_t *req)
>>>>>
>>>>>      req->data.regs.x86.shadow_gs = ctxt.shadow_gs;
>>>>>      req->data.regs.x86.dr6 = ctxt.dr6;
>>>>> +
>>>>> +    if ( hvm_vmtrace_output_position(curr, &req->data.regs.x86.pt_offset) != 1 )
>>>>> +        req->data.regs.x86.pt_offset = ~0;
>>>> Ah. (Regarding my earlier question about this returning -errno or
>>>> boolean).
>>>>
>>>>> --- a/xen/include/public/vm_event.h
>>>>> +++ b/xen/include/public/vm_event.h
>>>>> @@ -223,6 +223,12 @@ struct vm_event_regs_x86 {
>>>>>       */
>>>>>      uint64_t npt_base;
>>>>>
>>>>> +    /*
>>>>> +     * Current offset in the Processor Trace buffer. For Intel Processor Trace
>>>>> +     * this is MSR_RTIT_OUTPUT_MASK. Set to ~0 if no Processor Trace is active.
>>>>> +     */
>>>>> +    uint64_t pt_offset;
>>>> According to vmtrace_output_position() the value is only one half
>>>> of what the named MSR contains. Perhaps "... this is from MSR_..."?
>>>> Not sure whether, despite this, there still is a reason to have
>>>> this 64-bit wide.
>>> This is a vestigial remnant which escaped the "use vmtrace uniformly" work.
>>>
>>> It should match the domctl_vmtrace_output_position() format, so each
>>> interface gives the same content for the attempted-platform-neutral version.
>> From the vm_event ABI perspective it's simpler to have a 64-bit value
>> here even if the max value it may possibly carry is never going to use
>> the whole 64-bit width. I rather not play with shortening it just to
>> add padding somewhere else.
>>
>> As for what it's called is not that important from my perspective,
>> vmtrace_pos or something like that for example is fine by me.
> The important thing to me is that the comment not be misleading.
> Whether that's arranged for by adjusting the comment of the
> commented declaration is up to what you deem better, i.e. I
> understand the comment it is.

Please see v8.  I rewrote the comment.

~Andrew


^ permalink raw reply	[flat|nested] 41+ messages in thread

end of thread, other threads:[~2021-02-01  9:07 UTC | newest]

Thread overview: 41+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-01-21 21:27 [PATCH v7 00/10] Implement support for external IPT monitoring Andrew Cooper
2021-01-21 21:27 ` [PATCH v7 01/10] xen+tools: Introduce XEN_SYSCTL_PHYSCAP_vmtrace Andrew Cooper
2021-01-22 15:28   ` Ian Jackson
2021-01-26  8:58   ` Julien Grall
2021-01-26 10:04     ` Andrew Cooper
2021-01-21 21:27 ` [PATCH v7 02/10] xen/domain: Add vmtrace_frames domain creation parameter Andrew Cooper
2021-01-25 15:08   ` Jan Beulich
2021-01-25 17:17     ` Andrew Cooper
2021-01-26 10:51       ` Jan Beulich
2021-01-29 16:37     ` Jan Beulich
2021-01-21 21:27 ` [PATCH v7 03/10] tools/[lib]xl: Add vmtrace_buf_size parameter Andrew Cooper
2021-01-22 15:29   ` Ian Jackson
2021-01-21 21:27 ` [PATCH v7 04/10] xen/memory: Add a vmtrace_buf resource type Andrew Cooper
2021-01-25 16:31   ` Jan Beulich
2021-01-26  7:37     ` Jan Beulich
2021-01-26  9:58       ` Andrew Cooper
2021-01-26 10:30         ` Jan Beulich
2021-01-21 21:27 ` [PATCH v7 05/10] x86/vmx: Add Intel Processor Trace support Andrew Cooper
2021-01-26 13:35   ` Jan Beulich
2021-01-29 22:08     ` Andrew Cooper
2021-01-21 21:27 ` [PATCH v7 06/10] xen/domctl: Add XEN_DOMCTL_vmtrace_op Andrew Cooper
2021-01-26 14:18   ` Jan Beulich
2021-01-29 23:01     ` Andrew Cooper
2021-01-21 21:27 ` [PATCH v7 07/10] tools/libxc: Add xc_vmtrace_* functions Andrew Cooper
2021-01-22 15:29   ` Ian Jackson
2021-01-21 21:27 ` [PATCH v7 08/10] tools/misc: Add xen-vmtrace tool Andrew Cooper
2021-01-22 15:33   ` Ian Jackson
2021-01-25 15:30     ` Andrew Cooper
2021-01-26 11:59       ` Ian Jackson
2021-01-26 12:55         ` Andrew Cooper
2021-01-26 13:32           ` Ian Jackson
2021-01-26 15:59             ` Andrew Cooper
2021-01-21 21:27 ` [PATCH v7 09/10] xen/vmtrace: support for VM forks Andrew Cooper
2021-01-26 14:21   ` Jan Beulich
2021-01-27 15:50     ` Lengyel, Tamas
2021-01-21 21:27 ` [PATCH v7 10/10] x86/vm_event: Carry Processor Trace buffer offset in vm_event Andrew Cooper
2021-01-26 14:27   ` Jan Beulich
2021-01-29 23:22     ` Andrew Cooper
2021-01-29 23:40       ` Tamas K Lengyel
2021-02-01  8:55         ` Jan Beulich
2021-02-01  9:06           ` Andrew Cooper

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.