All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3] x86/hvm/viridian: flush remote tlbs by hypercall
@ 2015-11-19 13:19 Paul Durrant
  2015-11-19 15:35 ` Jan Beulich
  2015-11-19 16:06 ` Andrew Cooper
  0 siblings, 2 replies; 14+ messages in thread
From: Paul Durrant @ 2015-11-19 13:19 UTC (permalink / raw)
  To: xen-devel
  Cc: Wei Liu, Ian Campbell, Stefano Stabellini, Andrew Cooper,
	Ian Jackson, Paul Durrant, Jan Beulich, Keir Fraser

The Microsoft Hypervisor Top Level Functional Spec. (section 3.4) defines
two bits in CPUID leaf 0x40000004:EAX for the hypervisor to recommend
whether or not to issue a hypercall for local or remote TLB flush.

Whilst it's doubtful whether using a hypercall for local TLB flush would
be any more efficient than a specific INVLPG VMEXIT, a remote TLB flush
may well be more efficiently done. This is because the alternative
mechanism is to IPI all the vCPUs in question which (in the absence of
APIC virtualisation) will require emulation and scheduling of the vCPUs
only to have them immediately VMEXIT for local TLB flush.

This patch therefore adds a viridian option which, if selected, enables
the hypercall for remote TLB flush and implements it using ASID
invalidation for targetted vCPUs followed by an IPI only to the set of
CPUs that happened to be running a targetted vCPU (which may be the empty
set). The flush may be more severe than requested since the hypercall can
request flush only for a specific address space (CR3) but Xen neither
keeps a mapping of ASID to guest CR3 nor allows invalidation of a specific
ASID, but on a host with contended CPUs performance is still likely to
be better than a more specific flush using IPIs.

The implementation of the patch introduces per-vCPU viridian_init() and
viridian_deinit() functions to allow a scratch cpumask to be allocated.
This avoids needing to put this potentially large data structure on stack
during hypercall processing. It also modifies the hypercall input and
output bit-fields to allow a check for the 'fast' calling convention,
and a white-space fix in the definition of HVMPV_feature_mask (to remove
hard tabs).

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: Keir Fraser <keir@xen.org>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
---

v3:
 - Correct use of cpumask_var_t
 - Extend comment to explain pcpu_mask flush
 - Other cosmetic changes

v2:
 - Re-name viridian_init/deinit() to viridian_vcpu_init/deinit()
 - Use alloc/free_cpumask_var()
 - Use hvm_copy_from_guest_phys() to get hypercall arguments
---
 docs/man/xl.cfg.pod.5              |   6 ++
 tools/libxl/libxl_dom.c            |   3 +
 tools/libxl/libxl_types.idl        |   1 +
 xen/arch/x86/hvm/hvm.c             |  12 ++++
 xen/arch/x86/hvm/viridian.c        | 120 +++++++++++++++++++++++++++++++++----
 xen/include/asm-x86/hvm/viridian.h |   4 ++
 xen/include/asm-x86/perfc_defn.h   |   1 +
 xen/include/public/hvm/params.h    |  14 +++--
 8 files changed, 144 insertions(+), 17 deletions(-)

diff --git a/docs/man/xl.cfg.pod.5 b/docs/man/xl.cfg.pod.5
index b63846a..1a88e36 100644
--- a/docs/man/xl.cfg.pod.5
+++ b/docs/man/xl.cfg.pod.5
@@ -1466,6 +1466,12 @@ This set incorporates the Partition Reference TSC MSR. This
 enlightenment can improve performance of Windows 7 and Windows
 Server 2008 R2 onwards.
 
+=item B<hcall_remote_tlb_flush>
+
+This set incorporates use of hypercalls for remote TLB flushing.
+This enlightenment may improve performance of Windows guests running
+on hosts with higher levels of (physical) CPU contention.
+
 =item B<defaults>
 
 This is a special value that enables the default set of groups, which
diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c
index 44d481b..009ca9c 100644
--- a/tools/libxl/libxl_dom.c
+++ b/tools/libxl/libxl_dom.c
@@ -251,6 +251,9 @@ static int hvm_set_viridian_features(libxl__gc *gc, uint32_t domid,
     if (libxl_bitmap_test(&enlightenments, LIBXL_VIRIDIAN_ENLIGHTENMENT_REFERENCE_TSC))
         mask |= HVMPV_reference_tsc;
 
+    if (libxl_bitmap_test(&enlightenments, LIBXL_VIRIDIAN_ENLIGHTENMENT_HCALL_REMOTE_TLB_FLUSH))
+        mask |= HVMPV_hcall_remote_tlb_flush;
+
     if (mask != 0 &&
         xc_hvm_param_set(CTX->xch,
                          domid,
diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
index 4d78f86..0aa5b9d 100644
--- a/tools/libxl/libxl_types.idl
+++ b/tools/libxl/libxl_types.idl
@@ -219,6 +219,7 @@ libxl_viridian_enlightenment = Enumeration("viridian_enlightenment", [
     (1, "freq"),
     (2, "time_ref_count"),
     (3, "reference_tsc"),
+    (4, "hcall_remote_tlb_flush"),
     ])
 
 libxl_hdtype = Enumeration("hdtype", [
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 21f42a7..910d2be 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -2452,6 +2452,13 @@ int hvm_vcpu_initialise(struct vcpu *v)
     if ( rc != 0 )
         goto fail6;
 
+    if ( is_viridian_domain(d) )
+    {
+        rc = viridian_vcpu_init(v);
+        if ( rc != 0 )
+            goto fail7;
+    }
+
     if ( v->vcpu_id == 0 )
     {
         /* NB. All these really belong in hvm_domain_initialise(). */
@@ -2468,6 +2475,8 @@ int hvm_vcpu_initialise(struct vcpu *v)
 
     return 0;
 
+ fail7:
+    hvm_all_ioreq_servers_remove_vcpu(v->domain, v);
  fail6:
     nestedhvm_vcpu_destroy(v);
  fail5:
@@ -2484,6 +2493,9 @@ int hvm_vcpu_initialise(struct vcpu *v)
 
 void hvm_vcpu_destroy(struct vcpu *v)
 {
+    if ( is_viridian_domain(v->domain) )
+        viridian_vcpu_deinit(v);
+
     hvm_all_ioreq_servers_remove_vcpu(v->domain, v);
 
     if ( hvm_altp2m_supported() )
diff --git a/xen/arch/x86/hvm/viridian.c b/xen/arch/x86/hvm/viridian.c
index 2f22783..65420eb 100644
--- a/xen/arch/x86/hvm/viridian.c
+++ b/xen/arch/x86/hvm/viridian.c
@@ -33,9 +33,15 @@
 /* Viridian Hypercall Status Codes. */
 #define HV_STATUS_SUCCESS                       0x0000
 #define HV_STATUS_INVALID_HYPERCALL_CODE        0x0002
+#define HV_STATUS_INVALID_PARAMETER             0x0005
 
-/* Viridian Hypercall Codes and Parameters. */
-#define HvNotifyLongSpinWait    8
+/* Viridian Hypercall Codes. */
+#define HvFlushVirtualAddressSpace 2
+#define HvFlushVirtualAddressList  3
+#define HvNotifyLongSpinWait       8
+
+/* Viridian Hypercall Flags. */
+#define HV_FLUSH_ALL_PROCESSORS 1
 
 /* Viridian CPUID 4000003, Viridian MSR availability. */
 #define CPUID3A_MSR_TIME_REF_COUNT (1 << 1)
@@ -46,8 +52,9 @@
 #define CPUID3A_MSR_FREQ           (1 << 11)
 
 /* Viridian CPUID 4000004, Implementation Recommendations. */
-#define CPUID4A_MSR_BASED_APIC  (1 << 3)
-#define CPUID4A_RELAX_TIMER_INT (1 << 5)
+#define CPUID4A_HCALL_REMOTE_TLB_FLUSH (1 << 2)
+#define CPUID4A_MSR_BASED_APIC         (1 << 3)
+#define CPUID4A_RELAX_TIMER_INT        (1 << 5)
 
 /* Viridian CPUID 4000006, Implementation HW features detected and in use. */
 #define CPUID6A_APIC_OVERLAY    (1 << 0)
@@ -107,6 +114,8 @@ int cpuid_viridian_leaves(unsigned int leaf, unsigned int *eax,
              (d->arch.hvm_domain.viridian.guest_os_id.fields.os < 4) )
             break;
         *eax = CPUID4A_RELAX_TIMER_INT;
+        if ( viridian_feature_mask(d) & HVMPV_hcall_remote_tlb_flush )
+            *eax |= CPUID4A_HCALL_REMOTE_TLB_FLUSH;
         if ( !cpu_has_vmx_apic_reg_virt )
             *eax |= CPUID4A_MSR_BASED_APIC;
         *ebx = 2047; /* long spin count */
@@ -512,9 +521,22 @@ int rdmsr_viridian_regs(uint32_t idx, uint64_t *val)
     return 1;
 }
 
+int viridian_vcpu_init(struct vcpu *v)
+{
+    return alloc_cpumask_var(&v->arch.hvm_vcpu.viridian.flush_cpumask) ?
+           0 : -ENOMEM;
+}
+
+void viridian_vcpu_deinit(struct vcpu *v)
+{
+    free_cpumask_var(v->arch.hvm_vcpu.viridian.flush_cpumask);
+}
+
 int viridian_hypercall(struct cpu_user_regs *regs)
 {
-    int mode = hvm_guest_x86_mode(current);
+    struct vcpu *curr = current;
+    struct domain *currd = curr->domain;
+    int mode = hvm_guest_x86_mode(curr);
     unsigned long input_params_gpa, output_params_gpa;
     uint16_t status = HV_STATUS_SUCCESS;
 
@@ -522,11 +544,12 @@ int viridian_hypercall(struct cpu_user_regs *regs)
         uint64_t raw;
         struct {
             uint16_t call_code;
-            uint16_t rsvd1;
-            unsigned rep_count:12;
-            unsigned rsvd2:4;
-            unsigned rep_start:12;
-            unsigned rsvd3:4;
+            uint16_t fast:1;
+            uint16_t rsvd1:15;
+            uint16_t rep_count:12;
+            uint16_t rsvd2:4;
+            uint16_t rep_start:12;
+            uint16_t rsvd3:4;
         };
     } input;
 
@@ -535,12 +558,12 @@ int viridian_hypercall(struct cpu_user_regs *regs)
         struct {
             uint16_t result;
             uint16_t rsvd1;
-            unsigned rep_complete:12;
-            unsigned rsvd2:20;
+            uint32_t rep_complete:12;
+            uint32_t rsvd2:20;
         };
     } output = { 0 };
 
-    ASSERT(is_viridian_domain(current->domain));
+    ASSERT(is_viridian_domain(currd));
 
     switch ( mode )
     {
@@ -561,10 +584,81 @@ int viridian_hypercall(struct cpu_user_regs *regs)
     switch ( input.call_code )
     {
     case HvNotifyLongSpinWait:
+        /*
+         * See Microsoft Hypervisor Top Level Spec. section 18.5.1.
+         */
         perfc_incr(mshv_call_long_wait);
         do_sched_op(SCHEDOP_yield, guest_handle_from_ptr(NULL, void));
         status = HV_STATUS_SUCCESS;
         break;
+
+    case HvFlushVirtualAddressSpace:
+    case HvFlushVirtualAddressList:
+    {
+        cpumask_t *pcpu_mask;
+        struct vcpu *v;
+        struct {
+            uint64_t address_space;
+            uint64_t flags;
+            uint64_t vcpu_mask;
+        } input_params;
+
+        /*
+         * See Microsoft Hypervisor Top Level Spec. sections 12.4.2
+         * and 12.4.3.
+         */
+        perfc_incr(mshv_flush);
+
+        /* These hypercalls should never use the fast-call convention. */
+        status = HV_STATUS_INVALID_PARAMETER;
+        if ( input.fast )
+            break;
+
+        /* Get input parameters. */
+        if ( hvm_copy_from_guest_phys(&input_params, input_params_gpa,
+                                      sizeof(input_params)) != HVMCOPY_okay )
+            break;
+
+        /*
+         * It is not clear from the spec. if we are supposed to
+         * include current virtual CPU in the set or not in this case,
+         * so err on the safe side.
+         */
+        if ( input_params.flags & HV_FLUSH_ALL_PROCESSORS )
+            input_params.vcpu_mask = ~0ul;
+
+        pcpu_mask = curr->arch.hvm_vcpu.viridian.flush_cpumask;
+        cpumask_clear(pcpu_mask);
+
+        /*
+         * For each specified virtual CPU flush all ASIDs to invalidate
+         * TLB entries the next time it is scheduled and then, if it
+         * is currently running, add its physical CPU to a mask of
+         * those which need to be interrupted to force a flush.
+         */
+        for_each_vcpu ( currd, v )
+        {
+            if ( !(input_params.vcpu_mask & (1ul << v->vcpu_id)) )
+                continue;
+
+            hvm_asid_flush_vcpu(v);
+            if ( v->is_running )
+                cpumask_set_cpu(v->processor, pcpu_mask);
+        }
+
+        /*
+         * Since ASIDs have now been flushed it just remains to
+         * force any CPUs currently running target vCPUs out of non-
+         * root mode. It's possible that re-scheduling has taken place
+         * so we may unnecessarily IPI some CPUs.
+         */
+        if ( !cpumask_empty(pcpu_mask) )
+            flush_tlb_mask(pcpu_mask);
+
+        status = HV_STATUS_SUCCESS;
+        break;
+    }
+
     default:
         status = HV_STATUS_INVALID_HYPERCALL_CODE;
         break;
diff --git a/xen/include/asm-x86/hvm/viridian.h b/xen/include/asm-x86/hvm/viridian.h
index c4319d7..2eec85e 100644
--- a/xen/include/asm-x86/hvm/viridian.h
+++ b/xen/include/asm-x86/hvm/viridian.h
@@ -22,6 +22,7 @@ union viridian_apic_assist
 struct viridian_vcpu
 {
     union viridian_apic_assist apic_assist;
+    cpumask_var_t flush_cpumask;
 };
 
 union viridian_guest_os_id
@@ -117,6 +118,9 @@ viridian_hypercall(struct cpu_user_regs *regs);
 void viridian_time_ref_count_freeze(struct domain *d);
 void viridian_time_ref_count_thaw(struct domain *d);
 
+int viridian_vcpu_init(struct vcpu *v);
+void viridian_vcpu_deinit(struct vcpu *v);
+
 #endif /* __ASM_X86_HVM_VIRIDIAN_H__ */
 
 /*
diff --git a/xen/include/asm-x86/perfc_defn.h b/xen/include/asm-x86/perfc_defn.h
index 9ef092e..aac9331 100644
--- a/xen/include/asm-x86/perfc_defn.h
+++ b/xen/include/asm-x86/perfc_defn.h
@@ -115,6 +115,7 @@ PERFCOUNTER(mshv_call_sw_addr_space,    "MS Hv Switch Address Space")
 PERFCOUNTER(mshv_call_flush_tlb_list,   "MS Hv Flush TLB list")
 PERFCOUNTER(mshv_call_flush_tlb_all,    "MS Hv Flush TLB all")
 PERFCOUNTER(mshv_call_long_wait,        "MS Hv Notify long wait")
+PERFCOUNTER(mshv_call_flush,            "MS Hv Flush TLB")
 PERFCOUNTER(mshv_rdmsr_osid,            "MS Hv rdmsr Guest OS ID")
 PERFCOUNTER(mshv_rdmsr_hc_page,         "MS Hv rdmsr hypercall page")
 PERFCOUNTER(mshv_rdmsr_vp_index,        "MS Hv rdmsr vp index")
diff --git a/xen/include/public/hvm/params.h b/xen/include/public/hvm/params.h
index 356dfd3..5e54a84 100644
--- a/xen/include/public/hvm/params.h
+++ b/xen/include/public/hvm/params.h
@@ -1,3 +1,4 @@
+
 /*
  * Permission is hereby granted, free of charge, to any person obtaining a copy
  * of this software and associated documentation files (the "Software"), to
@@ -98,11 +99,16 @@
 #define _HVMPV_reference_tsc 3
 #define HVMPV_reference_tsc  (1 << _HVMPV_reference_tsc)
 
+/* Use Hypercall for remote TLB flush */
+#define _HVMPV_hcall_remote_tlb_flush 4
+#define HVMPV_hcall_remote_tlb_flush (1 << _HVMPV_hcall_remote_tlb_flush)
+
 #define HVMPV_feature_mask \
-	(HVMPV_base_freq | \
-	 HVMPV_no_freq | \
-	 HVMPV_time_ref_count | \
-	 HVMPV_reference_tsc)
+        (HVMPV_base_freq | \
+         HVMPV_no_freq | \
+         HVMPV_time_ref_count | \
+         HVMPV_reference_tsc | \
+         HVMPV_hcall_remote_tlb_flush)
 
 #endif
 
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH v3] x86/hvm/viridian: flush remote tlbs by hypercall
  2015-11-19 13:19 [PATCH v3] x86/hvm/viridian: flush remote tlbs by hypercall Paul Durrant
@ 2015-11-19 15:35 ` Jan Beulich
  2015-11-19 16:06 ` Andrew Cooper
  1 sibling, 0 replies; 14+ messages in thread
From: Jan Beulich @ 2015-11-19 15:35 UTC (permalink / raw)
  To: Paul Durrant
  Cc: Wei Liu, Ian Campbell, Stefano Stabellini, Andrew Cooper,
	Ian Jackson, xen-devel, Keir Fraser

>>> On 19.11.15 at 14:19, <paul.durrant@citrix.com> wrote:
> The Microsoft Hypervisor Top Level Functional Spec. (section 3.4) defines
> two bits in CPUID leaf 0x40000004:EAX for the hypervisor to recommend
> whether or not to issue a hypercall for local or remote TLB flush.
> 
> Whilst it's doubtful whether using a hypercall for local TLB flush would
> be any more efficient than a specific INVLPG VMEXIT, a remote TLB flush
> may well be more efficiently done. This is because the alternative
> mechanism is to IPI all the vCPUs in question which (in the absence of
> APIC virtualisation) will require emulation and scheduling of the vCPUs
> only to have them immediately VMEXIT for local TLB flush.
> 
> This patch therefore adds a viridian option which, if selected, enables
> the hypercall for remote TLB flush and implements it using ASID
> invalidation for targetted vCPUs followed by an IPI only to the set of
> CPUs that happened to be running a targetted vCPU (which may be the empty
> set). The flush may be more severe than requested since the hypercall can
> request flush only for a specific address space (CR3) but Xen neither
> keeps a mapping of ASID to guest CR3 nor allows invalidation of a specific
> ASID, but on a host with contended CPUs performance is still likely to
> be better than a more specific flush using IPIs.
> 
> The implementation of the patch introduces per-vCPU viridian_init() and
> viridian_deinit() functions to allow a scratch cpumask to be allocated.
> This avoids needing to put this potentially large data structure on stack
> during hypercall processing. It also modifies the hypercall input and
> output bit-fields to allow a check for the 'fast' calling convention,
> and a white-space fix in the definition of HVMPV_feature_mask (to remove
> hard tabs).
> 
> Signed-off-by: Paul Durrant <paul.durrant@citrix.com>

Reviewed-by: Jan Beulich <jbeulich@suse.com>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v3] x86/hvm/viridian: flush remote tlbs by hypercall
  2015-11-19 13:19 [PATCH v3] x86/hvm/viridian: flush remote tlbs by hypercall Paul Durrant
  2015-11-19 15:35 ` Jan Beulich
@ 2015-11-19 16:06 ` Andrew Cooper
  2015-11-19 16:57   ` Paul Durrant
  1 sibling, 1 reply; 14+ messages in thread
From: Andrew Cooper @ 2015-11-19 16:06 UTC (permalink / raw)
  To: Paul Durrant, xen-devel
  Cc: Wei Liu, Ian Campbell, Stefano Stabellini, Ian Jackson,
	Jan Beulich, Keir Fraser

On 19/11/15 13:19, Paul Durrant wrote:
> @@ -561,10 +584,81 @@ int viridian_hypercall(struct cpu_user_regs *regs)
>      switch ( input.call_code )
>      {
>      case HvNotifyLongSpinWait:
> +        /*
> +         * See Microsoft Hypervisor Top Level Spec. section 18.5.1.
> +         */
>          perfc_incr(mshv_call_long_wait);
>          do_sched_op(SCHEDOP_yield, guest_handle_from_ptr(NULL, void));
>          status = HV_STATUS_SUCCESS;
>          break;
> +
> +    case HvFlushVirtualAddressSpace:
> +    case HvFlushVirtualAddressList:
> +    {
> +        cpumask_t *pcpu_mask;
> +        struct vcpu *v;
> +        struct {
> +            uint64_t address_space;
> +            uint64_t flags;
> +            uint64_t vcpu_mask;
> +        } input_params;
> +
> +        /*
> +         * See Microsoft Hypervisor Top Level Spec. sections 12.4.2
> +         * and 12.4.3.
> +         */
> +        perfc_incr(mshv_flush);
> +
> +        /* These hypercalls should never use the fast-call convention. */
> +        status = HV_STATUS_INVALID_PARAMETER;
> +        if ( input.fast )
> +            break;
> +
> +        /* Get input parameters. */
> +        if ( hvm_copy_from_guest_phys(&input_params, input_params_gpa,
> +                                      sizeof(input_params)) != HVMCOPY_okay )
> +            break;
> +
> +        /*
> +         * It is not clear from the spec. if we are supposed to
> +         * include current virtual CPU in the set or not in this case,
> +         * so err on the safe side.
> +         */
> +        if ( input_params.flags & HV_FLUSH_ALL_PROCESSORS )
> +            input_params.vcpu_mask = ~0ul;
> +
> +        pcpu_mask = curr->arch.hvm_vcpu.viridian.flush_cpumask;
> +        cpumask_clear(pcpu_mask);
> +
> +        /*
> +         * For each specified virtual CPU flush all ASIDs to invalidate
> +         * TLB entries the next time it is scheduled and then, if it
> +         * is currently running, add its physical CPU to a mask of
> +         * those which need to be interrupted to force a flush.
> +         */
> +        for_each_vcpu ( currd, v )
> +        {
> +            if ( !(input_params.vcpu_mask & (1ul << v->vcpu_id)) )
> +                continue;

You need to cap this loop at a vcpu_id of 63, or the above conditional
will become undefined.

It might also be wise to fail the vcpu_initialise() for a
viridian-enabled domain having more than 64 vcpus.

> +
> +            hvm_asid_flush_vcpu(v);
> +            if ( v->is_running )
> +                cpumask_set_cpu(v->processor, pcpu_mask);

__cpumask_set_cpu().  No need for atomic operations here.

> +        }
> +
> +        /*
> +         * Since ASIDs have now been flushed it just remains to
> +         * force any CPUs currently running target vCPUs out of non-
> +         * root mode. It's possible that re-scheduling has taken place
> +         * so we may unnecessarily IPI some CPUs.
> +         */
> +        if ( !cpumask_empty(pcpu_mask) )
> +            flush_tlb_mask(pcpu_mask);

Wouldn't it be easier to simply and input_params.vcpu_mask with
d->vcpu_dirty_mask ?

> +
> +        status = HV_STATUS_SUCCESS;
> +        break;
> +    }
> +
>      default:
>          status = HV_STATUS_INVALID_HYPERCALL_CODE;
>          break;
> diff --git a/xen/include/asm-x86/hvm/viridian.h b/xen/include/asm-x86/hvm/viridian.h
> index c4319d7..2eec85e 100644
> --- a/xen/include/asm-x86/hvm/viridian.h
> +++ b/xen/include/asm-x86/hvm/viridian.h
> @@ -22,6 +22,7 @@ union viridian_apic_assist
>  struct viridian_vcpu
>  {
>      union viridian_apic_assist apic_assist;
> +    cpumask_var_t flush_cpumask;
>  };
>  
>  union viridian_guest_os_id
> @@ -117,6 +118,9 @@ viridian_hypercall(struct cpu_user_regs *regs);
>  void viridian_time_ref_count_freeze(struct domain *d);
>  void viridian_time_ref_count_thaw(struct domain *d);
>  
> +int viridian_vcpu_init(struct vcpu *v);
> +void viridian_vcpu_deinit(struct vcpu *v);
> +
>  #endif /* __ASM_X86_HVM_VIRIDIAN_H__ */
>  
>  /*
> diff --git a/xen/include/asm-x86/perfc_defn.h b/xen/include/asm-x86/perfc_defn.h
> index 9ef092e..aac9331 100644
> --- a/xen/include/asm-x86/perfc_defn.h
> +++ b/xen/include/asm-x86/perfc_defn.h
> @@ -115,6 +115,7 @@ PERFCOUNTER(mshv_call_sw_addr_space,    "MS Hv Switch Address Space")
>  PERFCOUNTER(mshv_call_flush_tlb_list,   "MS Hv Flush TLB list")
>  PERFCOUNTER(mshv_call_flush_tlb_all,    "MS Hv Flush TLB all")
>  PERFCOUNTER(mshv_call_long_wait,        "MS Hv Notify long wait")
> +PERFCOUNTER(mshv_call_flush,            "MS Hv Flush TLB")
>  PERFCOUNTER(mshv_rdmsr_osid,            "MS Hv rdmsr Guest OS ID")
>  PERFCOUNTER(mshv_rdmsr_hc_page,         "MS Hv rdmsr hypercall page")
>  PERFCOUNTER(mshv_rdmsr_vp_index,        "MS Hv rdmsr vp index")
> diff --git a/xen/include/public/hvm/params.h b/xen/include/public/hvm/params.h
> index 356dfd3..5e54a84 100644
> --- a/xen/include/public/hvm/params.h
> +++ b/xen/include/public/hvm/params.h
> @@ -1,3 +1,4 @@
> +

Spurious change.

~Andrew

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v3] x86/hvm/viridian: flush remote tlbs by hypercall
  2015-11-19 16:06 ` Andrew Cooper
@ 2015-11-19 16:57   ` Paul Durrant
  2015-11-19 17:08     ` Andrew Cooper
  0 siblings, 1 reply; 14+ messages in thread
From: Paul Durrant @ 2015-11-19 16:57 UTC (permalink / raw)
  To: Andrew Cooper, xen-devel
  Cc: Wei Liu, Keir (Xen.org),
	Ian Campbell, Stefano Stabellini, Jan Beulich, Ian Jackson

> -----Original Message-----
> From: Andrew Cooper [mailto:andrew.cooper3@citrix.com]
> Sent: 19 November 2015 16:07
> To: Paul Durrant; xen-devel@lists.xenproject.org
> Cc: Ian Jackson; Stefano Stabellini; Ian Campbell; Wei Liu; Keir (Xen.org); Jan
> Beulich
> Subject: Re: [PATCH v3] x86/hvm/viridian: flush remote tlbs by hypercall
> 
> On 19/11/15 13:19, Paul Durrant wrote:
> > @@ -561,10 +584,81 @@ int viridian_hypercall(struct cpu_user_regs *regs)
> >      switch ( input.call_code )
> >      {
> >      case HvNotifyLongSpinWait:
> > +        /*
> > +         * See Microsoft Hypervisor Top Level Spec. section 18.5.1.
> > +         */
> >          perfc_incr(mshv_call_long_wait);
> >          do_sched_op(SCHEDOP_yield, guest_handle_from_ptr(NULL, void));
> >          status = HV_STATUS_SUCCESS;
> >          break;
> > +
> > +    case HvFlushVirtualAddressSpace:
> > +    case HvFlushVirtualAddressList:
> > +    {
> > +        cpumask_t *pcpu_mask;
> > +        struct vcpu *v;
> > +        struct {
> > +            uint64_t address_space;
> > +            uint64_t flags;
> > +            uint64_t vcpu_mask;
> > +        } input_params;
> > +
> > +        /*
> > +         * See Microsoft Hypervisor Top Level Spec. sections 12.4.2
> > +         * and 12.4.3.
> > +         */
> > +        perfc_incr(mshv_flush);
> > +
> > +        /* These hypercalls should never use the fast-call convention. */
> > +        status = HV_STATUS_INVALID_PARAMETER;
> > +        if ( input.fast )
> > +            break;
> > +
> > +        /* Get input parameters. */
> > +        if ( hvm_copy_from_guest_phys(&input_params, input_params_gpa,
> > +                                      sizeof(input_params)) != HVMCOPY_okay )
> > +            break;
> > +
> > +        /*
> > +         * It is not clear from the spec. if we are supposed to
> > +         * include current virtual CPU in the set or not in this case,
> > +         * so err on the safe side.
> > +         */
> > +        if ( input_params.flags & HV_FLUSH_ALL_PROCESSORS )
> > +            input_params.vcpu_mask = ~0ul;
> > +
> > +        pcpu_mask = curr->arch.hvm_vcpu.viridian.flush_cpumask;
> > +        cpumask_clear(pcpu_mask);
> > +
> > +        /*
> > +         * For each specified virtual CPU flush all ASIDs to invalidate
> > +         * TLB entries the next time it is scheduled and then, if it
> > +         * is currently running, add its physical CPU to a mask of
> > +         * those which need to be interrupted to force a flush.
> > +         */
> > +        for_each_vcpu ( currd, v )
> > +        {
> > +            if ( !(input_params.vcpu_mask & (1ul << v->vcpu_id)) )
> > +                continue;
> 
> You need to cap this loop at a vcpu_id of 63, or the above conditional
> will become undefined.

The guest should not issue the hypercall if it has more than 64 vCPUs so to some extend I don't care what happens, as long as it is not harmful to the hypervisor in general, and I don't think that it is in this case.

> 
> It might also be wise to fail the vcpu_initialise() for a
> viridian-enabled domain having more than 64 vcpus.
> 
> > +
> > +            hvm_asid_flush_vcpu(v);
> > +            if ( v->is_running )
> > +                cpumask_set_cpu(v->processor, pcpu_mask);
> 
> __cpumask_set_cpu().  No need for atomic operations here.
> 

Ok.

> > +        }
> > +
> > +        /*
> > +         * Since ASIDs have now been flushed it just remains to
> > +         * force any CPUs currently running target vCPUs out of non-
> > +         * root mode. It's possible that re-scheduling has taken place
> > +         * so we may unnecessarily IPI some CPUs.
> > +         */
> > +        if ( !cpumask_empty(pcpu_mask) )
> > +            flush_tlb_mask(pcpu_mask);
> 
> Wouldn't it be easier to simply and input_params.vcpu_mask with
> d->vcpu_dirty_mask ?
> 

No, that may yield much too big a mask. All we need here is a mask of where the vcpus are running *now*, not everywhere they've been.

> > +
> > +        status = HV_STATUS_SUCCESS;
> > +        break;
> > +    }
> > +
> >      default:
> >          status = HV_STATUS_INVALID_HYPERCALL_CODE;
> >          break;
> > diff --git a/xen/include/asm-x86/hvm/viridian.h b/xen/include/asm-
> x86/hvm/viridian.h
> > index c4319d7..2eec85e 100644
> > --- a/xen/include/asm-x86/hvm/viridian.h
> > +++ b/xen/include/asm-x86/hvm/viridian.h
> > @@ -22,6 +22,7 @@ union viridian_apic_assist
> >  struct viridian_vcpu
> >  {
> >      union viridian_apic_assist apic_assist;
> > +    cpumask_var_t flush_cpumask;
> >  };
> >
> >  union viridian_guest_os_id
> > @@ -117,6 +118,9 @@ viridian_hypercall(struct cpu_user_regs *regs);
> >  void viridian_time_ref_count_freeze(struct domain *d);
> >  void viridian_time_ref_count_thaw(struct domain *d);
> >
> > +int viridian_vcpu_init(struct vcpu *v);
> > +void viridian_vcpu_deinit(struct vcpu *v);
> > +
> >  #endif /* __ASM_X86_HVM_VIRIDIAN_H__ */
> >
> >  /*
> > diff --git a/xen/include/asm-x86/perfc_defn.h b/xen/include/asm-
> x86/perfc_defn.h
> > index 9ef092e..aac9331 100644
> > --- a/xen/include/asm-x86/perfc_defn.h
> > +++ b/xen/include/asm-x86/perfc_defn.h
> > @@ -115,6 +115,7 @@ PERFCOUNTER(mshv_call_sw_addr_space,    "MS
> Hv Switch Address Space")
> >  PERFCOUNTER(mshv_call_flush_tlb_list,   "MS Hv Flush TLB list")
> >  PERFCOUNTER(mshv_call_flush_tlb_all,    "MS Hv Flush TLB all")
> >  PERFCOUNTER(mshv_call_long_wait,        "MS Hv Notify long wait")
> > +PERFCOUNTER(mshv_call_flush,            "MS Hv Flush TLB")
> >  PERFCOUNTER(mshv_rdmsr_osid,            "MS Hv rdmsr Guest OS ID")
> >  PERFCOUNTER(mshv_rdmsr_hc_page,         "MS Hv rdmsr hypercall page")
> >  PERFCOUNTER(mshv_rdmsr_vp_index,        "MS Hv rdmsr vp index")
> > diff --git a/xen/include/public/hvm/params.h
> b/xen/include/public/hvm/params.h
> > index 356dfd3..5e54a84 100644
> > --- a/xen/include/public/hvm/params.h
> > +++ b/xen/include/public/hvm/params.h
> > @@ -1,3 +1,4 @@
> > +
> 
> Spurious change.

Yes, it is.

  Paul

> 
> ~Andrew

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v3] x86/hvm/viridian: flush remote tlbs by hypercall
  2015-11-19 16:57   ` Paul Durrant
@ 2015-11-19 17:08     ` Andrew Cooper
  2015-11-20  9:15       ` Paul Durrant
  0 siblings, 1 reply; 14+ messages in thread
From: Andrew Cooper @ 2015-11-19 17:08 UTC (permalink / raw)
  To: Paul Durrant, xen-devel
  Cc: Wei Liu, Keir (Xen.org),
	Ian Campbell, Stefano Stabellini, Jan Beulich, Ian Jackson

On 19/11/15 16:57, Paul Durrant wrote:
>> -----Original Message-----
>> From: Andrew Cooper [mailto:andrew.cooper3@citrix.com]
>> Sent: 19 November 2015 16:07
>> To: Paul Durrant; xen-devel@lists.xenproject.org
>> Cc: Ian Jackson; Stefano Stabellini; Ian Campbell; Wei Liu; Keir (Xen.org); Jan
>> Beulich
>> Subject: Re: [PATCH v3] x86/hvm/viridian: flush remote tlbs by hypercall
>>
>> On 19/11/15 13:19, Paul Durrant wrote:
>>> @@ -561,10 +584,81 @@ int viridian_hypercall(struct cpu_user_regs *regs)
>>>      switch ( input.call_code )
>>>      {
>>>      case HvNotifyLongSpinWait:
>>> +        /*
>>> +         * See Microsoft Hypervisor Top Level Spec. section 18.5.1.
>>> +         */
>>>          perfc_incr(mshv_call_long_wait);
>>>          do_sched_op(SCHEDOP_yield, guest_handle_from_ptr(NULL, void));
>>>          status = HV_STATUS_SUCCESS;
>>>          break;
>>> +
>>> +    case HvFlushVirtualAddressSpace:
>>> +    case HvFlushVirtualAddressList:
>>> +    {
>>> +        cpumask_t *pcpu_mask;
>>> +        struct vcpu *v;
>>> +        struct {
>>> +            uint64_t address_space;
>>> +            uint64_t flags;
>>> +            uint64_t vcpu_mask;
>>> +        } input_params;
>>> +
>>> +        /*
>>> +         * See Microsoft Hypervisor Top Level Spec. sections 12.4.2
>>> +         * and 12.4.3.
>>> +         */
>>> +        perfc_incr(mshv_flush);
>>> +
>>> +        /* These hypercalls should never use the fast-call convention. */
>>> +        status = HV_STATUS_INVALID_PARAMETER;
>>> +        if ( input.fast )
>>> +            break;
>>> +
>>> +        /* Get input parameters. */
>>> +        if ( hvm_copy_from_guest_phys(&input_params, input_params_gpa,
>>> +                                      sizeof(input_params)) != HVMCOPY_okay )
>>> +            break;
>>> +
>>> +        /*
>>> +         * It is not clear from the spec. if we are supposed to
>>> +         * include current virtual CPU in the set or not in this case,
>>> +         * so err on the safe side.
>>> +         */
>>> +        if ( input_params.flags & HV_FLUSH_ALL_PROCESSORS )
>>> +            input_params.vcpu_mask = ~0ul;
>>> +
>>> +        pcpu_mask = curr->arch.hvm_vcpu.viridian.flush_cpumask;
>>> +        cpumask_clear(pcpu_mask);
>>> +
>>> +        /*
>>> +         * For each specified virtual CPU flush all ASIDs to invalidate
>>> +         * TLB entries the next time it is scheduled and then, if it
>>> +         * is currently running, add its physical CPU to a mask of
>>> +         * those which need to be interrupted to force a flush.
>>> +         */
>>> +        for_each_vcpu ( currd, v )
>>> +        {
>>> +            if ( !(input_params.vcpu_mask & (1ul << v->vcpu_id)) )
>>> +                continue;
>> You need to cap this loop at a vcpu_id of 63, or the above conditional
>> will become undefined.
> The guest should not issue the hypercall if it has more than 64 vCPUs so to some extend I don't care what happens, as long as it is not harmful to the hypervisor in general, and I don't think that it is in this case.

The compiler is free to do anything it wishes when encountering
undefined behaviour, including crash the hypervisor.

Any undefined-behaviour which can be triggered by a guest action
warrants an XSA, because there is no telling what might happen.

>
>> It might also be wise to fail the vcpu_initialise() for a
>> viridian-enabled domain having more than 64 vcpus.
>>
>>> +
>>> +            hvm_asid_flush_vcpu(v);
>>> +            if ( v->is_running )
>>> +                cpumask_set_cpu(v->processor, pcpu_mask);
>> __cpumask_set_cpu().  No need for atomic operations here.
>>
> Ok.
>
>>> +        }
>>> +
>>> +        /*
>>> +         * Since ASIDs have now been flushed it just remains to
>>> +         * force any CPUs currently running target vCPUs out of non-
>>> +         * root mode. It's possible that re-scheduling has taken place
>>> +         * so we may unnecessarily IPI some CPUs.
>>> +         */
>>> +        if ( !cpumask_empty(pcpu_mask) )
>>> +            flush_tlb_mask(pcpu_mask);
>> Wouldn't it be easier to simply and input_params.vcpu_mask with
>> d->vcpu_dirty_mask ?
>>
> No, that may yield much too big a mask. All we need here is a mask of where the vcpus are running *now*, not everywhere they've been.

The dirty mask is a "currently scheduled on" mask.

~Andrew

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v3] x86/hvm/viridian: flush remote tlbs by hypercall
  2015-11-19 17:08     ` Andrew Cooper
@ 2015-11-20  9:15       ` Paul Durrant
  2015-11-20  9:44         ` Jan Beulich
                           ` (2 more replies)
  0 siblings, 3 replies; 14+ messages in thread
From: Paul Durrant @ 2015-11-20  9:15 UTC (permalink / raw)
  To: Andrew Cooper, xen-devel
  Cc: Wei Liu, Keir (Xen.org),
	Ian Campbell, Stefano Stabellini, Jan Beulich, Ian Jackson

> -----Original Message-----
> From: Andrew Cooper [mailto:andrew.cooper3@citrix.com]
> Sent: 19 November 2015 17:09
> To: Paul Durrant; xen-devel@lists.xenproject.org
> Cc: Ian Jackson; Stefano Stabellini; Ian Campbell; Wei Liu; Keir (Xen.org); Jan
> Beulich
> Subject: Re: [PATCH v3] x86/hvm/viridian: flush remote tlbs by hypercall
> 
> On 19/11/15 16:57, Paul Durrant wrote:
> >> -----Original Message-----
> >> From: Andrew Cooper [mailto:andrew.cooper3@citrix.com]
> >> Sent: 19 November 2015 16:07
> >> To: Paul Durrant; xen-devel@lists.xenproject.org
> >> Cc: Ian Jackson; Stefano Stabellini; Ian Campbell; Wei Liu; Keir (Xen.org);
> Jan
> >> Beulich
> >> Subject: Re: [PATCH v3] x86/hvm/viridian: flush remote tlbs by hypercall
> >>
> >> On 19/11/15 13:19, Paul Durrant wrote:
> >>> @@ -561,10 +584,81 @@ int viridian_hypercall(struct cpu_user_regs
> *regs)
> >>>      switch ( input.call_code )
> >>>      {
> >>>      case HvNotifyLongSpinWait:
> >>> +        /*
> >>> +         * See Microsoft Hypervisor Top Level Spec. section 18.5.1.
> >>> +         */
> >>>          perfc_incr(mshv_call_long_wait);
> >>>          do_sched_op(SCHEDOP_yield, guest_handle_from_ptr(NULL,
> void));
> >>>          status = HV_STATUS_SUCCESS;
> >>>          break;
> >>> +
> >>> +    case HvFlushVirtualAddressSpace:
> >>> +    case HvFlushVirtualAddressList:
> >>> +    {
> >>> +        cpumask_t *pcpu_mask;
> >>> +        struct vcpu *v;
> >>> +        struct {
> >>> +            uint64_t address_space;
> >>> +            uint64_t flags;
> >>> +            uint64_t vcpu_mask;
> >>> +        } input_params;
> >>> +
> >>> +        /*
> >>> +         * See Microsoft Hypervisor Top Level Spec. sections 12.4.2
> >>> +         * and 12.4.3.
> >>> +         */
> >>> +        perfc_incr(mshv_flush);
> >>> +
> >>> +        /* These hypercalls should never use the fast-call convention. */
> >>> +        status = HV_STATUS_INVALID_PARAMETER;
> >>> +        if ( input.fast )
> >>> +            break;
> >>> +
> >>> +        /* Get input parameters. */
> >>> +        if ( hvm_copy_from_guest_phys(&input_params,
> input_params_gpa,
> >>> +                                      sizeof(input_params)) != HVMCOPY_okay )
> >>> +            break;
> >>> +
> >>> +        /*
> >>> +         * It is not clear from the spec. if we are supposed to
> >>> +         * include current virtual CPU in the set or not in this case,
> >>> +         * so err on the safe side.
> >>> +         */
> >>> +        if ( input_params.flags & HV_FLUSH_ALL_PROCESSORS )
> >>> +            input_params.vcpu_mask = ~0ul;
> >>> +
> >>> +        pcpu_mask = curr->arch.hvm_vcpu.viridian.flush_cpumask;
> >>> +        cpumask_clear(pcpu_mask);
> >>> +
> >>> +        /*
> >>> +         * For each specified virtual CPU flush all ASIDs to invalidate
> >>> +         * TLB entries the next time it is scheduled and then, if it
> >>> +         * is currently running, add its physical CPU to a mask of
> >>> +         * those which need to be interrupted to force a flush.
> >>> +         */
> >>> +        for_each_vcpu ( currd, v )
> >>> +        {
> >>> +            if ( !(input_params.vcpu_mask & (1ul << v->vcpu_id)) )
> >>> +                continue;
> >> You need to cap this loop at a vcpu_id of 63, or the above conditional
> >> will become undefined.
> > The guest should not issue the hypercall if it has more than 64 vCPUs so to
> some extend I don't care what happens, as long as it is not harmful to the
> hypervisor in general, and I don't think that it is in this case.
> 
> The compiler is free to do anything it wishes when encountering
> undefined behaviour, including crash the hypervisor.
> 

I don't believe so. See section 6.5.7 of the C99 spec. (top pf page 85):

"The result of E1 << E2 is E1 left-shifted E2 bit positions; vacated bits are filled with
zeros. If E1 has an unsigned type, the value of the result is E1 x 2^E2, reduced modulo
one more than the maximum value representable in the result type. If E1 has a signed
type and nonnegative value, and E1 x 2^E2 is representable in the result type, then that is
the resulting value; otherwise, the behavior is undefined"

So, the undefined behaviour you refer to is only in the case that E1 is signed and in this case it is most definitely unsigned, so the modulo rule applies.

> Any undefined-behaviour which can be triggered by a guest action
> warrants an XSA, because there is no telling what might happen.
> 
> >
> >> It might also be wise to fail the vcpu_initialise() for a
> >> viridian-enabled domain having more than 64 vcpus.
> >>
> >>> +
> >>> +            hvm_asid_flush_vcpu(v);
> >>> +            if ( v->is_running )
> >>> +                cpumask_set_cpu(v->processor, pcpu_mask);
> >> __cpumask_set_cpu().  No need for atomic operations here.
> >>
> > Ok.
> >
> >>> +        }
> >>> +
> >>> +        /*
> >>> +         * Since ASIDs have now been flushed it just remains to
> >>> +         * force any CPUs currently running target vCPUs out of non-
> >>> +         * root mode. It's possible that re-scheduling has taken place
> >>> +         * so we may unnecessarily IPI some CPUs.
> >>> +         */
> >>> +        if ( !cpumask_empty(pcpu_mask) )
> >>> +            flush_tlb_mask(pcpu_mask);
> >> Wouldn't it be easier to simply and input_params.vcpu_mask with
> >> d->vcpu_dirty_mask ?
> >>

Actually I realise your original statement makes no sense anyway. There is no such mask as d->vcpu_dirty_mask.There is d->domain_dirty_cpumask which is a mask of *physical* CPUs, but since (as the name implies) input_params. vcpu_mask is a mask of *virtual* CPUs then ANDing the two together would just yield garbage. 

> > No, that may yield much too big a mask. All we need here is a mask of
> where the vcpus are running *now*, not everywhere they've been.
> 
> The dirty mask is a "currently scheduled on" mask.

No it's not. The comment in sched.h clearly states that domain_dirty_cpumask is a "Bitmask of CPUs which are holding onto this domain's state" which is, as I said before, essentially everywhere the domains vcpus have been scheduled since the last time state was flushed. Since, in this case, I have already invalidated ASIDs for all targeted virtual CPUs I don't need to IPI that many physical CPUs, I only need the mask of where the virtual CPUs are *currently* running. If one of the them gets descheduled before the IPI then the IPI was unnecessary (but there is no low-cost way of determining or preventing that).

  Paul

> 
> ~Andrew

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v3] x86/hvm/viridian: flush remote tlbs by hypercall
  2015-11-20  9:15       ` Paul Durrant
@ 2015-11-20  9:44         ` Jan Beulich
  2015-11-20  9:50           ` Paul Durrant
  2015-11-20 12:26         ` Jan Beulich
  2015-11-20 13:06         ` Andrew Cooper
  2 siblings, 1 reply; 14+ messages in thread
From: Jan Beulich @ 2015-11-20  9:44 UTC (permalink / raw)
  To: Andrew Cooper, Paul Durrant
  Cc: WeiLiu, Keir (Xen.org),
	Ian Campbell, Stefano Stabellini, xen-devel, Ian Jackson

>>> On 20.11.15 at 10:15, <Paul.Durrant@citrix.com> wrote:
>> From: Andrew Cooper [mailto:andrew.cooper3@citrix.com]
>> Sent: 19 November 2015 17:09
>> On 19/11/15 16:57, Paul Durrant wrote:
>> >>> +        /*
>> >>> +         * For each specified virtual CPU flush all ASIDs to invalidate
>> >>> +         * TLB entries the next time it is scheduled and then, if it
>> >>> +         * is currently running, add its physical CPU to a mask of
>> >>> +         * those which need to be interrupted to force a flush.
>> >>> +         */
>> >>> +        for_each_vcpu ( currd, v )
>> >>> +        {
>> >>> +            if ( !(input_params.vcpu_mask & (1ul << v->vcpu_id)) )
>> >>> +                continue;
>> >> You need to cap this loop at a vcpu_id of 63, or the above conditional
>> >> will become undefined.
>> > The guest should not issue the hypercall if it has more than 64 vCPUs so to
>> some extend I don't care what happens, as long as it is not harmful to the
>> hypervisor in general, and I don't think that it is in this case.
>> 
>> The compiler is free to do anything it wishes when encountering
>> undefined behaviour, including crash the hypervisor.
>> 
> 
> I don't believe so. See section 6.5.7 of the C99 spec. (top pf page 85):
> 
> "The result of E1 << E2 is E1 left-shifted E2 bit positions; vacated bits are 
> filled with
> zeros. If E1 has an unsigned type, the value of the result is E1 x 2^E2, 
> reduced modulo
> one more than the maximum value representable in the result type. If E1 has 
> a signed
> type and nonnegative value, and E1 x 2^E2 is representable in the result 
> type, then that is
> the resulting value; otherwise, the behavior is undefined"
> 
> So, the undefined behaviour you refer to is only in the case that E1 is 
> signed and in this case it is most definitely unsigned, so the modulo rule 
> applies.

Looks like you missed the earlier clause 3 (at the bottom of page 84):
"If the value of the right operand is negative or is greater than or
 equal to the width of the promoted left operand, the behavior is
 undefined." And if you look at generated code, I think you'll find
that the result is not zero for too large shift counts, as would need
to be the case if that clause wasn't there.

I'll actually make my previous Reviewed-by dependent on this issue
getting fixed.

>> Any undefined-behaviour which can be triggered by a guest action
>> warrants an XSA, because there is no telling what might happen.

Well, it warrants fixing, but I don't think it unconditionally would
be XSA-worthy. After all, the compiler (normally) does sane
things in response - in order to crash the hypervisor or produce
truly undefined behavior, it would need to generate extra code
(since other than in C, at the assembly level the instruction's
behavior is defined), which I think we can take for granted it
won't.

Jan

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v3] x86/hvm/viridian: flush remote tlbs by hypercall
  2015-11-20  9:44         ` Jan Beulich
@ 2015-11-20  9:50           ` Paul Durrant
  0 siblings, 0 replies; 14+ messages in thread
From: Paul Durrant @ 2015-11-20  9:50 UTC (permalink / raw)
  To: Jan Beulich, Andrew Cooper
  Cc: Wei Liu, Keir (Xen.org),
	Ian Campbell, Stefano Stabellini, xen-devel, Ian Jackson

> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: 20 November 2015 09:45
> To: Andrew Cooper; Paul Durrant
> Cc: Ian Campbell; Ian Jackson; Stefano Stabellini; Wei Liu; xen-
> devel@lists.xenproject.org; Keir (Xen.org)
> Subject: RE: [PATCH v3] x86/hvm/viridian: flush remote tlbs by hypercall
> 
> >>> On 20.11.15 at 10:15, <Paul.Durrant@citrix.com> wrote:
> >> From: Andrew Cooper [mailto:andrew.cooper3@citrix.com]
> >> Sent: 19 November 2015 17:09
> >> On 19/11/15 16:57, Paul Durrant wrote:
> >> >>> +        /*
> >> >>> +         * For each specified virtual CPU flush all ASIDs to invalidate
> >> >>> +         * TLB entries the next time it is scheduled and then, if it
> >> >>> +         * is currently running, add its physical CPU to a mask of
> >> >>> +         * those which need to be interrupted to force a flush.
> >> >>> +         */
> >> >>> +        for_each_vcpu ( currd, v )
> >> >>> +        {
> >> >>> +            if ( !(input_params.vcpu_mask & (1ul << v->vcpu_id)) )
> >> >>> +                continue;
> >> >> You need to cap this loop at a vcpu_id of 63, or the above conditional
> >> >> will become undefined.
> >> > The guest should not issue the hypercall if it has more than 64 vCPUs so
> to
> >> some extend I don't care what happens, as long as it is not harmful to the
> >> hypervisor in general, and I don't think that it is in this case.
> >>
> >> The compiler is free to do anything it wishes when encountering
> >> undefined behaviour, including crash the hypervisor.
> >>
> >
> > I don't believe so. See section 6.5.7 of the C99 spec. (top pf page 85):
> >
> > "The result of E1 << E2 is E1 left-shifted E2 bit positions; vacated bits are
> > filled with
> > zeros. If E1 has an unsigned type, the value of the result is E1 x 2^E2,
> > reduced modulo
> > one more than the maximum value representable in the result type. If E1
> has
> > a signed
> > type and nonnegative value, and E1 x 2^E2 is representable in the result
> > type, then that is
> > the resulting value; otherwise, the behavior is undefined"
> >
> > So, the undefined behaviour you refer to is only in the case that E1 is
> > signed and in this case it is most definitely unsigned, so the modulo rule
> > applies.
> 
> Looks like you missed the earlier clause 3 (at the bottom of page 84):
> "If the value of the right operand is negative or is greater than or
>  equal to the width of the promoted left operand, the behavior is
>  undefined." And if you look at generated code, I think you'll find
> that the result is not zero for too large shift counts, as would need
> to be the case if that clause wasn't there.
> 

Hmm. That does seem contradictory. It does indeed say 'behavior' is undefined. If it was merely 'value' is undefined then I would not see that as a problem.

> I'll actually make my previous Reviewed-by dependent on this issue
> getting fixed.
> 

Since you've convinced me that the check is necessary, I'll post v5 with it in.

  Paul

> >> Any undefined-behaviour which can be triggered by a guest action
> >> warrants an XSA, because there is no telling what might happen.
> 
> Well, it warrants fixing, but I don't think it unconditionally would
> be XSA-worthy. After all, the compiler (normally) does sane
> things in response - in order to crash the hypervisor or produce
> truly undefined behavior, it would need to generate extra code
> (since other than in C, at the assembly level the instruction's
> behavior is defined), which I think we can take for granted it
> won't.
> 
> Jan

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v3] x86/hvm/viridian: flush remote tlbs by hypercall
  2015-11-20  9:15       ` Paul Durrant
  2015-11-20  9:44         ` Jan Beulich
@ 2015-11-20 12:26         ` Jan Beulich
  2015-11-20 13:41           ` Paul Durrant
  2015-11-20 13:45           ` Paul Durrant
  2015-11-20 13:06         ` Andrew Cooper
  2 siblings, 2 replies; 14+ messages in thread
From: Jan Beulich @ 2015-11-20 12:26 UTC (permalink / raw)
  To: Paul Durrant
  Cc: WeiLiu, Ian Campbell, Andrew Cooper, Stefano Stabellini,
	Ian Jackson, xen-devel, Keir (Xen.org)

>>> On 20.11.15 at 10:15, <Paul.Durrant@citrix.com> wrote:
>> From: Andrew Cooper [mailto:andrew.cooper3@citrix.com]
>> Sent: 19 November 2015 17:09
>> On 19/11/15 16:57, Paul Durrant wrote:
>> >> From: Andrew Cooper [mailto:andrew.cooper3@citrix.com]
>> >> Sent: 19 November 2015 16:07
>> >> On 19/11/15 13:19, Paul Durrant wrote:
>> >>> +        /*
>> >>> +         * Since ASIDs have now been flushed it just remains to
>> >>> +         * force any CPUs currently running target vCPUs out of non-
>> >>> +         * root mode. It's possible that re-scheduling has taken place
>> >>> +         * so we may unnecessarily IPI some CPUs.
>> >>> +         */
>> >>> +        if ( !cpumask_empty(pcpu_mask) )
>> >>> +            flush_tlb_mask(pcpu_mask);
>> >> Wouldn't it be easier to simply and input_params.vcpu_mask with
>> >> d->vcpu_dirty_mask ?
>> >>
> 
> Actually I realise your original statement makes no sense anyway. There is 
> no such mask as d->vcpu_dirty_mask.There is d->domain_dirty_cpumask which is a 
> mask of *physical* CPUs, but since (as the name implies) input_params. 
> vcpu_mask is a mask of *virtual* CPUs then ANDing the two together would just 
> yield garbage. 
> 
>> > No, that may yield much too big a mask. All we need here is a mask of
>> where the vcpus are running *now*, not everywhere they've been.
>> 
>> The dirty mask is a "currently scheduled on" mask.
> 
> No it's not. The comment in sched.h clearly states that domain_dirty_cpumask 
> is a "Bitmask of CPUs which are holding onto this domain's state" which is, 
> as I said before, essentially everywhere the domains vcpus have been 
> scheduled since the last time state was flushed. Since, in this case, I have 
> already invalidated ASIDs for all targeted virtual CPUs I don't need to IPI 
> that many physical CPUs, I only need the mask of where the virtual CPUs are 
> *currently* running. If one of the them gets descheduled before the IPI then 
> the IPI was unnecessary (but there is no low-cost way of determining or 
> preventing that).

While you can't "and" that mask into input_params.vcpu_mask,
wouldn't using it allow you to avoid the scratch pCPU mask
variable?

Jan

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v3] x86/hvm/viridian: flush remote tlbs by hypercall
  2015-11-20  9:15       ` Paul Durrant
  2015-11-20  9:44         ` Jan Beulich
  2015-11-20 12:26         ` Jan Beulich
@ 2015-11-20 13:06         ` Andrew Cooper
  2 siblings, 0 replies; 14+ messages in thread
From: Andrew Cooper @ 2015-11-20 13:06 UTC (permalink / raw)
  To: Paul Durrant, xen-devel
  Cc: Wei Liu, Keir (Xen.org),
	Ian Campbell, Stefano Stabellini, Jan Beulich, Ian Jackson

On 20/11/15 09:15, Paul Durrant wrote:
>> -----Original Message-----
>> From: Andrew Cooper [mailto:andrew.cooper3@citrix.com]
>> Sent: 19 November 2015 17:09
>> To: Paul Durrant; xen-devel@lists.xenproject.org
>> Cc: Ian Jackson; Stefano Stabellini; Ian Campbell; Wei Liu; Keir (Xen.org); Jan
>> Beulich
>> Subject: Re: [PATCH v3] x86/hvm/viridian: flush remote tlbs by hypercall
>>
>> On 19/11/15 16:57, Paul Durrant wrote:
>>>> -----Original Message-----
>>>> From: Andrew Cooper [mailto:andrew.cooper3@citrix.com]
>>>> Sent: 19 November 2015 16:07
>>>> To: Paul Durrant; xen-devel@lists.xenproject.org
>>>> Cc: Ian Jackson; Stefano Stabellini; Ian Campbell; Wei Liu; Keir (Xen.org);
>> Jan
>>>> Beulich
>>>> Subject: Re: [PATCH v3] x86/hvm/viridian: flush remote tlbs by hypercall
>>>>
>>>> On 19/11/15 13:19, Paul Durrant wrote:
>>>>> @@ -561,10 +584,81 @@ int viridian_hypercall(struct cpu_user_regs
>> *regs)
>>>>>      switch ( input.call_code )
>>>>>      {
>>>>>      case HvNotifyLongSpinWait:
>>>>> +        /*
>>>>> +         * See Microsoft Hypervisor Top Level Spec. section 18.5.1.
>>>>> +         */
>>>>>          perfc_incr(mshv_call_long_wait);
>>>>>          do_sched_op(SCHEDOP_yield, guest_handle_from_ptr(NULL,
>> void));
>>>>>          status = HV_STATUS_SUCCESS;
>>>>>          break;
>>>>> +
>>>>> +    case HvFlushVirtualAddressSpace:
>>>>> +    case HvFlushVirtualAddressList:
>>>>> +    {
>>>>> +        cpumask_t *pcpu_mask;
>>>>> +        struct vcpu *v;
>>>>> +        struct {
>>>>> +            uint64_t address_space;
>>>>> +            uint64_t flags;
>>>>> +            uint64_t vcpu_mask;
>>>>> +        } input_params;
>>>>> +
>>>>> +        /*
>>>>> +         * See Microsoft Hypervisor Top Level Spec. sections 12.4.2
>>>>> +         * and 12.4.3.
>>>>> +         */
>>>>> +        perfc_incr(mshv_flush);
>>>>> +
>>>>> +        /* These hypercalls should never use the fast-call convention. */
>>>>> +        status = HV_STATUS_INVALID_PARAMETER;
>>>>> +        if ( input.fast )
>>>>> +            break;
>>>>> +
>>>>> +        /* Get input parameters. */
>>>>> +        if ( hvm_copy_from_guest_phys(&input_params,
>> input_params_gpa,
>>>>> +                                      sizeof(input_params)) != HVMCOPY_okay )
>>>>> +            break;
>>>>> +
>>>>> +        /*
>>>>> +         * It is not clear from the spec. if we are supposed to
>>>>> +         * include current virtual CPU in the set or not in this case,
>>>>> +         * so err on the safe side.
>>>>> +         */
>>>>> +        if ( input_params.flags & HV_FLUSH_ALL_PROCESSORS )
>>>>> +            input_params.vcpu_mask = ~0ul;
>>>>> +
>>>>> +        pcpu_mask = curr->arch.hvm_vcpu.viridian.flush_cpumask;
>>>>> +        cpumask_clear(pcpu_mask);
>>>>> +
>>>>> +        /*
>>>>> +         * For each specified virtual CPU flush all ASIDs to invalidate
>>>>> +         * TLB entries the next time it is scheduled and then, if it
>>>>> +         * is currently running, add its physical CPU to a mask of
>>>>> +         * those which need to be interrupted to force a flush.
>>>>> +         */
>>>>> +        for_each_vcpu ( currd, v )
>>>>> +        {
>>>>> +            if ( !(input_params.vcpu_mask & (1ul << v->vcpu_id)) )
>>>>> +                continue;
>>>> You need to cap this loop at a vcpu_id of 63, or the above conditional
>>>> will become undefined.
>>> The guest should not issue the hypercall if it has more than 64 vCPUs so to
>> some extend I don't care what happens, as long as it is not harmful to the
>> hypervisor in general, and I don't think that it is in this case.
>>
>> The compiler is free to do anything it wishes when encountering
>> undefined behaviour, including crash the hypervisor.
>>
> I don't believe so. See section 6.5.7 of the C99 spec. (top pf page 85):
>
> "The result of E1 << E2 is E1 left-shifted E2 bit positions; vacated bits are filled with
> zeros. If E1 has an unsigned type, the value of the result is E1 x 2^E2, reduced modulo
> one more than the maximum value representable in the result type. If E1 has a signed
> type and nonnegative value, and E1 x 2^E2 is representable in the result type, then that is
> the resulting value; otherwise, the behavior is undefined"
>
> So, the undefined behaviour you refer to is only in the case that E1 is signed and in this case it is most definitely unsigned, so the modulo rule applies.

Ok.

>
>> Any undefined-behaviour which can be triggered by a guest action
>> warrants an XSA, because there is no telling what might happen.
>>
>>>> It might also be wise to fail the vcpu_initialise() for a
>>>> viridian-enabled domain having more than 64 vcpus.
>>>>
>>>>> +
>>>>> +            hvm_asid_flush_vcpu(v);
>>>>> +            if ( v->is_running )
>>>>> +                cpumask_set_cpu(v->processor, pcpu_mask);
>>>> __cpumask_set_cpu().  No need for atomic operations here.
>>>>
>>> Ok.
>>>
>>>>> +        }
>>>>> +
>>>>> +        /*
>>>>> +         * Since ASIDs have now been flushed it just remains to
>>>>> +         * force any CPUs currently running target vCPUs out of non-
>>>>> +         * root mode. It's possible that re-scheduling has taken place
>>>>> +         * so we may unnecessarily IPI some CPUs.
>>>>> +         */
>>>>> +        if ( !cpumask_empty(pcpu_mask) )
>>>>> +            flush_tlb_mask(pcpu_mask);
>>>> Wouldn't it be easier to simply and input_params.vcpu_mask with
>>>> d->vcpu_dirty_mask ?
>>>>
> Actually I realise your original statement makes no sense anyway. There is no such mask as d->vcpu_dirty_mask.There is d->domain_dirty_cpumask which is a mask of *physical* CPUs, but since (as the name implies) input_params. vcpu_mask is a mask of *virtual* CPUs then ANDing the two together would just yield garbage.
>
>>> No, that may yield much too big a mask. All we need here is a mask of
>> where the vcpus are running *now*, not everywhere they've been.
>>
>> The dirty mask is a "currently scheduled on" mask.
> No it's not. The comment in sched.h clearly states that domain_dirty_cpumask is a "Bitmask of CPUs which are holding onto this domain's state" which is, as I said before, essentially everywhere the domains vcpus have been scheduled since the last time state was flushed. Since, in this case, I have already invalidated ASIDs for all targeted virtual CPUs I don't need to IPI that many physical CPUs, I only need the mask of where the virtual CPUs are *currently* running. If one of the them gets descheduled before the IPI then the IPI was unnecessary (but there is no low-cost way of determining or preventing that).

In the case of a FlushAll, the domain dirty mask would be fine, as you
don't care about a subset of vcpus.  Having said that, it is likely not
worth doing unless we can optimise away the loop entirely.

~Andrew

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v3] x86/hvm/viridian: flush remote tlbs by hypercall
  2015-11-20 12:26         ` Jan Beulich
@ 2015-11-20 13:41           ` Paul Durrant
  2015-11-20 15:02             ` Jan Beulich
  2015-11-20 13:45           ` Paul Durrant
  1 sibling, 1 reply; 14+ messages in thread
From: Paul Durrant @ 2015-11-20 13:41 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Wei Liu, Ian Campbell, Andrew Cooper, Stefano Stabellini,
	Ian Jackson, xen-devel, Keir (Xen.org)

> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: 20 November 2015 12:27
> To: Paul Durrant
> Cc: Andrew Cooper; Ian Campbell; Ian Jackson; Stefano Stabellini; Wei Liu;
> xen-devel@lists.xenproject.org; Keir (Xen.org)
> Subject: RE: [PATCH v3] x86/hvm/viridian: flush remote tlbs by hypercall
> 
> >>> On 20.11.15 at 10:15, <Paul.Durrant@citrix.com> wrote:
> >> From: Andrew Cooper [mailto:andrew.cooper3@citrix.com]
> >> Sent: 19 November 2015 17:09
> >> On 19/11/15 16:57, Paul Durrant wrote:
> >> >> From: Andrew Cooper [mailto:andrew.cooper3@citrix.com]
> >> >> Sent: 19 November 2015 16:07
> >> >> On 19/11/15 13:19, Paul Durrant wrote:
> >> >>> +        /*
> >> >>> +         * Since ASIDs have now been flushed it just remains to
> >> >>> +         * force any CPUs currently running target vCPUs out of non-
> >> >>> +         * root mode. It's possible that re-scheduling has taken place
> >> >>> +         * so we may unnecessarily IPI some CPUs.
> >> >>> +         */
> >> >>> +        if ( !cpumask_empty(pcpu_mask) )
> >> >>> +            flush_tlb_mask(pcpu_mask);
> >> >> Wouldn't it be easier to simply and input_params.vcpu_mask with
> >> >> d->vcpu_dirty_mask ?
> >> >>
> >
> > Actually I realise your original statement makes no sense anyway. There is
> > no such mask as d->vcpu_dirty_mask.There is d->domain_dirty_cpumask
> which is a
> > mask of *physical* CPUs, but since (as the name implies) input_params.
> > vcpu_mask is a mask of *virtual* CPUs then ANDing the two together
> would just
> > yield garbage.
> >
> >> > No, that may yield much too big a mask. All we need here is a mask of
> >> where the vcpus are running *now*, not everywhere they've been.
> >>
> >> The dirty mask is a "currently scheduled on" mask.
> >
> > No it's not. The comment in sched.h clearly states that
> domain_dirty_cpumask
> > is a "Bitmask of CPUs which are holding onto this domain's state" which is,
> > as I said before, essentially everywhere the domains vcpus have been
> > scheduled since the last time state was flushed. Since, in this case, I have
> > already invalidated ASIDs for all targeted virtual CPUs I don't need to IPI
> > that many physical CPUs, I only need the mask of where the virtual CPUs
> are
> > *currently* running. If one of the them gets descheduled before the IPI
> then
> > the IPI was unnecessary (but there is no low-cost way of determining or
> > preventing that).
> 
> While you can't "and" that mask into input_params.vcpu_mask,
> wouldn't using it allow you to avoid the scratch pCPU mask
> variable?
> 

I'm not sure. After flushing the ASIDs I guess I could start with the domain_dirty_cpumask and, remove non-running vcpus from it, and then use it as the flush mask. If I do that, I suppose I ought to reset the vcpu_dirty_vcpumask values too. Something like...

        for_each_vcpu ( currd, v )
        {
            if ( v->vcpu_id >= (sizeof(input_params.vcpu_mask) * 8) )
                break;

            if ( !((input_params.vcpu_mask >> v->vcpu_id) & 1) )
                continue;

            hvm_asid_flush_vcpu(v);
            cpumask_clear(v->vcpu_dirty_vcpumask);
            
            if ( v->is_running )
                cpumask_set_cpu(v->processor, v->vcpu_dirty_vcpumask);
            else
                cpumask_clear_cpu(v->processor, d->domain_dirty_vcpumask);
        }

        if ( !cpumask_empty(d->domain_dirty_vcpumask) )
            flush_tlb_mask(d->domain_dirty_vcpumask);

Maybe?

  Paul

> Jan

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v3] x86/hvm/viridian: flush remote tlbs by hypercall
  2015-11-20 12:26         ` Jan Beulich
  2015-11-20 13:41           ` Paul Durrant
@ 2015-11-20 13:45           ` Paul Durrant
  1 sibling, 0 replies; 14+ messages in thread
From: Paul Durrant @ 2015-11-20 13:45 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Wei Liu, Ian Campbell, Andrew Cooper, Stefano Stabellini,
	Ian Jackson, xen-devel, Keir (Xen.org)

> -----Original Message-----
> From: Paul Durrant
> Sent: 20 November 2015 13:41
> To: 'Jan Beulich'
> Cc: Andrew Cooper; Ian Campbell; Ian Jackson; Stefano Stabellini; Wei Liu;
> xen-devel@lists.xenproject.org; Keir (Xen.org)
> Subject: RE: [PATCH v3] x86/hvm/viridian: flush remote tlbs by hypercall
> 
> > -----Original Message-----
> > From: Jan Beulich [mailto:JBeulich@suse.com]
> > Sent: 20 November 2015 12:27
> > To: Paul Durrant
> > Cc: Andrew Cooper; Ian Campbell; Ian Jackson; Stefano Stabellini; Wei Liu;
> > xen-devel@lists.xenproject.org; Keir (Xen.org)
> > Subject: RE: [PATCH v3] x86/hvm/viridian: flush remote tlbs by hypercall
> >
> > >>> On 20.11.15 at 10:15, <Paul.Durrant@citrix.com> wrote:
> > >> From: Andrew Cooper [mailto:andrew.cooper3@citrix.com]
> > >> Sent: 19 November 2015 17:09
> > >> On 19/11/15 16:57, Paul Durrant wrote:
> > >> >> From: Andrew Cooper [mailto:andrew.cooper3@citrix.com]
> > >> >> Sent: 19 November 2015 16:07
> > >> >> On 19/11/15 13:19, Paul Durrant wrote:
> > >> >>> +        /*
> > >> >>> +         * Since ASIDs have now been flushed it just remains to
> > >> >>> +         * force any CPUs currently running target vCPUs out of non-
> > >> >>> +         * root mode. It's possible that re-scheduling has taken place
> > >> >>> +         * so we may unnecessarily IPI some CPUs.
> > >> >>> +         */
> > >> >>> +        if ( !cpumask_empty(pcpu_mask) )
> > >> >>> +            flush_tlb_mask(pcpu_mask);
> > >> >> Wouldn't it be easier to simply and input_params.vcpu_mask with
> > >> >> d->vcpu_dirty_mask ?
> > >> >>
> > >
> > > Actually I realise your original statement makes no sense anyway. There
> is
> > > no such mask as d->vcpu_dirty_mask.There is d->domain_dirty_cpumask
> > which is a
> > > mask of *physical* CPUs, but since (as the name implies) input_params.
> > > vcpu_mask is a mask of *virtual* CPUs then ANDing the two together
> > would just
> > > yield garbage.
> > >
> > >> > No, that may yield much too big a mask. All we need here is a mask of
> > >> where the vcpus are running *now*, not everywhere they've been.
> > >>
> > >> The dirty mask is a "currently scheduled on" mask.
> > >
> > > No it's not. The comment in sched.h clearly states that
> > domain_dirty_cpumask
> > > is a "Bitmask of CPUs which are holding onto this domain's state" which is,
> > > as I said before, essentially everywhere the domains vcpus have been
> > > scheduled since the last time state was flushed. Since, in this case, I have
> > > already invalidated ASIDs for all targeted virtual CPUs I don't need to IPI
> > > that many physical CPUs, I only need the mask of where the virtual CPUs
> > are
> > > *currently* running. If one of the them gets descheduled before the IPI
> > then
> > > the IPI was unnecessary (but there is no low-cost way of determining or
> > > preventing that).
> >
> > While you can't "and" that mask into input_params.vcpu_mask,
> > wouldn't using it allow you to avoid the scratch pCPU mask
> > variable?
> >
> 
> I'm not sure. After flushing the ASIDs I guess I could start with the
> domain_dirty_cpumask and, remove non-running vcpus from it, and then
> use it as the flush mask. If I do that, I suppose I ought to reset the
> vcpu_dirty_vcpumask values too. Something like...
> 
>         for_each_vcpu ( currd, v )
>         {
>             if ( v->vcpu_id >= (sizeof(input_params.vcpu_mask) * 8) )
>                 break;
> 
>             if ( !((input_params.vcpu_mask >> v->vcpu_id) & 1) )
>                 continue;
> 
>             hvm_asid_flush_vcpu(v);
>             cpumask_clear(v->vcpu_dirty_vcpumask);
> 
>             if ( v->is_running )
>                 cpumask_set_cpu(v->processor, v->vcpu_dirty_vcpumask);
>             else
>                 cpumask_clear_cpu(v->processor, d->domain_dirty_vcpumask);
>         }
> 
>         if ( !cpumask_empty(d->domain_dirty_vcpumask) )
>             flush_tlb_mask(d->domain_dirty_vcpumask);
> 

s/vcpumask/cpumask in the above. You hopefully got what I meant though.

> Maybe?
> 
>   Paul
> 
> > Jan

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v3] x86/hvm/viridian: flush remote tlbs by hypercall
  2015-11-20 13:41           ` Paul Durrant
@ 2015-11-20 15:02             ` Jan Beulich
  2015-11-20 15:06               ` Paul Durrant
  0 siblings, 1 reply; 14+ messages in thread
From: Jan Beulich @ 2015-11-20 15:02 UTC (permalink / raw)
  To: Paul Durrant
  Cc: Wei Liu, Ian Campbell, Andrew Cooper, StefanoStabellini,
	Ian Jackson, xen-devel, Keir(Xen.org)

>>> On 20.11.15 at 14:41, <Paul.Durrant@citrix.com> wrote:
>> From: Jan Beulich [mailto:JBeulich@suse.com]
>> Sent: 20 November 2015 12:27
>> While you can't "and" that mask into input_params.vcpu_mask,
>> wouldn't using it allow you to avoid the scratch pCPU mask
>> variable?
> 
> I'm not sure. After flushing the ASIDs I guess I could start with the 
> domain_dirty_cpumask and, remove non-running vcpus from it, and then use it as 
> the flush mask. If I do that, I suppose I ought to reset the 
> vcpu_dirty_vcpumask values too. Something like...
> 
>         for_each_vcpu ( currd, v )
>         {
>             if ( v->vcpu_id >= (sizeof(input_params.vcpu_mask) * 8) )
>                 break;
> 
>             if ( !((input_params.vcpu_mask >> v->vcpu_id) & 1) )
>                 continue;
> 
>             hvm_asid_flush_vcpu(v);
>             cpumask_clear(v->vcpu_dirty_vcpumask);
>             
>             if ( v->is_running )
>                 cpumask_set_cpu(v->processor, v->vcpu_dirty_vcpumask);
>             else
>                 cpumask_clear_cpu(v->processor, d->domain_dirty_vcpumask);
>         }
> 
>         if ( !cpumask_empty(d->domain_dirty_vcpumask) )
>             flush_tlb_mask(d->domain_dirty_vcpumask);

For one I don't think you should be modifying either of the two
dirty masks here - nothing outside of context switch code does or
should: Both really ought to be redundant with what context
switch code already does, plus the "clear" would even have the
potential of suppressing further flushes if you race with context
sitch code. And if you wanted to subtract the vCPU-s you did the
ASID flush for from d->domain_dirty_cpumask, you'd again need
a temporary mask, i.e. not much (if anything) won compared to v5.

Jan

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v3] x86/hvm/viridian: flush remote tlbs by hypercall
  2015-11-20 15:02             ` Jan Beulich
@ 2015-11-20 15:06               ` Paul Durrant
  0 siblings, 0 replies; 14+ messages in thread
From: Paul Durrant @ 2015-11-20 15:06 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Wei Liu, Ian Campbell, Andrew Cooper, Stefano Stabellini,
	Ian Jackson, xen-devel, Keir (Xen.org)

> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: 20 November 2015 15:03
> To: Paul Durrant
> Cc: Andrew Cooper; Ian Campbell; Ian Jackson; Stefano Stabellini; Wei Liu;
> xen-devel@lists.xenproject.org; Keir (Xen.org)
> Subject: RE: [PATCH v3] x86/hvm/viridian: flush remote tlbs by hypercall
> 
> >>> On 20.11.15 at 14:41, <Paul.Durrant@citrix.com> wrote:
> >> From: Jan Beulich [mailto:JBeulich@suse.com]
> >> Sent: 20 November 2015 12:27
> >> While you can't "and" that mask into input_params.vcpu_mask,
> >> wouldn't using it allow you to avoid the scratch pCPU mask
> >> variable?
> >
> > I'm not sure. After flushing the ASIDs I guess I could start with the
> > domain_dirty_cpumask and, remove non-running vcpus from it, and then
> use it as
> > the flush mask. If I do that, I suppose I ought to reset the
> > vcpu_dirty_vcpumask values too. Something like...
> >
> >         for_each_vcpu ( currd, v )
> >         {
> >             if ( v->vcpu_id >= (sizeof(input_params.vcpu_mask) * 8) )
> >                 break;
> >
> >             if ( !((input_params.vcpu_mask >> v->vcpu_id) & 1) )
> >                 continue;
> >
> >             hvm_asid_flush_vcpu(v);
> >             cpumask_clear(v->vcpu_dirty_vcpumask);
> >
> >             if ( v->is_running )
> >                 cpumask_set_cpu(v->processor, v->vcpu_dirty_vcpumask);
> >             else
> >                 cpumask_clear_cpu(v->processor, d->domain_dirty_vcpumask);
> >         }
> >
> >         if ( !cpumask_empty(d->domain_dirty_vcpumask) )
> >             flush_tlb_mask(d->domain_dirty_vcpumask);
> 
> For one I don't think you should be modifying either of the two
> dirty masks here - nothing outside of context switch code does or
> should: Both really ought to be redundant with what context
> switch code already does, plus the "clear" would even have the
> potential of suppressing further flushes if you race with context
> sitch code. And if you wanted to subtract the vCPU-s you did the
> ASID flush for from d->domain_dirty_cpumask, you'd again need
> a temporary mask, i.e. not much (if anything) won compared to v5.
> 

Yes, racing with the context switch code is a problem. I'll stick with the v5 patch. Just need an ack for the libxl bits now I guess.

  Paul

> Jan

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2015-11-20 15:07 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-11-19 13:19 [PATCH v3] x86/hvm/viridian: flush remote tlbs by hypercall Paul Durrant
2015-11-19 15:35 ` Jan Beulich
2015-11-19 16:06 ` Andrew Cooper
2015-11-19 16:57   ` Paul Durrant
2015-11-19 17:08     ` Andrew Cooper
2015-11-20  9:15       ` Paul Durrant
2015-11-20  9:44         ` Jan Beulich
2015-11-20  9:50           ` Paul Durrant
2015-11-20 12:26         ` Jan Beulich
2015-11-20 13:41           ` Paul Durrant
2015-11-20 15:02             ` Jan Beulich
2015-11-20 15:06               ` Paul Durrant
2015-11-20 13:45           ` Paul Durrant
2015-11-20 13:06         ` Andrew Cooper

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.