All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/2] Introduce runstate area registration with phys address
@ 2019-04-23  8:10 Andrii Anisov
  2019-04-23  8:10   ` [Xen-devel] " Andrii Anisov
                   ` (3 more replies)
  0 siblings, 4 replies; 83+ messages in thread
From: Andrii Anisov @ 2019-04-23  8:10 UTC (permalink / raw)
  To: xen-devel; +Cc: Julien Grall, Andrii Anisov, Jan Beulich, Roger Pau Monné

From: Andrii Anisov <andrii_anisov@epam.com>

Following discussion [1] it is introduced and implemented a runstate
registration interface which uses guest's phys address instead of a virtual one.
The new hypercall employes the same data structures as a predecessor, but
expects the vcpu_runstate_info structure to not cross a page boundary.
The interface is implemented in a way vcpu_runstate_info structure is mapped to
the hypervisor on the hypercall processing and is directly accessed during its
updates. This runstate area mapping follows vcpu_info structure registration.

Permanent mapping of runstate area would consume vmap area on arm32 what is
limited to 1G. Though it is assumed that ARM32 does not target the server market
and the rest of possible applications will not host a huge number of VCPUs to
render the limitation into the issue.

The series is tested for ARM64. Build tested for x86. I'd appreciate if someone
could check it with x86.
The Linux kernel patch is here [2]. Though it is for 4.14.

Changes in:

  v2: It was reconsidered the new runstate interface implementation. The new 
      interface is made independent of the old one. Do not share runstate_area
      field, and consequently avoid excessive concurrency with the old runstate
      interface usage.
      Introduced locks in order to resolve possible concurrency between runstate
      area registration and usage. 
      Addressed comments from Jan Beulich [3][4] about coding style nits. Though
      some of them become obsolete with refactoring and few are picked into this
      thread for further discussion.

      There were made performance measurements of approaches (runstate mapped on
      access vs mapped on registration). The test setups are as following:
     
      Thin Dom0 (Linux with intiramfs) with DomD running rich Yocto Linux. In
      DomD 3d benchmark numbers are compared. The benchmark is GlMark2. GLMark2
      is ran with different resolutions in order to emit different irq load, 
      where 320x240 emits high IRQ load, but 1920x1080 emits low irq load.
      Separately tested baking DomD benchmark run with primitive Dom0 CPU burn
      (dd), in order to stimulate VCPU(dX)->VCPU(dY) switches rather than
      VCPU(dX)->idle->VCPU(dX).
      with following results:

                            mapped               mapped
                            on access            on init
      GLMark2 320x240       2852                 2877          +0.8%
          +Dom0 CPUBurn     2088                 2094          +0.2%
      GLMark2 800x600       2368                 2375          +0.3%
          +Dom0 CPUBurn     1868                 1921          +2.8%
      GLMark2 1920x1080     931                  931            0%
          +Dom0 CPUBurn     892                  894           +0.2%

      Please note that "mapped on access" means using the old runstate
      registering interface. And runstate update in this case still often fails
      to map runstate area like [5], despite the fact that our Linux kernel
      does not have KPTI enabled. So runstate area update, in this case, is
      really shortened.


      Also it was checked IRQ latency difference using TBM in a setup similar to
      [5]. Please note that the IRQ rate is one in 30 seconds, and only
      VCPU->idle->VCPU use-case is considered. With following results (in ns,
      the timer granularity 120ns):

      mapped on access:
          max=9960 warm_max=8640 min=7200 avg=7626
      mapped on init:
          max=9480 warm_max=8400 min=7080 avg=7341

      Unfortunately there are no consitent results yet from profiling using
      Lauterbach PowerTrace. Still in communication with the tracer vendor in
      order to setup the proper configuration.

[1] https://lists.xenproject.org/archives/html/xen-devel/2019-02/msg00416.html
[2] https://github.com/aanisov/linux/commit/ba34d2780f57ea43f81810cd695aace7b55c0f29
[3] https://lists.xenproject.org/archives/html/xen-devel/2019-03/msg00936.html
[4] https://lists.xenproject.org/archives/html/xen-devel/2019-03/msg00934.html
[5] https://lists.xenproject.org/archives/html/xen-devel/2019-01/msg02369.html
[6] https://lists.xenproject.org/archives/html/xen-devel/2018-12/msg02297.html


Andrii Anisov (2):
  xen: introduce VCPUOP_register_runstate_phys_memory_area hypercall
  xen: implement VCPUOP_register_runstate_phys_memory_area

 xen/arch/arm/domain.c        |  62 +++++++++++++++++--------
 xen/arch/x86/domain.c        | 105 +++++++++++++++++++++++++++++++------------
 xen/common/domain.c          |  81 +++++++++++++++++++++++++++++++++
 xen/include/asm-arm/domain.h |   2 +
 xen/include/public/vcpu.h    |  15 +++++++
 xen/include/xen/domain.h     |   2 +
 xen/include/xen/sched.h      |   8 ++++
 7 files changed, 227 insertions(+), 48 deletions(-)

-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 83+ messages in thread

* [PATCH v2 1/2] xen: introduce VCPUOP_register_runstate_phys_memory_area hypercall
@ 2019-04-23  8:10   ` Andrii Anisov
  0 siblings, 0 replies; 83+ messages in thread
From: Andrii Anisov @ 2019-04-23  8:10 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, Andrii Anisov, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan,
	Julien Grall, Jan Beulich, xen-devel, Wei Liu,
	Roger Pau Monné

From: Andrii Anisov <andrii_anisov@epam.com>

The hypercall employs the same vcpu_register_runstate_memory_area
structure for the interface, but requires registered area to not
cross a page boundary.

Signed-off-by: Andrii Anisov <andrii_anisov@epam.com>
---
 xen/common/domain.c       |  5 ++++-
 xen/include/public/vcpu.h | 15 +++++++++++++++
 2 files changed, 19 insertions(+), 1 deletion(-)

diff --git a/xen/common/domain.c b/xen/common/domain.c
index 88bbe98..ae22049 100644
--- a/xen/common/domain.c
+++ b/xen/common/domain.c
@@ -1532,10 +1532,13 @@ long do_vcpu_op(int cmd, unsigned int vcpuid, XEN_GUEST_HANDLE_PARAM(void) arg)
             vcpu_runstate_get(v, &runstate);
             __copy_to_guest(runstate_guest(v), &runstate, 1);
         }
-
         break;
     }
 
+    case VCPUOP_register_runstate_phys_memory_area:
+        rc = -EOPNOTSUPP;
+        break;
+
 #ifdef VCPU_TRAP_NMI
     case VCPUOP_send_nmi:
         if ( !guest_handle_is_null(arg) )
diff --git a/xen/include/public/vcpu.h b/xen/include/public/vcpu.h
index 3623af9..d7da4a3 100644
--- a/xen/include/public/vcpu.h
+++ b/xen/include/public/vcpu.h
@@ -235,6 +235,21 @@ struct vcpu_register_time_memory_area {
 typedef struct vcpu_register_time_memory_area vcpu_register_time_memory_area_t;
 DEFINE_XEN_GUEST_HANDLE(vcpu_register_time_memory_area_t);
 
+/*
+ * Register a shared memory area from which the guest may obtain its own
+ * runstate information without needing to execute a hypercall.
+ * Notes:
+ *  1. The registered address must be guest's physical address.
+ *  2. The registered runstate area should not cross page boundary.
+ *  3. Only one shared area may be registered per VCPU. The shared area is
+ *     updated by the hypervisor each time the VCPU is scheduled. Thus
+ *     runstate.state will always be RUNSTATE_running and
+ *     runstate.state_entry_time will indicate the system time at which the
+ *     VCPU was last scheduled to run.
+ * @extra_arg == pointer to vcpu_register_runstate_memory_area structure.
+ */
+#define VCPUOP_register_runstate_phys_memory_area 14
+
 #endif /* __XEN_PUBLIC_VCPU_H__ */
 
 /*
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [Xen-devel] [PATCH v2 1/2] xen: introduce VCPUOP_register_runstate_phys_memory_area hypercall
@ 2019-04-23  8:10   ` Andrii Anisov
  0 siblings, 0 replies; 83+ messages in thread
From: Andrii Anisov @ 2019-04-23  8:10 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, Andrii Anisov, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan,
	Julien Grall, Jan Beulich, xen-devel, Wei Liu,
	Roger Pau Monné

From: Andrii Anisov <andrii_anisov@epam.com>

The hypercall employs the same vcpu_register_runstate_memory_area
structure for the interface, but requires registered area to not
cross a page boundary.

Signed-off-by: Andrii Anisov <andrii_anisov@epam.com>
---
 xen/common/domain.c       |  5 ++++-
 xen/include/public/vcpu.h | 15 +++++++++++++++
 2 files changed, 19 insertions(+), 1 deletion(-)

diff --git a/xen/common/domain.c b/xen/common/domain.c
index 88bbe98..ae22049 100644
--- a/xen/common/domain.c
+++ b/xen/common/domain.c
@@ -1532,10 +1532,13 @@ long do_vcpu_op(int cmd, unsigned int vcpuid, XEN_GUEST_HANDLE_PARAM(void) arg)
             vcpu_runstate_get(v, &runstate);
             __copy_to_guest(runstate_guest(v), &runstate, 1);
         }
-
         break;
     }
 
+    case VCPUOP_register_runstate_phys_memory_area:
+        rc = -EOPNOTSUPP;
+        break;
+
 #ifdef VCPU_TRAP_NMI
     case VCPUOP_send_nmi:
         if ( !guest_handle_is_null(arg) )
diff --git a/xen/include/public/vcpu.h b/xen/include/public/vcpu.h
index 3623af9..d7da4a3 100644
--- a/xen/include/public/vcpu.h
+++ b/xen/include/public/vcpu.h
@@ -235,6 +235,21 @@ struct vcpu_register_time_memory_area {
 typedef struct vcpu_register_time_memory_area vcpu_register_time_memory_area_t;
 DEFINE_XEN_GUEST_HANDLE(vcpu_register_time_memory_area_t);
 
+/*
+ * Register a shared memory area from which the guest may obtain its own
+ * runstate information without needing to execute a hypercall.
+ * Notes:
+ *  1. The registered address must be guest's physical address.
+ *  2. The registered runstate area should not cross page boundary.
+ *  3. Only one shared area may be registered per VCPU. The shared area is
+ *     updated by the hypervisor each time the VCPU is scheduled. Thus
+ *     runstate.state will always be RUNSTATE_running and
+ *     runstate.state_entry_time will indicate the system time at which the
+ *     VCPU was last scheduled to run.
+ * @extra_arg == pointer to vcpu_register_runstate_memory_area structure.
+ */
+#define VCPUOP_register_runstate_phys_memory_area 14
+
 #endif /* __XEN_PUBLIC_VCPU_H__ */
 
 /*
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [PATCH v2 2/2] xen: implement VCPUOP_register_runstate_phys_memory_area
@ 2019-04-23  8:10   ` Andrii Anisov
  0 siblings, 0 replies; 83+ messages in thread
From: Andrii Anisov @ 2019-04-23  8:10 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, Andrii Anisov, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan,
	Julien Grall, Jan Beulich, xen-devel, Wei Liu,
	Roger Pau Monné

From: Andrii Anisov <andrii_anisov@epam.com>

VCPUOP_register_runstate_phys_memory_area is implemented via runstate
area mapping.

Signed-off-by: Andrii Anisov <andrii_anisov@epam.com>
---
 xen/arch/arm/domain.c        |  62 +++++++++++++++++--------
 xen/arch/x86/domain.c        | 105 +++++++++++++++++++++++++++++++------------
 xen/common/domain.c          |  80 ++++++++++++++++++++++++++++++++-
 xen/include/asm-arm/domain.h |   2 +
 xen/include/xen/domain.h     |   2 +
 xen/include/xen/sched.h      |   8 ++++
 6 files changed, 210 insertions(+), 49 deletions(-)

diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
index 6dc633e..8e24e63 100644
--- a/xen/arch/arm/domain.c
+++ b/xen/arch/arm/domain.c
@@ -275,32 +275,55 @@ static void ctxt_switch_to(struct vcpu *n)
 }
 
 /* Update per-VCPU guest runstate shared memory area (if registered). */
-static void update_runstate_area(struct vcpu *v)
+void update_runstate_area(struct vcpu *v)
 {
-    void __user *guest_handle = NULL;
+    if ( !guest_handle_is_null(runstate_guest(v)) )
+    {
+        void __user *guest_handle = NULL;
+        if ( VM_ASSIST(v->domain, runstate_update_flag) )
+        {
+            guest_handle = &v->runstate_guest.p->state_entry_time + 1;
+            guest_handle--;
+            v->runstate.state_entry_time |= XEN_RUNSTATE_UPDATE;
+            __raw_copy_to_guest(guest_handle,
+                                (void *)(&v->runstate.state_entry_time + 1) - 1,
+                                1);
+            smp_wmb();
+        }
 
-    if ( guest_handle_is_null(runstate_guest(v)) )
-        return;
+        __copy_to_guest(runstate_guest(v), &v->runstate, 1);
 
-    if ( VM_ASSIST(v->domain, runstate_update_flag) )
-    {
-        guest_handle = &v->runstate_guest.p->state_entry_time + 1;
-        guest_handle--;
-        v->runstate.state_entry_time |= XEN_RUNSTATE_UPDATE;
-        __raw_copy_to_guest(guest_handle,
-                            (void *)(&v->runstate.state_entry_time + 1) - 1, 1);
-        smp_wmb();
+        if ( guest_handle )
+        {
+            v->runstate.state_entry_time &= ~XEN_RUNSTATE_UPDATE;
+            smp_wmb();
+            __raw_copy_to_guest(guest_handle,
+                                (void *)(&v->runstate.state_entry_time + 1) - 1,
+                                1);
+        }
     }
 
-    __copy_to_guest(runstate_guest(v), &v->runstate, 1);
-
-    if ( guest_handle )
+    spin_lock(&v->mapped_runstate_lock);
+    if ( v->mapped_runstate )
     {
-        v->runstate.state_entry_time &= ~XEN_RUNSTATE_UPDATE;
-        smp_wmb();
-        __raw_copy_to_guest(guest_handle,
-                            (void *)(&v->runstate.state_entry_time + 1) - 1, 1);
+        if ( VM_ASSIST(v->domain, runstate_update_flag) )
+        {
+            v->mapped_runstate->state_entry_time |= XEN_RUNSTATE_UPDATE;
+            smp_wmb();
+            v->runstate.state_entry_time |= XEN_RUNSTATE_UPDATE;
+        }
+
+        memcpy(v->mapped_runstate, &v->runstate, sizeof(v->runstate));
+
+        if ( VM_ASSIST(v->domain, runstate_update_flag) )
+        {
+            v->mapped_runstate->state_entry_time &= ~XEN_RUNSTATE_UPDATE;
+            smp_wmb();
+            v->runstate.state_entry_time &= ~XEN_RUNSTATE_UPDATE;
+        }
     }
+    spin_unlock(&v->mapped_runstate_lock);
+
 }
 
 static void schedule_tail(struct vcpu *prev)
@@ -998,6 +1021,7 @@ long do_arm_vcpu_op(int cmd, unsigned int vcpuid, XEN_GUEST_HANDLE_PARAM(void) a
     {
         case VCPUOP_register_vcpu_info:
         case VCPUOP_register_runstate_memory_area:
+        case VCPUOP_register_runstate_phys_memory_area:
             return do_vcpu_op(cmd, vcpuid, arg);
         default:
             return -EINVAL;
diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
index 9eaa978..46c2219 100644
--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -1558,51 +1558,98 @@ void paravirt_ctxt_switch_to(struct vcpu *v)
         wrmsr_tsc_aux(v->arch.msrs->tsc_aux);
 }
 
-/* Update per-VCPU guest runstate shared memory area (if registered). */
-bool update_runstate_area(struct vcpu *v)
+static void update_mapped_runstate_area_native(struct vcpu *v)
 {
-    bool rc;
-    struct guest_memory_policy policy = { .nested_guest_mode = false };
-    void __user *guest_handle = NULL;
-
-    if ( guest_handle_is_null(runstate_guest(v)) )
-        return true;
-
-    update_guest_memory_policy(v, &policy);
-
     if ( VM_ASSIST(v->domain, runstate_update_flag) )
     {
-        guest_handle = has_32bit_shinfo(v->domain)
-            ? &v->runstate_guest.compat.p->state_entry_time + 1
-            : &v->runstate_guest.native.p->state_entry_time + 1;
-        guest_handle--;
         v->runstate.state_entry_time |= XEN_RUNSTATE_UPDATE;
-        __raw_copy_to_guest(guest_handle,
-                            (void *)(&v->runstate.state_entry_time + 1) - 1, 1);
+        v->mapped_runstate.native->state_entry_time |= XEN_RUNSTATE_UPDATE;
         smp_wmb();
     }
 
-    if ( has_32bit_shinfo(v->domain) )
+    memcpy(v->mapped_runstate.native, &v->runstate, sizeof(v->runstate));
+
+    if ( VM_ASSIST(v->domain, runstate_update_flag) )
     {
-        struct compat_vcpu_runstate_info info;
+        v->runstate.state_entry_time &= ~XEN_RUNSTATE_UPDATE;
+        v->mapped_runstate.native->state_entry_time &= ~XEN_RUNSTATE_UPDATE;
+        smp_wmb();
+    }
+}
 
-        XLAT_vcpu_runstate_info(&info, &v->runstate);
-        __copy_to_guest(v->runstate_guest.compat, &info, 1);
-        rc = true;
+static void update_mapped_runstate_area_compat(struct vcpu *v)
+{
+    if ( VM_ASSIST(v->domain, runstate_update_flag) )
+    {
+        v->runstate.state_entry_time |= XEN_RUNSTATE_UPDATE;
+        v->mapped_runstate.compat->state_entry_time |= XEN_RUNSTATE_UPDATE;
+        smp_wmb();
     }
-    else
-        rc = __copy_to_guest(runstate_guest(v), &v->runstate, 1) !=
-             sizeof(v->runstate);
 
-    if ( guest_handle )
+    memcpy(v->mapped_runstate.compat, &v->runstate, sizeof(v->runstate));
+
+    if ( VM_ASSIST(v->domain, runstate_update_flag) )
     {
         v->runstate.state_entry_time &= ~XEN_RUNSTATE_UPDATE;
+        v->mapped_runstate.compat->state_entry_time &= ~XEN_RUNSTATE_UPDATE;
         smp_wmb();
-        __raw_copy_to_guest(guest_handle,
-                            (void *)(&v->runstate.state_entry_time + 1) - 1, 1);
     }
+}
 
-    update_guest_memory_policy(v, &policy);
+/* Update per-VCPU guest runstate shared memory area (if registered). */
+bool update_runstate_area(struct vcpu *v)
+{
+    bool rc = true;
+
+    if ( !guest_handle_is_null(runstate_guest(v)) )
+    {
+        struct guest_memory_policy policy = { .nested_guest_mode = false };
+        void __user *guest_handle = NULL;
+
+        update_guest_memory_policy(v, &policy);
+        if ( VM_ASSIST(v->domain, runstate_update_flag) )
+        {
+            guest_handle = has_32bit_shinfo(v->domain)
+                ? &v->runstate_guest.compat.p->state_entry_time + 1
+                : &v->runstate_guest.native.p->state_entry_time + 1;
+            guest_handle--;
+            v->runstate.state_entry_time |= XEN_RUNSTATE_UPDATE;
+            __raw_copy_to_guest(guest_handle,
+                                (void *)(&v->runstate.state_entry_time + 1) - 1, 1);
+            smp_wmb();
+        }
+
+        if ( has_32bit_shinfo(v->domain) )
+        {
+            struct compat_vcpu_runstate_info info;
+
+            XLAT_vcpu_runstate_info(&info, &v->runstate);
+            __copy_to_guest(v->runstate_guest.compat, &info, 1);
+            rc = true;
+        }
+        else
+            rc = __copy_to_guest(runstate_guest(v), &v->runstate, 1) !=
+                 sizeof(v->runstate);
+
+        if ( guest_handle )
+        {
+            v->runstate.state_entry_time &= ~XEN_RUNSTATE_UPDATE;
+            smp_wmb();
+            __raw_copy_to_guest(guest_handle,
+                                (void *)(&v->runstate.state_entry_time + 1) - 1, 1);
+        }
+        update_guest_memory_policy(v, &policy);
+    }
+
+    spin_lock(v->mapped_runstate_lock);
+    if ( v->mapped_runstate )
+    {
+        if ( has_32bit_shinfo((v)->domain) )
+            update_mapped_runstate_area_compat(v);
+        else
+            update_mapped_runstate_area_native(v);
+    }
+    spin_unlock(v->mapped_runstate_lock);
 
     return rc;
 }
diff --git a/xen/common/domain.c b/xen/common/domain.c
index ae22049..6df76c6 100644
--- a/xen/common/domain.c
+++ b/xen/common/domain.c
@@ -149,6 +149,7 @@ struct vcpu *vcpu_create(
     v->dirty_cpu = VCPU_CPU_CLEAN;
 
     spin_lock_init(&v->virq_lock);
+    spin_lock_init(&v->mapped_runstate_lock);
 
     tasklet_init(&v->continue_hypercall_tasklet, NULL, 0);
 
@@ -699,6 +700,69 @@ int rcu_lock_live_remote_domain_by_id(domid_t dom, struct domain **d)
     return 0;
 }
 
+static void _unmap_runstate_area(struct vcpu *v)
+{
+    mfn_t mfn;
+
+    if ( !v->mapped_runstate )
+        return;
+
+    mfn = _mfn(virt_to_mfn(runstate_guest(v).p));
+
+    unmap_domain_page_global((void *)
+                             ((unsigned long)v->mapped_runstate &
+                              PAGE_MASK));
+
+    v->mapped_runstate = NULL;
+    put_page_and_type(mfn_to_page(mfn));
+}
+
+static int map_runstate_area(struct vcpu *v,
+                      struct vcpu_register_runstate_memory_area *area)
+{
+    unsigned long offset = area->addr.p & ~PAGE_MASK;
+    gfn_t gfn = gaddr_to_gfn(area->addr.p);
+    struct domain *d = v->domain;
+    void *mapping;
+    struct page_info *page;
+    size_t size = sizeof (struct vcpu_runstate_info );
+
+    if ( offset > (PAGE_SIZE - size) )
+        return -EINVAL;
+
+    page = get_page_from_gfn(d, gfn_x(gfn), NULL, P2M_ALLOC);
+    if ( !page )
+        return -EINVAL;
+
+    if ( !get_page_type(page, PGT_writable_page) )
+    {
+        put_page(page);
+        return -EINVAL;
+    }
+
+    mapping = __map_domain_page_global(page);
+
+    if ( mapping == NULL )
+    {
+        put_page_and_type(page);
+        return -ENOMEM;
+    }
+
+    spin_lock(&v->mapped_runstate_lock);
+    _unmap_runstate_area(v);
+    v->mapped_runstate = mapping + offset;
+    spin_unlock(&v->mapped_runstate_lock);
+
+    return 0;
+}
+
+static void unmap_runstate_area(struct vcpu *v)
+{
+    spin_lock(&v->mapped_runstate_lock);
+    _unmap_runstate_area(v);
+    spin_unlock(&v->mapped_runstate_lock);
+}
+
 int domain_kill(struct domain *d)
 {
     int rc = 0;
@@ -737,7 +801,11 @@ int domain_kill(struct domain *d)
         if ( cpupool_move_domain(d, cpupool0) )
             return -ERESTART;
         for_each_vcpu ( d, v )
+        {
+            set_xen_guest_handle(runstate_guest(v), NULL);
+            unmap_runstate_area(v);
             unmap_vcpu_info(v);
+        }
         d->is_dying = DOMDYING_dead;
         /* Mem event cleanup has to go here because the rings 
          * have to be put before we call put_domain. */
@@ -1192,6 +1260,7 @@ int domain_soft_reset(struct domain *d)
     for_each_vcpu ( d, v )
     {
         set_xen_guest_handle(runstate_guest(v), NULL);
+        unmap_runstate_area(v);
         unmap_vcpu_info(v);
     }
 
@@ -1536,8 +1605,17 @@ long do_vcpu_op(int cmd, unsigned int vcpuid, XEN_GUEST_HANDLE_PARAM(void) arg)
     }
 
     case VCPUOP_register_runstate_phys_memory_area:
-        rc = -EOPNOTSUPP;
+    {
+        struct vcpu_register_runstate_memory_area area;
+
+        rc = -EFAULT;
+        if ( copy_from_guest(&area, arg, 1) )
+            break;
+
+        rc = map_runstate_area(v, &area);
+
         break;
+    }
 
 #ifdef VCPU_TRAP_NMI
     case VCPUOP_send_nmi:
diff --git a/xen/include/asm-arm/domain.h b/xen/include/asm-arm/domain.h
index 312fec8..3fb6ea2 100644
--- a/xen/include/asm-arm/domain.h
+++ b/xen/include/asm-arm/domain.h
@@ -217,6 +217,8 @@ void vcpu_show_execution_state(struct vcpu *);
 void vcpu_show_registers(const struct vcpu *);
 void vcpu_switch_to_aarch64_mode(struct vcpu *);
 
+void update_runstate_area(struct vcpu *);
+
 /*
  * Due to the restriction of GICv3, the number of vCPUs in AFF0 is
  * limited to 16, thus only the first 4 bits of AFF0 are legal. We will
diff --git a/xen/include/xen/domain.h b/xen/include/xen/domain.h
index d1bfc82..ecddcfe 100644
--- a/xen/include/xen/domain.h
+++ b/xen/include/xen/domain.h
@@ -118,4 +118,6 @@ struct vnuma_info {
 
 void vnuma_destroy(struct vnuma_info *vnuma);
 
+struct vcpu_register_runstate_memory_area;
+
 #endif /* __XEN_DOMAIN_H__ */
diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
index 748bb0f..2afe31c 100644
--- a/xen/include/xen/sched.h
+++ b/xen/include/xen/sched.h
@@ -163,15 +163,23 @@ struct vcpu
     void            *sched_priv;    /* scheduler-specific data */
 
     struct vcpu_runstate_info runstate;
+
+    spinlock_t      mapped_runstate_lock;
+
 #ifndef CONFIG_COMPAT
 # define runstate_guest(v) ((v)->runstate_guest)
     XEN_GUEST_HANDLE(vcpu_runstate_info_t) runstate_guest; /* guest address */
+    vcpu_runstate_info_t *mapped_runstate;
 #else
 # define runstate_guest(v) ((v)->runstate_guest.native)
     union {
         XEN_GUEST_HANDLE(vcpu_runstate_info_t) native;
         XEN_GUEST_HANDLE(vcpu_runstate_info_compat_t) compat;
     } runstate_guest; /* guest address */
+    union {
+        vcpu_runstate_info_t* native;
+        vcpu_runstate_info_compat_t* compat;
+    } mapped_runstate; /* guest address */
 #endif
 
     /* last time when vCPU is scheduled out */
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 83+ messages in thread

* [Xen-devel] [PATCH v2 2/2] xen: implement VCPUOP_register_runstate_phys_memory_area
@ 2019-04-23  8:10   ` Andrii Anisov
  0 siblings, 0 replies; 83+ messages in thread
From: Andrii Anisov @ 2019-04-23  8:10 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, Andrii Anisov, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan,
	Julien Grall, Jan Beulich, xen-devel, Wei Liu,
	Roger Pau Monné

From: Andrii Anisov <andrii_anisov@epam.com>

VCPUOP_register_runstate_phys_memory_area is implemented via runstate
area mapping.

Signed-off-by: Andrii Anisov <andrii_anisov@epam.com>
---
 xen/arch/arm/domain.c        |  62 +++++++++++++++++--------
 xen/arch/x86/domain.c        | 105 +++++++++++++++++++++++++++++++------------
 xen/common/domain.c          |  80 ++++++++++++++++++++++++++++++++-
 xen/include/asm-arm/domain.h |   2 +
 xen/include/xen/domain.h     |   2 +
 xen/include/xen/sched.h      |   8 ++++
 6 files changed, 210 insertions(+), 49 deletions(-)

diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
index 6dc633e..8e24e63 100644
--- a/xen/arch/arm/domain.c
+++ b/xen/arch/arm/domain.c
@@ -275,32 +275,55 @@ static void ctxt_switch_to(struct vcpu *n)
 }
 
 /* Update per-VCPU guest runstate shared memory area (if registered). */
-static void update_runstate_area(struct vcpu *v)
+void update_runstate_area(struct vcpu *v)
 {
-    void __user *guest_handle = NULL;
+    if ( !guest_handle_is_null(runstate_guest(v)) )
+    {
+        void __user *guest_handle = NULL;
+        if ( VM_ASSIST(v->domain, runstate_update_flag) )
+        {
+            guest_handle = &v->runstate_guest.p->state_entry_time + 1;
+            guest_handle--;
+            v->runstate.state_entry_time |= XEN_RUNSTATE_UPDATE;
+            __raw_copy_to_guest(guest_handle,
+                                (void *)(&v->runstate.state_entry_time + 1) - 1,
+                                1);
+            smp_wmb();
+        }
 
-    if ( guest_handle_is_null(runstate_guest(v)) )
-        return;
+        __copy_to_guest(runstate_guest(v), &v->runstate, 1);
 
-    if ( VM_ASSIST(v->domain, runstate_update_flag) )
-    {
-        guest_handle = &v->runstate_guest.p->state_entry_time + 1;
-        guest_handle--;
-        v->runstate.state_entry_time |= XEN_RUNSTATE_UPDATE;
-        __raw_copy_to_guest(guest_handle,
-                            (void *)(&v->runstate.state_entry_time + 1) - 1, 1);
-        smp_wmb();
+        if ( guest_handle )
+        {
+            v->runstate.state_entry_time &= ~XEN_RUNSTATE_UPDATE;
+            smp_wmb();
+            __raw_copy_to_guest(guest_handle,
+                                (void *)(&v->runstate.state_entry_time + 1) - 1,
+                                1);
+        }
     }
 
-    __copy_to_guest(runstate_guest(v), &v->runstate, 1);
-
-    if ( guest_handle )
+    spin_lock(&v->mapped_runstate_lock);
+    if ( v->mapped_runstate )
     {
-        v->runstate.state_entry_time &= ~XEN_RUNSTATE_UPDATE;
-        smp_wmb();
-        __raw_copy_to_guest(guest_handle,
-                            (void *)(&v->runstate.state_entry_time + 1) - 1, 1);
+        if ( VM_ASSIST(v->domain, runstate_update_flag) )
+        {
+            v->mapped_runstate->state_entry_time |= XEN_RUNSTATE_UPDATE;
+            smp_wmb();
+            v->runstate.state_entry_time |= XEN_RUNSTATE_UPDATE;
+        }
+
+        memcpy(v->mapped_runstate, &v->runstate, sizeof(v->runstate));
+
+        if ( VM_ASSIST(v->domain, runstate_update_flag) )
+        {
+            v->mapped_runstate->state_entry_time &= ~XEN_RUNSTATE_UPDATE;
+            smp_wmb();
+            v->runstate.state_entry_time &= ~XEN_RUNSTATE_UPDATE;
+        }
     }
+    spin_unlock(&v->mapped_runstate_lock);
+
 }
 
 static void schedule_tail(struct vcpu *prev)
@@ -998,6 +1021,7 @@ long do_arm_vcpu_op(int cmd, unsigned int vcpuid, XEN_GUEST_HANDLE_PARAM(void) a
     {
         case VCPUOP_register_vcpu_info:
         case VCPUOP_register_runstate_memory_area:
+        case VCPUOP_register_runstate_phys_memory_area:
             return do_vcpu_op(cmd, vcpuid, arg);
         default:
             return -EINVAL;
diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
index 9eaa978..46c2219 100644
--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -1558,51 +1558,98 @@ void paravirt_ctxt_switch_to(struct vcpu *v)
         wrmsr_tsc_aux(v->arch.msrs->tsc_aux);
 }
 
-/* Update per-VCPU guest runstate shared memory area (if registered). */
-bool update_runstate_area(struct vcpu *v)
+static void update_mapped_runstate_area_native(struct vcpu *v)
 {
-    bool rc;
-    struct guest_memory_policy policy = { .nested_guest_mode = false };
-    void __user *guest_handle = NULL;
-
-    if ( guest_handle_is_null(runstate_guest(v)) )
-        return true;
-
-    update_guest_memory_policy(v, &policy);
-
     if ( VM_ASSIST(v->domain, runstate_update_flag) )
     {
-        guest_handle = has_32bit_shinfo(v->domain)
-            ? &v->runstate_guest.compat.p->state_entry_time + 1
-            : &v->runstate_guest.native.p->state_entry_time + 1;
-        guest_handle--;
         v->runstate.state_entry_time |= XEN_RUNSTATE_UPDATE;
-        __raw_copy_to_guest(guest_handle,
-                            (void *)(&v->runstate.state_entry_time + 1) - 1, 1);
+        v->mapped_runstate.native->state_entry_time |= XEN_RUNSTATE_UPDATE;
         smp_wmb();
     }
 
-    if ( has_32bit_shinfo(v->domain) )
+    memcpy(v->mapped_runstate.native, &v->runstate, sizeof(v->runstate));
+
+    if ( VM_ASSIST(v->domain, runstate_update_flag) )
     {
-        struct compat_vcpu_runstate_info info;
+        v->runstate.state_entry_time &= ~XEN_RUNSTATE_UPDATE;
+        v->mapped_runstate.native->state_entry_time &= ~XEN_RUNSTATE_UPDATE;
+        smp_wmb();
+    }
+}
 
-        XLAT_vcpu_runstate_info(&info, &v->runstate);
-        __copy_to_guest(v->runstate_guest.compat, &info, 1);
-        rc = true;
+static void update_mapped_runstate_area_compat(struct vcpu *v)
+{
+    if ( VM_ASSIST(v->domain, runstate_update_flag) )
+    {
+        v->runstate.state_entry_time |= XEN_RUNSTATE_UPDATE;
+        v->mapped_runstate.compat->state_entry_time |= XEN_RUNSTATE_UPDATE;
+        smp_wmb();
     }
-    else
-        rc = __copy_to_guest(runstate_guest(v), &v->runstate, 1) !=
-             sizeof(v->runstate);
 
-    if ( guest_handle )
+    memcpy(v->mapped_runstate.compat, &v->runstate, sizeof(v->runstate));
+
+    if ( VM_ASSIST(v->domain, runstate_update_flag) )
     {
         v->runstate.state_entry_time &= ~XEN_RUNSTATE_UPDATE;
+        v->mapped_runstate.compat->state_entry_time &= ~XEN_RUNSTATE_UPDATE;
         smp_wmb();
-        __raw_copy_to_guest(guest_handle,
-                            (void *)(&v->runstate.state_entry_time + 1) - 1, 1);
     }
+}
 
-    update_guest_memory_policy(v, &policy);
+/* Update per-VCPU guest runstate shared memory area (if registered). */
+bool update_runstate_area(struct vcpu *v)
+{
+    bool rc = true;
+
+    if ( !guest_handle_is_null(runstate_guest(v)) )
+    {
+        struct guest_memory_policy policy = { .nested_guest_mode = false };
+        void __user *guest_handle = NULL;
+
+        update_guest_memory_policy(v, &policy);
+        if ( VM_ASSIST(v->domain, runstate_update_flag) )
+        {
+            guest_handle = has_32bit_shinfo(v->domain)
+                ? &v->runstate_guest.compat.p->state_entry_time + 1
+                : &v->runstate_guest.native.p->state_entry_time + 1;
+            guest_handle--;
+            v->runstate.state_entry_time |= XEN_RUNSTATE_UPDATE;
+            __raw_copy_to_guest(guest_handle,
+                                (void *)(&v->runstate.state_entry_time + 1) - 1, 1);
+            smp_wmb();
+        }
+
+        if ( has_32bit_shinfo(v->domain) )
+        {
+            struct compat_vcpu_runstate_info info;
+
+            XLAT_vcpu_runstate_info(&info, &v->runstate);
+            __copy_to_guest(v->runstate_guest.compat, &info, 1);
+            rc = true;
+        }
+        else
+            rc = __copy_to_guest(runstate_guest(v), &v->runstate, 1) !=
+                 sizeof(v->runstate);
+
+        if ( guest_handle )
+        {
+            v->runstate.state_entry_time &= ~XEN_RUNSTATE_UPDATE;
+            smp_wmb();
+            __raw_copy_to_guest(guest_handle,
+                                (void *)(&v->runstate.state_entry_time + 1) - 1, 1);
+        }
+        update_guest_memory_policy(v, &policy);
+    }
+
+    spin_lock(v->mapped_runstate_lock);
+    if ( v->mapped_runstate )
+    {
+        if ( has_32bit_shinfo((v)->domain) )
+            update_mapped_runstate_area_compat(v);
+        else
+            update_mapped_runstate_area_native(v);
+    }
+    spin_unlock(v->mapped_runstate_lock);
 
     return rc;
 }
diff --git a/xen/common/domain.c b/xen/common/domain.c
index ae22049..6df76c6 100644
--- a/xen/common/domain.c
+++ b/xen/common/domain.c
@@ -149,6 +149,7 @@ struct vcpu *vcpu_create(
     v->dirty_cpu = VCPU_CPU_CLEAN;
 
     spin_lock_init(&v->virq_lock);
+    spin_lock_init(&v->mapped_runstate_lock);
 
     tasklet_init(&v->continue_hypercall_tasklet, NULL, 0);
 
@@ -699,6 +700,69 @@ int rcu_lock_live_remote_domain_by_id(domid_t dom, struct domain **d)
     return 0;
 }
 
+static void _unmap_runstate_area(struct vcpu *v)
+{
+    mfn_t mfn;
+
+    if ( !v->mapped_runstate )
+        return;
+
+    mfn = _mfn(virt_to_mfn(runstate_guest(v).p));
+
+    unmap_domain_page_global((void *)
+                             ((unsigned long)v->mapped_runstate &
+                              PAGE_MASK));
+
+    v->mapped_runstate = NULL;
+    put_page_and_type(mfn_to_page(mfn));
+}
+
+static int map_runstate_area(struct vcpu *v,
+                      struct vcpu_register_runstate_memory_area *area)
+{
+    unsigned long offset = area->addr.p & ~PAGE_MASK;
+    gfn_t gfn = gaddr_to_gfn(area->addr.p);
+    struct domain *d = v->domain;
+    void *mapping;
+    struct page_info *page;
+    size_t size = sizeof (struct vcpu_runstate_info );
+
+    if ( offset > (PAGE_SIZE - size) )
+        return -EINVAL;
+
+    page = get_page_from_gfn(d, gfn_x(gfn), NULL, P2M_ALLOC);
+    if ( !page )
+        return -EINVAL;
+
+    if ( !get_page_type(page, PGT_writable_page) )
+    {
+        put_page(page);
+        return -EINVAL;
+    }
+
+    mapping = __map_domain_page_global(page);
+
+    if ( mapping == NULL )
+    {
+        put_page_and_type(page);
+        return -ENOMEM;
+    }
+
+    spin_lock(&v->mapped_runstate_lock);
+    _unmap_runstate_area(v);
+    v->mapped_runstate = mapping + offset;
+    spin_unlock(&v->mapped_runstate_lock);
+
+    return 0;
+}
+
+static void unmap_runstate_area(struct vcpu *v)
+{
+    spin_lock(&v->mapped_runstate_lock);
+    _unmap_runstate_area(v);
+    spin_unlock(&v->mapped_runstate_lock);
+}
+
 int domain_kill(struct domain *d)
 {
     int rc = 0;
@@ -737,7 +801,11 @@ int domain_kill(struct domain *d)
         if ( cpupool_move_domain(d, cpupool0) )
             return -ERESTART;
         for_each_vcpu ( d, v )
+        {
+            set_xen_guest_handle(runstate_guest(v), NULL);
+            unmap_runstate_area(v);
             unmap_vcpu_info(v);
+        }
         d->is_dying = DOMDYING_dead;
         /* Mem event cleanup has to go here because the rings 
          * have to be put before we call put_domain. */
@@ -1192,6 +1260,7 @@ int domain_soft_reset(struct domain *d)
     for_each_vcpu ( d, v )
     {
         set_xen_guest_handle(runstate_guest(v), NULL);
+        unmap_runstate_area(v);
         unmap_vcpu_info(v);
     }
 
@@ -1536,8 +1605,17 @@ long do_vcpu_op(int cmd, unsigned int vcpuid, XEN_GUEST_HANDLE_PARAM(void) arg)
     }
 
     case VCPUOP_register_runstate_phys_memory_area:
-        rc = -EOPNOTSUPP;
+    {
+        struct vcpu_register_runstate_memory_area area;
+
+        rc = -EFAULT;
+        if ( copy_from_guest(&area, arg, 1) )
+            break;
+
+        rc = map_runstate_area(v, &area);
+
         break;
+    }
 
 #ifdef VCPU_TRAP_NMI
     case VCPUOP_send_nmi:
diff --git a/xen/include/asm-arm/domain.h b/xen/include/asm-arm/domain.h
index 312fec8..3fb6ea2 100644
--- a/xen/include/asm-arm/domain.h
+++ b/xen/include/asm-arm/domain.h
@@ -217,6 +217,8 @@ void vcpu_show_execution_state(struct vcpu *);
 void vcpu_show_registers(const struct vcpu *);
 void vcpu_switch_to_aarch64_mode(struct vcpu *);
 
+void update_runstate_area(struct vcpu *);
+
 /*
  * Due to the restriction of GICv3, the number of vCPUs in AFF0 is
  * limited to 16, thus only the first 4 bits of AFF0 are legal. We will
diff --git a/xen/include/xen/domain.h b/xen/include/xen/domain.h
index d1bfc82..ecddcfe 100644
--- a/xen/include/xen/domain.h
+++ b/xen/include/xen/domain.h
@@ -118,4 +118,6 @@ struct vnuma_info {
 
 void vnuma_destroy(struct vnuma_info *vnuma);
 
+struct vcpu_register_runstate_memory_area;
+
 #endif /* __XEN_DOMAIN_H__ */
diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
index 748bb0f..2afe31c 100644
--- a/xen/include/xen/sched.h
+++ b/xen/include/xen/sched.h
@@ -163,15 +163,23 @@ struct vcpu
     void            *sched_priv;    /* scheduler-specific data */
 
     struct vcpu_runstate_info runstate;
+
+    spinlock_t      mapped_runstate_lock;
+
 #ifndef CONFIG_COMPAT
 # define runstate_guest(v) ((v)->runstate_guest)
     XEN_GUEST_HANDLE(vcpu_runstate_info_t) runstate_guest; /* guest address */
+    vcpu_runstate_info_t *mapped_runstate;
 #else
 # define runstate_guest(v) ((v)->runstate_guest.native)
     union {
         XEN_GUEST_HANDLE(vcpu_runstate_info_t) native;
         XEN_GUEST_HANDLE(vcpu_runstate_info_compat_t) compat;
     } runstate_guest; /* guest address */
+    union {
+        vcpu_runstate_info_t* native;
+        vcpu_runstate_info_compat_t* compat;
+    } mapped_runstate; /* guest address */
 #endif
 
     /* last time when vCPU is scheduled out */
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 83+ messages in thread

* Re: [PATCH v2 1/2] xen: introduce VCPUOP_register_runstate_phys_memory_area hypercall
@ 2019-05-08 10:10     ` George Dunlap
  0 siblings, 0 replies; 83+ messages in thread
From: George Dunlap @ 2019-05-08 10:10 UTC (permalink / raw)
  To: Andrii Anisov, xen-devel
  Cc: Stefano Stabellini, Andrii Anisov, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan,
	Julien Grall, Jan Beulich, xen-devel, Wei Liu,
	Roger Pau Monné

On 4/23/19 9:10 AM, Andrii Anisov wrote:
> From: Andrii Anisov <andrii_anisov@epam.com>
> 
> The hypercall employs the same vcpu_register_runstate_memory_area
> structure for the interface, but requires registered area to not
> cross a page boundary.

This needs a lot more information.

I use the following template when writing changelog entries:

* What's the current situation
* Why is that a problem
* How this patch fixes it
* Any other information needed to understand the patch.

You can drop things when it's obvious, but it's not at all obvious in
this case, at least to me.

This seems to be implementing a VCPUOP which duplicates another VCPUOP's
functionality, with some tweaks.  I can back-guess what the differences
are from the description (e.g,. the other one must be a virtual address,
may cross a page boundary, &c), but I can't tell why that's a problem.

Also, I don't see the point of separating the definition from the
implementation -- why not squash these two patches together?

 -George

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [Xen-devel] [PATCH v2 1/2] xen: introduce VCPUOP_register_runstate_phys_memory_area hypercall
@ 2019-05-08 10:10     ` George Dunlap
  0 siblings, 0 replies; 83+ messages in thread
From: George Dunlap @ 2019-05-08 10:10 UTC (permalink / raw)
  To: Andrii Anisov, xen-devel
  Cc: Stefano Stabellini, Andrii Anisov, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan,
	Julien Grall, Jan Beulich, xen-devel, Wei Liu,
	Roger Pau Monné

On 4/23/19 9:10 AM, Andrii Anisov wrote:
> From: Andrii Anisov <andrii_anisov@epam.com>
> 
> The hypercall employs the same vcpu_register_runstate_memory_area
> structure for the interface, but requires registered area to not
> cross a page boundary.

This needs a lot more information.

I use the following template when writing changelog entries:

* What's the current situation
* Why is that a problem
* How this patch fixes it
* Any other information needed to understand the patch.

You can drop things when it's obvious, but it's not at all obvious in
this case, at least to me.

This seems to be implementing a VCPUOP which duplicates another VCPUOP's
functionality, with some tweaks.  I can back-guess what the differences
are from the description (e.g,. the other one must be a virtual address,
may cross a page boundary, &c), but I can't tell why that's a problem.

Also, I don't see the point of separating the definition from the
implementation -- why not squash these two patches together?

 -George

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v2 0/2] Introduce runstate area registration with phys address
@ 2019-05-08 13:39           ` Julien Grall
  0 siblings, 0 replies; 83+ messages in thread
From: Julien Grall @ 2019-05-08 13:39 UTC (permalink / raw)
  To: Andrii Anisov
  Cc: Juergen Gross, xen-devel, Boris Ostrovsky, Stefano Stabellini

(+xen-devel, juergen, boris)

On 08/05/2019 10:25, Andrii Anisov wrote:
> Hello Julien,
> 
> On 03.05.19 13:19, Julien Grall wrote:
>> Could you be a bit more specific about the failure? Do you see "Failed to walk 
>> page-table"?
> Sorry for a delayed answer.
> 
> Yes, I see еру following, also with your patch (and replacing dprintk with 
> printk, 'cause it is more convenient to me to issue a non-debug build):
> 
> (XEN) d1v3 par 0x809
> (XEN) d1v1 par 0x809
> (XEN) d1v2 par 0x809
> (XEN) d1v0 par 0x809
> (XEN) d1v3: Failed to walk page-table va 0xffff80002ff7e348
> (XEN) d1v1: Failed to walk page-table va 0xffff80002ff4e348
> (XEN) d1v2: Failed to walk page-table va 0xffff80002ff66348
> (XEN) d1v0: Failed to walk page-table va 0xffff80002ff36348
> 
> 
> As I understand ARM ARM, EL1_PAR says it is the "translation fault level" (as 
> per "ISS encoding for an exception from a Data Abort", DFSC bits).

That's a translation fault level 0 on a stage-1 page-table walk. To confirm you 
have kpti disabled, right? Does it always fail, or only time to time? Could you 
dump the guest register when this happen?

Even with kpti disabled, you are still potentially facing issue using virtual 
address (although they may be more difficult to trigger). Indeed, you are 
relying on the guest OS to unmap the regions or touch the page-tables entries 
used for the walk.

Unfortunately we can't prevent a guest playing with its page-table. So at best 
we can only workaround it.

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [Xen-devel] [PATCH v2 0/2] Introduce runstate area registration with phys address
@ 2019-05-08 13:39           ` Julien Grall
  0 siblings, 0 replies; 83+ messages in thread
From: Julien Grall @ 2019-05-08 13:39 UTC (permalink / raw)
  To: Andrii Anisov
  Cc: Juergen Gross, xen-devel, Boris Ostrovsky, Stefano Stabellini

(+xen-devel, juergen, boris)

On 08/05/2019 10:25, Andrii Anisov wrote:
> Hello Julien,
> 
> On 03.05.19 13:19, Julien Grall wrote:
>> Could you be a bit more specific about the failure? Do you see "Failed to walk 
>> page-table"?
> Sorry for a delayed answer.
> 
> Yes, I see еру following, also with your patch (and replacing dprintk with 
> printk, 'cause it is more convenient to me to issue a non-debug build):
> 
> (XEN) d1v3 par 0x809
> (XEN) d1v1 par 0x809
> (XEN) d1v2 par 0x809
> (XEN) d1v0 par 0x809
> (XEN) d1v3: Failed to walk page-table va 0xffff80002ff7e348
> (XEN) d1v1: Failed to walk page-table va 0xffff80002ff4e348
> (XEN) d1v2: Failed to walk page-table va 0xffff80002ff66348
> (XEN) d1v0: Failed to walk page-table va 0xffff80002ff36348
> 
> 
> As I understand ARM ARM, EL1_PAR says it is the "translation fault level" (as 
> per "ISS encoding for an exception from a Data Abort", DFSC bits).

That's a translation fault level 0 on a stage-1 page-table walk. To confirm you 
have kpti disabled, right? Does it always fail, or only time to time? Could you 
dump the guest register when this happen?

Even with kpti disabled, you are still potentially facing issue using virtual 
address (although they may be more difficult to trigger). Indeed, you are 
relying on the guest OS to unmap the regions or touch the page-tables entries 
used for the walk.

Unfortunately we can't prevent a guest playing with its page-table. So at best 
we can only workaround it.

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v2 0/2] Introduce runstate area registration with phys address
@ 2019-05-08 13:54             ` Andrii Anisov
  0 siblings, 0 replies; 83+ messages in thread
From: Andrii Anisov @ 2019-05-08 13:54 UTC (permalink / raw)
  To: Julien Grall
  Cc: Juergen Gross, xen-devel, Boris Ostrovsky, Stefano Stabellini

Hello Julien,

On 08.05.19 16:39, Julien Grall wrote:
> That's a translation fault level 0 on a stage-1 page-table walk. To confirm you have kpti disabled, right?
Yes, KPTI is disabled. That's why I'm curious what's wrong with that.

> Does it always fail, or only time to time?
It happens on boot. And those prints are permanent and makes boot time enormous. I once waited till user prompt (half an hour or so), they calmed on visible idle, but printed from time to time (maybe on userspace activities).

> Could you dump the guest register when this happen?
Could you please remember how to do that?

> Even with kpti disabled, you are still potentially facing issue using virtual address (although they may be more difficult to trigger). Indeed, you are relying on the guest OS to unmap the regions or touch the page-tables entries used for the walk.

> Unfortunately we can't prevent a guest playing with its page-table. So at best we can only workaround it.
And that's why we turned to guest phys address used for runstate area.

-- 
Sincerely,
Andrii Anisov.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [Xen-devel] [PATCH v2 0/2] Introduce runstate area registration with phys address
@ 2019-05-08 13:54             ` Andrii Anisov
  0 siblings, 0 replies; 83+ messages in thread
From: Andrii Anisov @ 2019-05-08 13:54 UTC (permalink / raw)
  To: Julien Grall
  Cc: Juergen Gross, xen-devel, Boris Ostrovsky, Stefano Stabellini

Hello Julien,

On 08.05.19 16:39, Julien Grall wrote:
> That's a translation fault level 0 on a stage-1 page-table walk. To confirm you have kpti disabled, right?
Yes, KPTI is disabled. That's why I'm curious what's wrong with that.

> Does it always fail, or only time to time?
It happens on boot. And those prints are permanent and makes boot time enormous. I once waited till user prompt (half an hour or so), they calmed on visible idle, but printed from time to time (maybe on userspace activities).

> Could you dump the guest register when this happen?
Could you please remember how to do that?

> Even with kpti disabled, you are still potentially facing issue using virtual address (although they may be more difficult to trigger). Indeed, you are relying on the guest OS to unmap the regions or touch the page-tables entries used for the walk.

> Unfortunately we can't prevent a guest playing with its page-table. So at best we can only workaround it.
And that's why we turned to guest phys address used for runstate area.

-- 
Sincerely,
Andrii Anisov.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v2 0/2] Introduce runstate area registration with phys address
  2019-04-23  8:10 [PATCH v2 0/2] Introduce runstate area registration with phys address Andrii Anisov
                   ` (2 preceding siblings ...)
       [not found] ` <fa126315-31af-854e-817a-8640b431c82b@arm.com>
@ 2019-05-08 13:59 ` Julien Grall
  2019-05-13 10:15   ` Andrii Anisov
  3 siblings, 1 reply; 83+ messages in thread
From: Julien Grall @ 2019-05-08 13:59 UTC (permalink / raw)
  To: Andrii Anisov, xen-devel; +Cc: Andrii Anisov, Jan Beulich, Roger Pau Monné

Hi,

On 23/04/2019 09:10, Andrii Anisov wrote:
> From: Andrii Anisov <andrii_anisov@epam.com>
> 
> Following discussion [1] it is introduced and implemented a runstate
> registration interface which uses guest's phys address instead of a virtual one.
> The new hypercall employes the same data structures as a predecessor, but
> expects the vcpu_runstate_info structure to not cross a page boundary.
> The interface is implemented in a way vcpu_runstate_info structure is mapped to
> the hypervisor on the hypercall processing and is directly accessed during its
> updates. This runstate area mapping follows vcpu_info structure registration.
> 
> Permanent mapping of runstate area would consume vmap area on arm32 what is
> limited to 1G. Though it is assumed that ARM32 does not target the server market
> and the rest of possible applications will not host a huge number of VCPUs to
> render the limitation into the issue.

I am afraid I can't possible back this assumption. As I pointed out in the 
previous version, I would be OK with the always map solution on Arm32 (pending 
performance) because it would be possible to increase the virtual address area 
by reworking the address space.

> 
> The series is tested for ARM64. Build tested for x86. I'd appreciate if someone
> could check it with x86.
> The Linux kernel patch is here [2]. Though it is for 4.14.

The patch looks wrong to me. You are using virt_to_phys() on a percpu area. What 
does actually promise you the physical address will always be the same?

> 
> Changes in:
> 
>    v2: It was reconsidered the new runstate interface implementation. The new
>        interface is made independent of the old one. Do not share runstate_area
>        field, and consequently avoid excessive concurrency with the old runstate
>        interface usage.
>        Introduced locks in order to resolve possible concurrency between runstate
>        area registration and usage.
>        Addressed comments from Jan Beulich [3][4] about coding style nits. Though
>        some of them become obsolete with refactoring and few are picked into this
>        thread for further discussion.
> 
>        There were made performance measurements of approaches (runstate mapped on
>        access vs mapped on registration). The test setups are as following:
>       
>        Thin Dom0 (Linux with intiramfs) with DomD running rich Yocto Linux. In
>        DomD 3d benchmark numbers are compared. The benchmark is GlMark2. GLMark2
>        is ran with different resolutions in order to emit different irq load,
>        where 320x240 emits high IRQ load, but 1920x1080 emits low irq load.
>        Separately tested baking DomD benchmark run with primitive Dom0 CPU burn
>        (dd), in order to stimulate VCPU(dX)->VCPU(dY) switches rather than

Are you saying that the command dd is the CPUBurn? I am not sure how this could 
be considered as a CPUBurn. IHMO, this is more IO related.

>        VCPU(dX)->idle->VCPU(dX).
>        with following results:
> 
>                              mapped               mapped
>                              on access            on init
>        GLMark2 320x240       2852                 2877          +0.8%
>            +Dom0 CPUBurn     2088                 2094          +0.2%
>        GLMark2 800x600       2368                 2375          +0.3%
>            +Dom0 CPUBurn     1868                 1921          +2.8%
>        GLMark2 1920x1080     931                  931            0%
>            +Dom0 CPUBurn     892                  894           +0.2%
> 
>        Please note that "mapped on access" means using the old runstate
>        registering interface. And runstate update in this case still often fails
>        to map runstate area like [5], despite the fact that our Linux kernel
>        does not have KPTI enabled. So runstate area update, in this case, is
>        really shortened.

We know that the old interface is broken, so telling us the new interface is 
faster is not entirely useful. What I am more interested is how it if you use a 
guest physical address on the version "Mapped on access".

> 
> 
>        Also it was checked IRQ latency difference using TBM in a setup similar to
>        [5]. Please note that the IRQ rate is one in 30 seconds, and only
>        VCPU->idle->VCPU use-case is considered. With following results (in ns,
>        the timer granularity 120ns):

How long did you run the benchmark?

> 
>        mapped on access:
>            max=9960 warm_max=8640 min=7200 avg=7626
>        mapped on init:
>            max=9480 warm_max=8400 min=7080 avg=7341
> 
>        Unfortunately there are no consitent results yet from profiling using
>        Lauterbach PowerTrace. Still in communication with the tracer vendor in
>        order to setup the proper configuration.
> 
> [1] https://lists.xenproject.org/archives/html/xen-devel/2019-02/msg00416.html
> [2] https://github.com/aanisov/linux/commit/ba34d2780f57ea43f81810cd695aace7b55c0f29
> [3] https://lists.xenproject.org/archives/html/xen-devel/2019-03/msg00936.html
> [4] https://lists.xenproject.org/archives/html/xen-devel/2019-03/msg00934.html
> [5] https://lists.xenproject.org/archives/html/xen-devel/2019-01/msg02369.html
> [6] https://lists.xenproject.org/archives/html/xen-devel/2018-12/msg02297.html
> 
> 
> Andrii Anisov (2):
>    xen: introduce VCPUOP_register_runstate_phys_memory_area hypercall
>    xen: implement VCPUOP_register_runstate_phys_memory_area
> 
>   xen/arch/arm/domain.c        |  62 +++++++++++++++++--------
>   xen/arch/x86/domain.c        | 105 +++++++++++++++++++++++++++++++------------
>   xen/common/domain.c          |  81 +++++++++++++++++++++++++++++++++
>   xen/include/asm-arm/domain.h |   2 +
>   xen/include/public/vcpu.h    |  15 +++++++
>   xen/include/xen/domain.h     |   2 +
>   xen/include/xen/sched.h      |   8 ++++
>   7 files changed, 227 insertions(+), 48 deletions(-)
> 

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v2 0/2] Introduce runstate area registration with phys address
@ 2019-05-08 14:31               ` Julien Grall
  0 siblings, 0 replies; 83+ messages in thread
From: Julien Grall @ 2019-05-08 14:31 UTC (permalink / raw)
  To: Andrii Anisov
  Cc: Juergen Gross, xen-devel, Boris Ostrovsky, Stefano Stabellini



On 08/05/2019 14:54, Andrii Anisov wrote:
>> Does it always fail, or only time to time?
> It happens on boot. And those prints are permanent and makes boot time enormous. 
> I once waited till user prompt (half an hour or so), they calmed on visible 
> idle, but printed from time to time (maybe on userspace activities).

I haven't seen them with nokpti platform so far. I am curious to know what is 
your configuration here.

> 
>> Could you dump the guest register when this happen?
> Could you please remember how to do that?

vcpu_show_execution_state(current) should do the job here.

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [Xen-devel] [PATCH v2 0/2] Introduce runstate area registration with phys address
@ 2019-05-08 14:31               ` Julien Grall
  0 siblings, 0 replies; 83+ messages in thread
From: Julien Grall @ 2019-05-08 14:31 UTC (permalink / raw)
  To: Andrii Anisov
  Cc: Juergen Gross, xen-devel, Boris Ostrovsky, Stefano Stabellini



On 08/05/2019 14:54, Andrii Anisov wrote:
>> Does it always fail, or only time to time?
> It happens on boot. And those prints are permanent and makes boot time enormous. 
> I once waited till user prompt (half an hour or so), they calmed on visible 
> idle, but printed from time to time (maybe on userspace activities).

I haven't seen them with nokpti platform so far. I am curious to know what is 
your configuration here.

> 
>> Could you dump the guest register when this happen?
> Could you please remember how to do that?

vcpu_show_execution_state(current) should do the job here.

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v2 2/2] xen: implement VCPUOP_register_runstate_phys_memory_area
@ 2019-05-08 15:40     ` Julien Grall
  0 siblings, 0 replies; 83+ messages in thread
From: Julien Grall @ 2019-05-08 15:40 UTC (permalink / raw)
  To: Andrii Anisov, xen-devel
  Cc: Stefano Stabellini, Andrii Anisov, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan,
	Jan Beulich, xen-devel, Wei Liu, Roger Pau Monné

Hi,

On 23/04/2019 09:10, Andrii Anisov wrote:
> From: Andrii Anisov <andrii_anisov@epam.com>
> 
> VCPUOP_register_runstate_phys_memory_area is implemented via runstate
> area mapping. >
> Signed-off-by: Andrii Anisov <andrii_anisov@epam.com>
> ---
>   xen/arch/arm/domain.c        |  62 +++++++++++++++++--------
>   xen/arch/x86/domain.c        | 105 +++++++++++++++++++++++++++++++------------
>   xen/common/domain.c          |  80 ++++++++++++++++++++++++++++++++-
>   xen/include/asm-arm/domain.h |   2 +
>   xen/include/xen/domain.h     |   2 +
>   xen/include/xen/sched.h      |   8 ++++
>   6 files changed, 210 insertions(+), 49 deletions(-)

This patch is quite hard to read because you are reworking the code and at the 
same time implementing the new VCPUOP. How about moving the rework in a separate 
patch? The implementation can then be fold in the previous patch as suggested by 
George.

> 
> diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
> index 6dc633e..8e24e63 100644
> --- a/xen/arch/arm/domain.c
> +++ b/xen/arch/arm/domain.c
> @@ -275,32 +275,55 @@ static void ctxt_switch_to(struct vcpu *n)
>   }
>   
>   /* Update per-VCPU guest runstate shared memory area (if registered). */
> -static void update_runstate_area(struct vcpu *v)
> +void update_runstate_area(struct vcpu *v)

Why do you export update_runstate_area? The function does not seem to be called 
outside.

>   {
> -    void __user *guest_handle = NULL;
> +    if ( !guest_handle_is_null(runstate_guest(v)) )
> +    {
> +        void __user *guest_handle = NULL;
> +        if ( VM_ASSIST(v->domain, runstate_update_flag) )
> +        {
> +            guest_handle = &v->runstate_guest.p->state_entry_time + 1;
> +            guest_handle--;
> +            v->runstate.state_entry_time |= XEN_RUNSTATE_UPDATE;
> +            __raw_copy_to_guest(guest_handle,
> +                                (void *)(&v->runstate.state_entry_time + 1) - 1,
> +                                1);
> +            smp_wmb();
> +        }
>   
> -    if ( guest_handle_is_null(runstate_guest(v)) )
> -        return;
> +        __copy_to_guest(runstate_guest(v), &v->runstate, 1);
>   
> -    if ( VM_ASSIST(v->domain, runstate_update_flag) )
> -    {
> -        guest_handle = &v->runstate_guest.p->state_entry_time + 1;
> -        guest_handle--;
> -        v->runstate.state_entry_time |= XEN_RUNSTATE_UPDATE;
> -        __raw_copy_to_guest(guest_handle,
> -                            (void *)(&v->runstate.state_entry_time + 1) - 1, 1);
> -        smp_wmb();
> +        if ( guest_handle )
> +        {
> +            v->runstate.state_entry_time &= ~XEN_RUNSTATE_UPDATE;
> +            smp_wmb();
> +            __raw_copy_to_guest(guest_handle,
> +                                (void *)(&v->runstate.state_entry_time + 1) - 1,
> +                                1);
> +        }
>       }
>   
> -    __copy_to_guest(runstate_guest(v), &v->runstate, 1);
> -
> -    if ( guest_handle )
> +    spin_lock(&v->mapped_runstate_lock);
> +    if ( v->mapped_runstate )

The code looks a bit odd to me, you seem to allow a guest to provide 2 runstate 
areas: one using guest virtual address the other using guest physical address.

It would be best if we prevent a guest to mix match them. IOW, if the guest 
provide a physical address first, then *all* the call should be physical 
address. Alternatively this could be a per vCPU decision.

>       {
> -        v->runstate.state_entry_time &= ~XEN_RUNSTATE_UPDATE;
> -        smp_wmb();
> -        __raw_copy_to_guest(guest_handle,
> -                            (void *)(&v->runstate.state_entry_time + 1) - 1, 1);
> +        if ( VM_ASSIST(v->domain, runstate_update_flag) )
> +        {
> +            v->mapped_runstate->state_entry_time |= XEN_RUNSTATE_UPDATE;
> +            smp_wmb();
> +            v->runstate.state_entry_time |= XEN_RUNSTATE_UPDATE;
> +        }
> +
> +        memcpy(v->mapped_runstate, &v->runstate, sizeof(v->runstate));
> +
> +        if ( VM_ASSIST(v->domain, runstate_update_flag) )
> +        {
> +            v->mapped_runstate->state_entry_time &= ~XEN_RUNSTATE_UPDATE;
> +            smp_wmb();
> +            v->runstate.state_entry_time &= ~XEN_RUNSTATE_UPDATE;
> +        }
>       }
> +    spin_unlock(&v->mapped_runstate_lock);
> +

NIT: The newline is not necessary here.

>   }
>   
>   static void schedule_tail(struct vcpu *prev)
> @@ -998,6 +1021,7 @@ long do_arm_vcpu_op(int cmd, unsigned int vcpuid, XEN_GUEST_HANDLE_PARAM(void) a
>       {
>           case VCPUOP_register_vcpu_info:
>           case VCPUOP_register_runstate_memory_area:
> +        case VCPUOP_register_runstate_phys_memory_area:
>               return do_vcpu_op(cmd, vcpuid, arg);
>           default:
>               return -EINVAL;


[...]

> diff --git a/xen/common/domain.c b/xen/common/domain.c
> index ae22049..6df76c6 100644
> --- a/xen/common/domain.c
> +++ b/xen/common/domain.c
> @@ -149,6 +149,7 @@ struct vcpu *vcpu_create(
>       v->dirty_cpu = VCPU_CPU_CLEAN;
>   
>       spin_lock_init(&v->virq_lock);
> +    spin_lock_init(&v->mapped_runstate_lock);
>   
>       tasklet_init(&v->continue_hypercall_tasklet, NULL, 0);
>   
> @@ -699,6 +700,69 @@ int rcu_lock_live_remote_domain_by_id(domid_t dom, struct domain **d)
>       return 0;
>   }
>   
> +static void _unmap_runstate_area(struct vcpu *v)
A better name would be unamep_runstate_area_locked() so you avoid the reserved 
name and make clear of the use.

> +{
> +    mfn_t mfn;
> +
> +    if ( !v->mapped_runstate )
> +        return;
> +
> +    mfn = _mfn(virt_to_mfn(runstate_guest(v).p));

As pointed out by Jan in the previous version:

The pointer is the result of __map_domain_page_global(). So I don't think you
don't think you can legitimately use virt_to_mfn() on it, at
least not on x86; domain_page_map_to_mfn() is what you
want to use here.

> +
> +    unmap_domain_page_global((void *)
> +                             ((unsigned long)v->mapped_runstate &
> +                              PAGE_MASK));
> +
> +    v->mapped_runstate = NULL;
> +    put_page_and_type(mfn_to_page(mfn));
> +}

We seem to have this pattern in a few places now (see unmap_guest_page). It 
would be good to introduce helpers that can be used everywhere (probably lifted 
from common/event_fifo.c.

> +
> +static int map_runstate_area(struct vcpu *v,
> +                      struct vcpu_register_runstate_memory_area *area)
> +{
> +    unsigned long offset = area->addr.p & ~PAGE_MASK;
> +    gfn_t gfn = gaddr_to_gfn(area->addr.p);
> +    struct domain *d = v->domain;
> +    void *mapping;
> +    struct page_info *page;
> +    size_t size = sizeof (struct vcpu_runstate_info );

space is not necessary before ).

But is the variable really necessary?

> +
> +    if ( offset > (PAGE_SIZE - size) )
> +        return -EINVAL;
> +
> +    page = get_page_from_gfn(d, gfn_x(gfn), NULL, P2M_ALLOC);
> +    if ( !page )
> +        return -EINVAL;
> +
> +    if ( !get_page_type(page, PGT_writable_page) )
> +    {
> +        put_page(page);
> +        return -EINVAL;
> +    }
> +
> +    mapping = __map_domain_page_global(page);
> +
> +    if ( mapping == NULL )
> +    {
> +        put_page_and_type(page);
> +        return -ENOMEM;
> +    }
> +
> +    spin_lock(&v->mapped_runstate_lock);
> +    _unmap_runstate_area(v);
> +    v->mapped_runstate = mapping + offset;
> +    spin_unlock(&v->mapped_runstate_lock);
> +
> +    return 0;
> +}
> +
> +static void unmap_runstate_area(struct vcpu *v)
> +{
> +    spin_lock(&v->mapped_runstate_lock);
> +    _unmap_runstate_area(v);
> +    spin_unlock(&v->mapped_runstate_lock);
> +}
> +
>   int domain_kill(struct domain *d)
>   {
>       int rc = 0;
> @@ -737,7 +801,11 @@ int domain_kill(struct domain *d)
>           if ( cpupool_move_domain(d, cpupool0) )
>               return -ERESTART;
>           for_each_vcpu ( d, v )
> +        {
> +            set_xen_guest_handle(runstate_guest(v), NULL);
> +            unmap_runstate_area(v);
>               unmap_vcpu_info(v);
> +        }
>           d->is_dying = DOMDYING_dead;
>           /* Mem event cleanup has to go here because the rings
>            * have to be put before we call put_domain. */
> @@ -1192,6 +1260,7 @@ int domain_soft_reset(struct domain *d)
>       for_each_vcpu ( d, v )
>       {
>           set_xen_guest_handle(runstate_guest(v), NULL);
> +        unmap_runstate_area(v);
>           unmap_vcpu_info(v);
>       }
>   
> @@ -1536,8 +1605,17 @@ long do_vcpu_op(int cmd, unsigned int vcpuid, XEN_GUEST_HANDLE_PARAM(void) arg)
>       }
>   
>       case VCPUOP_register_runstate_phys_memory_area:
> -        rc = -EOPNOTSUPP;
> +    {
> +        struct vcpu_register_runstate_memory_area area;
> +
> +        rc = -EFAULT;
> +        if ( copy_from_guest(&area, arg, 1) )
> +            break;
> +
> +        rc = map_runstate_area(v, &area);
> +
>           break;
> +    }
>   
>   #ifdef VCPU_TRAP_NMI
>       case VCPUOP_send_nmi:
> diff --git a/xen/include/asm-arm/domain.h b/xen/include/asm-arm/domain.h
> index 312fec8..3fb6ea2 100644
> --- a/xen/include/asm-arm/domain.h
> +++ b/xen/include/asm-arm/domain.h
> @@ -217,6 +217,8 @@ void vcpu_show_execution_state(struct vcpu *);
>   void vcpu_show_registers(const struct vcpu *);
>   void vcpu_switch_to_aarch64_mode(struct vcpu *);
>   
> +void update_runstate_area(struct vcpu *);
> +
>   /*
>    * Due to the restriction of GICv3, the number of vCPUs in AFF0 is
>    * limited to 16, thus only the first 4 bits of AFF0 are legal. We will
> diff --git a/xen/include/xen/domain.h b/xen/include/xen/domain.h
> index d1bfc82..ecddcfe 100644
> --- a/xen/include/xen/domain.h
> +++ b/xen/include/xen/domain.h
> @@ -118,4 +118,6 @@ struct vnuma_info {
>   
>   void vnuma_destroy(struct vnuma_info *vnuma);
>   
> +struct vcpu_register_runstate_memory_area;
> +
>   #endif /* __XEN_DOMAIN_H__ */
> diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
> index 748bb0f..2afe31c 100644
> --- a/xen/include/xen/sched.h
> +++ b/xen/include/xen/sched.h
> @@ -163,15 +163,23 @@ struct vcpu
>       void            *sched_priv;    /* scheduler-specific data */
>   
>       struct vcpu_runstate_info runstate;
> +
> +    spinlock_t      mapped_runstate_lock;
> +
>   #ifndef CONFIG_COMPAT
>   # define runstate_guest(v) ((v)->runstate_guest)
>       XEN_GUEST_HANDLE(vcpu_runstate_info_t) runstate_guest; /* guest address */
> +    vcpu_runstate_info_t *mapped_runstate;
>   #else
>   # define runstate_guest(v) ((v)->runstate_guest.native)
>       union {
>           XEN_GUEST_HANDLE(vcpu_runstate_info_t) native;
>           XEN_GUEST_HANDLE(vcpu_runstate_info_compat_t) compat;
>       } runstate_guest; /* guest address */
> +    union {
> +        vcpu_runstate_info_t* native;
> +        vcpu_runstate_info_compat_t* compat;
> +    } mapped_runstate; /* guest address */

The combination of mapped_runstate and runstate_guest is a bit confusing. I 
think you want to rework the interface to show that only one is possible at the 
time and make clear which one is used by who. Maybe:

union
{
    /* Legacy interface to be used when the guest provides a virtual address */
    union {
       XEN_GUEST_HANDLE(vcpu_runstate_info_t) native;
       ...
    } virt;

    /* Interface used when the guest provides a physical address */
    union {
    } phys;
} runstate_guest.

runstate_guest_type /* could be a bool or enum */

Jan what do you think?

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [Xen-devel] [PATCH v2 2/2] xen: implement VCPUOP_register_runstate_phys_memory_area
@ 2019-05-08 15:40     ` Julien Grall
  0 siblings, 0 replies; 83+ messages in thread
From: Julien Grall @ 2019-05-08 15:40 UTC (permalink / raw)
  To: Andrii Anisov, xen-devel
  Cc: Stefano Stabellini, Andrii Anisov, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan,
	Jan Beulich, xen-devel, Wei Liu, Roger Pau Monné

Hi,

On 23/04/2019 09:10, Andrii Anisov wrote:
> From: Andrii Anisov <andrii_anisov@epam.com>
> 
> VCPUOP_register_runstate_phys_memory_area is implemented via runstate
> area mapping. >
> Signed-off-by: Andrii Anisov <andrii_anisov@epam.com>
> ---
>   xen/arch/arm/domain.c        |  62 +++++++++++++++++--------
>   xen/arch/x86/domain.c        | 105 +++++++++++++++++++++++++++++++------------
>   xen/common/domain.c          |  80 ++++++++++++++++++++++++++++++++-
>   xen/include/asm-arm/domain.h |   2 +
>   xen/include/xen/domain.h     |   2 +
>   xen/include/xen/sched.h      |   8 ++++
>   6 files changed, 210 insertions(+), 49 deletions(-)

This patch is quite hard to read because you are reworking the code and at the 
same time implementing the new VCPUOP. How about moving the rework in a separate 
patch? The implementation can then be fold in the previous patch as suggested by 
George.

> 
> diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
> index 6dc633e..8e24e63 100644
> --- a/xen/arch/arm/domain.c
> +++ b/xen/arch/arm/domain.c
> @@ -275,32 +275,55 @@ static void ctxt_switch_to(struct vcpu *n)
>   }
>   
>   /* Update per-VCPU guest runstate shared memory area (if registered). */
> -static void update_runstate_area(struct vcpu *v)
> +void update_runstate_area(struct vcpu *v)

Why do you export update_runstate_area? The function does not seem to be called 
outside.

>   {
> -    void __user *guest_handle = NULL;
> +    if ( !guest_handle_is_null(runstate_guest(v)) )
> +    {
> +        void __user *guest_handle = NULL;
> +        if ( VM_ASSIST(v->domain, runstate_update_flag) )
> +        {
> +            guest_handle = &v->runstate_guest.p->state_entry_time + 1;
> +            guest_handle--;
> +            v->runstate.state_entry_time |= XEN_RUNSTATE_UPDATE;
> +            __raw_copy_to_guest(guest_handle,
> +                                (void *)(&v->runstate.state_entry_time + 1) - 1,
> +                                1);
> +            smp_wmb();
> +        }
>   
> -    if ( guest_handle_is_null(runstate_guest(v)) )
> -        return;
> +        __copy_to_guest(runstate_guest(v), &v->runstate, 1);
>   
> -    if ( VM_ASSIST(v->domain, runstate_update_flag) )
> -    {
> -        guest_handle = &v->runstate_guest.p->state_entry_time + 1;
> -        guest_handle--;
> -        v->runstate.state_entry_time |= XEN_RUNSTATE_UPDATE;
> -        __raw_copy_to_guest(guest_handle,
> -                            (void *)(&v->runstate.state_entry_time + 1) - 1, 1);
> -        smp_wmb();
> +        if ( guest_handle )
> +        {
> +            v->runstate.state_entry_time &= ~XEN_RUNSTATE_UPDATE;
> +            smp_wmb();
> +            __raw_copy_to_guest(guest_handle,
> +                                (void *)(&v->runstate.state_entry_time + 1) - 1,
> +                                1);
> +        }
>       }
>   
> -    __copy_to_guest(runstate_guest(v), &v->runstate, 1);
> -
> -    if ( guest_handle )
> +    spin_lock(&v->mapped_runstate_lock);
> +    if ( v->mapped_runstate )

The code looks a bit odd to me, you seem to allow a guest to provide 2 runstate 
areas: one using guest virtual address the other using guest physical address.

It would be best if we prevent a guest to mix match them. IOW, if the guest 
provide a physical address first, then *all* the call should be physical 
address. Alternatively this could be a per vCPU decision.

>       {
> -        v->runstate.state_entry_time &= ~XEN_RUNSTATE_UPDATE;
> -        smp_wmb();
> -        __raw_copy_to_guest(guest_handle,
> -                            (void *)(&v->runstate.state_entry_time + 1) - 1, 1);
> +        if ( VM_ASSIST(v->domain, runstate_update_flag) )
> +        {
> +            v->mapped_runstate->state_entry_time |= XEN_RUNSTATE_UPDATE;
> +            smp_wmb();
> +            v->runstate.state_entry_time |= XEN_RUNSTATE_UPDATE;
> +        }
> +
> +        memcpy(v->mapped_runstate, &v->runstate, sizeof(v->runstate));
> +
> +        if ( VM_ASSIST(v->domain, runstate_update_flag) )
> +        {
> +            v->mapped_runstate->state_entry_time &= ~XEN_RUNSTATE_UPDATE;
> +            smp_wmb();
> +            v->runstate.state_entry_time &= ~XEN_RUNSTATE_UPDATE;
> +        }
>       }
> +    spin_unlock(&v->mapped_runstate_lock);
> +

NIT: The newline is not necessary here.

>   }
>   
>   static void schedule_tail(struct vcpu *prev)
> @@ -998,6 +1021,7 @@ long do_arm_vcpu_op(int cmd, unsigned int vcpuid, XEN_GUEST_HANDLE_PARAM(void) a
>       {
>           case VCPUOP_register_vcpu_info:
>           case VCPUOP_register_runstate_memory_area:
> +        case VCPUOP_register_runstate_phys_memory_area:
>               return do_vcpu_op(cmd, vcpuid, arg);
>           default:
>               return -EINVAL;


[...]

> diff --git a/xen/common/domain.c b/xen/common/domain.c
> index ae22049..6df76c6 100644
> --- a/xen/common/domain.c
> +++ b/xen/common/domain.c
> @@ -149,6 +149,7 @@ struct vcpu *vcpu_create(
>       v->dirty_cpu = VCPU_CPU_CLEAN;
>   
>       spin_lock_init(&v->virq_lock);
> +    spin_lock_init(&v->mapped_runstate_lock);
>   
>       tasklet_init(&v->continue_hypercall_tasklet, NULL, 0);
>   
> @@ -699,6 +700,69 @@ int rcu_lock_live_remote_domain_by_id(domid_t dom, struct domain **d)
>       return 0;
>   }
>   
> +static void _unmap_runstate_area(struct vcpu *v)
A better name would be unamep_runstate_area_locked() so you avoid the reserved 
name and make clear of the use.

> +{
> +    mfn_t mfn;
> +
> +    if ( !v->mapped_runstate )
> +        return;
> +
> +    mfn = _mfn(virt_to_mfn(runstate_guest(v).p));

As pointed out by Jan in the previous version:

The pointer is the result of __map_domain_page_global(). So I don't think you
don't think you can legitimately use virt_to_mfn() on it, at
least not on x86; domain_page_map_to_mfn() is what you
want to use here.

> +
> +    unmap_domain_page_global((void *)
> +                             ((unsigned long)v->mapped_runstate &
> +                              PAGE_MASK));
> +
> +    v->mapped_runstate = NULL;
> +    put_page_and_type(mfn_to_page(mfn));
> +}

We seem to have this pattern in a few places now (see unmap_guest_page). It 
would be good to introduce helpers that can be used everywhere (probably lifted 
from common/event_fifo.c.

> +
> +static int map_runstate_area(struct vcpu *v,
> +                      struct vcpu_register_runstate_memory_area *area)
> +{
> +    unsigned long offset = area->addr.p & ~PAGE_MASK;
> +    gfn_t gfn = gaddr_to_gfn(area->addr.p);
> +    struct domain *d = v->domain;
> +    void *mapping;
> +    struct page_info *page;
> +    size_t size = sizeof (struct vcpu_runstate_info );

space is not necessary before ).

But is the variable really necessary?

> +
> +    if ( offset > (PAGE_SIZE - size) )
> +        return -EINVAL;
> +
> +    page = get_page_from_gfn(d, gfn_x(gfn), NULL, P2M_ALLOC);
> +    if ( !page )
> +        return -EINVAL;
> +
> +    if ( !get_page_type(page, PGT_writable_page) )
> +    {
> +        put_page(page);
> +        return -EINVAL;
> +    }
> +
> +    mapping = __map_domain_page_global(page);
> +
> +    if ( mapping == NULL )
> +    {
> +        put_page_and_type(page);
> +        return -ENOMEM;
> +    }
> +
> +    spin_lock(&v->mapped_runstate_lock);
> +    _unmap_runstate_area(v);
> +    v->mapped_runstate = mapping + offset;
> +    spin_unlock(&v->mapped_runstate_lock);
> +
> +    return 0;
> +}
> +
> +static void unmap_runstate_area(struct vcpu *v)
> +{
> +    spin_lock(&v->mapped_runstate_lock);
> +    _unmap_runstate_area(v);
> +    spin_unlock(&v->mapped_runstate_lock);
> +}
> +
>   int domain_kill(struct domain *d)
>   {
>       int rc = 0;
> @@ -737,7 +801,11 @@ int domain_kill(struct domain *d)
>           if ( cpupool_move_domain(d, cpupool0) )
>               return -ERESTART;
>           for_each_vcpu ( d, v )
> +        {
> +            set_xen_guest_handle(runstate_guest(v), NULL);
> +            unmap_runstate_area(v);
>               unmap_vcpu_info(v);
> +        }
>           d->is_dying = DOMDYING_dead;
>           /* Mem event cleanup has to go here because the rings
>            * have to be put before we call put_domain. */
> @@ -1192,6 +1260,7 @@ int domain_soft_reset(struct domain *d)
>       for_each_vcpu ( d, v )
>       {
>           set_xen_guest_handle(runstate_guest(v), NULL);
> +        unmap_runstate_area(v);
>           unmap_vcpu_info(v);
>       }
>   
> @@ -1536,8 +1605,17 @@ long do_vcpu_op(int cmd, unsigned int vcpuid, XEN_GUEST_HANDLE_PARAM(void) arg)
>       }
>   
>       case VCPUOP_register_runstate_phys_memory_area:
> -        rc = -EOPNOTSUPP;
> +    {
> +        struct vcpu_register_runstate_memory_area area;
> +
> +        rc = -EFAULT;
> +        if ( copy_from_guest(&area, arg, 1) )
> +            break;
> +
> +        rc = map_runstate_area(v, &area);
> +
>           break;
> +    }
>   
>   #ifdef VCPU_TRAP_NMI
>       case VCPUOP_send_nmi:
> diff --git a/xen/include/asm-arm/domain.h b/xen/include/asm-arm/domain.h
> index 312fec8..3fb6ea2 100644
> --- a/xen/include/asm-arm/domain.h
> +++ b/xen/include/asm-arm/domain.h
> @@ -217,6 +217,8 @@ void vcpu_show_execution_state(struct vcpu *);
>   void vcpu_show_registers(const struct vcpu *);
>   void vcpu_switch_to_aarch64_mode(struct vcpu *);
>   
> +void update_runstate_area(struct vcpu *);
> +
>   /*
>    * Due to the restriction of GICv3, the number of vCPUs in AFF0 is
>    * limited to 16, thus only the first 4 bits of AFF0 are legal. We will
> diff --git a/xen/include/xen/domain.h b/xen/include/xen/domain.h
> index d1bfc82..ecddcfe 100644
> --- a/xen/include/xen/domain.h
> +++ b/xen/include/xen/domain.h
> @@ -118,4 +118,6 @@ struct vnuma_info {
>   
>   void vnuma_destroy(struct vnuma_info *vnuma);
>   
> +struct vcpu_register_runstate_memory_area;
> +
>   #endif /* __XEN_DOMAIN_H__ */
> diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
> index 748bb0f..2afe31c 100644
> --- a/xen/include/xen/sched.h
> +++ b/xen/include/xen/sched.h
> @@ -163,15 +163,23 @@ struct vcpu
>       void            *sched_priv;    /* scheduler-specific data */
>   
>       struct vcpu_runstate_info runstate;
> +
> +    spinlock_t      mapped_runstate_lock;
> +
>   #ifndef CONFIG_COMPAT
>   # define runstate_guest(v) ((v)->runstate_guest)
>       XEN_GUEST_HANDLE(vcpu_runstate_info_t) runstate_guest; /* guest address */
> +    vcpu_runstate_info_t *mapped_runstate;
>   #else
>   # define runstate_guest(v) ((v)->runstate_guest.native)
>       union {
>           XEN_GUEST_HANDLE(vcpu_runstate_info_t) native;
>           XEN_GUEST_HANDLE(vcpu_runstate_info_compat_t) compat;
>       } runstate_guest; /* guest address */
> +    union {
> +        vcpu_runstate_info_t* native;
> +        vcpu_runstate_info_compat_t* compat;
> +    } mapped_runstate; /* guest address */

The combination of mapped_runstate and runstate_guest is a bit confusing. I 
think you want to rework the interface to show that only one is possible at the 
time and make clear which one is used by who. Maybe:

union
{
    /* Legacy interface to be used when the guest provides a virtual address */
    union {
       XEN_GUEST_HANDLE(vcpu_runstate_info_t) native;
       ...
    } virt;

    /* Interface used when the guest provides a physical address */
    union {
    } phys;
} runstate_guest.

runstate_guest_type /* could be a bool or enum */

Jan what do you think?

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v2 0/2] Introduce runstate area registration with phys address
@ 2019-05-08 16:01                 ` Andrii Anisov
  0 siblings, 0 replies; 83+ messages in thread
From: Andrii Anisov @ 2019-05-08 16:01 UTC (permalink / raw)
  To: Julien Grall
  Cc: Juergen Gross, xen-devel, Boris Ostrovsky, Stefano Stabellini

Hello Julien,


On 08.05.19 17:31, Julien Grall wrote:
> I haven't seen them with nokpti platform so far. I am curious to know what is your configuration here.

XEN 4.12 with our patches. Thin Dom0 is a generic armv8 Linux, LK 4.14.75 with patches from Renesas and us.
DomD is LK 4.14.75 with HW assigned and drivers. LK configs you can find on my google drive [1].

Those faults fire only for DomD (on its start).

> vcpu_show_execution_state(current) should do the job here.

Here it is:

(XEN) d1v2 par 0x809
(XEN) d1v2: Failed to walk page-table va 0xffff80002ff66357
(XEN) *** Dumping Dom1 vcpu#2 state: ***
(XEN) ----[ Xen-4.12.0  arm64  debug=n   Not tainted ]----
(XEN) CPU:    2
(XEN) PC:     0000ffffbd28dc88
(XEN) LR:     0000ffffbd28e674
(XEN) SP_EL0: 0000ffffe9890410
(XEN) SP_EL1: ffff00000803c000
(XEN) CPSR:   40000000 MODE:64-bit EL0t (Guest User)
(XEN)      X0: 0000000000000000  X1: 0000ffffe98907bc  X2: 0000ffffe9890430
(XEN)      X3: 0000000000000000  X4: 0000000000000000  X5: 0000000000000000
(XEN)      X6: 0000000000000001  X7: 0000ffffe98907b8  X8: 0101010101010101
(XEN)      X9: 0000000000000000 X10: 0000ffffe9890570 X11: 0000000000000020
(XEN)     X12: 0000000000000000 X13: 0000000000000000 X14: 0000000000000015
(XEN)     X15: 000000000000000a X16: 0000ffffbd6502c8 X17: 0000ffffbd28e660
(XEN)     X18: 0000ffffe98902cf X19: 0000ffffe98907b8 X20: 0000ffffe98907b8
(XEN)     X21: 0000ffffbd653000 X22: 0000ffffe98906f8 X23: 000000000000001e
(XEN)     X24: 0000ffffe98906a8 X25: 0000ffffbd584000 X26: 0000aaaac8736f68
(XEN)     X27: 0000ffffbd584978 X28: 0000000000000002  FP: 0000ffffe9890410
(XEN)
(XEN)    ELR_EL1: 0000ffffbd2b4698
(XEN)    ESR_EL1: 56000000
(XEN)    FAR_EL1: 0000ffffbd525014
(XEN)
(XEN)  SCTLR_EL1: 34d5d91d
(XEN)    TCR_EL1: 34b5503510
(XEN)  TTBR0_EL1: 000000006c056000
(XEN)  TTBR1_EL1: 00030000412d8000
(XEN)
(XEN)   VTCR_EL2: 80023558
(XEN)  VTTBR_EL2: 000200073fed6000
(XEN)
(XEN)  SCTLR_EL2: 30cd183d
(XEN)    HCR_EL2: 000000008078663f
(XEN)  TTBR0_EL2: 0000000078184000
(XEN)
(XEN)    ESR_EL2: 5a000ea1
(XEN)  HPFAR_EL2: 0000000000030010
(XEN)    FAR_EL2: ffff000008005f00
(XEN)
(XEN) No stack trace for guest user-mode

BTW, we have another public holiday tomorrow, so I'll get back to you on Friday.

[1] https://drive.google.com/drive/folders/1SX0FhAOKrkFdbNELFuBUt7fmLZKb8XmK?usp=sharing

-- 
Sincerely,
Andrii Anisov.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [Xen-devel] [PATCH v2 0/2] Introduce runstate area registration with phys address
@ 2019-05-08 16:01                 ` Andrii Anisov
  0 siblings, 0 replies; 83+ messages in thread
From: Andrii Anisov @ 2019-05-08 16:01 UTC (permalink / raw)
  To: Julien Grall
  Cc: Juergen Gross, xen-devel, Boris Ostrovsky, Stefano Stabellini

Hello Julien,


On 08.05.19 17:31, Julien Grall wrote:
> I haven't seen them with nokpti platform so far. I am curious to know what is your configuration here.

XEN 4.12 with our patches. Thin Dom0 is a generic armv8 Linux, LK 4.14.75 with patches from Renesas and us.
DomD is LK 4.14.75 with HW assigned and drivers. LK configs you can find on my google drive [1].

Those faults fire only for DomD (on its start).

> vcpu_show_execution_state(current) should do the job here.

Here it is:

(XEN) d1v2 par 0x809
(XEN) d1v2: Failed to walk page-table va 0xffff80002ff66357
(XEN) *** Dumping Dom1 vcpu#2 state: ***
(XEN) ----[ Xen-4.12.0  arm64  debug=n   Not tainted ]----
(XEN) CPU:    2
(XEN) PC:     0000ffffbd28dc88
(XEN) LR:     0000ffffbd28e674
(XEN) SP_EL0: 0000ffffe9890410
(XEN) SP_EL1: ffff00000803c000
(XEN) CPSR:   40000000 MODE:64-bit EL0t (Guest User)
(XEN)      X0: 0000000000000000  X1: 0000ffffe98907bc  X2: 0000ffffe9890430
(XEN)      X3: 0000000000000000  X4: 0000000000000000  X5: 0000000000000000
(XEN)      X6: 0000000000000001  X7: 0000ffffe98907b8  X8: 0101010101010101
(XEN)      X9: 0000000000000000 X10: 0000ffffe9890570 X11: 0000000000000020
(XEN)     X12: 0000000000000000 X13: 0000000000000000 X14: 0000000000000015
(XEN)     X15: 000000000000000a X16: 0000ffffbd6502c8 X17: 0000ffffbd28e660
(XEN)     X18: 0000ffffe98902cf X19: 0000ffffe98907b8 X20: 0000ffffe98907b8
(XEN)     X21: 0000ffffbd653000 X22: 0000ffffe98906f8 X23: 000000000000001e
(XEN)     X24: 0000ffffe98906a8 X25: 0000ffffbd584000 X26: 0000aaaac8736f68
(XEN)     X27: 0000ffffbd584978 X28: 0000000000000002  FP: 0000ffffe9890410
(XEN)
(XEN)    ELR_EL1: 0000ffffbd2b4698
(XEN)    ESR_EL1: 56000000
(XEN)    FAR_EL1: 0000ffffbd525014
(XEN)
(XEN)  SCTLR_EL1: 34d5d91d
(XEN)    TCR_EL1: 34b5503510
(XEN)  TTBR0_EL1: 000000006c056000
(XEN)  TTBR1_EL1: 00030000412d8000
(XEN)
(XEN)   VTCR_EL2: 80023558
(XEN)  VTTBR_EL2: 000200073fed6000
(XEN)
(XEN)  SCTLR_EL2: 30cd183d
(XEN)    HCR_EL2: 000000008078663f
(XEN)  TTBR0_EL2: 0000000078184000
(XEN)
(XEN)    ESR_EL2: 5a000ea1
(XEN)  HPFAR_EL2: 0000000000030010
(XEN)    FAR_EL2: ffff000008005f00
(XEN)
(XEN) No stack trace for guest user-mode

BTW, we have another public holiday tomorrow, so I'll get back to you on Friday.

[1] https://drive.google.com/drive/folders/1SX0FhAOKrkFdbNELFuBUt7fmLZKb8XmK?usp=sharing

-- 
Sincerely,
Andrii Anisov.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v2 2/2] xen: implement VCPUOP_register_runstate_phys_memory_area
@ 2019-05-09  9:27       ` Jan Beulich
  0 siblings, 0 replies; 83+ messages in thread
From: Jan Beulich @ 2019-05-09  9:27 UTC (permalink / raw)
  To: Julien Grall, andrii.anisov
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan, xen-devel,
	andrii_anisov, Roger Pau Monne

>>> On 08.05.19 at 17:40, <julien.grall@arm.com> wrote:
> On 23/04/2019 09:10, Andrii Anisov wrote:
>> --- a/xen/include/xen/sched.h
>> +++ b/xen/include/xen/sched.h
>> @@ -163,15 +163,23 @@ struct vcpu
>>       void            *sched_priv;    /* scheduler-specific data */
>>   
>>       struct vcpu_runstate_info runstate;
>> +
>> +    spinlock_t      mapped_runstate_lock;
>> +
>>   #ifndef CONFIG_COMPAT
>>   # define runstate_guest(v) ((v)->runstate_guest)
>>       XEN_GUEST_HANDLE(vcpu_runstate_info_t) runstate_guest; /* guest address */
>> +    vcpu_runstate_info_t *mapped_runstate;
>>   #else
>>   # define runstate_guest(v) ((v)->runstate_guest.native)
>>       union {
>>           XEN_GUEST_HANDLE(vcpu_runstate_info_t) native;
>>           XEN_GUEST_HANDLE(vcpu_runstate_info_compat_t) compat;
>>       } runstate_guest; /* guest address */
>> +    union {
>> +        vcpu_runstate_info_t* native;
>> +        vcpu_runstate_info_compat_t* compat;
>> +    } mapped_runstate; /* guest address */
> 
> The combination of mapped_runstate and runstate_guest is a bit confusing. I 
> think you want to rework the interface to show that only one is possible at the 
> time and make clear which one is used by who. Maybe:
> 
> union
> {
>     /* Legacy interface to be used when the guest provides a virtual address */
>     union {
>        XEN_GUEST_HANDLE(vcpu_runstate_info_t) native;
>        ...
>     } virt;
> 
>     /* Interface used when the guest provides a physical address */
>     union {
>     } phys;
> } runstate_guest.
> 
> runstate_guest_type /* could be a bool or enum */
> 
> Jan what do you think?

I fully agree - no mixing of approaches here, if possible. However,
care needs to be taken that after a domain reset the new kernel
can chose the opposite model. Question is whether there are even
other cases where independent components (say boot loader and
OS) may need to be permitted to chose models independently of
one another.

As a side note, Andrii - would you please avoid sending double mail
to xen-devel (addresses differing just by domain used)?

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [Xen-devel] [PATCH v2 2/2] xen: implement VCPUOP_register_runstate_phys_memory_area
@ 2019-05-09  9:27       ` Jan Beulich
  0 siblings, 0 replies; 83+ messages in thread
From: Jan Beulich @ 2019-05-09  9:27 UTC (permalink / raw)
  To: Julien Grall, andrii.anisov
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan, xen-devel,
	andrii_anisov, Roger Pau Monne

>>> On 08.05.19 at 17:40, <julien.grall@arm.com> wrote:
> On 23/04/2019 09:10, Andrii Anisov wrote:
>> --- a/xen/include/xen/sched.h
>> +++ b/xen/include/xen/sched.h
>> @@ -163,15 +163,23 @@ struct vcpu
>>       void            *sched_priv;    /* scheduler-specific data */
>>   
>>       struct vcpu_runstate_info runstate;
>> +
>> +    spinlock_t      mapped_runstate_lock;
>> +
>>   #ifndef CONFIG_COMPAT
>>   # define runstate_guest(v) ((v)->runstate_guest)
>>       XEN_GUEST_HANDLE(vcpu_runstate_info_t) runstate_guest; /* guest address */
>> +    vcpu_runstate_info_t *mapped_runstate;
>>   #else
>>   # define runstate_guest(v) ((v)->runstate_guest.native)
>>       union {
>>           XEN_GUEST_HANDLE(vcpu_runstate_info_t) native;
>>           XEN_GUEST_HANDLE(vcpu_runstate_info_compat_t) compat;
>>       } runstate_guest; /* guest address */
>> +    union {
>> +        vcpu_runstate_info_t* native;
>> +        vcpu_runstate_info_compat_t* compat;
>> +    } mapped_runstate; /* guest address */
> 
> The combination of mapped_runstate and runstate_guest is a bit confusing. I 
> think you want to rework the interface to show that only one is possible at the 
> time and make clear which one is used by who. Maybe:
> 
> union
> {
>     /* Legacy interface to be used when the guest provides a virtual address */
>     union {
>        XEN_GUEST_HANDLE(vcpu_runstate_info_t) native;
>        ...
>     } virt;
> 
>     /* Interface used when the guest provides a physical address */
>     union {
>     } phys;
> } runstate_guest.
> 
> runstate_guest_type /* could be a bool or enum */
> 
> Jan what do you think?

I fully agree - no mixing of approaches here, if possible. However,
care needs to be taken that after a domain reset the new kernel
can chose the opposite model. Question is whether there are even
other cases where independent components (say boot loader and
OS) may need to be permitted to chose models independently of
one another.

As a side note, Andrii - would you please avoid sending double mail
to xen-devel (addresses differing just by domain used)?

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v2 0/2] Introduce runstate area registration with phys address
  2019-05-08 13:59 ` Julien Grall
@ 2019-05-13 10:15   ` Andrii Anisov
  2019-05-13 11:16     ` Julien Grall
  0 siblings, 1 reply; 83+ messages in thread
From: Andrii Anisov @ 2019-05-13 10:15 UTC (permalink / raw)
  To: Julien Grall, xen-devel; +Cc: Andrii Anisov, Jan Beulich, Roger Pau Monné

Hello Julien,

On 08.05.19 16:59, Julien Grall wrote:
> Hi,
> 
> On 23/04/2019 09:10, Andrii Anisov wrote:
>> From: Andrii Anisov <andrii_anisov@epam.com>
>>
>> Following discussion [1] it is introduced and implemented a runstate
>> registration interface which uses guest's phys address instead of a virtual one.
>> The new hypercall employes the same data structures as a predecessor, but
>> expects the vcpu_runstate_info structure to not cross a page boundary.
>> The interface is implemented in a way vcpu_runstate_info structure is mapped to
>> the hypervisor on the hypercall processing and is directly accessed during its
>> updates. This runstate area mapping follows vcpu_info structure registration.
>>
>> Permanent mapping of runstate area would consume vmap area on arm32 what is
>> limited to 1G. Though it is assumed that ARM32 does not target the server market
>> and the rest of possible applications will not host a huge number of VCPUs to
>> render the limitation into the issue.
> 
> I am afraid I can't possible back this assumption. As I pointed out in the previous version, I would be OK with the always map solution on Arm32 (pending performance) because it would be possible to increase the virtual address area by reworking the address space.

I'm sorry, I'm not sure what should be my actions about that.

>>
>> The series is tested for ARM64. Build tested for x86. I'd appreciate if someone
>> could check it with x86.
>> The Linux kernel patch is here [2]. Though it is for 4.14.
> 
> The patch looks wrong to me. You are using virt_to_phys() on a percpu area. 
> What does actually promise you the physical address will always be the same?

Sorry for my ignorance here, could you please elaborate more about what is wrong here?


> Are you saying that the command dd is the CPUBurn? I am not sure how this could be considered as a CPUBurn. IHMO, this is more IO related.

Both /dev/null and /dev/zero are virtual devices no actual IO is performed during their operations, all the load is CPU (user and sys).

> 
>>        VCPU(dX)->idle->VCPU(dX).
>>        with following results:
>>
>>                              mapped               mapped
>>                              on access            on init
>>        GLMark2 320x240       2852                 2877          +0.8%
>>            +Dom0 CPUBurn     2088                 2094          +0.2%
>>        GLMark2 800x600       2368                 2375          +0.3%
>>            +Dom0 CPUBurn     1868                 1921          +2.8%
>>        GLMark2 1920x1080     931                  931            0%
>>            +Dom0 CPUBurn     892                  894           +0.2%
>>
>>        Please note that "mapped on access" means using the old runstate
>>        registering interface. And runstate update in this case still often fails
>>        to map runstate area like [5], despite the fact that our Linux kernel
>>        does not have KPTI enabled. So runstate area update, in this case, is
>>        really shortened.
> 
> We know that the old interface is broken, so telling us the new interface is faster is not entirely useful. What I am more interested is how it if you use a guest physical address on the version "Mapped on access".

Hm, I see your point. Well, I can make it for ARM to compare performance.

> 
>>
>>
>>        Also it was checked IRQ latency difference using TBM in a setup similar to
>>        [5]. Please note that the IRQ rate is one in 30 seconds, and only
>>        VCPU->idle->VCPU use-case is considered. With following results (in ns,
>>        the timer granularity 120ns):
> 
> How long did you run the benchmark?

I did run it until avg more ore less stabilizes (2-3 minutes), then took the minimal avg (note, we have a moving average there).

> 
>>
>>        mapped on access:
>>            max=9960 warm_max=8640 min=7200 avg=7626
>>        mapped on init:
>>            max=9480 warm_max=8400 min=7080 avg=7341
>>
>>        Unfortunately there are no consitent results yet from profiling using
>>        Lauterbach PowerTrace. Still in communication with the tracer vendor in
>>        order to setup the proper configuration.

-- 
Sincerely,
Andrii Anisov.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v2 0/2] Introduce runstate area registration with phys address
@ 2019-05-13 10:50                   ` Julien Grall
  0 siblings, 0 replies; 83+ messages in thread
From: Julien Grall @ 2019-05-13 10:50 UTC (permalink / raw)
  To: Andrii Anisov
  Cc: Juergen Gross, xen-devel, Boris Ostrovsky, Stefano Stabellini

Hi,

On 5/8/19 5:01 PM, Andrii Anisov wrote:
> On 08.05.19 17:31, Julien Grall wrote:
>> I haven't seen them with nokpti platform so far. I am curious to know 
>> what is your configuration here.
> 
> XEN 4.12 with our patches. Thin Dom0 is a generic armv8 Linux, LK 
> 4.14.75 with patches from Renesas and us.
> DomD is LK 4.14.75 with HW assigned and drivers. LK configs you can find 
> on my google drive [1].
> 
> Those faults fire only for DomD (on its start).
> 
>> vcpu_show_execution_state(current) should do the job here.
> 
> Here it is:
> 
> (XEN) d1v2 par 0x809
> (XEN) d1v2: Failed to walk page-table va 0xffff80002ff66357
> (XEN) *** Dumping Dom1 vcpu#2 state: ***
> (XEN) ----[ Xen-4.12.0  arm64  debug=n   Not tainted ]----
> (XEN) CPU:    2
> (XEN) PC:     0000ffffbd28dc88
> (XEN) LR:     0000ffffbd28e674
> (XEN) SP_EL0: 0000ffffe9890410
> (XEN) SP_EL1: ffff00000803c000
> (XEN) CPSR:   40000000 MODE:64-bit EL0t (Guest User)

This one is happening when the guest was running in user mode. Is it 
always the case?

Also, your DomD .config has CONFIG_UNMAP_KERNEL_AT_EL0. So how do you 
disable kpti?

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [Xen-devel] [PATCH v2 0/2] Introduce runstate area registration with phys address
@ 2019-05-13 10:50                   ` Julien Grall
  0 siblings, 0 replies; 83+ messages in thread
From: Julien Grall @ 2019-05-13 10:50 UTC (permalink / raw)
  To: Andrii Anisov
  Cc: Juergen Gross, xen-devel, Boris Ostrovsky, Stefano Stabellini

Hi,

On 5/8/19 5:01 PM, Andrii Anisov wrote:
> On 08.05.19 17:31, Julien Grall wrote:
>> I haven't seen them with nokpti platform so far. I am curious to know 
>> what is your configuration here.
> 
> XEN 4.12 with our patches. Thin Dom0 is a generic armv8 Linux, LK 
> 4.14.75 with patches from Renesas and us.
> DomD is LK 4.14.75 with HW assigned and drivers. LK configs you can find 
> on my google drive [1].
> 
> Those faults fire only for DomD (on its start).
> 
>> vcpu_show_execution_state(current) should do the job here.
> 
> Here it is:
> 
> (XEN) d1v2 par 0x809
> (XEN) d1v2: Failed to walk page-table va 0xffff80002ff66357
> (XEN) *** Dumping Dom1 vcpu#2 state: ***
> (XEN) ----[ Xen-4.12.0  arm64  debug=n   Not tainted ]----
> (XEN) CPU:    2
> (XEN) PC:     0000ffffbd28dc88
> (XEN) LR:     0000ffffbd28e674
> (XEN) SP_EL0: 0000ffffe9890410
> (XEN) SP_EL1: ffff00000803c000
> (XEN) CPSR:   40000000 MODE:64-bit EL0t (Guest User)

This one is happening when the guest was running in user mode. Is it 
always the case?

Also, your DomD .config has CONFIG_UNMAP_KERNEL_AT_EL0. So how do you 
disable kpti?

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v2 0/2] Introduce runstate area registration with phys address
  2019-05-13 10:15   ` Andrii Anisov
@ 2019-05-13 11:16     ` Julien Grall
  2019-05-13 14:14       ` Andrii Anisov
  0 siblings, 1 reply; 83+ messages in thread
From: Julien Grall @ 2019-05-13 11:16 UTC (permalink / raw)
  To: Andrii Anisov, xen-devel; +Cc: Andrii Anisov, Jan Beulich, Roger Pau Monné



On 5/13/19 11:15 AM, Andrii Anisov wrote:
> Hello Julien,
> 
> On 08.05.19 16:59, Julien Grall wrote:
>> Hi,
>>
>> On 23/04/2019 09:10, Andrii Anisov wrote:
>>> From: Andrii Anisov <andrii_anisov@epam.com>
>>>
>>> Following discussion [1] it is introduced and implemented a runstate
>>> registration interface which uses guest's phys address instead of a 
>>> virtual one.
>>> The new hypercall employes the same data structures as a predecessor, 
>>> but
>>> expects the vcpu_runstate_info structure to not cross a page boundary.
>>> The interface is implemented in a way vcpu_runstate_info structure is 
>>> mapped to
>>> the hypervisor on the hypercall processing and is directly accessed 
>>> during its
>>> updates. This runstate area mapping follows vcpu_info structure 
>>> registration.
>>>
>>> Permanent mapping of runstate area would consume vmap area on arm32 
>>> what is
>>> limited to 1G. Though it is assumed that ARM32 does not target the 
>>> server market
>>> and the rest of possible applications will not host a huge number of 
>>> VCPUs to
>>> render the limitation into the issue.
>>
>> I am afraid I can't possible back this assumption. As I pointed out in 
>> the previous version, I would be OK with the always map solution on 
>> Arm32 (pending performance) because it would be possible to increase 
>> the virtual address area by reworking the address space.
> 
> I'm sorry, I'm not sure what should be my actions about that.

There no code modification involved so far. Just updating your cover 
letter with what I just said above.

> 
>>>
>>> The series is tested for ARM64. Build tested for x86. I'd appreciate 
>>> if someone
>>> could check it with x86.
>>> The Linux kernel patch is here [2]. Though it is for 4.14.
>>
>> The patch looks wrong to me. You are using virt_to_phys() on a percpu 
>> area. What does actually promise you the physical address will always 
>> be the same?
> 
> Sorry for my ignorance here, could you please elaborate more about what 
> is wrong here?

While the virtual address will never change over over the life cycle of 
a variable, I am not entirely sure we can make the same assumption for 
the physical address.

I know that kmalloc() is promising you that the physical address will 
not change. But percpu does not seem to use kmalloc() so have you 
confirmed this assumption can hold?

> 
> 
>> Are you saying that the command dd is the CPUBurn? I am not sure how 
>> this could be considered as a CPUBurn. IHMO, this is more IO related.
> 
> Both /dev/null and /dev/zero are virtual devices no actual IO is 
> performed during their operations, all the load is CPU (user and sys).

Thank you for the explanation. Shall I guess this is an existing 
benchmark [1]?

> 
>>
>>>        VCPU(dX)->idle->VCPU(dX).
>>>        with following results:
>>>
>>>                              mapped               mapped
>>>                              on access            on init
>>>        GLMark2 320x240       2852                 2877          +0.8%
>>>            +Dom0 CPUBurn     2088                 2094          +0.2%
>>>        GLMark2 800x600       2368                 2375          +0.3%
>>>            +Dom0 CPUBurn     1868                 1921          +2.8%
>>>        GLMark2 1920x1080     931                  931            0%
>>>            +Dom0 CPUBurn     892                  894           +0.2%
>>>
>>>        Please note that "mapped on access" means using the old runstate
>>>        registering interface. And runstate update in this case still 
>>> often fails
>>>        to map runstate area like [5], despite the fact that our Linux 
>>> kernel
>>>        does not have KPTI enabled. So runstate area update, in this 
>>> case, is
>>>        really shortened.
>>
>> We know that the old interface is broken, so telling us the new 
>> interface is faster is not entirely useful. What I am more interested 
>> is how it if you use a guest physical address on the version "Mapped 
>> on access".
> 
> Hm, I see your point. Well, I can make it for ARM to compare performance.
> 
>>
>>>
>>>
>>>        Also it was checked IRQ latency difference using TBM in a 
>>> setup similar to
>>>        [5]. Please note that the IRQ rate is one in 30 seconds, and only
>>>        VCPU->idle->VCPU use-case is considered. With following 
>>> results (in ns,
>>>        the timer granularity 120ns):
>>
>> How long did you run the benchmark?
> 
> I did run it until avg more ore less stabilizes (2-3 minutes), then took 
> the minimal avg (note, we have a moving average there).
Did you re-run multiple time?

> 
>>
>>>
>>>        mapped on access:
>>>            max=9960 warm_max=8640 min=7200 avg=7626
>>>        mapped on init:
>>>            max=9480 warm_max=8400 min=7080 avg=7341
>>>
>>>        Unfortunately there are no consitent results yet from 
>>> profiling using
>>>        Lauterbach PowerTrace. Still in communication with the tracer 
>>> vendor in
>>>        order to setup the proper configuration.
> 

[1]  https://patrickmn.com/projects/cpuburn/? If so, a link to the 
benchmark

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v2 2/2] xen: implement VCPUOP_register_runstate_phys_memory_area
@ 2019-05-13 12:30       ` Andrii Anisov
  0 siblings, 0 replies; 83+ messages in thread
From: Andrii Anisov @ 2019-05-13 12:30 UTC (permalink / raw)
  To: Julien Grall, xen-devel
  Cc: Stefano Stabellini, Andrii Anisov, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan,
	Jan Beulich, xen-devel, Wei Liu, Roger Pau Monné



On 08.05.19 18:40, Julien Grall wrote:
> This patch is quite hard to read because you are reworking the code and at the same time implementing the new VCPUOP. How about moving the rework in a separate patch? The implementation can then be fold in the previous patch as suggested by George.

OK.

> 
>>
>> diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
>> index 6dc633e..8e24e63 100644
>> --- a/xen/arch/arm/domain.c
>> +++ b/xen/arch/arm/domain.c
>> @@ -275,32 +275,55 @@ static void ctxt_switch_to(struct vcpu *n)
>>   }
>>   /* Update per-VCPU guest runstate shared memory area (if registered). */
>> -static void update_runstate_area(struct vcpu *v)
>> +void update_runstate_area(struct vcpu *v)
> 
> Why do you export update_runstate_area? The function does not seem to be called outside.

Ouch, this left from one of the previous versions.

> 
>>   {
>> -    void __user *guest_handle = NULL;
>> +    if ( !guest_handle_is_null(runstate_guest(v)) )
>> +    {
>> +        void __user *guest_handle = NULL;
>> +        if ( VM_ASSIST(v->domain, runstate_update_flag) )
>> +        {
>> +            guest_handle = &v->runstate_guest.p->state_entry_time + 1;
>> +            guest_handle--;
>> +            v->runstate.state_entry_time |= XEN_RUNSTATE_UPDATE;
>> +            __raw_copy_to_guest(guest_handle,
>> +                                (void *)(&v->runstate.state_entry_time + 1) - 1,
>> +                                1);
>> +            smp_wmb();
>> +        }
>> -    if ( guest_handle_is_null(runstate_guest(v)) )
>> -        return;
>> +        __copy_to_guest(runstate_guest(v), &v->runstate, 1);
>> -    if ( VM_ASSIST(v->domain, runstate_update_flag) )
>> -    {
>> -        guest_handle = &v->runstate_guest.p->state_entry_time + 1;
>> -        guest_handle--;
>> -        v->runstate.state_entry_time |= XEN_RUNSTATE_UPDATE;
>> -        __raw_copy_to_guest(guest_handle,
>> -                            (void *)(&v->runstate.state_entry_time + 1) - 1, 1);
>> -        smp_wmb();
>> +        if ( guest_handle )
>> +        {
>> +            v->runstate.state_entry_time &= ~XEN_RUNSTATE_UPDATE;
>> +            smp_wmb();
>> +            __raw_copy_to_guest(guest_handle,
>> +                                (void *)(&v->runstate.state_entry_time + 1) - 1,
>> +                                1);
>> +        }
>>       }
>> -    __copy_to_guest(runstate_guest(v), &v->runstate, 1);
>> -
>> -    if ( guest_handle )
>> +    spin_lock(&v->mapped_runstate_lock);
>> +    if ( v->mapped_runstate )
> 
> The code looks a bit odd to me, you seem to allow a guest to provide 2 runstate areas: one using guest virtual address the other using guest physical address.
> 
> It would be best if we prevent a guest to mix match them. 

Firstly I turned to implementing in that way, but the locking and decissions code become really ugly and complex while trying to cover 'guest's misbehavior' scenarios.

> IOW, if the guest provide a physical address first, then *all* the call should be physical address. Alternatively this could be a per vCPU decision.

I guess we should agree what to implement first.

> 
>>       {
>> -        v->runstate.state_entry_time &= ~XEN_RUNSTATE_UPDATE;
>> -        smp_wmb();
>> -        __raw_copy_to_guest(guest_handle,
>> -                            (void *)(&v->runstate.state_entry_time + 1) - 1, 1);
>> +        if ( VM_ASSIST(v->domain, runstate_update_flag) )
>> +        {
>> +            v->mapped_runstate->state_entry_time |= XEN_RUNSTATE_UPDATE;
>> +            smp_wmb();
>> +            v->runstate.state_entry_time |= XEN_RUNSTATE_UPDATE;
>> +        }
>> +
>> +        memcpy(v->mapped_runstate, &v->runstate, sizeof(v->runstate));
>> +
>> +        if ( VM_ASSIST(v->domain, runstate_update_flag) )
>> +        {
>> +            v->mapped_runstate->state_entry_time &= ~XEN_RUNSTATE_UPDATE;
>> +            smp_wmb();
>> +            v->runstate.state_entry_time &= ~XEN_RUNSTATE_UPDATE;
>> +        }
>>       }
>> +    spin_unlock(&v->mapped_runstate_lock);
>> +
> 
> NIT: The newline is not necessary here.

OK.

> 
>>   }
>>   static void schedule_tail(struct vcpu *prev)
>> @@ -998,6 +1021,7 @@ long do_arm_vcpu_op(int cmd, unsigned int vcpuid, XEN_GUEST_HANDLE_PARAM(void) a
>>       {
>>           case VCPUOP_register_vcpu_info:
>>           case VCPUOP_register_runstate_memory_area:
>> +        case VCPUOP_register_runstate_phys_memory_area:
>>               return do_vcpu_op(cmd, vcpuid, arg);
>>           default:
>>               return -EINVAL;
> 
> 
> [...]
> 
>> diff --git a/xen/common/domain.c b/xen/common/domain.c
>> index ae22049..6df76c6 100644
>> --- a/xen/common/domain.c
>> +++ b/xen/common/domain.c
>> @@ -149,6 +149,7 @@ struct vcpu *vcpu_create(
>>       v->dirty_cpu = VCPU_CPU_CLEAN;
>>       spin_lock_init(&v->virq_lock);
>> +    spin_lock_init(&v->mapped_runstate_lock);
>>       tasklet_init(&v->continue_hypercall_tasklet, NULL, 0);
>> @@ -699,6 +700,69 @@ int rcu_lock_live_remote_domain_by_id(domid_t dom, struct domain **d)
>>       return 0;
>>   }
>> +static void _unmap_runstate_area(struct vcpu *v)
> A better name would be unamep_runstate_area_locked() so you avoid the reserved name and make clear of the use.

OK.

> 
>> +{
>> +    mfn_t mfn;
>> +
>> +    if ( !v->mapped_runstate )
>> +        return;
>> +
>> +    mfn = _mfn(virt_to_mfn(runstate_guest(v).p));
> 
> As pointed out by Jan in the previous version:
> 
> The pointer is the result of __map_domain_page_global(). So I don't think you
> don't think you can legitimately use virt_to_mfn() on it, at
> least not on x86; domain_page_map_to_mfn() is what you
> want to use here.

Yep.

> 
>> +
>> +    unmap_domain_page_global((void *)
>> +                             ((unsigned long)v->mapped_runstate &
>> +                              PAGE_MASK));
>> +
>> +    v->mapped_runstate = NULL;
>> +    put_page_and_type(mfn_to_page(mfn));
>> +}
> 
> We seem to have this pattern in a few places now (see unmap_guest_page). It would be good to introduce helpers that can be used everywhere (probably lifted from common/event_fifo.c.

I'll check.

> 
>> +
>> +static int map_runstate_area(struct vcpu *v,
>> +                      struct vcpu_register_runstate_memory_area *area)
>> +{
>> +    unsigned long offset = area->addr.p & ~PAGE_MASK;
>> +    gfn_t gfn = gaddr_to_gfn(area->addr.p);
>> +    struct domain *d = v->domain;
>> +    void *mapping;
>> +    struct page_info *page;
>> +    size_t size = sizeof (struct vcpu_runstate_info );
> 
> space is not necessary before ).
> 
> But is the variable really necessary?

Well, I think it could be dropped.
> 
>> +
>> +    if ( offset > (PAGE_SIZE - size) )
>> +        return -EINVAL;
>> +
>> +    page = get_page_from_gfn(d, gfn_x(gfn), NULL, P2M_ALLOC);
>> +    if ( !page )
>> +        return -EINVAL;
>> +
>> +    if ( !get_page_type(page, PGT_writable_page) )
>> +    {
>> +        put_page(page);
>> +        return -EINVAL;
>> +    }
>> +
>> +    mapping = __map_domain_page_global(page);
>> +
>> +    if ( mapping == NULL )
>> +    {
>> +        put_page_and_type(page);
>> +        return -ENOMEM;
>> +    }
>> +
>> +    spin_lock(&v->mapped_runstate_lock);
>> +    _unmap_runstate_area(v);
>> +    v->mapped_runstate = mapping + offset;
>> +    spin_unlock(&v->mapped_runstate_lock);
>> +
>> +    return 0;
>> +}
>> +
>> +static void unmap_runstate_area(struct vcpu *v)
>> +{
>> +    spin_lock(&v->mapped_runstate_lock);
>> +    _unmap_runstate_area(v);
>> +    spin_unlock(&v->mapped_runstate_lock);
>> +}
>> +
>>   int domain_kill(struct domain *d)
>>   {
>>       int rc = 0;
>> @@ -737,7 +801,11 @@ int domain_kill(struct domain *d)
>>           if ( cpupool_move_domain(d, cpupool0) )
>>               return -ERESTART;
>>           for_each_vcpu ( d, v )
>> +        {
>> +            set_xen_guest_handle(runstate_guest(v), NULL);
>> +            unmap_runstate_area(v);
>>               unmap_vcpu_info(v);
>> +        }
>>           d->is_dying = DOMDYING_dead;
>>           /* Mem event cleanup has to go here because the rings
>>            * have to be put before we call put_domain. */
>> @@ -1192,6 +1260,7 @@ int domain_soft_reset(struct domain *d)
>>       for_each_vcpu ( d, v )
>>       {
>>           set_xen_guest_handle(runstate_guest(v), NULL);
>> +        unmap_runstate_area(v);
>>           unmap_vcpu_info(v);
>>       }
>> @@ -1536,8 +1605,17 @@ long do_vcpu_op(int cmd, unsigned int vcpuid, XEN_GUEST_HANDLE_PARAM(void) arg)
>>       }
>>       case VCPUOP_register_runstate_phys_memory_area:
>> -        rc = -EOPNOTSUPP;
>> +    {
>> +        struct vcpu_register_runstate_memory_area area;
>> +
>> +        rc = -EFAULT;
>> +        if ( copy_from_guest(&area, arg, 1) )
>> +            break;
>> +
>> +        rc = map_runstate_area(v, &area);
>> +
>>           break;
>> +    }
>>   #ifdef VCPU_TRAP_NMI
>>       case VCPUOP_send_nmi:
>> diff --git a/xen/include/asm-arm/domain.h b/xen/include/asm-arm/domain.h
>> index 312fec8..3fb6ea2 100644
>> --- a/xen/include/asm-arm/domain.h
>> +++ b/xen/include/asm-arm/domain.h
>> @@ -217,6 +217,8 @@ void vcpu_show_execution_state(struct vcpu *);
>>   void vcpu_show_registers(const struct vcpu *);
>>   void vcpu_switch_to_aarch64_mode(struct vcpu *);
>> +void update_runstate_area(struct vcpu *);
>> +
>>   /*
>>    * Due to the restriction of GICv3, the number of vCPUs in AFF0 is
>>    * limited to 16, thus only the first 4 bits of AFF0 are legal. We will
>> diff --git a/xen/include/xen/domain.h b/xen/include/xen/domain.h
>> index d1bfc82..ecddcfe 100644
>> --- a/xen/include/xen/domain.h
>> +++ b/xen/include/xen/domain.h
>> @@ -118,4 +118,6 @@ struct vnuma_info {
>>   void vnuma_destroy(struct vnuma_info *vnuma);
>> +struct vcpu_register_runstate_memory_area;
>> +
>>   #endif /* __XEN_DOMAIN_H__ */
>> diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
>> index 748bb0f..2afe31c 100644
>> --- a/xen/include/xen/sched.h
>> +++ b/xen/include/xen/sched.h
>> @@ -163,15 +163,23 @@ struct vcpu
>>       void            *sched_priv;    /* scheduler-specific data */
>>       struct vcpu_runstate_info runstate;
>> +
>> +    spinlock_t      mapped_runstate_lock;
>> +
>>   #ifndef CONFIG_COMPAT
>>   # define runstate_guest(v) ((v)->runstate_guest)
>>       XEN_GUEST_HANDLE(vcpu_runstate_info_t) runstate_guest; /* guest address */
>> +    vcpu_runstate_info_t *mapped_runstate;
>>   #else
>>   # define runstate_guest(v) ((v)->runstate_guest.native)
>>       union {
>>           XEN_GUEST_HANDLE(vcpu_runstate_info_t) native;
>>           XEN_GUEST_HANDLE(vcpu_runstate_info_compat_t) compat;
>>       } runstate_guest; /* guest address */
>> +    union {
>> +        vcpu_runstate_info_t* native;
>> +        vcpu_runstate_info_compat_t* compat;
>> +    } mapped_runstate; /* guest address */
> > The combination of mapped_runstate and runstate_guest is a bit confusing. I think you want to rework the interface to show that only one is possible at the time and make clear which one is used by who. Maybe:

As I said before, IMO coupling those interfaces makes the code complicated and ugly.

> 
> union
> {
>     /* Legacy interface to be used when the guest provides a virtual address */
>     union {
>        XEN_GUEST_HANDLE(vcpu_runstate_info_t) native;
>        ...
>     } virt;
> 
>     /* Interface used when the guest provides a physical address */
>     union {
>     } phys;
> } runstate_guest.> 
> runstate_guest_type /* could be a bool or enum */
> 
> Jan what do you think?
> 
> Cheers,
> 

-- 
Sincerely,
Andrii Anisov.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [Xen-devel] [PATCH v2 2/2] xen: implement VCPUOP_register_runstate_phys_memory_area
@ 2019-05-13 12:30       ` Andrii Anisov
  0 siblings, 0 replies; 83+ messages in thread
From: Andrii Anisov @ 2019-05-13 12:30 UTC (permalink / raw)
  To: Julien Grall, xen-devel
  Cc: Stefano Stabellini, Andrii Anisov, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan,
	Jan Beulich, xen-devel, Wei Liu, Roger Pau Monné



On 08.05.19 18:40, Julien Grall wrote:
> This patch is quite hard to read because you are reworking the code and at the same time implementing the new VCPUOP. How about moving the rework in a separate patch? The implementation can then be fold in the previous patch as suggested by George.

OK.

> 
>>
>> diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
>> index 6dc633e..8e24e63 100644
>> --- a/xen/arch/arm/domain.c
>> +++ b/xen/arch/arm/domain.c
>> @@ -275,32 +275,55 @@ static void ctxt_switch_to(struct vcpu *n)
>>   }
>>   /* Update per-VCPU guest runstate shared memory area (if registered). */
>> -static void update_runstate_area(struct vcpu *v)
>> +void update_runstate_area(struct vcpu *v)
> 
> Why do you export update_runstate_area? The function does not seem to be called outside.

Ouch, this left from one of the previous versions.

> 
>>   {
>> -    void __user *guest_handle = NULL;
>> +    if ( !guest_handle_is_null(runstate_guest(v)) )
>> +    {
>> +        void __user *guest_handle = NULL;
>> +        if ( VM_ASSIST(v->domain, runstate_update_flag) )
>> +        {
>> +            guest_handle = &v->runstate_guest.p->state_entry_time + 1;
>> +            guest_handle--;
>> +            v->runstate.state_entry_time |= XEN_RUNSTATE_UPDATE;
>> +            __raw_copy_to_guest(guest_handle,
>> +                                (void *)(&v->runstate.state_entry_time + 1) - 1,
>> +                                1);
>> +            smp_wmb();
>> +        }
>> -    if ( guest_handle_is_null(runstate_guest(v)) )
>> -        return;
>> +        __copy_to_guest(runstate_guest(v), &v->runstate, 1);
>> -    if ( VM_ASSIST(v->domain, runstate_update_flag) )
>> -    {
>> -        guest_handle = &v->runstate_guest.p->state_entry_time + 1;
>> -        guest_handle--;
>> -        v->runstate.state_entry_time |= XEN_RUNSTATE_UPDATE;
>> -        __raw_copy_to_guest(guest_handle,
>> -                            (void *)(&v->runstate.state_entry_time + 1) - 1, 1);
>> -        smp_wmb();
>> +        if ( guest_handle )
>> +        {
>> +            v->runstate.state_entry_time &= ~XEN_RUNSTATE_UPDATE;
>> +            smp_wmb();
>> +            __raw_copy_to_guest(guest_handle,
>> +                                (void *)(&v->runstate.state_entry_time + 1) - 1,
>> +                                1);
>> +        }
>>       }
>> -    __copy_to_guest(runstate_guest(v), &v->runstate, 1);
>> -
>> -    if ( guest_handle )
>> +    spin_lock(&v->mapped_runstate_lock);
>> +    if ( v->mapped_runstate )
> 
> The code looks a bit odd to me, you seem to allow a guest to provide 2 runstate areas: one using guest virtual address the other using guest physical address.
> 
> It would be best if we prevent a guest to mix match them. 

Firstly I turned to implementing in that way, but the locking and decissions code become really ugly and complex while trying to cover 'guest's misbehavior' scenarios.

> IOW, if the guest provide a physical address first, then *all* the call should be physical address. Alternatively this could be a per vCPU decision.

I guess we should agree what to implement first.

> 
>>       {
>> -        v->runstate.state_entry_time &= ~XEN_RUNSTATE_UPDATE;
>> -        smp_wmb();
>> -        __raw_copy_to_guest(guest_handle,
>> -                            (void *)(&v->runstate.state_entry_time + 1) - 1, 1);
>> +        if ( VM_ASSIST(v->domain, runstate_update_flag) )
>> +        {
>> +            v->mapped_runstate->state_entry_time |= XEN_RUNSTATE_UPDATE;
>> +            smp_wmb();
>> +            v->runstate.state_entry_time |= XEN_RUNSTATE_UPDATE;
>> +        }
>> +
>> +        memcpy(v->mapped_runstate, &v->runstate, sizeof(v->runstate));
>> +
>> +        if ( VM_ASSIST(v->domain, runstate_update_flag) )
>> +        {
>> +            v->mapped_runstate->state_entry_time &= ~XEN_RUNSTATE_UPDATE;
>> +            smp_wmb();
>> +            v->runstate.state_entry_time &= ~XEN_RUNSTATE_UPDATE;
>> +        }
>>       }
>> +    spin_unlock(&v->mapped_runstate_lock);
>> +
> 
> NIT: The newline is not necessary here.

OK.

> 
>>   }
>>   static void schedule_tail(struct vcpu *prev)
>> @@ -998,6 +1021,7 @@ long do_arm_vcpu_op(int cmd, unsigned int vcpuid, XEN_GUEST_HANDLE_PARAM(void) a
>>       {
>>           case VCPUOP_register_vcpu_info:
>>           case VCPUOP_register_runstate_memory_area:
>> +        case VCPUOP_register_runstate_phys_memory_area:
>>               return do_vcpu_op(cmd, vcpuid, arg);
>>           default:
>>               return -EINVAL;
> 
> 
> [...]
> 
>> diff --git a/xen/common/domain.c b/xen/common/domain.c
>> index ae22049..6df76c6 100644
>> --- a/xen/common/domain.c
>> +++ b/xen/common/domain.c
>> @@ -149,6 +149,7 @@ struct vcpu *vcpu_create(
>>       v->dirty_cpu = VCPU_CPU_CLEAN;
>>       spin_lock_init(&v->virq_lock);
>> +    spin_lock_init(&v->mapped_runstate_lock);
>>       tasklet_init(&v->continue_hypercall_tasklet, NULL, 0);
>> @@ -699,6 +700,69 @@ int rcu_lock_live_remote_domain_by_id(domid_t dom, struct domain **d)
>>       return 0;
>>   }
>> +static void _unmap_runstate_area(struct vcpu *v)
> A better name would be unamep_runstate_area_locked() so you avoid the reserved name and make clear of the use.

OK.

> 
>> +{
>> +    mfn_t mfn;
>> +
>> +    if ( !v->mapped_runstate )
>> +        return;
>> +
>> +    mfn = _mfn(virt_to_mfn(runstate_guest(v).p));
> 
> As pointed out by Jan in the previous version:
> 
> The pointer is the result of __map_domain_page_global(). So I don't think you
> don't think you can legitimately use virt_to_mfn() on it, at
> least not on x86; domain_page_map_to_mfn() is what you
> want to use here.

Yep.

> 
>> +
>> +    unmap_domain_page_global((void *)
>> +                             ((unsigned long)v->mapped_runstate &
>> +                              PAGE_MASK));
>> +
>> +    v->mapped_runstate = NULL;
>> +    put_page_and_type(mfn_to_page(mfn));
>> +}
> 
> We seem to have this pattern in a few places now (see unmap_guest_page). It would be good to introduce helpers that can be used everywhere (probably lifted from common/event_fifo.c.

I'll check.

> 
>> +
>> +static int map_runstate_area(struct vcpu *v,
>> +                      struct vcpu_register_runstate_memory_area *area)
>> +{
>> +    unsigned long offset = area->addr.p & ~PAGE_MASK;
>> +    gfn_t gfn = gaddr_to_gfn(area->addr.p);
>> +    struct domain *d = v->domain;
>> +    void *mapping;
>> +    struct page_info *page;
>> +    size_t size = sizeof (struct vcpu_runstate_info );
> 
> space is not necessary before ).
> 
> But is the variable really necessary?

Well, I think it could be dropped.
> 
>> +
>> +    if ( offset > (PAGE_SIZE - size) )
>> +        return -EINVAL;
>> +
>> +    page = get_page_from_gfn(d, gfn_x(gfn), NULL, P2M_ALLOC);
>> +    if ( !page )
>> +        return -EINVAL;
>> +
>> +    if ( !get_page_type(page, PGT_writable_page) )
>> +    {
>> +        put_page(page);
>> +        return -EINVAL;
>> +    }
>> +
>> +    mapping = __map_domain_page_global(page);
>> +
>> +    if ( mapping == NULL )
>> +    {
>> +        put_page_and_type(page);
>> +        return -ENOMEM;
>> +    }
>> +
>> +    spin_lock(&v->mapped_runstate_lock);
>> +    _unmap_runstate_area(v);
>> +    v->mapped_runstate = mapping + offset;
>> +    spin_unlock(&v->mapped_runstate_lock);
>> +
>> +    return 0;
>> +}
>> +
>> +static void unmap_runstate_area(struct vcpu *v)
>> +{
>> +    spin_lock(&v->mapped_runstate_lock);
>> +    _unmap_runstate_area(v);
>> +    spin_unlock(&v->mapped_runstate_lock);
>> +}
>> +
>>   int domain_kill(struct domain *d)
>>   {
>>       int rc = 0;
>> @@ -737,7 +801,11 @@ int domain_kill(struct domain *d)
>>           if ( cpupool_move_domain(d, cpupool0) )
>>               return -ERESTART;
>>           for_each_vcpu ( d, v )
>> +        {
>> +            set_xen_guest_handle(runstate_guest(v), NULL);
>> +            unmap_runstate_area(v);
>>               unmap_vcpu_info(v);
>> +        }
>>           d->is_dying = DOMDYING_dead;
>>           /* Mem event cleanup has to go here because the rings
>>            * have to be put before we call put_domain. */
>> @@ -1192,6 +1260,7 @@ int domain_soft_reset(struct domain *d)
>>       for_each_vcpu ( d, v )
>>       {
>>           set_xen_guest_handle(runstate_guest(v), NULL);
>> +        unmap_runstate_area(v);
>>           unmap_vcpu_info(v);
>>       }
>> @@ -1536,8 +1605,17 @@ long do_vcpu_op(int cmd, unsigned int vcpuid, XEN_GUEST_HANDLE_PARAM(void) arg)
>>       }
>>       case VCPUOP_register_runstate_phys_memory_area:
>> -        rc = -EOPNOTSUPP;
>> +    {
>> +        struct vcpu_register_runstate_memory_area area;
>> +
>> +        rc = -EFAULT;
>> +        if ( copy_from_guest(&area, arg, 1) )
>> +            break;
>> +
>> +        rc = map_runstate_area(v, &area);
>> +
>>           break;
>> +    }
>>   #ifdef VCPU_TRAP_NMI
>>       case VCPUOP_send_nmi:
>> diff --git a/xen/include/asm-arm/domain.h b/xen/include/asm-arm/domain.h
>> index 312fec8..3fb6ea2 100644
>> --- a/xen/include/asm-arm/domain.h
>> +++ b/xen/include/asm-arm/domain.h
>> @@ -217,6 +217,8 @@ void vcpu_show_execution_state(struct vcpu *);
>>   void vcpu_show_registers(const struct vcpu *);
>>   void vcpu_switch_to_aarch64_mode(struct vcpu *);
>> +void update_runstate_area(struct vcpu *);
>> +
>>   /*
>>    * Due to the restriction of GICv3, the number of vCPUs in AFF0 is
>>    * limited to 16, thus only the first 4 bits of AFF0 are legal. We will
>> diff --git a/xen/include/xen/domain.h b/xen/include/xen/domain.h
>> index d1bfc82..ecddcfe 100644
>> --- a/xen/include/xen/domain.h
>> +++ b/xen/include/xen/domain.h
>> @@ -118,4 +118,6 @@ struct vnuma_info {
>>   void vnuma_destroy(struct vnuma_info *vnuma);
>> +struct vcpu_register_runstate_memory_area;
>> +
>>   #endif /* __XEN_DOMAIN_H__ */
>> diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
>> index 748bb0f..2afe31c 100644
>> --- a/xen/include/xen/sched.h
>> +++ b/xen/include/xen/sched.h
>> @@ -163,15 +163,23 @@ struct vcpu
>>       void            *sched_priv;    /* scheduler-specific data */
>>       struct vcpu_runstate_info runstate;
>> +
>> +    spinlock_t      mapped_runstate_lock;
>> +
>>   #ifndef CONFIG_COMPAT
>>   # define runstate_guest(v) ((v)->runstate_guest)
>>       XEN_GUEST_HANDLE(vcpu_runstate_info_t) runstate_guest; /* guest address */
>> +    vcpu_runstate_info_t *mapped_runstate;
>>   #else
>>   # define runstate_guest(v) ((v)->runstate_guest.native)
>>       union {
>>           XEN_GUEST_HANDLE(vcpu_runstate_info_t) native;
>>           XEN_GUEST_HANDLE(vcpu_runstate_info_compat_t) compat;
>>       } runstate_guest; /* guest address */
>> +    union {
>> +        vcpu_runstate_info_t* native;
>> +        vcpu_runstate_info_compat_t* compat;
>> +    } mapped_runstate; /* guest address */
> > The combination of mapped_runstate and runstate_guest is a bit confusing. I think you want to rework the interface to show that only one is possible at the time and make clear which one is used by who. Maybe:

As I said before, IMO coupling those interfaces makes the code complicated and ugly.

> 
> union
> {
>     /* Legacy interface to be used when the guest provides a virtual address */
>     union {
>        XEN_GUEST_HANDLE(vcpu_runstate_info_t) native;
>        ...
>     } virt;
> 
>     /* Interface used when the guest provides a physical address */
>     union {
>     } phys;
> } runstate_guest.> 
> runstate_guest_type /* could be a bool or enum */
> 
> Jan what do you think?
> 
> Cheers,
> 

-- 
Sincerely,
Andrii Anisov.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v2 0/2] Introduce runstate area registration with phys address
  2019-05-13 11:16     ` Julien Grall
@ 2019-05-13 14:14       ` Andrii Anisov
  2019-05-13 14:34         ` Julien Grall
  0 siblings, 1 reply; 83+ messages in thread
From: Andrii Anisov @ 2019-05-13 14:14 UTC (permalink / raw)
  To: Julien Grall, xen-devel; +Cc: Andrii Anisov, Jan Beulich, Roger Pau Monné

Hello Julien,

On 13.05.19 14:16, Julien Grall wrote:
>>> I am afraid I can't possible back this assumption. As I pointed out in the previous version, I would be OK with the always map solution on Arm32 (pending performance) because it would be possible to increase the virtual address area by reworking the address space.
>>
>> I'm sorry, I'm not sure what should be my actions about that.
> 
> There no code modification involved so far. Just updating your cover letter with what I just said above.

OK, I'll take it for the next version.

>>> The patch looks wrong to me. You are using virt_to_phys() on a percpu area. What does actually promise you the physical address will always be the same?
>>
>> Sorry for my ignorance here, could you please elaborate more about what is wrong here?
> 
> While the virtual address will never change over over the life cycle of a variable, I am not entirely sure we can make the same assumption for the physical address.
> 
> I know that kmalloc() is promising you that the physical address will not change. But percpu does not seem to use kmalloc() so have you confirmed this assumption can hold?

I need a bit more time to investigate.


>>> Are you saying that the command dd is the CPUBurn? I am not sure how this could be considered as a CPUBurn. IHMO, this is more IO related.
>>
>> Both /dev/null and /dev/zero are virtual devices no actual IO is performed during their operations, all the load is CPU (user and sys).
> 
> Thank you for the explanation. Shall I guess this is an existing benchmark [1]?

Well, "dd if=/dev/zero of=/dev/null" is just a trivial way to get one CPU core busy. It works for (almost) any Linux system without any additional movements.
Using another trivial GO application for that purpose, seems to me like an overkill. Yet if you insist, I can use it.


>> I did run it until avg more ore less stabilizes (2-3 minutes), then took the minimal avg (note, we have a moving average there).
> Did you re-run multiple time?Yes, sure. These are not the one try results.

-- 
Sincerely,
Andrii Anisov.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v2 0/2] Introduce runstate area registration with phys address
@ 2019-05-13 14:34                     ` Andrii Anisov
  0 siblings, 0 replies; 83+ messages in thread
From: Andrii Anisov @ 2019-05-13 14:34 UTC (permalink / raw)
  To: Julien Grall
  Cc: Juergen Gross, xen-devel, Boris Ostrovsky, Stefano Stabellini

Hello Julien,

On 13.05.19 13:50, Julien Grall wrote:
> Also, your DomD .config has CONFIG_UNMAP_KERNEL_AT_EL0. So how do you disable kpti?

Sorry for the mess, I was looking for and did not find "CONFIG_PAGE_TABLE_ISOLATION". What was googled by me as a KPTI enable config. But it is for x86, what I've overlooked.

So I have KPTI enabled here. Thank you.

-- 
Sincerely,
Andrii Anisov.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [Xen-devel] [PATCH v2 0/2] Introduce runstate area registration with phys address
@ 2019-05-13 14:34                     ` Andrii Anisov
  0 siblings, 0 replies; 83+ messages in thread
From: Andrii Anisov @ 2019-05-13 14:34 UTC (permalink / raw)
  To: Julien Grall
  Cc: Juergen Gross, xen-devel, Boris Ostrovsky, Stefano Stabellini

Hello Julien,

On 13.05.19 13:50, Julien Grall wrote:
> Also, your DomD .config has CONFIG_UNMAP_KERNEL_AT_EL0. So how do you disable kpti?

Sorry for the mess, I was looking for and did not find "CONFIG_PAGE_TABLE_ISOLATION". What was googled by me as a KPTI enable config. But it is for x86, what I've overlooked.

So I have KPTI enabled here. Thank you.

-- 
Sincerely,
Andrii Anisov.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v2 0/2] Introduce runstate area registration with phys address
  2019-05-13 14:14       ` Andrii Anisov
@ 2019-05-13 14:34         ` Julien Grall
  2019-05-13 15:29           ` Andrii Anisov
  0 siblings, 1 reply; 83+ messages in thread
From: Julien Grall @ 2019-05-13 14:34 UTC (permalink / raw)
  To: Andrii Anisov, xen-devel; +Cc: Andrii Anisov, Jan Beulich, Roger Pau Monné



On 5/13/19 3:14 PM, Andrii Anisov wrote:
> Hello Julien,

Hello,

> On 13.05.19 14:16, Julien Grall wrote:
>>>> I am afraid I can't possible back this assumption. As I pointed out 
>>>> in the previous version, I would be OK with the always map solution 
>>>> on Arm32 (pending performance) because it would be possible to 
>>>> increase the virtual address area by reworking the address space.
>>>
>>> I'm sorry, I'm not sure what should be my actions about that.
>>
>> There no code modification involved so far. Just updating your cover 
>> letter with what I just said above.
> 
> OK, I'll take it for the next version.
> 
>>>> The patch looks wrong to me. You are using virt_to_phys() on a 
>>>> percpu area. What does actually promise you the physical address 
>>>> will always be the same?
>>>
>>> Sorry for my ignorance here, could you please elaborate more about 
>>> what is wrong here?
>>
>> While the virtual address will never change over over the life cycle 
>> of a variable, I am not entirely sure we can make the same assumption 
>> for the physical address.
>>
>> I know that kmalloc() is promising you that the physical address will 
>> not change. But percpu does not seem to use kmalloc() so have you 
>> confirmed this assumption can hold?
> 
> I need a bit more time to investigate.
> 
> 
>>>> Are you saying that the command dd is the CPUBurn? I am not sure how 
>>>> this could be considered as a CPUBurn. IHMO, this is more IO related.
>>>
>>> Both /dev/null and /dev/zero are virtual devices no actual IO is 
>>> performed during their operations, all the load is CPU (user and sys).
>>
>> Thank you for the explanation. Shall I guess this is an existing 
>> benchmark [1]?
> 
> Well, "dd if=/dev/zero of=/dev/null" is just a trivial way to get one rn
> CPU core busy. It works for (almost) any Linux system without any 
> additional movements.
> Using another trivial GO application for that purpose, seems to me like 
> an overkill. Yet if you insist, I can use it.

My point of my message is to understand the exact workload/setup you are 
using. "dd" was not an entirely obvious choice for CPUBurn and Google 
didn't provide a lot of website backing this information.

Anyway, now I understand a bit more the workload, a couple of more 
questions:
    - How many vCPUs are you using in each domain?
    - What scheduler are you using?
    - What is the affinity?

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v2 0/2] Introduce runstate area registration with phys address
  2019-05-13 14:34         ` Julien Grall
@ 2019-05-13 15:29           ` Andrii Anisov
  2019-05-13 15:31             ` Julien Grall
  0 siblings, 1 reply; 83+ messages in thread
From: Andrii Anisov @ 2019-05-13 15:29 UTC (permalink / raw)
  To: Julien Grall, xen-devel; +Cc: Andrii Anisov, Jan Beulich, Roger Pau Monné



On 13.05.19 17:34, Julien Grall wrote:
> My point of my message is to understand the exact workload/setup you are using. "dd" was not an entirely obvious choice for CPUBurn and Google didn't provide a lot of website backing this information.

> Anyway, now I understand a bit more the workload, a couple of more questions:
>     - How many vCPUs are you using in each domain?
>     - What scheduler are you using?
>     - What is the affinity?

For the test with glmark2: Dom0 (4 VCPUs), DomD (4 VCPUs), 4 PCPUs, no affinity specified.

For the test with TBM: Dom0 (2 VCPUs) pinned to PCPUs 0 and 1, and TBM domain with one VCPU pinned to PCPU 2.

The scheduler is defaut (credit2).

-- 
Sincerely,
Andrii Anisov.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v2 0/2] Introduce runstate area registration with phys address
  2019-05-13 15:29           ` Andrii Anisov
@ 2019-05-13 15:31             ` Julien Grall
  2019-05-13 15:38               ` Andrii Anisov
  0 siblings, 1 reply; 83+ messages in thread
From: Julien Grall @ 2019-05-13 15:31 UTC (permalink / raw)
  To: Andrii Anisov, xen-devel; +Cc: Andrii Anisov, Jan Beulich, Roger Pau Monné

Hi,

On 5/13/19 4:29 PM, Andrii Anisov wrote:
> 
> 
> On 13.05.19 17:34, Julien Grall wrote:
>> My point of my message is to understand the exact workload/setup you 
>> are using. "dd" was not an entirely obvious choice for CPUBurn and 
>> Google didn't provide a lot of website backing this information.
> 
>> Anyway, now I understand a bit more the workload, a couple of more 
>> questions:
>>     - How many vCPUs are you using in each domain?
>>     - What scheduler are you using?
>>     - What is the affinity?
> 
> For the test with glmark2: Dom0 (4 VCPUs), DomD (4 VCPUs), 4 PCPUs, no 
> affinity specified.

So, are you running 4 dd (one for each core) in parallel? Are they pinned?

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v2 0/2] Introduce runstate area registration with phys address
  2019-05-13 15:31             ` Julien Grall
@ 2019-05-13 15:38               ` Andrii Anisov
  2019-05-13 15:40                 ` Julien Grall
  0 siblings, 1 reply; 83+ messages in thread
From: Andrii Anisov @ 2019-05-13 15:38 UTC (permalink / raw)
  To: Julien Grall, xen-devel; +Cc: Andrii Anisov, Jan Beulich, Roger Pau Monné



On 13.05.19 18:31, Julien Grall wrote:
> So, are you running 4 dd (one for each core) in parallel? Are they pinned?

Yes, sure I run 4 dd's in parallel to get all VCPUs loaded. No they are not pinned.

-- 
Sincerely,
Andrii Anisov.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v2 0/2] Introduce runstate area registration with phys address
  2019-05-13 15:38               ` Andrii Anisov
@ 2019-05-13 15:40                 ` Julien Grall
  2019-05-13 15:42                   ` Andrii Anisov
  0 siblings, 1 reply; 83+ messages in thread
From: Julien Grall @ 2019-05-13 15:40 UTC (permalink / raw)
  To: Andrii Anisov, xen-devel; +Cc: Andrii Anisov, Jan Beulich, Roger Pau Monné



On 5/13/19 4:38 PM, Andrii Anisov wrote:
> 
> 
> On 13.05.19 18:31, Julien Grall wrote:
>> So, are you running 4 dd (one for each core) in parallel? Are they 
>> pinned?
> 
> Yes, sure I run 4 dd's in parallel to get all VCPUs loaded. No they are 
> not pinned.

 From my understanding, if you want consistency, then setting the 
affinity will definitely help. Otherwise, you may have the scheduler to 
kick up and also balancing.

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v2 0/2] Introduce runstate area registration with phys address
  2019-05-13 15:40                 ` Julien Grall
@ 2019-05-13 15:42                   ` Andrii Anisov
  2019-05-13 15:45                     ` Julien Grall
  0 siblings, 1 reply; 83+ messages in thread
From: Andrii Anisov @ 2019-05-13 15:42 UTC (permalink / raw)
  To: Julien Grall, xen-devel; +Cc: Andrii Anisov, Jan Beulich, Roger Pau Monné


On 13.05.19 18:40, Julien Grall wrote:
>  From my understanding, if you want consistency, then setting the affinity will definitely help. Otherwise, you may have the scheduler to kick up and also balancing.
Sorry, do you mean setting affinity for dd processes, or Dom0 VCPUs, or both?

-- 
Sincerely,
Andrii Anisov.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v2 0/2] Introduce runstate area registration with phys address
  2019-05-13 15:42                   ` Andrii Anisov
@ 2019-05-13 15:45                     ` Julien Grall
  2019-05-13 16:05                       ` Andrii Anisov
  0 siblings, 1 reply; 83+ messages in thread
From: Julien Grall @ 2019-05-13 15:45 UTC (permalink / raw)
  To: Andrii Anisov, xen-devel; +Cc: Andrii Anisov, Jan Beulich, Roger Pau Monné

Hi.

On 5/13/19 4:42 PM, Andrii Anisov wrote:
> 
> On 13.05.19 18:40, Julien Grall wrote:
>>  From my understanding, if you want consistency, then setting the 
>> affinity will definitely help. Otherwise, you may have the scheduler 
>> to kick up and also balancing.
> Sorry, do you mean setting affinity for dd processes, or Dom0 VCPUs, or 
> both?

I was speaking about dd process. But Dom0 vCPUs might also be a good idea.

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v2 0/2] Introduce runstate area registration with phys address
  2019-05-13 15:45                     ` Julien Grall
@ 2019-05-13 16:05                       ` Andrii Anisov
  0 siblings, 0 replies; 83+ messages in thread
From: Andrii Anisov @ 2019-05-13 16:05 UTC (permalink / raw)
  To: Julien Grall, xen-devel; +Cc: Andrii Anisov, Jan Beulich, Roger Pau Monné



On 13.05.19 18:45, Julien Grall wrote:
> I was speaking about dd process. But Dom0 vCPUs might also be a good idea.

I see.

-- 
Sincerely,
Andrii Anisov.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v2 2/2] xen: implement VCPUOP_register_runstate_phys_memory_area
@ 2019-05-14  9:35         ` Julien Grall
  0 siblings, 0 replies; 83+ messages in thread
From: Julien Grall @ 2019-05-14  9:35 UTC (permalink / raw)
  To: Jan Beulich, andrii.anisov
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan, xen-devel,
	andrii_anisov, Roger Pau Monne

Hi Jan,

On 09/05/2019 10:27, Jan Beulich wrote:
>>>> On 08.05.19 at 17:40, <julien.grall@arm.com> wrote:
>> On 23/04/2019 09:10, Andrii Anisov wrote:
>>> --- a/xen/include/xen/sched.h
>>> +++ b/xen/include/xen/sched.h
>>> @@ -163,15 +163,23 @@ struct vcpu
>>>        void            *sched_priv;    /* scheduler-specific data */
>>>    
>>>        struct vcpu_runstate_info runstate;
>>> +
>>> +    spinlock_t      mapped_runstate_lock;
>>> +
>>>    #ifndef CONFIG_COMPAT
>>>    # define runstate_guest(v) ((v)->runstate_guest)
>>>        XEN_GUEST_HANDLE(vcpu_runstate_info_t) runstate_guest; /* guest address */
>>> +    vcpu_runstate_info_t *mapped_runstate;
>>>    #else
>>>    # define runstate_guest(v) ((v)->runstate_guest.native)
>>>        union {
>>>            XEN_GUEST_HANDLE(vcpu_runstate_info_t) native;
>>>            XEN_GUEST_HANDLE(vcpu_runstate_info_compat_t) compat;
>>>        } runstate_guest; /* guest address */
>>> +    union {
>>> +        vcpu_runstate_info_t* native;
>>> +        vcpu_runstate_info_compat_t* compat;
>>> +    } mapped_runstate; /* guest address */
>>
>> The combination of mapped_runstate and runstate_guest is a bit confusing. I
>> think you want to rework the interface to show that only one is possible at the
>> time and make clear which one is used by who. Maybe:
>>
>> union
>> {
>>      /* Legacy interface to be used when the guest provides a virtual address */
>>      union {
>>         XEN_GUEST_HANDLE(vcpu_runstate_info_t) native;
>>         ...
>>      } virt;
>>
>>      /* Interface used when the guest provides a physical address */
>>      union {
>>      } phys;
>> } runstate_guest.
>>
>> runstate_guest_type /* could be a bool or enum */
>>
>> Jan what do you think?
> 
> I fully agree - no mixing of approaches here, if possible. However,
> care needs to be taken that after a domain reset the new kernel
> can chose the opposite model. Question is whether there are even
> other cases where independent components (say boot loader and
> OS) may need to be permitted to chose models independently of
> one another.
Good point. On a similar topic, how does Kexec works on Xen? Do we reset the 
domain as well?

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [Xen-devel] [PATCH v2 2/2] xen: implement VCPUOP_register_runstate_phys_memory_area
@ 2019-05-14  9:35         ` Julien Grall
  0 siblings, 0 replies; 83+ messages in thread
From: Julien Grall @ 2019-05-14  9:35 UTC (permalink / raw)
  To: Jan Beulich, andrii.anisov
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan, xen-devel,
	andrii_anisov, Roger Pau Monne

Hi Jan,

On 09/05/2019 10:27, Jan Beulich wrote:
>>>> On 08.05.19 at 17:40, <julien.grall@arm.com> wrote:
>> On 23/04/2019 09:10, Andrii Anisov wrote:
>>> --- a/xen/include/xen/sched.h
>>> +++ b/xen/include/xen/sched.h
>>> @@ -163,15 +163,23 @@ struct vcpu
>>>        void            *sched_priv;    /* scheduler-specific data */
>>>    
>>>        struct vcpu_runstate_info runstate;
>>> +
>>> +    spinlock_t      mapped_runstate_lock;
>>> +
>>>    #ifndef CONFIG_COMPAT
>>>    # define runstate_guest(v) ((v)->runstate_guest)
>>>        XEN_GUEST_HANDLE(vcpu_runstate_info_t) runstate_guest; /* guest address */
>>> +    vcpu_runstate_info_t *mapped_runstate;
>>>    #else
>>>    # define runstate_guest(v) ((v)->runstate_guest.native)
>>>        union {
>>>            XEN_GUEST_HANDLE(vcpu_runstate_info_t) native;
>>>            XEN_GUEST_HANDLE(vcpu_runstate_info_compat_t) compat;
>>>        } runstate_guest; /* guest address */
>>> +    union {
>>> +        vcpu_runstate_info_t* native;
>>> +        vcpu_runstate_info_compat_t* compat;
>>> +    } mapped_runstate; /* guest address */
>>
>> The combination of mapped_runstate and runstate_guest is a bit confusing. I
>> think you want to rework the interface to show that only one is possible at the
>> time and make clear which one is used by who. Maybe:
>>
>> union
>> {
>>      /* Legacy interface to be used when the guest provides a virtual address */
>>      union {
>>         XEN_GUEST_HANDLE(vcpu_runstate_info_t) native;
>>         ...
>>      } virt;
>>
>>      /* Interface used when the guest provides a physical address */
>>      union {
>>      } phys;
>> } runstate_guest.
>>
>> runstate_guest_type /* could be a bool or enum */
>>
>> Jan what do you think?
> 
> I fully agree - no mixing of approaches here, if possible. However,
> care needs to be taken that after a domain reset the new kernel
> can chose the opposite model. Question is whether there are even
> other cases where independent components (say boot loader and
> OS) may need to be permitted to chose models independently of
> one another.
Good point. On a similar topic, how does Kexec works on Xen? Do we reset the 
domain as well?

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v2 2/2] xen: implement VCPUOP_register_runstate_phys_memory_area
@ 2019-05-14  9:48           ` Jan Beulich
  0 siblings, 0 replies; 83+ messages in thread
From: Jan Beulich @ 2019-05-14  9:48 UTC (permalink / raw)
  To: Julien Grall
  Cc: Tim Deegan, Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, andrii.anisov,
	xen-devel, andrii_anisov, Roger Pau Monne

>>> On 14.05.19 at 11:35, <julien.grall@arm.com> wrote:
> On a similar topic, how does Kexec works on Xen? Do we reset the 
> domain as well?

No, how could we? What gets invoked isn't Xen in the common case,
but some other (typically bare metal) OS like Linux.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [Xen-devel] [PATCH v2 2/2] xen: implement VCPUOP_register_runstate_phys_memory_area
@ 2019-05-14  9:48           ` Jan Beulich
  0 siblings, 0 replies; 83+ messages in thread
From: Jan Beulich @ 2019-05-14  9:48 UTC (permalink / raw)
  To: Julien Grall
  Cc: Tim Deegan, Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, andrii.anisov,
	xen-devel, andrii_anisov, Roger Pau Monne

>>> On 14.05.19 at 11:35, <julien.grall@arm.com> wrote:
> On a similar topic, how does Kexec works on Xen? Do we reset the 
> domain as well?

No, how could we? What gets invoked isn't Xen in the common case,
but some other (typically bare metal) OS like Linux.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v2 2/2] xen: implement VCPUOP_register_runstate_phys_memory_area
@ 2019-05-14  9:58         ` Julien Grall
  0 siblings, 0 replies; 83+ messages in thread
From: Julien Grall @ 2019-05-14  9:58 UTC (permalink / raw)
  To: Andrii Anisov, xen-devel
  Cc: Stefano Stabellini, Andrii Anisov, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan,
	Jan Beulich, xen-devel, Wei Liu, Roger Pau Monné



On 13/05/2019 13:30, Andrii Anisov wrote:
> 
> 
> On 08.05.19 18:40, Julien Grall wrote:
>>> diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
>>> index 6dc633e..8e24e63 100644
>>
>>>   {
>>> -    void __user *guest_handle = NULL;
>>> +    if ( !guest_handle_is_null(runstate_guest(v)) )
>>> +    {
>>> +        void __user *guest_handle = NULL;
>>> +        if ( VM_ASSIST(v->domain, runstate_update_flag) )
>>> +        {
>>> +            guest_handle = &v->runstate_guest.p->state_entry_time + 1;
>>> +            guest_handle--;
>>> +            v->runstate.state_entry_time |= XEN_RUNSTATE_UPDATE;
>>> +            __raw_copy_to_guest(guest_handle,
>>> +                                (void *)(&v->runstate.state_entry_time + 1) 
>>> - 1,
>>> +                                1);
>>> +            smp_wmb();
>>> +        }
>>> -    if ( guest_handle_is_null(runstate_guest(v)) )
>>> -        return;
>>> +        __copy_to_guest(runstate_guest(v), &v->runstate, 1);
>>> -    if ( VM_ASSIST(v->domain, runstate_update_flag) )
>>> -    {
>>> -        guest_handle = &v->runstate_guest.p->state_entry_time + 1;
>>> -        guest_handle--;
>>> -        v->runstate.state_entry_time |= XEN_RUNSTATE_UPDATE;
>>> -        __raw_copy_to_guest(guest_handle,
>>> -                            (void *)(&v->runstate.state_entry_time + 1) - 1, 
>>> 1);
>>> -        smp_wmb();
>>> +        if ( guest_handle )
>>> +        {
>>> +            v->runstate.state_entry_time &= ~XEN_RUNSTATE_UPDATE;
>>> +            smp_wmb();
>>> +            __raw_copy_to_guest(guest_handle,
>>> +                                (void *)(&v->runstate.state_entry_time + 1) 
>>> - 1,
>>> +                                1);
>>> +        }
>>>       }
>>> -    __copy_to_guest(runstate_guest(v), &v->runstate, 1);
>>> -
>>> -    if ( guest_handle )
>>> +    spin_lock(&v->mapped_runstate_lock);
>>> +    if ( v->mapped_runstate )
>>
>> The code looks a bit odd to me, you seem to allow a guest to provide 2 
>> runstate areas: one using guest virtual address the other using guest physical 
>> address.
>>
>> It would be best if we prevent a guest to mix match them. 
> 
> Firstly I turned to implementing in that way, but the locking and decissions 
> code become really ugly and complex while trying to cover 'guest's misbehavior' 
> scenarios.

I think it is possible to have a simple version taking the decision on which 
method to use. You can either use the spin_lock to protect everything or use 
something like:

update_runstate_area():

if ( xchg(&v->runstate_in_use, 1) )
   return;

switch ( v->runstate_type )
{
GVADDR:
    update_runstate_by_gvaddr();
GPADDR:
    update_runstate_by_gpaddr();
}

xchg(&v->runstate_in_use, 0);

registering an area

while ( xchg(&v->runstate_in_use, 1) );
/* Clean-up and registering the area */

> 
>> IOW, if the guest provide a physical address first, then *all* the call should 
>> be physical address. Alternatively this could be a per vCPU decision.
> 
> I guess we should agree what to implement first.

I think there are an agreement that the two methods should not be used together.

[..]

>>> diff --git a/xen/include/asm-arm/domain.h b/xen/include/asm-arm/domain.h
>>> index 312fec8..3fb6ea2 100644
>>> --- a/xen/include/asm-arm/domain.h
>>> +++ b/xen/include/asm-arm/domain.h
>>> @@ -217,6 +217,8 @@ void vcpu_show_execution_state(struct vcpu *);
>>>   void vcpu_show_registers(const struct vcpu *);
>>>   void vcpu_switch_to_aarch64_mode(struct vcpu *);
>>> +void update_runstate_area(struct vcpu *);
>>> +
>>>   /*
>>>    * Due to the restriction of GICv3, the number of vCPUs in AFF0 is
>>>    * limited to 16, thus only the first 4 bits of AFF0 are legal. We will
>>> diff --git a/xen/include/xen/domain.h b/xen/include/xen/domain.h
>>> index d1bfc82..ecddcfe 100644
>>> --- a/xen/include/xen/domain.h
>>> +++ b/xen/include/xen/domain.h
>>> @@ -118,4 +118,6 @@ struct vnuma_info {
>>>   void vnuma_destroy(struct vnuma_info *vnuma);
>>> +struct vcpu_register_runstate_memory_area;
>>> +
>>>   #endif /* __XEN_DOMAIN_H__ */
>>> diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
>>> index 748bb0f..2afe31c 100644
>>> --- a/xen/include/xen/sched.h
>>> +++ b/xen/include/xen/sched.h
>>> @@ -163,15 +163,23 @@ struct vcpu
>>>       void            *sched_priv;    /* scheduler-specific data */
>>>       struct vcpu_runstate_info runstate;
>>> +
>>> +    spinlock_t      mapped_runstate_lock;
>>> +
>>>   #ifndef CONFIG_COMPAT
>>>   # define runstate_guest(v) ((v)->runstate_guest)
>>>       XEN_GUEST_HANDLE(vcpu_runstate_info_t) runstate_guest; /* guest address */
>>> +    vcpu_runstate_info_t *mapped_runstate;
>>>   #else
>>>   # define runstate_guest(v) ((v)->runstate_guest.native)
>>>       union {
>>>           XEN_GUEST_HANDLE(vcpu_runstate_info_t) native;
>>>           XEN_GUEST_HANDLE(vcpu_runstate_info_compat_t) compat;
>>>       } runstate_guest; /* guest address */
>>> +    union {
>>> +        vcpu_runstate_info_t* native;
>>> +        vcpu_runstate_info_compat_t* compat;
>>> +    } mapped_runstate; /* guest address */
>> > The combination of mapped_runstate and runstate_guest is a bit confusing. I 
>> think you want to rework the interface to show that only one is possible at 
>> the time and make clear which one is used by who. Maybe:
> 
> As I said before, IMO coupling those interfaces makes the code complicated and 
> ugly.

Well, I can't see how it can be ugly (see my example above).

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [Xen-devel] [PATCH v2 2/2] xen: implement VCPUOP_register_runstate_phys_memory_area
@ 2019-05-14  9:58         ` Julien Grall
  0 siblings, 0 replies; 83+ messages in thread
From: Julien Grall @ 2019-05-14  9:58 UTC (permalink / raw)
  To: Andrii Anisov, xen-devel
  Cc: Stefano Stabellini, Andrii Anisov, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan,
	Jan Beulich, xen-devel, Wei Liu, Roger Pau Monné



On 13/05/2019 13:30, Andrii Anisov wrote:
> 
> 
> On 08.05.19 18:40, Julien Grall wrote:
>>> diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
>>> index 6dc633e..8e24e63 100644
>>
>>>   {
>>> -    void __user *guest_handle = NULL;
>>> +    if ( !guest_handle_is_null(runstate_guest(v)) )
>>> +    {
>>> +        void __user *guest_handle = NULL;
>>> +        if ( VM_ASSIST(v->domain, runstate_update_flag) )
>>> +        {
>>> +            guest_handle = &v->runstate_guest.p->state_entry_time + 1;
>>> +            guest_handle--;
>>> +            v->runstate.state_entry_time |= XEN_RUNSTATE_UPDATE;
>>> +            __raw_copy_to_guest(guest_handle,
>>> +                                (void *)(&v->runstate.state_entry_time + 1) 
>>> - 1,
>>> +                                1);
>>> +            smp_wmb();
>>> +        }
>>> -    if ( guest_handle_is_null(runstate_guest(v)) )
>>> -        return;
>>> +        __copy_to_guest(runstate_guest(v), &v->runstate, 1);
>>> -    if ( VM_ASSIST(v->domain, runstate_update_flag) )
>>> -    {
>>> -        guest_handle = &v->runstate_guest.p->state_entry_time + 1;
>>> -        guest_handle--;
>>> -        v->runstate.state_entry_time |= XEN_RUNSTATE_UPDATE;
>>> -        __raw_copy_to_guest(guest_handle,
>>> -                            (void *)(&v->runstate.state_entry_time + 1) - 1, 
>>> 1);
>>> -        smp_wmb();
>>> +        if ( guest_handle )
>>> +        {
>>> +            v->runstate.state_entry_time &= ~XEN_RUNSTATE_UPDATE;
>>> +            smp_wmb();
>>> +            __raw_copy_to_guest(guest_handle,
>>> +                                (void *)(&v->runstate.state_entry_time + 1) 
>>> - 1,
>>> +                                1);
>>> +        }
>>>       }
>>> -    __copy_to_guest(runstate_guest(v), &v->runstate, 1);
>>> -
>>> -    if ( guest_handle )
>>> +    spin_lock(&v->mapped_runstate_lock);
>>> +    if ( v->mapped_runstate )
>>
>> The code looks a bit odd to me, you seem to allow a guest to provide 2 
>> runstate areas: one using guest virtual address the other using guest physical 
>> address.
>>
>> It would be best if we prevent a guest to mix match them. 
> 
> Firstly I turned to implementing in that way, but the locking and decissions 
> code become really ugly and complex while trying to cover 'guest's misbehavior' 
> scenarios.

I think it is possible to have a simple version taking the decision on which 
method to use. You can either use the spin_lock to protect everything or use 
something like:

update_runstate_area():

if ( xchg(&v->runstate_in_use, 1) )
   return;

switch ( v->runstate_type )
{
GVADDR:
    update_runstate_by_gvaddr();
GPADDR:
    update_runstate_by_gpaddr();
}

xchg(&v->runstate_in_use, 0);

registering an area

while ( xchg(&v->runstate_in_use, 1) );
/* Clean-up and registering the area */

> 
>> IOW, if the guest provide a physical address first, then *all* the call should 
>> be physical address. Alternatively this could be a per vCPU decision.
> 
> I guess we should agree what to implement first.

I think there are an agreement that the two methods should not be used together.

[..]

>>> diff --git a/xen/include/asm-arm/domain.h b/xen/include/asm-arm/domain.h
>>> index 312fec8..3fb6ea2 100644
>>> --- a/xen/include/asm-arm/domain.h
>>> +++ b/xen/include/asm-arm/domain.h
>>> @@ -217,6 +217,8 @@ void vcpu_show_execution_state(struct vcpu *);
>>>   void vcpu_show_registers(const struct vcpu *);
>>>   void vcpu_switch_to_aarch64_mode(struct vcpu *);
>>> +void update_runstate_area(struct vcpu *);
>>> +
>>>   /*
>>>    * Due to the restriction of GICv3, the number of vCPUs in AFF0 is
>>>    * limited to 16, thus only the first 4 bits of AFF0 are legal. We will
>>> diff --git a/xen/include/xen/domain.h b/xen/include/xen/domain.h
>>> index d1bfc82..ecddcfe 100644
>>> --- a/xen/include/xen/domain.h
>>> +++ b/xen/include/xen/domain.h
>>> @@ -118,4 +118,6 @@ struct vnuma_info {
>>>   void vnuma_destroy(struct vnuma_info *vnuma);
>>> +struct vcpu_register_runstate_memory_area;
>>> +
>>>   #endif /* __XEN_DOMAIN_H__ */
>>> diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
>>> index 748bb0f..2afe31c 100644
>>> --- a/xen/include/xen/sched.h
>>> +++ b/xen/include/xen/sched.h
>>> @@ -163,15 +163,23 @@ struct vcpu
>>>       void            *sched_priv;    /* scheduler-specific data */
>>>       struct vcpu_runstate_info runstate;
>>> +
>>> +    spinlock_t      mapped_runstate_lock;
>>> +
>>>   #ifndef CONFIG_COMPAT
>>>   # define runstate_guest(v) ((v)->runstate_guest)
>>>       XEN_GUEST_HANDLE(vcpu_runstate_info_t) runstate_guest; /* guest address */
>>> +    vcpu_runstate_info_t *mapped_runstate;
>>>   #else
>>>   # define runstate_guest(v) ((v)->runstate_guest.native)
>>>       union {
>>>           XEN_GUEST_HANDLE(vcpu_runstate_info_t) native;
>>>           XEN_GUEST_HANDLE(vcpu_runstate_info_compat_t) compat;
>>>       } runstate_guest; /* guest address */
>>> +    union {
>>> +        vcpu_runstate_info_t* native;
>>> +        vcpu_runstate_info_compat_t* compat;
>>> +    } mapped_runstate; /* guest address */
>> > The combination of mapped_runstate and runstate_guest is a bit confusing. I 
>> think you want to rework the interface to show that only one is possible at 
>> the time and make clear which one is used by who. Maybe:
> 
> As I said before, IMO coupling those interfaces makes the code complicated and 
> ugly.

Well, I can't see how it can be ugly (see my example above).

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v2 2/2] xen: implement VCPUOP_register_runstate_phys_memory_area
@ 2019-05-14 10:08           ` Andrii Anisov
  0 siblings, 0 replies; 83+ messages in thread
From: Andrii Anisov @ 2019-05-14 10:08 UTC (permalink / raw)
  To: Julien Grall, xen-devel
  Cc: Stefano Stabellini, Andrii Anisov, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan,
	Jan Beulich, xen-devel, Wei Liu, Roger Pau Monné

Hello Julien,

On 14.05.19 12:58, Julien Grall wrote:
>> I guess we should agree what to implement first.
> 
> I think there are an agreement that the two methods should not be used together.

For a domain or for a particular VCPU?
How should we response on the request to register paddr when we already have registered vaddr and vice versa?


-- 
Sincerely,
Andrii Anisov.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [Xen-devel] [PATCH v2 2/2] xen: implement VCPUOP_register_runstate_phys_memory_area
@ 2019-05-14 10:08           ` Andrii Anisov
  0 siblings, 0 replies; 83+ messages in thread
From: Andrii Anisov @ 2019-05-14 10:08 UTC (permalink / raw)
  To: Julien Grall, xen-devel
  Cc: Stefano Stabellini, Andrii Anisov, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan,
	Jan Beulich, xen-devel, Wei Liu, Roger Pau Monné

Hello Julien,

On 14.05.19 12:58, Julien Grall wrote:
>> I guess we should agree what to implement first.
> 
> I think there are an agreement that the two methods should not be used together.

For a domain or for a particular VCPU?
How should we response on the request to register paddr when we already have registered vaddr and vice versa?


-- 
Sincerely,
Andrii Anisov.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v2 2/2] xen: implement VCPUOP_register_runstate_phys_memory_area
@ 2019-05-14 11:23             ` Julien Grall
  0 siblings, 0 replies; 83+ messages in thread
From: Julien Grall @ 2019-05-14 11:23 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Tim Deegan, Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, andrii.anisov,
	xen-devel, andrii_anisov, Roger Pau Monne



On 14/05/2019 10:48, Jan Beulich wrote:
>>>> On 14.05.19 at 11:35, <julien.grall@arm.com> wrote:
>> On a similar topic, how does Kexec works on Xen? Do we reset the
>> domain as well?
> 
> No, how could we? What gets invoked isn't Xen in the common case,
> but some other (typically bare metal) OS like Linux.

It is not what I asked. What I asked is whether Xen is involved when a guest 
kernel is kexecing to another kernel.

I don't know enough Kexec to be able to answer that question myself.

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [Xen-devel] [PATCH v2 2/2] xen: implement VCPUOP_register_runstate_phys_memory_area
@ 2019-05-14 11:23             ` Julien Grall
  0 siblings, 0 replies; 83+ messages in thread
From: Julien Grall @ 2019-05-14 11:23 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Tim Deegan, Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, andrii.anisov,
	xen-devel, andrii_anisov, Roger Pau Monne



On 14/05/2019 10:48, Jan Beulich wrote:
>>>> On 14.05.19 at 11:35, <julien.grall@arm.com> wrote:
>> On a similar topic, how does Kexec works on Xen? Do we reset the
>> domain as well?
> 
> No, how could we? What gets invoked isn't Xen in the common case,
> but some other (typically bare metal) OS like Linux.

It is not what I asked. What I asked is whether Xen is involved when a guest 
kernel is kexecing to another kernel.

I don't know enough Kexec to be able to answer that question myself.

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v2 2/2] xen: implement VCPUOP_register_runstate_phys_memory_area
@ 2019-05-14 11:24             ` Julien Grall
  0 siblings, 0 replies; 83+ messages in thread
From: Julien Grall @ 2019-05-14 11:24 UTC (permalink / raw)
  To: Andrii Anisov, xen-devel
  Cc: Stefano Stabellini, Andrii Anisov, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan,
	Jan Beulich, xen-devel, Wei Liu, Roger Pau Monné



On 14/05/2019 11:08, Andrii Anisov wrote:
> Hello Julien,
> 
> On 14.05.19 12:58, Julien Grall wrote:
>>> I guess we should agree what to implement first.
>>
>> I think there are an agreement that the two methods should not be used together.
> 
> For a domain or for a particular VCPU?
> How should we response on the request to register paddr when we already have 
> registered vaddr and vice versa?

 From the discussion with Jan, you would tear down the previous solution and then
register the new version. So this allows cases like a bootloader and a kernel 
using different version of the interface.

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [Xen-devel] [PATCH v2 2/2] xen: implement VCPUOP_register_runstate_phys_memory_area
@ 2019-05-14 11:24             ` Julien Grall
  0 siblings, 0 replies; 83+ messages in thread
From: Julien Grall @ 2019-05-14 11:24 UTC (permalink / raw)
  To: Andrii Anisov, xen-devel
  Cc: Stefano Stabellini, Andrii Anisov, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan,
	Jan Beulich, xen-devel, Wei Liu, Roger Pau Monné



On 14/05/2019 11:08, Andrii Anisov wrote:
> Hello Julien,
> 
> On 14.05.19 12:58, Julien Grall wrote:
>>> I guess we should agree what to implement first.
>>
>> I think there are an agreement that the two methods should not be used together.
> 
> For a domain or for a particular VCPU?
> How should we response on the request to register paddr when we already have 
> registered vaddr and vice versa?

 From the discussion with Jan, you would tear down the previous solution and then
register the new version. So this allows cases like a bootloader and a kernel 
using different version of the interface.

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v2 2/2] xen: implement VCPUOP_register_runstate_phys_memory_area
@ 2019-05-14 11:29               ` Jan Beulich
  0 siblings, 0 replies; 83+ messages in thread
From: Jan Beulich @ 2019-05-14 11:29 UTC (permalink / raw)
  To: Julien Grall
  Cc: Tim Deegan, Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, andrii.anisov,
	xen-devel, andrii_anisov, Roger Pau Monne

>>> On 14.05.19 at 13:23, <julien.grall@arm.com> wrote:
> On 14/05/2019 10:48, Jan Beulich wrote:
>>>>> On 14.05.19 at 11:35, <julien.grall@arm.com> wrote:
>>> On a similar topic, how does Kexec works on Xen? Do we reset the
>>> domain as well?
>> 
>> No, how could we? What gets invoked isn't Xen in the common case,
>> but some other (typically bare metal) OS like Linux.
> 
> It is not what I asked. What I asked is whether Xen is involved when a guest 
> kernel is kexecing to another kernel.

I don't think I've ever heard of such a thing (outside of perhaps
using domain reset), so I don't think I can answer the (originally
ambiguous) question then.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [Xen-devel] [PATCH v2 2/2] xen: implement VCPUOP_register_runstate_phys_memory_area
@ 2019-05-14 11:29               ` Jan Beulich
  0 siblings, 0 replies; 83+ messages in thread
From: Jan Beulich @ 2019-05-14 11:29 UTC (permalink / raw)
  To: Julien Grall
  Cc: Tim Deegan, Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, andrii.anisov,
	xen-devel, andrii_anisov, Roger Pau Monne

>>> On 14.05.19 at 13:23, <julien.grall@arm.com> wrote:
> On 14/05/2019 10:48, Jan Beulich wrote:
>>>>> On 14.05.19 at 11:35, <julien.grall@arm.com> wrote:
>>> On a similar topic, how does Kexec works on Xen? Do we reset the
>>> domain as well?
>> 
>> No, how could we? What gets invoked isn't Xen in the common case,
>> but some other (typically bare metal) OS like Linux.
> 
> It is not what I asked. What I asked is whether Xen is involved when a guest 
> kernel is kexecing to another kernel.

I don't think I've ever heard of such a thing (outside of perhaps
using domain reset), so I don't think I can answer the (originally
ambiguous) question then.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v2 2/2] xen: implement VCPUOP_register_runstate_phys_memory_area
@ 2019-05-14 11:45               ` Andrii Anisov
  0 siblings, 0 replies; 83+ messages in thread
From: Andrii Anisov @ 2019-05-14 11:45 UTC (permalink / raw)
  To: Julien Grall, Jan Beulich, xen-devel
  Cc: Stefano Stabellini, Andrii Anisov, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan, Wei Liu,
	Roger Pau Monné



On 14.05.19 14:24, Julien Grall wrote:
>>> I think there are an agreement that the two methods should not be used together.
>>
>> For a domain or for a particular VCPU?
>> How should we response on the request to register paddr when we already have registered vaddr and vice versa?
> 
>  From the discussion with Jan, you would tear down the previous solution and then
> register the new version. So this allows cases like a bootloader and a kernel using different version of the interface.

I'm not sure Jan stated that, he rather questioned that.

Jan, could you please confirm you agree that it should be dropped already registered runstate and (potentially) changed version in case of the new register request?

-- 
Sincerely,
Andrii Anisov.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [Xen-devel] [PATCH v2 2/2] xen: implement VCPUOP_register_runstate_phys_memory_area
@ 2019-05-14 11:45               ` Andrii Anisov
  0 siblings, 0 replies; 83+ messages in thread
From: Andrii Anisov @ 2019-05-14 11:45 UTC (permalink / raw)
  To: Julien Grall, Jan Beulich, xen-devel
  Cc: Stefano Stabellini, Andrii Anisov, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan, Wei Liu,
	Roger Pau Monné



On 14.05.19 14:24, Julien Grall wrote:
>>> I think there are an agreement that the two methods should not be used together.
>>
>> For a domain or for a particular VCPU?
>> How should we response on the request to register paddr when we already have registered vaddr and vice versa?
> 
>  From the discussion with Jan, you would tear down the previous solution and then
> register the new version. So this allows cases like a bootloader and a kernel using different version of the interface.

I'm not sure Jan stated that, he rather questioned that.

Jan, could you please confirm you agree that it should be dropped already registered runstate and (potentially) changed version in case of the new register request?

-- 
Sincerely,
Andrii Anisov.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v2 2/2] xen: implement VCPUOP_register_runstate_phys_memory_area
@ 2019-05-14 12:02                 ` Jan Beulich
  0 siblings, 0 replies; 83+ messages in thread
From: Jan Beulich @ 2019-05-14 12:02 UTC (permalink / raw)
  To: andrii.anisov
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan,
	Julien Grall, xen-devel, andrii_anisov, Roger Pau Monne

>>> On 14.05.19 at 13:45, <andrii.anisov@gmail.com> wrote:
> On 14.05.19 14:24, Julien Grall wrote:
>>>> I think there are an agreement that the two methods should not be used together.
>>>
>>> For a domain or for a particular VCPU?
>>> How should we response on the request to register paddr when we already have registered vaddr and vice versa?
>> 
>>  From the discussion with Jan, you would tear down the previous solution and then
>> register the new version. So this allows cases like a bootloader and a 
> kernel using different version of the interface.
> 
> I'm not sure Jan stated that, he rather questioned that.
> 
> Jan, could you please confirm you agree that it should be dropped already 
> registered runstate and (potentially) changed version in case of the new 
> register request?

Well, I think Julian's implication was that we can't support in particular
the boot loader -> kernel handover case without extra measures (if
at all), and hence he added things together and said what results
from this. Of course ideally we'd reject mixed requests, but unless
someone can come up with a clever means of how to determine entity
boundaries I'm afraid this is not going to be possible without breaking
certain setups.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [Xen-devel] [PATCH v2 2/2] xen: implement VCPUOP_register_runstate_phys_memory_area
@ 2019-05-14 12:02                 ` Jan Beulich
  0 siblings, 0 replies; 83+ messages in thread
From: Jan Beulich @ 2019-05-14 12:02 UTC (permalink / raw)
  To: andrii.anisov
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan,
	Julien Grall, xen-devel, andrii_anisov, Roger Pau Monne

>>> On 14.05.19 at 13:45, <andrii.anisov@gmail.com> wrote:
> On 14.05.19 14:24, Julien Grall wrote:
>>>> I think there are an agreement that the two methods should not be used together.
>>>
>>> For a domain or for a particular VCPU?
>>> How should we response on the request to register paddr when we already have registered vaddr and vice versa?
>> 
>>  From the discussion with Jan, you would tear down the previous solution and then
>> register the new version. So this allows cases like a bootloader and a 
> kernel using different version of the interface.
> 
> I'm not sure Jan stated that, he rather questioned that.
> 
> Jan, could you please confirm you agree that it should be dropped already 
> registered runstate and (potentially) changed version in case of the new 
> register request?

Well, I think Julian's implication was that we can't support in particular
the boot loader -> kernel handover case without extra measures (if
at all), and hence he added things together and said what results
from this. Of course ideally we'd reject mixed requests, but unless
someone can come up with a clever means of how to determine entity
boundaries I'm afraid this is not going to be possible without breaking
certain setups.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v2 2/2] xen: implement VCPUOP_register_runstate_phys_memory_area
@ 2019-05-14 13:05                   ` Andrii Anisov
  0 siblings, 0 replies; 83+ messages in thread
From: Andrii Anisov @ 2019-05-14 13:05 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan,
	Julien Grall, xen-devel, andrii_anisov, Roger Pau Monne



On 14.05.19 15:02, Jan Beulich wrote:
> Well, I think Julian's implication was that we can't support in particular
> the boot loader -> kernel handover case without extra measures (if
> at all), and hence he added things together and said what results
> from this. Of course ideally we'd reject mixed requests, but unless
> someone can come up with a clever means of how to determine entity
> boundaries I'm afraid this is not going to be possible without breaking
> certain setups.

 From my understanding, if we are speaking of different entities in a domain and their boundaries, we have to define unregister interface as well. So that those entities would be able to take care of themselves explicitly.

-- 
Sincerely,
Andrii Anisov.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [Xen-devel] [PATCH v2 2/2] xen: implement VCPUOP_register_runstate_phys_memory_area
@ 2019-05-14 13:05                   ` Andrii Anisov
  0 siblings, 0 replies; 83+ messages in thread
From: Andrii Anisov @ 2019-05-14 13:05 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan,
	Julien Grall, xen-devel, andrii_anisov, Roger Pau Monne



On 14.05.19 15:02, Jan Beulich wrote:
> Well, I think Julian's implication was that we can't support in particular
> the boot loader -> kernel handover case without extra measures (if
> at all), and hence he added things together and said what results
> from this. Of course ideally we'd reject mixed requests, but unless
> someone can come up with a clever means of how to determine entity
> boundaries I'm afraid this is not going to be possible without breaking
> certain setups.

 From my understanding, if we are speaking of different entities in a domain and their boundaries, we have to define unregister interface as well. So that those entities would be able to take care of themselves explicitly.

-- 
Sincerely,
Andrii Anisov.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v2 2/2] xen: implement VCPUOP_register_runstate_phys_memory_area
@ 2019-05-14 13:49                     ` Julien Grall
  0 siblings, 0 replies; 83+ messages in thread
From: Julien Grall @ 2019-05-14 13:49 UTC (permalink / raw)
  To: Andrii Anisov, Jan Beulich
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan, xen-devel,
	andrii_anisov, Roger Pau Monne



On 14/05/2019 14:05, Andrii Anisov wrote:
> 
> 
> On 14.05.19 15:02, Jan Beulich wrote:
>> Well, I think Julian's implication was that we can't support in particular
>> the boot loader -> kernel handover case without extra measures (if
>> at all), and hence he added things together and said what results
>> from this. Of course ideally we'd reject mixed requests, but unless
>> someone can come up with a clever means of how to determine entity
>> boundaries I'm afraid this is not going to be possible without breaking
>> certain setups.
> 
>  From my understanding, if we are speaking of different entities in a domain and 
> their boundaries, we have to define unregister interface as well. So that those 
> entities would be able to take care of themselves explicitly.

You have to keep in mind that existing OS have to run on newer Xen without any 
modification.

The existing hypercall allows you to:
    1) De-register an interface using the value 0.
    2) Replacing a current existing interface

You probably can't use 2) for a bootloader -> kernel handover because we are 
dealing with guest virtual address. There is an high chance the virtual address 
space layout is going to be different or even turning off MMU for a bit (done on 
Arm). So you have to use 1) otherwise you might write in a random place in memory.

I am not entirely sure whether there are actual value for 2). The only reason I 
can think of is if you want to move around the runstate in your virtual address 
space. But that's sounds a bit weird at least on Arm.

For the new hypercall, I think we at least want 1) (with a magic value TBD). 2) 
might be helpful in the case the bootloader didn't do the right thing or we are 
using Kexec to boot a new kernel. This would also be safer as physical address 
could be excluded more easily.

2) should not be too difficult to implement. It is just a matter of clean-up 
whatever was used previous before registering the new interface.

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [Xen-devel] [PATCH v2 2/2] xen: implement VCPUOP_register_runstate_phys_memory_area
@ 2019-05-14 13:49                     ` Julien Grall
  0 siblings, 0 replies; 83+ messages in thread
From: Julien Grall @ 2019-05-14 13:49 UTC (permalink / raw)
  To: Andrii Anisov, Jan Beulich
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan, xen-devel,
	andrii_anisov, Roger Pau Monne



On 14/05/2019 14:05, Andrii Anisov wrote:
> 
> 
> On 14.05.19 15:02, Jan Beulich wrote:
>> Well, I think Julian's implication was that we can't support in particular
>> the boot loader -> kernel handover case without extra measures (if
>> at all), and hence he added things together and said what results
>> from this. Of course ideally we'd reject mixed requests, but unless
>> someone can come up with a clever means of how to determine entity
>> boundaries I'm afraid this is not going to be possible without breaking
>> certain setups.
> 
>  From my understanding, if we are speaking of different entities in a domain and 
> their boundaries, we have to define unregister interface as well. So that those 
> entities would be able to take care of themselves explicitly.

You have to keep in mind that existing OS have to run on newer Xen without any 
modification.

The existing hypercall allows you to:
    1) De-register an interface using the value 0.
    2) Replacing a current existing interface

You probably can't use 2) for a bootloader -> kernel handover because we are 
dealing with guest virtual address. There is an high chance the virtual address 
space layout is going to be different or even turning off MMU for a bit (done on 
Arm). So you have to use 1) otherwise you might write in a random place in memory.

I am not entirely sure whether there are actual value for 2). The only reason I 
can think of is if you want to move around the runstate in your virtual address 
space. But that's sounds a bit weird at least on Arm.

For the new hypercall, I think we at least want 1) (with a magic value TBD). 2) 
might be helpful in the case the bootloader didn't do the right thing or we are 
using Kexec to boot a new kernel. This would also be safer as physical address 
could be excluded more easily.

2) should not be too difficult to implement. It is just a matter of clean-up 
whatever was used previous before registering the new interface.

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v2 2/2] xen: implement VCPUOP_register_runstate_phys_memory_area
@ 2019-05-14 13:49                     ` Jan Beulich
  0 siblings, 0 replies; 83+ messages in thread
From: Jan Beulich @ 2019-05-14 13:49 UTC (permalink / raw)
  To: andrii.anisov
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan,
	Julien Grall, xen-devel, andrii_anisov, Roger Pau Monne

>>> On 14.05.19 at 15:05, <andrii.anisov@gmail.com> wrote:
> On 14.05.19 15:02, Jan Beulich wrote:
>> Well, I think Julian's implication was that we can't support in particular
>> the boot loader -> kernel handover case without extra measures (if
>> at all), and hence he added things together and said what results
>> from this. Of course ideally we'd reject mixed requests, but unless
>> someone can come up with a clever means of how to determine entity
>> boundaries I'm afraid this is not going to be possible without breaking
>> certain setups.
> 
>  From my understanding, if we are speaking of different entities in a domain 
> and their boundaries, we have to define unregister interface as well. So that 
> those entities would be able to take care of themselves explicitly.

If this was a concern only for newly written code, this would be fine.
But you need to make sure all existing code also continues to work
with whatever new interface you implement. Just because a kernel
uses your new physical address based mechanism doesn't mean the
boot loader knows to unregister what it has registered.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [Xen-devel] [PATCH v2 2/2] xen: implement VCPUOP_register_runstate_phys_memory_area
@ 2019-05-14 13:49                     ` Jan Beulich
  0 siblings, 0 replies; 83+ messages in thread
From: Jan Beulich @ 2019-05-14 13:49 UTC (permalink / raw)
  To: andrii.anisov
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan,
	Julien Grall, xen-devel, andrii_anisov, Roger Pau Monne

>>> On 14.05.19 at 15:05, <andrii.anisov@gmail.com> wrote:
> On 14.05.19 15:02, Jan Beulich wrote:
>> Well, I think Julian's implication was that we can't support in particular
>> the boot loader -> kernel handover case without extra measures (if
>> at all), and hence he added things together and said what results
>> from this. Of course ideally we'd reject mixed requests, but unless
>> someone can come up with a clever means of how to determine entity
>> boundaries I'm afraid this is not going to be possible without breaking
>> certain setups.
> 
>  From my understanding, if we are speaking of different entities in a domain 
> and their boundaries, we have to define unregister interface as well. So that 
> those entities would be able to take care of themselves explicitly.

If this was a concern only for newly written code, this would be fine.
But you need to make sure all existing code also continues to work
with whatever new interface you implement. Just because a kernel
uses your new physical address based mechanism doesn't mean the
boot loader knows to unregister what it has registered.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v2 2/2] xen: implement VCPUOP_register_runstate_phys_memory_area
@ 2019-05-15  8:44                       ` Andrii Anisov
  0 siblings, 0 replies; 83+ messages in thread
From: Andrii Anisov @ 2019-05-15  8:44 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan,
	Julien Grall, xen-devel, andrii_anisov, Roger Pau Monne

Hello Jan,

On 14.05.19 16:49, Jan Beulich wrote:
> If this was a concern only for newly written code, this would be fine.
> But you need to make sure all existing code also continues to work
> with whatever new interface you implement.

And that is one more reason why I tend to introduce the new interface in parallel to be fully independent from the old one.
But not do a mixed implementation as you and Julien suggest.

> Just because a kernel
> uses your new physical address based mechanism doesn't mean the
> boot loader knows to unregister what it has registered.

As Julien already said, the current interface has an implicit mechanism to unregister runstate area.
Also if the bootloader misses unregistering runstate area with the current interface, we already have the broken system.
So it is really up to the guest system to take care about its possible transitions.

-- 
Sincerely,
Andrii Anisov.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [Xen-devel] [PATCH v2 2/2] xen: implement VCPUOP_register_runstate_phys_memory_area
@ 2019-05-15  8:44                       ` Andrii Anisov
  0 siblings, 0 replies; 83+ messages in thread
From: Andrii Anisov @ 2019-05-15  8:44 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan,
	Julien Grall, xen-devel, andrii_anisov, Roger Pau Monne

Hello Jan,

On 14.05.19 16:49, Jan Beulich wrote:
> If this was a concern only for newly written code, this would be fine.
> But you need to make sure all existing code also continues to work
> with whatever new interface you implement.

And that is one more reason why I tend to introduce the new interface in parallel to be fully independent from the old one.
But not do a mixed implementation as you and Julien suggest.

> Just because a kernel
> uses your new physical address based mechanism doesn't mean the
> boot loader knows to unregister what it has registered.

As Julien already said, the current interface has an implicit mechanism to unregister runstate area.
Also if the bootloader misses unregistering runstate area with the current interface, we already have the broken system.
So it is really up to the guest system to take care about its possible transitions.

-- 
Sincerely,
Andrii Anisov.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v2 2/2] xen: implement VCPUOP_register_runstate_phys_memory_area
@ 2019-05-15  9:04                       ` Andrii Anisov
  0 siblings, 0 replies; 83+ messages in thread
From: Andrii Anisov @ 2019-05-15  9:04 UTC (permalink / raw)
  To: Julien Grall, Jan Beulich
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan, xen-devel,
	andrii_anisov, Roger Pau Monne



On 14.05.19 16:49, Julien Grall wrote:
> You have to keep in mind that existing OS have to run on newer Xen without any modification.

As I just written to Jan, it is one more reason to keep those interfaces living in parallel and do not mix their implementation.

> The existing hypercall allows you to:
>     1) De-register an interface using the value 0.

My current implementation can easily be updated with the same behavior.

>     2) Replacing a current existing interface

> You probably can't use 2) for a bootloader -> kernel handover because we are dealing with guest virtual address. There is an high chance the virtual address space layout is going to be different or even turning off MMU for a bit (done on Arm). So you have to use 1) otherwise you might write in a random place in memory.

This definitely not the way to handle transitions between systems in a guest domain.

> I am not entirely sure whether there are actual value for 2). The only reason I can think of is if you want to move around the runstate in your virtual address space. But that's sounds a bit weird at least on Arm.
> For the new hypercall, I think we at least want 1) (with a magic value TBD). 

The magic value 0x0 can easily be introduced.

>  2) might be helpful in the case the bootloader didn't do the right thing or we are using Kexec to boot a new kernel. This would also be safer as physical address could be excluded more easily.

But the new system have to get some knowledge about the previous phys addr is reserved (used by hypervisor), and do not use it prior to registering new runstate area.
Providing such a knowledge is something (e.g.) the bootloader should take care of. But, IMO, it is better to require from (e.g.) the bootloader to unregister its runstate area prior to switching to the new system.

-- 
Sincerely,
Andrii Anisov.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [Xen-devel] [PATCH v2 2/2] xen: implement VCPUOP_register_runstate_phys_memory_area
@ 2019-05-15  9:04                       ` Andrii Anisov
  0 siblings, 0 replies; 83+ messages in thread
From: Andrii Anisov @ 2019-05-15  9:04 UTC (permalink / raw)
  To: Julien Grall, Jan Beulich
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan, xen-devel,
	andrii_anisov, Roger Pau Monne



On 14.05.19 16:49, Julien Grall wrote:
> You have to keep in mind that existing OS have to run on newer Xen without any modification.

As I just written to Jan, it is one more reason to keep those interfaces living in parallel and do not mix their implementation.

> The existing hypercall allows you to:
>     1) De-register an interface using the value 0.

My current implementation can easily be updated with the same behavior.

>     2) Replacing a current existing interface

> You probably can't use 2) for a bootloader -> kernel handover because we are dealing with guest virtual address. There is an high chance the virtual address space layout is going to be different or even turning off MMU for a bit (done on Arm). So you have to use 1) otherwise you might write in a random place in memory.

This definitely not the way to handle transitions between systems in a guest domain.

> I am not entirely sure whether there are actual value for 2). The only reason I can think of is if you want to move around the runstate in your virtual address space. But that's sounds a bit weird at least on Arm.
> For the new hypercall, I think we at least want 1) (with a magic value TBD). 

The magic value 0x0 can easily be introduced.

>  2) might be helpful in the case the bootloader didn't do the right thing or we are using Kexec to boot a new kernel. This would also be safer as physical address could be excluded more easily.

But the new system have to get some knowledge about the previous phys addr is reserved (used by hypervisor), and do not use it prior to registering new runstate area.
Providing such a knowledge is something (e.g.) the bootloader should take care of. But, IMO, it is better to require from (e.g.) the bootloader to unregister its runstate area prior to switching to the new system.

-- 
Sincerely,
Andrii Anisov.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v2 2/2] xen: implement VCPUOP_register_runstate_phys_memory_area
@ 2019-05-15 10:31                         ` Julien Grall
  0 siblings, 0 replies; 83+ messages in thread
From: Julien Grall @ 2019-05-15 10:31 UTC (permalink / raw)
  To: Andrii Anisov, Jan Beulich
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan, xen-devel,
	andrii_anisov, Roger Pau Monne

Hi Andrii,

On 15/05/2019 10:04, Andrii Anisov wrote:
> 
> 
> On 14.05.19 16:49, Julien Grall wrote:
>> You have to keep in mind that existing OS have to run on newer Xen without any 
>> modification.
> 
> As I just written to Jan, it is one more reason to keep those interfaces living 
> in parallel and do not mix their implementation.
There are actually no good reason for a guest to register via the two interfaces 
at the same time. The more we want to encourage the OS developer to switch to 
the new interface.

I also provided in my previous e-mails way to make the two working together 
without much trouble.

> 
>> The existing hypercall allows you to:
>>     1) De-register an interface using the value 0.
> 
> My current implementation can easily be updated with the same behavior.
> 
>>     2) Replacing a current existing interface
> 
>> You probably can't use 2) for a bootloader -> kernel handover because we are 
>> dealing with guest virtual address. There is an high chance the virtual 
>> address space layout is going to be different or even turning off MMU for a 
>> bit (done on Arm). So you have to use 1) otherwise you might write in a random 
>> place in memory.
> 
> This definitely not the way to handle transitions between systems in a guest 
> domain.
> 
>> I am not entirely sure whether there are actual value for 2). The only reason 
>> I can think of is if you want to move around the runstate in your virtual 
>> address space. But that's sounds a bit weird at least on Arm.
>> For the new hypercall, I think we at least want 1) (with a magic value TBD). 
> 
> The magic value 0x0 can easily be introduced.

0x0 is not an option. It could be a valid physical address. We need a value that 
cannot be used by anyone.

> 
>>  2) might be helpful in the case the bootloader didn't do the right thing or 
>> we are using Kexec to boot a new kernel. This would also be safer as physical 
>> address could be excluded more easily.
> 
> But the new system have to get some knowledge about the previous phys addr is 
> reserved (used by hypervisor), and do not use it prior to registering new 
> runstate area.
> Providing such a knowledge is something (e.g.) the bootloader should take care 
> of. But, IMO, it is better to require from (e.g.) the bootloader to unregister 
> its runstate area prior to switching to the new system.

Well, if a bootloader keep some part in memory (such as for handling runtime 
services), it will usually mark those pages are reserved. So it can't be used by 
the kernel.

But here, the point is it would not be difficult to handle 2). So why would you 
try to forbid it?

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [Xen-devel] [PATCH v2 2/2] xen: implement VCPUOP_register_runstate_phys_memory_area
@ 2019-05-15 10:31                         ` Julien Grall
  0 siblings, 0 replies; 83+ messages in thread
From: Julien Grall @ 2019-05-15 10:31 UTC (permalink / raw)
  To: Andrii Anisov, Jan Beulich
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan, xen-devel,
	andrii_anisov, Roger Pau Monne

Hi Andrii,

On 15/05/2019 10:04, Andrii Anisov wrote:
> 
> 
> On 14.05.19 16:49, Julien Grall wrote:
>> You have to keep in mind that existing OS have to run on newer Xen without any 
>> modification.
> 
> As I just written to Jan, it is one more reason to keep those interfaces living 
> in parallel and do not mix their implementation.
There are actually no good reason for a guest to register via the two interfaces 
at the same time. The more we want to encourage the OS developer to switch to 
the new interface.

I also provided in my previous e-mails way to make the two working together 
without much trouble.

> 
>> The existing hypercall allows you to:
>>     1) De-register an interface using the value 0.
> 
> My current implementation can easily be updated with the same behavior.
> 
>>     2) Replacing a current existing interface
> 
>> You probably can't use 2) for a bootloader -> kernel handover because we are 
>> dealing with guest virtual address. There is an high chance the virtual 
>> address space layout is going to be different or even turning off MMU for a 
>> bit (done on Arm). So you have to use 1) otherwise you might write in a random 
>> place in memory.
> 
> This definitely not the way to handle transitions between systems in a guest 
> domain.
> 
>> I am not entirely sure whether there are actual value for 2). The only reason 
>> I can think of is if you want to move around the runstate in your virtual 
>> address space. But that's sounds a bit weird at least on Arm.
>> For the new hypercall, I think we at least want 1) (with a magic value TBD). 
> 
> The magic value 0x0 can easily be introduced.

0x0 is not an option. It could be a valid physical address. We need a value that 
cannot be used by anyone.

> 
>>  2) might be helpful in the case the bootloader didn't do the right thing or 
>> we are using Kexec to boot a new kernel. This would also be safer as physical 
>> address could be excluded more easily.
> 
> But the new system have to get some knowledge about the previous phys addr is 
> reserved (used by hypervisor), and do not use it prior to registering new 
> runstate area.
> Providing such a knowledge is something (e.g.) the bootloader should take care 
> of. But, IMO, it is better to require from (e.g.) the bootloader to unregister 
> its runstate area prior to switching to the new system.

Well, if a bootloader keep some part in memory (such as for handling runtime 
services), it will usually mark those pages are reserved. So it can't be used by 
the kernel.

But here, the point is it would not be difficult to handle 2). So why would you 
try to forbid it?

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v2 2/2] xen: implement VCPUOP_register_runstate_phys_memory_area
@ 2019-05-15 11:59                         ` Jan Beulich
  0 siblings, 0 replies; 83+ messages in thread
From: Jan Beulich @ 2019-05-15 11:59 UTC (permalink / raw)
  To: andrii.anisov
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan,
	Julien Grall, xen-devel, andrii_anisov, Roger Pau Monne

>>> On 15.05.19 at 10:44, <andrii.anisov@gmail.com> wrote:
> On 14.05.19 16:49, Jan Beulich wrote:
>> If this was a concern only for newly written code, this would be fine.
>> But you need to make sure all existing code also continues to work
>> with whatever new interface you implement.
> 
> And that is one more reason why I tend to introduce the new interface in 
> parallel to be fully independent from the old one.
> But not do a mixed implementation as you and Julien suggest.

What behavior guests see and how it is implemented in the hypervisor
are two largely independent things. That is, we could choose to forbid
mixing of registration methods while still having some level of code
sharing between how both hypercall variants get actually processed.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [Xen-devel] [PATCH v2 2/2] xen: implement VCPUOP_register_runstate_phys_memory_area
@ 2019-05-15 11:59                         ` Jan Beulich
  0 siblings, 0 replies; 83+ messages in thread
From: Jan Beulich @ 2019-05-15 11:59 UTC (permalink / raw)
  To: andrii.anisov
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan,
	Julien Grall, xen-devel, andrii_anisov, Roger Pau Monne

>>> On 15.05.19 at 10:44, <andrii.anisov@gmail.com> wrote:
> On 14.05.19 16:49, Jan Beulich wrote:
>> If this was a concern only for newly written code, this would be fine.
>> But you need to make sure all existing code also continues to work
>> with whatever new interface you implement.
> 
> And that is one more reason why I tend to introduce the new interface in 
> parallel to be fully independent from the old one.
> But not do a mixed implementation as you and Julien suggest.

What behavior guests see and how it is implemented in the hypervisor
are two largely independent things. That is, we could choose to forbid
mixing of registration methods while still having some level of code
sharing between how both hypercall variants get actually processed.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v2 2/2] xen: implement VCPUOP_register_runstate_phys_memory_area
@ 2019-05-16 12:09     ` Jan Beulich
  0 siblings, 0 replies; 83+ messages in thread
From: Jan Beulich @ 2019-05-16 12:09 UTC (permalink / raw)
  To: andrii.anisov
  Cc: Tim Deegan, Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, xen-devel,
	Julien Grall, xen-devel, andrii_anisov, Roger Pau Monne

>>> On 23.04.19 at 10:10, <andrii.anisov@gmail.com> wrote:
> --- a/xen/include/xen/sched.h
> +++ b/xen/include/xen/sched.h
> @@ -163,15 +163,23 @@ struct vcpu
>      void            *sched_priv;    /* scheduler-specific data */
>  
>      struct vcpu_runstate_info runstate;
> +
> +    spinlock_t      mapped_runstate_lock;

Besides other comments given elsewhere already - does this
really need to be a per-vCPU lock? Guests aren't expected to
invoke the API frequently, so quite likely a per-domain lock
would suffice. Quite possibly domain_{,un}lock() could
actually be (re-)used.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [Xen-devel] [PATCH v2 2/2] xen: implement VCPUOP_register_runstate_phys_memory_area
@ 2019-05-16 12:09     ` Jan Beulich
  0 siblings, 0 replies; 83+ messages in thread
From: Jan Beulich @ 2019-05-16 12:09 UTC (permalink / raw)
  To: andrii.anisov
  Cc: Tim Deegan, Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, xen-devel,
	Julien Grall, xen-devel, andrii_anisov, Roger Pau Monne

>>> On 23.04.19 at 10:10, <andrii.anisov@gmail.com> wrote:
> --- a/xen/include/xen/sched.h
> +++ b/xen/include/xen/sched.h
> @@ -163,15 +163,23 @@ struct vcpu
>      void            *sched_priv;    /* scheduler-specific data */
>  
>      struct vcpu_runstate_info runstate;
> +
> +    spinlock_t      mapped_runstate_lock;

Besides other comments given elsewhere already - does this
really need to be a per-vCPU lock? Guests aren't expected to
invoke the API frequently, so quite likely a per-domain lock
would suffice. Quite possibly domain_{,un}lock() could
actually be (re-)used.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v2 2/2] xen: implement VCPUOP_register_runstate_phys_memory_area
@ 2019-05-16 13:30       ` Andrii Anisov
  0 siblings, 0 replies; 83+ messages in thread
From: Andrii Anisov @ 2019-05-16 13:30 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Tim Deegan, Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, xen-devel,
	Julien Grall, xen-devel, andrii_anisov, Roger Pau Monne

Hello Jan,

On 16.05.19 15:09, Jan Beulich wrote:
>>>> On 23.04.19 at 10:10, <andrii.anisov@gmail.com> wrote:
>> --- a/xen/include/xen/sched.h
>> +++ b/xen/include/xen/sched.h
>> @@ -163,15 +163,23 @@ struct vcpu
>>       void            *sched_priv;    /* scheduler-specific data */
>>   
>>       struct vcpu_runstate_info runstate;
>> +
>> +    spinlock_t      mapped_runstate_lock;
> 
> Besides other comments given elsewhere already - does this
> really need to be a per-vCPU lock? Guests aren't expected to
> invoke the API frequently, so quite likely a per-domain lock
> would suffice. Quite possibly domain_{,un}lock() could
> actually be (re-)used.

I'd not use a per-domain lock here. This lock will be locked on every runstate area update, what's happening on every context switch. And the event of simultaneous switching of vcpus from the same domain has quite high probability.

-- 
Sincerely,
Andrii Anisov.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [Xen-devel] [PATCH v2 2/2] xen: implement VCPUOP_register_runstate_phys_memory_area
@ 2019-05-16 13:30       ` Andrii Anisov
  0 siblings, 0 replies; 83+ messages in thread
From: Andrii Anisov @ 2019-05-16 13:30 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Tim Deegan, Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, xen-devel,
	Julien Grall, xen-devel, andrii_anisov, Roger Pau Monne

Hello Jan,

On 16.05.19 15:09, Jan Beulich wrote:
>>>> On 23.04.19 at 10:10, <andrii.anisov@gmail.com> wrote:
>> --- a/xen/include/xen/sched.h
>> +++ b/xen/include/xen/sched.h
>> @@ -163,15 +163,23 @@ struct vcpu
>>       void            *sched_priv;    /* scheduler-specific data */
>>   
>>       struct vcpu_runstate_info runstate;
>> +
>> +    spinlock_t      mapped_runstate_lock;
> 
> Besides other comments given elsewhere already - does this
> really need to be a per-vCPU lock? Guests aren't expected to
> invoke the API frequently, so quite likely a per-domain lock
> would suffice. Quite possibly domain_{,un}lock() could
> actually be (re-)used.

I'd not use a per-domain lock here. This lock will be locked on every runstate area update, what's happening on every context switch. And the event of simultaneous switching of vcpus from the same domain has quite high probability.

-- 
Sincerely,
Andrii Anisov.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v2 2/2] xen: implement VCPUOP_register_runstate_phys_memory_area
@ 2019-05-16 13:30       ` Andrii Anisov
  0 siblings, 0 replies; 83+ messages in thread
From: Andrii Anisov @ 2019-05-16 13:30 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Tim Deegan, Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, xen-devel,
	Julien Grall, xen-devel, andrii_anisov, Roger Pau Monne

Hello Jan,

On 16.05.19 15:09, Jan Beulich wrote:
>>>> On 23.04.19 at 10:10, <andrii.anisov@gmail.com> wrote:
>> --- a/xen/include/xen/sched.h
>> +++ b/xen/include/xen/sched.h
>> @@ -163,15 +163,23 @@ struct vcpu
>>       void            *sched_priv;    /* scheduler-specific data */
>>   
>>       struct vcpu_runstate_info runstate;
>> +
>> +    spinlock_t      mapped_runstate_lock;
> 
> Besides other comments given elsewhere already - does this
> really need to be a per-vCPU lock? Guests aren't expected to
> invoke the API frequently, so quite likely a per-domain lock
> would suffice. Quite possibly domain_{,un}lock() could
> actually be (re-)used.

I'd not use a per-domain lock here. This lock will be locked on every runstate area update, what's happening on every context switch. And the event of simultaneous switching of several vcpus from the same domain has quite high probability.

-- 
Sincerely,
Andrii Anisov.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [Xen-devel] [PATCH v2 2/2] xen: implement VCPUOP_register_runstate_phys_memory_area
@ 2019-05-16 13:30       ` Andrii Anisov
  0 siblings, 0 replies; 83+ messages in thread
From: Andrii Anisov @ 2019-05-16 13:30 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Tim Deegan, Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, xen-devel,
	Julien Grall, xen-devel, andrii_anisov, Roger Pau Monne

Hello Jan,

On 16.05.19 15:09, Jan Beulich wrote:
>>>> On 23.04.19 at 10:10, <andrii.anisov@gmail.com> wrote:
>> --- a/xen/include/xen/sched.h
>> +++ b/xen/include/xen/sched.h
>> @@ -163,15 +163,23 @@ struct vcpu
>>       void            *sched_priv;    /* scheduler-specific data */
>>   
>>       struct vcpu_runstate_info runstate;
>> +
>> +    spinlock_t      mapped_runstate_lock;
> 
> Besides other comments given elsewhere already - does this
> really need to be a per-vCPU lock? Guests aren't expected to
> invoke the API frequently, so quite likely a per-domain lock
> would suffice. Quite possibly domain_{,un}lock() could
> actually be (re-)used.

I'd not use a per-domain lock here. This lock will be locked on every runstate area update, what's happening on every context switch. And the event of simultaneous switching of several vcpus from the same domain has quite high probability.

-- 
Sincerely,
Andrii Anisov.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v2 2/2] xen: implement VCPUOP_register_runstate_phys_memory_area
@ 2019-05-16 13:48         ` Julien Grall
  0 siblings, 0 replies; 83+ messages in thread
From: Julien Grall @ 2019-05-16 13:48 UTC (permalink / raw)
  To: Andrii Anisov, Jan Beulich
  Cc: Tim Deegan, Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, xen-devel, xen-devel,
	andrii_anisov, Roger Pau Monne



On 16/05/2019 14:30, Andrii Anisov wrote:
> Hello Jan,
> 
> On 16.05.19 15:09, Jan Beulich wrote:
>>>>> On 23.04.19 at 10:10, <andrii.anisov@gmail.com> wrote:
>>> --- a/xen/include/xen/sched.h
>>> +++ b/xen/include/xen/sched.h
>>> @@ -163,15 +163,23 @@ struct vcpu
>>>       void            *sched_priv;    /* scheduler-specific data */
>>>       struct vcpu_runstate_info runstate;
>>> +
>>> +    spinlock_t      mapped_runstate_lock;
>>
>> Besides other comments given elsewhere already - does this
>> really need to be a per-vCPU lock? Guests aren't expected to
>> invoke the API frequently, so quite likely a per-domain lock
>> would suffice. Quite possibly domain_{,un}lock() could
>> actually be (re-)used.
> 
> I'd not use a per-domain lock here. This lock will be locked on every runstate 
> area update, what's happening on every context switch. And the event of 
> simultaneous switching of several vcpus from the same domain has quite high 
> probability.

The lock can be completely removed anyway. See my previous comments.

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [Xen-devel] [PATCH v2 2/2] xen: implement VCPUOP_register_runstate_phys_memory_area
@ 2019-05-16 13:48         ` Julien Grall
  0 siblings, 0 replies; 83+ messages in thread
From: Julien Grall @ 2019-05-16 13:48 UTC (permalink / raw)
  To: Andrii Anisov, Jan Beulich
  Cc: Tim Deegan, Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, xen-devel, xen-devel,
	andrii_anisov, Roger Pau Monne



On 16/05/2019 14:30, Andrii Anisov wrote:
> Hello Jan,
> 
> On 16.05.19 15:09, Jan Beulich wrote:
>>>>> On 23.04.19 at 10:10, <andrii.anisov@gmail.com> wrote:
>>> --- a/xen/include/xen/sched.h
>>> +++ b/xen/include/xen/sched.h
>>> @@ -163,15 +163,23 @@ struct vcpu
>>>       void            *sched_priv;    /* scheduler-specific data */
>>>       struct vcpu_runstate_info runstate;
>>> +
>>> +    spinlock_t      mapped_runstate_lock;
>>
>> Besides other comments given elsewhere already - does this
>> really need to be a per-vCPU lock? Guests aren't expected to
>> invoke the API frequently, so quite likely a per-domain lock
>> would suffice. Quite possibly domain_{,un}lock() could
>> actually be (re-)used.
> 
> I'd not use a per-domain lock here. This lock will be locked on every runstate 
> area update, what's happening on every context switch. And the event of 
> simultaneous switching of several vcpus from the same domain has quite high 
> probability.

The lock can be completely removed anyway. See my previous comments.

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v2 2/2] xen: implement VCPUOP_register_runstate_phys_memory_area
@ 2019-05-16 14:25           ` Andrii Anisov
  0 siblings, 0 replies; 83+ messages in thread
From: Andrii Anisov @ 2019-05-16 14:25 UTC (permalink / raw)
  To: Julien Grall, Jan Beulich
  Cc: Tim Deegan, Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, xen-devel, xen-devel,
	andrii_anisov, Roger Pau Monne

Hello Julien,

On 16.05.19 16:48, Julien Grall wrote:
> The lock can be completely removed anyway. See my previous comments.

You suggested kinda simplified try_lock with runstate update skipping in case of fail.
The question here is if we are OK with not updating runstate under circumstances?
Though even in this case the question here might be if we need `runstate_in_use` per VCPU or per Domain? My answer is per VCPU.

-- 
Sincerely,
Andrii Anisov.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [Xen-devel] [PATCH v2 2/2] xen: implement VCPUOP_register_runstate_phys_memory_area
@ 2019-05-16 14:25           ` Andrii Anisov
  0 siblings, 0 replies; 83+ messages in thread
From: Andrii Anisov @ 2019-05-16 14:25 UTC (permalink / raw)
  To: Julien Grall, Jan Beulich
  Cc: Tim Deegan, Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, xen-devel, xen-devel,
	andrii_anisov, Roger Pau Monne

Hello Julien,

On 16.05.19 16:48, Julien Grall wrote:
> The lock can be completely removed anyway. See my previous comments.

You suggested kinda simplified try_lock with runstate update skipping in case of fail.
The question here is if we are OK with not updating runstate under circumstances?
Though even in this case the question here might be if we need `runstate_in_use` per VCPU or per Domain? My answer is per VCPU.

-- 
Sincerely,
Andrii Anisov.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v2 2/2] xen: implement VCPUOP_register_runstate_phys_memory_area
@ 2019-05-16 14:28             ` Julien Grall
  0 siblings, 0 replies; 83+ messages in thread
From: Julien Grall @ 2019-05-16 14:28 UTC (permalink / raw)
  To: Andrii Anisov, Jan Beulich
  Cc: Tim Deegan, Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, xen-devel, xen-devel,
	andrii_anisov, Roger Pau Monne

Hi Andrii,

On 16/05/2019 15:25, Andrii Anisov wrote:
> Hello Julien,
> 
> On 16.05.19 16:48, Julien Grall wrote:
>> The lock can be completely removed anyway. See my previous comments.
> 
> You suggested kinda simplified try_lock with runstate update skipping in case of 
> fail.
> The question here is if we are OK with not updating runstate under circumstances?

Well, if you fail the check then it means someone was modifying it behind your 
back. That someone could update the runstate with the new information once it is 
modified. So I can't see issue with not updating the runstate in the context switch.

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [Xen-devel] [PATCH v2 2/2] xen: implement VCPUOP_register_runstate_phys_memory_area
@ 2019-05-16 14:28             ` Julien Grall
  0 siblings, 0 replies; 83+ messages in thread
From: Julien Grall @ 2019-05-16 14:28 UTC (permalink / raw)
  To: Andrii Anisov, Jan Beulich
  Cc: Tim Deegan, Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, xen-devel, xen-devel,
	andrii_anisov, Roger Pau Monne

Hi Andrii,

On 16/05/2019 15:25, Andrii Anisov wrote:
> Hello Julien,
> 
> On 16.05.19 16:48, Julien Grall wrote:
>> The lock can be completely removed anyway. See my previous comments.
> 
> You suggested kinda simplified try_lock with runstate update skipping in case of 
> fail.
> The question here is if we are OK with not updating runstate under circumstances?

Well, if you fail the check then it means someone was modifying it behind your 
back. That someone could update the runstate with the new information once it is 
modified. So I can't see issue with not updating the runstate in the context switch.

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v2 2/2] xen: implement VCPUOP_register_runstate_phys_memory_area
@ 2019-05-16 14:29               ` Andrii Anisov
  0 siblings, 0 replies; 83+ messages in thread
From: Andrii Anisov @ 2019-05-16 14:29 UTC (permalink / raw)
  To: Julien Grall, Jan Beulich
  Cc: Tim Deegan, Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, xen-devel, xen-devel,
	andrii_anisov, Roger Pau Monne



On 16.05.19 17:28, Julien Grall wrote:
> Well, if you fail the check then it means someone was modifying it behind your back. That someone could update the runstate with the new information once it is modified. So I can't see issue with not updating the runstate in the context switch.

That's fair enough.

-- 
Sincerely,
Andrii Anisov.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [Xen-devel] [PATCH v2 2/2] xen: implement VCPUOP_register_runstate_phys_memory_area
@ 2019-05-16 14:29               ` Andrii Anisov
  0 siblings, 0 replies; 83+ messages in thread
From: Andrii Anisov @ 2019-05-16 14:29 UTC (permalink / raw)
  To: Julien Grall, Jan Beulich
  Cc: Tim Deegan, Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, xen-devel, xen-devel,
	andrii_anisov, Roger Pau Monne



On 16.05.19 17:28, Julien Grall wrote:
> Well, if you fail the check then it means someone was modifying it behind your back. That someone could update the runstate with the new information once it is modified. So I can't see issue with not updating the runstate in the context switch.

That's fair enough.

-- 
Sincerely,
Andrii Anisov.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 83+ messages in thread

end of thread, other threads:[~2019-05-16 14:29 UTC | newest]

Thread overview: 83+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-04-23  8:10 [PATCH v2 0/2] Introduce runstate area registration with phys address Andrii Anisov
2019-04-23  8:10 ` [PATCH v2 1/2] xen: introduce VCPUOP_register_runstate_phys_memory_area hypercall Andrii Anisov
2019-04-23  8:10   ` [Xen-devel] " Andrii Anisov
2019-05-08 10:10   ` George Dunlap
2019-05-08 10:10     ` [Xen-devel] " George Dunlap
2019-04-23  8:10 ` [PATCH v2 2/2] xen: implement VCPUOP_register_runstate_phys_memory_area Andrii Anisov
2019-04-23  8:10   ` [Xen-devel] " Andrii Anisov
2019-05-08 15:40   ` Julien Grall
2019-05-08 15:40     ` [Xen-devel] " Julien Grall
2019-05-09  9:27     ` Jan Beulich
2019-05-09  9:27       ` [Xen-devel] " Jan Beulich
2019-05-14  9:35       ` Julien Grall
2019-05-14  9:35         ` [Xen-devel] " Julien Grall
2019-05-14  9:48         ` Jan Beulich
2019-05-14  9:48           ` [Xen-devel] " Jan Beulich
2019-05-14 11:23           ` Julien Grall
2019-05-14 11:23             ` [Xen-devel] " Julien Grall
2019-05-14 11:29             ` Jan Beulich
2019-05-14 11:29               ` [Xen-devel] " Jan Beulich
2019-05-13 12:30     ` Andrii Anisov
2019-05-13 12:30       ` [Xen-devel] " Andrii Anisov
2019-05-14  9:58       ` Julien Grall
2019-05-14  9:58         ` [Xen-devel] " Julien Grall
2019-05-14 10:08         ` Andrii Anisov
2019-05-14 10:08           ` [Xen-devel] " Andrii Anisov
2019-05-14 11:24           ` Julien Grall
2019-05-14 11:24             ` [Xen-devel] " Julien Grall
2019-05-14 11:45             ` Andrii Anisov
2019-05-14 11:45               ` [Xen-devel] " Andrii Anisov
2019-05-14 12:02               ` Jan Beulich
2019-05-14 12:02                 ` [Xen-devel] " Jan Beulich
2019-05-14 13:05                 ` Andrii Anisov
2019-05-14 13:05                   ` [Xen-devel] " Andrii Anisov
2019-05-14 13:49                   ` Julien Grall
2019-05-14 13:49                     ` [Xen-devel] " Julien Grall
2019-05-15  9:04                     ` Andrii Anisov
2019-05-15  9:04                       ` [Xen-devel] " Andrii Anisov
2019-05-15 10:31                       ` Julien Grall
2019-05-15 10:31                         ` [Xen-devel] " Julien Grall
2019-05-14 13:49                   ` Jan Beulich
2019-05-14 13:49                     ` [Xen-devel] " Jan Beulich
2019-05-15  8:44                     ` Andrii Anisov
2019-05-15  8:44                       ` [Xen-devel] " Andrii Anisov
2019-05-15 11:59                       ` Jan Beulich
2019-05-15 11:59                         ` [Xen-devel] " Jan Beulich
2019-05-16 12:09   ` Jan Beulich
2019-05-16 12:09     ` [Xen-devel] " Jan Beulich
2019-05-16 13:30     ` Andrii Anisov
2019-05-16 13:30       ` [Xen-devel] " Andrii Anisov
2019-05-16 13:30     ` Andrii Anisov
2019-05-16 13:30       ` [Xen-devel] " Andrii Anisov
2019-05-16 13:48       ` Julien Grall
2019-05-16 13:48         ` [Xen-devel] " Julien Grall
2019-05-16 14:25         ` Andrii Anisov
2019-05-16 14:25           ` [Xen-devel] " Andrii Anisov
2019-05-16 14:28           ` Julien Grall
2019-05-16 14:28             ` [Xen-devel] " Julien Grall
2019-05-16 14:29             ` Andrii Anisov
2019-05-16 14:29               ` [Xen-devel] " Andrii Anisov
     [not found] ` <fa126315-31af-854e-817a-8640b431c82b@arm.com>
     [not found]   ` <CAC1WxdiMzAq5hRC-mhRQuFDs7z_Hj5w7VAy52ec87SJQOGmp3w@mail.gmail.com>
     [not found]     ` <a28f95a1-d9da-2caf-f4b4-013100176b02@arm.com>
     [not found]       ` <090ce8cc-f329-fe54-4894-b7f12e3cd5a6@gmail.com>
2019-05-08 13:39         ` [PATCH v2 0/2] Introduce runstate area registration with phys address Julien Grall
2019-05-08 13:39           ` [Xen-devel] " Julien Grall
2019-05-08 13:54           ` Andrii Anisov
2019-05-08 13:54             ` [Xen-devel] " Andrii Anisov
2019-05-08 14:31             ` Julien Grall
2019-05-08 14:31               ` [Xen-devel] " Julien Grall
2019-05-08 16:01               ` Andrii Anisov
2019-05-08 16:01                 ` [Xen-devel] " Andrii Anisov
2019-05-13 10:50                 ` Julien Grall
2019-05-13 10:50                   ` [Xen-devel] " Julien Grall
2019-05-13 14:34                   ` Andrii Anisov
2019-05-13 14:34                     ` [Xen-devel] " Andrii Anisov
2019-05-08 13:59 ` Julien Grall
2019-05-13 10:15   ` Andrii Anisov
2019-05-13 11:16     ` Julien Grall
2019-05-13 14:14       ` Andrii Anisov
2019-05-13 14:34         ` Julien Grall
2019-05-13 15:29           ` Andrii Anisov
2019-05-13 15:31             ` Julien Grall
2019-05-13 15:38               ` Andrii Anisov
2019-05-13 15:40                 ` Julien Grall
2019-05-13 15:42                   ` Andrii Anisov
2019-05-13 15:45                     ` Julien Grall
2019-05-13 16:05                       ` Andrii Anisov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.